改进YOLOv5s和DeepSORT的井下人员检测及跟踪算法

邵小强; 李鑫; 杨涛; 杨永德; 刘士博; 原泽文

doi:10.13199/j.cnki.cst.2022-1933

摘要: 矿井移动目标的实时监测及跟踪系统是建设智慧矿山必不可少的内容，井下巡检机器人的出现可以实现对作业人员的实时监测，但是井下光照不均、煤尘干扰等因素的存在导致传统图像检测算法无法准确检测出作业人员。基于此提出一种可部署于井下巡检机器人的改进YOLOv5s和DeepSORT的井下人员检测及跟踪算法。首先利用监控摄像头与巡检机器人所录视频制作数据集，然后使用改进YOLOv5s网络对井下人员进行识别：考虑到井下人员检测及跟踪算法包含复杂的网络结构和庞大的参数体量，限制了检测模型的响应速度，使用改进轻量化网络ShuffleNetV2替代原YOLOv5s主干网络CSP-Darknet53。同时，为减少图像中复杂背景的干扰，提升作业人员的关注度，将Transformer自注意力模块融入改进ShuffleNetV2。其次，为了使多尺度特征能够有效融合且使得推理信息能够有效传输，将Neck 中FPN+PAN结构替换为BiFPN结构。接着利用改进DeepSORT对人员进行编码追踪：考虑到井下环境黑暗，照度低，无纹理性，DeepSORT难以有效提取到人员的外观信息，于是采用更深层卷积替换DeepSORT中小型残差网络来强化DeepSORT的外观信息提取能力。最后通过公开行人数据集及自建井下人员检测及跟踪数据集对本文改进算法进行验证，结果表明：改进的检测模型相比于原YOLOv5s模型平均检测精度提高了5.2%，参数量减少了41%，速度提升了21%；改进YOLOv5s-DeepSORT的井下人员跟踪方法精度达到了89.17%，速度达到了67FPS，可以有效部署于井下巡检机器人实现作业人员的实时检测及跟踪。

Abstract: The real-time monitoring and tracking system of mine moving targets is an essential part of the construction of smart mines. The appearance of downhole inspection robots can realize the real-time monitoring of operators, but the existence of uneven lighting, coal dust interference and other factors lead to the traditional image detection algorithm can not accurately detect operators. Based on this, this paper proposes an improved YOLOv5s and DeepSORT algorithm for downhole personnel detection and tracking that can be deployed in downhole inspection robots. Firstly, the data set was made by using the video recorded by the surveillance camera and inspection robot, and then the improved YOLOv5s network was used to identify the underground personnel: Considering that the detection and tracking algorithm for downhole personnel contains complex network structure and huge parameter volume, which limits the response speed of the detection model, this paper uses an improved lightweight network ShuffleNetV2 to replace the original YOLOv5s backbone network CSP-Darknet53. Meanwhile, in order to reduce the interference of complex image background and improve the attention of operators, Transformer self-attention module is integrated into the ShuffleNetV2. Secondly, the FPN+PAN structure in Neck is replaced by BiFPN structure in order to effectively fuse multi-scale features and effectively transmit inference information. Then, improved DeepSORT was used to encode and track personnel: considering that the underground environment was dark, with low illumination and no texture, it was difficult for DeepSORT to effectively extract personnel's appearance information, so DeepSORT's small and medium residual network was replaced by deeper convolution to enhance DeepSORT's appearance information extraction ability. Finally, the improved algorithm is verified by open pedestrian data set and self-built underground personnel detection and tracking data set. The results show that compared with the original YOLOv5s model, the average detection accuracy of the improved detection model is increased by 5.2%, the number of parameters is reduced by 41%, and the speed is increased by 21%. The improved YOLOv5s-DeepSORT downhole personnel tracking method has a precision of 89.17% and a speed of 67FPS, which can be effectively deployed in downhole inspection robots to realize real-time detection and tracking of operators.

改进YOLOv5s和DeepSORT的井下人员检测及跟踪算法

Underground personnel detection and tracking based on improved YOLOv5s and DeepSORT