高级检索

基于双流融合网络的输送带跑偏检测方法

Conveyor belt deviation detection method based on dual flow network

  • 摘要: 传统输送带跑偏检测方法中,接触式检测技术成本高,非接触式检测技术则精度低。随着人工智能技术的发展,虽然基于卷积神经网络的方法可以有效提高检测精度,但受限于卷积操作本身局部运算特性的限制,仍存在对长距离、全局信息感知不足等问题,很难再提升在输送带边缘检测上的精度。为解决上述问题,① 通过将传统卷积神经网络的卷积对局部特征的提取能力与Transformer结构对全局、长距离信息感知能力相结合,提出了一种全局与局部信息相互融合的双流输送带边缘检测网络模型(Dual-Flow Transformer Network,DFTNet),能够较好地提高输送带边缘检测精度并抑制输送带图像噪声和背景的干扰;② 通过设计卷积神经网络(Convolutional Neural Network,CNN)和转换器Transformer特征融合模块,形成双流编码器–解码器结构,利用结构上的巧妙设计,可以更好地融合全局上下文信息,避免了Transformer结构在大规模数据集上预训练,可以灵活调节网络结构;③ 通过从实际工业场景中所采集到多场景的运输机输送带图片,构建了包含5种不同场景下多角度、不同位置的输送带输送带数据集。研究结果表明,双流融合网络DFTNet综合性能最佳,均交并比mIou达91.08%,准确率ACC达99.48%,平均精确率mPrecision达91.88%,平均召回率mRecall达96.22%,相比纯卷积神经网络HRNet分别提升了25.36%、0.29%、17.70%与29.46%,相比全卷积神经网络(Fully Convolutional Networks,FCN)分别提升了29.5%、0.32%、24.77%与34.13%,在参数量、计算速度上均有较大提升。同时,处理图像帧率达53.07 fps,满足工业中实时性的要求,具有较大实用价值。

     

    Abstract: Among the traditional belt edge detection methods, the contact detection technology has high cost and the non-contact detection technology has low precision. With the development of artificial intelligence technology, although the method based on convolutional neural network can effectively improve the detection accuracy, but limited by the local operation characteristics of the convolutional operation itself, there are still problems such as insufficient perception of long-distance and global information, it is difficult to improve the accuracy of the belt edge detection. In order to solve the above problems, ① by combining the traditional convolutional neural network's ability to extract local features and the Transformer structure's ability to perceive global and long-distance information, a dual-flow transformer network (DFTNet) which integrates global and local information is proposed. The edge detection network model can better improve the belt edge detection accuracy and suppress the interference of belt image noise and background; ② By designing the CNN and Transformer feature fusion modules, a dual-flow encoder-decoder structure is formed. The clever design can better integrate the global context information, avoid the pre-training of the Transformer structure on large-scale data sets and be flexibly adjusted; ③ By Through the multi-scene conveyor belt pictures collected from the actual industrial scene, a belt conveyor belt dataset containing five different scenes, various angles and different positions is constructed. Through experimental verification, the DFTNet proposed in this paper has the best comprehensive performance with mIou 91.08%, ACC 99.48%, mPrecision 91.88% and mRecall 96.22%. which are 25.36%, 0.29%, 17.70% and 29.46% respectively compared to the pure convolutional neural network HRNet, and 29.5%, 0.32%, 24.77% and 34.13% respectively compared to FCN. At the same time, the frame rate of DFTNet processing images reaches 53.07 fps, which meets the real-time requirements in the industry and has great practical value.

     

/

返回文章
返回