Coalmine image super-resolution reconstruction via fusing multi-dimensional feature and residual attention network
-
摘要:
煤矿井下环境复杂,受光照、煤尘、水雾的影响,采集的图像往往存在细节模糊、纹理缺失等问题,低分辨率的矿井图像对煤矿安全监控的智能化发展带来诸多制约。图像超分辨率重建作为一种重要的图像处理技术,旨在从矿井低分辨率图像中恢复出清晰的高分辨率图像,从而显著提升煤矿智能监测与安全管理的可靠性。针对矿井图像边缘纹理信息缺失、细节模糊不清等质量退化问题,笔者提出一种矿井图像的多维特征与残差注意力网络超分辨率重建方法。首先,采用多分支网络将动态卷积与通道注意力机制进行并行融合,以“水平−通道”“垂直−通道”交互方式来捕获不同的空间统计特性。其次,设计了一种递归稀疏自注意力机制,在线性复杂度下聚合代表性特征图,自适应选择权重分配,减少计算过程中的信息冗余。最后,基于标准多头自注意力机制和残差连接方式构建深层特征提取的基本单元,将获得的特征信息与浅层特征通过跳跃连接共同输入重建模块,完成超分辨率矿井图像重建。实验结果表明,笔者所提方法在客观评价指标和主观视觉分析上较现有主流算法均有明显提升。在矿井数据集的测试中,2倍和4倍缩放因子下的图像相似性(LPIPS)平均降低10.97%、9.91%,峰值信噪比(PSNR)平均提升4.10%、2.30%,证明了该方法在恢复矿井图像结构和纹理细节上的有效性。
Abstract:The complex underground environment of coal coalmines, influenced by lighting, coal dust, and water mist, often results in collected images with blurred details and missing textures, leading to decreased image resolution and posing significant limitations to the intelligent development of coal coalmine safety monitoring. Image super-resolution reconstruction, an essential image processing technology, aims to recover clear high-resolution images from low-resolution coalmine images, thereby significantly enhancing the reliability of intelligent monitoring and safety management in coal coalmines. To address issues such as the loss of edge texture information and blurring of details in coalmine images, a coalmine image super-resolution reconstruction method integrating multi-dimensional features and residual attention networks is proposed. First, a multi-branch network is employed to parallelly integrate dynamic convolution and channel attention mechanisms, capturing different spatial statistical characteristics through “horizontal-channel” and “vertical-channel” interactions. Secondly, a recursive sparse self-attention mechanism is designed to aggregate representative feature maps under linear complexity, adaptively selecting weight distribution and reducing information redundancy during computation. Finally, the basic unit of deep feature extraction is constructed based on the standard multi-head self-attention mechanism and residual connection, with the obtained feature information and shallow features jointly input into the reconstruction module via skip connections to complete super-resolution reconstruction of coalmine images. Experimental results indicate that the proposed method significantly outperforms existing mainstream algorithms in both objective evaluation metrics and subjective visual analysis. In tests on the coalmine dataset, LPIPS (Learned Perceptual Image Patch Similarity) decreases by an average of 10.97% and 9.91%, while PSNR (Peak Signal-to-Noise Ratio) increases by an average of 4.10% and 2.30% for 2x and 4x scaling factors, respectively, demonstrating the method's effectiveness in restoring the structure and texture details of coalmine images.
-
-
表 1 不同超分算法在3种基准数据集上的性能对比
Table 1 Performance comparison of different super-resolution algorithms on three benchmark datasets
尺度因子 模型 Set5 Set14 Urban100 PSNR↑/SSIM↑ PSNR↑/SSIM↑ PSNR↑/SSIM↑ ×2 SRCNN 36.66/ 0.9542 32.45/ 0.9067 29.50/ 0.8946 CSRCNN 37.45/ 0.9570 34.34/ 0.9240 29.88/ 0.9020 VDSR 37.53/ 0.9587 33.03/ 0.9124 30.76/ 0.9140 EDSR 38.11/ 0.9602 33.92/ 0.9195 32.93/ 0.9351 SPAN 38.27/ 0.9614 34.34/ 0.9240 33.34/ 0.9384 ELAN 38.11/ 0.9609 33.82/ 0.9196 32.62/ 0.9328 SwinIR 38.35/ 0.9620 34.14/ 0.9227 33.40/ 0.9393 DAT 38.34/ 0.9619 34.43/ 0.9247 33.54/ 0.9402 Our 38.43/ 0.9637 34.94/ 0.9232 33.81/ 0.9725 ×4 SRCNN 30.49/ 0.8628 27.50/ 0.7315 24.47/ 0.7229 CSRCNN 31.01/ 0.8702 28.47/ 0.7720 24.62/ 0.7280 VDSR 31.35/ 0.8838 28.01/ 0.7674 25.18/ 0.7524 EDSR 32.46/ 0.8968 28.80/ 0.7876 26.64/ 0.8033 SPAN 32.63/ 0.9002 28.87/ 0.7889 26.82/ 0.8087 ELAN 32.47/ 0.8983 28.81/ 0.7868 26.60/ 0.8015 SwinIR 32.72/ 0.9021 28.94/ 0.7914 27.07/ 0.8164 DAT 32.74/ 0.9013 29.02/ 0.7914 27.14/ 0.8149 Our 32.91/ 0.9342 29.91/ 0.8891 27.72/ 0.8549 表 2 不同算法在矿井数据集上的性能对比
Table 2 Performance comparison of different algorithms on coalmine dataset
尺度因子 模型 PSNR↑ SSIM↑ LPIPS↓ VIF↑ ×2 SRCNN 28.4619 0.9503 0.2971 0.6012 CSRCNN 30.6238 0.9596 0.2934 0.6103 VDSR 30.6404 0.9664 0.2831 0.6265 EDSR 30.7148 0.9675 0.2799 0.7154 SPAN 30.8366 0.9700 0.2776 0.7518 ELAN 31.0719 0.9713 0.2768 0.7789 SwinIR 31.8084 0.9755 0.2654 0.8324 DAT 32.0146 0.9716 0.2381 0.8693 Our 33.3269 0.9796 0.2118 0.8820 ×4 SRCNN 23.1245 0.8021 0.3980 0.5872 CSRCNN 23.6153 0.8153 0.3128 0.5981 VDSR 23.9827 0.8276 0.3135 0.6318 EDSR 24.7814 0.8432 0.3106 0.6387 SPAN 25.4756 0.8575 0.3082 0.6805 ELAN 26.1092 0.8762 0.3054 0.7264 SwinIR 26.7248 0.8743 0.2931 0.7702 DAT 27.3493 0.8837 0.2905 0.7903 Our 27.9792 0.8885 0.2617 0.8147 表 3 不同算法在MOS评价指标上的对比
Table 3 Comparison of different algorithms on mos evaluation metrics
尺度因子 SRCNN CSRCNN SPAN VDSR SwinIR DAT Our ×2缩放因子 0.5270 0.5492 0.5992 0.6108 0.6326 0.7194 0.7426 ×4缩放因子 0.4342 0.4782 0.4889 0.5412 0.5625 0.5701 0.6041 表 4 2倍缩放因子下不同模块对模型性能的影响对比
Table 4 Comparison of the impact of different modules on model performance under 2x scaling factor
模型 模块 矿井图像 Set5 Set14 Urban100 MSA MDIA RS_SA PSNR↑/SSIM ↑ PSNR↑/SSIM↑ PSNR↑/SSIM↑ PSNR↑/SSIM↑ a √ × × 32.15/ 0.9610 37.74/ 0.9525 33.28/ 0.9156 32.45/ 0.9278 b × × √ 32.68/ 0.9712 37.89/ 0.9621 34.61/ 0.9198 33.18/ 0.9447 c × √ × 32.79/ 0.9643 38.01/ 0.9597 34.68/ 0.9169 33.03/ 0.9541 d × √ √ 33.33/ 0.9738 38.43/ 0.9637 34.94/ 0.9232 33.81/ 0.9725 -
[1] 王国法,庞义辉,任怀伟,等. 智慧矿山系统工程及关键技术研究与实践[J]. 煤炭学报,2024,49(1):181−202. WANG Guofa,PANG Yihui,REN Huaiwei,et al. System engineering and key technologies research and practice of smart mine[J]. Journal of China Coal Society,2024,49(1):181−202.
[2] 程健,李昊,马昆,等. 矿井视觉计算体系架构与关键技术[J]. 煤炭科学技术,2023,51(9):202−218. doi: 10.12438/cst.2023-0152 CHENG Jian,LI Hao,MA Kun,et al. Architecture and key technologies of coalmine underground vision computing[J]. Coal Science and Technology,2023,51(9):202−218. doi: 10.12438/cst.2023-0152
[3] 程健,陈亮,王凯,等. 一种多特征融合的复杂场景动态目标跟踪算法[J]. 中国矿业大学学报,2021,50(5):1002−1010. CHENG Jian,CHEN Liang,WANG Kai,et al. Multi-feature fusion dynamic target tracking algorithm for complex scenes[J]. Journal of China University of Mining & Technology,2021,50(5):1002−1010.
[4] 张艳青,马建红,韩颖,等. 真实场景下图像超分辨率重建研究综述[J]. 计算机工程与应用,2023,59(8):28−40. doi: 10.3778/j.issn.1002-8331.2208-0223 ZHANG Yanqing,MA Jianhong,HAN Ying,et al. Review of research on real-world single image super-resolution reconstruction[J]. Computer Engineering and Applications,2023,59(8):28−40. doi: 10.3778/j.issn.1002-8331.2208-0223
[5] IRANI M,PELEG S. Improving resolution by image registration[J]. CVGIP:Graphical Models and Image Processing,1991,53(3):231−239. doi: 10.1016/1049-9652(91)90045-L
[6] 李佳星,赵勇先,王京华. 基于深度学习的单幅图像超分辨率重建算法综述[J]. 自动化学报,2021,47(10):2341−2363. LI Jiaxing,ZHAO Yongxian,WANG Jinghua. A review of single image super-resolution reconstruction algorithms based on deep learning[J]. Acta Automatica Sinica,2021,47(10):2341−2363.
[7] LEPCHA D C,GOYAL B,DOGRA A,et al. Image super-resolution:a comprehensive review,recent trends,challenges and applications[J]. Information Fusion,2023,91:230−260. doi: 10.1016/j.inffus.2022.10.007
[8] DONG C,LOY C C,TANG X O. Accelerating the super-resolution convolutional neural network[M]//Lecture notes in computer science. Cham:Springer International Publishing,2016:391−407.
[9] KIM J,LEE J K,LEE K M. Deeply-recursive convolutional network for image super-resolution[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas,NV,USA. IEEE,2016:1637−1645.
[10] KIM J,LEE J K,LEE K M. Accurate image super-resolution using very deep convolutional networks[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas,NV,USA. IEEE,2016:1646−1654.
[11] 程德强,郭昕,陈亮亮,等. 多通道递归残差网络的图像超分辨率重建[J]. 中国图象图形学报,2021,26(3):605−618. doi: 10.11834/jig.200108 CHENG Deqiang,GUO Xin,CHEN Liangliang,et al. Image super-resolution reconstruction from multi-channel recursive residual network[J]. Journal of Image and Graphics,2021,26(3):605−618. doi: 10.11834/jig.200108
[12] 陈亮亮. 光照不均匀场景单幅图像超分辨率重建方法研究[D]. 徐州:中国矿业大学,2022. CHEN Liangliang. Research on super-resolution reconstruction method of single image in uneven illumination scene[D]. Xuzhou:China University of Mining and Technology,2022.
[13] LIM B,SON S,KIM H,et al. Enhanced deep residual networks for single image super-resolution[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Honolulu,HI,USA. IEEE,2017:1132−1140.
[14] 杨宏业,赵银娣,董霁红. 基于纹理转移的露天矿区遥感图像超分辨率重建[J]. 煤炭学报,2019,44(12):3781−3789. YANG Hongye,ZHAO Yindi,DONG Jihong. Remote sensing image super-resolution of open-pit mining area based on texture transfer[J]. Journal of China Coal Society,2019,44(12):3781−3789.
[15] WANG X T,XIE L B,DONG C,et al. Real-ESRGAN:training real-world blind super-resolution with pure synthetic data[C]//2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Montreal,BC,Canada. IEEE,2021:1905−1914.
[16] 田子建,吴佳奇,张文琪,等. 基于Transformer和自适应特征融合的矿井低照度图像亮度提升和细节增强方法[J]. 煤炭科学技术,2024,52(1):297−310. TIAN Zijian,WU Jiaqi,ZHANG Wenqi,et al. An illuminance improvement and details enhancement method on coal mine low-light images based on Transformer and adaptive feature fusion[J]. Coal Science and Technology,2024,52(1):297−310.
[17] DAI T,CAI J R,ZHANG Y B,et al. Second-order attention network for single image super-resolution[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach,CA,USA. IEEE,2019:11057−11066.
[18] LIANG J Y,CAO J Z,SUN G L,et al. SwinIR:image restoration using swin transformer[C]//2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Montreal,BC,Canada. IEEE,2021:1833−1844.
[19] 王满利,张航,李佳悦,等. 基于深度神经网络的煤矿井下低光照图像增强算法[J]. 煤炭科学技术,2023,51(9):231−241. doi: 10.12438/cst.2022-1626 WANG Manli,ZHANG Hang,LI Jiayue,et al. Deep neural network-based image enhancement algorithm for low-illumination images underground coal mines[J]. Coal Science and Technology,2023,51(9):231−241. doi: 10.12438/cst.2022-1626
[20] WANG Q L,WU B G,ZHU P F,et al. ECA-net:efficient channel attention for deep convolutional neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle,WA,USA. IEEE,2020:11531−11539.
[21] HAN K,WANG Y H,CHEN H T,et al. A survey on vision transformer[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(1):87−110. doi: 10.1109/TPAMI.2022.3152247
[22] LIN H Z,CHENG X,WU X Y,et al. CAT:cross attention in vision transformer[C]//2022 IEEE International Conference on Multimedia and Expo (ICME). Taipei,China. IEEE,2022:1−6.
[23] LIU Z,LIN Y T,CAO Y,et al. Swin transformer:hierarchical vision transformer using shifted windows[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal,QC,Canada. IEEE,2021:9992−10002.
[24] LI Y W,ZHANG Y L,TIMOFTE R,et al. NTIRE 2023 challenge on efficient super-resolution:methods and results[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Vancouver,BC,Canada. IEEE,2023:1922−1960.
[25] BEVILACQUA M,ROUMY A,GUILLEMOT C,et al. Low-complexity single-image super-resolution based on nonnegative neighbor embedding[C]//Proceedings ofthe British Machine Vision Conference 2012. Surrey. British Machine Vision Association,2012.
[26] ZEYDE R,ELAD M,PROTTER M. On single image scale-up using sparse-representations[M]//BOISSONNAT J D,CHENIN P,COHEN A,et al,eds. Lecture notes in computer science. Berlin,Heidelberg:Springer Berlin Heidelberg,2012:711−730.
[27] HUANG J B,SINGH A,AHUJA N. Single image super-resolution from transformed self-exemplars[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston,MA,USA. IEEE,2015:5197−5206.
[28] SETIADI D R I M. PSNR vs SSIM:imperceptibility quality assessment for image steganography[J]. Multimedia Tools and Applications,2021,80(6):8423−8444. doi: 10.1007/s11042-020-10035-z
[29] SARA U,AKTER M,UDDIN M S. Image quality assessment through FSIM,SSIM,MSE and PSNR—a comparative study[J]. Journal of Computer and Communications,2019,7(3):8−18. doi: 10.4236/jcc.2019.73002
[30] ZHANG R,ISOLA P,EFROS A A,et al. The unreasonable effectiveness of deep features as a perceptual metric[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA. IEEE,2018:586−595.
[31] KUO T Y,SU P C,TSAI C M. Improved visual information fidelity based on sensitivity characteristics of digital images[J]. Journal of Visual Communication and Image Representation,2016,40:76−84. doi: 10.1016/j.jvcir.2016.06.010
[32] ZHANG J W,WANG Z X,ZHENG Y H,et al. Cascaded convolutional neural network for image super-resolution[M]//Communications in computer and information science. Cham:Springer International Publishing,2021:361−373.
[33] WAN C,YU H Y,LI Z Q,et al. Swift parameter-free attention network for efficient super-resolution[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle,WA,USA. IEEE,2024:6246−6256.
[34] ZHANG X D,ZENG H,GUO S,et al. Efficient long-range attention network for Image super-resolution[M]//Lecture notes in computer science. Cham:Springer Nature Switzerland,2022:649−667.
[35] CHEN Z,ZHANG Y L,GU J J,et al. Dual aggregation transformer for image super-resolution[C]//2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris,France. IEEE,2023:12278−12287.