高级检索

矿井图像的多维特征与残差注意力网络超分辨率重建方法

程健, 米立飞, 李昊, 李和平, 王广福, 马永壮

程 健,米立飞,李 昊,等. 矿井图像的多维特征与残差注意力网络超分辨率重建方法[J]. 煤炭科学技术,2024,52(11):117−128. DOI: 10.12438/cst.2024-1055
引用本文: 程 健,米立飞,李 昊,等. 矿井图像的多维特征与残差注意力网络超分辨率重建方法[J]. 煤炭科学技术,2024,52(11):117−128. DOI: 10.12438/cst.2024-1055
CHENG Jian,MI Lifei,LI Hao,et al. Coalmine image super-resolution reconstruction via fusing multi-dimensional feature and residual attention network[J]. Coal Science and Technology,2024,52(11):117−128. DOI: 10.12438/cst.2024-1055
Citation: CHENG Jian,MI Lifei,LI Hao,et al. Coalmine image super-resolution reconstruction via fusing multi-dimensional feature and residual attention network[J]. Coal Science and Technology,2024,52(11):117−128. DOI: 10.12438/cst.2024-1055

矿井图像的多维特征与残差注意力网络超分辨率重建方法

基金项目: 国家重点研发计划资助项目(2023YFC2907600); 天地科技股份有限公司科技创新创业资金专项重点资助项目(2021-TD-ZD002, 2022-2-TD-ZD001)
详细信息
    作者简介:

    程健: (1974—),男, 四川平昌人,研究员,博士生导师, 博士。 E-mail:jiancheng@tsinghua.org.cn

    通讯作者:

    李和平: (1978—),男,湖北荆门人,教授级高级工程师,博士生导师,博士。E-mail:lihp@ccteg-bigdata.com

  • 中图分类号: TD76; TP18; TP391.41

Coalmine image super-resolution reconstruction via fusing multi-dimensional feature and residual attention network

  • 摘要:

    煤矿井下环境复杂,受光照、煤尘、水雾的影响,采集的图像往往存在细节模糊、纹理缺失等问题,低分辨率的矿井图像对煤矿安全监控的智能化发展带来诸多制约。图像超分辨率重建作为一种重要的图像处理技术,旨在从矿井低分辨率图像中恢复出清晰的高分辨率图像,从而显著提升煤矿智能监测与安全管理的可靠性。针对矿井图像边缘纹理信息缺失、细节模糊不清等质量退化问题,笔者提出一种矿井图像的多维特征与残差注意力网络超分辨率重建方法。首先,采用多分支网络将动态卷积与通道注意力机制进行并行融合,以“水平−通道”“垂直−通道”交互方式来捕获不同的空间统计特性。其次,设计了一种递归稀疏自注意力机制,在线性复杂度下聚合代表性特征图,自适应选择权重分配,减少计算过程中的信息冗余。最后,基于标准多头自注意力机制和残差连接方式构建深层特征提取的基本单元,将获得的特征信息与浅层特征通过跳跃连接共同输入重建模块,完成超分辨率矿井图像重建。实验结果表明,笔者所提方法在客观评价指标和主观视觉分析上较现有主流算法均有明显提升。在矿井数据集的测试中,2倍和4倍缩放因子下的图像相似性(LPIPS)平均降低10.97%、9.91%,峰值信噪比(PSNR)平均提升4.10%、2.30%,证明了该方法在恢复矿井图像结构和纹理细节上的有效性。

    Abstract:

    The complex underground environment of coal coalmines, influenced by lighting, coal dust, and water mist, often results in collected images with blurred details and missing textures, leading to decreased image resolution and posing significant limitations to the intelligent development of coal coalmine safety monitoring. Image super-resolution reconstruction, an essential image processing technology, aims to recover clear high-resolution images from low-resolution coalmine images, thereby significantly enhancing the reliability of intelligent monitoring and safety management in coal coalmines. To address issues such as the loss of edge texture information and blurring of details in coalmine images, a coalmine image super-resolution reconstruction method integrating multi-dimensional features and residual attention networks is proposed. First, a multi-branch network is employed to parallelly integrate dynamic convolution and channel attention mechanisms, capturing different spatial statistical characteristics through “horizontal-channel” and “vertical-channel” interactions. Secondly, a recursive sparse self-attention mechanism is designed to aggregate representative feature maps under linear complexity, adaptively selecting weight distribution and reducing information redundancy during computation. Finally, the basic unit of deep feature extraction is constructed based on the standard multi-head self-attention mechanism and residual connection, with the obtained feature information and shallow features jointly input into the reconstruction module via skip connections to complete super-resolution reconstruction of coalmine images. Experimental results indicate that the proposed method significantly outperforms existing mainstream algorithms in both objective evaluation metrics and subjective visual analysis. In tests on the coalmine dataset, LPIPS (Learned Perceptual Image Patch Similarity) decreases by an average of 10.97% and 9.91%, while PSNR (Peak Signal-to-Noise Ratio) increases by an average of 4.10% and 2.30% for 2x and 4x scaling factors, respectively, demonstrating the method's effectiveness in restoring the structure and texture details of coalmine images.

  • 图  1   矿井图像的多维特征与残差注意力网络超分辨率重建网络结构

    Figure  1.   Coalmine image super-resolution reconstruction via fusing multi-dimensional feature and residual attention network

    图  2   残差混合注意力(RMA)

    Figure  2.   Residual Mixed Attention(RMA)

    图  3   多维交互注意力模块(MDIA)

    Figure  3.   Multi-Dimensional Interactive Attention module (MDIA)

    图  4   递归稀疏自注意力(RS_SA)

    Figure  4.   Recursive Sparse Self-Attention (RS_SA)

    图  5   矿井图像数据集部分示例

    Figure  5.   Examples of coalmine image dataset

    图  6   ×4缩放因子下矿井图像的重建结果对比

    Figure  6.   Comparison of reconstruction results for ×4 images in coalmine dataset

    图  7   不同算法重建结果的像素差值颜色图对比

    Figure  7.   Comparison of pixel difference color maps of reconstruction results from different algorithms

    表  1   不同超分算法在3种基准数据集上的性能对比

    Table  1   Performance comparison of different super-resolution algorithms on three benchmark datasets

    尺度因子 模型 Set5 Set14 Urban100
    PSNR↑/SSIM↑ PSNR↑/SSIM↑ PSNR↑/SSIM↑
    ×2 SRCNN 36.66/0.9542 32.45/0.9067 29.50/0.8946
    CSRCNN 37.45/0.9570 34.34/0.9240 29.88/0.9020
    VDSR 37.53/0.9587 33.03/0.9124 30.76/0.9140
    EDSR 38.11/0.9602 33.92/0.9195 32.93/0.9351
    SPAN 38.27/0.9614 34.34/0.9240 33.34/0.9384
    ELAN 38.11/0.9609 33.82/0.9196 32.62/0.9328
    SwinIR 38.35/0.9620 34.14/0.9227 33.40/0.9393
    DAT 38.34/0.9619 34.43/0.9247 33.54/0.9402
    Our 38.43/0.9637 34.94/0.9232 33.81/0.9725
    ×4 SRCNN 30.49/0.8628 27.50/0.7315 24.47/0.7229
    CSRCNN 31.01/0.8702 28.47/0.7720 24.62/0.7280
    VDSR 31.35/0.8838 28.01/0.7674 25.18/0.7524
    EDSR 32.46/0.8968 28.80/0.7876 26.64/0.8033
    SPAN 32.63/0.9002 28.87/0.7889 26.82/0.8087
    ELAN 32.47/0.8983 28.81/0.7868 26.60/0.8015
    SwinIR 32.72/0.9021 28.94/0.7914 27.07/0.8164
    DAT 32.74/0.9013 29.02/0.7914 27.14/0.8149
    Our 32.91/0.9342 29.91/0.8891 27.72/0.8549
    下载: 导出CSV

    表  2   不同算法在矿井数据集上的性能对比

    Table  2   Performance comparison of different algorithms on coalmine dataset

    尺度因子 模型 PSNR↑ SSIM↑ LPIPS↓ VIF↑
    ×2 SRCNN 28.4619 0.9503 0.2971 0.6012
    CSRCNN 30.6238 0.9596 0.2934 0.6103
    VDSR 30.6404 0.9664 0.2831 0.6265
    EDSR 30.7148 0.9675 0.2799 0.7154
    SPAN 30.8366 0.9700 0.2776 0.7518
    ELAN 31.0719 0.9713 0.2768 0.7789
    SwinIR 31.8084 0.9755 0.2654 0.8324
    DAT 32.0146 0.9716 0.2381 0.8693
    Our 33.3269 0.9796 0.2118 0.8820
    ×4 SRCNN 23.1245 0.8021 0.3980 0.5872
    CSRCNN 23.6153 0.8153 0.3128 0.5981
    VDSR 23.9827 0.8276 0.3135 0.6318
    EDSR 24.7814 0.8432 0.3106 0.6387
    SPAN 25.4756 0.8575 0.3082 0.6805
    ELAN 26.1092 0.8762 0.3054 0.7264
    SwinIR 26.7248 0.8743 0.2931 0.7702
    DAT 27.3493 0.8837 0.2905 0.7903
    Our 27.9792 0.8885 0.2617 0.8147
    下载: 导出CSV

    表  3   不同算法在MOS评价指标上的对比

    Table  3   Comparison of different algorithms on mos evaluation metrics

    尺度因子 SRCNN CSRCNN SPAN VDSR SwinIR DAT Our
    ×2缩放因子 0.5270 0.5492 0.5992 0.6108 0.6326 0.7194 0.7426
    ×4缩放因子 0.4342 0.4782 0.4889 0.5412 0.5625 0.5701 0.6041
    下载: 导出CSV

    表  4   2倍缩放因子下不同模块对模型性能的影响对比

    Table  4   Comparison of the impact of different modules on model performance under 2x scaling factor

    模型模块矿井图像Set5Set14Urban100
    MSAMDIARS_SAPSNR↑/SSIM ↑PSNR↑/SSIM↑PSNR↑/SSIM↑PSNR↑/SSIM↑
    a××32.15/0.961037.74/0.952533.28/0.915632.45/0.9278
    b××32.68/0.971237.89/0.962134.61/0.919833.18/0.9447
    c××32.79/0.964338.01/0.959734.68/0.916933.03/0.9541
    d×33.33/0.973838.43/0.963734.94/0.923233.81/0.9725
    下载: 导出CSV
  • [1] 王国法,庞义辉,任怀伟,等. 智慧矿山系统工程及关键技术研究与实践[J]. 煤炭学报,2024,49(1):181−202.

    WANG Guofa,PANG Yihui,REN Huaiwei,et al. System engineering and key technologies research and practice of smart mine[J]. Journal of China Coal Society,2024,49(1):181−202.

    [2] 程健,李昊,马昆,等. 矿井视觉计算体系架构与关键技术[J]. 煤炭科学技术,2023,51(9):202−218. doi: 10.12438/cst.2023-0152

    CHENG Jian,LI Hao,MA Kun,et al. Architecture and key technologies of coalmine underground vision computing[J]. Coal Science and Technology,2023,51(9):202−218. doi: 10.12438/cst.2023-0152

    [3] 程健,陈亮,王凯,等. 一种多特征融合的复杂场景动态目标跟踪算法[J]. 中国矿业大学学报,2021,50(5):1002−1010.

    CHENG Jian,CHEN Liang,WANG Kai,et al. Multi-feature fusion dynamic target tracking algorithm for complex scenes[J]. Journal of China University of Mining & Technology,2021,50(5):1002−1010.

    [4] 张艳青,马建红,韩颖,等. 真实场景下图像超分辨率重建研究综述[J]. 计算机工程与应用,2023,59(8):28−40. doi: 10.3778/j.issn.1002-8331.2208-0223

    ZHANG Yanqing,MA Jianhong,HAN Ying,et al. Review of research on real-world single image super-resolution reconstruction[J]. Computer Engineering and Applications,2023,59(8):28−40. doi: 10.3778/j.issn.1002-8331.2208-0223

    [5]

    IRANI M,PELEG S. Improving resolution by image registration[J]. CVGIP:Graphical Models and Image Processing,1991,53(3):231−239. doi: 10.1016/1049-9652(91)90045-L

    [6] 李佳星,赵勇先,王京华. 基于深度学习的单幅图像超分辨率重建算法综述[J]. 自动化学报,2021,47(10):2341−2363.

    LI Jiaxing,ZHAO Yongxian,WANG Jinghua. A review of single image super-resolution reconstruction algorithms based on deep learning[J]. Acta Automatica Sinica,2021,47(10):2341−2363.

    [7]

    LEPCHA D C,GOYAL B,DOGRA A,et al. Image super-resolution:a comprehensive review,recent trends,challenges and applications[J]. Information Fusion,2023,91:230−260. doi: 10.1016/j.inffus.2022.10.007

    [8]

    DONG C,LOY C C,TANG X O. Accelerating the super-resolution convolutional neural network[M]//Lecture notes in computer science. Cham:Springer International Publishing,2016:391−407.

    [9]

    KIM J,LEE J K,LEE K M. Deeply-recursive convolutional network for image super-resolution[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas,NV,USA. IEEE,2016:1637−1645.

    [10]

    KIM J,LEE J K,LEE K M. Accurate image super-resolution using very deep convolutional networks[C]//2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Las Vegas,NV,USA. IEEE,2016:1646−1654.

    [11] 程德强,郭昕,陈亮亮,等. 多通道递归残差网络的图像超分辨率重建[J]. 中国图象图形学报,2021,26(3):605−618. doi: 10.11834/jig.200108

    CHENG Deqiang,GUO Xin,CHEN Liangliang,et al. Image super-resolution reconstruction from multi-channel recursive residual network[J]. Journal of Image and Graphics,2021,26(3):605−618. doi: 10.11834/jig.200108

    [12] 陈亮亮. 光照不均匀场景单幅图像超分辨率重建方法研究[D]. 徐州:中国矿业大学,2022.

    CHEN Liangliang. Research on super-resolution reconstruction method of single image in uneven illumination scene[D]. Xuzhou:China University of Mining and Technology,2022.

    [13]

    LIM B,SON S,KIM H,et al. Enhanced deep residual networks for single image super-resolution[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Honolulu,HI,USA. IEEE,2017:1132−1140.

    [14] 杨宏业,赵银娣,董霁红. 基于纹理转移的露天矿区遥感图像超分辨率重建[J]. 煤炭学报,2019,44(12):3781−3789.

    YANG Hongye,ZHAO Yindi,DONG Jihong. Remote sensing image super-resolution of open-pit mining area based on texture transfer[J]. Journal of China Coal Society,2019,44(12):3781−3789.

    [15]

    WANG X T,XIE L B,DONG C,et al. Real-ESRGAN:training real-world blind super-resolution with pure synthetic data[C]//2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Montreal,BC,Canada. IEEE,2021:1905−1914.

    [16] 田子建,吴佳奇,张文琪,等. 基于Transformer和自适应特征融合的矿井低照度图像亮度提升和细节增强方法[J]. 煤炭科学技术,2024,52(1):297−310.

    TIAN Zijian,WU Jiaqi,ZHANG Wenqi,et al. An illuminance improvement and details enhancement method on coal mine low-light images based on Transformer and adaptive feature fusion[J]. Coal Science and Technology,2024,52(1):297−310.

    [17]

    DAI T,CAI J R,ZHANG Y B,et al. Second-order attention network for single image super-resolution[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Long Beach,CA,USA. IEEE,2019:11057−11066.

    [18]

    LIANG J Y,CAO J Z,SUN G L,et al. SwinIR:image restoration using swin transformer[C]//2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW). Montreal,BC,Canada. IEEE,2021:1833−1844.

    [19] 王满利,张航,李佳悦,等. 基于深度神经网络的煤矿井下低光照图像增强算法[J]. 煤炭科学技术,2023,51(9):231−241. doi: 10.12438/cst.2022-1626

    WANG Manli,ZHANG Hang,LI Jiayue,et al. Deep neural network-based image enhancement algorithm for low-illumination images underground coal mines[J]. Coal Science and Technology,2023,51(9):231−241. doi: 10.12438/cst.2022-1626

    [20]

    WANG Q L,WU B G,ZHU P F,et al. ECA-net:efficient channel attention for deep convolutional neural networks[C]//2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Seattle,WA,USA. IEEE,2020:11531−11539.

    [21]

    HAN K,WANG Y H,CHEN H T,et al. A survey on vision transformer[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence,2023,45(1):87−110. doi: 10.1109/TPAMI.2022.3152247

    [22]

    LIN H Z,CHENG X,WU X Y,et al. CAT:cross attention in vision transformer[C]//2022 IEEE International Conference on Multimedia and Expo (ICME). Taipei,China. IEEE,2022:1−6.

    [23]

    LIU Z,LIN Y T,CAO Y,et al. Swin transformer:hierarchical vision transformer using shifted windows[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal,QC,Canada. IEEE,2021:9992−10002.

    [24]

    LI Y W,ZHANG Y L,TIMOFTE R,et al. NTIRE 2023 challenge on efficient super-resolution:methods and results[C]//2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Vancouver,BC,Canada. IEEE,2023:1922−1960.

    [25]

    BEVILACQUA M,ROUMY A,GUILLEMOT C,et al. Low-complexity single-image super-resolution based on nonnegative neighbor embedding[C]//Proceedings ofthe British Machine Vision Conference 2012. Surrey. British Machine Vision Association,2012.

    [26]

    ZEYDE R,ELAD M,PROTTER M. On single image scale-up using sparse-representations[M]//BOISSONNAT J D,CHENIN P,COHEN A,et al,eds. Lecture notes in computer science. Berlin,Heidelberg:Springer Berlin Heidelberg,2012:711−730.

    [27]

    HUANG J B,SINGH A,AHUJA N. Single image super-resolution from transformed self-exemplars[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Boston,MA,USA. IEEE,2015:5197−5206.

    [28]

    SETIADI D R I M. PSNR vs SSIM:imperceptibility quality assessment for image steganography[J]. Multimedia Tools and Applications,2021,80(6):8423−8444. doi: 10.1007/s11042-020-10035-z

    [29]

    SARA U,AKTER M,UDDIN M S. Image quality assessment through FSIM,SSIM,MSE and PSNR—a comparative study[J]. Journal of Computer and Communications,2019,7(3):8−18. doi: 10.4236/jcc.2019.73002

    [30]

    ZHANG R,ISOLA P,EFROS A A,et al. The unreasonable effectiveness of deep features as a perceptual metric[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA. IEEE,2018:586−595.

    [31]

    KUO T Y,SU P C,TSAI C M. Improved visual information fidelity based on sensitivity characteristics of digital images[J]. Journal of Visual Communication and Image Representation,2016,40:76−84. doi: 10.1016/j.jvcir.2016.06.010

    [32]

    ZHANG J W,WANG Z X,ZHENG Y H,et al. Cascaded convolutional neural network for image super-resolution[M]//Communications in computer and information science. Cham:Springer International Publishing,2021:361−373.

    [33]

    WAN C,YU H Y,LI Z Q,et al. Swift parameter-free attention network for efficient super-resolution[C]//2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Seattle,WA,USA. IEEE,2024:6246−6256.

    [34]

    ZHANG X D,ZENG H,GUO S,et al. Efficient long-range attention network for Image super-resolution[M]//Lecture notes in computer science. Cham:Springer Nature Switzerland,2022:649−667.

    [35]

    CHEN Z,ZHANG Y L,GU J J,et al. Dual aggregation transformer for image super-resolution[C]//2023 IEEE/CVF International Conference on Computer Vision (ICCV). Paris,France. IEEE,2023:12278−12287.

  • 期刊类型引用(2)

    1. 陆伟,叶文钢,李金亮,钱冠雨,张青松,李金虎. 输送带温敏改性方法及火灾早期预警指标气体研究. 煤炭科学技术. 2025(01): 170-182 . 本站查看
    2. 王银辉. 基于ReaxFF的不黏煤燃烧过程中自由基及主要燃烧产物生成规律研究. 矿业安全与环保. 2024(04): 80-89 . 百度学术

    其他类型引用(3)

图(7)  /  表(4)
计量
  • 文章访问数:  114
  • HTML全文浏览量:  7
  • PDF下载量:  53
  • 被引次数: 5
出版历程
  • 收稿日期:  2024-07-18
  • 网络出版日期:  2024-11-06
  • 刊出日期:  2024-11-24

目录

    /

    返回文章
    返回