Abstract:
The complexity and variability of underground environments pose severe challenges to safe production, making video surveillance a key technological means to ensure operational safety. Visible light and thermal infrared images play important roles in underground monitoring due to their respective advantages. However, the inherent limitations of single-modal images fail to meet the requirements for information completeness in intelligent underground monitoring. Therefore, fusing heterogeneous images to achieve complementary advantages is an effective solution to the aforementioned problems.Aiming at the issues of poor lighting adaptability, local feature loss, and artifact interference in traditional fusion algorithms in the special underground environment, a dual-encoder fusion algorithm for heterogeneous underground images based on low-light correction is proposed. First, to avoid the confusion caused by a single encoder when extracting features from heterogeneous images, leading to insufficient retention of original image information in the fused image, separate visible light and thermal infrared encoders based on convolutional neural networks and Transformer architectures are designed, respectively, to effectively extract the features of the original heterogeneous images. Then, to mitigate the local feature loss in fused images caused by uneven lighting in underground environments, a selective lighting feature enhancement module is designed to improve the visual quality of low-illumination areas. Next, a parallel global and local feature extraction module is designed to capture both macro semantic information and micro detail information of images, thereby enhancing the feature richness of the fused images. Finally, to alleviate the artifact interference in fused images, a low-light correction loss function guided by lighting information is proposed to assist the decoder in dynamically adjusting fusion weights, thereby enhancing the fused image’s ability to retain complementary information from heterogeneous images.To verify the advantages of the proposed algorithm, it was compared with nine fusion detection algorithms using a self-built dataset. The experimental results show that the proposed fusion algorithm can effectively mitigate local information loss caused by uneven lighting, reduce artifact interference in the fused results, and enhance the information completeness of the fused images. Compared with the contrast algorithms, the proposed algorithm shows significant advantages in five core indicators: spatial frequency, average gradient, spectral correlation difference, visual information fidelity, and correlation coefficient. Moreover, the fused images are more consistent with human visual perception in terms of visual effects.