融合非局部注意力的大气湍流图像复原生成对抗算法研究
Research on a Generative Adversarial Algorithm for Atmospheric Turbulence Image Restoration Incorporating Non-Local Attention
摘要: 大气湍流效应引起的光程随机变化会导致成像系统采集到的远距离图像序列出现严重的几何畸变、像素级抖动及动态模糊,严重制约了图像的高层语义理解与应用。针对传统物理模型方法依赖强先验假设且计算耗时,以及现有二维深度学习方法难以有效利用时空相关性的问题,本研究提出了一种基于三维非局部(3D non-local)注意力机制的时空残差感知Wasserstein生成对抗网络(Generative Adversarial Network, GAN)用于大气湍流退化图像序列的盲复原。该算法创新性地将非局部注意力机制扩展至三维时空域,通过构建全时空亲和度矩阵以捕捉视频序列中跨帧与长距离的空间依赖关系,从而有效校正非刚性几何形变。为解决高维特征交互带来的计算复杂度呈二次方增长的问题,本研究在注意力机制的键变换与值变换路径中引入了空间子采样策略,在保证全局感受野的同时显著降低了运算负担。此外,网络集成了可变形卷积的时间对齐模块与空间注意力增强模块,以解决湍流引起的时间非一致性问题。在优化目标上,结合了感知损失、像素级均方误差损失以及Wasserstein对抗损失,引导生成器重建出兼具结构保真度与高频纹理细节的清晰图像。实验结果表明,该方法在合成湍流图像复原上取得了优越的复原效果。
Abstract: The random variation of optical path length induced by atmospheric turbulence effects leads to severe geometric distortions, pixel-level jitter, and dynamic blur in image sequences captured by imaging systems at long distances, which significantly hinders high-level semantic understanding and application of the images. To address the limitations of traditional physical model methods, which rely on strong prior assumptions and are computationally intensive, and the shortcomings of existing 2D deep learning methods in effectively utilizing spatio-temporal correlations, this study proposes a spatio-temporal residual-aware Wasserstein Generative Adversarial Network (GAN) based on a 3D non-local attention mechanism for blind restoration of atmospheric turbulence-degraded image sequences. The algorithm innovatively extends the non-local attention mechanism to the 3D spatio-temporal domain, capturing cross-frame and long-range spatial dependencies within video sequences by constructing a full spatio-temporal affinity matrix, thereby effectively correcting non-rigid geometric deformations. To tackle the issue of quadratic growth in computational complexity associated with high-dimensional feature interactions, a spatial sub-sampling strategy is introduced into the key and value transformation paths of the attention mechanism, significantly reducing the computational burden while maintaining a global receptive field. Furthermore, the network integrates a temporal alignment module using deformable convolutions and a spatial attention enhancement module to address the temporal inconsistency caused by turbulence. For the optimization objective, a combination of perceptual loss, pixel-level mean squared error loss, and Wasserstein adversarial loss is employed to guide the generator in reconstructing clear images with both structural fidelity and high-frequency texture details. Experimental results demonstrate that the proposed method achieves superior restoration performance on both synthetic turbulence data.
文章引用:苏钰盛, 周奕含. 融合非局部注意力的大气湍流图像复原生成对抗算法研究[J]. 应用数学进展, 2026, 15(4): 220-231. https://doi.org/10.12677/aam.2026.154152

参考文献

[1] Hufnagel, R.E. and Stanley, N.R. (1964) Modulation Transfer Function Associated with Image Transmission through Turbulent Media. Journal of the Optical Society of America, 54, 52-61. [Google Scholar] [CrossRef
[2] Fried, D.L. (1965) Statistics of a Geometric Representation of Wavefront Distortion. Journal of the Optical Society of America, 55, 1427-1435. [Google Scholar] [CrossRef
[3] Roggemann, M.C. and Welsh, B.M. (2018) Imaging through Turbulence. CRC Press.
[4] Hill, P., Anantrasirichai, N., Achim, A. and Bull, D. (2025) Deep Learning Techniques for Atmospheric Turbulence Removal: A Review. Artificial Intelligence Review, 58, Article No. 101. [Google Scholar] [CrossRef
[5] Dong, C., Loy, C.C., He, K. and Tang, X. (2014) Learning a Deep Convolutional Network for Image Super-Resolution. In: Fleet, D., Pajdla, T., Schiele, B. and Tuytelaars, T., Eds., Computer VisionECCV 2014, Springer, 184-199. [Google Scholar] [CrossRef
[6] Nah, S., Kim, T.H. and Lee, K.M. (2017) Deep Multi-Scale Convolutional Neural Network for Dynamic Scene Deblurring. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 257-265. [Google Scholar] [CrossRef
[7] Chak, W.H., Lau, C.P. and Lui, L.M. (2018) Subsampled Turbulence Removal Network. arXiv: 1807.04418.
[8] Kappeler, A., Yoo, S., Dai, Q. and Katsaggelos, A.K. (2016) Video Super-Resolution with Convolutional Neural Networks. IEEE Transactions on Computational Imaging, 2, 109-122. [Google Scholar] [CrossRef
[9] Tran, D., Bourdev, L., Fergus, R., Torresani, L. and Paluri, M. (2015) Learning Spatiotemporal Features with 3D Convolutional Networks. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 7-13 December 2015, 4489-4497. [Google Scholar] [CrossRef
[10] Caballero, J., Ledig, C., Aitken, A., Acosta, A., Totz, J., Wang, Z., et al. (2017) Real-Time Video Super-Resolution with Spatio-Temporal Networks and Motion Compensation. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 2848-2857. [Google Scholar] [CrossRef
[11] Goodfellow, I.J., Pouget-Abadie, J., Mirza, M., et al. (2014) Generative Adversarial Nets. arXiv:1406.2661.
[12] Ledig, C., Theis, L., Huszar, F., Caballero, J., Cunningham, A., Acosta, A., et al. (2017) Photo-Realistic Single Image Super-Resolution Using a Generative Adversarial Network. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 105-114. [Google Scholar] [CrossRef
[13] Arjovsky, M. and Chintala, S. and Bottou, L. (2017) Wasserstein GAN. arXiv: 1701.07875.
[14] Sajjadi, M.S.M., Vemulapalli, R. and Brown, M. (2018) Frame-Recurrent Video Super-Resolution. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 6626-6634. [Google Scholar] [CrossRef
[15] Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. arXiv: 1706.03762.
[16] Cao, J., Fan, Y., Gool, L.V., Green, S., Ilg, E., Liang, J., et al. (2022) Recurrent Video Restoration Transformer with Guided Deformable Attention. Advances in Neural Information Processing Systems 35, New Orleans, 28 November-9 December 2022, 378-393. [Google Scholar] [CrossRef
[17] Wang, X., Girshick, R., Gupta, A. and He, K. (2018) Non-Local Neural Networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 7794-7803. [Google Scholar] [CrossRef
[18] Zhang, X., Chimitt, N., Chi, Y., Mao, Z. and Chan, S.H. (2024) Spatio-Temporal Turbulence Mitigation: A Translational Perspective. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 16-22 June 2024, 2889-2899. [Google Scholar] [CrossRef