基于混合感知与频率自适应门控网络的轻量图像超分辨率重建算法
Lightweight Image Super-Resolution Reconstruction Algorithm Based on Hybrid Perception and Frequency-Adaptive Gating Network
摘要: 基于Transformer的方法凭借其卓越的长距离依赖建模能力,在单图像超分辨率领域取得了显著进展。然而,现有的轻量级Transformer架构在追求计算效率时,往往通过通道压缩或稀疏窗口机制来降低计算负担,此类策略削弱了局部特征的空间连续性,且自注意力机制固有的低通滤波特性限制了网络对高频纹理细节的恢复能力。为了解决上述频率偏差与局部信息丢失的问题,本文提出了一种基于混合感知与频率自适应门控网络的轻量图像超分辨率重建算法HPG-SR。首先,本文设计了混合感知门控注意力模块,通过并行使用局部感知分支和可学习的门控机制,在保留大窗口全局感受野的同时,显式地强化局部高频细节。其次,本文提出了多尺度门控前馈网络,利用双路多尺度卷积和上下文门控替代传统的静态激活函数,增强了网络对不同频率特征的自适应选择能力。最后,提出了对比度感知特征细化模块,利用标准差统计量强化对纹理丰富区域的特征响应。在五个基准数据集上的广泛实验表明,HPG-SR在参数量和计算量相当的情况下,性能优于当前最先进的轻量级SR方法。特别是在纹理复杂的Urban100数据集上,该算法展现出了更佳的细节恢复能力。
Abstract: Transformer-based methods have achieved significant progress in single image super-resolution due to their superior ability to model long-range dependencies. However, existing lightweight Transformer architectures often employ channel compression or sparse window mechanisms to reduce computational burden, which inevitably weakens the spatial continuity of local features. Furthermore, the inherent low-pass filtering nature of the self-attention mechanism limits the network’s capacity to recover high-frequency texture details. To address the issues of frequency bias and local information loss, this paper proposes a lightweight image super-resolution reconstruction algorithm based on a Hybrid Perception and Frequency-Adaptive Gating network, named HPG-SR. First, a Hybrid Perception Gated Attention module is designed. By utilizing a parallel local perception branch and a learnable gating mechanism, it explicitly enhances local high-frequency details while retaining the global receptive field of large windows. Second, a Multi-Scale Gated Feed-Forward Network is proposed, which employs dual-path multi-scale convolutions and context gating to replace traditional static activation functions, thereby enhancing the network’s adaptive selection capability for features across different frequencies. Finally, a Contrast-Aware Feature Refinement module is introduced to strengthen feature responses in texture-rich regions using standard deviation statistics. Extensive experiments on five benchmark datasets demonstrate that HPG-SR outperforms state-of-the-art lightweight SR methods with comparable parameters and computational complexity. Particularly on the texture-complex Urban100 dataset, the proposed algorithm exhibits superior detail recovery capability.
文章引用:庞梦鑫, 董智红, 曹鹏, 张鸣赟. 基于混合感知与频率自适应门控网络的轻量图像超分辨率重建算法[J]. 计算机科学与应用, 2026, 16(1): 8-19. https://doi.org/10.12677/csa.2026.161002

参考文献

[1] Yang, J.C, Wright, J., Huang, T.S., et al. (2010) Image Super-Resolution via Sparse Representation. IEEE Transactions on Image Processing, 19, 2861-2873. [Google Scholar] [CrossRef] [PubMed]
[2] Lim, B., Son, S., Kim, H., Nah, S. and Lee, K.M. (2017) Enhanced Deep Residual Networks for Single Image Super-Resolution. 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, 21-26 July 2017, 136-144. [Google Scholar] [CrossRef
[3] Zhang, Y., Li, K., Li, K., et al. (2018) Image Super-Resolution Using Very Deep Residual Channel Attention Networks. Proceedings of the European Conference on Computer Vision (ECCV), Munich, 8-14 September 2018, 286-301.
[4] Ahn, N., Kang, B. and Sohn, K.A. (2018) Fast, Accurate, and Lightweight Super-Resolution with Cascading Residual Network. Proceedings of the European Conference on Computer Vision (ECCV), Munich, 8-14 September 2018, 252-268.
[5] Zhang, Y., Tian, Y., Kong, Y., Zhong, B. and Fu, Y. (2018) Residual Dense Network for Image Super-Resolution. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 2472-2481. [Google Scholar] [CrossRef
[6] Chen, H., Wang, Y., Guo, T., Xu, C., Deng, Y., Liu, Z., et al. (2021) Pre-Trained Image Processing Transformer. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 12294-12305. [Google Scholar] [CrossRef
[7] Liang, J., Cao, J., Sun, G., Zhang, K., Van Gool, L. and Timofte, R. (2021) SwinIR: Image Restoration Using Swin Transformer. 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, 11-17 October 2021, 1833-1844. [Google Scholar] [CrossRef
[8] Park, N. and Kim, S. (2022) How Do Vision Transformers Work? International Conference on Learning Representations (ICLR), 2022.
https://openreview.net/forum?id=D78Go4hVcxO
[9] Zhang, X., Zeng, H., Guo, S. and Zhang, L. (2022) Efficient Long-Range Attention Network for Image Super-Resolution. In: Lecture Notes in Computer Science, Springer, 649-667. [Google Scholar] [CrossRef
[10] Zhou, Y., Li, Z., Guo, C.L., et al. (2023) SRFormer: Permuted Self-Attention for Single Image Super-Resolution. Proceedings of the IEEE/CVF International Conference on Computer Vision, Paris, 1-6 October 2023, 12780-12791.
[11] Dong, C., Loy, C.C., He, K. and Tang, X. (2015) Image Super-Resolution Using Deep Convolutional Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38, 295-307. [Google Scholar] [CrossRef] [PubMed]
[12] Hui, Z., Wang, X. and Gao, X. (2018) Fast and Accurate Single Image Super-Resolution via Information Distillation Network. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 723-731. [Google Scholar] [CrossRef
[13] Hui, Z., Gao, X., Yang, Y. and Wang, X. (2019) Lightweight Image Super-Resolution with Information Multi-Distillation Network. Proceedings of the 27th ACM International Conference on Multimedia, Nice, 21-25 October 2019, 2024-2032. [Google Scholar] [CrossRef
[14] Liu, J., Tang, J. and Wu, G. (2020) Residual Feature Distillation Network for Lightweight Image Super-resolution. In: Lecture Notes in Computer Science, Springer, 41-55. [Google Scholar] [CrossRef
[15] Luo, X., Xie, Y., Zhang, Y., Qu, Y., Li, C. and Fu, Y. (2020) Latticenet: Towards Lightweight Image Super-Resolution with Lattice Block. In: Lecture Notes in Computer Science, Springer, 272-289. [Google Scholar] [CrossRef
[16] Li, W., Zhou, K., Qi, L., et al. (2020) Lapar: Linearly-Assembled Pixel-Adaptive Regression Network for Single Image Super-Resolution and beyond. Advances in Neural Information Processing Systems, 33, 20343-20355.
[17] Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2021) An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale. International Conference on Learning Representations, Austria, 3-7 May 2021.
[18] Lu, Z.S., Li, J.C., Liu, H., Huang, C.Y., Zhang, L.L. and Zeng, T.Y. (2022) Transformer for Single Image Super-Resolution. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), New Orleans, 19-20 June 2022, 456-465. [Google Scholar] [CrossRef
[19] Pan, X., Ge, C., Lu, R., Song, S., Chen, G., Huang, Z., et al. (2022) On the Integration of Self-Attention and Convolution. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 805-815. [Google Scholar] [CrossRef
[20] Chen, L., Chu, X., Zhang, X. and Sun, J. (2022) Simple Baselines for Image Restoration. In: Lecture Notes in Computer Science, Springer, 17-33. [Google Scholar] [CrossRef
[21] Zhao, H., Gallo, O., Frosio, I. and Kautz, J. (2016) Loss Functions for Image Restoration with Neural Networks. IEEE Transactions on Computational Imaging, 3, 47-57. [Google Scholar] [CrossRef
[22] Agustsson, E. and Timofte, R. (2017) NTIRE 2017 Challenge on Single Image Super-Resolution: Dataset and Study. 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Honolulu, 21-26 July 2017, 1122-1131. [Google Scholar] [CrossRef
[23] Bevilacqua, M., Roumy, A., Guillemot, C. and Morel, M.A. (2012) Low-Complexity Single-Image Super-Resolution Based on Nonnegative Neighbor Embedding. Proceedings of the British Machine Vision Conference 2012, Surrey, 3-7 September 2012, 135.1-135.10. [Google Scholar] [CrossRef
[24] Zeyde, R., Elad, M. and Protter, M. (2012) On Single Image Scale-Up Using Sparse-Representations. In: Lecture Notes in Computer Science, Springer, 711-730. [Google Scholar] [CrossRef
[25] Martin, D., Fowlkes, C., Tal, D., et al. (2001) A Database of Human Segmented Natural Images and Its Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics. Proceedings of the 8th IEEE International Conference on Computer Vision, Vancouver, 7-14 July 2001, 416-423.
[26] Huang, J.B., Singh, A. and Ahuja, N. (2015) Single Image Super-Resolution from Transformed Self-Exemplars. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 7-12 June 2015, 5197-5206. [Google Scholar] [CrossRef
[27] Matsui, Y., Ito, K., Aramaki, Y., Fujimoto, A., Ogawa, T., Yamasaki, T., et al. (2017) Sketch-Based Manga Retrieval Using Manga109 Dataset. Multimedia Tools and Applications, 76, 21811-21838. [Google Scholar] [CrossRef
[28] Wang, Z., Bovik, A.C., Sheikh, H.R., et al. (2004) Image Quality Assessment: From Error Visibility to Structural Similarity. IEEE Transactions on Image Processing, 13, 600-612. [Google Scholar] [CrossRef] [PubMed]
[29] Paszke, A., Gross, S., Massa, F., et al. (2019) PyTorch: An Imperative Style, High-Performance Deep Learning Library. Advances in Neural Information Processing Systems, 32, 8024-8035.
[30] Kingma, D.P. and Ba, J. (2014) Adam: A Method for Stochastic Optimization. International Conference on Learning Representations (ICLR), 2015.