HFF-UNet:基于混合特征融合的轻量化医学图像分割网络
HFF-UNet: A Lightweight Hybrid Feature Fusion Network for Medical Image Segmentation
DOI: 10.12677/jisp.2025.144041, PDF,    科研立项经费支持
作者: 蒋林烽, 卢洪轩, 徐爱茹, 余子怡, 姚兴兴*:武汉工程大学数理学院,湖北 武汉;张耀严:武汉工程大学光电信息与能源工程学院,湖北 武汉
关键词: 医学图像分割轻量化网络注意力机制特征融合U-NetMedical Image Segmentation Lightweight Network Attention Mechanism Feature Fusion U-Net
摘要: 医学图像分割在疾病诊断和治疗规划中发挥着关键作用,但现有方法如U-Net及其变体仍面临全局上下文建模不足、特征融合效率低和模型复杂度高等挑战。为此,本文提出一种轻量化医学图像分割网络HFF-UNet (Hybrid Feature Fusion U-Net),通过三个创新模块显著提升分割性能:首先,设计高效多尺度注意力模块以增强多尺度特征表达能力;其次,提出金字塔池化激励模块改进跳跃连接,缓解编码器与解码器间的语义差距;最后,开发混合特征融合模块优化特征融合过程,提升细节恢复能力。在Glas和CVC-ClinicDB公开数据集上的实验结果表明,与原始U-Net算法相比,Dice系数分别提升1.80%和2.05%,IoU分别提升2.06%和1.89%,参数量分别降低92.20%,82.45%,计算量分别降低90.89%和73.48%,并且显著优于现有轻量化模型。消融实验进一步验证了各模块的有效性。本研究为复杂医学图像分割提供了一种精度与效率兼顾的解决方案,具有重要的临床应用价值。
Abstract: Medical image segmentation plays a pivotal role in disease diagnosis and treatment planning. However, existing methods such as U-Net and its variants still face challenges including insufficient global context modeling, inefficient feature fusion, and high model complexity. To address these issues, this paper proposes a lightweight medical image segmentation network, HFF-UNet (Hybrid Feature Fusion U-Net), which significantly improves segmentation performance through three innovative modules: 1) an efficient multi-scale attention module to enhance multi-scale feature representation; 2) a Pyramid Pooling Excitation Module to refine skip connections and mitigate the semantic gap between the encoder and decoder; and 3) a Hybrid Feature Fusion Block to optimize the feature fusion process and improve detail recovery. Experimental results on the publicly available GlaS and CVC-ClinicDB datasets demonstrate that, compared to the original U-Net algorithm, the proposed method achieves improvements in Dice coefficients of 1.80% and 2.05%, and in IoU of 2.06% and 1.89%, respectively, while reducing the number of parameters by 92.20% and 82.45%, and computational costs by 90.89% and 73.48%. Moreover, HFF-UNet outperforms existing lightweight models significantly. Ablation studies further validate the effectiveness of each module. This study provides a balanced solution for accuracy and efficiency in complex medical image segmentation, offering substantial clinical application value.
文章引用:蒋林烽, 卢洪轩, 张耀严, 徐爱茹, 余子怡, 姚兴兴. HFF-UNet:基于混合特征融合的轻量化医学图像分割网络[J]. 图像与信号处理, 2025, 14(4): 443-456. https://doi.org/10.12677/jisp.2025.144041

参考文献

[1] Ronneberger, O., Fischer, P. and Brox, T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., et al., Eds., Medical Image Computing and Computer-Assisted Intervention, Springer International Publishing, 234-241. [Google Scholar] [CrossRef
[2] Diakogiannis, F.I., Waldner, F., Caccetta, P. and Wu, C. (2020) ResUNet-a: A Deep Learning Framework for Semantic Segmentation of Remotely Sensed Data. ISPRS Journal of Photogrammetry and Remote Sensing, 162, 94-114. [Google Scholar] [CrossRef
[3] Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N. and Liang, J. (2020) UNet++: Redesigning Skip Connections to Exploit Multiscale Features in Image Segmentation. IEEE Transactions on Medical Imaging, 39, 1856-1867. [Google Scholar] [CrossRef] [PubMed]
[4] Chen, J., Lu, Y., Yu, Q., et al. (2021) TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. [Google Scholar] [CrossRef
[5] Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., et al. (2023) Swin-UNet: UNet-Like Pure Transformer for Medical Image Segmentation. Computer VisionECCV 2022 Workshops, Tel Aviv, 23-27 October 2022, 205-218. [Google Scholar] [CrossRef
[6] Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., et al. (2021) Swin Transformer: Hierarchical Vision Transformer Using Shifted Windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 10012-10022. [Google Scholar] [CrossRef
[7] Wang, H., Cao, P., Wang, J., et al. (2022) UCTransNet: Rethinking the Skip Connections in U-Net from a Channel-Wise Perspective with Transformer. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 2441-2449.
[8] Ouyang, D., He, S., Zhang, G., et al. (2023) Efficient Multi-Scale Attention Module with Cross-Spatial Learning. Proceedings of ICASSP 2023-IEEE International Conference on Acoustics, Speech and Signal Processing, Rhodes Island, 4-10 June 2023, 1-5.
[9] Valanarasu, J.M.J. and Patel, V.M. (2022) UNeXt: MLP-Based Rapid Medical Image Segmentation Network. International Conference on Medical Image Computing and Computer-Assisted Intervention, Singapore, 18-22 September 2022, 23-33.
[10] Chollet, F. (2017) Xception: Deep Learning with Depthwise Separable Convolutions. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 1251-1258. [Google Scholar] [CrossRef
[11] Howard, A.G., Zhu, M., Chen, B., et al. (2017) MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications. [Google Scholar] [CrossRef
[12] Dinh, B., Nguyen, T., Tran, T. and Pham, V. (2023) 1M Parameters Are Enough? A Lightweight CNN-Based Model for Medical Image Segmentation. 2023 Asia Pacific Signal and Information Processing Association Annual Summit and Conference (APSIPA ASC), Taipei, 31 October-3 November 2023, 1279-1284. [Google Scholar] [CrossRef
[13] Ruan, J., Xiang, S., Xie, M., et al. (2022) MALUNet: A Multi-Attention and Lightweight UNet for Skin Lesion Segmentation. 2022 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), Las Vegas, 6-8 December 2022, 1150-1156.
[14] Wang, X., Girshick, R., Gupta, A. and He, K. (2018) Non-Local Neural Networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 7794-7803. [Google Scholar] [CrossRef
[15] Schlemper, J., Oktay, O., Schaap, M., et al. (2019) Attention Gated Networks: Learning to Leverage Salient Regions in Medical Images. Medical Image Analysis, 53, 197-207.
[16] Schlemper, J., Oktay, O., Chen, L., et al. (2018) Attention-Gated Networks for Improving Ultrasound Scan Plane Detection. [Google Scholar] [CrossRef
[17] Petit, O., Thome, N., Rambour, C., Themyr, L., Collins, T. and Soler, L. (2021) U-Net Transformer: Self and Cross Attention for Medical Image Segmentation. Machine Learning in Medical Imaging: 12th International Workshop, MLMI 2021, Strasbourg, 27 September 2021, 267-276. [Google Scholar] [CrossRef
[18] Sinha, A. and Dolz, J. (2021) Multi-Scale Self-Guided Attention for Medical Image Segmentation. IEEE Journal of Biomedical and Health Informatics, 25, 121-130. [Google Scholar] [CrossRef] [PubMed]
[19] Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, 4-9 December 2017, 6000-6010. [Google Scholar] [CrossRef
[20] Valanarasu, J.M.J., Oza, P., Hacihaliloglu, I. and Patel, V.M. (2021) Medical Transformer: Gated Axial-Attention for Medical Image Segmentation. Medical Image Computing and Computer-Assisted Intervention, MICCAI 2021, Strasbourg, 27 September-1 October 2021, 36-46. [Google Scholar] [CrossRef
[21] Sun, G., Pan, Y., Kong, W., Xu, Z., Ma, J., Racharak, T., et al. (2024) DA-Transunet: Integrating Spatial and Channel Dual Attention with Transformer U-Net for Medical Image Segmentation. Frontiers in Bioengineering and Biotechnology, 12, Article ID: 1398237. [Google Scholar] [CrossRef] [PubMed]
[22] Hu, J., Shen, L. and Sun, G. (2018) Squeeze-and-Excitation Networks. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 7132-7141. [Google Scholar] [CrossRef
[23] Zhao, H., Shi, J., Qi, X., Wang, X. and Jia, J. (2017) Pyramid Scene Parsing Network. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 2881-2890. [Google Scholar] [CrossRef
[24] Sirinukunwattana, K., Pluim, J.P.W., Chen, H., et al. (2017) Gland Segmentation in Colon Histology Images: The GlaS Challenge Contest. Medical Image Analysis, 35, 489-502. [Google Scholar] [CrossRef] [PubMed]
[25] Bernal, J., Sánchez, F.J., Fernández-Esparrach, G., Gil, D., Rodríguez, C. and Vilariño, F. (2015) WM-DOVA Maps for Accurate Polyp Highlighting in Colonoscopy: Validation vs. Saliency Maps from Physicians. Computerized Medical Imaging and Graphics, 43, 99-111. [Google Scholar] [CrossRef] [PubMed]
[26] Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W. and Hu, Q. (2020) ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 11534-11542. [Google Scholar] [CrossRef
[27] Woo, S., Park, J., Lee, J. and Kweon, I.S. (2018) CBAM: Convolutional Block Attention Module. Proceedings of the European Conference on Computer Vision (ECCV), Munich, 8-14 September 2018, 3-19. [Google Scholar] [CrossRef
[28] Hou, Q., Zhou, D. and Feng, J. (2021) Coordinate Attention for Efficient Mobile Network Design. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 19-25 June 2021, 13713-13722.
[29] Selvaraju, R.R., Cogswell, M., Das, A., Vedantam, R., Parikh, D. and Batra, D. (2017) Grad-CAM: Visual Explanations from Deep Networks via Gradient-Based Localization. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 618-626. [Google Scholar] [CrossRef