基于注意力机制的小儿肾图分割网络
Segmentation Network of Paediatric Kidney Map Based on Attention Mechanism
DOI: 10.12677/mos.2024.135454, PDF,   
作者: 李志力:上海理工大学,光电信息与计算机工程学院,上海
关键词: 肾图分割深度学习语义分割注意力机制Kidney Image Segmentation Deep Learning Semantic Segmentation Attention Mechanism
摘要: 本文提出了一种全新的网络架构,解决了U-Net在小儿肾图分割中对全局信息的部分损失问题。由于U-Net的编码器通过多次下采样降低了输入图像的空间分辨率,导致模型无法完全利用图像的全局上下文信息,且下采样过程丢失了大量重要信息。因此,将Transformer架构引入U-Net的深层学习目标位置信息及局部与全局的关系。此外,设计了多尺度通道和空间注意力模块,该模块融合了多尺度特征以弥补网络池化过程中丢失的信息,能有效地捕捉了特征图的全局上下文信息,抑制背景噪声。最后,利用新华医院收集的小儿肾图数据,进行了对比实验与消融实验,验证了所提方法的有效性。
Abstract: In this paper, a novel network architecture is proposed to solve the problem of partial loss of global information by U-Net in paediatric renogram segmentation. Since the encoder of U-Net reduces the spatial resolution of the input image by multiple downsampling, it results in the model not being able to fully utilise the global contextual information of the image, and the downsampling process loses a large amount of important information. Therefore, the Transformer architecture is introduced into U-Net’s deep learning of target location information and the relationship between local and global. In addition, the Multiscale Channel and Spatial Attention (MSCA) module is designed, which incorporates multiscale features to compensate for the information lost in the network pooling process, and can effectively capture the global context information of the feature map and suppress the background noise. Finally, using the paediatric nephrogram data collected from Xinhua Hospital, comparative experiments and ablation experiments were conducted to verify the effectiveness of the proposed method.
文章引用:李志力. 基于注意力机制的小儿肾图分割网络[J]. 建模与仿真, 2024, 13(5): 5021-5032. https://doi.org/10.12677/mos.2024.135454

参考文献

[1] Zhang, Z., Wu, C., Coleman, S. and Kerr, D. (2020) Dense-inception U-Net for Medical Image Segmentation. Computer Methods and Programs in Biomedicine, 192, Article ID: 105395. [Google Scholar] [CrossRef] [PubMed]
[2] Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. arXiv: 1706.03762.
[3] Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2020) An Image Is Worth 16 × 16 Words: Transformers for Image Recognition at Scale. arXiv: 2010.11929.
[4] Hu, J., Shen, L., Albanie, S., Sun, G. and Wu, E. (2020) Squeeze-and-Excitation Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 2011-2023. [Google Scholar] [CrossRef] [PubMed]
[5] Woo, S., Park, J., Lee, J. and Kweon, I.S. (2018) CBAM: Convolutional Block Attention Module. In: Ferrari, V., Hebert, M., Sminchisescu, C. and Weiss, Y., Eds., Computer VisionECCV 2018, Springer International Publishing, 3-19. [Google Scholar] [CrossRef
[6] Long, J., Shelhamer, E. and Darrell, T. (2015) Fully Convolutional Networks for Semantic Segmentation. 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Boston, 7-12 June 2015, 3431-3440. [Google Scholar] [CrossRef
[7] Badrinarayanan, V., Kendall, A. and Cipolla, R. (2017) Segnet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 2481-2495. [Google Scholar] [CrossRef] [PubMed]
[8] Zhao, H., Shi, J., Qi, X., Wang, X. and Jia, J. (2017). Pyramid Scene Parsing Network. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 6230-6239.[CrossRef
[9] Gu, Z., Cheng, J., Fu, H., Zhou, K., Hao, H., Zhao, Y., et al. (2019) Ce-net: Context Encoder Network for 2D Medical Image Segmentation. IEEE Transactions on Medical Imaging, 38, 2281-2292. [Google Scholar] [CrossRef] [PubMed]
[10] Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A. and Zagoruyko, S. (2020) End-to-End Object Detection with Transformers. In: Vedaldi, A., Bischof, H., Brox, T. and Frahm, J.M., Eds., Computer VisionECCV 2020, Springer International Publishing, 213-229. [Google Scholar] [CrossRef
[11] Oktay, O., Schlemper, J., Folgoc, L., et al. (2018) Attention U-Net: Learning Where to Look for the Pancreas. arXiv: 1804.03999.