基于双分支特征提取的息肉图像分割网络
Polyp Image Segmentation Network Based on Double Branch Feature Extraction
DOI: 10.12677/SEA.2024.131001, PDF,    国家自然科学基金支持
作者: 林元杰, 韩啸翔, 陈柯炎, 张维坤:上海理工大学健康科学与工程学院,上海;刘巧红*:上海健康医学院医疗器械学院,上海
关键词: 医学图像分割CNN Transformer特征融合特征提取Polyp Segmentation CNN Transformer Feature Extraction Feature Fusion
摘要: 结肠息肉分割是从结肠息肉图像中提取病理信息的关键步骤,对结直肠癌的诊断和治疗具有重要意义。针对结肠息肉分割中形状大小不一、病灶组织与背景差异性难以区分等问题,本文提出了一种基于CNN和Swin Transformer的双分支特征提取的医学息肉分割网络(DST-Net),其充分考虑了卷积神经网络和Transformer在提取局部特征和全局特征各自的优势。DST-Net是一种编码器-解码器架构,首先设计了基于VGG和空洞卷积(AC)的两种编码器分别提取局部边界特征和多尺度特征;接下来在编解码器中间的底部模块使用两个连续的Swin Transformer模块,充分利用Transformer的远程依赖关系进一步加强网络的全局特征提取能力;最后,在编解码器之间的跳过连接中使用了通道注意力模块(CAB),以更加关注可疑和复杂的区域。在CVC-Clinic DB和Kvasir两个公开的息肉数据集上验证了所提出方法,结果表明该模型优于现有的其他方法,可以准确有效地实现结肠息肉的分割任务。
Abstract: Colon polyp segmentation is a key step to extract pathological information from colon polyp images, which is of great significance for the diagnosis and treatment of colorectal cancer. Aiming at the problems such as different shapes and sizes in colon polyp segmentation and difficult to distinguish lesion tissue and background differences, this paper proposes a Dual-branch feature extraction medical polyp segmentation network (DST-Net), which fully considers the advantages of convolutional neural network and Transformer in extracting local features and global features. DST-Net is an encoder-decoder architecture. Firstly, two encoders based on VGG and void convolution (AC) are designed to extract local boundary features and multi-scale features respectively. Next, the bottom module in the middle of the codec uses two continuous Swin Transformer modules to make full use of the remote dependency of Transformer to further strengthen the global feature extraction capability of the network. Finally, the Channel Attention Module (CAB) is used in skip connections between codecs to pay more attention to suspicious and complex areas. The proposed method was validated on two publicly available polyp datasets, CVC-Clinic DB and Kvasir. The results show that the model is superior to other existing methods and can accurately and effectively achieve the task of colon polyp segmentation.
文章引用:林元杰, 刘巧红, 韩啸翔, 陈柯炎, 张维坤. 基于双分支特征提取的息肉图像分割网络[J]. 软件工程与应用, 2024, 13(1): 1-10. https://doi.org/10.12677/SEA.2024.131001

参考文献

[1] Xia, C., Dong, X., Li, H., et al. (2022) Cancer Statistics in China and United States, 2022: Profiles, Trends, and Determinants. Chinese Medical Journal, 135, 584-590. [Google Scholar] [CrossRef
[2] 张恒良, 李锵, 关欣. 一种改进的三维双路径脑肿瘤图像分割网络[J]. 光学学报, 2021, 41(3): 0310002.
[3] 黄鸿, 彭超, 吴若愚, 等. 基于部分注释CT图像的自监督迁移学习肺结节分类[J]. 光学学报, 2020, 40(18): 1810003.
[4] 李大湘, 张振. 基于改进U-Net视网膜血管图像分割算法[J]. 光学学报, 2020, 40(10): 1010001.
[5] 王亚刚, 郗怡媛, 潘晓英. 改进DeepLabv3 +网络的肠道息肉分割方法[J]. 计算机科学与探索, 2020, 14(7): 1243-1250.
[6] Ronneberger, O., Fischer, P. and Brox, T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., Hornegger, J., Wells, W. and Frangi, A., Eds., Medical Image Computing and Computer-Assisted Intervention, Springer, Cham, 234-241. [Google Scholar] [CrossRef
[7] Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N., et al. (2018) UNet++: A Nested U-Net Architecture for Medical Image Segmentation. In: Stoyanov, D., et al., Eds., Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Springer, Cham, 3-11. [Google Scholar] [CrossRef] [PubMed]
[8] Jha, D., Smedsrud, P.H., Riegler, M.A., et al. (2019) ResUNet++: An Advanced Architecture for Medical Image. 2019 IEEE International Symposium on Multimedia (ISM), San Diego, 09-11 December 2019, 225-2255. [Google Scholar] [CrossRef
[9] Duc, N.T., Oanh, N.T., Thuy, N.T., et al. (2022) ColonFormer: An Efficient Transformer based Method for Colon Polyp Segmentation. IEEE Access, 10, 80575-80586. [Google Scholar] [CrossRef
[10] Park, K.-B. and Lee, J.Y. (2022) SwinE-Net: Hybrid Deep Learning Approach to Novel Polyp Segmentation Using Convolutional Neural Network and Swin Transformer. Journal of Computational Design and Engineering, 9, 616-632. [Google Scholar] [CrossRef
[11] Liu, Z., Lin, Y., Cao, Y., et al. (2021) Swin Transformer: Hierarchical Vision Transformer using Shifted Windows. 2021 IEEE/CVF International Conference on Computer Vision (ICCV), Montreal, 10-17 October 2021, 9992-10002. [Google Scholar] [CrossRef
[12] Chen, L.C., Papandreou, G., Schroff, F., et al. (2017) Rethinking Atrous Convolution for Semantic Image Segmentation. In: Ferrari, V., Hebert, M., Sminchisescu, C. and Weiss, Y., Eds., Computer Vision—ECCV 2018. Lecture Notes in Computer Science, Springer, Cham, 833-851. [Google Scholar] [CrossRef
[13] Bernal, J., Sánchez, F., Fernández-Esparrach, G., et al. (2015) WM-DOVA Maps for Accurate Polyp Highlighting in Colonoscopy: Validation vs. Saliency Maps from Physicians. Computerized Medical Imaging and Graphics, 43, 99-111. [Google Scholar] [CrossRef] [PubMed]
[14] Jha, D., Smedsrud, P.H., Riegler, M.A., et al. (2020) Kvasir-SEG: A Segmented Polyp Dataset. In: Ro, Y., et al., Eds., MultiMedia Modeling, Lecture Notes in Computer Science, Springer, Cham, 451-462. [Google Scholar] [CrossRef
[15] Xiao, X., Lian, S., Luo, Z., et al. (2018) Weighted Res-UNet for High-Quality Retina Vessel Segmentation. 2018 9th International Conference on Information Technology in Medicine and Education (ITME), Hangzhou, 19-21 October 2018, 327-331. [Google Scholar] [CrossRef
[16] Fan, D.P., Ji, G.P., Zhou, T., et al. (2020) PraNet: Parallel Reverse Attention Network for Polyp Segmentation. In: Martel, A.L., et al., Eds., Medical Image Computing and Computer Assisted Intervention—MICCAI 2020. Lecture Notes in Computer Science, Springer, Cham, 263-273. [Google Scholar] [CrossRef