基于VMamba-CNN混合的结直肠癌切片图像分割
Colorectal Cancer Slice Image Segmentation Based on VMamba-CNN Hybrid
DOI: 10.12677/mos.2025.144331, PDF,    国家自然科学基金支持
作者: 王劭羽, 陈庆奎:上海理工大学光电信息与计算机工程学院,上海;黄 陈:上海交通大学医学院附属第一人民医院胃肠外科,上海
关键词: 医学图像分割卷积神经网络结直肠癌VMambaMedical Image Segmentation Convolutional Neural Network Colorectal Cancer Vmamba
摘要: 该研究提出一种基于VMamba和卷积神经网络(CNN)混合架构的结直肠癌(CRC)病理切片图像分割方法VMDC-Unet,旨在解决传统方法在处理肿瘤异质性、复杂背景及模糊边界时的不足。该方法通过融合VMamba模型的长距离依赖处理能力和CNN的局部特征提取优势,引入改进的ConvNext模块以增强细粒度特征提取,并设计局部自注意力机制优化跳跃连接的特征融合效率。实验结果表明,在SJTU_GSFPH和Glas数据集上,VMDC-Unet的分割精度与泛化能力均优于其他基准模型,消融实验进一步验证了各模块的有效性。该工作为医学图像分割提供了多模型协同的新思路,其结合全局依赖建模与局部特征强化的策略,为CRC精准诊疗提供了可靠的技术支持。
Abstract: This study proposes a VMDC-Unet method based on a hybrid architecture of VMamba and convolutional neural network (CNN) for colorectal cancer (CRC) pathological image segmentation, aiming to address the limitations of traditional methods in handling tumor heterogeneity, complex backgrounds, and blurred boundaries. The method integrates VMamba’s long-range dependency modeling capability with CNN’s local feature extraction strength. It introduces an enhanced ConvNext module to improve fine-grained feature representation and designs a local self-attention mechanism to optimize feature fusion efficiency in skip connections. Experimental results demonstrate that VMDC-Unet outperformed baseline models in segmentation accuracy and generalization capability on both SJTU_GSFPH and Glas datasets. Ablation studies further verified the effectiveness of each component. The work provides a novel multi-model collaboration strategy for medical image segmentation, where the combination of global dependency modeling and local feature enhancement offers reliable technical support for precise CRC diagnosis and treatment.
文章引用:王劭羽, 陈庆奎, 黄陈. 基于VMamba-CNN混合的结直肠癌切片图像分割[J]. 建模与仿真, 2025, 14(4): 799-810. https://doi.org/10.12677/mos.2025.144331

参考文献

[1] Sung, H., Ferlay, J., Siegel, R.L., Laversanne, M., Soerjomataram, I., Jemal, A., et al. (2021) Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA: A Cancer Journal for Clinicians, 71, 209-249. [Google Scholar] [CrossRef] [PubMed]
[2] Rawla, P., Sunkara, T. and Barsouk, A. (2019) Epidemiology of Colorectal Cancer: Incidence, Mortality, Survival, and Risk Factors. Gastroenterology Review, 14, 89-103. [Google Scholar] [CrossRef] [PubMed]
[3] Kumar, N., Verma, R., Sharma, S., Bhargava, S., Vahadane, A. and Sethi, A. (2017) A Dataset and a Technique for Generalized Nuclear Segmentation for Computational Pathology. IEEE Transactions on Medical Imaging, 36, 1550-1560. [Google Scholar] [CrossRef] [PubMed]
[4] Litjens, G., Kooi, T., Bejnordi, B.E., Setio, A.A.A., Ciompi, F., Ghafoorian, M., et al. (2017) A Survey on Deep Learning in Medical Image Analysis. Medical Image Analysis, 42, 60-88. [Google Scholar] [CrossRef] [PubMed]
[5] Thakur, N., Yoon, H. and Chong, Y. (2020) Current Trends of Artificial Intelligence for Colorectal Cancer Pathology Image Analysis: A Systematic Review. Cancers, 12, Article 1884. [Google Scholar] [CrossRef] [PubMed]
[6] Otsu, N. (1975) A Threshold Selection Method from Gray-Level Histograms. Automatica, 11, 23-27.
[7] Canny, J. (1986) A Computational Approach to Edge Detection. IEEE Transactions on Pattern Analysis and Machine Intelligence, 8, 679-698. [Google Scholar] [CrossRef
[8] Adams, R. and Bischof, L. (1994) Seeded Region Growing. IEEE Transactions on Pattern Analysis and Machine Intelligence, 16, 641-647. [Google Scholar] [CrossRef
[9] 殷晓航, 王永才, 李德英. 基于U-Net结构改进的医学影像分割技术综述[J]. 软件学报, 2021, 32(2): 519-550.
[10] 张玮智, 于谦, 苏金善, 等. 从U-Net到Transformer: 深度模型在医学图像分割中的应用综述[J]. 计算机应用, 2024, 44(z1): 204-222.
[11] 陈哲, 童基均, 潘哲毅. 基于Attention-ResUNet的肝脏肿瘤分割算法[J]. 计算机时代, 2023(10): 100-104.
[12] Ronneberger, O., Fischer, P. and Brox, T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., Hornegger, J., Wells, W. and Frangi, A., Eds., Medical Image Computing and Computer-Assisted InterventionMICCAI 2015, Springer, 234-241. [Google Scholar] [CrossRef
[13] Zhou, Z., Rahman Siddiquee, M.M., Tajbakhsh, N. and Liang, J. (2018) UNet++: A Nested U-Net Architecture for Medical Image Segmentation. In: Stoyanov, D., et al., Eds., Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support, Springer, 3-11. [Google Scholar] [CrossRef] [PubMed]
[14] Oktay, O., Schlemper, J., Folgoc, L.L., et al. (2018) Attention U-Net: Learning Where to Look for the Pancreas. arXiv: 1804.03999.
[15] Xiao, X., Lian, S., Luo, Z. and Li, S. (2018) Weighted Res-UNet for High-Quality Retina Vessel Segmentation. 2018 9th International Conference on Information Technology in Medicine and Education (ITME), Hangzhou, 19-21 October 2018, 327-331. [Google Scholar] [CrossRef
[16] Huang, H., Lin, L., Tong, R., Hu, H., Zhang, Q., Iwamoto, Y., et al. (2020) UNet 3+: A Full-Scale Connected UNet for Medical Image Segmentation. ICASSP 2020—2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Barcelona, 4-8 May 2020, 1055-1059. [Google Scholar] [CrossRef
[17] Chen, J., Lu, Y., Yu, Q., et al. (2021) TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. arXiv: 2102.04306.
[18] Cao, H., Wang, Y., Chen, J., Jiang, D., Zhang, X., Tian, Q., et al. (2023) Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation. In: Karlinsky, L., Michaeli, T. and Nishino, K., Eds., Computer VisionECCV 2022 Workshops, Springer, 205-218. [Google Scholar] [CrossRef
[19] Wang, H., Cao, P., Wang, J. and Zaiane, O.R. (2022) Uctransnet: Rethinking the Skip Connections in U-Net from a Channel-Wise Perspective with Transformer. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 2441-2449. [Google Scholar] [CrossRef
[20] Zhou, H.Y., Guo, J., Zhang, Y., et al. (2021) nnFormer: Interleaved Transformer for Volumetric Segmentation. arXiv: 2109.03201.
[21] Liu, Z., Mao, H., Wu, C., Feichtenhofer, C., Darrell, T. and Xie, S. (2022) A Convnet for the 2020s. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 11966-11976. [Google Scholar] [CrossRef
[22] Gu, A., Goel, K. and Ré, C. (2021) Efficiently Modeling Long Sequences with Structured State Spaces. arXiv: 2111.00396.
[23] Gu, A. and Dao, T. (2023) Mamba: Linear-Time Sequence Modeling with Selective State Spaces. arXiv: 2312.00752.
[24] Liu, Y., Tian, Y., Zhao, Y., et al. (2024) VMamba: Visual State Space Model. arXiv: 2401.10166.