融合梯度幅值引导门控的Transformer-CNN网络用于乳腺超声图像分割
A Transformer-CNN Network Fusing Gradient Magnitude-Guided Gating for Breast Ultrasound Image Segmentation
摘要: 乳腺超声图像中病灶的精准分割对乳腺癌的早期诊断与治疗规划具有重要意义。然而,由于超声图像固有的噪声干扰、边界模糊以及病灶形态多样性,传统基于卷积神经网络或Transformer的分割方法在全局语义建模与局部细节保持之间存在显著矛盾。为此,本研究提出了一种梯度引导空间注意力门控机制,并将其深度融合于一种新颖的Transformer-CNN并行混合架构中,构建出GMG-TCNet (Gradient-guided Mixed Attention Network)。该网络通过自适应梯度门控动态提取并增强图像中的边缘先验信息,结合空间注意力机制进一步抑制背景干扰,实现了多尺度特征的精准对齐与融合。在四个公开乳腺超声数据集上的系统性实验表明,GMG-TCNet在mDice、mIoU、Precision等关键指标上均显著优于U-Net、Swin-UNet、TransUNet等主流模型,尤其在边界分割精度上表现出明显优势。消融研究进一步验证了梯度引导机制与双分支结构的有效性。本研究不仅为乳腺超声图像分割提供了一种高性能解决方案,也为先验知识与深度学习的协同建模提供了新思路。
Abstract: Accurate segmentation of breast lesions in ultrasound images is crucial for early diagnosis and treatment planning of breast cancer. However, due to inherent noise, blurred boundaries, and morphological diversity of lesions, traditional segmentation methods based solely on Convolutional Neural Networks or Transformers face significant challenges in balancing global semantic modeling and local detail preservation. To address this, we propose a Gradient-guided Spatial Attention Gate and deeply integrate it into a novel Transformer-CNN parallel hybrid architecture, forming the GMG-TCNet. The network dynamically extracts and enhances edge prior information through an adaptive gradient gate, combined with a spatial attention mechanism to suppress background interference, achieving precise alignment and fusion of multi-scale features. Systematic experiments on four public breast ultrasound datasets demonstrate that GMG-TCNet significantly outperforms state-of-the-art models such as U-Net, Swin-UNet, and TransUNet in key metrics including mDice, mIoU, and Precision, particularly excelling in boundary segmentation accuracy. Ablation studies further validate the effectiveness of the gradient guidance mechanism and dual-branch structure. This study not only provides a high-performance solution for breast ultrasound image segmentation but also offers new insights into the synergistic modeling of prior knowledge and deep learning.
参考文献
|
[1]
|
Ronneberger, O., Fischer, P. and Brox, T. (2015) U-net: Convolutional Networks for Biomedical Image Segmentation. In: Navab, N., et al., Eds., Medical Image Computing and Computer-Assisted Intervention—MICCAI 2015, Springer International Publishing, 234-241. [Google Scholar] [CrossRef]
|
|
[2]
|
Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. http://arxiv.org/abs/1706.03762
|
|
[3]
|
Chen, Y., Li, J., Xiao, H., et al. (2018) Dual Path Networks. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, 4-9 December 2017, 4470-4478.
|
|
[4]
|
张恩珲, 林帅, 陈金令, 等. PFTransCNN: 基于CNN-Transformer双分支融合的病理图像分割[J]. 微电子学与计算机, 2026, 43(3): 88-97.
|
|
[5]
|
Wu, T., Tang, S., Zhang, R., Cao, J. and Zhang, Y. (2021) CGNet: A Light-Weight Context Guided Network for Semantic Segmentation. IEEE Transactions on Image Processing, 30, 1169-1179. [Google Scholar] [CrossRef] [PubMed]
|
|
[6]
|
邓酩, 徐锦凡, 肖洪祥, 等. 改进TransUNet的高效通道注意力医学图像分割网络[J]. 计算机应用, 2025, 45(12): 4037-4044.
|
|
[7]
|
Chen, J.N., Lu, Y.Y., Yu, Q.H., et al. (2021) TransUNet: Transformers Make Strong Encoders for Medical Image Segmentation. [Google Scholar] [CrossRef]
|
|
[8]
|
Cao, H., Wang, Y.Y., Chen, J., et al. (2021) Swin-Unet: Unet-Like Pure Transformer for Medical Image Segmentation. https://arxiv.org/abs/2105.05537
|