基于临床–影像多模态融合的乳腺MRI复发风险预测
Recurrence Risk Prediction on Breast MRI via Clinical-Image Multimodal Fusion
DOI: 10.12677/aam.2026.154148, PDF,   
作者: 郝凯亮:青岛大学数学与统计学院,山东 青岛
关键词: 乳腺MRI复发预测多模态融合类别不平衡XGBoostBreast MRI Recurrence Prediction Multimodal Fusion Class Imbalance XGBoost
摘要: 乳腺癌复发风险评估对于术后随访管理、辅助治疗调整及高危患者的早期干预具有重要意义。为比较临床模型、影像模型及临床–影像多模态融合模型在乳腺MRI复发风险预测中的表现,并验证融合策略在低阳性率场景下的应用价值,本文基于Duke-Breast-Cancer-MRI公开队列,完成临床变量清洗、编码与筛选以及病灶ROI三通道输入(Pre/Post/Sub)构建,并采用固定随机种子进行分层划分,得到训练集、验证集和测试集,样本数分别为588、148和184。临床变量筛选在训练集内依次进行低方差剔除、高相关特征过滤及单变量相关性筛选,其中低方差阈值设为0.01,特征间绝对相关系数阈值设为0.90,最终保留67个临床变量进入建模。针对阳性样本比例约9.46%的类别不平衡问题,在训练阶段结合加权采样、Focal Loss与类别权重进行优化,在推理阶段采用“召回优先 + 特异度下限0.75”的阈值选择策略。结果显示,在测试集上,Fusion-DualMeta模型的AUC、AUPRC、Sensitivity、Specificity、F1值和MCC分别为0.8633、0.2801、0.9412、0.7844、0.4638和0.4667;与Clinical-XGBoost模型相比,其AUC、F1值、Sensitivity和MCC分别提高0.0673、0.1480、0.4118和0.2253;与Image-EmbROI模型相比,上述指标分别提高0.0902、0.1951、0.4118和0.2818,且假阴性病例数由8例降至1例。结合样本规模、类别不平衡程度及模型可解释性需求,本文采用基于元学习的后融合策略,并对其与特征拼接、注意力机制融合及张量积融合等方法的适用性进行了讨论。研究表明,在保持特异度约束的前提下,临床–影像多模态融合模型能够显著增强乳腺癌复发高风险患者的识别能力,可为乳腺癌随访筛查与辅助决策提供更具实用价值的技术支持。
Abstract: Assessment of breast cancer recurrence risk is of great importance for postoperative follow-up management, adjuvant treatment adjustment, and early intervention in high-risk patients. To compare the performance of clinical, imaging, and clinical-image multimodal fusion models for recurrence prediction on breast MRI, this study was conducted on the Duke-Breast-Cancer-MRI public cohort. Clinical variables were cleaned, encoded, and filtered, and tumor-centered ROI three-channel inputs (Pre/Post/Sub) were constructed. The data were split into training, validation, and test sets with sizes of 588, 148, and 184, respectively. Clinical variable screening was performed on the training set only, including low-variance removal (threshold = 0.01), high-correlation filtering (|r| > 0.90), and univariate relevance screening, resulting in 67 retained variables. To address the strong class imbalance with a recurrence-positive rate of about 9.46%, weighted sampling, Focal Loss, and class weighting were adopted during training, while a recall-prioritized thresholding strategy with a minimum specificity constraint of 0.75 was applied during inference. On the test set, the Fusion-DualMeta model achieved an AUC of 0.8633, an AUPRC of 0.2801, a sensitivity of 0.9412, a specificity of 0.7844, an F1 score of 0.4638, and an MCC of 0.4667. Compared with Clinical-XGBoost, the improvements in AUC, F1 score, sensitivity, and MCC were 0.0673, 0.1480, 0.4118, and 0.2253, respectively. Compared with Image-EmbROI, the corresponding gains were 0.0902, 0.1951, 0.4118, and 0.2818, with false negatives reduced from 8 to 1. Considering sample size, class imbalance, and interpretability, a meta-learning-based late-fusion strategy was adopted and discussed against other multimodal fusion approaches. The results indicate that under a specificity-constrained setting, clinical-image multimodal fusion can substantially improve the identification of high-risk recurrence patients and may provide useful support for follow-up screening and decision-making in breast cancer care.
文章引用:郝凯亮. 基于临床–影像多模态融合的乳腺MRI复发风险预测[J]. 应用数学进展, 2026, 15(4): 182-191. https://doi.org/10.12677/aam.2026.154148

参考文献

[1] Saha, A., Harowicz, M.R., Grimm, L.J., Kim, C.E., Ghate, S.V., Walsh, R., et al. (2018) A Machine Learning Approach to Radiogenomics of Breast Cancer: A Study of 922 Subjects and 529 DCE-MRI Features. British Journal of Cancer, 119, 508-516. [Google Scholar] [CrossRef] [PubMed]
[2] The Cancer Imaging Archive (TCIA) (2022) Duke-Breast-Cancer-MRI: Dynamic Contrast-Enhanced Magnetic Reso-nance Images of Breast Cancer Patients with Tumor Locations.
[3] Chen, T. and Guestrin, C. (2016) XGBoost: A Scalable Tree Boosting System. Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, 13-17 August 2016, 785-794. [Google Scholar] [CrossRef
[4] Tan, M. and Le, Q.V. (2019) EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks. Proceedings of the 36th International Conference on Machine Learning, Long Beach, 9-15 June 2019, 6105-6114.
[5] Lin, T., Goyal, P., Girshick, R., He, K. and Dollar, P. (2017) Focal Loss for Dense Object Detection. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 2980-2988. [Google Scholar] [CrossRef
[6] Chicco, D. and Jurman, G. (2020) The Advantages of the Matthews Correlation Coefficient (MCC) over F1 Score and Accuracy in Binary Classification Evaluation. BMC Genomics, 21, Article No. 6. [Google Scholar] [CrossRef] [PubMed]
[7] Haibo He, and Garcia, E.A. (2009) Learning from Imbalanced Data. IEEE Transactions on Knowledge and Data Engineering, 21, 1263-1284. [Google Scholar] [CrossRef
[8] Thakur, D., Gera, T., Bhardwaj, V., Mazen, R., Lasisi, A. and Engida, T. (2025) A Comparative Study on Advanced Predictive Modeling of Thyroid Cancer Recurrence Using Multi Algorithmic Machine Learning Frameworks. Scientific Reports, 16, Article No. 3385. [Google Scholar] [CrossRef
[9] Li, H., Zhu, Y., Burnside, E.S., Drukker, K., Hoadley, K.A., Fan, C., et al. (2016) MR Imaging Radiomics Signatures for Predicting the Risk of Breast Cancer Recurrence as Given by Research Versions of Mammaprint, Oncotype DX, and PAM50 Gene Assays. Radiology, 281, 382-391. [Google Scholar] [CrossRef] [PubMed]
[10] Koh, J., Lee, E., Han, K., Kim, S., Kim, D., Kwak, J.Y., et al. (2020) Three-Dimensional Radiomics of Triple-Negative Breast Cancer: Prediction of Systemic Recurrence. Scientific Reports, 10, Article No. 2976. [Google Scholar] [CrossRef] [PubMed]
[11] Xu, K., Hua, M., Mai, T., Ren, X., Fang, X., Wang, C., et al. (2024) A Multiparametric MRI-Based Radiomics Model for Stratifying Postoperative Recurrence in Luminal B Breast Cancer. Journal of Imaging Informatics in Medicine, 37, 1475-1487. [Google Scholar] [CrossRef] [PubMed]
[12] Zhang, R., Wang, K., Wang, S., Wang, C., Cao, T., Ci, C., et al. (2025) Multimodal Deep Learning Model for Prediction of Breast Cancer Recurrence Risk and Correlation with Oncotype DX. Breast Cancer Research, 27, Article No. 178. [Google Scholar] [CrossRef
[13] Yu, Y., Ren, W., Mao, L., Ouyang, W., Hu, Q., Yao, Q., et al. (2025) MRI-Based Multimodal AI Model Enables Prediction of Recurrence Risk and Adjuvant Therapy in Breast Cancer. Pharmacological Research, 216, Article ID: 107765. [Google Scholar] [CrossRef] [PubMed]