基于机器学习的胰腺癌患者预后生存分析模型
A Machine Learning-Based Model for Prognostic Survival Analysis in Pancreatic Cancer Patients
DOI: 10.12677/sa.2026.152043, PDF,   
作者: 付文斌, 易 静*:重庆医科大学公共卫生学院,重庆
关键词: 胰腺癌机器学习自适应加权SHAPPancreatic Cancer Machine Learning Adaptive Weighting SHAP
摘要: 目的:针对胰腺癌预后预测中传统回归模型处理高维非线性特征受限,以及深度学习缺乏临床解释性的问题,构建一种高精度且具有强解释性的生存分析模型。方法:基于591例患者数据,选取XGBoost、SVM、MLP与CNN作为基础模型。提出自适应加权与分层融合的双阶段策略进行集成优化,并利用SHAP技术对24个临床特征进行量化解释分析。结果:该模型准确率达90.0%,AUC为0.90,召回率89.0%,相比固定权重融合提升了2%。自适应加权与层次化融合分别贡献了1.5%和0.5%的效能增幅。年龄、医疗可及性及加工食品饮食是影响预后的关键因素。结论:该方法有效解决了单一模型在复杂疾病预测中的不足,在保证高准确率的同时实现了临床可解释性,为制定个性化治疗方案提供了可靠依据。
Abstract: Objective: To address the limitations of traditional regression models in handling high-dimensional non-linear features and the lack of clinical interpretability in deep learning models, this study aims to construct a survival analysis model for pancreatic cancer that achieves both high precision and strong interpretability. Methods: Based on data from 591 patients, XGBoost, Support Vector Machine (SVM), Multilayer Perceptron (MLP), and Convolutional Neural Network (CNN) were selected as base models. A two-stage ensemble optimization strategy was proposed, incorporating adaptive weighting and hierarchical fusion. Furthermore, SHAP (SHapley Additive exPlanations) technology was employed to quantitatively interpret 24 clinical features. Results: The proposed model achieved an accuracy of 90.0%, an Area Under the Curve (AUC) of 0.90, and a recall rate of 89.0%, representing a 2% improvement over fixed-weight fusion methods. Specifically, adaptive weighting and hierarchical fusion contributed to performance gains of 1.5% and 0.5%, respectively. Feature analysis identified age, healthcare accessibility, and processed food intake as critical prognostic factors. Conclusion: This approach effectively overcomes the limitations of single models in complex disease prediction. By ensuring high accuracy while maintaining clinical interpretability, the model provides a reliable basis for developing personalized treatment plans.
文章引用:付文斌, 易静. 基于机器学习的胰腺癌患者预后生存分析模型[J]. 统计学与应用, 2026, 15(2): 155-162. https://doi.org/10.12677/sa.2026.152043

参考文献

[1] Sung, H., Ferlay, J., Siegel, R.L., Laversanne, M., Soerjomataram, I., Jemal, A., et al. (2021) Global Cancer Statistics 2020: GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. CA: A Cancer Journal for Clinicians, 71, 209-249. [Google Scholar] [CrossRef] [PubMed]
[2] Siegel, R.L., Miller, K.D., Wagle, N.S. and Jemal, A. (2023) Cancer Statistics, 2023. CA: A Cancer Journal for Clinicians, 73, 17-48. [Google Scholar] [CrossRef] [PubMed]
[3] 张思维, 郑荣寿, 孙可欣, 等. 2016年中国恶性肿瘤分地区发病和死亡估计: 基于人群的肿瘤登记数据分析[J].中国肿瘤, 2023, 32(5): 321-332.
[4] Mizrahi, J.D., Surana, R., Valle, J.W. and Shroff, R.T. (2020) Pancreatic cancer. The Lancet, 395, 2008-2020. [Google Scholar] [CrossRef] [PubMed]
[5] Topol, E.J. (2019) High-Performance Medicine: The Convergence of Human and Artificial Intelligence. Nature Medicine, 25, 44-56. [Google Scholar] [CrossRef] [PubMed]
[6] Katzman, J.L., Shaham, U., Cloninger, A., Bates, J., Jiang, T. and Kluger, Y. (2018) Deepsurv: Personalized Treatment Recommender System Using a Cox Proportional Hazards Deep Neural Network. BMC Medical Research Methodology, 18, Article No. 24. [Google Scholar] [CrossRef] [PubMed]
[7] Kourou, K., Exarchos, T.P., Exarchos, K.P., Karamouzis, M.V. and Fotiadis, D.I. (2015) Machine Learning Applications in Cancer Prognosis and Prediction. Computational and Structural Biotechnology Journal, 13, 8-17. [Google Scholar] [CrossRef] [PubMed]
[8] Esteva, A., Robicquet, A., Ramsundar, B., Kuleshov, V., DePristo, M., Chou, K., et al. (2019) A Guide to Deep Learning in Healthcare. Nature Medicine, 25, 24-29. [Google Scholar] [CrossRef] [PubMed]
[9] Teng, B., Zhang, X., Ge, M., Miao, M., Li, W. and Ma, J. (2024) Personalized Three-Year Survival Prediction and Prognosis Forecast by Interpretable Machine Learning for Pancreatic Cancer Patients: A Population-Based Study and an External Validation. Frontiers in Oncology, 14, Article ID: 1488118. [Google Scholar] [CrossRef] [PubMed]
[10] Tjoa, E. and Guan, C. (2021) A Survey on Explainable Artificial Intelligence (XAI): Toward Medical Xai. IEEE Transactions on Neural Networks and Learning Systems, 32, 4793-4813. [Google Scholar] [CrossRef] [PubMed]
[11] Rudin, C. (2019) Stop Explaining Black Box Machine Learning Models for High Stakes Decisions and Use Interpretable Models Instead. Nature Machine Intelligence, 1, 206-215. [Google Scholar] [CrossRef] [PubMed]
[12] Sagi, O. and Rokach, L. (2018) Ensemble Learning: A Survey. WIREs Data Mining and Knowledge Discovery, 8, e1249. [Google Scholar] [CrossRef
[13] Lundberg, S.M., Nair, B., Vavilala, M.S., Horibe, M., Eisses, M.J., Adams, T., et al. (2018) Explainable Machine-Learning Predictions for the Prevention of Hypoxaemia during Surgery. Nature Biomedical Engineering, 2, 749-760. [Google Scholar] [CrossRef] [PubMed]