基于组合机器学习模型的心血管疾病预测研究
Study on Prediction of Cardiovascular Disease Based on Combined Machine Learning Models
摘要: 近年来,机器学习技术与医学领域的结合已成为一种发展趋势。大多数学者更倾向于利用机器学习来预测癌症等疾病,但事实上,心血管疾病的死亡率高于癌症等疾病,因此建立用于心血管疾病预测的模型尤为重要。本研究深入探讨了机器学习的单一和组合模型在心血管疾病预测中的应用,组合模型的构建方式为首先基于六种机器学习算法构建基础模型,其次选择其中性能最优的模型,最后将该模型与其他五种模型结合,从而构建出串行或并行的组合模型。研究结果显示,基于逻辑回归和随机森林的串行组合模型的效能最佳,该模型较单一模型的AUC值提升约12%。
Abstract: In recent years, the combination of machine learning technology and the medical field has become a trend. Most scholars prefer to use machine learning to predict diseases such as cancer, but in fact, the mortality rate of cardiovascular disease is higher than that of cancer and other diseases, so it is especially important to build models for cardiovascular disease prediction. In this study, the application of single and combined models of machine learning in cardiovascular disease prediction is discussed in depth. The combined model is constructed by firstly constructing a base model based on six machine learning algorithms, secondly selecting the model with the best performance among them, and finally combining this model with the other five models, so as to construct a serial or parallel combined model. The results of the study show that the serial combined model based on Logistic regression and random forest has the best performance, and the AUC value of this model is improved by about 12% compared with the single model.
参考文献
|
[1]
|
《中国心血管健康与疾病报告2022》编写组. 《中国心血管健康与疾病报告2022》概述[J]. 中国心血管病研究, 2023, 21(7): 577-600.
|
|
[2]
|
《中国心血管健康与疾病报告》编写组. 《中国心血管健康与疾病报告2020》要点解读[J]. 中国心血管杂志, 2021, 26(3): 209-218.
|
|
[3]
|
阙菊华, 林俊. 电子血压计临床应用准确度探讨[J]. 中国设备工程, 2021(18): 82-83.
|
|
[4]
|
杨艳平, 李荣. 基于决策树-逻辑回归模型的心脏病影响因素[J]. 工业控制计算机, 2024, 37(8): 114-116.
|
|
[5]
|
Wang, Z., Xu, H., Zhou, P. and Xiao, G. (2023) An Improved Multilabel K-Nearest Neighbor Algorithm Based on Value and Weight. Computation, 11, Article 32. [Google Scholar] [CrossRef]
|
|
[6]
|
Mustafa Abdullah, D. and Mohsin Abdulazeez, A. (2021) Machine Learning Applications Based on SVM Classification a Review. Qubahan Academic Journal, 1, 81-90. [Google Scholar] [CrossRef]
|
|
[7]
|
Galopo Perez, J. and S. Perez, E. (2021) Predicting Student Program Completion Using Naïve Bayes Classification Algorithm. International Journal of Modern Education and Computer Science, 13, 57-67. [Google Scholar] [CrossRef]
|
|
[8]
|
Charbuty, B. and Abdulazeez, A. (2021) Classification Based on Decision Tree Algorithm for Machine Learning. Journal of Applied Science and Technology Trends, 2, 20-28. [Google Scholar] [CrossRef]
|
|
[9]
|
Magidi, J., Nhamo, L., Mpandeli, S. and Mabhaudhi, T. (2021) Application of the Random Forest Classifier to Map Irrigated Areas Using Google Earth Engine. Remote Sensing, 13, Article 876. [Google Scholar] [CrossRef] [PubMed]
|
|
[10]
|
曹桂林. 心血管疾病数据集下基于机器学习的心血管疾病患者识别[J]. 河北软件职业技术学院学报, 2024, 26(1): 6-11.
|
|
[11]
|
王润玮. 基于机器学习组合模型的心血管疾病预测[D]: [硕士学位论文]. 苏州: 苏州大学, 2024.
|
|
[12]
|
邱昭斌. 基于改进金豺优化算法和CatBoost模型的心血管疾病风险预测研究[D]: [硕士学位论文]. 银川: 北方民族大学, 2024.
|