不同分类模型的心脏病预测效果分析
Analysis of Prediction Effect of Different Classification Models on Heart Disease
摘要: 医学统计数据具有类型多样、数据量大等特征,一般的参数回归模型对医学统计数据的预测效果有时不能达到相应的要求。本论文提出了一种基于NW估计的非参数改进logistic回归模型,降低了logistic回归模型的链接函数假设条件;利用所提出的模型对心脏病诊断数据进行了模型拟合,并将其与一般的logistic回归模型的预测ROC曲线进行比较,发现对于样本数据的拟合,非参数改进后的logistic回归模型的预测效果优于一般的logistic回归模型。除此而外,本论文还对比了非参数机器学习方法——支持向量机与上述两种模型的预测效果之间的差异,绘制不同核函数下的支持向量机预测ROC曲线,对比之下,发现polynomial核函数下的支持向量机对患者是否患有心脏病这一问题的预测效果最好。
Abstract:
Medical statistical data is characterized by various types and large amounts of data, so the predic-tion effect of general parametric regression model cannot meet the corresponding requirements sometimes. This paper presents a non-parametric improved logistic regression model based on Nadaraya-Watson estimation, which reduces the link function hypothesis of logistic regression model. The proposed model was used to fit the heart disease diagnosis data, and the predictive ROC curve of the general logistic regression model was compared with that of the general logistic regres-sion model. It was found that the predictive effect of the non-parametric improved logistic regres-sion model was better than that of the general Logistic regression model. In addition, this paper also compares the difference between the prediction effect of non-parametric machine learning method support vector machine and the above two models, and draws the ROC curve predicted by support vector machine under different kernel functions. By comparison, it is found that support vector machine under kernel function has the best prediction effect on whether patients suffer from heart disease.
参考文献
|
[1]
|
覃雪纯. 基于拟合优度抽样下非参数logistic模型的统计推断[D]: [硕士学位论文]. 武汉: 华中师范大学, 2022.
|
|
[2]
|
王晨阳. 基于数据驱动的心脏病分类诊断系统设计[D]: [硕士学位论文]. 西安: 西安电子科技大学, 2021.
|
|
[3]
|
缪琦. 基于随机森林和支持向量机的糖尿病风险预测方法研究[D]: [硕士学位论文]. 镇江: 江苏大学, 2019.
|