基于得分函数的概率分类模型研究
Research on Probability Classification Model Based on Scoring Function
DOI: 10.12677/AAM.2022.1110743, PDF,    国家自然科学基金支持
作者: 李佳洁:南京信息工程大学数学与统计学院,江苏 南京
关键词: 分类得分函数准确率机器学习Classification Scoring Function Accuracy Machine Learning
摘要: 现代统计学中有各种分类方法,在数据研究中,类别分得越精准,得到的结果就越有价值。对于二元分类问题,本文提出了一种基于得分函数的概率分类模型MKL,从理论上证明了所提出的MKL估计的一致性。在实证方面,本文通过拟牛顿算法直接对连续化后的MKL统计量进行优化,给出了模拟研究的分类效果和一个心脏衰竭数据集的实例。该方法考虑了预测能力、计算复杂度和实际可解释性方面的权衡,与现有的分类方法相比具有优势。
Abstract: There are various classification methods in modern statistics, and in data research, the more accu-rate the classification, the more valuable the results obtained. For the binary classification problem, this paper proposes a probabilistic classification model MKL based on the score function, which theoretically proves the consistency of the proposed MKL estimate. In terms of empirical evidence, this paper directly optimizes the continuous MKL statistics by means of quasi-Newtonian algorithm, and gives the classification effect of the simulation study and an example of a heart failure dataset. This approach takes into account trade-offs in terms of predictive power, computational complexity, and practical interpretability, and offers advantages over existing classification methods.
文章引用:李佳洁. 基于得分函数的概率分类模型研究[J]. 应用数学进展, 2022, 11(10): 7000-7011. https://doi.org/10.12677/AAM.2022.1110743

参考文献

[1] Cramer, J.S. (2002) The Origins of Logistic Regression. Tinbergen Institute Discussion Papers No. 2002-119/4. [Google Scholar] [CrossRef
[2] Breiman, L. (2001) Random Forests. Machine Learning, 45, 5-32. [Google Scholar] [CrossRef
[3] Cortes, C. and Vapnik, V. (1995) Support-Vector Networks. Machine Learning, 20, 273-297. [Google Scholar] [CrossRef
[4] Lecun, Y., Bottou, L., Bengio Y. and Haffner, P. (1998) Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278-2324. https://ieeexplore.ieee.org/document/726791 [Google Scholar] [CrossRef
[5] Zhang, S.C., Li, X.L., et al. (2007) Learning k for kNN Classification. ACM Transactions on Intelligent Systems and Technology, 8, 1-19. [Google Scholar] [CrossRef
[6] Fang, F. and Chen, Y. (2018) A New Ap-proach for Credit Scoring by Directly Maximizing the Kolmogorov-Smirnov Statistic. Computational Stats & Data Analysis, 133, 180-194. [Google Scholar] [CrossRef
[7] Guerrero, V.M. and Johnson, R.A. (1982) Use of the Box-Cox Trans-formation with Binary Response Models. Biometrika, 69, 309-314. [Google Scholar] [CrossRef
[8] Chicco, D. and Jurman, G. (2020) Machine Learning Can Predict Survival of Patients with Heart Failure from Serum Creatinine and Ejection Fraction Alone. BMC Medical Informatics and Decision Making, 20, 1-16. [Google Scholar] [CrossRef] [PubMed]