基于AdaBoost提升算法的个人信用评估模型研究
Research on Personal Credit Evaluation Model Based on AdaBoost Algorithm
摘要: 本文的目的是利用数据挖掘方法构建互联网金融时代下的信用评分体系模型。本文先是对征信机构提供的数据进行指标选取、数据预处理,发现数据集中直接反映客户信用历史、还款压力和还款能力的指标变量分类作用较大;再利用SMOTE算法对处理后数据进行过采样以减少数据集的不均衡程度;然后在得到的新训练集中学习最佳单层决策树弱分类器和利用AdaBoost提升算法将构建的多个弱分类器组合成强分类器,模型精度基本稳定在80%,分类器平均性能达到88%;最后综合考虑分类器性能、分类代价以及程序运行时间,选用分类器数目为2000的模型预测测试集样本的贷款逾期情况。
Abstract: The purpose of this paper is to construct a credit scoring system model in the era of Internet finance by using the data mining method. This paper firstly selects the data provided by the credit reporting agency and preprocesses the data, finding that the indexes that directly reflect credit history, repayment pressure and repayment ability of customers have stronger classification ability. The SMOTE algorithm is used to reduce the imbalance of the data through oversampling; and then learning the best single-layer decision tree weak classifier in the new training set and combining the weak classifiers constructed by the AdaBoost algorithm into a strong classifier, the model accuracy is 80% stability, and the average performance of the classifier is 88%. Finally, considering the classifier performance, classification cost and program running time, the loan forecast overdue of the model test set sample is selected by the model with 2000.
文章引用:金俊玲. 基于AdaBoost提升算法的个人信用评估模型研究[J]. 社会科学前沿, 2018, 7(10): 1724-1734. https://doi.org/10.12677/ASS.2018.710257

参考文献

[1] 石勇, 孟凡. 信用评分基本理论及其应用[J]. 大数据, 2017(3): 20-25.
[2] 许佩. 商业银行个人信用评分系统的优化研究[D]: [硕士学位论文]. 哈尔滨: 哈尔滨工业大学, 2017: 20-42.
[3] 史小康, 马学俊. 个人信用评级模型的指标选择方法[J]. 统计与决策, 2014(23): 41-43.
[4] 钱云. 非均衡数据分类算法若干应用研究[D]: [硕士学位论文]. 吉林: 吉林大学, 2014: 25-44.