基于机器学习的网贷借款人违约预测研究
Research on Default Prediction of Online Lending Borrowers Based on Machine Learning
摘要: 网络借贷行业的飞速发展使得传统风控在数据时效性、全面性和层次性上的短板日益凸显。而目前,机器学习的蓬勃兴起使网贷平台能够利用多维大数据构建智能风控模型,更加准确的评估个人信用状况,有效地降低违约风险。本文基于中诚信征信有限公司提供的借款人贷款风险数据,利用Logistic、XGBoost和NN构建预测模型,并将结果进行对比。由于XGBoost算法具有高度的灵活性,允许自定义优化目标和评价标准,并且参数较多,可调整的范围大,因此基于XGBoost算法构建的模型对网贷借款人违约预测的准确率较高。同时,本文利用自动化调参工具遍历所有参数组合,给模型调参带来了极大的便利。
Abstract: The rapid development of the online lending industry has made traditional risk control increasingly prominent in the timeliness, comprehensiveness and hierarchy of data. At present, the booming machine learning enables the online lending platform to build an intelligent risk control model by using multi-dimensional big data so that it can assess personal credit status more accurately and reduce default risk more effectively. Based on the borrower loan risk data provided by CCX Credit Technology, this paper uses Logistic, XGBoost and NN to construct a forecasting model and compares the results. The XGBoost algorithm has a high degree of flexibility and allows custom optimization goals and evaluation criteria, and it also has more parameters, the scope of adjustment is large. So the model built based on XGBoost algorithm has higher accuracy for default prediction of online loan borrower. At the same time, this article uses the automated tuning tool to traverse all the parameter combinations, which brings great convenience to the model tuning.
文章引用:王相婷, 赵子轩, 王殊檀, 刘宁宁. 基于机器学习的网贷借款人违约预测研究[J]. 服务科学和管理, 2019, 8(1): 40-48. https://doi.org/10.12677/SSEM.2019.81006

参考文献

[1] Correa Bahnsen, A. (2016) Feature Engineering Strategies for Credit Card Fraud Detection. Expert Systems with Ap-plications, 51, 134-142 [Google Scholar] [CrossRef
[2] 熊正德, 刘臻煊, 熊一鹏. 基于有序logistic模型的互联网金融客户违约风险研究[J]. 系统工程, 2017, 35(8): 29-38.
[3] 阮素梅, 周泽林. 基于L1惩罚Logit模型的P2P网络借贷信用违约识别与预测[J]. 财贸研究, 2018, 29(2): 54-63.
[4] Tsang, S. (2014) De-tecting Online Auction Shilling Frauds Using Supervised Learning. Expert Systems with Applications, 41, 3027-3040. [Google Scholar] [CrossRef
[5] 王茂光, 葛蕾蕾, 赵江平. 基于C5.0算法的小额网贷平台的风险监控研究[J]. 中国管理科学, 2016, 24(S1): 345-352.
[6] 王程龙, 陈程. 基于决策树的P2P网贷平台信用评级体系研究[J]. 农村金融研究, 2016(12): 45-50.
[7] Desai, V.S., Crook, J.N. and Overstreet, G.A. (1996) A Com-parison of Neural Network and Linear Scoring Models in the Credit Union Environment. European Journal of Opera-tional Research, 95, 24-37. [Google Scholar] [CrossRef
[8] 吴斌, 叶菁菁, 董敏. P2P网贷个人信用风险评估模型研究——基于混合果蝇神经网络的方法[J]. 会计之友, 2017(21): 32-35.
[9] 李昕, 戴一成. 基于BP神经网络的P2P网贷借款人信用风险评估研究[J]. 武汉金融, 2018(2): 33-37.
[10] Fernandez-Delgado, M., Cernadas, E., Barro, S., et al. (2014) Do We Need Hundreds of Classifiers to Solve Real World Classification Problems. The Journal of Machine Learning Research, 15, 3133-3181.
[11] 张宁静, 顾新, 杨铖. P2P校园贷款个人违约风险因素指标探析[J]. 财会月刊, 2018(6): 82-89.
[12] 蒋翠清, 王睿雅, 丁勇. 融入软信息的P2P网络借贷违约预测方法[J]. 中国管理科学, 2017, 25(11): 12-21.
[13] 丁岚, 骆品亮. 基于Stacking集成策略的P2P网贷违约风险预警研究[J]. 投资研究, 2017, 36(4): 41-54.