基于斜坡损失最小二乘几何NHSVM的信用卡客户违约概率预测分析
Analysis of Credit Card Customer Default Probability Prediction Using Least Squares Ramp Loss Geometric NHSVM
摘要: 本文基于UCI机器学习库中的信用卡客户违约的数据,以客户是否存在信用卡违约行为作为响应变量,以23个描述客户信息以及客户每月的还款情况和还款金额的变量作为解释变量建立预测模型。为了提升金融服务领域对信用卡违约概率评估的准确性和效率,本文提出了一种优化的预测模型——改进的斜坡损失最小二乘几何非平行超平面支持向量机(RLS-GNHSVM)。RLS-GNHSVM模型融合了斜坡损失函数和最小二乘几何非平行超平面支持向量机的优势,旨在克服传统的凸损失函数对异常值敏感而导致性能不佳的问题。该模型不仅能够在数据含噪或存在异常值的情况下保持稳定的预测性能,还显著优化了预测精度。在实证应用中,RLS-GNHSVM模型相较于其他三种主流模型,在预测信用卡客户违约概率方面展现出了更高的效能和适用性,为金融机构提供了更为精准的风险评估工具。
Abstract: Based on the credit card customer default dataset from the UCI Machine Learning Repository, this paper establishes a predictive model with the presence of credit card default behavior as the response variable and 23 explanatory variables detailing customer information, along with monthly repayment status and amounts during the data collection period. To enhance the accuracy and efficiency of credit card default probability assessment in the financial services sector, we propose an optimized predictive model known as the Refined Least Squares Ramp Loss Geometric Non-Parallel Hyperplane Support Vector Machine (RLS-GNHSVM). The RLS-GNHSVM model seamlessly combines the strengths of the Least Squares Ramp Loss function and the Geometric Non-Parallel Hyperplane SVM. This integration aims to address the shortcomings of traditional convex loss functions, which are prone to performance degradation due to sensitivity to outliers. The RLS-GNHSVM model not only maintains stable predictive performance amidst noisy data or the presence of outliers but also significantly enhances prediction accuracy. In empirical applications, the RLS-GNHSVM model demonstrates superior performance and applicability compared to three other mainstream models in predicting credit card customer default probabilities. By offering a more precise risk assessment tool, this model provides financial institutions with a powerful means to refine their decision-making processes and enhance their risk management capabilities.
参考文献
|
[1]
|
Vapnik, V.N. (1999) An Overview of Statistical Learning Theory. IEEE Transactions on Neural Networks, 10, 988-999. [Google Scholar] [CrossRef] [PubMed]
|
|
[2]
|
Jayadeva, Khemchandani, R. and Chandra, S. (2007) Twin Support Vector Machines for Pattern Classification. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 905-910. [Google Scholar] [CrossRef] [PubMed]
|
|
[3]
|
Arun Kumar, M. and Gopal, M. (2009) Least Squares Twin Support Vector Machines for Pattern Classification. Expert Systems with Applications, 36, 7535-7543. [Google Scholar] [CrossRef]
|
|
[4]
|
Shao, Y., Chen, W. and Deng, N. (2014) Nonparallel Hyperplane Support Vector Machine for Binary Classification Problems. Information Sciences, 263, 22-35. [Google Scholar] [CrossRef]
|
|
[5]
|
Qi, K. and Yang, H. (2023) LS-GNHSVM: A Novel Joint Geometrical Nonparallel Hyperplane Support Vector Machine. Expert Systems with Applications, 215, 119413. [Google Scholar] [CrossRef]
|
|
[6]
|
Liu, D., Shi, Y., Tian, Y. and Huang, X. (2016) Ramp Loss Least Squares Support Vector Machine. Journal of Computational Science, 14, 61-68. [Google Scholar] [CrossRef]
|
|
[7]
|
Collobert, R., Sinz, F., Weston, J. and Bottou, L. (2006). Trading Convexity for Scalability. Proceedings of the 23rd International Conference on Machine Learning-ICML’06, 201-208.[CrossRef]
|
|
[8]
|
Yuille, A.L. and Rangarajan, A. (2003) The Concave-Convex Procedure. Neural Computation, 15, 915-936. [Google Scholar] [CrossRef] [PubMed]
|
|
[9]
|
Yeh, I. and Lien, C. (2009) The Comparisons of Data Mining Techniques for the Predictive Accuracy of Probability of Default of Credit Card Clients. Expert Systems with Applications, 36, 2473-2480. [Google Scholar] [CrossRef]
|