基于WOE-Logistic信用评分卡模型构建与应用研究
Research on the Construction and Application of WOE-Logistic Credit Scorecard Model
DOI: 10.12677/csa.2025.1511279, PDF,    科研立项经费支持
作者: 孙 娜, 刘政永*:河北金融学院河北省金融科技应用重点实验室,河北 保定
关键词: 信用评分卡WOELogistic回归Credit Scorecard WOE Logistic Regression
摘要: 本研究基于Give Me Some Credit数据集,开发了一种融合WOE编码与Logistic回归的信用评分卡模型,旨在解决金融机构在信贷风险评估中的核心挑战。研究的主要贡献在于:提出了一种优化的特征离散化方法,通过WOE转换有效处理非线性关系并增强模型解释性;构建了包含KS统计量、PSI稳定性和多决策阈值的综合评估体系,显著提升了模型验证的全面性与业务适用性。实证结果表明,该模型在测试集上取得了0.85的AUC值和0.452的KS统计量,展现出优秀的风险区分能力,同时PSI指标验证了模型在不同群体间的稳定性。本研究的方法论框架不仅为信用风险评估提供了技术参考,其评估体系也可推广至其他金融风险预测场景。然而,研究在特征工程深度和模型对比广度方面仍存在改进空间,为后续研究指明了方向。
Abstract: This study is based on the Give Me Some Credit dataset and has developed a credit scoring card model that integrates WOE encoding with logistic regression, to address the core challenges faced by financial institutions in credit risk assessment. The main contributions of this research are as follows: it proposes an optimized feature discretization method which can effectively deal with nonlinear relationships and enhance model interpretability by using WOE transformation; it constructs a comprehensive evaluation system that includes KS statistics, PSI stability, and multiple decision thresholds, significantly improving the comprehensiveness and business of model validation. Empirical results show that the model achieves an AUC value of 0.85 and a KS statistic of 0.452 on the test, demonstrating excellent risk differentiation capabilities, while the PSI indicator verifies the stability of the model across different groups. The methodological framework of this study not only provides technical references credit risk assessment but also its evaluation system can be generalized to other financial risk prediction scenarios. However, there is room for improvement in the depth of feature engineering and the breadth of model, pointing the way for further research.
文章引用:孙娜, 刘政永. 基于WOE-Logistic信用评分卡模型构建与应用研究[J]. 计算机科学与应用, 2025, 15(11): 19-32. https://doi.org/10.12677/csa.2025.1511279

参考文献

[1] 周德慧. 信用评分模型在金融投资中的大数据分析与应用探讨[J]. 中国信用, 2025(7): 122-125.
[2] 杨玉霞, 陈建刚. 信用评分模型在客户细分中的应用研究[J]. 金融文坛, 2024(4): 4-6.
[3] 朱德斌. 基于改进SMOTE算法的信用评分卡模型设计[D]: [硕士学位论文]. 大连: 东北财经大学, 2024.
[4] 王江源. 基于Stacking融合模型的信用贷款违约预测的研究——以Give Me Some Credit数据集为例[J]. 信息与电脑(理论版), 2023, 35(4): 154-156.
[5] 张俊丽, 郭双颜, 任翠萍, 马倩. 基于逻辑回归的个人信用评分卡模型研究[J]. 现代信息科技, 2024, 8(5): 12-16.
[6] 王旭拓, 卫雨婷, 张焕焕. 基于Kolmogorov-Smirnov (KS)统计量的信用评分模型选择方法[J]. 数理统计与管理, 2024, 43(1): 100-116.
[7] 张利斌, 吴宗文. 基于XGBoost机器学习模型的信用评分卡与基于逻辑回归模型的对比[J]. 中南民族大学学报(自然科学版), 2023, 42(6): 846-852.
[8] 李爱华, 刘婉昕, 陈思帆, 石勇. 面向不平衡数据的SMOTE-BO-XGBoost集成信用评分模型研究[J/OL]. 中国管理科学, 1-10. 2025-10-28.[CrossRef
[9] 施月丽. 基于Blending融合的个人信用评分模型研究[D]: [硕士学位论文]. 芜湖: 安徽师范大学, 2023.