基于数据重构与阈值自适应的信用卡欺诈不平衡分类模型研究
Research on A Credit Card Fraud Imbalanced Classification Model Based on Data Reconstruction and Threshold Adaptation
摘要: 随着信用卡交易的普及,欺诈检测已成为银行风险控制的核心挑战。该问题的关键在于欺诈交易仅占极低比例,导致数据高度不平衡,使得传统分类模型严重失效。为此,本文提出一种基于数据重构与阈值自适应的不平衡分类模型。本研究以Kaggle信用卡欺诈数据集为对象,首先通过特征选择与样本平衡技术进行数据重构,从源头优化数据质量与分布。进而,在逻辑回归模型基础上,突破默认0.5阈值的限制,引入阈值自适应调整机制,系统优化分类决策边界。结果表明,本方法有效解决了类别不平衡带来的预测偏差。其中,“数据重构”显著提升了模型对欺诈交易的识别能力,而“阈值自适应”则在召回率与误报率之间实现了基于业务需求的最优平衡。二者协同,共同构成了一个高效、实用的欺诈检测解决方案,为金融风控领域的类似问题提供了重要的方法论参考与实践价值。
Abstract: With the widespread adoption of credit card transactions, fraud detection has become a core challenge in bank risk control. The crux of this problem lies in the extremely low proportion of fraudulent transactions, resulting in highly imbalanced data that renders traditional classification models largely ineffective. To address this, this paper proposes an imbalanced classification model based on Data Reconstruction and Threshold Adaptation. Using the Kaggle credit card fraud dataset, the study first performs Data Reconstruction through feature selection and sample balancing to optimize data quality and distribution at the source. Furthermore, building upon a logistic regression model, it breaks through the limitation of the default 0.5 threshold by introducing a Threshold Adaptation mechanism to systematically optimize the classification decision boundary. The results show that this method effectively mitigates the prediction bias caused by class imbalance. Specifically, “Data Reconstruction” significantly enhanced the model’s ability to identify fraudulent transactions, while “Threshold Adaptation” achieved an optimal business-oriented balance between the recall rate and the false positive rate. Working synergistically, they form an efficient and practical fraud detection solution, providing significant methodological reference and practical value for similar problems in the field of financial risk control.
参考文献
|
[1]
|
顾明, 李飞凤, 王晓勇, 郑冬花. 基于改进SMOTE算法和深度学习集成框架的信用卡欺诈检测[J]. 贵阳学院学报(自然科学版), 2024, 19(2): 99-104, 115.
|
|
[2]
|
周可. 面向非平衡数据的信用卡欺诈检测研究[D]: [硕士学位论文]. 鞍山: 辽宁科技大学, 2024.
|
|
[3]
|
刘汝欣. 基于不平衡分类和混合深度学习模型的信用卡欺诈检测研究[D]: [硕士学位论文]. 抚州: 东华理工大学, 2024.
|
|
[4]
|
曾昊. 基于集成学习的不平衡数据集算法研究[D]: [硕士学位论文]. 柳州: 广西科技大学, 2024.
|
|
[5]
|
王全东. 考虑类别不平衡的半监督集成个人信用评分模型研究[D]: [硕士学位论文]. 上海: 东华大学, 2025.
|
|
[6]
|
张信渊. 面向不平衡数据分类的孪生超球支持向量机模型改进及其在信用卡欺诈检测中的应用[D]: [硕士学位论文]. 银川: 北方民族大学, 2025.
|
|
[7]
|
徐蕴灏. 基于生成对抗网络的信用卡欺诈检测研究[D]: [硕士学位论文]. 南京: 南京邮电大学, 2023.
|