基于CNN-SVM的信用卡诈骗检测方法
A Credit Card Fraud Detection Method Based on CNN-SVM
DOI: 10.12677/AAM.2021.102044, PDF,    国家自然科学基金支持
作者: 丁 冲, 常磊雅, 景英川*:太原理工大学,数学学院,山西 晋中
关键词: 信用卡诈骗不平衡数据SMOTE卷积神经网络支持向量机Credit Card Fraud Skewed Data SMOTE Convolution Neural Network Support Vector Machine
摘要: 随着经济的发展,信用卡的普及,越来越多的信用卡交易出现了违规欺诈等行为,给国家和个人带来了巨大的经济损失。针对信用卡交易数据量大、特征数多和高度不平衡性(正常样本数量远高于诈骗样本数量)等特性,使得欺诈检测系统需进一步改进和完善。为减少银行和持卡人的损失,提出了一种基于卷积神经网络(CNN)和支持向量机(SVM)相结合的方法,即CNN-SVM法。该模型首先用SMOTE算法对原始数据中小样本进行处理以达到平衡数据的效果,再利用CNN对数据进行隐式特征提取,最后用SVM对提取后的特征数据进行检测。结合实例分析并比较得出:基于CNN-SVM的欺诈检测模型与传统的分类模型相比,有更加精准优良的效果。
Abstract: With the development of economy and the popularity of credit card, more and more credit card transactions have been illegal and fraudulent, which has brought huge economic losses to the country and individuals. Due to the large amount of credit card transaction data, the large number of features and the high imbalance (the number of normal samples is much higher than the number of fraud samples), the fraud detection system needs to be further improved and perfected. In order to reduce the losses of banks and cardholders, a method based on the combination of convolutional neural network (CNN) and support vector machine (SVM), namely the CNN-SVM method, is proposed. This model firstly uses SMOTE algorithm to treat the small sample of the original data to achieve the effect of balance data, then uses CNN to extract the implicit feature of the data, and finally uses SVM to detect the extracted characteristic data. Based on the example analysis and comparison, the fraud detection model based on CNN-SVM is more accurate and better than the traditional classification model.
文章引用:丁冲, 常磊雅, 景英川. 基于CNN-SVM的信用卡诈骗检测方法[J]. 应用数学进展, 2021, 10(2): 386-395. https://doi.org/10.12677/AAM.2021.102044

参考文献

[1] 中国银行业协会银行卡专业委员会. 中国银行卡产业发展蓝皮书2018[M]. 北京: 中国金融出版社, 2018.
[2] Laleh, N. and Azgomi, M.A. (2009) A Taxonomy of Frauds and Fraud Detection Techniques. 2009 International Conference on Information Systems, Technology and Management, Ghaziabad, 12-13 March 2009, 256-267. [Google Scholar] [CrossRef
[3] Sahin, Y., Bulkan, S. and Duman, E. (2013) A Cost-Sensitive Decision Tree Approach for Fraud Detection. Expert Systems with Applications, 40, 5916-5923. [Google Scholar] [CrossRef
[4] Chan, P.K., Fan, W., Prodromidis, A.L. and Stolfo, S.J. (1999) Distributed Data Mining in Credit Card Fraud Detection. IEEE Intelligent Systems and their Applications, 14, 67-74. [Google Scholar] [CrossRef
[5] Aleskerov, E., Freisleben, B. and Rao, B. (1997) CARDWATCH: A Neural Network Based Database Mining System for Credit Card Fraud Detection. Computational Intelligence for Financial Engineering, New York City, 24-25 March 1997, 220-226. [Google Scholar] [CrossRef
[6] Fiore, U., Santis, A.D., Perla, F., Zanetti, P. and Palmieri, F. (2017) Using Generative Adversarial Networks for Improving Classification Effectiveness in Credit Card Fraud Detection. Information Sciences, 479, 448-455. [Google Scholar] [CrossRef
[7] Yang, W., Zhang, Y., Ye, K., Li, L. and Xu, C.-Z. (2019) FFD: A Federated Learning Based Method for Credit Card Fraud Detection. International Conference on Big Data 2019, San Diego, 25-30 June 2019, 18-32. [Google Scholar] [CrossRef
[8] Breiman, L. (2001). Random Forests. Machine Learning, 45, 5-32.[CrossRef
[9] 董师师, 黄哲学. 随机森林理论浅析[J]. 集成技术, 2013, 2(1): 1-7.
[10] 王奕森, 夏树涛. 集成学习之随机森林算法综述[J]. 信息通信技术, 2018, 12(1): 49-55. http://dx.chinadoi.cn/10.3969/j.issn.1674-1285.2018.01.009
[11] Cortes, C. and Vapnik, V. (1995). Support-Vector Networks. Machine Learning, 20, 273-297.[CrossRef
[12] Freund, Y. and Schapire, R.E. (1997). A Decision-Theoretic Generalization of On-Line Learning and an Application to Boosting. Journal of Computer and System Sciences, 55, 119-139.[CrossRef
[13] Lecun, Y., Bottou, L., Bengio, Y. and Haffner, P. (1998) Gradient-Based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278-2324. [Google Scholar] [CrossRef
[14] Lecun, Y. Boser, B., Denker, J. and Henderson, D. (2014) Backpropagation Applied to Handwritten Zip Code Recognition. Neural Computation, 1, 541-551. [Google Scholar] [CrossRef
[15] Chawla, N.V., Bowyer, K.W., Hall, L.O. and Kegelmeyer, W.P. (2002) SMOTE: Synthetic Minority Over-Sampling Technique. Journal of Artificial Intelligence Research, 16, 321-357. [Google Scholar] [CrossRef
[16] Ccfraud Dataset.
https://www.kaggle.com/mlg-ulb/creditcardfraud
[17] Awoyemi, J.O., Adetunmbi, A.O. and Oluwadare, S.A. (2017) Credit Card Fraud Detection Using Machine Learning Techniques: A Comparative Analysis. 2017 International Conference on Computing Networking and Informatics, Lagos, 29-31 October 2017, 1-9. [Google Scholar] [CrossRef
[18] Kingma, D. and Ba, J. (2014) Adam: A Method for Stochastic Optimization. Computer Science. arXiv:1412.6980.
https://arxiv.org/abs/1412.6980