基于深度学习的入侵检测数据分类研究
Research on Intrusion Detection Data Classification Based on Deep Learning
DOI: 10.12677/AAM.2023.126274, PDF,    国家自然科学基金支持
作者: 金 颖:成都信息工程大学网络空间安全学院,四川 成都;成都信息工程大学先进密码技术与系统安全四川省重点实验室,四川 成都
关键词: 生成对抗网络入侵检测不均衡数据分类深度森林特征提取Generative Adversarial Network (GAN) Intrusion Detection Imbalanced Data Classification Deep Forest Feature Extraction
摘要: 针对由于入侵检测数据集中数据类别不平衡,而导致的检测分类准确率低的问题,设计一种基于生成对抗网络(GAN)和深度森林结合的入侵检测模型。首先,基于生成对抗网络独有的对抗思想,通过原数据类的分类结果,筛选出需要生成的类别,生成数据集中缺少的数据,缓解数据集不平衡的问题;然后,针对网络流量特征复杂与深度森林模型数据处理成本高的矛盾,设计了基于主成分分析和线性判别算法结合的特征提取方法,解决了深度森林模型中的数据计算冗余问题,提高了数据传递与处理能力。实验结果证明,所提方法的分类检测精度达到了96%,其中少数类数据的检测精度比没有平衡前提高了30%。
Abstract: Aiming at the problem of low detection and classification accuracy due to the imbalance of data categories in the intrusion detection dataset, an intrusion detection model based on the combina-tion of generative adversarial networks (GAN) and deep forest is designed. .First of all, based on the adversarial characteristics of generated adversarial networks, the classes that need to be generated are screened out through the classification results of the original data and the missing data in the dataset is generated to alleviate the problem of dataset imbalance. Then, aiming at the contradic-tion between the complex network traffic characteristics and the high data processing cost of the deep forest model, a feature extraction method based on the combination of principal component analysis and linear discriminant analysis is designed. It solves the data calculation redundancy problem in the deep forest model and improves the data transmission and processing capabilities. The experimental results show that the classification detection accuracy of the proposed method reaches 96%, and the detection accuracy of the minority class data is 30% higher than that without balance.
文章引用:金颖. 基于深度学习的入侵检测数据分类研究[J]. 应用数学进展, 2023, 12(6): 2736-2748. https://doi.org/10.12677/AAM.2023.126274

参考文献

[1] Fernandes Jr., G., Rodrigues, J.J.P.C., Carvalho, L.F., Al-Muhtadi, J.F. and Proença Jr., M.L. (2019) A Comprehensive Survey on Network Anomaly Detection. Telecommunication Systems, 70, 447-489. [Google Scholar] [CrossRef
[2] Rahman, M.A., Shahriar, M.H. and Masum, R. (2019) False Data Injection Attacks against Contingency Analysis in Power Grids: Poster. Proceedings of the 12th Conference on Security and Privacy in Wireless and Mobile Networks, Miami, 15-17 May 2019, 343-344. [Google Scholar] [CrossRef
[3] Lee, P.H. (2014) Resampling Methods Improve the Predictive Power of Modeling in Class-Imbalanced Datasets. International Journal of Environmental Research and Public Health, 11, 9776-9789. [Google Scholar] [CrossRef] [PubMed]
[4] Goodfellow, I., Pouget-Abadie, J., Mirza, M., et al. (2014) Genera-tive Adversarial Nets. MIT Press, Cambridge.
[5] Kocher, G. and Kumar, G. (2021) Machine Learning and Deep Learning Methods for Intrusion Detection Systems: Recent Developments and Challenges. Soft Computing, 25, 9731-9763. [Google Scholar] [CrossRef
[6] Zhou, Z.-H. and Feng, J. (2019) Deep Forest. Na-tional Science Review, 6, 74-86. [Google Scholar] [CrossRef] [PubMed]
[7] Yong, S. and Feng, L. (2106) SMOTE-NCL: A Re-Sampling Method with Filter for Network Intrusion Detection. 2016 2nd IEEE International Con-ference on Computer and Communications (ICCC), Chengdu, 14-17 October 2016, 1157-1161. [Google Scholar] [CrossRef
[8] Yan, B.H., Han, G.D., Sun, M. and Ye, S. (2017) A Novel Region Adaptive SMOTE Algorithm for Intrusion Detection on Imbalanced Problem. 2017 3rd IEEE Internation-al Conference on Computer and Communications (ICCC), Chengdu, 13-16 December 2017, 1281-1286. [Google Scholar] [CrossRef
[9] Belenko, V., Chernenko, V., Kalinin, M. and Krundyshev, V. (2018) Evaluation of GAN Applicability for Intrusion Detection in Self-Organizing Networks of Cyber Physical Sys-tems. 2018 International Russian Automation Conference (RusAutoCon), Sochi, 9-16 September 2018, 1-7. [Google Scholar] [CrossRef
[10] Lee, J. and Park, K. (2021) GAN-Based Imbalanced Data Intrusion Detection System. Personal and Ubiquitous Computing, 25, 121-128. [Google Scholar] [CrossRef
[11] Liao, D., Huang, S., Tan, Y. and Bai, G. (2020) Network Intru-sion Detection Method Based on GAN Model. 2020 International Conference on Computer Communication and Net-work Securit, Xi’an, 21-23 August 2020, 153-156. [Google Scholar] [CrossRef
[12] Shahriar, M.H., Haque, N.I., Rahman, M.A. and Alonso, M. (2020) G-IDS: Generative Adversarial Networks Assisted Intrusion Detection System. 2020 IEEE 44th Annual Com-puters, Software, and Applications Conference (COMPSAC), Madrid, 13-17 July 2020, 376-385. [Google Scholar] [CrossRef
[13] 钱铁云, 王毅, 张明明, 刘俊恺. 基于深度神经网络的入侵检测方法[J]. 华中科技大学学报(自然科学版), 2018, 46(1): 6-10. [Google Scholar] [CrossRef
[14] 丁龙斌, 伍忠东, 苏佳丽. 基于集成深度森林的入侵检测方法[J]. 计算机工程, 2020, 46(3): 144-150. [Google Scholar] [CrossRef
[15] 范怡敏, 齐林, 帖云. 基于基因表达小样本数据的级联森林分类模型[J]. 计算机应用与软件, 2020, 37(11): 165-171.
[16] 颜建军, 刘章鹏, 刘国萍, 等. 基于深度森林算法的慢性胃炎中医证候分类[J]. 华东理工大学学报(自然科学版), 2019, 45(4): 593-599.
[17] 蒋鹏飞, 魏松杰. 基于深度森林与CWGAN-GP的移动应用网络行为分类与评估[J]. 计算机科学, 2020, 47(1): 287-292.
[18] 张鹏, 李志, 邸希元. 基于深度森林的无线传感器网络故障分类算法[J]. 计算机测量与控制, 2022, 30(1): 26-33. [Google Scholar] [CrossRef
[19] 王耀聃, 李红娇, 詹清钦. 结合堆叠稀疏自编码器与改进深度森林的窃电检测方法[J]. 计算机应用与软件, 2022, 39(12): 64-72+158.