基于PCA和SE-ResNet-VIT的恶意软件检测方法
PCA and SE-ResNet-VIT Based Malware Detection Method
DOI: 10.12677/CSA.2023.139177, PDF,    科研立项经费支持
作者: 凡聪:广东工业大学计算机学院,广东 广州;张杰:广东工业大学自动化学院,广东 广州
关键词: 恶意软件检测主成分分析SE-ResNetVision Transformer集成模型Malware Detection Principal Component Analysis SE-ResNet Vision Transformer Ensemble Mode
摘要: 近年来,恶意软件的数量不断增加,为用户带来了严重的安全隐患。为了避免主机系统受到恶意软件的侵害,提高检测的准确率,提出一种基于主成分分析(Principal component analysis, PCA)降维和SE-ResNet-VIT集成模型的恶意软件检测方法。由于软件数据信息具有高维度,多噪点的特征,通过PCA对待检测软件数据进行主成分提取,去除样本数据中的冗余特征项。SE-ResNet-VIT模型是将改进为双线性融合机制的SE-ResNet和VIT (Vision Transformer)中的编码器相结合的集成模型。改进的SE-ResNet模型能够从局部特征中提取更多信息,并通过组合这些特征来提高模型的表示能力。VIT模型能够通过注意力机制来学习数据之间的依赖关系,并能够处理长序列数据。该方法通过结合SE-ResNet和VIT,以两种不同的方式提取特征,能够更准确地捕捉软件的语义信息,从而提高恶意软件检测的准确性。在Ember数据集上进行了对比实验,实验结果表明,该方法的准确率分别为97.05%和98.45%,并与现有的多种检测方法进行对比,在准确率方面分别提高1.94%~5.95%,该方法有更好的检测准确率和泛化能力。
Abstract: As the digital age continues to advance, so does the threat of malicious software, commonly known as malware. In recent years, the number of malware attacks has skyrocketed, putting users’ information and systems at risk. To mitigate these security concerns, researchers have developed a novel malware detection method that leverages the power of Principal Component Analysis (PCA) downscaling and an integrated model combining SE-ResNet and VIT (Vision Transformer). The SE-ResNet model, enhanced with a bilinear fusion mechanism, excels at extracting local features and improving the representation capability of the model. Meanwhile, the VIT model, with its attention mechanism, is able to learn inter-data dependencies and process long sequences of data. By combining these two models, the proposed approach is able to accurately capture the semantic information of software, leading to an improvement in malware detection accuracy. To demonstrate its effectiveness, the proposed method was tested against the Ember datasets, yielding an impressive accuracy of 97.05% and 98.45% respectively. The results of these experiments clearly indicate that this novel approach outperforms existing methods, with an improvement in accuracy ranging from 1.94% to 5.95%. In conclusion, the proposed malware detection method based on PCA downscaling and the integrated SE-ResNet-VIT model offers a cutting-edge solution to the growing problem of malware attacks. With its ability to accurately capture semantic information and improve detection accuracy, this method is poised to be a critical tool in safeguarding against malicious software.
文章引用:凡聪, 张杰. 基于PCA和SE-ResNet-VIT的恶意软件检测方法[J]. 计算机科学与应用, 2023, 13(9): 1785-1795. https://doi.org/10.12677/CSA.2023.139177

参考文献

[1] Komatwar, R. and Kokare, M. (2021) Retracted Article: A Survey on Malware Detection and Classification. Journal of Applied Security Research, 16, 390-420. [Google Scholar] [CrossRef
[2] 国家互联网应急中心. 2021年上半年我国互联网网络安全监测数据分析报告[R]. https://www.cert.org.cn/publish/main/upload/File/first-half%20%20year%20cyberseurity%20report%202021.pdf
[3] Qi, P., Zhang, Z., Wang, W., et al. (2021) Malware Detection by Exploiting Deep Learning over Binary Programs. 2020 25th International Conference on Pattern Recognition (ICPR) IEEE, Milan, 10-15 January 2021, 9068-9075. [Google Scholar] [CrossRef
[4] Gibert, D., Mateu, C., Planes, J., et al. (2018) Classification of Malware by Using Structural Entropy on Convolutional Neural Networks. Proceedings of the AAAI Conference on Artificial Intelligence, 32, 7759-7764. [Google Scholar] [CrossRef
[5] Ye, Y., Chen, L., Hou, S., et al. (2018) DeepAM: A Heterogeneous Deep Learning Framework for Intelligent Malware Detection. Knowledge and Information Systems, 54, 265-285.
[6] Wang, S., Zhou, G., Lu, J., et al. (2019) A Novel Malware Detection and Classification Method Based on Capsule Network. In: Sun, X.M., Pan, Z.Q. and Bertino, E., Eds., International Conference on Artificial Intelligence and Security, Springer, Cham, 573-584. [Google Scholar] [CrossRef
[7] Albahar, M.A., ElSayed, M.S. and Jurcut, A. (2022) A Modified ResNeXt for Android Malware Identification and Classification. Computational Intelligence and Neuroscience, 2022, Article ID: 8634784. [Google Scholar] [CrossRef] [PubMed]
[8] 张柏翰, 凌捷. 改进的基于DNN的恶意软件检测方法[J]. 计算机工程与应用, 2021, 57(10): 81-87.
[9] Sun, R., Yuan, X., He, P., et al. (2017) Learning Fast and Slow: Propedeutica for Real-Time Malware Detection. IEEE Transactions on Neural Networks and Learning Systems, 33, 2518-2529.
[10] Hu, X., Sun, R., Xu, K., et al. (2020) Exploit Internal Structural Information for IoT Malware Detection Based on Hierarchical Transformer Model. 2020 IEEE 19th International Conference on Trust, Security and Privacy in Computing and Communications (TrustCom), Guangzhou, 29 December-1 January 2021, 927-934. [Google Scholar] [CrossRef
[11] Anderson, H.S. and Roth, P. (2018) Ember: An Open Dataset for Training Static PE Malware Machine Learning Models.
[12] Abbasi, E., Moghaddam, M.R.A. and Kowsari, E. (2022) A Systematic and Critical Review on Development of Machine Learning Based-Ensemble Models for Prediction of Adsorption Process Efficiency. Journal of Cleaner Production, 379, Article ID: 134588. [Google Scholar] [CrossRef
[13] 金逸灵, 陈兴蜀, 王玉龙. 基于LSTM-CNN的容器内恶意软件静态检测[J]. 计算机应用研究, 2020, 37(12): 3704-3707+3711. [Google Scholar] [CrossRef
[14] 傅依娴, 芦天亮, 马泽良. 基于One-Hot的CNN恶意代码检测技术[J]. 计算机应用与软件, 2020, 37(1): 304-308+333.
[15] Abdi, H. and Williams, L.J. (2010) Principal Component Analysis. Wiley Interdisciplinary Reviews: Computational Statistics, 2, 433-459. [Google Scholar] [CrossRef
[16] Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, 4-9 December 2017, 30.
[17] Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2020) An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale.
[18] Hu, J., Shen, L. and Sun, G. (2018) Squeeze-and-Excitation Networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 7132-7141. [Google Scholar] [CrossRef
[19] He, K., Zhang, X., Ren, S., et al. (2016) Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 770-778. [Google Scholar] [CrossRef