基于半监督学习的加密流量分类模型CLGAN
Encrypted Traffic Classification Model CLGAN Based on Semi-Supervised Learning
DOI: 10.12677/mos.2025.145416, PDF,    国家自然科学基金支持
作者: 邸 展:上海理工大学光电信息与计算机工程学院,上海;刘 亚:上海理工大学光电信息与计算机工程学院,上海;香港狮子山网络安全实验室,香港;赵逢禹:上海出版印刷高等专科学校信息与智能工程系,上海;曲 博:港专学院网络空间科技学院,香港
关键词: 加密流量分类深度学习半监督学习注意力机制生成对抗网络Encrypted Traffic Classification Deep Learning Semi-Supervised Learning Attention Mechanism Generative Adversarial Network
摘要: 加密流量的快速增长和复杂化对网络流量分类提出了新的挑战。针对现有方法在加密流量分类中特征提取能力不足和标注数据需求量大的问题,文章提出了一种基于生成对抗网络(GAN)的半监督加密流量分类模型CLGAN。通过对GAN网络中的判别器使用CNN与LSTM级联结构,模型能够有效结合卷积神经网络(CNN)的空间特征提取能力与长短时记忆网络(LSTM)的时序特征捕获能力,从而提升分类性能。在公开数据集ISCX2012 VPN-nonVPN和USTC-TFC2016上,分别与半监督学习模型DCGAN、基于AE的半监督学习模型以及监督学习模型CNN-LSTM进行对比实验。实验结果表明:CLGAN在标注数据稀缺场景下表现出更强的特征提取和泛化能力。当标记样本数量为2000时,CLGAN的分类准确率比CNN-LSTM模型提高了约3%;与模型DCGAN相比,在不同数据标记数量下CLGAN模型分类准确率均提高了约4%。
Abstract: The rapid growth and complexity of encrypted traffic pose new challenges to network traffic classification. Aiming at the problems of insufficient feature extraction ability and great demand for labeled data in existing methods for encrypted traffic classification, this paper proposes a semi-supervised encrypted traffic classification model CLGAN based on the Generative Adversarial Network (GAN). By using a cascade structure of Convolutional Neural Network (CNN) and Long Short-Term Memory network (LSTM) in the discriminator of the GAN network, the model can effectively combine the spatial feature extraction ability of the CNN and the temporal feature capture ability of the LSTM, thus improving the classification performance. Comparative experiments are conducted on the public datasets ISCX2012 VPN-nonVPN and USTC-TFC2016, comparing with the semi-supervised learning model DCGAN, the semi-supervised learning model based on Autoencoder (AE), and the supervised learning model CNN-LSTM. The experimental results show that CLGAN exhibits stronger feature extraction and generalization capabilities in scenarios where labeled data is scarce. When the number of labeled samples is 2000, the classification accuracy of CLGAN is approximately 3% higher than that of the CNN-LSTM model; compared with the DCGAN model, the classification accuracy of the CLGAN model is approximately 4% higher under different numbers of labeled data.
文章引用:邸展, 刘亚, 赵逢禹, 曲博. 基于半监督学习的加密流量分类模型CLGAN[J]. 建模与仿真, 2025, 14(5): 579-590. https://doi.org/10.12677/mos.2025.145416

参考文献

[1] Aceto, G., Ciuonzo, D., Montieri, A. and Pescape, A. (2018) Mobile Encrypted Traffic Classification Using Deep Learning. 2018 Network Traffic Measurement and Analysis Conference (TMA), Vienna, 26-29 June 2018, 1-8. [Google Scholar] [CrossRef
[2] Wang, P., Ye, F., Chen, X. and Qian, Y. (2018) Datanet: Deep Learning Based Encrypted Network Traffic Classification in SDN Home Gateway. IEEE Access, 6, 55380-55391. [Google Scholar] [CrossRef
[3] Lotfollahi, M., Siavoshani, M.J., Zade, R.S.H. and Saberian, M. (2017) Deep Packet: A Novel Approach for Encrypted Traffic Classification Using Deep Learning. Soft Computing, 24, 1999-2012. http://www.arxiv.org [Google Scholar] [CrossRef
[4] Wang, W., Zhu, M., Wang, J., Zeng, X. and Yang, Z. (2017) End-to-End Encrypted Traffic Classification with One-Dimensional Convolution Neural Networks. 2017 IEEE International Conference on Intelligence and Security Informatics (ISI), Beijing, 22-24 July 2017, 43-48. [Google Scholar] [CrossRef
[5] Wang, W., Zhu, M., Zeng, X., Ye, X. and Sheng, Y. (2017) Malware Traffic Classification Using Convolutional Neural Network for Representation Learning. 2017 International Conference on Information Networking (ICOIN), Da Nang, 11-13 January 2017, 712-717. [Google Scholar] [CrossRef
[6] Chen, Z., He, K., Li, J. and Geng, Y. (2017) Seq2Img: A Sequence-to-Image Based Approach Towards IP Traffic Classification Using Convolutional Neural Networks. 2017 IEEE International Conference on Big Data (Big Data), Boston, 11-14 December 2017, 1271-1276. [Google Scholar] [CrossRef
[7] Chen, X., Yu, J., Ye, F. and Wang, P. (2018) A Hierarchical Approach to Encrypted Data Packet Classification in Smart Home Gateways. 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), Athens, 12-15 August 2018, 41-45. [Google Scholar] [CrossRef
[8] Wang, Z. (2015) The Application of Deep Learning on Traffic Identification.
http://www.blackhat.com
[9] Lopez-Martin, M., Carro, B., Sanchez-Esguevillas, A. and Lloret, J. (2017) Network Traffic Classifier with Convolutional and Recurrent Neural Networks for Internet of Things. IEEE Access, 5, 18042-18050. [Google Scholar] [CrossRef
[10] Wang, W., Sheng, Y., Wang, J., Zeng, X., Ye, X., Huang, Y., et al. (2018) HAST-IDS: Learning Hierarchical Spatial-Temporal Features Using Deep Neural Networks to Improve Intrusion Detection. IEEE Access, 6, 1792-1806. [Google Scholar] [CrossRef
[11] Wang, P., Wang, Z., Ye, F. and Chen, X. (2021) ByteSGAN: A Semi-Supervised Generative Adversarial Network for Encrypted Traffic Classification in SDN Edge Gateway. Computer Networks, 200, Article 108535. [Google Scholar] [CrossRef
[12] Srivastava, N., Hinton, G., Krizhevsky, A., et al. (2014) Dropout: A Simple Way to Prevent Neural Networks from Overfitting. The Journal of Machine Learning Research, 15, 1929-1958.
[13] Iliyasu, A.S. and Deng, H. (2020) Semi-Supervised Encrypted Traffic Classification with Deep Convolutional Generative Adversarial Networks. IEEE Access, 8, 118-126. [Google Scholar] [CrossRef
[14] Aouedi, O., Piamrat, K. and Bagadthey, D. (2020) A Semi-Supervised Stacked Autoencoder Approach for Network Traffic Classification. 2020 IEEE 28th International Conference on Network Protocols (ICNP), Madrid, 13-16 October 2020, 1-6. [Google Scholar] [CrossRef