基于结构知识增强的网络设备日志理解方法
Structural Knowledge-Enhanced Approach for Network Device Log Understanding
摘要: 随着网络系统规模和复杂性的不断增长,系统日志已成为故障诊断和运维管理的重要数据源。然而,现有的日志理解方法普遍忽视了日志文本的结构化特征以及系统组件间的关联关系,导致在复杂故障场景下的理解能力有限。为解决这一问题,本文提出了一种基于结构知识增强的网络设备日志理解方法。首先,我们构建了包含设施–错误–严重性三层语义关系的日志知识图谱,显式建模系统组件间的故障传播路径和依赖关系。在此基础上,设计了结构化掩码预测任务,通过对日志中的关键结构化字段采用更高的掩码概率,引导模型重点学习系统架构和错误类型的语义表示。同时,提出了图神经网络增强的文本对齐机制,并通过自注意力机制动态融合多实体图嵌入,实现知识图谱结构信息与文本语义的有效对齐。实验结果表明,所提的方法在多个任务指标上显著优于主流基线模型,验证了其各任务下的有效性与泛化能力。
Abstract: With the continuous growth in scale and complexity of network systems, system logs have become an important data source for fault diagnosis and operations management. However, existing log understanding methods generally neglect the structured characteristics of log texts and the associative relationships among system components, resulting in limited understanding capabilities in complex fault scenarios. To address this issue, this paper proposes a structural knowledge-enhanced approach for network device log understanding. First, we construct a log knowledge graph encompassing three-layer semantic relationships of facility-error-severity, explicitly modeling fault propagation paths and dependency relationships among system components. Building upon this foundation, we design a structured masking prediction task that employs higher masking probabilities for key structured fields in logs, guiding the model to focus on learning semantic representations of system architecture and error types. Meanwhile, we propose a graph neural network-enhanced text alignment mechanism that dynamically fuses multi-entity graph embeddings through self-attention mechanisms, achieving effective alignment between knowledge graph structural information and textual semantics. Experimental results demonstrate that the proposed method significantly outperforms mainstream baseline models across multiple task metrics, validating its effectiveness and generalization capability across various tasks.
文章引用:曹祥龙, 张明西, 殷菘泽, 李雨辰, 王凌璇. 基于结构知识增强的网络设备日志理解方法[J]. 软件工程与应用, 2025, 14(5): 1013-1025. https://doi.org/10.12677/sea.2025.145090

参考文献

[1] Jiang, Z.X., Li, T., Zhang, Z.G., Ge, J.G., You, J.L. and Li, L.X. (2021) A Survey on Log Research of Aiops: Methods and Trends. Mobile Networks and Applications, 26, 2353-2364. [Google Scholar] [CrossRef
[2] Zhang, X., Xu, Y., Qin, S., He, S., Qiao, B., Li, Z., et al. (2021) Onion: Identifying Incident-Indicating Logs for Cloud Systems. Proceedings of the 29th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Athens, 23-28 August 2021, 1253-1263. [Google Scholar] [CrossRef
[3] Du, M., Li, F., Zheng, G. and Srikumar, V. (2017) DeepLog: Anomaly Detection and Diagnosis from System Logs through Deep Learning. Proceedings of the 2017 ACM SIGSAC Conference on Computer and Communications Security, Dallas, 30 October-3 November 2017, 1285-1298. [Google Scholar] [CrossRef
[4] Wit, E. and McClure, J. (2004) Statistics for Microarrays: Design, Analysis, and Inference. 5th Edition, Wiley. [Google Scholar] [CrossRef
[5] Zhang, C., Peng, X., Sha, C., et al. (2022) Deeptralog: Trace-Log Combined Microservice Anomaly Detection through Graph-Based Deep Learning. ICSE’22: Proceedings of the 44th International Conference on Software Engineering, 623-634. [Google Scholar] [CrossRef
[6] Devlin, J., Chang, M.W., Lee, K., et al. (2019) Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), Minneapolis, 2-7 June 2019, 4171-4186.
[7] Li, X., Chen, P., Jing, L., He, Z. and Yu, G. (2020) SwissLog: Robust and Unified Deep Learning Based Log Anomaly Detection for Diverse Faults. 2020 IEEE 31st International Symposium on Software Reliability Engineering (ISSRE), Coimbra, 12-15 October 2020, 92-103. [Google Scholar] [CrossRef
[8] Gholamian, S. and Ward, P.A.S. (2021) On the Naturalness and Localness of Software Logs. 2021 IEEE/ACM 18th International Conference on Mining Software Repositories (MSR), Madrid, 17-19 May 2021, 155-166. [Google Scholar] [CrossRef
[9] Han, X. and Yuan, S. (2021) Unsupervised Cross-System Log Anomaly Detection via Domain Adaptation. Proceedings of the 30th ACM International Conference on Information & Knowledge Management, 1-5 November 2021, 3068-3072. [Google Scholar] [CrossRef
[10] Zhang, X., Xu, Y., Lin, Q., Qiao, B., Zhang, H., Dang, Y., et al. (2019) Robust Log-Based Anomaly Detection on Unstable Log Data. Proceedings of the 2019 27th ACM Joint Meeting on European Software Engineering Conference and Symposium on the Foundations of Software Engineering, Tallinn, 26-30 August 2019, 807-817. [Google Scholar] [CrossRef
[11] Meng, W., Liu, Y., Zhu, Y., Zhang, S., Pei, D., Liu, Y., et al. (2019) LogAnomaly: Unsupervised Detection of Sequential and Quantitative Anomalies in Unstructured Logs. Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence, Macao, 10-16 August 2019, 4739-4745. [Google Scholar] [CrossRef
[12] Guo, H., Yuan, S. and Wu, X. (2021) LogBERT: Log Anomaly Detection via Bert. 2021 International Joint Conference on Neural Networks (IJCNN), Shenzhen, 18-22 July 2021, 1-8. [Google Scholar] [CrossRef
[13] Ma, L., Yang, W., Xu, B., Jiang, S., Fei, B., Liang, J., et al. (2024) KnowLog: Knowledge Enhanced Pre-Trained Language Model for Log Understanding. Proceedings of the IEEE/ACM 46th International Conference on Software Engineering, Lisbon, 14-20 April 2024, 1-13. [Google Scholar] [CrossRef
[14] Ma, L., Yang, W., Jiang, S., Fei, B., Zhou, M., Li, S., et al. (2025) LUK: Empowering Log Understanding with Expert Knowledge from Large Language Models. IEEE Transactions on Software Engineering. [Google Scholar] [CrossRef
[15] Peters, M.E., Neumann, M., Logan, R., Schwartz, R., Joshi, V., Singh, S., et al. (2019) Knowledge Enhanced Contextual Word Representations. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, 3-7 November 2019, 43-54. [Google Scholar] [CrossRef
[16] Sui, Y., Zhang, Y., Sun, J., Xu, T., Zhang, S., Li, Z., et al. (2023) LogKG: Log Failure Diagnosis through Knowledge Graph. IEEE Transactions on Services Computing, 16, 3493-3507. [Google Scholar] [CrossRef
[17] Liao, L., Zhu, K., Luo, J. and Cai, J. (2023) LogBASA: Log Anomaly Detection Based on System Behavior Analysis and Global Semantic Awareness. International Journal of Intelligent Systems, 2023, Article ID: 3777826. [Google Scholar] [CrossRef
[18] Reimers, N. and Gurevych, I. (2019) Sentence-BERT: Sentence Embeddings Using Siamese Bert-Networks. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, 3-7 November 2019, 3982-3992. [Google Scholar] [CrossRef
[19] Lu, S., Wei, X., Li, Y. and Wang, L. (2018) Detecting Anomaly in Big Data System Logs Using Convolutional Neural Network. 2018 IEEE 16th Intl Conf on Dependable, Autonomic and Secure Computing, 16th Intl Conf on Pervasive Intelligence and Computing, 4th Intl Conf on Big Data Intelligence and Computing and Cyber Science and Technology Congress (DASC/PiCom/DataCom/CyberSciTech), Athens, 12-15 August 2018, 151-158. [Google Scholar] [CrossRef
[20] Pennington, J., Socher, R. and Manning, C. (2014) Glove: Global Vectors for Word Representation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, 25-29 October 2014, 1532-1543. [Google Scholar] [CrossRef
[21] Kingma, D.P. (2014) Adam: A Method for Stochastic Optimization. arXiv: 1412.6980.
[22] Sorower, M.S. (2010) A Literature Survey on Algorithms for Multi-Label Learning. Oregon State University.