基于图神经网络和知识图谱的可解释小样本文本分类模型
An Interpretable Few-Shot Text Classification Model Based on Graph Neural Networks and Knowledge Graphs
摘要: 小样本文本分类具有广泛的应用场景。然而,现有方法面临两个关键挑战:数据稀缺和可解释性不足。为此,本实验提出ARExplainer方法,这是一个数据与推理增强的可解释小样本文本分类方法。通过利用大语言模型(LLMs)的泛化能力,有效扩展了训练样本的多样性,从而缓解了小样本学习的数据瓶颈。针对模型可解释性问题,构建了知识图谱驱动的推理引擎。该引擎结合图注意力网络提取可验证的符号推理路径,为分类决策提供逻辑依据,最后,利用基于提示的解释生成器生成简洁、清晰的自然语言解释。实验结果表明,在1-shot设置下,ARExplainer显著优于最好的基线模型。此外,通过与自动生成解释和人工标注结果的对比分析,证实ARExplainer能够提供更便于人类理解的自然语言解释。
Abstract: Few-shot text classification has broad application scenarios. However, existing methods face two key challenges: data scarcity and insufficient interpretability. To address these issues, this paper proposed ARExplainer, an interpretable few-shot text classification method with data and reasoning augmentation. By leveraging the generalization capability of Large Language Models (LLMs), it effectively expanded the diversity of training samples, thereby alleviating the data bottleneck in few-shot learning. For the model interpretability issue, the method constructed a knowledge graph-driven reasoning engine. This engine combined a Graph Attention Network to extract verifiable symbolic reasoning paths, providing logical evidence for classification decisions. Finally, it utilized a prompt-based explanation generator to produce concise and clear natural language explanations. Experimental results demonstrated that ARExplainer significantly outperformed the strongest baseline model in the 1-shot setting. Furthermore, comparative analysis against automatically generated explanations and human-annotated results confirmed that ARExplainer provides natural language explanations that are easier for humans to understand.
文章引用:周子璇, 李秋瑶. 基于图神经网络和知识图谱的可解释小样本文本分类模型[J]. 计算机科学与应用, 2026, 16(1): 115-129. https://doi.org/10.12677/csa.2026.161010

参考文献

[1] Bragg, J., Cohan, A., Lo, K., et al. (2021) Flex: Unifying Evaluation for Few-Shot NLP. Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, 6-14 December 2021, 15787-15800.
[2] Schick, T., Schmid, H. and Schütze, H. (2020). Automatically Identifying Words That Can Serve as Labels for Few-Shot Text Classification. Proceedings of the 28th International Conference on Computational Linguistics, Barcelona, December 2020, 5569-5578.[CrossRef
[3] Devlin, J., Chang, M.W., Lee, K., et al. (2019) Bert: Pre-Training of Deep Bidirectional Transformers for Language Understanding. Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1, 4171-4186.
[4] Liu, Y. Ott, M., Goyal, N., et al. (2019) Roberta: A Robustly Optimized Bert Pretraining Approach.
[5] Raffel, C., Shazeer, N., Roberts, A., et al. (2020) Exploring the Limits of Transfer Learning with a Unified Text-to-Text Transformer. Journal of Machine Learning Research, 21, 1-67.
[6] Wei, J., Tay, Y., Bommasani, R., et al. (2022) Emergent Abilities of Large Language Models. Transactions on Machine Learning Research (TMLR). [Google Scholar] [CrossRef
[7] Petroni, F., Rocktäschel, T., Riedel, S., Lewis, P., Bakhtin, A., Wu, Y., et al. (2019) Language Models as Knowledge Bases? Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, November 2019, 2463-2473. [Google Scholar] [CrossRef
[8] Ji, Z., Lee, N., Frieske, R., Yu, T., Su, D., Xu, Y., et al. (2023) Survey of Hallucination in Natural Language Generation. ACM Computing Surveys, 55, 1-38. [Google Scholar] [CrossRef
[9] Bang, Y., Cahyawijaya, S., Lee, N., Dai, W., Su, D., Wilie, B., et al. (2023) A Multitask, Multilingual, Multimodal Evaluation of ChatGPT on Reasoning, Hallucination, and Interactivity. Proceedings of the 13th International Joint Conference on Natural Language Processing and the 3rd Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics, Volume 1, 675-718. [Google Scholar] [CrossRef
[10] Zhang, H., Song, H., Li, S., Zhou, M. and Song, D. (2023) A Survey of Controllable Text Generation Using Transformer-Based Pre-Trained Language Models. ACM Computing Surveys, 56, 1-37. [Google Scholar] [CrossRef
[11] Danilevsky, M., Qian, K., Aharonov, R., Katsis, Y., Kawas, B. and Sen, P. (2020) A Survey of the State of Explainable AI for Natural Language Processing. Proceedings of the 1st Conference of the Asia-Pacific Chapter of the Association for Computational Linguistics and the 10th International Joint Conference on Natural Language Processing, Suzhou, December 2020, 447-459. [Google Scholar] [CrossRef
[12] Situ, X., Zukerman, I., Paris, C., Maruf, S. and Haffari, G. (2021) Learning to Explain: Generating Stable Explanations Fast. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Volume 1, 5340-5355. [Google Scholar] [CrossRef
[13] Liu, P., Yuan, W., Fu, J., Jiang, Z., Hayashi, H. and Neubig, G. (2023) Pre-Train, Prompt, and Predict: A Systematic Survey of Prompting Methods in Natural Language Processing. ACM Computing Surveys, 55, 1-35. [Google Scholar] [CrossRef
[14] Chen, Q., Ji, F., Zeng, X., Li, F., Zhang, J., Chen, H., et al. (2021) KACE: Generating Knowledge Aware Contrastive Explanations for Natural Language Inference. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Volume 1, 2516-2527. [Google Scholar] [CrossRef
[15] Abu-Salih, B. (2021) Domain-Specific Knowledge Graphs: A Survey. Journal of Network and Computer Applications, 185, Article ID: 103076. [Google Scholar] [CrossRef
[16] Chen, J., Tam, D., Raffel, C., Bansal, M. and Yang, D. (2023) An Empirical Survey of Data Augmentation for Limited Data Learning in NLP. Transactions of the Association for Computational Linguistics, 11, 191-211. [Google Scholar] [CrossRef
[17] Møller, A.G., Dalsgaard, J.A., Pera, A., et al. (2023) Is a Prompt and a Few Samples All You Need? Using GPT-4 for Data Augmentation in Low-Resource Classification Tasks.
[18] Shum, K., Diao, S. and Zhang, T. (2023) Automatic Prompt Augmentation and Selection with Chain-of-Thought from Labeled Data. Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, December 2023, 12113-12139. [Google Scholar] [CrossRef
[19] Peng, B., Li, C., He, P., et al. (2023) Instruction Tuning with GPT-4.
[20] Hoffmann, J., Borgeaud, S., Mensch, A., et al. (2022) Training Compute-Optimal Large Language Models. Proceedings of the 36th International Conference on Neural Information Processing Systems, New Orleans, 28 November-9 December 2022, 30016-30030.
[21] Luo, L., Zhao, Z., Haffari, G., et al. (2024) Graph-Constrained Reasoning: Faithful Reasoning on Knowledge Graphs with Large Language Models. 42nd International Conference on Machine Learning. [Google Scholar] [CrossRef
[22] Sun, Y., Wang, S., Feng, S., et al. (2021) Ernie 3.0: Large-Scale Knowledge Enhanced Pre-Training for Language Understanding and Generation.
[23] Schick, T. and Schütze, H. (2021) Exploiting Cloze-Questions for Few-Shot Text Classification and Natural Language Inference. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, 255-269. [Google Scholar] [CrossRef
[24] Yuan, W., Neubig, G. and Liu, P. (2021) Bartscore: Evaluating Generated Text as Text Generation. Advances in Neural Information Processing Systems, 34, 27263-27277.
[25] Haviv, A., Berant, J. and Globerson, A. (2021) BERTese: Learning to Speak to BERT. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics: Main Volume, April 2021, 3618-3623. [Google Scholar] [CrossRef
[26] Li, X.L. and Liang, P. (2021) Prefix-Tuning: Optimizing Continuous Prompts for Generation. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Volume 1, 4582-4597. [Google Scholar] [CrossRef
[27] Tsimpoukelli, M., Menick, J.L., Cabi, S., et al. (2021) Multimodal Few-Shot Learning with Frozen Language Models. Advances in Neural Information Processing Systems 34: Annual Conference on Neural Information Processing Systems 2021, NeurIPS 2021, 6-14 December 2021, 200-212.
[28] Zhang, Z., Han, X., Liu, Z., Jiang, X., Sun, M. and Liu, Q. (2019). ERNIE: Enhanced Language Representation with Informative Entities. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, July 2019, 1441-1451.[CrossRef
[29] Liu, W., Zhou, P., Zhao, Z., Wang, Z., Ju, Q., Deng, H., et al. (2020) K-BERT: Enabling Language Representation with Knowledge Graph. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 2901-2908. [Google Scholar] [CrossRef
[30] Liu, Y., Wan, Y., He, L., Peng, H. and Yu, P.S. (2021) KG-BART: Knowledge Graph-Augmented BART for Generative Commonsense Reasoning. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 6418-6425. [Google Scholar] [CrossRef
[31] Lin, B.Y., Chen, X., Chen, J. and Ren, X. (2019) Kagnet: Knowledge-Aware Graph Networks for Commonsense Reasoning. Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing and the 9th International Joint Conference on Natural Language Processing (EMNLP-IJCNLP), Hong Kong, November 2019, 2829-2839. [Google Scholar] [CrossRef
[32] Dai, D., Dong, L., Hao, Y., Sui, Z., Chang, B. and Wei, F. (2022) Knowledge Neurons in Pretrained Transformers. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Volume 1, 8493-8502. [Google Scholar] [CrossRef
[33] Rosset, C., Xiong, C., Phan, M., et al. (2020) Knowledge-Aware Language Model Pretraining.
[34] Lewis, P., Perez, E., Piktus, A., et al. (2020) Retrieval-Augmented Generation for Knowledge-Intensive NLP Tasks. Advances in Neural Information Processing Systems 33: Annual Conference on Neural Information Processing Systems 2020, NeurIPS 2020, 6-12 December 2020, 9459-9474.
[35] Wang, X., Kapanipathi, P., Musa, R., Yu, M., Talamadupula, K., Abdelaziz, I., et al. (2019) Improving Natural Language Inference Using External Knowledge in the Science Questions Domain. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 7208-7215. [Google Scholar] [CrossRef
[36] Hambardzumyan, K., Khachatrian, H. and May, J. (2021) WARP: Word-Level Adversarial Reprogramming. Proceedings of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, Volume 1, 4921-4933. [Google Scholar] [CrossRef
[37] Cui, G., Hu, S., Ding, N., Huang, L. and Liu, Z. (2022) Prototypical Verbalizer for Prompt-Based Few-Shot Tuning. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Volume 1, 7014-7024. [Google Scholar] [CrossRef