机器生成文本检测综述
A Comprehensive Survey of Machine-Generated Text Detection
DOI: 10.12677/airr.2026.151005, PDF,   
作者: 孙一凯:北京信息科技大学计算机学院,北京;王洪俊:北京信息科技大学计算机学院,北京;拓尔思信息技术股份有限公司,北京
关键词: 大语言模型自然语言处理文本检测Large Language Models Natural Language Processing Text Detection
摘要: 随着人工智能生成内容(AIGC)与人类文本之间的界限日益模糊,机器生成文本检测成为自然语言处理的重要研究方向。文章综述机器生成文本检测的技术演变,包括主动嵌入隐秘信号的水印技术、基于特征统计的传统方法、利用预训练语言模型(如RoBERTa、DeBERTa)进行判别的监督学习方法,以及结合模型预测不确定性、特征分布差异的概率检测方法。近年来,局部化检测与可解释性分析成为新的研究热点,使检测系统能够识别具体生成片段并解释判别依据。然而,跨模型泛化、多语言场景与对抗鲁棒性仍是亟待解决的难题。未来的研究将致力于构建具有更强鲁棒性和可解释性的检测框架,结合因果推理与多模态信息提升检测性能,助力推动LLM生成文本检测技术的实用化与规范化发展。
Abstract: As the boundary between Artificial Intelligence-Generated Content (AIGC) and human-written text becomes increasingly blurred, machine-generated text detection has emerged as a critical research direction in natural language processing. This paper reviews the technological evolution of machine-generated text detection, including watermarking techniques that actively embed hidden signals, traditional methods based on feature statistics, supervised learning approaches leveraging pre-trained language models (e.g., RoBERTa, DeBERTa) for discrimination, and probability-based detection methods that incorporate model prediction uncertainty and feature distribution differences. In recent years, localized detection and interpretability analysis have become new research hotspots, enabling detection systems to identify specific generated segments and explain the basis for discrimination. However, cross-model generalization, multilingual scenarios, and adversarial robustness remain pressing challenges to be addressed. Future research will focus on constructing detection frameworks with enhanced robustness and interpretability, integrating causal reasoning and multimodal information to improve detection performance, thereby advancing the practicalization and standardization of LLM-generated text detection technologies.
文章引用:孙一凯, 王洪俊. 机器生成文本检测综述[J]. 人工智能与机器人研究, 2026, 15(1): 38-49. https://doi.org/10.12677/airr.2026.151005

参考文献

[1] Weber-Wulff, D., Anohina-Naumeca, A., Bjelobaba, S., Foltýnek, T., Guerrero-Dib, J., Popoola, O., et al. (2023) Testing of Detection Tools for AI-Generated Text. International Journal for Educational Integrity, 19, 1-39. [Google Scholar] [CrossRef
[2] Najjar, A.A., Ashqar, H.I., Darwish, O.A., et al. (2025) Detecting AI-Generated Text in Educational Content: Leveraging Machine Learning and Explainable AI for Academic Integrity. arXiv:2501.03203.
[3] Zhou, Y., He, B. and Sun, L. (2024) Humanizing Machine-Generated Content: Evading AI-Text Detection through Adversarial Attack. arXiv:2404.01907.
[4] Yadagiri, A., Shree, L., Parween, S., et al. (2024) Detecting AI-Generated Text with Pre-Trained Models Using Linguistic Features. Proceedings of the 21st International Conference on Natural Language Processing (ICON), Chennai, 15-18 December 2024, 188-196.
[5] He, P., Liu, X., Gao, J., et al. (2021) DeBERTa: Decoding-Enhanced BERT with Disentangled Attention. Proceedings of the International Conference on Learning Representations, Vienna, 3-7 May 2021, 1-17.
[6] Clark, K., Luong, M.T., Le, Q.V., et al. (2020) Electra: Pre-Training Text Encoders as Discriminators Rather Than Generators. arXiv:2003.10555.
[7] Zhang, Z., Qin, W. and Plummer, B. (2024) Machine-Generated Text Localization. Findings of the Association for Computational Linguistics ACL 2024, Bangkok, 11-16 August 2024, 8357-8371. [Google Scholar] [CrossRef
[8] Kadhim, A.K., Jiao, L., Shafik, R., et al. (2025) Adversarial Attacks on AI-Generated Text Detection Models: A Token Probability-Based Approach Using Embeddings. arXiv:2501.18998.
[9] Zeng, C., Tang, S., Chen, Y., et al. (2025) Human Texts Are Outliers: Detecting LLM-Generated Texts via Out-of-Distribution Detection. arXiv:2510.08602.
[10] Tao, Z., Li, Z., Chen, R., et al. (2024) Unveiling Large Language Models Generated Texts: A Multi-Level Fine-Grained Detection Framework. arXiv:2410.14231.
[11] Li, Y., Li, Q., Cui, L., Bi, W., Wang, Z., Wang, L., et al. (2024) MAGE: Machine-Generated Text Detection in the Wild. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 1, 36-53. [Google Scholar] [CrossRef
[12] Macko, D., Moro, R., Uchendu, A., Lucas, J., Yamashita, M., Pikuliak, M., et al. (2023) Multitude: Large-Scale Multilingual Machine-Generated Text Detection Benchmark. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-10 December 2023, 9960-9987. [Google Scholar] [CrossRef
[13] Dugan, L., Hwang, A., Trhlík, F., Zhu, A., Ludan, J.M., Xu, H., et al. (2024) RAID: A Shared Benchmark for Robust Evaluation of Machine-Generated Text Detectors. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 1, 12463-12492. [Google Scholar] [CrossRef
[14] Huang, Y., Cao, J., Luo, H., Guan, X., Liu, B. (2025) MAGRET: Machine-Generated Text Detection with Rewritten Texts. Proceedings of the 31st International Conference on Computational Linguistics, Abu Dhabi, 19-24 January 2025, 8336-8346.
[15] He, X., Shen, X., Chen, Z., Backes, M. and Zhang, Y. (2024) MGTbench: Benchmarking Machine-Generated Text Detection. Proceedings of the 2024 on ACM SIGSAC Conference on Computer and Communications Security, Salt Lake City, 14-18 October 2024, 2251-2265. [Google Scholar] [CrossRef
[16] Wang, Y., Mansurov, J., Ivanov, P., Su, J., Shelmanov, A., Tsvigun, A., et al. (2024) M4GT-Bench: Evaluation Benchmark for Black-Box Machine-Generated Text Detection. Proceedings of the 62nd Annual Meeting of the Association for Computational Linguistics, 1, 3964-3992. [Google Scholar] [CrossRef
[17] Wang, Y., Mansurov, J., Ivanov, P., Su, J., Shelmanov, A., Tsvigun, A., et al. (2024) M4: Multi-Generator, Multi-Domain, and Multi-Lingual Black-Box Machine-Generated Text Detection. Proceedings of the 18th Conference of the European Chapter of the Association for Computational Linguistics (Volume 1: Long Papers), St. Julian’s, 17-22 March 2024, 1369-1407. [Google Scholar] [CrossRef
[18] Macko, D., Kopál, J., Moro, R. and Srba, I. (2025) Multisocial: Multilingual Benchmark of Machine-Generated Text Detection of Social-Media Texts. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vienna, 27 July-1 August 2025, 727-752. [Google Scholar] [CrossRef
[19] Gehrmann, S., Strobelt, H. and Rush, A. (2019) GLTR: Statistical Detection and Visualization of Generated Text. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics: System Demonstrations, Florence, 28 July-2 August 2019, 111-116. [Google Scholar] [CrossRef
[20] Zellers, R., Holtzman, A., Rashkin, H., et al. (2019) Defending against Neural Fake News. Proceedings of the 33rd International Conference on Neural Information, Vancouver, 8-14 December 2019, 9051-9062.
[21] Mitchell, E., Lee, Y., Khazatsky, A., et al. (2023) DetectGPT: Zero-Shot Machine-Generated Text Detection Using Probability Curvature. Proceedings of the 40th International Conference on Machine Learning, Honolulu, 23-29 July 2023, 24950-24962.
[22] Bao, G., Zhao, Y., Teng, Z., et al. (2024) Fast-DetectGPT: Efficient Zero-Shot Detection of Machine-Generated Text via Conditional Probability Curvature. Proceedings of the Twelfth International Conference on Learning Representations, Vienna, May 2024, 1-9.
[23] Venkatraman, S., Uchendu, A. and Lee, D. (2024) GPT-Who: An Information Density-Based Machine-Generated Text Detector. Findings of the Association for Computational Linguistics: NAACL 2024, Mexico, 16-21 June 2024, 103-115. [Google Scholar] [CrossRef
[24] Ippolito, D., Duckworth, D., Callison-Burch, C. and Eck, D. (2020) Automatic Detection of Generated Text Is Easiest When Humans Are Fooled. Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, Online, 5-10 July 2020, 1808-1822. [Google Scholar] [CrossRef
[25] Welleck, S., Kulikov, I., Roller, S., et al. (2019) Neural Text Generation with Unlikelihood Training. arXiv:1908.04319.
[26] Megías, A.J.G., Ureña-López, L.A. and Martínez-Cámara, E. (2024) The Influence of the Perplexity Score in the Detection of Machine-Generated Texts. Proceedings of the First International Conference on Natural Language Processing and Artificial Intelligence for Cyber Security, Lancaster, 29-30 July 2024, 80-85.
[27] Holtzman, A., Buys, J., Du, L., et al. (2020) The Curious Case of Neural Text Degeneration. Proceedings of the International Conference on Learning Representations (ICLR), April 2020.
[28] Uchendu, A., Le, T., Shu, K. and Lee, D. (2020) Authorship Attribution for Neural Text Generation. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), Online, 16-20 November 2020, 8384-8395. [Google Scholar] [CrossRef
[29] He, P., Liu, X., Gao, J., et al. (2020) DeBERTa: Decoding-Enhanced BERT with Disentangled Attention. arXiv:2006.03654.
[30] Kuznetsov, K., Tulchinskii, E., Kushnareva, L., Magai, G., Barannikov, S., Nikolenko, S., et al. (2024) Robust AI-Generated Text Detection by Restricted Embeddings. Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, 12-16 November 2024, 17036-17055. [Google Scholar] [CrossRef
[31] Zhi, L., Fang, L. and Cai, M. (2025) Efficient AI-Generated Text Detection Based on Contrastively Enhanced Hybrid Features and Support Vector Machine. 2025 2nd International Conference on Intelligent Perception and Pattern Recognition (IPPR), Chongqing, 15-17 August 2025, 386-391. [Google Scholar] [CrossRef
[32] Hao, W., Li, R., Zhao, W., Yang, J. and Mao, C. (2025) Learning to Rewrite: Generalized LLM-Generated Text Detection. Proceedings of the 63rd Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Vienna, Vienna, 27 July-1 August 2025, 6421-6434. [Google Scholar] [CrossRef
[33] Jiao, K., Wang, Q., Zhang, L., Guo, Z. and Mao, Z. (2025) M-Rangedetector: Enhancing Generalization in Machine-Generated Text Detection through Multi-Range Attention Masks. Findings of the Association for Computational Linguistics: ACL 2025, Vienna, 27 July-1 August 2025, 8971-8983. [Google Scholar] [CrossRef
[34] Guo, Z. and Yu, S. (2023) AuthentiGPT: Detecting Machine-Generated Text via Black-Box Language Models Denoising. arXiv:2311.07700.
[35] Pu, X., Zhang, J., Han, X., Tsvetkov, Y. and He, T. (2023) On the Zero-Shot Generalization of Machine-Generated Text Detectors. Findings of the Association for Computational Linguistics: EMNLP 2023, Singapore, 6-10 December 2023, 4799-4808. [Google Scholar] [CrossRef
[36] Sadiq, S., Aljrees, T. and Ullah, S. (2023) Deepfake Detection on Social Media: Leveraging Deep Learning and Fasttext Embeddings for Identifying Machine-Generated Tweets. IEEE Access, 11, 95008-95021. [Google Scholar] [CrossRef
[37] Yan, J., Zhao, W. and Guo, H. (2025) A Lightweight Detector: Zero-Shot Detection of Machine-Generated Text with Once Call. 2025 5th International Conference on Artificial Intelligence, Big Data and Algorithms (CAIBDA), Beijing, 20-22 June 2025, 1331-1334. [Google Scholar] [CrossRef
[38] Hans, A., Schwarzschild, A., Cherepanova, V., et al. (2024) Spotting LLMs with Binoculars: Zero-Shot Detection of Machine-Generated Text. arXiv:2401.12070.
[39] Feng, W., Guo, X., He, Y., Huang, H., Ma, C., Zhang, S., et al. (2024) Detective: Detecting AI-Generated Text via Multi-Level Contrastive Learning. Advances in Neural Information Processing Systems, 37, 88320-88347. [Google Scholar] [CrossRef
[40] Fu, Y., Xiong, D. and Dong, Y. (2024) Watermarking Conditional Text Generation for AI Detection: Unveiling Challenges and a Semantic-Aware Watermark Remedy. Proceedings of the AAAI Conference on Artificial Intelligence, 38, 18003-18011. [Google Scholar] [CrossRef
[41] Yang, X., Chen, K., Zhang, W., et al. (2023) Watermarking Text Generated by Black-Box Language Models. arXiv:2305.08883.
[42] Hou, A., Zhang, J., He, T., Wang, Y., Chuang, Y., Wang, H., et al. (2024) Semstamp: A Semantic Watermark with Paraphrastic Robustness for Text Generation. Proceedings of the 2024 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (Volume 1: Long Papers), Mexico, 16-21 June 2024, 4067-4082. [Google Scholar] [CrossRef
[43] Piet, J., Sitawarin, C., Fang, V., Mu, N. and Wagner, D. (2025) Markmywords: Analyzing and Evaluating Language Model Watermarks. 2025 IEEE Conference on Secure and Trustworthy Machine Learning (SaTML), Copenhagen, 9-11 April 2025, 68-91. [Google Scholar] [CrossRef
[44] Huang, B., Su, D., Sun, F., Cao, Q., Shen, H. and Cheng, X. (2025) Low-Entropy Watermark Detection via Bayes’ Rule Derived Detector. Findings of the Association for Computational Linguistics: ACL 2025, Vienna, 27 July-1 August 2025, 14330-14344. [Google Scholar] [CrossRef
[45] Xu, Y., Liu, A., Hu, X., et al. (2025) Mark Your LLM: Detecting the Misuse of Open-Source Large Language Models via Watermarking. arXiv:2503.04636.
[46] Wang, L., Yang, W., Chen, D., et al. (2023) Towards Codable Watermarking for Injecting Multi-Bits Information to LLMs. arXiv:2307.15992.
[47] Zhao, N., Chen, K., Zhang, W. and Yu, N. (2025) Performance-Lossless Black-Box Model Watermarking. IEEE Transactions on Dependable and Secure Computing, 1-17. [Google Scholar] [CrossRef
[48] Macko, D., Moro, R., Uchendu, A., Srba, I., Lucas, J.S., Yamashita, M., et al. (2024) Authorship Obfuscation in Multilingual Machine-Generated Text Detection. Findings of the Association for Computational Linguistics: EMNLP 2024, Miami, 12-16 November 2024, 6348-6368. [Google Scholar] [CrossRef
[49] Koike, R., Kaneko, M. and Okazaki, N. (2024) Outfox: LLM-Generated Essay Detection through In-Context Learning with Adversarially Generated Examples. Proceedings of the AAAI Conference on Artificial Intelligence, 38, 21258-21266. [Google Scholar] [CrossRef
[50] Przybyła, P., McGill, E. and Saggion, H. (2025) Attacking Misinformation Detection Using Adversarial Examples Generated by Language Models. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Suzhou, 4-9 November 2025, 27614-27630. [Google Scholar] [CrossRef
[51] Teja, L.S., Yadagiri, A., Chunka, C., et al. (2025) Fine-Grained Detection of AI-Generated Text Using Sentence-Level Seg-Mentation. arXiv:2509.17830.
[52] Jiang, L., Wu, D. and Zheng, X. (2025) Sendetex: Sentence-Level AI-Generated Text Detection for Human-AI Hybrid Content via Style and Context Fusion. Proceedings of the 2025 Conference on Empirical Methods in Natural Language Processing, Suzhou, 4-9 November 2025, 5287-5302. [Google Scholar] [CrossRef
[53] Corizzo, R. and Leal-Arenas, S. (2023) One-GPT: A One-Class Deep Fusion Model for Machine-Generated Text Detection. 2023 IEEE International Conference on Big Data (BigData), Sorrento, 15-18 December 2023, 5743-5752. [Google Scholar] [CrossRef
[54] Li, X., Yin, Z., Tan, H., Jing, S., Su, D., Cheng, Y., et al. (2025) PRDetect: Perturbation-Robust LLM-Generated Text Detection Based on Syntax Tree. Findings of the Association for Computational Linguistics: NAACL 2025, Albuquerque, 29 April-4 May 2025, 8290-8301. [Google Scholar] [CrossRef
[55] Bethany, M., Wherry, B., Bethany, E., et al. (2024) Deciphering Textual Authenticity: A Generalized Strategy through the Lens of Large Language Semantics for Detecting Human vs. Machine-Generated Text. 33rd USENIX Security Symposium (USENIX Security 24), Philadelphia, 14-16 August 2024, 5805-5822.
[56] Wang, P., Li, L., Ren, K., Jiang, B., Zhang, D. and Qiu, X. (2023) SeqxGPT: Sentence-Level AI-Generated Text Detection. Proceedings of the 2023 Conference on Empirical Methods in Natural Language Processing, Singapore, 6-10 December 2023, 1144-1156. [Google Scholar] [CrossRef