基于依存句法的机器译文质量优化方法研究
A Study Based on Dependency Syntax for Machine Translation Quality
摘要: 大语言模型翻译普遍存在受原文结构影响、质量难以保证等问题,而机器译文质量评估多采用人工评价,暂无通用的质量评估框架。基于此,本文提出一种基于依存句法分析的译文质量评估与优化方法。通过构建“人工智能”领域的中英平行新闻语料库,利用依存句法分析工具提取并分析英文原文与机器译文在平均依存距离、关键词性分布、依存关系类型及从句使用等方面的差异。研究发现,机器译文存在平均依存距离偏高、形容词修饰语占比过大、名词复合结构占比偏低、较少使用从句等结构性问题。根据研究发现,本文提出一套句法优化策略,并对比优化前后的依存结构指标,验证了该策略能有效提升译文句法结构。
Abstract: Translations produced by large language models are widely affected by source-text structures, and their quality is difficult to ensure. Meanwhile, current machine translation quality assessment relies largely on human evaluation, and a unified, general evaluation framework is still lacking. In response to these issues, this study proposes a translation quality evaluation and optimization method based on dependency syntactic analysis. By constructing a Chinese-English parallel news corpus in the field of artificial intelligence, dependency parsing tools are employed to extract and analyze differences between English source texts and machine-translated texts in terms of average dependency distance, distribution of key syntactic elements, types of dependency relations, and the use of subordinate clauses. The results indicate that machine-translated texts exhibit several structural problems, including excessively high average dependency distance, an over proportion of adjectival modifiers, an under representation of nominal compound structures, and limited use of subordinate clauses. Based on these findings, the study proposes a set of syntactic optimization strategies. By comparing dependency-structure indicators before and after optimization, the effectiveness of the proposed strategy in improving the syntactic structure of machine-translated texts is empirically verified.
文章引用:孔晓宏. 基于依存句法的机器译文质量优化方法研究[J]. 现代语言学, 2026, 14(1): 454-460. https://doi.org/10.12677/ml.2026.141059

参考文献

[1] 韩林涛, 陈重宇. 机器翻译质量评估: 发展历程与新路径[J]. 语言服务研究, 2025, 5(1): 53-71.
[2] Snover, et al. (2006) A Study of Translation Edit Rate with Targeted Human Annotation. In: Proceedings of the 7th Conference of the Association for Machine Translation in the Americas, Cambridge, 8-12 August 2006, 223-231.
[3] Papineni, K., Roukos, S., Ward, T. and Zhu, W. (2001) BLEU: A Method for Automatic Evaluation of Machine Translation. In: Proceedings of the 40th Annual Meeting on Association for Computational Linguistics-ACL ‘02, Philadelphia, 6-12 July 2002, 311-318. [Google Scholar] [CrossRef
[4] Rei, R., Stewart, C., Farinha, A.C. and Lavie, A. (2020) COMET: A Neural Framework for MT Evaluation. Proceedings of the 2020 Conference on Empirical Methods in Natural Language Processing (EMNLP), November 2020, 2685-2702. [Google Scholar] [CrossRef
[5] Yang, Z., Li, J., Wang, M. and Zhou, M. (2023) Knowledge-Prompted Evaluation for Machine Translation with Large Language Models. Transactions of the Association for Computational Linguistics, 11, 1572-1590.
[6] Tesnière, L. (1959) Éléments de syntaxe structurale. Klincksieck.
[7] 刘海涛. 依存语法的理论与实践[M]. 北京: 科学出版社, 2009.
[8] Shen, L., Xu, J. and Weischedel, R. (2010) A New String-to-Dependency Machine Translation Algorithm with a Target Dependency Language Model. Proceedings of the 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, 11-16 July 2010, 577-585.
[9] Xu, J., Zhang, M. and Li, C. (2021) Exploring Universal Dependencies for Translation Universals. Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, 19-23 April 2021, 3123-3133.
[10] Xiao, R. (2010) How Different Is Translated Chinese from Native Chinese? A Corpus-Based Study of Characteristic Features. International Journal of Corpus Linguistics, 15, 5-35. [Google Scholar] [CrossRef
[11] Cheung, K.H. (2017) A Corpus-Based Study of the Translation of Chinese Noun Phrases into English. The Hong Kong Polytechnic University.