基于Trae的中医典籍英译平行语料库构建与应用研究
Research on Construction and Application of a Trae-Based C-E Parallel Corpus for TCM Classics
DOI: 10.12677/ml.2026.146605, PDF,    科研立项经费支持
作者: 冯俊芳:山东大学齐鲁医院德州医院,山东 德州;栗心生:山东华宇工学院,山东 德州
关键词: 中医典籍英译平行语料库Trae平台文本对齐RAG检索English Translation of Traditional Chinese Medical Classics Parallel Corpus Trae Platform Text Alignment RAG Retrieval
摘要: 中医典籍作为中华传统医学重要文献载体,其英译传播对于中医药国际化和中华文化全球推广具有重要意义。然而,当前中医典籍英译研究面临语料分散、标注不规范、检索效率低等困境。本文提出一种基于Trae的中医典籍英译平行语料库构建方法,设计并实现了从语料采集、文本清洗、段落对齐到智能比对的完整技术框架。系统以Python为核心开发语言,利用Trae平台的MCP协议扩展能力和RAG检索增强技术,实现了对《黄帝内经·素问》《伤寒论》等经典典籍的多版本平行语料自动化处理。研究表明,该框架能够显著提升中医典籍英译语料库的构建效率与质量,为翻译研究者和从业者提供有力的技术支撑,对推动中医药翻译学的数字化转型具有重要参考价值。
Abstract: As a treasure of Chinese civilization, the English translation and dissemination of traditional Chinese medical classics are of great significance for the internationalization of traditional Chinese medicine and the global promotion of Chinese culture. However, current research on the English translation of traditional Chinese medical classics is confronted with problems such as scattered corpora, non-standard annotation, and low retrieval efficiency. This paper proposes a method for constructing a parallel corpus of traditional Chinese medical classics based on Trae, and designs and implements a complete technical framework from corpus collection, text cleaning, paragraph alignment to intelligent comparison. The system is developed with Python as the core language and utilizes the MCP protocol extension capability and RAG retrieval enhancement technology of the Trae platform to achieve the automatic processing of multi-version parallel corpora of classic works such as “Essential Questions in Yellow Emperor’s Inner Canon”. The research shows that this framework can significantly improve the construction efficiency and quality of the English translation corpus of traditional Chinese medical classics, providing strong technical support for translation researchers and practitioners, and has important reference value for promoting the digital transformation of traditional Chinese medicine translation studies.
文章引用:冯俊芳, 栗心生. 基于Trae的中医典籍英译平行语料库构建与应用研究[J]. 现代语言学, 2026, 14(6): 969-982. https://doi.org/10.12677/ml.2026.146605

参考文献

[1] 兰凤利. 论译者主体性对《黄帝内经素问》英译的影响[J]. 中华医史杂志, 2005(2): 74-78.
[2] 兰凤利. 《黄帝内经素问》翻译实例分析[J]. 中国翻译, 2004(4): 75-78.
[3] 王攀月, 刘振, 张宗明. 中医药文化传播的新途径——以动漫为例[J]. 南京中医药大学学报(社会科学版), 2022, 23(1): 17-22.
[4] 蒋继彪. 中医药话语体系建设的三维模式研究[J]. 南京中医药大学学报(社会科学版), 2022, 23(5): 289-293.
[5] Yazar, B.K., Şahın, D.Ö. and Kiliç, E. (2023) Low-Resource Neural Machine Translation: A Systematic Literature Review. IEEE Access, 11, 131775-131813. [Google Scholar] [CrossRef
[6] 严承希, 唐雪梅, 杨浩, 等. HanNER: 一个面向汉语古籍语料命名实体自动抽取的通用框架[J]. 情报学报, 2023, 42(2): 203-216.
[7] Xu, S., Zhang, X., Wu, Y., et al. (2023) EvaHan2023: Overview of the First International Ancient Chinese Translation Bakeoff. Proceedings of ALT2023 (Colocated with MTSummit XIX), Macau, 4-8 September 2023, 1-14.
[8] Storey, M.A., Zagalsky, A., et al. (2017) How Researchers Use GitHub: A Survey of the CSCW Community. Proceedings of the 2017 ACM Conference on Computer Supported Cooperative Work and Social Computing (CSCW’17), Portland, 25 February-1 March 2017, 1034-1047.
[9] 李艳翠, 冯继克, 来纯晓, 等. 汉英篇章衔接对齐语料库构建研究[J]. 中文信息学报, 2022, 36(4): 39-47+56.
[10] Liang, W., Yuksekgonul, M., Mao, Y., et al. (2024) Mapping the Increasing Use of LLMs in Scientific Papers.
https://arxiv.org/abs/2404.01268
[11] 朱俊秀, 闻永毅. 基于《中国中医古籍总目》的数据挖掘[J]. 西部中医药, 2023, 36(4): 35-38.
[12] Keung, P., Salazar, J., Lu, Y. and Smith, N.A. (2020) Unsupervised Bitext Mining and Translation via Self-Trained Contextual Embeddings. Transactions of the Association for Computational Linguistics, 8, 828-841. [Google Scholar] [CrossRef
[13] 傅灵婴, 施蕴中. 《黄帝内经》虚指数词的英译[J]. 中西医结合学报, 2008, 6(12): 1318-1320.
[14] Zheng, J. and Xiao, X. (2024) A Complex Network Approach to Analyse Pre-Trained Language Models for Ancient Chinese. Royal Society Open Science, 11, Article ID: 240061. [Google Scholar] [CrossRef] [PubMed]
[15] Chen, L., Qi, Y., Wu, A., Deng, L. and Jiang, T. (2023) Mapping Chinese Medical Entities to the Unified Medical Language System. Health Data Science, 3, Article No. 0011. [Google Scholar] [CrossRef] [PubMed]
[16] 张其成, 梁健康. 简帛医书养生方法中的哲学思想探析[J]. 南京中医药大学学报(社会科学版), 2021, 22(1): 1-5.