日常话语研究的优质原料:CED语料库介绍
High-Quality Raw Materials for Everyday Discourse Research: An Introduction to the CED Corpus
DOI: 10.12677/ml.2024.128721, PDF,    科研立项经费支持
作者: 付晓丽:河北师范大学外国语学院,河北 石家庄
关键词: 日常话语语料库汉语口语语料采集标注方案语料构成Corpus of Everyday Discourse Spoken Chinese Corpus Collection Annotation Schemes Corpus Composition
摘要: 随着汉语日常话语研究的日益深入,相关的语料库建设工作愈发重要。自然发生的、人际互动过程中的语言事实,是支撑科学研究的优质基础原材料。对此类语料进行详尽描述和深入探讨,可发现并揭示汉语日常话语的使用规律和本质特征。本文介绍已建成并投入使用的“汉语日常话语语料库”(简称CED)。文章介绍“日常话语”的工作定义,报告该语料库建设的指导原则、研究方法,数据采集标准、标注方案制定及语料构成情况,希望更多的研究者从中得到些许启发。
Abstract: With the deepening of the study of daily Chinese discourse, the construction of related corpora is becoming more and more important. Linguistic facts occurring naturally in human interaction are the high-quality basic raw materials that underpin scientific research. A detailed description and in-depth discussion of such corpus can discover and reveal the rules and essential features of daily Chinese discourse. This article introduces the “Corpus of Everyday Chinese Discourse” (CED) that has been built and put into use. This article introduces the working definition of “everyday discourse” and reports the guiding principles, research methods, data collection standards, annotation scheme formulation and corpus composition of the corpus, hoping more researchers can get some inspiration from it.
文章引用:付晓丽. 日常话语研究的优质原料:CED语料库介绍[J]. 现代语言学, 2024, 12(8): 536-542. https://doi.org/10.12677/ml.2024.128721

参考文献

[1] 付晓丽, 荣红, 董东, 宋文辉. 汉语日常话语语料库建设研究报告[R]. 石家庄: 河北师范大学, 2019.
[2] 郑家恒, 张虎, 谭红叶, 钱揖丽, 卢娇丽. 智能信息处理——汉语语料库加工技术及应用[M]. 北京: 科学出版社, 2010.
[3] Dash, N.S. and Arulmozi, S. (2018) History, Features, and Typology of Language Corpora. Springer. [Google Scholar] [CrossRef
[4] José de São, J. and Teixeira, A.R. (2013) At the “Ethical Crossroads” of Ethnography: Observing the “Care Encounter” at the Elderly Person’s Home. In: Isabella, P., Maria, I. and Fernanda, M., Eds., Practices of Ethics: An Empirical Approach to Ethics in Social Sciences Research, Cambridge Scholars Publishing, 43-64.
[5] Keel, S. (2016) Socialization: Parent-Child Everyday Interaction. Routledge. [Google Scholar] [CrossRef
[6] Tannen, D., Kendall, S. and Gordon, C. (2007) Family Talk: Discourse and Identity in Four American Families. Oxford University Press. [Google Scholar] [CrossRef
[7] 王融. 数据匿名化的法律规制[J]. 研究与开发, 2016(4): 38-44.
[8] Atkinson, J.M. and Heritage, J. (1984) Structures of Social Action: Studies in Conversation Analysis. Cambridge University Press.
[9] Jefferson, G. (2004) Glossary of Transcript Symbols with an Introduction. In: Lerner, G.H., Ed., Conversation Analysis: Studies from the First Generation, John Benjamins, 13-31. [Google Scholar] [CrossRef
[10] 刘虹. 会话结构分析[M]. 北京: 北京大学出版社, 2004.
[11] Palmer, M., Gildea, D. and Kingsbury, P. (2005) The Proposition Bank: A Corpus Annotated with Semantic Roles. Computational Linguistics Journal, 31, 71-106. [Google Scholar] [CrossRef
[12] 杨换丽. 基于语料库的情感咨询中自我修正研究[D]: [硕士学位论文]. 石家庄: 河北师范大学, 2019.
[13] 谷伟明. 情感咨询人际互动中话语标记“然后”功能探析[J]. 现代语言学, 2021, 9(3): 737-747.
[14] 张丽媛. 电台咨询节目中建议拒绝言语行为[D]: [硕士学位论文]. 石家庄: 河北师范大学, 2024.
[15] Norrick, N.R. (2000) Conversational Narrative: Storytelling in Everyday Talk. John Benjamins. [Google Scholar] [CrossRef