生成式人工智能赋能EFL课堂教学测评的公平性问题——基于师生双视角的研究
Research on the Fairness of Generative AI-Empowered EFL Classroom Assessment—From Dual Perspectives of Teachers and Students
摘要: 近年来生成式人工智能(GenAI)在教育测评领域广泛应用,测评的公平性面临新的挑战。本研究采用混合研究方法,通过对209名高校教师和637名学生的问卷调查与深度访谈,探讨了GenAI赋能EFL教学测评场景下,教师和学生对测评公平性的认知现状、群体差异及影响因素。研究发现:1) 师生对GenAI介入测评的公平性整体持审慎态度,尤其在“AI辅助与独立完成作业采用同一评分标准”的情景中,师生感知的公平风险最高;2) 师生在公平性关注点上存在显著差异,教师更关注评分规则的形式公平与技术中立,而学生更在意资源获取的机会平等与竞争公平;3) 评分标准的适配性、AI工具的可及性以及个体技术素养是制约测评公平性的核心因素。研究结果为构建数智时代测评公平、规范教学测评体系提供了实证依据。
Abstract: In recent years, Generative Artificial Intelligence (GenAI) has been widely applied in the field of educational assessment, bringing new challenges to assessment fairness. This study employs a mixed-methods approach—comprising a questionnaire survey of 209 university teachers and 637 students, alongside in-depth interviews—to explore the current state of cognition, group differences, and influencing factors regarding assessment fairness in GenAI-empowered EFL (English as a Foreign Language) teaching and assessment scenarios. The study findings indicate that: 1) Both teachers and students generally maintain a cautious attitude toward the intervention of GenAI in assessment; the perceived fairness risk is highest in the scenario where “the same scoring criteria are applied to both AI-assisted and independently completed assignments”. 2) Significant differences exist between teachers and students regarding their focus on fairness: teachers are more concerned with the formal fairness and technical neutrality of scoring rules, while students care more about equal opportunity in resource acquisition and competitive fairness. 3) The core factors impacting assessment fairness include the adaptability of scoring standards, the accessibility of AI tools, and individual technical literacy. The results of this research provide empirical evidence for constructing a fair assessment framework and a standardized teaching assessment system in the digital-intelligence era.
文章引用:王静, 董子涵, 戴文蕙, 张泽涵. 生成式人工智能赋能EFL课堂教学测评的公平性问题——基于师生双视角的研究[J]. 教育进展, 2026, 16(5): 826-834. https://doi.org/10.12677/ae.2026.165926

参考文献

[1] Green, S.K., Johnson, R.L., Kim, D. and Pope, N.S. (2007) Ethics in Classroom Assessment Practices: Issues and Attitudes. Teaching and Teacher Education, 23, 999-1011. [Google Scholar] [CrossRef
[2] Fan, X., Liu, X. and Johnson, R.L. (2020) A Mixed Method Study of Ethical Issues in Classroom Assessment in Chinese Higher Education. Asia Pacific Education Review, 21, 183-195. [Google Scholar] [CrossRef
[3] 刘秀梅, 古明, 国红延, 等. 中学教师在教学测评中的伦理困境研究[J]. 中国教育学刊, 2022(1): 86-91.
[4] 顾景倩. 浅谈GenAI技术在大学英语教学中的应用——以Spark Desk (讯飞星火)为例[J]. 英语广场, 2024(17): 113-116.
[5] 徐林林, 胡杰辉, 苏扬. 人工智能辅助学术英语写作的学习者认知及行为研究[J]. 外语界, 2024(3): 78-85.
[6] 孙婷婷. AI时代外语教学数智化转型的嬗变与跃迁——基于CiteSpace可视化图谱分析[J]. 湖北开放职业学院学报, 2025, 38(13): 157-160.
[7] 王佑镁, 王欣颖, 柳晨晨. 教育领域GenAI应用的伦理风险管理框架研究[J]. 电化教育研究, 2024, 45(10): 28-34+42.
[8] 谢娟. 人工智能与教育融合创新何以“伦理先行”——兼论GenAI教育应用的伦理路径[J]. 现代远程教育研究, 2024, 36(6): 11-19.
[9] 王佑镁, 王旦, 王海洁, 柳晨晨. 算法公平: 教育人工智能算法偏见的逻辑与治理[J]. 开放教育研究, 2023, 29(5): 37-46.
[10] Messick, S. (1988) Meaning and Values in Test Validation: The Science and Ethics of Assessment. ETS Research Report Series, 1988, i-28. [Google Scholar] [CrossRef
[11] 刘艳红. GenAI的三大安全风险及法律规制: 以ChatGPT为例[J]. 东方法学, 2023(4): 29-43.
[12] 古明, 刘秀梅, 国红延. 中学英语教师对课堂测评活动中伦理问题的认知水平研究[J]. 基础外语教育, 2024, 26(6): 91-99+111.