基于LLM的智能阅卷系统设计
Design of Intelligent Marking System Based on LLM
摘要: 随着教育规模扩大与个性化需求增长,传统人工阅卷模式因效率低、主观性强及反馈滞后等问题面临严峻挑战。本研究针对这一痛点,提出一种基于大语言模型(LLM)的智能阅卷系统,旨在通过技术创新提升评卷效率、公平性与可解释性。系统以Transformer架构为核心,采用“指令微调 + 规则约束强化学习”的混合评分算法,结合历史试卷数据与专家评分规则对LLM进行领域适配优化,有效解决主观题评分一致性难题;通过模块化设计实现数据预处理、多阶段评分、错因溯源与个性化反馈生成的全流程自动化。创新性体现于三方面:其一,融合LLM语义理解与规则引擎硬性约束,平衡算法灵活性与评估严谨性;其二,设计注意力权重可视化与评分依据高亮机制,破解教育场景下的“黑箱”信任壁垒;其三,构建轻量化微调(LoRA)与向量数据库协同架构,保障高并发场景的工程可行性。该系统为大规模考试、个性化教学提供技术支撑,推动教育评估从“经验驱动”向“数据智能”转型。未来研究将扩展至多模态答案解析与动态学情追踪,深化AI与教育融合的实践价值。
Abstract: As the scale of education expands and personalized needs grow, the traditional manual grading model faces significant challenges due to its low efficiency, strong subjectivity, and delayed feedback. This study addresses these issues by proposing an intelligent grading system based on large language models (LLMs). The system aims to enhance grading efficiency, fairness, and explainability through technological innovation. At its core is a Transformer architecture, which employs a hybrid scoring algorithm combining “instruction fine-tuning + rule constraint reinforcement learning”. By integrating historical test data and expert scoring rules, the system optimizes the LLM for domain-specific tasks, effectively addressing the challenge of consistent scoring for subjective questions. Through modular design, the system automates the entire process, from data preprocessing to multi-stage scoring, error tracing, and personalized feedback generation. The innovations are reflected in three key areas: first, by integrating LLM semantic understanding with the rigid constraints of rule engines, it balances algorithmic flexibility with assessment rigor; second, by implementing a mechanism for visualizing attention weights and highlighting scoring criteria, it breaks down the “black box” trust barriers in educational settings; third, by constructing a lightweight fine-tuning and vector database collaborative architecture, it ensures the system’s engineering feasibility in high-concurrency scenarios. This system provides technical support for large-scale exams and personalized teaching, facilitating the transition of educational assessment from “experience-driven” to “data intelligence”. Future research will expand to multi-modal answer analysis and dynamic learning situation tracking, further enhancing the practical value of AI in education.
参考文献
|
[1]
|
Wang, L., Ma, C., Feng, X., Zhang, Z., Yang, H., Zhang, J., et al. (2024) A Survey on Large Language Model Based Autonomous Agents. Frontiers of Computer Science, 18, Article No. 186345. [Google Scholar] [CrossRef]
|
|
[2]
|
龙禹辰, 勾智楠, 陈宇欣, 等. 基于大语言模型的多任务生成式重构对话情绪识别[J/OL]. 计算机应用研究, 1-9. 2025-03-19.[CrossRef]
|
|
[3]
|
王路桥, 周洋涛, 李青山, 等. 基于大语言模型的多智能体协作代码评审人推荐[J/OL]. 软件学报, 1-18. 2025-03-19.[CrossRef]
|
|
[4]
|
王志鹏, 何铁科, 赵若愚, 等. 大语言模型在代码优化任务中的能力探究及改进方法[J/OL]. 软件学报, 1-24. 2025-03-19.[CrossRef]
|
|
[5]
|
祁凯, 周燕生. 基于大语言模型生成内容的负面舆情态势恶化牵引作用研究[J/OL]. 情报杂志, 1-10. http://kns.cnki.net/kcms/detail/61.1167.G3.20250311.1110.002.html, 2025-03-19.
|
|
[6]
|
吴文隆, 尹海莲, 王宁, 等. 大语言模型和知识图谱协同的跨域异质数据查询框架[J]. 计算机研究与发展, 2025, 62(3): 605-619.
|
|
[7]
|
蔡启航, 徐彬, 董晓迪. 利用语义增强提示和结构信息的知识图谱补全模型[J/OL]. 计算机科学, 1-17. http://kns.cnki.net/kcms/detail/50.1075.TP.20241028.1439.034.html, 2025-03-19.
|
|
[8]
|
Pan, H., Liu, J., Gong, B., Zhu, Y., Bai, J., Huang, H., et al. (2024) Construction and Preliminary Application of Large Language Model for Reservoir Performance Analysis. Petroleum Exploration and Development, 51, 1357-1366. [Google Scholar] [CrossRef]
|
|
[9]
|
Lazebnik, T. and Rosenfeld, A. (2024) Detecting LLM-Assisted Writing in Scientific Communication: Are We There Yet? Journal of Data and Information Science, 9, 4-13. [Google Scholar] [CrossRef]
|
|
[10]
|
张嘉睿, 张豈明, 毕枫林, 等. 基于IPEX-LLM的本地轻量化课程教学智能辅助系统[J]. 华东师范大学学报(自然科学版), 2024(5): 162-172.
|