多模态教育问答平台的设计与实现
Design and Implementation of a Multimodal Educational Question-Answering Platform
DOI: 10.12677/csa.2025.159219, PDF,   
作者: 苗 丹:北方工业大学人工智能与计算机学院,北京
关键词: 多模态教育问答平台手势绘画Multimodal Educational Q&A Platform Gesture-Based Drawing
摘要: 为了解决传统教育问答平台交互方式单一、不能够充分满足用户在复杂知识场景下表达和获取需求的问题,本文基于Spring Boot与Next.js打造了集文字、图像、手势绘画、语音四种交互模态于一体的多模态问答平台。从整体上讲,本平台包含了“前端交互层(Next.js实现交互界面)–数据服务层(Spring Boot 管理会话与数据持久化)–多模态处理层(Python集成阿里云图像生成与视觉理解API、OpenCV与MediaPipe手势绘图、PyAudio语音识别)”组成的技术架构。通过该平台的设计与实现,有效地提高了其教育问答方式的多样性以及用户体验。
Abstract: In order to solve the problem of traditional education Q&A platforms having a single interaction method and being unable to fully meet users’ expression and acquisition needs in complex knowledge scenarios, this paper builds a multimodal Q&A platform based on Spring Boot and Next.js, which integrates four interaction modes: text, image, gesture drawing, and voice. Overall, this platform consists of a technical architecture consisting of a front-end interaction layer (Next.js implements the interaction interface), a data service layer (Spring Boot manages sessions and data persistence), and a multimodal processing layer (Python integrates Alibaba Cloud image generation and visual understanding APIs, OpenCV and MediaPipe gesture drawing, PyAudio speech recognition). Through the design and implementation of this platform, the flexibility of its educational interaction and user experience have been effectively improved.
文章引用:苗丹. 多模态教育问答平台的设计与实现[J]. 计算机科学与应用, 2025, 15(9): 7-15. https://doi.org/10.12677/csa.2025.159219

参考文献

[1] 朱珂. 网络学习空间交互性、沉浸感对学生持续使用意愿的影响研究[J]. 中国电化教育, 2017(2): 89-95.
[2] 汪维富, 毛美娟. 多模态学习分析: 理解与评价真实学习的新路向[J]. 电化教育研究, 2021, 4(22): 25-32.
[3] Drachsler, H. and Schneider, J. (2018) JCAL Special Issue on Multimodal Learning Analytics. Journal of Computer Assisted Learning, 34, 335-337. [Google Scholar] [CrossRef
[4] Raca, M. and Dillenbourg, P. (2014) Holistic Analysis of the Classroom. Proceedings of the 2014 ACM workshop on Multimodal Learning Analytics Workshop and Grand Challenge, Istanbul, 12 November 2014, 13-20. [Google Scholar] [CrossRef
[5] Tan, S., Wiebrands, M., O’Halloran, K. and Wignell, P. (2020) Analysing Student Engagement with 360-Degree Videos through Multimodal Data Analytics and User Annotations. Technology, Pedagogy and Education, 29, 593-612. [Google Scholar] [CrossRef
[6] Olsen, J.K., Sharma, K., Rummel, N. and Aleven, V. (2020) Temporal Analysis of Multimodal Data to Predict Collaborative Learning Outcomes. British Journal of Educational Technology, 51, 1527-1547. [Google Scholar] [CrossRef
[7] 刘清堂, 吴林静, 刘嫚, 等. 智能导师系统研究现状与发展趋势[J]. 中国电化教育, 2016(10): 39-44.
[8] Mohamed, H. and Lamia, M. (2018) Implementing Flipped Classroom That Used an Intelligent Tutoring System into Learning Process. Computers & Education, 124, 62-76. [Google Scholar] [CrossRef
[9] Schez-Sobrino, S., Gmez-Portes, C., Vallejo, D., Glez-Morcillo, C. and Redondo, M.Á. (2020) An Intelligent Tutoring System to Facilitate the Learning of Programming through the Usage of Dynamic Graphic Visualizations. Applied Sciences, 10, Article 1518. [Google Scholar] [CrossRef
[10] 首新, 田伟, 李健, 等. 基于过程数据的人机“虚拟代理”协作问题解决测评研究——以PISA中国四地区为例[J]. 现代教育技术, 2023, 33(10): 86-97.
[11] 禹鑫燚, 张鑫, 许成军, 等. 融合人体感知和多模态手势的人机交互方法和系统设计[J]. 高技术通讯, 2025, 35(2): 183-197.
[12] 王瀚升, 张艳瑜, 郭江真, 等. 智能助行的人机交互策略发展[J]. 科技导报, 2025, 43(13): 78-89.
[13] 闫然然. 多模态教学模式在线上中级汉语综合课中的应用研究[D]: [硕士学位论文]. 长春: 吉林外国语大学, 2025.
[14] 关玉蓉. 基于可视化平台的API接口研究[J]. 科技广场, 2015(6): 26-29.
[15] 张海洋, 成新民, 徐黄镇. 物联网电梯数据交互平台RESTful API接口设计与实现[J]. 数字技术与应用, 2022, 40(10): 171-175.
[16] 黄欣荣. 从技术创新、科学实践与生活方式看DeepSeek的哲学意义[J]. 语言与教育研究, 2025, 9(3): 5-12, 2.
[17] 宋浩翔, 张旭, 沈宏晔, 等. 基于MediaPipe的手语识别系统[J]. 物联网技术, 2025, 15(10): 4-6.
[18] 张天翼. 基于OpenCV的嵌入式平台物体跟踪系统设计[J]. 仪表技术, 2025(4): 1-4, 19.