# 基于自然语言处理的校园百科知识问答机器人的研究Research on Campus Encyclopedia Knowledge Answering Robot Based on Natural Language Processing

College students often repetitively ask simple questions about campus life. The purpose of this system is to design an intelligent question and answer robot to save time as well as facilitate teachers and students. The continuous development of the information age makes the exploitation and application of computer artificial intelligence more and more extensive. This system uses the technology of word segmentation and short text similarity calculation in natural language processing, and develops a Web application program based on MySQL and Spring Boot. The program is deployed on the Internet and tested in our school with good results.

1. 引言

2. 相关研究与研究基础理论

2.1. 国内外相关研究

2.2. 分词技术

2.3. 语义匹配模型

Figure 1. Framework of SimNet

1) INPUT layer通过look up table将文本词序列转换为word embedding序列。

2) Presentation的功能是构建句子，将独立词语的embedding表示组建成为为具有全局信息的一个或多个低维紧凑的语义向量。

3) Matching layer的功能是匹配。利用表示层生成的文本向量进行相似度计算，这里共有两种匹配算法，分别是Representation-based Match和Interaction-based Match。

① Representation-based Match

Figure 2. Cosine (left) vs MLP (right)

$\mathrm{cos}\theta =\frac{{X}_{1}{X}_{2}+{Y}_{1}{Y}_{2}}{\sqrt{{X}_{1}^{2}+{Y}_{1}^{2}}×\sqrt{{X}_{2}^{2}+{Y}_{2}^{2}}}$ (1)

$\mathrm{cos}\theta =\frac{{\sum }_{i=1}^{n}\left({A}_{1}×{B}_{2}\right)}{\sqrt{{\sum }_{i=1}^{n}{A}_{i}^{2}}×\sqrt{{\sum }_{i=1}^{n}{B}_{i}^{2}}}=\frac{A\cdot B}{|A|×|B|}$ (2)

MLP (Multilayer Perceptron)多层感知器，是一种前向结构的人工神经网络，映照一组输入向量到一组输出向量，通常使用反向传播算法来训练MLP。

② Interaction-based Match

3. 系统设计(Systematic Design)

Figure 3. System function frame diagram

4. 系统实现

4.1. 开发工具及运行环境

4.2. 核心部分实现

1) 检索流程实现

2) 分词与问题匹配实现

4.3. 功能实现

Figure 4. System operation flow chart

Figure 5. The system’s front page

Figure 6. Quiz interface

Figure 7. Express information management module interface

5. 结论

2018年国家级、省级大学生创新创业训练计划项目(201811347024、201811347086)；2019仲恺农业工程学院校级质量工程项目(KA190573919)；2019广州市哲学社会科学发展“十三五”规划2019年度课题(2019gzgj125)。

