基于TF-IDF和jieba分词的交通运输综合执法语音文件和文本文件关联匹配技术
TF-IDF-Based Transportation Integrated Law Enforcement Voice File and Text File Association Matching Technology
DOI: 10.12677/OJTT.2023.125041, PDF,    科研立项经费支持
作者: 刘文平, 李艳春, 张 贺, 张宇驰, 丁 鼎:北京市交通运输综合执法总队,北京;于 泉, 王传炀:北方工业大学电气与控制工程学院,北京
关键词: TF-IDFjieba分词交通运输综合执法关联匹配听证会TF-IDF Jieba Participle Comprehensive Law Enforcement of Transportation Association Matching Hearing
摘要: 在交通运输综合行政执法听证环节中,传统听证环节均是线下举行的,听证记录员需要对整个听证环节的笔录进行详细记录。由于会后需要与整个案件的证据材料进行归档整理,对于执法人员的工作强度要求很高。因此,针对交通运输综合执法办案流程中的听证业务环节提供一定的技术支撑,利用TF-IDF算法对听证内容进行关键词提取,和jieba分词进行优化开发语音文件和文本文件关联匹配技术,实现听证语音文本与案件关键要素信息的精确关联匹配,构建完整证据链确保行政处罚有据可依,整体提升交通运输综合行政执法针对听证案件的处罚判决的充分与准确,助力政府治理系统和治理能力现代化建设。
Abstract: In the comprehensive administrative law enforcement hearing process of transportation, the traditional hearing process is held offline, and the hearing recorder needs to keep detailed records of the entire hearing process. Due to the need to archive and organize the evidence materials of the entire case after the meeting, there is a high demand for the workload of law enforcement personnel. To provide certain technical support for the hearing business process in the comprehensive law enforcement process of transportation, TF-IDF algorithm is used to extract key words from the hearing content, and jieba segmentation is used to optimize the development of voice evidence files and text file association matching technology, Realize accurate correlation and matching between hearing voice text and key element information of the case, construct a complete evidence chain to ensure that administrative penalties are based on evidence, comprehensively improve the adequacy and accuracy of punishment judgments for hearing cases in transportation comprehensive administrative law enforcement, and assist in the modernization of government governance system and governance capacity.
文章引用:刘文平, 李艳春, 张贺, 张宇驰, 丁鼎, 于泉, 王传炀. 基于TF-IDF和jieba分词的交通运输综合执法语音文件和文本文件关联匹配技术[J]. 交通技术, 2023, 12(5): 377-384. https://doi.org/10.12677/OJTT.2023.125041

参考文献

[1] 王婧. 基于TF-IDF与Word2vec的新闻热点分析[J]. 中国有线电视, 2023, 451(2): 59-63.
[2] 梁尘逸, 姚远哲. 基于异构信息网络与TF-IDF的核心药物发现算法[J]. 计算机时代, 2023, 371(5): 31-35. [Google Scholar] [CrossRef
[3] Yang, Z., Dai, Z., Yang, Y., et al. (2019) XLNet: Generalized Autore-Gressive Pretraining for Language Understanding. Proceedings of the 31th Conference on Advances in Neural Information Processing Systems, Long Beach, 8 September 2019, 5754-5764.
[4] 陈铭. 面向微博的文本质量评估与分类技术研究与实现[D]: [硕士学位论文]. 长沙: 国防科学技术大学, 2015.
[5] 金宇杰, 袁明. 基于TF-IDF算法的新词发现系统原理与实现[J]. 信息化研究, 2020, 46(5): 39-44.
[6] 柳文婷. 基于改进互信息的微博新情感词提取[J]. 延边大学学报(自然科学版), 2019, 45(4): 349-355.
[7] 王欣. 一种基于多字互信息与邻接熵的改进新词合成算法[J]. 现代计算机(专业版), 2018(11): 7-11.