基于HarmonyOS的端云协同语音反诈系统

doi:10.12677/sea.2026.153039

期刊菜单

基于HarmonyOS的端云协同语音反诈系统
A Cloud-Edge Collaborative Voice Anti-Fraud System Based on HarmonyOS

DOI: 10.12677/sea.2026.153039, PDF, 科研立项经费支持
作者: 黄纯琴, 杨金春, 严言, 杨再鑫, 臧淼：北方工业大学人工智能与计算机学院，北京
关键词: HarmonyOS；语音反诈；端云协同；语音识别；诈骗检测；Harmonyos； Voice Anti-Fraud； Cloud-Edge Collaboration； Speech Recognition； Fraud Detection

摘要: 针对电信网络诈骗话术不断演化、用户在通话过程中缺乏实时语音内容风险预警的问题，本文设计并实现了一种基于HarmonyOS的端云协同语音反诈系统。系统采用前后端分离架构：前端基于HarmonyOS ArkTS框架，实现多音源音频播放、音频选择、用户交互及结果可视化；后端基于Python Flask框架，完成音频预处理、百度短语音识别接口调用以及诈骗文本检测。为提升系统实时性与工程可用性，在大模型语义分析基础上，引入“本地规则预筛 + 大模型语义判别补充 + 结果缓存”的融合检测策略，输出结构化检测结果，包括风险等级、置信度、判定理由和安全建议。实验结果表明：在自构建的电信诈骗与正常通话语音数据集上，系统整体诈骗检测准确率达到91.6%，对“安全账户”“冻结资金”“法院传票”等关键诈骗短语的召回率超过95%；在Wi-Fi环境下，系统端到端平均响应时延约为2 s，连续100次检测成功率达到100%。进一步的补充实验表明，优化后的融合检测流程在20条典型文本样本上取得了100%的分类准确率，平均响应时间约为0.38 s。结果说明，该系统能够有效识别典型诈骗话术，并在准确率与实时性之间取得较好平衡，可为HarmonyOS生态下智能反诈应用开发提供参考。

Abstract: To address the problem that telecom fraud scripts are constantly evolving and users lack real-time risk warning during phone calls, this paper designs and implements a cloud-edge collaborative voice anti-fraud system based on HarmonyOS. The system adopts a front-end/back-end separated architecture. The front end is developed with the HarmonyOS ArkTS framework to support multi-source audio playback, audio selection, user interaction, and result visualization. The back end is built on Python Flask to perform audio preprocessing, invoke Baidu short speech recognition services, and conduct fraud text detection. To improve real-time performance and engineering practicality, a hybrid detection strategy integrating local rule-based pre-screening, large-model-assisted semantic judgment, and result caching is proposed on top of large-model semantic analysis. The system outputs structured results including risk level, confidence score, reason, and safety suggestions. Experimental results show that, on a self-constructed dataset of telecom fraud and normal conversation speech, the overall fraud detection accuracy reaches 91.6%, while the recall of key fraud phrases such as “safe account”, “frozen funds”, and “court subpoena” exceeds 95%. Under Wi-Fi conditions, the average end-to-end response latency is about 2 s, and the success rate over 100 consecutive tests reaches 100%. In addition, supplementary experiments on 20 representative text samples show that the optimized hybrid detection process achieves 100% classification accuracy with an average response time of 0.38 s. The results indicate that the proposed system can effectively identify typical fraud scripts and achieve a good balance between accuracy and real-time performance, providing a practical reference for intelligent anti-fraud application development in the HarmonyOS ecosystem.

文章引用：黄纯琴, 杨金春, 严言, 杨再鑫, 臧淼. 基于HarmonyOS的端云协同语音反诈系统[J]. 软件工程与应用, 2026, 15(3): 412-422. https://doi.org/10.12677/sea.2026.153039

参考文献

[1]	周志华. 机器学习[M]. 北京: 清华大学出版社, 2016.
[2]	李航. 统计学习方法[M]. 北京: 清华大学出版社, 2019.
[3]	Goodfellow, I., Bengio, Y. and Courville, A. (2016) Deep Learning. MIT Press.
[4]	Russell, S. and Norvig, P. (2021) Artificial Intelligence: A Modern Approach. 4th Edition, Pearson.
[5]	Vaswani, A., Shazeer, N., Parmar, N., et al. (2017) Attention Is All You Need. 31st Conference on Neural Information Processing Systems (NIPS 2017), Long Beach, 4-9 December 2017, 5998-6008.
[6]	Devlin, J., Chang, M.W., Lee, K., et al. (2019) BERT: Pre-Training of Deep Bidirectional Transformers for Language Understanding. Proceedings of NAACL-HLT 2019, Minneapolis, 2-7 June 2019, 4171-4186.
[7]	Brown, T.B., Mann, B., Ryder, N., et al. (2020) Language Models are Few-Shot Learners. 34th Conference on Neural Information Processing Systems (NeurIPS 2020), Vancouver, 6-12 December 2020, 1877-1901.
[8]	Radford, A., Narasimhan, K., Salimans, T., et al. (2018) Improving Language Understanding by Generative Pre-Training. https://cdn.openai.com/research-covers/language-unsupervised/language_understanding_paper.pdf
[9]	Huawei: HarmonyOS Developer Documentation. https://developer.harmonyos.com/

为你推荐

友情链接