基于深度学习的盲人行路辅助软件设计
Design of Walking Assistance Software for Blind People Based on Deep Learning
DOI: 10.12677/SEA.2022.112034, PDF,   
作者: 史娜维, 汪华章*:西南民族大学电气工程学院,四川 成都
关键词: 盲人出行视频检测语音合成Travel for the Blind Video Detection Speech Synthesis
摘要: 随着时代的进步,科技化的社会给部分人的生活带来了极大便利,但也使视障人群的生活变得举步维艰,针对国内关于盲人导盲设备的设计存在的不足,如智能化低、精度不高、无法对障碍物进行实时警报等,本文提出了一种结合计算机视觉和自然语言处理的盲人辅助行路的软件设计方案,利用视频检测技术实现实时检测过往车辆、行人以及障碍物,分析其所处位置,并对检测出的目标进行话术丰富,再基于语音合成技术将检测到的周围目标进行实时播报,为盲人提供相应的语音辅助指导,测试结果表明,上述系统能够以低成本的方法实现提高实用性、保障用户安全并且增加测障精度,使得盲人出行与生活更加方便。
Abstract: With the progress of the times, the technological society has brought great convenience to the lives of some people, but it has also made the lives of visually impaired people difficult. Due to low intelligence, low precision, and inability to provide real-time alerts to obstacles, this paper proposes a software design scheme for blind people’s assisted walking that combines computer vision and natural language processing. Video detection technology is used to detect passing vehicles, pedestrians and obstacles, analyze their locations, enrich the detected targets, and then broadcast the detected surrounding targets in real time based on speech synthesis technology, providing corresponding voice assistance guidance for the blind. The test results show that the above system can be implemented in a low-cost way to improve the practicability, ensure the safety of users, and in-crease the accuracy of obstacle detection, making travel and life of the blind more convenient.
文章引用:史娜维, 汪华章. 基于深度学习的盲人行路辅助软件设计[J]. 软件工程与应用, 2022, 11(2): 320-329. https://doi.org/10.12677/SEA.2022.112034

参考文献

[1] 李冉, 刘正一. 人文关怀理念下的盲人助行产品设计研究[J]. 工业设计, 2021(11): 70-71.
[2] Hwang, A.D. and Peli, E. (2014) An Augmented-Reality Edge Enhancement Application for Google Glass. Optometry and Vision Science, 91, 1021-1030.
[3] 何冰冰, 盛涛, 李凯鹏, 吴明明. 基于RFID地铁站内智能导盲系统设计[J]. 南方农机, 2020, 51(9): 106-107+116.
[4] 周浩, 吕俊燕, 杨瑞青. 三模块控制的助盲拐杖设计[J]. 电子制作, 2021(12): 87-88+65.
[5] 武曌晗, 荣学文, 范永. 导盲机器人研究现状综述[J]. 计算机工程与应用, 2020, 56(14): 1-13.
[6] 柳赟, 孙淑艳. 基于自适应模板更新的改进孪生卷积网络目标跟踪算法[J]. 计算机应用与软件, 2021, 38(4): 145-151+230.
[7] Girshick, R., Donahue, J., Darrell, T. and Malik, J. (2014) Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation. 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, 23-28 June 2014, 580-587. [Google Scholar] [CrossRef
[8] Ren, S., He, K. and Girshick, R. (2015) Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks. IEEE Transactions on Pattern Analysis & Machine Intelligence, Washington DC, 6 June 2017, 1137-1149.
[9] Liu, W., Anguelov, D., Erhan, D., et al. (2016) SSD: Single Shot MultiBox Detector. European Conference on Computer Vision, Amsterdam, 11-14 October 2016, 21-37. [Google Scholar] [CrossRef
[10] Redmon, J., Divvala, S.K., Girshick, R. and Farhadi, A. (2016) You Only Look Once: Unified, Real-Time Object Detection. 2016 IEEE Conference on Comput-er Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 779-788. [Google Scholar] [CrossRef
[11] Redmon, J. and Farhadi, A. (2018) YOLOv3: An Incremental Improvement. arXiv:1804.02767v1.
[12] 候少麒, 梁杰, 殷康宁, 刘学婷, 殷光强. 基于空洞卷积金字塔的目标检测算法[J]. 电子科技大学学报, 2021, 50(6): 843-851.
[13] Wang, Y., Skerry-Ryan, R.J., Stanton, D., et al. (2017) Tacotron: Towards End-to-End Speech Synthesis. Proceedings of Interspeech 2017, Stockholm, 20-24 August 2017, 4006-4010. [Google Scholar] [CrossRef
[14] Szegedy, C., Liu, W., Jia, Y., et al. (2015) Going Deeper with Convolutions. 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, 7-12 June 2015, 1-9.
[15] 王庆尧. 基于强制单调注意力机制的改进Tacotron2语音合成方法[D]: [硕士学位论文]. 哈尔滨: 哈尔滨工业大学, 2021.