基于改进YOLOv8n目标检测网络的室内动态视觉SLAM算法
Indoor Dynamic Visual SLAM Algorithm Based on an Improved YOLOv8n Object Detection Network
DOI: 10.12677/jsta.2025.131008, PDF,   
作者: 李家伟, 翁发禄, 陈伟东, 梁传福:江西理工大学电气工程与自动化学院,江西 赣州
关键词: 视觉SLAM目标检测室内动态环境光流法Visual SLAM Object Detection Indoor Dynamic Environment Optical Flow Method
摘要: 针对同步定位与地图构建(Simultaneous Localization and Mapping, SLAM)系统在室内动态环境中存在鲁棒性变差、定位精度降低的问题,提出了一种基于改进YOLOv8n目标检测网络的室内动态视觉SLAM算法。首先,选择YOLOv8n网络作为基线,并采用Ghost卷积替换原卷积,同时增加一个小目标检测层,减小网络模型体积,提高模型检测速度。其次,将改进的YOLOv8n目标检测网络与LK稀疏光流法结合,并引入视觉SLAM系统跟踪线程中,对场景中动态目标进行识别判断,筛选并剔除动态特征点;最后,仅使用静态特征点进行特征匹配和位姿估计。实验结果表明,在TUM数据集动态序列下相较于ORB-SLAM2,绝对轨迹均方根误差平均降低了96.62%,显著提高系统的鲁棒性和定位精度。与DS-SLAM、DynaSLAM等系统相比,该系统也能更有效平衡检测速度与定位精度。
Abstract: To address the issues of reduced robustness and lower localization accuracy in Simultaneous Loca- lization and Mapping (SLAM) systems within dynamic indoor environments, an indoor dynamic visual SLAM algorithm based on an improved object detection network is proposed. First, the YOLOv8n network is selected as the baseline, and Ghost convolution is employed to replace the original convolution, along with the addition of a small object detection layer to reduce the model size and improve detection speed. Second, the improved YOLOv8n object detection network is integrated with the Lucas-Kanade sparse optical flow method and is introduced into the tracking thread of the visual SLAM system to identify and filter out dynamic feature points from the scene. Finally, only static feature points are utilized for feature matching and pose estimation. Experimental results indicate that, on the dynamic sequences of the TUM dataset, the root mean square error of the absolute trajectory is reduced by an average of 96.62% compared to ORB-SLAM2, significantly enhancing system robustness and localization accuracy. Compared to systems such as DS-SLAM and DynaSLAM, this system also effectively balances detection speed and accuracy.
文章引用:李家伟, 翁发禄, 陈伟东, 梁传福. 基于改进YOLOv8n目标检测网络的室内动态视觉SLAM算法[J]. 传感器技术与应用, 2025, 13(1): 55-66. https://doi.org/10.12677/jsta.2025.131008

参考文献

[1] Davison, A.J., Reid, I.D., Molton, N.D. and Stasse, O. (2007) MonoSLAM: Real-Time Single Camera SLAM. IEEE Transactions on Pattern Analysis and Machine Intelligence, 29, 1052-1067. [Google Scholar] [CrossRef] [PubMed]
[2] Chansoo Park, and Song, J. (2015) Illumination Change Compensation and Extraction of Corner Feature Orientation for Upward-Looking Camera-Based SLAM. 2015 12th International Conference on Ubiquitous Robots and Ambient Intelligence (URAI), Goyangi, 28-30 October 2015, 224-227. [Google Scholar] [CrossRef
[3] Bresson, G., Alsayed, Z., Yu, L. and Glaser, S. (2017) Simultaneous Localization and Mapping: A Survey of Current Trends in Autonomous Driving. IEEE Transactions on Intelligent Vehicles, 2, 194-220. [Google Scholar] [CrossRef
[4] 高翔, 张涛, 刘毅, 颜沁睿, 著. 视觉SLAM十四讲 从理论到实践[M]. 北京: 电子工业出版社, 2019.
[5] Mur-Artal, R. and Tardos, J.D. (2017) ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras. IEEE Transactions on Robotics, 33, 1255-1262. [Google Scholar] [CrossRef
[6] Qin, T., Li, P. and Shen, S. (2018) VINS-Mono: A Robust and Versatile Monocular Visual-Inertial State Estimator. IEEE Transactions on Robotics, 34, 1004-1020. [Google Scholar] [CrossRef
[7] Engel, J., Schöps, T. and Cremers, D. (2014) LSD-SLAM: Large-Scale Direct Monocular SLAM. In: Fleet, D., Pajdla, T., Schiele, B. and Tuytelaars, T., Eds., Computer VisionECCV 2014, Springer International Publishing, 834-849. [Google Scholar] [CrossRef
[8] Engel, J., Koltun, V. and Cremers, D. (2018) Direct Sparse Odometry. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 611-625. [Google Scholar] [CrossRef] [PubMed]
[9] Li, S. and Lee, D. (2017) RGB-D SLAM in Dynamic Environments Using Static Point Weighting. IEEE Robotics and Automation Letters, 2, 2263-2270. [Google Scholar] [CrossRef
[10] Cheng, J., Wang, C. and Meng, M.Q. (2020) Robust Visual Localization in Dynamic Environments Based on Sparse Motion Removal. IEEE Transactions on Automation Science and Engineering, 17, 658-669. [Google Scholar] [CrossRef
[11] Wang, R., Wan, W., Wang, Y. and Di, K. (2019) A New RGB-D SLAM Method with Moving Object Detection for Dynamic Indoor Scenes. Remote Sensing, 11, Article 1143. [Google Scholar] [CrossRef
[12] Yuan, X. and Chen, S. (2020) SaD-SLAM: A Visual SLAM Based on Semantic and Depth Information. 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Las Vegas, 24 October-24 January 2021, 4930-4935. [Google Scholar] [CrossRef
[13] Li, F., Chen, W., Xu, W., Huang, L., Li, D., Cai, S., et al. (2020) A Mobile Robot Visual SLAM System with Enhanced Semantics Segmentation. IEEE Access, 8, 25442-25458. [Google Scholar] [CrossRef
[14] Yu, C., Liu, Z., Liu, X., Xie, F., Yang, Y., Wei, Q., et al. (2018) DS-SLAM: A Semantic Visual SLAM Towards Dynamic Environments. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, 1-5 October 2018, 1168-1174. [Google Scholar] [CrossRef
[15] Bescos, B., Facil, J.M., Civera, J. and Neira, J. (2018) DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes. IEEE Robotics and Automation Letters, 3, 4076-4083. [Google Scholar] [CrossRef
[16] He, J., Li, M., Wang, Y. and Wang, H. (2023) OVD-SLAM: An Online Visual SLAM for Dynamic Environments. IEEE Sensors Journal, 23, 13210-13219. [Google Scholar] [CrossRef
[17] Su, P., Luo, S. and Huang, X. (2022) Real-time Dynamic SLAM Algorithm Based on Deep Learning. IEEE Access, 10, 87754-87766. [Google Scholar] [CrossRef
[18] He, K., Gkioxari, G., Dollar, P. and Girshick, R. (2020) Mask R-CNN. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 386-397. [Google Scholar] [CrossRef] [PubMed]
[19] Badrinarayanan, V., Kendall, A. and Cipolla, R. (2017) SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 2481-2495. [Google Scholar] [CrossRef] [PubMed]