视觉语义SLAM中关键帧选取策略的研究
Research on Key Frame Selection Strategy in Visual Semantic SLAM
摘要: 基于视觉的同步定位与地图构建(视觉SLAM)是目前计算机科学中重要的研究领域,是无人驾驶、环境感知、机器人等领域的重要技术。近些年,随着深度学习的迅猛发展,语义分割作为其核心衍生技术之一,拓展出了非常广泛的应用场景,为人类提供了像素级别的图像理解。为了结合语义分割与视觉SLAM,探索语义分割在视觉SLAM中的应用,本文基于ORBSLAM2与SegNet语义分割网络,探讨并提出一种在语义SLAM中,满足实时语义信息获取要求的关键帧选择策略。并通过语义延迟性能测试,结果表明,改进后的选择策略能保证使用的关键帧的语义信息与其他线程使用的帧是较为接近的,并且延迟性能优于传统的顺序关键帧选取策略。
Abstract: Visual-based simultaneous localization and map Construction (Visual SLAM) is an important re-search field in computer science, and it is an important technology in the fields of unmanned driv-ing, environmental perception, robotics, etc. In recent years, with the rapid development of deep learning, semantic segmentation, as one of its core derivative technologies, has expanded a very wide range of application scenarios, providing pixel-level image understanding for human beings. In order to combine semantic segmentation with visual SLAM and explore the application of semantic segmentation in visual SLAM, based on ORBSLAM2 and SegNet semantic segmentation networks, this paper discusses and proposes a key frame selection strategy that can meet the requirements of real-time semantic information acquisition in semantic SLAM. Through the semantic delay performance test, the results show that the improved selection strategy can ensure that the semantic key frame information used is close to the current tracking frame, and the delay performance is better than the traditional sequential key frame selection strategy.
文章引用:徐畅, 邝坚. 视觉语义SLAM中关键帧选取策略的研究[J]. 计算机科学与应用, 2023, 13(12): 2230-2235. https://doi.org/10.12677/CSA.2023.1312223

参考文献

[1] 王妍. 基于语义分割的室内动态视觉SLAM与回环检测研究[D]: [硕士学位论文]. 西安: 西安理工大学, 2023.[CrossRef
[2] Mur-Artal, R., Montiel, J.M.M. and Tardos, J.D. (2015) ORB-SLAM: A Versatile and Accurate Monocular SLAM System. IEEE Transactions on Robotics, 31, 1147-1163. [Google Scholar] [CrossRef
[3] Badrinarayanan, V., Kendall, A. and Cipolla, R. (2017) Segnet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 2481-2495. [Google Scholar] [CrossRef
[4] Ronneberger, O., Fischer, P. and Brox, T. (2015) U-Net: Convolutional Networks for Biomedical Image Segmentation. Medical Image Computing and Computer-Assisted Inter-vention-MICCAI 2015: 18th International Conference, Munich, October 5-9 2015, 234-241. [Google Scholar] [CrossRef
[5] Zhao, H., Shi, J., Qi, X., et al. (2017) Pyramid Scene Parsing Network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, 21-26 July 2017, 2881-2890. [Google Scholar] [CrossRef
[6] He, K., Zhang, X., Ren, S., et al. (2016) Deep Residual Learning for Image Recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, 27-30 June 2016, 770-778. [Google Scholar] [CrossRef
[7] Mur-Artal, R. and Tardós, J.D. (2017) Orb-Slam2: An Open-Source Slam System for Monocular, Stereo, and RGB-D Cameras. IEEE Transactions on Robotics, 33, 1255-1262. [Google Scholar] [CrossRef
[8] Jürgen, S., Nikolas, E., Felix, E., et al. (2012) A Benchmark for the Evaluation of RGB-D SLAM Systems. 2012 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Portugal, 7-12 October 2012, 573-580. [Google Scholar] [CrossRef