面向夜间场景的激光雷达与相机融合的三维目标检测算法
A 3D Object Aetection Algorithm Based on LiDAR and Camera Fusion for Night Scenes
DOI: 10.12677/airr.2025.144097, PDF,   
作者: 陈泽彬:广东工业大学自动化学院,广东 广州
关键词: 三维目标检测夜间场景多模态融合深度学习3D Object Detection Night Scene Multimodal Fusion Deep Learning
摘要: 基于激光雷达与相机融合的目标检测算法成功提高了自动驾驶系统的感知性能。然而,以往的多模态算法都是面向白天场景进行检测,在夜间场景下会因为夜间环境亮度较低以及由灯光照射引起的过曝问题导致检测性能的下降;为此,本文研究了一种改进的夜间多模态检测算法。首先引入了基于统计的过曝区域像素掩码模块,通过分析像素饱和度特征来抑制过曝区域的干扰;其次引入了基于权重归一化的通道加权融合模块,采用逐通道动态权重分配机制来优化不同模态特征的融合。实验采用NuScenes数据集全集及其夜间子集验证模型性能,实验结果表明改进模型平均类别精度达74.5%,对比基线方法实现了性能上的提升,算法可靠性得到验证,该研究成果展现出较好的工程价值。
Abstract: The target detection algorithm based on the fusion of LiDAR and camera successfully improved the perception performance of the autonomous driving system. However, previous multimodal algorithms are all for detection in daytime scenes. In nighttime scenes, the detection performance will decrease due to the low brightness of the nighttime environment and the overexposure problem caused by lighting. Therefore, this paper studies an improved nighttime multimodal detection algorithm. First, a statistical overexposed area pixel mask module is introduced to suppress the interference of the overexposed area by analyzing the pixel saturation characteristics; secondly, a channel weighted fusion module based on weight normalization is introduced, and a channel-by-channel dynamic weight allocation mechanism is used to optimize the fusion of different modal features. The experiment uses the full set of NuScenes dataset and its nighttime subset to verify the model performance. The experimental results show that the average category accuracy of the improved model reaches 74.5%, which is a performance improvement compared with the baseline method. The reliability of the algorithm is verified, and the research results show good engineering value.
文章引用:陈泽彬. 面向夜间场景的激光雷达与相机融合的三维目标检测算法[J]. 人工智能与机器人研究, 2025, 14(4): 1025-1033. https://doi.org/10.12677/airr.2025.144097

参考文献

[1] 王科俊, 赵彦东, 邢向磊. 深度学习在无人驾驶汽车领域应用的研究进展[J]. 智能系统学报, 2018, 13(1): 55-69.
[2] 国务院关于印发“十四五”现代综合交通运输体系发展规划的通知[J]. 中华人民共和国国务院公报, 2022(4): 8-28.
[3] 李昌财, 陈刚, 侯作勋, 等. 自动驾驶中的三维目标检测算法研究综述[J]. 中国图象图形学报, 2024, 29(11): 3238-3264.
[4] Vora, S., Lang, A.H., Helou, B. and Beijbom, O. (2020) Pointpainting: Sequential Fusion for 3D Object Detection. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 4604-4612. [Google Scholar] [CrossRef
[5] Liang, T., Yu, K., et al. (2022) Bevfusion: A Simple and Robust Lidar-Camera Fusion Framework. 36th Conference on Neural Information Processing Systems (NeurIPS 2022), New Orleans, 28 November-9 December 2022, 10421-10434.
[6] Zhang, C., Wang, H., Cai, Y., Chen, L., Li, Y., Sotelo, M.A., et al. (2022) Robust-Fusionnet: Deep Multimodal Sensor Fusion for 3D Object Detection under Severe Weather Conditions. IEEE Transactions on Instrumentation and Measurement, 71, 1-13. [Google Scholar] [CrossRef
[7] Sural, S., Sahu, N. and Rajkumar, R.R. (2024) Contextualfusion: Context-Based Multi-Sensor Fusion for 3D Object Detection in Adverse Operating Conditions. 2024 IEEE Intelligent Vehicles Symposium (IV), Jeju Island, 2-5 June 2024, 1534-1541. [Google Scholar] [CrossRef
[8] Guo, C., Li, C., Guo, J., Loy, C.C., Hou, J., Kwong, S., et al. (2020) Zero-Reference Deep Curve Estimation for Low-Light Image Enhancement. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 1780-1789. [Google Scholar] [CrossRef
[9] Jiang, Y., Gong, X., Liu, D., Cheng, Y., Fang, C., Shen, X., et al. (2021) Enlightengan: Deep Light Enhancement without Paired Supervision. IEEE Transactions on Image Processing, 30, 2340-2349. [Google Scholar] [CrossRef] [PubMed]
[10] Godard, C., Aodha, O.M., Firman, M. and Brostow, G. (2019) Digging into Self-Supervised Monocular Depth Estimation. 2019 IEEE/CVF International Conference on Computer Vision (ICCV), Seoul, 27 October-2 November 2019, 3828-3838. [Google Scholar] [CrossRef
[11] Zheng, Y., Zhong, C., Li, P., Gao, H., Zheng, Y., Jin, B., et al. (2023) STEPS: Joint Self-Supervised Nighttime Image Enhancement and Depth Estimation. 2023 IEEE International Conference on Robotics and Automation (ICRA), London, 29 May-2 June 2023, 4916-4923. [Google Scholar] [CrossRef
[12] Wang, S., Caesar, H., Nan, L. and Kooij, J.F.P. (2024) Unibev: Multi-Modal 3D Object Detection with Uniform BEV Encoders for Robustness against Missing Sensor Modalities. 2024 IEEE Intelligent Vehicles Symposium (IV), Jeju Island, 2-5 June 2024, 2776-2783. [Google Scholar] [CrossRef
[13] Chen, L., Li, S., Bai, Q., Yang, J., Jiang, S. and Miao, Y. (2021) Review of Image Classification Algorithms Based on Convolutional Neural Networks. Remote Sensing, 13, Article No. 4712. [Google Scholar] [CrossRef
[14] Zhou, Y. and Tuzel, O. (2018) VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 4490-4499. [Google Scholar] [CrossRef
[15] Bai, X., Hu, Z., Zhu, X., Huang, Q., Chen, Y., Fu, H., et al. (2022) Transfusion: Robust Lidar-Camera Fusion for 3D Object Detection with Transformers. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 1090-1099. [Google Scholar] [CrossRef
[16] Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., et al. (2020) Nuscenes: A Multimodal Dataset for Autonomous Driving. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 11621-11631. [Google Scholar] [CrossRef