基于图像引导的自动驾驶场景下深度补全算法研究
Research on Image-Guided Depth Completion Algorithms for Autonomous Driving
摘要: 深度补全旨在结合稀疏深度测量与彩色图像以恢复出高分辨率的密集深度图像,这一技术在自动驾驶场景中至关重要。大多数现有方法均基于空间传播机制,即对初始估计的密集深度进行迭代精化。然而初始深度估计通常直接通过常规卷积来提取特征,导致在处理空值区域时易产生卷积噪声以及物体边界恢复存在不足。为解决上述问题,本文提出一种双分支多尺度深度补全网络。该方法采用“插值 + 更新”的两阶段策略:在深度插值分支中提出引导双边插值,并结合子流形稀疏卷积与空洞空间金字塔池化生成初始稠密深度图;同时引入通道–空间注意力机制,实现跨模态特征的动态加权。在更新分支中,利用迭代传播机制对初始结果进行优化,从而提升全局一致性与局部细节恢复。在KITTI数据集上的验证结果表明,所提出的双分支多尺度深度补全网络优于其他主流方法。
Abstract: Depth completion aims to combine sparse depth measurements with RGB images to recover high-resolution dense depth maps. This technology is crucial for autonomous driving scenarios. Most existing approaches rely on spatial propagation mechanisms that iteratively refine the initial dense depth estimates. However, initial depth estimation typically extracts features directly through conventional convolutions, which makes it prone to convolution noise when handling empty regions and less effective in reconstructing object boundaries. To address these issues, this paper proposes a Dual-branch Multi-scale Depth Completion Network (DM-Net). The method adopts a two-stage “interpolation + update” strategy. In the interpolation branch, guided bilateral interpolation is introduced and combined with Submanifold Sparse Convolution and Atrous Spatial Pyramid Pooling to generate the initial dense depth map. Meanwhile, a Convolutional Block Attention Module is incorporated to achieve dynamic weighting of cross-modal features. In the update branch, an iterative propagation mechanism is employed to refine the initial results, thereby enhancing global consistency and local detail recovery. Experiments on the KITTI dataset demonstrate that the proposed dual-branch multi-scale depth completion network outperforms other mainstream methods.
文章引用:仇毅. 基于图像引导的自动驾驶场景下深度补全算法研究[J]. 人工智能与机器人研究, 2026, 15(1): 87-96. https://doi.org/10.12677/airr.2026.151010

参考文献

[1] Song, Z., Lu, J., Yao, Y. and Zhang, J. (2022) Self-Supervised Depth Completion from Direct Visual-Lidar Odometry in Autonomous Driving. IEEE Transactions on Intelligent Transportation Systems, 23, 11654-11665. [Google Scholar] [CrossRef
[2] Hu, J., Bao, C., Ozay, M., et al. (2022) Deep Depth Completion from Extremely Sparse Data: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 8244-8264.
[3] 付志超. 基于彩色图像语义信息引导的深度补全研究[D]: [博士学位论文]. 上海: 华东师范大学, 2024.
[4] Tang, J., Tian, F., Feng, W., Li, J. and Tan, P. (2021) Learning Guided Convolutional Network for Depth Completion. IEEE Transactions on Image Processing, 30, 1116-1129. [Google Scholar] [CrossRef] [PubMed]
[5] Zhang, Y., Guo, X., Poggi, M., Zhu, Z., Huang, G. and Mattoccia, S. (2023) CompletionFormer: Depth Completion with Convolutions and Vision Transformers. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, 17-24 June 2023, 18527-18536. [Google Scholar] [CrossRef
[6] 徐杰杰. 基于特征融合的单目深度估计和深度补全研究[D]: [硕士学位论文]. 南京: 南京信息工程大学, 2023.
[7] 陈思远. 基于激光雷达与摄像头融合的深度补全算法研究[D]: [硕士学位论文]. 成都: 电子科技大学, 2022.
[8] 王海廷. 基于多视角融合和注意力机制的深度补全算法研究[D]: [硕士学位论文]. 大连: 大连理工大学, 2023.
[9] Liu, S., De Mello, S., Gu, J., et al. (2017) Learning Affinity via Spatial Propagation Networks. arXiv: 1710.01020.
[10] Cheng, X., Wang, P. and Yang, R. (2020) Learning Depth with Convolutional Spatial Propagation Network. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 2361-2379. [Google Scholar] [CrossRef] [PubMed]
[11] Cheng, X., Wang, P., Guan, C. and Yang, R. (2020) CSPN++: Learning Context and Resource Aware Convolutional Spatial Propagation Networks for Depth Completion. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 10615-10622. [Google Scholar] [CrossRef
[12] Park, J., Joo, K., Hu, Z., Liu, C. and So Kweon, I. (2020) Non-Local Spatial Propagation Network for Depth Completion. In: Vedaldi, A., Bischof, H., Brox, T. and Frahm, J.M., Eds., Computer VisionECCV 2020, Springer, 120-136. [Google Scholar] [CrossRef
[13] Wang, Y., Li, B., Zhang, G., Liu, Q., Gao, T. and Dai, Y. (2023) LRRU: Long-Short Range Recurrent Updating Networks for Depth Completion. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, 1-6 October 2023, 9388-9398. [Google Scholar] [CrossRef
[14] Li, Y., Huang, J., Ahuja, N. and Yang, M. (2019) Joint Image Filtering with Deep Convolutional Networks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41, 1909-1923. [Google Scholar] [CrossRef] [PubMed]
[15] Uhrig, J., Schneider, N., Schneider, L., Franke, U., Brox, T. and Geiger, A. (2017) Sparsity Invariant CNNs. 2017 International Conference on 3D Vision (3DV), Qingdao, 10-12 October 2017, 11-20. [Google Scholar] [CrossRef
[16] Graham, B. and Van der Maaten, L. (2017) Submanifold Sparse Convolutional Networks. arXiv: 1706.01307.
[17] Woo, S., Park, J., Lee, J. and Kweon, I.S. (2018) CBAM: Convolutional Block Attention Module. In: Ferrari, V., Hebert, M., Sminchisescu, C. and Weiss, Y., Eds., Computer VisionECCV 2018, Springer, 3-19. [Google Scholar] [CrossRef