基于路端激光雷达的多目标检测算法研究
Research on Multi Object Detection Algorithm Based on Roadside Lidar
摘要: 为了解决路端点云目标检测任务中复杂场景下远距离检测精度差,内存和计算开销大等问题,现有的基于点的检测器常采用随机采样和最远点采样对输入点云进行下采样,忽略了前景点的重要性。本文提出了一种面向路端的基于点的单阶段三维目标检算法。本文算法采用了一种面向自动驾驶路端检测任务的实例感知下采样策略,分级选择检测对象的前景点。此外,算法引入中心感知模块来进一步估计精确的实例中心,提高检测精度,改善误检问题。为了验证算法的性能,在自动驾驶路端数据集DAIR-V2X-I上进行实验。实验结果表明,所提算法相比其它公开算法检测准确率具有一定的提升,同单阶段检测算法3DSSD相比,在中等检测难度下,对DAIR-V2X-I数据集汽车、行人、骑行者的检测准确率分别提高了0.99%,2.4%和4.7%。
Abstract: In order to solve the problems of poor remote detection accuracy, high memory and computational overhead in complex scenes of endpoint cloud object detection tasks, existing point based detectors often use random sampling and farthest point sampling to downsample the input point cloud, ignoring the importance of foreground points. We introduce a novel point-based and single-stage algorithm for 3D object detection at the road end. The key to this algorithm is to use two learnable, task oriented, instance aware downsampling strategies to hierarchically select the foreground points of the object of interest. In addition, a context center aware module is introduced to further estimate the accurate instance center, improve detection accuracy, and address false positives. To validate the performance, we conducted experiments on the DAIR-V2X-I autonomous driving roadside dataset. Results demonstrate that our algorithm achieves superior detection accuracy over other public benchmarks. Compared to the single-stage detector 3DSSD, our method achieves gains of 0.99%, 2.4%, and 4.7% in AP for cars, pedestrians, and cyclists, respectively, on the DAIR-V2X-I dataset under moderate difficulty.
文章引用:许正尧. 基于路端激光雷达的多目标检测算法研究[J]. 人工智能与机器人研究, 2025, 14(6): 1424-1432. https://doi.org/10.12677/airr.2025.146133

参考文献

[1] Yu, H.B., Luo, Y.Z., Shu, M., et al. (2022) DAIR-V2X: A Largescale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 21329-21338.
[2] 《中国公路学报》编辑部. 中国汽车工程学术研究综述2023[J]. 中国公路学报, 2023, 36(11): 1-192.
[3] Liu, Z., Wu, Z. and Toth, R. (2020) SMOKE: Single-Stage Monocular 3D Object Detection via Keypoint Estimation. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), Seattle, 14-19 June 2020, 4289-4298. [Google Scholar] [CrossRef
[4] Chen, X.Z., Kundu, K., Zhang, Z.Y., et al. (2016) Monocular 3D Object Detection for Autonomous Driving. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 2147-2156. [Google Scholar] [CrossRef
[5] He, T. and Soatto, S. (2019) MONO3D++: Monocular 3D Vehicle Detection with Two-Scale 3D Hypotheses and Task Priors. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 8409-8416. [Google Scholar] [CrossRef
[6] Charles, R.Q., Su, H., Kaichun, M. and Guibas, L.J. (2017) PointNet: Deep Learning on Point Sets for 3D Classification and Segmentation. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 77-85. [Google Scholar] [CrossRef
[7] Charles, R.Q., Yi, L., Su, H., et al. (2017) PointNet++ : Deep Hierarchical Feature Learning on Point Sets in a Metric Space. Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, 4-9 December 2017, 5105-5114.
[8] Li, J., Luo, C. and Yang, X. (2023) PillarNeXt: Rethinking Network Designs for 3D Object Detection in Lidar Point Clouds. 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Vancouver, 17-24 June 2023, 17567-17576. [Google Scholar] [CrossRef
[9] Zhou, Y. and Tuzel, O. (2018) VoxelNet: End-to-End Learning for Point Cloud Based 3D Object Detection. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 4490-4499. [Google Scholar] [CrossRef
[10] Yan, Y., Mao, Y. and Li, B. (2018) SECOND: Sparsely Embedded Convolutional Detection. Sensors, 18, Article 3337. [Google Scholar] [CrossRef] [PubMed]
[11] Lang, A.H., Vora, S., Caesar, H., Zhou, L., Yang, J. and Beijbom, O. (2019). PointPillars: Fast Encoders for Object Detection from Point Clouds. 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Long Beach, 15-20 June 2019, 12689-12697.[CrossRef
[12] Xu, D., Anguelov, D. and Jain, A. (2018) PointFusion: Deep Sensor Fusion for 3D Bounding Box Estimation. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 244-253. [Google Scholar] [CrossRef
[13] Ku, J., Mozifian, M., Lee, J., Harakeh, A. and Waslander, S.L. (2017) Joint 3D Proposal Generation and Object Detection from View Aggregation. 2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Madrid, 1-5 October 2018, 1-8. [Google Scholar] [CrossRef
[14] Yu, H., Luo, Y., Shu, M., Huo, Y., Yang, Z., Shi, Y., et al. (2022) DAIR-V2X: A Large-Scale Dataset for Vehicle-Infrastructure Cooperative 3D Object Detection. 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), New Orleans, 18-24 June 2022, 21329-21338. [Google Scholar] [CrossRef
[15] Qi, C.R., Litany, O., He, K.M., et al. (2019) Deep Hough Voting for 3D Object Detection in Point Clouds.
https://arxiv.org/abs/1904.09664
[16] 张勇, 石志广, 沈奇, 等. 基于特征融合的改进型PointPillar点云目标检测[J]. 光学精密工程, 2023, 31(19): 2910-2920.