|
[1]
|
Ma, Y., Wang, T., Bai, X., Yang, H., Hou, Y., Wang, Y., et al. (2024) Vision-Centric BEV Perception: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46, 10978-10997. [Google Scholar] [CrossRef] [PubMed]
|
|
[2]
|
Wang, Y., Guizilini, V.C., Zhang, T., et al. (2022) Detr3d: 3d Object Detection from Multi-View Images via 3d-to-2d Queries. 2022 Conference on Robot Learning, Auckland, 14-18 December 2022, 180-191.
|
|
[3]
|
Liu, Y., Wang, T., Zhang, X. and Sun, J. (2022) PETR: Position Embedding Transformation for Multi-View 3D Object Detection. In: Lecture Notes in Computer Science, Springer, 531-548. [Google Scholar] [CrossRef]
|
|
[4]
|
Chen, Z., Li, Z., Zhang, S., Fang, L., Jiang, Q. and Zhao, F. (2022) Graph-DETR3D. Proceedings of the 30th ACM International Conference on Multimedia, Lisboa, 10-14 October 2022, 5999-6008. [Google Scholar] [CrossRef]
|
|
[5]
|
Peng, L., Chen, Z., Fu, Z., Liang, P. and Cheng, E. (2023) BEVSegFormer: Bird’s Eye View Semantic Segmentation from Arbitrary Camera Rigs. 2023 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV), Waikoloa, 2-7 January 2023, 5924-5932. [Google Scholar] [CrossRef]
|
|
[6]
|
Reinauer, R., Caorsi, M. and Berkouk, N. (2021) Persformer: A Transformer Architecture for Topological Machine Learning.
|
|
[7]
|
Li, Z., Wang, W., Li, H., Xie, E., Sima, C., Lu, T., et al. (2025) BEVFormer: Learning Bird’s-Eye-View Representation from Lidar-Camera via Spatiotemporal Transformers. IEEE Transactions on Pattern Analysis and Machine Intelligence, 47, 2020-2036. [Google Scholar] [CrossRef] [PubMed]
|
|
[8]
|
Liu, Y., Yan, J., Jia, F., Li, S., Gao, A., Wang, T., et al. (2023) PETRv2: A Unified Framework for 3D Perception from Multi-Camera Images. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, 1-6 October 2023, 3239-3249. [Google Scholar] [CrossRef]
|
|
[9]
|
Dosovitskiy, A., Beyer, L., Kolesnikov, A., et al. (2020) An Image Is Worth 16x16 Words: Transformers for Image Recognition at Scale.
|
|
[10]
|
Hu, H., Wang, F., Su, J., et al. (2023) EA-LSS: Edge-Aware Lift-Splat-Shot Framework for 3d Bev Object Detection.
|
|
[11]
|
Simonyan, K. and Zisserman, A. (2014) Very Deep Convolutional Networks for Large-Scale Image Recognition.
|
|
[12]
|
Huang, J., Huang, G., Zhu, Z., et al. (2021) BEVDet: High-Performance Multi-Camera 3d Object Detection in Bird-Eye-view.
|
|
[13]
|
Daubechies, I., DeVore, R., Foucart, S., Hanin, B. and Petrova, G. (2021) Nonlinear Approximation and (Deep) ReLU Networks. Constructive Approximation, 55, 127-172. [Google Scholar] [CrossRef]
|
|
[14]
|
Gürbüz, Y.Z., Şener, O. and Alatan, A.A. (2023) Generalized Sum Pooling for Metric Learning. 2023 IEEE/CVF International Conference on Computer Vision (ICCV), Paris, 1-6 October 2023, 5439-5450. [Google Scholar] [CrossRef]
|
|
[15]
|
Zou, Z., Chen, K., Shi, Z., Guo, Y. and Ye, J. (2023) Object Detection in 20 Years: A Survey. Proceedings of the IEEE, 111, 257-276. [Google Scholar] [CrossRef]
|
|
[16]
|
Hao, S., Zhou, Y. and Guo, Y. (2020) A Brief Survey on Semantic Segmentation with Deep Learning. Neurocomputing, 406, 302-321. [Google Scholar] [CrossRef]
|
|
[17]
|
Chen, X., Yang, C., Mo, J., Sun, Y., Karmouni, H., Jiang, Y., et al. (2024) CSPNeXt: A New Efficient Token Hybrid Backbone. Engineering Applications of Artificial Intelligence, 132, Article 107886. [Google Scholar] [CrossRef]
|
|
[18]
|
Lin, T., Dollar, P., Girshick, R., He, K., Hariharan, B. and Belongie, S. (2017) Feature Pyramid Networks for Object Detection. 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, 21-26 July 2017, 936-944. [Google Scholar] [CrossRef]
|
|
[19]
|
Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., et al. (2020) nuScenes: A Multimodal Dataset for Autonomous Driving. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 11618-11628. [Google Scholar] [CrossRef]
|
|
[20]
|
Zhang, Z., Schwing, A.G., Fidler, S. and Urtasun, R. (2015) Monocular Object Instance Segmentation and Depth Ordering with CNNs. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 7-13 December 2015, 2614-2622. [Google Scholar] [CrossRef]
|
|
[21]
|
Zhou, X., Wang, D. and Krähenbühl, P. (2019) Objects as Points.
|
|
[22]
|
Wang, T., Zhu, X., Pang, J. and Lin, D. (2021) FCOS3D: Fully Convolutional One-Stage Monocular 3D Object Detection. 2021 IEEE/CVF International Conference on Computer Vision Workshops (ICCVW), Montreal, 11-17 October 2021, 913-922. [Google Scholar] [CrossRef]
|