机器学习赋能的自动驾驶技术研究综述
A Survey on Machine Learning for Autonomous Driving Technologies
摘要: 机器学习赋予智能体从数据中自主学习并归纳总结的能力,从而实现对未知场景的理解。作为实现自动驾驶的关键,该技术正引领系统设计从依赖人工规则向数据驱动模式转变,是提升车辆智能化水平、实现高级别自动驾驶的重要方向。为了厘清机器学习在该领域的发展脉络与应用现状,本研究以机器学习促进自动驾驶系统优化发展为主线,重点对“从人工设计到数据驱动”与“从模块化架构到端到端模式”这两大核心技术路线进行分析,详细介绍了机器学习在各个部分中的研究与应用现状,总结了目前自动驾驶系统在决策可解释性、极端案例处理等方面存在的局限性,以及介绍了以世界模型、大模型为代表的前沿技术在复杂交通环境中的优势表现。最后,指出了未来促进自动驾驶技术进步的关键途径,为进一步实现高阶自动驾驶提供了思路。
Abstract: Machine learning empowers intelligent agents to autonomously learn from data and generalize, enabling them to understand and adapt to unfamiliar scenarios. As a cornerstone of autonomous driving, this technology is catalyzing a paradigm shift in system design from reliance on human-defined rules to data-driven methodologies which represent a pivotal pathway toward enhancing vehicle intelligence and achieving higher levels of automation. To elucidate the evolution and current applications of machine learning in this domain, this study examines its role in optimizing autonomous driving systems. It focuses on two key technological trajectories: the transition from human-designed rules to data-driven approaches, and from modular architectures to end-to-end frameworks. The research provides a comprehensive overview of the current progress and practical applications of machine learning across core system components, summarizes existing limitations such as limited interpretability in decision-making and insufficient robustness in handling edge cases and highlights the advantages of emerging paradigms such as world models and large-scale foundation models in complex traffic environments. Finally, the study identifies critical directions for the future advancement of autonomous driving technologies and offers insights into achieving higher levels of intelligent automation.
文章引用:杨潜源, 文家燕. 机器学习赋能的自动驾驶技术研究综述[J]. 建模与仿真, 2026, 15(4): 55-71. https://doi.org/10.12677/mos.2026.154053

参考文献

[1] World Health Organization (2023) Global Status Report on Road Safety 2023: Summary. World Health Organization.
[2] 李克强, 戴一凡, 李升波, 等. 智能网联汽车(ICV)技术的发展现状及趋势[J]. 汽车安全与节能学报, 2017, 8(1): 1-14.
[3] 李升波, 关阳, 侯廉, 等. 深度神经网络的关键技术及其在自动驾驶领域的应用[J]. 汽车安全与节能学报, 2019, 10(2): 119-145.
[4] 陈虹, 郭露露, 边宁. 对汽车智能化进程及其关键技术的思考[J]. 科技导报, 2017, 35(11): 52-59.
[5] Urmson, C., Baker, C., Dolan, J., Rybski, P., Salesky, B., Whittaker, W., et al. (2009) Autonomous Driving in Traffic: Boss and the Urban Challenge. AI Magazine, 30, 17-28. [Google Scholar] [CrossRef
[6] Divakarla, K.P., Emadi, A. and Razavi, S. (2019) A Cognitive Advanced Driver Assistance Systems Architecture for Autonomous-Capable Electrified Vehicles. IEEE Transactions on Transportation Electrification, 5, 48-58. [Google Scholar] [CrossRef
[7] 钱志鸿, 田春生, 郭银景, 等. 智能网联交通系统的关键技术与发展[J]. 电子与信息学报, 2020, 42(1): 2-19.
[8] 李升波, 占国建, 蒋宇轩, 等. 类脑学习型自动驾驶决控系统的关键技术[J]. 汽车工程, 2023, 45(9): 1499-1515.
[9] Li, X., Wang, Z., Huang, Y. and Chen, H. (2023) A Survey on Self-Evolving Autonomous Driving: A Perspective on Data Closed-Loop Technology. IEEE Transactions on Intelligent Vehicles, 8, 4613-4631. [Google Scholar] [CrossRef
[10] Chen, C., Seff, A., Kornhauser, A. and Xiao, J. (2015) DeepDriving: Learning Affordance for Direct Perception in Autonomous Driving. 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, 7-13 December 2015, 2722-2730. [Google Scholar] [CrossRef
[11] Karle, P., Geisslinger, M., Betz, J. and Lienkamp, M. (2022) Scenario Understanding and Motion Prediction for Autonomous Vehicles—Review and Comparison. IEEE Transactions on Intelligent Transportation Systems, 23, 16962-16982. [Google Scholar] [CrossRef
[12] Xue, P., Pi, D., Wang, H., Cheng, Y., Yan, Y., Sun, X., et al. (2025) Drivable Area Detection Method in Dark Unstructured-Roads Based on CNN Data Fusion with Surface Normal Estimation. IEEE Transactions on Intelligent Transportation Systems, 26, 8694-8706. [Google Scholar] [CrossRef
[13] Badrinarayanan, V., Kendall, A. and Cipolla, R. (2017) SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 2481-2495. [Google Scholar] [CrossRef] [PubMed]
[14] Izquierdo, R., Quintanar, Á., Llorca, D.F., Daza, I.G., Hernández, N., Parra, I., et al. (2023) Vehicle Trajectory Prediction on Highways Using Bird Eye View Representations and Deep Learning. Applied Intelligence, 53, 8370-8388. [Google Scholar] [CrossRef
[15] Wei, Z., Huang, H., Zhang, G., Zhou, R., Luo, X., Li, S., et al. (2025) Interactive Critical Scenario Generation for Autonomous Vehicles Testing Based on In-Depth Crash Data Using Reinforcement Learning. IEEE Transactions on Intelligent Vehicles, 10, 1471-1482. [Google Scholar] [CrossRef
[16] Wu, Z., Ni, J., Wang, X., et al. (2024) HoloDrive: Holistic 2D-3D Multi-Modal Street Scene Generation for Autonomous Driving. arxiv: 2412.01407.
[17] Schwarting, W., Alonso-Mora, J. and Rus, D. (2018) Planning and Decision-Making for Autonomous Vehicles. Annual Review of Control, Robotics, and Autonomous Systems, 1, 187-210. [Google Scholar] [CrossRef
[18] Ngai, D.C.K. and Yung, N.H.C. (2011) A Multiple-Goal Reinforcement Learning Method for Complex Vehicle Overtaking Maneuvers. IEEE Transactions on Intelligent Transportation Systems, 12, 509-522. [Google Scholar] [CrossRef
[19] Hou, Y., Edara, P. and Sun, C. (2015) Situation Assessment and Decision Making for Lane Change Assistance Using Ensemble Learning Methods. Expert Systems with Applications, 42, 3875-3882. [Google Scholar] [CrossRef
[20] Lillicrap, T.P., Hunt, J.J., Pritzel, A., et al. (2016) Continuous Control with Deep Reinforcement Learning. International Conference on Learning Representations (ICLR) 2016, San Juan, 2-4 May 2016.
[21] 李晨. 基于组合神经网络的车辆驾驶行为预测[D]: [硕士学位论文]. 北京: 北京交通大学, 2021.
[22] Duan, J., Zhang, F., Li, S.E., Ren, Y., Cheng, B. and Xin, Z. (2022) Applications of Distributional Soft Actor-Critic in Real-World Autonomous Driving. 2022 2nd International Conference on Computer, Control and Robotics (ICCCR), Shanghai, 18-20 March 2022, 109-114. [Google Scholar] [CrossRef
[23] Wang, Y., Tan, M., Zou, W., et al. (20255) Enhanced DACER Algorithm with High Diffusion Efficiency. arxiv: 2505.23426.
[24] Klomp, M., Jonasson, M., Laine, L., Henderson, L., Regolin, E. and Schumi, S. (2019) Trends in Vehicle Motion Control for Automated Driving on Public Roads. Vehicle System Dynamics, 57, 1028-1061. [Google Scholar] [CrossRef
[25] Li, F., Zhang, Y., Chen, H., Stano, P., Sorniotti, A., Tian, H., et al. (2025) Decoupling Control Based on Neural Network Inverse System for Path Tracking in Multi-Actuated Unmanned Ground Vehicles. Vehicle System Dynamics. [Google Scholar] [CrossRef
[26] Li, Z., Zhao, P., Jiang, C., Huang, W. and Liang, H. (2022) A Learning-Based Model Predictive Trajectory Planning Controller for Automated Driving in Unstructured Dynamic Environments. IEEE Transactions on Vehicular Technology, 71, 5944-5959. [Google Scholar] [CrossRef
[27] Yao, J. and Ge, Z. (2022) Path-Tracking Control Strategy of Unmanned Vehicle Based on DDPG Algorithm. Sensors, 22, Article 7881. [Google Scholar] [CrossRef] [PubMed]
[28] Chen, L., Sinavski, O., Hünermann, J., Karnsund, A., Willmott, A.J., Birch, D., et al. (2024) Driving with LLMs: Fusing Object-Level Vector Modality for Explainable Autonomous Driving. 2024 IEEE International Conference on Robotics and Automation (ICRA), Yokohama, 13-17 May 2024, 14093-14100. [Google Scholar] [CrossRef
[29] Jin, L., Liu, L., Wang, X., Shang, M. and Wang, F. (2024) Physical-Informed Neural Network for Mpc-Based Trajectory Tracking of Vehicles with Noise Considered. IEEE Transactions on Intelligent Vehicles, 9, 4493-4503. [Google Scholar] [CrossRef
[30] 辰韬资本, 南京大学上海校友会自动驾驶分会, 九章智驾, 宝通科技. 端到端自动驾驶行业研究报告[R]. 上海, 2024.
[31] Prakash, A., Chitta, K. and Geiger, A. (2021) Multi-Modal Fusion Transformer for End-To-End Autonomous Driving. 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Nashville, 20-25 June 2021, 7073-7083. [Google Scholar] [CrossRef
[32] Wang, X., Li, K. and Chehri, A. (2024) Multi-sensor Fusion Technology for 3D Object Detection in Autonomous Driving: A Review. IEEE Transactions on Intelligent Transportation Systems, 25, 1148-1165. [Google Scholar] [CrossRef
[33] Winter, K., Azer, M. and Flohr, F.B. (2025) BEVDriver: Leveraging BEV Maps in LLMs for Robust Closed-Loop Driving. 2025 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), Hangzhou, 19-25 October 2025, 20379-20385. [Google Scholar] [CrossRef
[34] Huang, Z., Sheng, Z., Qu, Y., You, J. and Chen, S. (2025) VLM-RL: A Unified Vision Language Models and Reinforcement Learning Framework for Safe Autonomous Driving. Transportation Research Part C: Emerging Technologies, 180, Article ID: 105321. [Google Scholar] [CrossRef
[35] Yu, Z., Li, J., Chen, Z., Wei, Y., Zhang, X. and Tan, X. (2025) Multimodal End-To-End Autonomous Driving via Bilateral Modality Interaction. Expert Systems with Applications, 293, Article ID: 128458. [Google Scholar] [CrossRef
[36] Cao, X., Zhou, T., Ma, Y., Ye, W., Cui, C., Tang, K., et al. (2024) MAPLM: A Real-World Large-Scale Vision-Language Benchmark for Map and Traffic Scene Understanding. 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 16-22 June 2024, 21819-21830. [Google Scholar] [CrossRef
[37] Liu, W., Zhang, J., Zheng, B., et al. (2025) X-Driver: Explainable Autonomous Driving with Vision-Language Models. arxiv: 2505.05098.
[38] Peng, B., Sun, Q., Li, S.E., Kum, D., Yin, Y., Wei, J., et al. (2021) End-To-End Autonomous Driving through Dueling Double Deep Q-network. Automotive Innovation, 4, 328-337. [Google Scholar] [CrossRef
[39] Zhou, X., Han, X., Yang, F., et al. (2025) Opendrivevla: Towards End-To-End Autonomous Driving with Large Vision Language Action Model. arxiv: 2503.23463.
[40] Song, N., Zhang, B., Zhu, X., et al. (2025) LMAD: Integrated End-to-End Vision-Language Model for Explainable Autonomous Driving. arxiv: 2508.12404.
[41] Dong, Y., Zhong, Y., Yu, W., et al. (2019) Mcity Data Collection for Automated Vehicles Study. arxiv: 1912.06258.
[42] Di Lillo, L., Gode, T., Zhou, X., Atzei, M., Chen, R. and Victor, T. (2024) Comparative Safety Performance of Autonomous-and Human Drivers: A Real-World Case Study of the Waymo Driver. Heliyon, 10, e34379. [Google Scholar] [CrossRef] [PubMed]
[43] Geiger, A., Lenz, P., Stiller, C., et al. (2013) Vision meets robotics: The KITTI Dataset. The International Journal of Robotics Research, 32, 1231-1237.
[44] Caesar, H., Bankiti, V., Lang, A.H., Vora, S., Liong, V.E., Xu, Q., et al. (2020) nuScenes: A Multimodal Dataset for Autonomous Driving. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 11618-11628. [Google Scholar] [CrossRef
[45] Yu, F., Chen, H., Wang, X., Xian, W., Chen, Y., Liu, F., et al. (2020) BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 3-19 June 2020, 2633-2642. [Google Scholar] [CrossRef
[46] Wang, T., Kim, S., Wenxuan, J., Xie, E., Ge, C., Chen, J., et al. (2024) DeepAccident: A Motion and Accident Prediction Benchmark for V2X Autonomous Driving. Proceedings of the AAAI Conference on Artificial Intelligence, 38, 5599-5606. [Google Scholar] [CrossRef
[47] Cordts, M., Omran, M., Ramos, S., Rehfeld, T., Enzweiler, M., Benenson, R., et al. (2016) The Cityscapes Dataset for Semantic Urban Scene Understanding. 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, 27-30 June 2016, 3213-3223. [Google Scholar] [CrossRef
[48] Neuhold, G., Ollmann, T., Bulo, S.R. and Kontschieder, P. (2017) The Mapillary Vistas Dataset for Semantic Understanding of Street Scenes. 2017 IEEE International Conference on Computer Vision (ICCV), Venice, 22-29 October 2017, 5000-5009. [Google Scholar] [CrossRef
[49] Ramanishka, V., Chen, Y., Misu, T. and Saenko, K. (2018) Toward Driving Scene Understanding: A Dataset for Learning Driver Behavior and Causal Reasoning. 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, 18-23 June 2018, 7699-7707. [Google Scholar] [CrossRef
[50] Sun, P., Kretzschmar, H., Dotiwalla, X., Chouard, A., Patnaik, V., Tsui, P., et al. (2020) Scalability in Perception for Autonomous Driving: Waymo Open Dataset. 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), Seattle, 13-19 June 2020, 2443-2451. [Google Scholar] [CrossRef
[51] Han, J., Liang, X., Xu, H., et al. (2021) Soda10m: A Large-Scale 2D Self/Semi-Supervised Object Detection Dataset for Autonomous Driving. arxiv: 2106.11118.
[52] Singh, G., Akrigg, S., Maio, M.D., Fontana, V., Alitappeh, R.J., Khan, S., et al. (2023) ROAD: The Road Event Awareness Dataset for Autonomous Driving. IEEE Transactions on Pattern Analysis and Machine Intelligence, 45, 1036-1054. [Google Scholar] [CrossRef] [PubMed]
[53] Yang, H., Pan, L. and Liu, S. (2025) An RL-Based Cost-Effective Two-Layer Scaling Strategy for Multiregional Heterogeneous and Time-Varying Cloud Instances. IEEE Internet of Things Journal, 12, 10709-10721. [Google Scholar] [CrossRef
[54] Xie, S., Chen, S., Zheng, J., Tomizuka, M., Zheng, N. and Wang, J. (2022) From Human Driving to Automated Driving: What Do We Know about Drivers? IEEE Transactions on Intelligent Transportation Systems, 23, 6189-6205. [Google Scholar] [CrossRef
[55] Zeng, Y., Zhang, X., Li, H., Wang, J., Zhang, J. and Zhou, W. (2024) X2-VLM: All-In-One Pre-Trained Model for Vision-Language Tasks. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46, 3156-3168. [Google Scholar] [CrossRef] [PubMed]
[56] Qu, Y., Huang, Z., Sheng, Z., et al. (2025) VL-SAFE: Vision-Language Guided Safety-Aware Reinforcement Learning with World Models for Autonomous Driving. arxiv: 2505.16377.