基于启发式强化学习自动驾驶的多目标协同速度控制研究
Heuristic Reinforcement Learning-Based Multi-Objective Speed Coordination for Autonomous Vehicles
DOI: 10.12677/csa.2025.159224, PDF,   
作者: 李 琦:上海电科智能系统股份有限公司,上海
关键词: 车辆跟驰速度控制强化学习启发式Car Following Velocity Control Reinforcement Learning Heuristic
摘要: 近年来,自动驾驶技术快速发展,但速度控制的安全性、效率与舒适性仍是关键挑战。本文提出一种基于深度强化学习(Deep Deterministic Policy Gradient, DDPG)的车辆跟驰控制模型。通过IDM模型约束强化学习输出,融合安全、效率和舒适性的多目标奖励函数,利用下一代仿真(NGSIM)数据集中提取的1341个跟驰事件对模型进行了训练和测试。将所提出的模型与没加约束的DDPG算法进行比较,以评估所提出的模型的性能。结果表明,该方法有助于开发更好的自动驾驶系统,具有一定的实用价值,能够为自主驾驶系统的开发提供参考。
Abstract: In recent years, the rapid development of autonomous driving technology has continued to present key challenges in achieving safe, efficient, and comfortable speed control. This paper proposes a vehicle car-following control model based on the Deep Deterministic Policy Gradient (DDPG) algorithm. To enhance performance, the model’s output is constrained by the Intelligent Driver Model (IDM), and it utilizes a multi-objective reward function that integrates safety, efficiency, and comfort. The model was trained and tested on 1341 car-following events extracted from the Next Generation Simulation (NGSIM) dataset. A comparative analysis was conducted against an unconstrained DDPG algorithm to evaluate the proposed model’s performance. The results demonstrate that this method contributes to the development of more effective autonomous driving systems, holds significant practical value, and can serve as a reference for future autonomous system design.
文章引用:李琦. 基于启发式强化学习自动驾驶的多目标协同速度控制研究[J]. 计算机科学与应用, 2025, 15(9): 63-72. https://doi.org/10.12677/csa.2025.159224

参考文献

[1] Zhu, M., Wang, X., Tarko, A. and Fang, S. (2018) Modeling Car-Following Behavior on Urban Expressways in Shanghai: A Naturalistic Driving Study. Transportation Research Part C: Emerging Technologies, 93, 425-445. [Google Scholar] [CrossRef
[2] Wang, X., Chen, M., Zhu, M. and Tremont, P. (2016) Development of a Kinematic-Based Forward Collision Warning Algorithm Using an Advanced Driving Simulator. IEEE Transactions on Intelligent Transportation Systems, 17, 2583-2591. [Google Scholar] [CrossRef
[3] Gipps, P.G. (1981) A Behavioural Car-Following Model for Computer Simulation. Transportation Research Part B: Methodological, 15, 105-111. [Google Scholar] [CrossRef
[4] Treiber, M., Hennecke, A. and Helbing, D. (2000) Congested Traffic States in Empirical Observations and Microscopic Simulations. Physical Review E, 62, 1805-1824. [Google Scholar] [CrossRef] [PubMed]
[5] Zhang, Y., Xu, Q., Wang, J., Wu, K., Zheng, Z. and Lu, K. (2023) A Learning-Based Discretionary Lane-Change Decision-Making Model with Driving Style Awareness. IEEE Transactions on Intelligent Transportation Systems, 24, 68-78. [Google Scholar] [CrossRef
[6] Jia, H.F., et al. (2003) Develop a Car-Following Model Using Data Collected by “Five-Wheel System”. Proceedings of the 2003 IEEE International Conference on Intelligent Transportation Systems, Vol. 1, 346-351.
[7] Wei, D. and Liu, H. (2013) Analysis of Asymmetric Driving Behavior Using a Self-Learning Approach. Transportation Research Part B: Methodological, 47, 1-14. [Google Scholar] [CrossRef
[8] Li, G., Yang, Y., Li, S., Qu, X., Lyu, N. and Li, S.E. (2022) Decision Making of Autonomous Vehicles in Lane Change Scenarios: Deep Reinforcement Learning Approaches with Risk Awareness. Transportation Research Part C: Emerging Technologies, 134, Article ID: 103452. [Google Scholar] [CrossRef
[9] Basu, C., Yang, Q., Hungerman, D., Singhal, M. and Dragan, A.D. (2017) Do You Want Your Autonomous Car to Drive like You? Proceedings of the 2017 ACM/IEEE International Conference on Human-Robot Interaction, Vienna, 6-9 March 2017, 417-425. [Google Scholar] [CrossRef
[10] Ossen, S. and Hoogendoorn, S.P. (2011) Heterogeneity in Car-Following Behavior: Theory and Empirics. Transportation Research Part C: Emerging Technologies, 19, 182-195. [Google Scholar] [CrossRef
[11] Yang, X., Zou, Y., Zhang, H., Qu, X. and Chen, L. (2023) Improved Deep Reinforcement Learning for Car-Following Decision-Making. Physica A: Statistical Mechanics and Its Applications, 624, Article ID: 128912. [Google Scholar] [CrossRef
[12] Zhu, M., Wang, X. and Wang, Y. (2018) Human-Like Autonomous Car-Following Model with Deep Reinforcement Learning. Transportation Research Part C: Emerging Technologies, 97, 348-368. [Google Scholar] [CrossRef
[13] Hart, F., Okhrin, O. and Treiber, M. (2024) Towards Robust Car-Following Based on Deep Reinforcement Learning. Transportation Research Part C: Emerging Technologies, 159, Article ID: 104486. [Google Scholar] [CrossRef
[14] Hart, P., Nilsson, N. and Raphael, B. (1968) A Formal Basis for the Heuristic Determination of Minimum Cost Paths. IEEE Transactions on Systems Science and Cybernetics, 4, 100-107. [Google Scholar] [CrossRef
[15] Lillicrap, T.P., Hunt, J.J., Pritzel, A., et al. (2015) Continuous Control with Deep Reinforcement Learning.
[16] Montanino, M. and Punzo, V. (2015) Trajectory Data Reconstruction and Simulation-Based Validation against Macroscopic Traffic Patterns. Transportation Research Part B: Methodological, 80, 82-106. [Google Scholar] [CrossRef
[17] Wang, X., Jiang, R., Li, L., Lin, Y., Zheng, X. and Wang, F. (2018) Capturing Car-Following Behaviors by Deep Learning. IEEE Transactions on Intelligent Transportation Systems, 19, 910-920. [Google Scholar] [CrossRef
[18] Vogel, K. (2003) A Comparison of Headway and Time to Collision as Safety Indicators. Accident Analysis & Prevention, 35, 427-433. [Google Scholar] [CrossRef] [PubMed]
[19] Uhlenbeck, G.E. and Ornstein, L.S. (1930) On the Theory of the Brownian Motion. Physical Review, 36, 823-841. [Google Scholar] [CrossRef