基于风险评估的自动驾驶变道决策强化学习模型

doi:10.12677/orf.2025.155227

期刊菜单

基于风险评估的自动驾驶变道决策强化学习模型
A Reinforcement Learning Model for Autonomous Lane Change Decision Based on Risk Assessment

DOI: 10.12677/orf.2025.155227, PDF,
作者: 李琦：上海电科智能系统股份有限公司，上海；周鲁露：上海市公安局交通管理总队，上海
关键词: 自动驾驶；变道决策；经验回放；深度强化学习；Autonomous Driving； Lane-Change Decision； Experience Replay； Deep Reinforcement Learning

摘要: 在自动驾驶技术快速发展的背景下，智能体的安全、高效变道决策成为提升驾驶安全性与通行效率的核心挑战。现有基于深度强化学习的变道决策方法往往忽略了周围车辆驾驶风格等动态微观信息，且奖励函数设计单一，导致决策鲁棒性不足、训练过程不稳定。为解决上述问题，本文以带经验回放的深度Q网络(DQN)算法为基础架构，提出融合驾驶风格感知的自动驾驶变道决策优化方法。该方法的核心创新体现在两方面：一是突破传统DQN仅依赖宏观运动学信息的局限，通过量化邻车激进型、保守型等驾驶风格并构建风险系数，形成融合微观驾驶风格的风险状态(Risk)表示，提升智能体对动态环境风险的感知精准度；二是针对单一奖励目标导致的决策偏差问题，设计融合安全性、效率与规则遵守性的多目标奖励函数，通过权重调整引导智能体学习均衡驾驶策略。同时，借助经验回放机制保障训练过程的稳定性。为验证算法性能，本文在SUMO仿真平台中，将所提算法与传统DQN及Double DQN算法展开对比实验。结果表明，本文提出的算法在变道成功率、碰撞率及平均通行效率等关键指标上均展现出显著优势，为自动驾驶场景下的智能体决策提供了更安全、高效的解决方案。

Abstract: Against the backdrop of rapid development of autonomous driving technology, the safe and efficient lane-changing decision-making of intelligent agents has become a core challenge to improve driving safety and traffic efficiency. The existing lane-changing decision methods based on deep reinforcement learning mostly ignore the issues of environmental dynamics and sample correlation, resulting in insufficient decision robustness unstable training process. To address the above issues, this paper proposes an optimization method for autonomous driving lane-changing decision making with driver style perception based on the deep Q-Network (DQN) algorithm with experience replay. The core innovations of this method are reflected in two aspects: one is to break through the limitations of traditional DQN, which relies on macro kinematic information, and to form a risk state representation by quantifying the driving styles such as aggressive and conservative neighboring vehicles and constructing a risk coefficient (risk), so as to improve the accuracy of the agent’s perception of dynamic environmental risks; the other is to design a multi-objective reward function integrating safety, efficiency and with rules, so as to guide the agent to learn a balanced driving strategy by adjusting the weights, and at the same time, to ensure the stability of the training process by the replay mechanism. In order to verify the performance of the algorithm, a comparative experiment was conducted with the traditional DQN and Double DQN algorithms in the SUMO simulation platform. The experimental results show that the algorithm has significant improvements in lane changing success rate, collision rate, and average traffic efficiency compared to traditional DQN and rule-based methods, providing an effective solution for intelligent agent decision-making in autonomous driving scenarios.

文章引用：李琦, 周鲁露. 基于风险评估的自动驾驶变道决策强化学习模型[J]. 运筹与模糊学, 2025, 15(5): 13-25. https://doi.org/10.12677/orf.2025.155227

参考文献

[1]	蒲龙忠. 驾驶员驾驶车辆变道行为原因综述[J]. 交通科技与管理, 2021(17): 215-215+94.
[2]	Yu, S.J., Ma, C. and Chen, J.Z. (2023) Research Progress of Automatic Driving Lane Change Decision Algorithms Based on Learning. Automobile Applied Technology, 48, 189-194.
[3]	程诺. 基于改进优先经验回放的DDPG路径规划算法研究[D]: [硕士学位论文]. 济南: 山东交通学院, 2024.
[4]	张斌, 何明, 陈希亮, 等. 改进DDPG算法在自动驾驶中的应用[J]. 计算机工程与应用, 2019, 55(10): 264-270.
[5]	裴晓飞, 莫烁杰, 陈祯福, 等. 基于TD3算法的人机混驾交通环境自动驾驶汽车换道研究[J]. 中国公路学报, 2021, 34(11): 246-254.
[6]	Grigorescu, S., Trasnea, B., Cocias, T. and Macesanu, G. (2020) A Survey of Deep Learning Techniques for Autonomous Driving. Journal of Field Robotics, 37, 362-386. [Google Scholar] [CrossRef]
[7]	Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., et al. (2015) Human-Level Control through Deep Reinforcement Learning. Nature, 518, 529-533. [Google Scholar] [CrossRef] [PubMed]
[8]	Kiran, B.R., Sobh, I., Talpaert, V., Mannion, P., Sallab, A.A.A., Yogamani, S., et al. (2021) Deep Reinforcement Learning for Autonomous Driving: A Survey. IEEE Transactions on Intelligent Transportation Systems, 23, 4909-4926. [Google Scholar] [CrossRef]
[9]	Fosgerau, M., Melo, E., de Palma, A. and Shum, M. (2020) Discrete Choice and Rational Inattention: A General Equivalence Result. International Economic Review, 61, 1569-1589. [Google Scholar] [CrossRef] [PubMed]
[10]	Mnih, V., Kavukcuoglu, K., Silver, D., et al. (2013) Playing Atari with Deep Reinforcement Learning. arXiv:1312.5602.

为你推荐

友情链接