|
[1]
|
Yang, L., Qi, J.T., Xiao, J. and Yong, X. (2014) A Literature Review of UAV 3D Path Planning. Proceeding of the 11th World Congress on Intelligent Control and Automation, Shenyang, 29 June-4 July 2014, 2376-2381. [Google Scholar] [CrossRef]
|
|
[2]
|
聂虹宇, 张广玉, 李德才, 等. 多旋翼无人机的环境感知与运动规划方法综述[J]. 信息与控制, 2025, 54(3): 353-371.
|
|
[3]
|
李晓辉, 苗苗, 冉保健, 等. 基于改进A*算法的无人机避障路径规划[J]. 计算机系统应用, 2021, 30(2): 255-259.
|
|
[4]
|
李亚飞, 赵瑞. 城市复杂环境下多目标无人机路径规划研究[J]. 南京航空航天大学学报, 2024, 56(6): 1002-1012.
|
|
[5]
|
Huang, Y., Li, H., Dai, Y., Lu, G. and Duan, M. (2024) A 3D Path Planning Algorithm for UAVs Based on an Improved Artificial Potential Field and Bidirectional RRT. Drones, 8, Article 760. [Google Scholar] [CrossRef]
|
|
[6]
|
曾国奇, 赵民强, 刘方圆, 等. 基于网格PRM的无人机多约束航路规划[J]. 系统工程与电子技术, 2016, 38(10): 2310-2316.
|
|
[7]
|
Tripicchio, P., Unetti, M., D’Avella, S. and Avizzano, C.A. (2023) Smooth Coverage Path Planning for UAVs with Model Predictive Control Trajectory Tracking. Electronics, 12, Article 2310. [Google Scholar] [CrossRef]
|
|
[8]
|
Azar, A.T., Koubaa, A., Ali Mohamed, N., Ibrahim, H.A., Ibrahim, Z.F., Kazim, M., et al. (2021) Drone Deep Reinforcement Learning: A Review. Electronics, 10, Article 999. [Google Scholar] [CrossRef]
|
|
[9]
|
Sun, H., Zhang, W., Yu, R. and Zhang, Y. (2021) Motion Planning for Mobile Robots—Focusing on Deep Reinforcement Learning: A Systematic Review. IEEE Access, 9, 69061-69081. [Google Scholar] [CrossRef]
|
|
[10]
|
Zhu, K. and Zhang, T. (2021) Deep Reinforcement Learning Based Mobile Robot Navigation: A Review. Tsinghua Science and Technology, 26, 674-691. [Google Scholar] [CrossRef]
|
|
[11]
|
熊斯, 李逸琛, 欧阳权, 等. 基于强化学习的无人机集群航迹规划研究综述[J]. 空间电子技术, 2025, 22(6): 1-8, 123.
|
|
[12]
|
Tanimoto, Y. and Fukumizu, K. (2024) State-Separated SARSA: A Practical Sequential Decision-Making Algorithm with Recovering Rewards. arXiv: 2403.11520.
|
|
[13]
|
许振阳, 陈谋, 韩增亮, 等. 复杂环境下基于TCPDQN算法的低空飞行器动态航路规划[J]. 机器人, 2025, 47(3): 383-393.
|
|
[14]
|
Watkins, C.J. and Watkins, P. (1989) Learning from Delayed Rewards. Ph.D. Thesis, King’s College.
|
|
[15]
|
张泽华, 杨波, 傅广, 等. 基于SARSA的动态蜂群算法求解作业车间调度问题[J]. 组合机床与自动化加工技术, 2023(6): 188-192.
|
|
[16]
|
陈一波, 赵知劲. 基于SARSA学习的跳频系统智能抗干扰决策算法[J]. 现代电子技术, 2023, 46(1): 31-35.
|
|
[17]
|
司彦娜, 普杰信, 于晓升, 等. 基于径向基神经网络的多步SARSA控制算法[J]. 控制与决策, 2023, 38(4): 944-950.
|
|
[18]
|
黄鑫, 张志佳, 孙平, 等. 基于深度强化学习的路径规划算法综述[J]. 机器人, 2026, 48(1): 196-216.
|
|
[19]
|
于天浩, 周航, 贾鑫悦, 等. 基于改进DQN算法的无人机路径规划算法研究[J]. 航空计算技术, 2025, 55(6): 59-63, 79.
|
|
[20]
|
王艺霖, 张烈平, 尹亚梦, 等. 基于改进DDQN的移动机器人路径规划算法[J]. 桂林航天工业学院学报, 2025, 30(5): 770-783.
|
|
[21]
|
Van Hasselt, H., Guez, A. and Silver, D. (2016) Deep Reinforcement Learning with Double Q-Learning. Proceedings of the AAAI Conference on Artificial Intelligence, 30, 2094-2100. [Google Scholar] [CrossRef]
|
|
[22]
|
Wang, Z., Schaul, T., Hessel, M., et al. (2016) Dueling Network Architectures for Deep Reinforcement Learning. Inter-national Conference on Machine Learning, New York, 19-24 June 2016, 1995-2003.
|
|
[23]
|
苏江玉. 基于深度强化学习的USV路径规划算法研究[D]: [硕士学位论文]. 哈尔滨: 哈尔滨工程大学, 2023.
|
|
[24]
|
武曲, 张义, 郭坤, 等. 基于DPES Dueling DQN的路径规划方法研究[J]. 计算机应用与软件, 2023, 40(6): 147-153, 233.
|
|
[25]
|
Xu, Y., Wei, Y., Wang, D., Jiang, K. and Deng, H. (2023) Multi-UAV Path Planning in GPS and Communication Denial Environment. Sensors, 23, Article 2997. [Google Scholar] [CrossRef] [PubMed]
|
|
[26]
|
Schulman, J., Levine, S., Abbeel, P., et al. (2015) Trust Region Policy Optimization. International Conference on Ma-chine Learning, Lille, 6-11 July 2015, 1889-1897.
|
|
[27]
|
万宇航, 朱子璐, 钟春富, 等. 基于改进PPO算法的机械臂动态路径规划[J]. 系统仿真学报, 2025, 37(6): 1462-1473.
|
|
[28]
|
程浩鹏, 朱涵, 杨高奇, 等. 深度强化学习及智能路径规划应用综述[J]. 现代计算机, 2022, 28(21): 1-10.
|
|
[29]
|
Barto, A.G., Sutton, R.S. and Anderson, C.W. (1983) Neuronlike Adaptive Elements That Can Solve Difficult Learning Control Problems. IEEE Transactions on Systems, Man, and Cybernetics, 13, 834-846. [Google Scholar] [CrossRef]
|
|
[30]
|
Lillicrap, T.P., Hunt, J.J., Pritzel, A., et al. (2016) Continuous Control with Deep Reinforcement Learning. International Conference on Learning Representations, San Juan, 2-4 May 2016.
|
|
[31]
|
Silver, D., Lever, G., Heess, N., et al. (2014) Deterministic Policy Gradient Algorithms. International Conference on Machine Learning, Beijing, 21-26 June 2014, 387-395.
|
|
[32]
|
王树森. 深度强化学习[M]. 北京: 人民邮电出版社, 2022.
|
|
[33]
|
Fujimoto, S., Hoof, H. and Meger, D. (2018) Addressing Function Approximation Error in Actor-Critic Methods. International Conference on Machine Learning, Stockholm, 10-15 July 2018, 1587-1596.
|
|
[34]
|
Mnih, V., Badia, A.P., Mirza, M., et al. (2016) Asynchronous Methods for Deep Reinforcement Learning. International Conference on Machine Learning, New York, 19-24 June 2016, 1928-1937.
|
|
[35]
|
Haarnoja, T., Zhou, A., Abbeel, P., et al. (2018) Soft Actor-Critic: Off-Policy Maximum Entropy Deep Reinforcement Learning with a Stochastic Actor. International Conference on Machine Learning, Stockholm, 10-15 July 2018, 1861-1870.
|
|
[36]
|
周明鑫. 基于强化学习的多智能体自主任务分配[D]: [硕士学位论文]. 哈尔滨: 哈尔滨工程大学, 2022.
|
|
[37]
|
Wu, J., Sun, Y., Li, D., Shi, J., Li, X., Gao, L., et al. (2023) An Adaptive Conversion Speed Q-Learning Algorithm for Search and Rescue UAV Path Planning in Unknown Environments. IEEE Transactions on Vehicular Technology, 72, 15391-15404. [Google Scholar] [CrossRef]
|
|
[38]
|
Saeed, R.A., Ali, E.S., Abdelhaq, M., Alsaqour, R., Ahmed, F.R.A. and Saad, A.M.E. (2024) Energy Efficient Path Planning Scheme for Unmanned Aerial Vehicle Using Hybrid Generic Algorithm-Based Q-Learning Optimization. IEEE Access, 12, 13400-13417. [Google Scholar] [CrossRef]
|
|
[39]
|
王现磊, 郝文宁, 陈刚, 等. 基于模拟退火策略的SARSA强化学习方法[J]. 计算机仿真, 2019, 36(4): 219-222, 228.
|
|
[40]
|
Chao, Y., Dillmann, R., Roennau, A. and Xiong, Z. (2024) E-DQN-Based Path Planning Method for Drones in Airsim Simulator under Unknown Environment. Biomimetics, 9, Article 238. [Google Scholar] [CrossRef] [PubMed]
|
|
[41]
|
Zhu, Y., Tan, Y., Chen, Y., Chen, L. and Lee, K.Y. (2024) UAV Path Planning Based on Random Obstacle Training and Linear Soft Update of DRL in Dense Urban Environment. Energies, 17, Article 2762. [Google Scholar] [CrossRef]
|
|
[42]
|
Jiang, W., Bao, C., Xu, G. and Wang, Y. (2021) Research on Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Improved Dueling DQN Algorithm. 2021 China Automation Congress (CAC), Beijing, 22-24 October 2021, 5110-5115. [Google Scholar] [CrossRef]
|
|
[43]
|
Qi, C., Wu, C., Lei, L., Li, X. and Cong, P. (2022) UAV Path Planning Based on the Improved PPO Algorithm. 2022 Asia Conference on Advanced Robotics, Automation, and Control Engineering (ARACE), Qingdao, 26-28 August 2022, 193-199. [Google Scholar] [CrossRef]
|
|
[44]
|
Tian, S., Li, Y., Zhang, X., Zheng, L., Cheng, L., She, W., et al. (2024) Fast UAV Path Planning in Urban Environments Based on Three-Step Experience Buffer Sampling DDPG. Digital Communications and Networks, 10, 813-826. [Google Scholar] [CrossRef]
|
|
[45]
|
牟文心, 时宏伟. 基于改进TD3算法的无人机轨迹规划[J]. 计算机系统应用, 2024, 33(12): 197-209.
|
|
[46]
|
Zhao, F.Y., Li, D.Y., Wang, Z.X., Mao, J.L. and Wang, N.Y. (2024) Autonomous Localized Path Planning Algorithm for UAVs Based on TD3 Strategy. Scientific Reports, 14, Article No. 763. [Google Scholar] [CrossRef] [PubMed]
|
|
[47]
|
Zhou, Y., Shu, J., Hao, H., Song, H. and Lai, X. (2023) UAV 3D Online Track Planning Based on Improved SAC Algorithm. Journal of the Brazilian Society of Mechanical Sciences and Engineering, 46, Article No. 12. [Google Scholar] [CrossRef]
|
|
[48]
|
赵天隆, 陈龙胜, 张存富, 等. 融合强化学习与改进人工势场的无人机编队路径规划[J]. 航空兵器, 2025, 32(5): 54-63.
|
|
[49]
|
Wang, W., Zhang, G., Da, Q. and Tian, Y. (2024) Path Planning with Improved Dueling DQN Algorithm for UAVs in Unknown Dynamic Environment. In: Li, S., Ed., Computational and Experimental Simulations in Engineering, Springer, 453-465. [Google Scholar] [CrossRef]
|
|
[50]
|
Zhang, Y., Ding, M., Yuan, Y., Zhang, J., Yang, Q., Shi, G., et al. (2024) Multi-UAV Cooperative Pursuit of a Fast-Moving Target UAV Based on the GM-TD3 Algorithm. Drones, 8, Article 557. [Google Scholar] [CrossRef]
|
|
[51]
|
Qiao, B., Jia, Z., Xiao, B. and Qian, H. (2025) Game Maneuver Decision-Making for Multi-UAV via PPO-A3C-PER Learning Method. In: Yan, L., Duan, H. and Deng, Y., Eds., Advances in Guidance, Navigation and Control, Springer, 72-81. [Google Scholar] [CrossRef]
|
|
[52]
|
陈麒杰, 晋玉强, 韩露. 无人机路径规划算法研究综述[J]. 飞航导弹, 2020(5): 54-58.
|