|
[1]
|
李茹杨, 彭慧民, 李仁刚, 赵坤. 强化学习算法与应用综述[J]. 计算机系统应用. 2020, 29(12): 13-25.
|
|
[2]
|
孙彧, 曹雷, 陈希亮, 徐志雄, 等. 多智能体深度强化学习研究综述[J]. 计算机工程与应用. 2020, 56(5): 13-24.
|
|
[3]
|
Bellman, R. (1956) Dynamic Programming and Lagrange Multipliers. Proceedings of the National Academy of Sciences, 42, 767-769. [Google Scholar] [CrossRef] [PubMed]
|
|
[4]
|
Sutton, R.S. (1988) Learning to Predict by the Methods of Temporal Differences. Machine Learning, 3, 9-44. [Google Scholar] [CrossRef]
|
|
[5]
|
Watkins, C.J.C.H. and Dayan, P. (1992) Technical Note: Q-Learning. Machine Learning, 8, 279-292. [Google Scholar] [CrossRef]
|
|
[6]
|
Rummery, G.A. and Niranjan, M. (1994) On-Line Q-Learning Using Connectionist Systems. Technical Report, 1-7.
|
|
[7]
|
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., et al. (2015) Human-Level Control through Deep Reinforcement Learning. Nature, 518, 529-533. [Google Scholar] [CrossRef] [PubMed]
|
|
[8]
|
Mnih, V., Kavukcuoglu, K., Silver, D., et al. (2013) Playing Atari with Deep Reinforcement Learning.
|
|
[9]
|
Silver, D., Huang, A., Maddison, C.J., Guez, A., Sifre, L., van den Driessche, G., et al. (2016) Mastering the Game of Go with Deep Neural Networks and Tree Search. Nature, 529, 484-489. [Google Scholar] [CrossRef] [PubMed]
|
|
[10]
|
杜威, 丁世飞. 多智能体强化学习综述[J]. 计算机科学. 2019, 8(46): 1-8.
|
|
[11]
|
Sunehag, P., Lever, G., Gruslys, A., Czarnecki, W.M., Zambaldi, V., Jaderberg, M., et al. (2018). Value-Decomposition Networks for Cooperative Multi-Agent Learning Based on Team Reward. International Joint Conference on Autonomous Agents and Multiagent Systems, Stockholm, 10-15 July 2018, 2085-2087.[CrossRef]
|
|
[12]
|
李盛祥. 基于强化学习的多智能体协同关键技术及应用研究[D]: [博士学位论文]. 郑州: 战略支援部队信息工程大学, 2021.
|
|
[13]
|
Rashid, T., Samvelyan, M., Witt, C.S., et al. (2018) Qmix: Monotonic Value Function Factorization for Deep Multi-Agent Reinforcement Learning. International Conference on Machine Learning, Stockholm, 10-15 July 2018, 4292-4301.
|
|
[14]
|
吴昊霖. 基于协作多智能体强化学习的飞行冲突解脱策略研究[D]: [博士学位论文]. 四川: 四川大学, 2021.
|