|
[1]
|
Mu, S.M., Chu, T.G. and Wang, L. (2005) Coordinated Collective Motion in a Motile Particle Group with a Leader. Physica A: Statistical Mechanics & Its Applications, 351, 211-226. [Google Scholar] [CrossRef]
|
|
[2]
|
Nash, J.F. (1950) Two-Person Cooperative Games. Econometrica, 21, 128-140. [Google Scholar] [CrossRef]
|
|
[3]
|
Nash, J.F. (1951) Non-Cooperative Games. Annals of Mathematics, 54, 286-295. [Google Scholar] [CrossRef]
|
|
[4]
|
Starr, A.W. and Ho. Y.C. (1969) Nonzero-Sum Differential Games. Journal of Optimization Theory and Applications, 3, 184-206. [Google Scholar] [CrossRef]
|
|
[5]
|
Vamvoudakis, K.G. and Lewis, F.L. (2011) Multi-Player Non-Zero-Sum Games: Online Adaptive Learning Solution of Coupled Hamilton-Jacobi Equations. Automatica, 47, 1556-1569. [Google Scholar] [CrossRef]
|
|
[6]
|
Yang, D.S., Pang, Y.H. and Zhou, B.W. (2019) Fault Diagnosis for Energy Internet Using Correlation Processing-Based Convolutional Neural Networks. IEEE Transactions on Systems, Man, and Cybernetics: Systems, 49, 1739-1748. [Google Scholar] [CrossRef]
|
|
[7]
|
Yang, X.F. and Gao, J.W. (2016) Linear-Quadratic Uncertain Differential Game with Application to Resource Extraction Problem. IEEE Transactions on Fuzzy Systems: A Publication of the IEEE Neural Networks Council, 24, 819-826. [Google Scholar] [CrossRef]
|
|
[8]
|
Hong, Y.G., Hu, J.P. and Gao, L.X. (2008) Tracking Control for Multi-Agent Consensus with an Active Leader and Variable Topology. Automatica, 42, 1177-1182. [Google Scholar] [CrossRef]
|
|
[9]
|
Ren, W., Moore, K.L. and Chen, Y.Q. (2006) High-Order and Model Reference Consensus Algorithms in Cooperative Control of Multivehicle Systems. Journal of Dynamic Systems Measurement and Control, 129, 678-688. [Google Scholar] [CrossRef]
|
|
[10]
|
Freiling, G., Jank, G. and Abou-Kandil, H. (2002) On Global Existence of Solutions to Coupled Matrix Riccati Equations in Closed-Loop Nash Games. IEEE Transactions on Automatic Control, 41, 264-269. [Google Scholar] [CrossRef]
|
|
[11]
|
Abu-Khalaf, M., Lewis, F.L. and Huang, J. (2007) Policy Iterations on the Hamilton-Jacobi-Isaacs Equation for H∞ State Feedback Control with Input Saturation. IEEE Transactions on Automatic Control, 51, 1989-1995. [Google Scholar] [CrossRef]
|
|
[12]
|
Lewis, F.L. and Vrabie, D. (2009) Reinforcement Learning and Adaptive Dynamic Programming for Feedback Control. IEEE Circuits & Systems Magazine, 9, 32-50. [Google Scholar] [CrossRef]
|
|
[13]
|
He, H.B., Ni, Z. and Fu. J. (2012) A Three-Network Architecture for On-Line Learning and Optimization Based on Adaptive Dynamic Programming. Neurocomputing, 78, 3-13. [Google Scholar] [CrossRef]
|
|
[14]
|
Dierks, T. and Jagnnathan, S. (2012) Online Optimal Control of Affine Nonlinear Discrete-Time Systems with Unknown Internal Dynamics by Using Timebased Policy Update. IEEE Transactions on Neural Networks & Learning Systems, 23, 1118-1129. [Google Scholar] [CrossRef]
|
|
[15]
|
Wei, L.Q., Wang, F.Y. and Liu, D.R. (2014) Finite-Approximation-Error-Based Discrete-Time Iterative Adaptive Dynamic Programming. IEEE Transactions on Cybernetics, 44, 2820-2833. [Google Scholar] [CrossRef]
|
|
[16]
|
Ni, Z., He, H.B. and Zhao, D.B. (2015) GrDHP: A General Utility Function Representation for Dual Heuristic Dynamic Programming. IEEE Transactions on Neural Networks & Learning Systems, 26, 614-627. [Google Scholar] [CrossRef]
|
|
[17]
|
Wei, Q.L., Liu, D.R. and Lin, H.Q. (2016) Value Iteration Adaptive Dynamic Programming for Optimal Control of Discrete-Time Nonlinear Systems. IEEE Transactions on Cybernetics, 46, 840-853. [Google Scholar] [CrossRef]
|
|
[18]
|
Gao, W.N. and Jiang, Z.P. (2016) Adaptive Dynamic Programming and Adaptive Optimal Output Regulation of Linear Systems. IEEE Transactions on Automatic Control, 61, 4164-4169. [Google Scholar] [CrossRef]
|
|
[19]
|
Zhang, H.G., Liang, H.J. and Wang, Z.S. (2017) Optimal Output Regulation for Heterogeneous Multiagent Systems via Adaptive Dynamic Programming. IEEE Transactions on Neural Networks & Learning Systems, 28, 18-29. [Google Scholar] [CrossRef]
|
|
[20]
|
Yang, Y.L., Wunsch, D. and Yin, Y.X. (2017) Hamiltonian-Driven Adaptive Dynamic Programming for Continuous Nonlinear Dynamical Systems. IEEE Transactions on Neural Networks & Learning Systems, 28, 1929-1940. [Google Scholar] [CrossRef]
|
|
[21]
|
Sun, J.L. and Long, T. (2020) Event-Triggered Distributed Zero-Sum Differential Game for Nonlinear Multi-Agent Systems Using Adaptive Dynamic Programming. ISA Transactions, 110, 39-52.
|
|
[22]
|
罗傲, 肖文彬, 周琪, 等. 基于强化学习的一类具有输入约束非线性系统最优控制[J/OL]. 控制理论与应用, 2021.
|
|
[23]
|
Zhu, Y.H., Zhao, D.B. and Li, X.J. (2017) Iterative Adaptive Dynamic Programming for Solving Unknown Nonlinear Zero-Sum Game Based on Online Data. IEEE Transactions on Neural Networks & Learning Systems, 28, 714-725. [Google Scholar] [CrossRef]
|
|
[24]
|
Yasini, S., Sistani, M.B. and Karimpour, A. (2014) Approximate Dynamic Programming for Two-Player Zero-Sum Game Related to H∞ Control of Unknown Nonlinear Continuous-Time Systems. International Journal of Control, Automation and Systems, 13, 99-109. [Google Scholar] [CrossRef]
|
|
[25]
|
Song, R. and Zhu, L. (2019) Stable Value Iteration for Two-Player Zero-Sum Game of Discrete-Time Nonlinear Systems Based on Adaptive Dynamic Programming. Neurocomputing, 340, 180-195.
|
|
[26]
|
Vamvoudakis, K.G., Safaei, F.R.P. and Hespanha, J.P. (2019) Robust Event-Triggered Output Feedback Learning Algorithm for Voltage Source Inverters with Unknown Load and Parameter Variations. International Journal of Robust and Nonlinear Control, 29, 3502-3517. [Google Scholar] [CrossRef]
|
|
[27]
|
Yang, D.S., Li, T. and Zhang, H.G. (2019) Event-Trigger-Based Robust Control for Nonlinear Constrained-Input Systems Using Reinforcement Learning Method. Neurocomputing, 340, 158-170.
|
|
[28]
|
张正义, 赵学艳. 基于Q学习算法的随机离散时间系统的随机线性二次最优追踪控制[J]. 南京信息工程大学学报, 2020, 13(5): 548-555.
|
|
[29]
|
Abouheaf, M.L., Lewis, F.L. and Vamvoudakis, K.G. (2014) Multi-Agent Discrete-Time Graphical Games and Reinforcement Learning Solutions. Automatica, 50, 3038-3053.
|
|
[30]
|
Yang, N., Xiao, J.W. and Wang, Y.W. (2018) Non-Zero Sum Differential Graphical Game: Cluster Synchronisation for Multi-Agents with Partially Unknown Dynamics. International Journal of Control, 92, 2408-2419. [Google Scholar] [CrossRef]
|
|
[31]
|
Jiang, H., Zhang, H.G. and Han, J. (2018) Iterative Adaptive Dynamic Programming Methods with Neural Network Implementation for Multiplayer Zero-Sum Games. Neurocomputing, 307, 54-60.
|
|
[32]
|
Liu, D.R., Li, H.L. and Wang, D. (2013) Neural-Network-Based Zero-Sum Game for Discrete-Time Nonlinear Systems via Iterative Adaptive Dynamic Programming Algorithm. Neurocomputing, 110, 92-100.
|
|
[33]
|
李传江, 马广富. 最优控制[M]. 北京: 科学出版社, 2011: 216-218.
|
|
[34]
|
吴受章. 最优控制理论与应用[M]. 北京: 机械工业出版社, 2007: 193-194.
|
|
[35]
|
Luy, N.T. (2017) Distributed Cooperative H∞ Optimal Tracking Control of Mimo Nonlinear Multi-Agent Systems in Strict-Feedback Form via Adaptive Dynamic Programming. International Journal of Control, 91, 952-968. [Google Scholar] [CrossRef]
|
|
[36]
|
Jiao, Q., Modares, H. and Xu, S.Y. (2016) Multi-Agent Zero-Sum Differential Graphical Games for Disturbance Rejection in Distributed Control. Automatica, 69, 24-34.
|
|
[37]
|
Vamvoudakis, K.G., Lewis, F.L. and Hudas, G.R. (2012) Multi-Agent Differential Graphical Games: Online Adaptive Learning Solution for Synchronization with Optimality. Automatica, 48, 1598-1611.
|