基于强化学习的多智能体系统一致性跟踪控制算法
Reinforcement Learning-Based Consensus Tracking Control Algorithm for Multi-Agent Systems
摘要: 本文提出了一种新颖的基于强化学习的无模型自适应控制算法,适用于具有未知动态的离散时间非线性多智能体系统。采用等效动态线性化算法来设计最优控制器。针对Q学习策略和演员–评论家(actor-critic)神经网络进行了重构,以促进一致性控制。所提出的强化学习方法能够仅基于输入–输出数据实时动态调整线性化参数。通过数值仿真验证了该方法的有效性。
Abstract: This paper presents a novel reinforcement learning-based model-free adaptive control algorithm for discrete-time nonlinear multi-agent systems with unknown dynamics. The equivalent dynamic linearization algorithm is employed to design an optimal controller. The Q-Learning strategy and actor-critic neural network are restructured to facilitate consensus control. The proposed reinforcement learning approach dynamically adjusts linearization parameters in real-time using only input-output data. Numerical simulations validate the method’s effectiveness.
文章引用:刘人志. 基于强化学习的多智能体系统一致性跟踪控制算法[J]. 计算机科学与应用, 2025, 15(4): 374-381. https://doi.org/10.12677/csa.2025.154110

参考文献

[1] Schilling, M., Melnik, A., Ohl, F.W., Ritter, H.J. and Hammer, B. (2021) Decentralized Control and Local Information for Robust and Adaptive Decentralized Deep Reinforcement Learning. Neural Networks, 144, 699-725. [Google Scholar] [CrossRef] [PubMed]
[2] Wang, N., Gao, Y. and Zhang, X. (2021) Data-Driven Performance-Prescribed Reinforcement Learning Control of an Unmanned Surface Vehicle. IEEE Transactions on Neural Networks and Learning Systems, 32, 5456-5467. [Google Scholar] [CrossRef] [PubMed]
[3] Zhang, Y., Chu, B. and Shu, Z. (2019) A Preliminary Study on the Relationship between Iterative Learning Control and Reinforcement Learning. IFAC-PapersOnLine, 52, 314-319. [Google Scholar] [CrossRef
[4] Yue, B., Su, M., Jin, X. and Che, W. (2022) Event-Triggered MFAC of Nonlinear NCSs against Sensor Faults and Dos Attacks. IEEE Transactions on Circuits and Systems II: Express Briefs, 69, 4409-4413. [Google Scholar] [CrossRef
[5] Liao, Y., Jiang, Q., Du, T. and Jiang, W. (2020) Redefined Output Model-Free Adaptive Control Method and Unmanned Surface Vehicle Heading Control. IEEE Journal of Oceanic Engineering, 45, 714-723. [Google Scholar] [CrossRef
[6] Wang, X., Karimi, H.R., Shen, M., Liu, D., Li, L. and Shi, J. (2022) Neural Network-Based Event-Triggered Data-Driven Control of Disturbed Nonlinear Systems with Quantized Input. Neural Networks, 156, 152-159. [Google Scholar] [CrossRef] [PubMed]
[7] Dorri, A., Kanhere, S.S. and Jurdak, R. (2018) Multi-Agent Systems: A Survey. IEEE Access, 6, 28573-28593. [Google Scholar] [CrossRef
[8] Chen, F. and Ren, W. (2019) On the Control of Multi-Agent Systems: A Survey. Foundations and Trends® in Systems and Control, 6, 339-499. [Google Scholar] [CrossRef
[9] Olfati-Saber, R., Fax, J.A. and Murray, R.M. (2007) Consensus and Cooperation in Networked Multi-Agent Systems. Proceedings of the IEEE, 95, 215-233. [Google Scholar] [CrossRef
[10] Amirkhani, A. and Barshooi, A.H. (2021) Consensus in Multi-Agent Systems: A Review. Artificial Intelligence Review, 55, 3897-3935. [Google Scholar] [CrossRef
[11] Zhao, W., Chen, G., Xie, X., Xia, J. and Park, J.H. (2023) Sampled-Data Exponential Consensus of Multi-Agent Systems with Lipschitz Nonlinearities. Neural Networks, 167, 763-774. [Google Scholar] [CrossRef] [PubMed]
[12] Ren, H., Liu, R., Cheng, Z., Ma, H. and Li, H. (2024) Data-Driven Event-Triggered Control for Nonlinear Multi-Agent Systems with Uniform Quantization. IEEE Transactions on Circuits and Systems II: Express Briefs, 71, 712-716. [Google Scholar] [CrossRef
[13] Ma, H., Li, H., Lu, R. and Huang, T. (2020) Adaptive Event-Triggered Control for a Class of Nonlinear Systems with Periodic Disturbances. Science China Information Sciences, 63, Article ID: 150212. [Google Scholar] [CrossRef
[14] Zhu, Y. and Hou, Z. (2014) Data-Driven MFAC for a Class of Discrete-Time Nonlinear Systems with RBFNN. IEEE Transactions on Neural Networks and Learning Systems, 25, 1013-1020. [Google Scholar] [CrossRef] [PubMed]
[15] Hou, Z., Chi, R. and Gao, H. (2017) An Overview of Dynamic-Linearization-Based Data-Driven Control and Applications. IEEE Transactions on Industrial Electronics, 64, 4076-4090. [Google Scholar] [CrossRef