基于Q学习的自适应行为选择鹦鹉优化算法——QLAB-PO算法
Q-Learning Based Adaptive Behavior Selection Parrot Optimizer—QLAB-PO Algorithm
DOI: 10.12677/csa.2026.164108, PDF,   
作者: 章洛铭:温州大学计算机与人工智能学院,浙江 温州
关键词: 鹦鹉优化算法Q学习自适应行为选择Parrot Optimizer Q-Learning Adaptive Behavior Selection
摘要: 针对传统鹦鹉优化算法(Parrot Optimizer, PO)在复杂优化问题中行为选择单一、收敛速度慢、易陷入局部最优等问题,本文提出一种基于Q学习的自适应行为选择鹦鹉优化算法(Q-Learning Based Adaptive Behavior Selection Parrot Optimizer, QLAB-PO)。该算法通过把强化学习中的Q学习机制引入鹦鹉优化算法中,借助选择Q表,使算法能够根据当前搜索情况自适应地选择相应策略。算法在原有四种行为模式的基础上添加了群体学习行为和自适应变异行为,并通过Q学习动态调整所选择策略。实验结果表明,QLAB-PO算法在CEC2017标准测试函数上的收敛速度和求解精度均显著优于原始PO算法及其他主流元启发式算法,验证了所提算法的有效性和优越性。
Abstract: To address the problems of traditional Parrot Optimizer (PO) algorithms, such as limited behavior selection, slow convergence speed, and susceptibility to local optima in complex optimization problems, this paper proposes a Q-Learning Based Adaptive Behavior Selection Parrot Optimizer (QLAB-PO). This algorithm introduces the Q-learning mechanism from reinforcement learning into the Parrot Optimizer, constructing a Q-table of behavior selections to adaptively select appropriate strategies based on the current search situation. In addition to the original four behavior modes, the algorithm adds swarm learning and adaptive mutation behaviors, and dynamically adjusts the selected strategies through Q-learning. Experimental results show that the QLAB-PO algorithm significantly outperforms the original PO algorithm and other mainstream metaheuristic algorithms in terms of convergence speed and solution accuracy on the CEC2017 standard test function, validating the effectiveness and superiority of the proposed algorithm.
文章引用:章洛铭. 基于Q学习的自适应行为选择鹦鹉优化算法——QLAB-PO算法[J]. 计算机科学与应用, 2026, 16(4): 42-55. https://doi.org/10.12677/csa.2026.164108

参考文献

[1] Macready, W.G. and Wolpert, D.H. (1996) What Makes an Optimization Problem Hand? Complexity, 1, 40-46. [Google Scholar] [CrossRef
[2] Kennedy, J. and Eberhart, R. (1995) Particle Swarm Optimization. Proceedings of ICNN’95—International Conference on Neural Networks, Vol. 4, 1942-1948. [Google Scholar] [CrossRef
[3] Mirjalili, S., Mirjalili, S.M. and Lewis, A. (2014) Grey Wolf Optimizer. Advances in Engineering Software, 69, 46-61. [Google Scholar] [CrossRef
[4] Mirjalili, S. and Lewis, A. (2016) The Whale Optimization Algorithm. Advances in Engineering Software, 95, 51-67. [Google Scholar] [CrossRef
[5] Lian, J., Hui, G., Ma, L., Zhu, T., Wu, X., Heidari, A.A., et al. (2024) Parrot Optimizer: Algorithm and Applications to Medical Problems. Computers in Biology and Medicine, 172, Article ID: 108064. [Google Scholar] [CrossRef] [PubMed]
[6] Watkins, C.J.C.H. and Dayan, P. (1992) Q-Learning. Machine Learning, 8, 279-292. [Google Scholar] [CrossRef
[7] Yang, Y., Gao, Y., Ding, Z., Wu, J., Zhang, S., Han, F., et al. (2024) Advancements in Q‐Learning Meta‐Heuristic Optimization Algorithms: A Survey. WIREs Data Mining and Knowledge Discovery, 14, e1548. [Google Scholar] [CrossRef
[8] Li, Y., Wang, H., Fan, J. and Geng, Y. (2022) A Novel Q-Learning Algorithm Based on Improved Whale Optimization Algorithm for Path Planning. PLOS ONE, 17, e0279438. [Google Scholar] [CrossRef] [PubMed]
[9] Meerza, S.I.A., Islam, M. and Uzzal, M.M. (2019) Q-Learning Based Particle Swarm Optimization Algorithm for Optimal Path Planning of Swarm of Mobile Robots. 2019 1st International Conference on Advances in Science, Engineering and Robotics Technology (ICASERT), Dhaka, 3-5 May 2019, 1-5. [Google Scholar] [CrossRef
[10] Kazikova, A., Pluhacek, M. and Senkerik, R. (2018) Performance of the Bison Algorithm on Benchmark IEEE CEC 2017. In: Silhavy, R., Ed., Artificial Intelligence and Algorithms in Intelligent Systems, Springer International Publishing, 445-454. [Google Scholar] [CrossRef
[11] Zhao, W., Wang, L., Zhang, Z., Mirjalili, S., Khodadadi, N. and Ge, Q. (2023) Quadratic Interpolation Optimization (QIO): A New Optimization Algorithm Based on Generalized Quadratic Interpolation and Its Applications to Real-World Engineering Problems. Computer Methods in Applied Mechanics and Engineering, 417, Article ID: 116446. [Google Scholar] [CrossRef
[12] 高鑫宇. 基于自适应知识迁移的多因子进化算法研究与应用[D]: [硕士学位论文]. 西安: 西安理工大学, 2024.
[13] Mirjalili, S. (2015) Moth-Flame Optimization Algorithm: A Novel Nature-Inspired Heuristic Paradigm. Knowledge-Based Systems, 89, 228-249. [Google Scholar] [CrossRef