基于风险价值探索机制的PPO-DBN算法股票交易策略
Stock Trading Strategy Based on RaV Exploration Mechanism with PPO-DBN Algorithm
摘要: 随着人工智能技术与算法的发展,其在金融交易市场决策中的应用越来越广泛。特别是使用深度强化学习方法模拟交易环境实现交易决策成为当前的研究热点。基于此,本文提出基于风险价值探索机制的PPO-DBN算法,将近端策略优化(Proximal Policy Optimization, PPO)算法结合深度信念网络(Deep Belief Net, DBN),并在训练中使用基于风险价值的探索机制,使用当前市场的风险价值(Value at Risk, VaR)动态调整ε-greedy的探索率。并且为了更好掌握市场数据的变化情况,引入基于波动率驱动的自适应移动平均值(Adaptive Moving Average, AMA)来构造状态空间,根据市场波动率动态调整均线窗口,同时,使用日资产变化作为奖励函数进行算法训练。最后,将该算法应用在中国股票市场中的六组股票行情数据进行实验验证。实验结果表明,所提出的算法在夏普比率、收益率等多个评价指标上均有良好表现。
Abstract: With the development of artificial intelligence technologies and algorithms, their applications in decision-making in the financial trading market have become increasingly widespread. In particular, the use of deep reinforcement learning methods to simulate the trading environment and achieve trading decisions has become a current research hotspot. Based on this, this paper proposes the PPO-DBN algorithm based on the risk value exploration mechanism. It combines the Proximal Policy Optimization (PPO) algorithm with the Deep Belief Net (DBN), and uses the risk value-based exploration mechanism during the training process. The Value at Risk (VaR) of the current market is used to dynamically adjust the exploration rate of the ε-greedy algorithm. Moreover, in order to better grasp the changes in market data, an Adaptive Moving Average (AMA) driven by volatility is introduced to construct the state space. The moving average window is dynamically adjusted according to the market volatility. At the same time, the daily asset change is used as the reward function for algorithm training. Finally, this algorithm is applied to the market data of six groups of stocks in the Chinese stock market for experimental verification. The experimental results show that the proposed algorithm performs well in multiple evaluation indicators such as the Sharpe ratio and the rate of return.
文章引用:李冠男, 张国凯. 基于风险价值探索机制的PPO-DBN算法股票交易策略[J]. 运筹与模糊学, 2025, 15(3): 128-139. https://doi.org/10.12677/orf.2025.153146

参考文献

[1] 杨胜刚, 卢向前. 行为金融、噪声交易与中国证券市场主体行为特征研究[J]. 湖南大学学报(社会科学版), 2002(1): 25-29.
[2] Kiboi, J. and Katuse, P. (2015) Nairobi Stock Exchange: A Regression of Factors Affecting Stock Prices. Prime Journal of Social Science, 4, 1093-1098.
[3] van Otterlo, M. and Wiering, M. (2012) Reinforcement Learning and Markov Decision Processes. In: Wiering, M. and van Otterlo, M., Eds., Reinforcement Learning, Springer, 3-42. [Google Scholar] [CrossRef
[4] Sutton, R.S. and Barto, A.G. (2018) Reinforcement Learning: An Introduction. MIT Press.
[5] Hua, Y.M., Guo, J.H. and Zhao, H. (2015) Deep Belief Networks and Deep Learning. Proceedings of 2015 International Conference on Intelligent Computing and Internet of Things, Harbin, 17-18 January 2015, 1-4. [Google Scholar] [CrossRef
[6] Lin, Y., Liu, S., Yang, H., Wu, H. and Jiang, B. (2021) Improving Stock Trading Decisions Based on Pattern Recognition Using Machine Learning Technology. PLOS ONE, 16, e0255558. [Google Scholar] [CrossRef] [PubMed]
[7] Lotfi, I. and El Bouhadi, A. (2021) Artificial Intelligence Methods: Toward a New Decision Making Tool. Applied Artificial Intelligence, 36, Article ID: 1992141. [Google Scholar] [CrossRef
[8] Selvamuthu, D., Kumar, V. and Mishra, A. (2019) Indian Stock Market Prediction Using Artificial Neural Networks on Tick Data. Financial Innovation, 5, Article No. 16. [Google Scholar] [CrossRef
[9] Yang, C., Zhai, J. and Tao, G. (2020) Deep Learning for Price Movement Prediction Using Convolutional Neural Network and Long Short-Term Memory. Mathematical Problems in Engineering, 2020, Article ID: 2746845. [Google Scholar] [CrossRef
[10] Nabipour, M., Nayyeri, P., Jabani, H., S., S. and Mosavi, A. (2020) Predicting Stock Market Trends Using Machine Learning and Deep Learning Algorithms via Continuous and Binary Data; a Comparative Analysis. IEEE Access, 8, 150199-150212. [Google Scholar] [CrossRef
[11] Yang, K., Zhang, G., Bi, C., Guan, Q., Xu, H. and Xu, S. (2023) Improving CNN-Based Stock Trading by Considering Data Heterogeneity and Burst. International Journal on Cybernetics & Informatics, 12, 01-13. [Google Scholar] [CrossRef
[12] Lin, Y., Lai, C. and Pai, P. (2022) Using Deep Learning Techniques in Forecasting Stock Markets by Hybrid Data with Multilingual Sentiment Analysis. Electronics, 11, Article 3513. [Google Scholar] [CrossRef
[13] Yu, S., Yang, S. and Yoon, S. (2023) The Design of an Intelligent Lightweight Stock Trading System Using Deep Learning Models: Employing Technical Analysis Methods. Systems, 11, Article 470. [Google Scholar] [CrossRef
[14] Santos, G.C., Garruti, D., Barboza, F., de Souza, K.G., Domingos, J.C. and Veiga, A. (2023) Management of Investment Portfolios Employing Reinforcement Learning. PeerJ Computer Science, 9, e1695. [Google Scholar] [CrossRef] [PubMed]
[15] Yang, S. (2023) Deep Reinforcement Learning for Portfolio Management. Knowledge-Based Systems, 278, Article ID: 110905. [Google Scholar] [CrossRef
[16] Lin, Y., Chen, C., Sang, C. and Huang, S. (2022) Multiagent-Based Deep Reinforcement Learning for Risk-Shifting Portfolio Management. Applied Soft Computing, 123, Article ID: 108894. [Google Scholar] [CrossRef
[17] Zhao, L., Kong, S. and Shen, Y. (2023) DoubleAdapt: A Meta-Learning Approach to Incremental Learning for Stock Trend Forecasting. Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining, Long Beach, 6-10 August 2023, 3492-3503. [Google Scholar] [CrossRef
[18] Du, Y., Wang, J., Feng, W., Pan, S., Qin, T., Xu, R., et al. (2021) AdaRNN: Adaptive Learning and Forecasting of Time Series. Proceedings of the 30th ACM International Conference on Information & Knowledge Management, Queensland, 1-5 November 2021, 402-411. [Google Scholar] [CrossRef
[19] Wang, S.Y., et al. (2024) TimeMixer: Decomposable Multiscale Mixing for Time Series Forecasting. arXiv:2405.14616.
[20] Liu, Y., et al. (2023) iTransFormer: Inverted Transformers Are Effective for Time Series Forecasting. arXiv: 2310.06625.