基于深度强化学习的投资组合管理问题
Portfolio Management Problem Based on Deep Reinforcement Learning
摘要: 投资组合管理任务一直是金融领域的热点问题之一,随着人工智能技术的发展,已经有越来越多的工作将人工智能技术应用于投资组合管理领域。而其中最为主要的则是强化学习技术,强化学习技术是机器学习的一个分支,不断地根据环境反馈来调整自己的策略而不需要预先给定标签。同时,深度学习具有高阶特征抽取能力,因此,本文将使用深度强化学习来解决投资组合管理问题。针对金融序列存在着大量噪音的问题,使用EWT经验小波变换来对股价序列进行去噪,然后使用去噪后的序列构建科技指标,并输入到模型之中。使用TCN时间卷积网络来提取股票的时序特征,然后使用多头注意力网络来提取股票之间的空间关系,最后输入到全连接层之中,再经过sigmoid函数和softmax函数来得到投资组合的权重。使用基本的策略梯度方法,并使用夏普比率作为目标函数,分别在DJIA、HSI和DAX三种数据集上对本文构建的模型进行实验,使用了六种指标来评价不同策略的优劣,实验证实本文提出的模型具有一定的优势,能够实现较低的风险以及较高的回报。
Abstract: The task of portfolio management has been a hot issue in the financial field. With the development of artificial intelligence technology, more and more work has been applied artificial intelligence technology to the field of portfolio management. The most important of these is reinforcement learning, which is a branch of machine learning that continuously adjusts its strategy based on environmental feedback without prespecifying labels. At the same time, deep learning has high-order feature extraction capabilities, so this article will use deep reinforcement learning to solve the problem of portfolio management. Aiming at the problem of a large amount of noise in the financial data, the EWT is used to denoise the stock price sequence, and the denoised sequence is used to construct technical indicators and input into the model. And then the TCN is used to extract the time series features of stocks, multi-head attention is used to extract the spatial relationship between stocks, and finally input it into the fully connected layer and get the portfolio through the sigmoid function and softmax function. Using the basic strategy gradient method and using the Sharpe ratio as the objective function, the model constructed in this paper was tested on three data sets of DJIA, HSI and DAX respectively, and six indicators were used to evaluate different strategies, and the experiments confirmed that the model proposed in this paper has advantages and can achieve lower risk and higher return.
文章引用:胥智星. 基于深度强化学习的投资组合管理问题[J]. 运筹与模糊学, 2023, 13(3): 1427-1441. https://doi.org/10.12677/ORF.2023.133144

参考文献

[1] Silver, D., Schrittwieser, J., Simonyan, K. and Antonoglou, I. (2017) Mastering the Game of Go without Human Knowledge. Nature, 550, 354-359. [Google Scholar] [CrossRef] [PubMed]
[2] Markowitz, H.M. (1952) Port-folio Selection. The Journal of Finance, 7, 77. [Google Scholar] [CrossRef
[3] Sharpe, W.F. (1964) Capital Asset Prices: A Theory of Market Equilib-rium under Conditions of Risk. The Journal of Finance, 19, 425-442. [Google Scholar] [CrossRef
[4] 彭燕, 刘宇红, 张荣芬. 基于LSTM的股票价格预测建模与分析[J]. 计算机工程与应用, 2019, 55(11): 209-212.
[5] Bao, W., Yue, J. and Rao, Y. (2017) A Deep Learning Framework for Financial Time Series Using Stacked Autoencoders and Long-Short Term Memory. PLOS ONE, 12, e0180944. [Google Scholar] [CrossRef] [PubMed]
[6] 张倩玉, 严冬梅, 韩佳彤. 结合深度学习和分解算法的股票价格预测研究[J]. 计算机工程与应用, 2021, 57(5): 56-64.
[7] 许杰, 祝玉坤, 邢春晓. 基于深度强化学习的金融交易算法研究[J]. 计算机工程与应用, 2022, 58(7): 276-285.
[8] Jiang, Z., Xu, D. and Liang, J. (2017) A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem. ArXiv Preprint ArXiv: 1706.10059.
[9] Liang, Z., Chen, H., Zhu, J., Jiang, K. and Li, Y. (2018) Adversarial Deep Reinforcement Learning in Portfolio Management. ArXiv Preprint ArXiv: 1808.09940.
[10] Huang, S.H., Miao, Y.H. and Hsiao, Y.T. (2021) Novel Deep Reinforcement Algorithm with Adaptive Sampling Strategy for Continuous Portfolio Optimization. IEEE Access, 9, 77371-77385. [Google Scholar] [CrossRef
[11] Wang, J., Zhang, Y., Tang, K., Wu, J. and Xiong, Z. (2019) Alphastock: A Buying-Winners-and-Selling-Losers Investment Strategy Using Interpretable Deep Rein-forcement Attention Networks. Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining, 25 July 2019, 1900-1908. [Google Scholar] [CrossRef
[12] Wang, Z., Huang, B., Tu, S., Zhang, K. and Xu, L. (2021) DeepTrader: A Deep Reinforcement Learning Approach for Risk-Return Balanced Portfolio Management with Market Conditions Embedding. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 643-650. [Google Scholar] [CrossRef
[13] Lee, J., Kim, R., Yi, S.W. and Kang, J. (2020) MAPS: Mul-ti-Agent Reinforcement Learning-Based Portfolio Management System, 4520-4526. [Google Scholar] [CrossRef