基于机器学习的股票收益预测与投资组合研究
Machine Learning Based Stock Return Prediction and Portfolio Research
摘要: 计算机和互联网的高速发展使得量化投资在全球逐渐兴起。笔者将机器学习模型和多因子模型相结合构建量化选股模型,并使用上证50指数成分股2016年到2022年的日频数据进行模型训练和样本外预测,结果发现:1) 以随机森林、支持向量机、XGBoost三个模型进行选股构建的投资策略能够战胜市场;2) 投资收益受市场行情影响巨大,在下跌行情中,主动型投资策略即使能够战胜市场,也不能保证获得超过无风险收益率的收益。
Abstract: The rapid development of computers and the Internet has led to the gradual rise of quantitative investment worldwide. This author combines machine learning models and multi-factor models to construct a quantitative stock selection model, and uses the daily frequency data of the constituents of the SSE 50 index from 2016 to 2022 for model training and out-of-sample prediction, and finds that 1) The investment strategy constructed by stock selection with the three models of Random Forests, Support Vector Machines, and XGBoost is able to the market; 2) The investment return is affected by the market sentiment greatly, and it is difficult to get more than the risk-free rate of return in the falling market.
文章引用:陈欣. 基于机器学习的股票收益预测与投资组合研究[J]. 运筹与模糊学, 2024, 14(2): 599-609. https://doi.org/10.12677/orf.2024.142163

参考文献

[1] Markowitz, H. (1952) Portfolio Selection. The Journal of Finance, 7, 77-91. [Google Scholar] [CrossRef
[2] Sharpe, W.F. (1964) Capital Asset Prices: A Theory of Market Equilibrium under Conditions of Risk. The Journal of Finance, 19, 425-442. [Google Scholar] [CrossRef
[3] Ross, A. (1976) The Arbitrage Theory of Capital Asset Pricing. Journal of Economic Theory, 13, 341-360. [Google Scholar] [CrossRef
[4] Fama, E.F. and French, K.R. (1993) Common Risk Factors in the Returns on Stocks and Bonds. Journal of Financial Economics, 33, 3-56. [Google Scholar] [CrossRef
[5] Fama, E.F. and French, K.R. (2015) A Five-Factor Asset Pricing Model. Journal of Financial Economics, 116, 1-22. [Google Scholar] [CrossRef
[6] 王伟. 三因素模型在中国资本市场的有效性研究[D]: [硕士学位论文]. 成都: 西南财经大学, 2008.
[7] 王涛. Fama-French三因子模型及其添加市盈率因子模型在中国股市的适用性研究[D]: [硕士学位论文]. 成都: 西南财经大学, 2012.
[8] 何路. 多因子量化选股及投资者情绪择时策略的实证检验[D]: [硕士学位论文]. 南京: 南京大学, 2020.
[9] Tay, F.E.H. and Cao, L. (2001) Application of Support Vector Machines in Financial Time Series Forecasting. Omega, 29, 309-317. [Google Scholar] [CrossRef
[10] Kim, K.J. (2003) Financial Time Series Forecasting Using Support Vector Mechines. Neurocomputing, 55, 307-319. [Google Scholar] [CrossRef
[11] Kwon, Y.K., Choi, S.S. and Moon, B.R. (2005) Stock Prediction Based on Financial Correlation. Proceedings of the 7th Annual Conference on Genetic and Evolutionary Computation, Washington DC USA, 25-29 June 2005, 2061-2066. [Google Scholar] [CrossRef
[12] 徐国祥, 杨振建. PCA-GA-SVM模型的构建及应用研究——沪深300指数预测精度实证分析[J]. 数量经济技术经济研究, 2011, 28(2): 135-147.
[13] 韩燕龙. 基于随机森林的指数化投资组合构建研究[D]: [硕士学位论文]. 广州: 华南理工大学, 2015.
[14] 李想. 基于XGBoost算法的多因子量化选股方案策划[D]: [硕士学位论文]. 上海: 上海师范大学, 2017.
[15] 贺隆超. 多因子量化选股与机器学习量化择时投资策略研究[D]: [硕士学位论文]. 乌鲁木齐: 新疆财经大学, 2020.
[16] 吕子夷. 基于机器学习算法的股指期货价格预测与比较研究[D]: [硕士学位论文]. 杭州: 浙江大学, 2020.