基于文本分析——LSTM的量化选股模型
Quantitative Stock Selection Model Based on Text Analysis-LSTM
摘要: 股票市场的波动深刻影响投资者决策,凸显量化选股的关键意义。本研究创造性地将文本分析和长短期记忆神经网络(LSTM)相结合,构建了选股模型。模型先借助TF-IDF和TextRank剖析相关财经新闻及行业简报,挖掘热门利好行业及股票,再运用LSTM模型预测所选股票最低价走势,为投资者决策提供参考依据。经过模型评估,其均方误差(MSE)为0.02037,平均绝对误差(MAE)为0.09182,决定系数(R
2)达到0.9754,这些指标显示模型的预测性能较为出色。通过文本分析确定科技行业为热门利好行业,并选取该行业的九洲集团、焦点科技两支股票展开案例分析,验证了引入滑动窗口的LSTM模型在预测精度方面优于未引入滑动窗口的LSTM模型。此研究成果为量化选股开辟新方向,有望助力投资者决策更科学。
Abstract: The fluctuations in the stock market have a profound impact on investors’ decisions, highlighting the crucial significance of quantitative stock selection. This study innovatively combines text analysis with the Long Short-Term Memory neural network (LSTM) to construct a stock selection model. The model first utilizes TF-IDF and TextRank to analyze relevant financial news and industry briefs, unearthing popular and favorable industries and stocks. Then, it employs the LSTM model to predict the lowest price trends of the selected stocks, providing a reference basis for investors’ decisions. Through model evaluation, its Mean Squared Error (MSE) reaches 0.02037, the Mean Absolute Error (MAE) is 0.09182, and the Coefficient of Determination (R2) is 0.9754, indicating that the model has a good prediction effect. Through text analysis, it is determined that the technology industry is a popular and favorable industry, and two stocks, Jiuzhou Group and Focus Technology, in this industry are selected for case analysis, confirming that the LSTM model with a sliding window is superior to the LSTM model without a sliding window in prediction accuracy. The results of this study open up a new direction for quantitative stock selection and are expected to help investors make more scientific and rational decisions and make their investment behaviors more forward-looking and accurate.
参考文献
|
[1]
|
邓晶. 基于孪生支持向量机的量化投资研究[D]: [硕士学位论文]. 上海: 上海工程技术大学, 2021.
|
|
[2]
|
梁馨月. 基于宏观经济因素下的多因子选股及实证分析[D]: [硕士学位论文]. 北京: 中央民族大学, 2022.
|
|
[3]
|
李馨蕊. 基于RF-SVM算法的多因子量化选股研究[D]: [硕士学位论文]. 长沙: 湖南大学, 2022.
|
|
[4]
|
杨淼杰, 王彩凤, 武辰华. 基于随机森林算法的多因子选股模型研究(英文) [J]. 纯粹数学与应用数学, 2023, 39(4): 506-519.
|
|
[5]
|
张儒森, 黄外斌. 基于集成学习算法的指数增强策略研究[J]. 金融客, 2024(3): 27-29.
|
|
[6]
|
赵俊茹. 基于图神经网络的多因子选股策略研究[D]: [硕士学位论文]. 兰州: 兰州财经大学, 2024.
|
|
[7]
|
何开元. 基于机器学习的选股与择时量化交易策略研究[D]: [硕士学位论文]. 武汉: 武汉轻工大学, 2024.
|
|
[8]
|
徐舫. 基于机器学习预测的量化选股研究[D]: [硕士学位论文]. 武汉: 武汉轻工大学, 2024.
|