基于机器学习方法的上证综合指数的预测分析
Forecast Analysis of Shanghai Composite Index Based on Machine Learning Method
DOI: 10.12677/HJDM.2016.61001, PDF, HTML, XML, 下载: 3,059  浏览: 8,334 
作者: 吴仍康:云南财经大学统计与数学学院,云南 昆明
关键词: 上证综合指数机器学习随机森林支持向量机Shanghai Composite Index Machine Learning Random Forests SVM
摘要: 上证综合指数是广大投资者关注的重要指数。上证综合指数不仅反映了我国股票市场的基本状况,同时对我国经济走向也具有重要的导向作用。对上证综合指数的预测分析以及趋势研判对稳定市场、引导投资者具有重大意义。而股票市场数据是典型的非线性系统,传统统计学预测方法在处理时预测精度较低。本文综合运用R软件并结合目前机器学习领域最新的六种方法——决策树、boosting、bagging、随机森林、支持向量机、神经网络分别对训练集进行训练,得到相应模型。并建立相应的十折交叉验证集计算出每种方法的预测均方误差进行对比。筛选出效果较好的模型,并对预测数据与真实数据进行数据可视化对比。对结果分析可知,随机森林、支持向量机两种机器学习方法拟合效果较好,且精度高。
Abstract: The Shanghai composite index is an important index that general investors pay close attention to. Shanghai composite index, which not only reflects the basic situation of the stock market in our country, but also takes an important guiding role to our economy. Prediction of Shanghai composite index and trend analysis plays an important role to stabilize market and guide investors. And stock market data are a typical nonlinear system; traditional statistical forecasting methods predict a low accuracy. In this paper, we use R software comprehensively and combine with the latest six kinds of methods in machine learning field, decision tree, boosting, bagging, random forests, support vector machine (SVM), neural network to train the training set, respectively, get the corresponding model. And set up the corresponding ten-fold cross validation to calculate the prediction mean square error of each method for comparison. Select the model with better effect, and make a visualized comparison between prediction data and real data. Analysis shows that the results of random forests, SVM are more fitting, and have high precision.
文章引用:吴仍康. 基于机器学习方法的上证综合指数的预测分析[J]. 数据挖掘, 2016, 6(1): 1-8. http://dx.doi.org/10.12677/HJDM.2016.61001

参考文献

[1] 黄伯中. 技术分析原理[M]. 香港: 明报出版社, 1995: 12-30.
[2] 鲍志强. 证券投资技巧与理论[M]. 南京: 河海大学出版社, 1991: 54-81.
[3] 马超群, 高仁祥. 现代预测理论与方法[M]. 长沙: 湖南大学出版社, 1998.
[4] McCullagh, P. and Nelder, J.A. (1989) Generalized Linear Models. 2nd Edition, Chapman and Hall, London.
http://dx.doi.org/10.1007/978-1-4899-3242-6
[5] Granger, C.W.J. (1980) Long Memory Relationships and the Aggregation of Dynamics Models. Journal of Econometrics, 14, 227-238.
http://dx.doi.org/10.1016/0304-4076(80)90092-5
[6] Bollerslev, T. (1986) Generalized Autoregressive Condi-tional Heteroskedasticity. Journal of Econometrics, 31, 307- 327.
http://dx.doi.org/10.1016/0304-4076(86)90063-1
[7] 赵传刚. 我国A股市场量价关系的实证分析[D]. 南昌: 江西财经大学, 2007: 20-22.
[8] 曹赛玉. 几种决策概率模型在现实生活中的应用[J]. 理论与实践理论月刊, 2006(5): 91-93.