基于一种新的特征选择方法的朴素贝叶斯分类器选择证券的研究
Research on Security Selection by Naive Bayes Classifier Based on a New Feature Selection Method
摘要: 本文提出了基于一种新的特征选择方法的朴素贝叶斯证券分类模型。首先,根据深交所50家公司2011年的交易数据和常用的18个指标,采取新的特征选择方法即互信息和主成分分析相结合选出用于分类的因子;其次,利用前10个月的数据建立朴素贝叶斯分类模型,用后两个月的数据检验模型的预测精度。实证分析表明模型的分类平均正确率达到75%,具有应用价值。
Abstract: In this paper, a naive Bayes classifier for securities selection based on a new feature selection method is established. Firstly, in consideration of the trading data of 50 companies in Shenzhen Stock Exchange and 18 commonly used indicators, a new feature selection method, i.e. the combi-nation of mutual information and principal component analysis, is adopted to select the value fac-tors for classification. Secondly, a naive Bayes classifier is constructed with the data of the first 10 months, and the prediction accuracy of the classifier is tested with that of the last two months. The empirical analysis shows that the average accuracy of the classifier reaches 75%, which is of ap-plication value.
文章引用:郭盼盼, 刘海军, 李双双. 基于一种新的特征选择方法的朴素贝叶斯分类器选择证券的研究[J]. 应用数学进展, 2019, 8(1): 41-49. https://doi.org/10.12677/AAM.2019.81005

参考文献

[1] Fama, E.F. and French, K.R. (1992) The Cross-Section of Expected Stock Returns. The Journal of Finance, 47, 427-465. [Google Scholar] [CrossRef
[2] 唐文慧. 基于数据挖掘技术的股价预测实证分析[D]: [硕士学位论文]. 成都: 西南财经大学, 2009.
[3] 雷炜, 叶东毅. 利用决策树技术对股票价格数据库进行数据挖掘[J]. 福建电脑, 2004(8): 52-53.
[4] 王领, 胡扬. 基于C4.5决策树的股票数据挖掘[J]. 计算机与现代化, 2015(10): 21-24.
[5] 钱颖能, 胡运发. 用朴素贝叶斯分类法选股[J]. 计算机应用与软件, 2007, 24(6): 90-92.
[6] 左辉, 楼新远. 基于贝叶斯分类的选股方法[J]. 电脑知识与技术, 2008, 2(10): 173-176.
[7] 骆桦, 张喜梅. 基于贝叶斯分类法的股票选择模型的研究[J]. 浙江理工大学学报(自然科学版), 2015, 33(3): 418-422.
[8] White, H. (1988) Economic Prediction Using Neural Networks: The Case of IBM Daily Stock Returns. IEEE International Conference on Neural Networks, 2, 451-458.
[9] Oliveira, F.A.D., Nobre, C.N. and Zárate, L.E. (2013) Applying Artificial Neural Networks to Prediction of Stock Price and Improvement of the Directional Prediction Index—Case Study of PETR4, Petrobras, Brazil. Expert Systems with Applications, 40, 7596-7606. [Google Scholar] [CrossRef
[10] Zahedi, J. and Rounaghi, M.M. (2015) Application of Artificial Neural Network Models and Principal Component Analysis Method in Predicting Stock Prices on Tehran Stock Ex-change. Physica A Statistical Mechanics & Its Applications, 438, 178-187. [Google Scholar] [CrossRef
[11] Qiu, M., Song, Y. and Akagi, F. (2016) Application of Artificial Neural Network for the Prediction of Stock Market Returns: The Case of the Japanese Stock Market. Chaos, Solitons & Fractals, 85, 1-7. [Google Scholar] [CrossRef
[12] Almuallim, H. and Dietterich, T.G. (1991) Learning With Many Irrelevant Features. Proceedings of the 9th National Conference on Artificial Intelligence, Anaheim, 14-19 July 1991, AAAI Press, Volume 2.
[13] Domingos, P. and Pazzani, M. (1997) On the Optimality of the Simple Bayesian Classi-fier under Zero-One Loss. Machine Learning, 29, 103-130. [Google Scholar] [CrossRef
[14] Blum, A.L. and Langley, P. (1997) Selection of Relevant Features and Examples in Machine Learning. Artificial Intelligence, 97, 245-271. [Google Scholar] [CrossRef
[15] 唐勇波, 桂卫华, 彭涛, 等. 基于互信息变量选择的变压器油中溶解气体浓度预测[J]. 仪器仪表学报, 2013, 34(7): 1492-1498.
[16] 郭伟. 基于互信息的RBF神经网络结构优化设计[J]. 计算机科学, 2013, 40(6): 252-255.
[17] 韩敏, 刘晓欣. 基于互信息的分步式输入变量选择多元序列预测研究[J]. 自动化学报, 2012, 38(6): 999-1006.
[18] Cover, T.M. and Thomas, J.A. (1991) Elements of Information Theory. John Wiley & Sons, Inc, New York. [Google Scholar] [CrossRef
[19] 何晓群. 多元统计分析[M]. 第二版. 北京: 中国人民大学出版社, 2008.
[20] Tom M. Mitchell, 米切尔, 曾华军, 等. 机器学习[M]. 北京: 机械工业出版社, 2003.