带有双层变量选择的logit模型在股价变动趋势预测中的应用
Application of Logit Models with Bi-Level Variable Selection in Predicting Stock Prices Movement Trends
摘要: 股票价格受经济因素、投资者心理预期、股票市场走势、宏观政策等多种因素影响。因此,准确预测股票价格的变动趋势是金融领域的重要难题。本文将技术指标分析与带有双层变量选择的惩罚三项logit模型相结合,提出了Sparse Group Lasso/Group Bridge/Composite MCP/Group Exponential Lasso惩罚三项logit模型来预测股票价格的上涨、横盘和下跌趋势。首先,选取58项重要技术指标,将其分为13个互不重叠的组,针对三只美国股票:美源伯根(COR)、思科系统(CSCO)、麦当劳(MCD)分别构建模型;接着利用训练集得到参数估计值,利用测试集结合混淆矩阵、准确率、Kappa、HUM综合评估模型的预测性能;最后引入Group Lasso/Group SCAD/Group MCP惩罚三项logit模型、支持向量机、随机森林、人工神经网络与本文提出的方法进行比较。结果表明,综合各指标来看,本文所提出的方法均优于六种对比方法。因此该方法可以有效提高预测准确率,为投资者带来更高收益。
Abstract: Stock prices are influenced by various factors, including economic conditions, investor psychological expectations, market trends, and macroeconomic policies. Therefore, accurately predicting stock price movements remains a significant challenge in the field of finance. In this paper, we combine technical indicator analysis with a penalized trinomial logit model featuring bi-level variable selection, and propose Sparse Group Lasso/Group Bridge/Composite MCP/Group Exponential Lasso penalized multinomial logit models to forecast up trends, sideways trends and down trends in stock prices movement trends. Firstly, 58 important technical indicators are selected and divided into 13 mutually exclusive groups. Models are constructed for three U.S. stocks: Cencora (COR), Cisco Systems (CSCO) and McDonald’s (MCD). Secondly, parameter estimates are obtained using the training set, and the predictive performance of the models is comprehensively evaluated on the test set using the confusion matrix, accuracy, Kappa and HUM. Finally, comparisons are made with Group Lasso/Group SCAD/Group MCP penalized trinomial logit models, SVM, RF and ANN. The results demonstrate that, across all evaluation metrics, the proposed methods outperform the other 6 approaches. Therefore, this method can effectively improve the prediction accuracy and provide investors with higher returns.
文章引用:郭姝敏. 带有双层变量选择的logit模型在股价变动趋势预测中的应用[J]. 统计学与应用, 2026, 15(1): 156-168. https://doi.org/10.12677/sa.2026.151016

参考文献

[1] Biu, G.S. and Kusuma, P.K. (2023) Stock Market Volatility Analysis during the Global Financial Crisis: Literature Review. Educational Journal of History and Humanities, 6, 2510-2520.
[2] Engle, R.F. (1982) Autoregressive Conditional Heteroscedasticity with Estimates of the Variance of United Kingdom Inflation. Econometrica, 50, 987-1007. [Google Scholar] [CrossRef
[3] Afeef, M., Ihsan, A. and Zada, H. (2018) Forecasting Stock Prices through Univariate ARIMA Modeling. NUML International Journal of Business & Management, 13, 130-143.
[4] Ballings, M., Van den Poel, D., Hespeels, N. and Gryp, R. (2015) Evaluating Multiple Classifiers for Stock Price Direction Prediction. Expert Systems with Applications, 42, 7046-7056. [Google Scholar] [CrossRef
[5] Yun, K.K., Yoon, S.W. and Won, D. (2021) Prediction of Stock Price Direction Using a Hybrid GA-XGBoost Algorithm with a Three-Stage Feature Engineering Process. Expert Systems with Applications, 186, Article 115716. [Google Scholar] [CrossRef
[6] Long, J., Chen, Z., He, W., Wu, T. and Ren, J. (2020) An Integrated Framework of Deep Learning and Knowledge Graph for Prediction of Stock Price Trend: An Application in Chinese Stock Exchange Market. Applied Soft Computing, 91, Article 106205. [Google Scholar] [CrossRef
[7] Vuong, P.H., Dat, T.T., Mai, T.K., et al. (2022) Stock-Price Forecasting Based on XGBoost and LSTM. Computer Systems Science and Engineering, 40, 237-246. [Google Scholar] [CrossRef
[8] Yuan, M. and Lin, Y. (2006) Model Selection and Estimation in Regression with Grouped Variables. Journal of the Royal Statistical Society Series B: Statistical Methodology, 68, 49-67. [Google Scholar] [CrossRef
[9] Huang, J., Breheny, P. and Ma, S. (2012) A Selective Review of Group Selection in High-Dimensional Models. Statistical Science, 27, 481-499. [Google Scholar] [CrossRef] [PubMed]
[10] Huang, J., Ma, S., Xie, H. and Zhang, C. (2009) A Group Bridge Approach for Variable Selection. Biometrika, 96, 339-355. [Google Scholar] [CrossRef] [PubMed]
[11] Breheny, P. and Huang, J. (2009) Penalized Methods for Bi-Level Variable Selection. Statistics and Its Interface, 2, 369-380. [Google Scholar] [CrossRef] [PubMed]
[12] Wu, T.T. and Lange, K. (2008) Coordinate Descent Algorithms for Lasso Penalized Regression. The Annals of Applied Statistics, 2, 224-244. [Google Scholar] [CrossRef
[13] Breheny, P. (2015) The Group Exponential Lasso for Bi-Level Variable Selection. Biometrics, 71, 731-740. [Google Scholar] [CrossRef] [PubMed]
[14] Glonek, G.F.V. and McCullagh, P. (1995) Multivariate Logistic Models. Journal of the Royal Statistical Society Series B: Statistical Methodology, 57, 533-546. [Google Scholar] [CrossRef
[15] Novoselova, N., Della Beffa, C., Wang, J., Li, J., Pessler, F. and Klawonn, F. (2014) HUM Calculator and HUM Package for R: Easy-to-Use Software Tools for Multicategory Receiver Operating Characteristic Analysis. Bioinformatics, 30, 1635-1636. [Google Scholar] [CrossRef] [PubMed]
[16] Li, J., Gao, M. and D’Agostino, R. (2019) Evaluating Classification Accuracy for Modern Learning Approaches. Statistics in Medicine, 38, 2477-2503. [Google Scholar] [CrossRef] [PubMed]
[17] Hu, X. and Yang, J. (2024) G-LASSO/G-SCAD/G-MCP Penalized Trinomial Logit Dynamic Models Predict up Trends, Sideways Trends and down Trends for Stock Returns. Expert Systems with Applications, 249, Article 123476. [Google Scholar] [CrossRef
[18] Hu, X. and Yang, J. (2024) Group Penalized Multinomial Logit Models and Stock Return Direction Prediction. IEEE Transactions on Information Theory, 70, 4297-4318. [Google Scholar] [CrossRef
[19] Vincent, M. and Hansen, N.R. (2014) Sparse Group Lasso and High Dimensional Multinomial Classification. Computational Statistics & Data Analysis, 71, 771-786. [Google Scholar] [CrossRef
[20] Rifkin, R. and Klautau, A. (2004) In Defense of One-vs-All Classification. Journal of Machine Learning Research, 5, 101-141.