隐马尔可夫模型参数的变分推断方法研究
The Study of Variational Inference Methods for HMMs’ Parameters
DOI: 10.12677/sa.2026.155119, PDF,   
作者: 张 镭:吉林财经大学统计与数据科学学院,吉林 长春
关键词: 多元t分布变分推断隐马尔可夫模型股指预测Multivariate t-Distribution VI HMM Stock Index Forecasting
摘要: 隐马尔可夫模型因其功能强大且拥有坚实的数理基础而被广泛应用于众多领域。然而,其传统参数估计方法通常基于最大似然估计,难以有效刻画复杂分布结构及模型不确定性,尤其在存在异常值或厚尾特征的金融数据中表现受限。故本文引入变分推断框架,构造了一种基于多元t分布的变分隐马尔可夫模型,并给出了详细推导过程。然后,通过数值模拟实验测试了其参数估计与状态预测的正确性与有效性;最后,使用S&P500与CSI300两套真实股指数据进行了实证分析,结果表明VBtHMM模型参数估计更加高效稳健,在应对含异常值或非高斯噪声数据建模中,仍能保持较强的参数估计与状态预测能力。
Abstract: Hidden Markov Models (HMMs) have been widely used in various fields due to their strong modeling capabilities and sound mathematical foundation. However, traditional parameter estimation of HMMs mainly relies on Maximum Likelihood Estimation (MLE), which fails to effectively capture complex distributional structures and quantify model uncertainty—this limitation is more obvious when modeling financial data with outliers or heavy-tailed distributions. To address this problem, this paper proposes a variational HMM based on multivariate t-distribution under the Variational Inference (VI) framework, with detailed mathematical derivations provided. Numerical simulations are carried out to verify the validity and effectiveness of the proposed model in parameter estimation and latent state prediction. Finally, empirical analyses using real S&P 500 and CSI 300 data show that the VBtHMM achieves more efficient and robust parameter estimation, and maintains strong performance in parameter estimation and state prediction even for data with outliers or non-Gaussian noise.
文章引用:张镭. 隐马尔可夫模型参数的变分推断方法研究[J]. 统计学与应用, 2026, 15(5): 192-207. https://doi.org/10.12677/sa.2026.155119

参考文献

[1] 贺本岚. 股票价格预测的最优选择模型[J]. 统计与决策, 2008(6): 135-137.
[2] 魏宇. 沪深300股指期货的波动率预测模型研究[J]. 管理科学学报, 2010, 13(2): 66-76.
[3] Pagan, A. (1996) The Econometrics of Financial Markets. Journal of Empirical Finance, 3, 15-102. [Google Scholar] [CrossRef
[4] Bollerslev, T., Chou, R.Y. and Kroner, K.F. (1992) ARCH Modeling in Finance: A Review of the Theory and Empirical Evidence. Journal of Econometrics, 52, 5-59. [Google Scholar] [CrossRef
[5] Hamilton, J.D. (1989) A New Approach to the Economic Analysis of Nonstationary Time Series and the Business Cycle. Econometrica, 57, 357-384. [Google Scholar] [CrossRef
[6] Rabiner, L.R. (1989) A Tutorial on Hidden Markov Models and Selected Applications in Speech Recognition. Proceedings of the IEEE, 77, 257-286. [Google Scholar] [CrossRef
[7] Nystrup, P., Madsen, H. and Lindström, E. (2017) Long Memory of Financial Time Series and Hidden Markov Models with Time-Varying Parameters. Journal of Forecasting, 36, 989-1002. [Google Scholar] [CrossRef
[8] Oelschläger, L. and Adam, T. (2023) Detecting Bearish and Bullish Markets in Financial Time Series Using Hierarchical Hidden Markov Models. Statistical Modelling, 23, 107-126. [Google Scholar] [CrossRef
[9] 方兆本, 缪柏其. 随机过程[M]. 第3版. 北京: 科学出版社, 2011: 26.
[10] Bishop, C.M. and Nasrabadi, N.M. (2006) Pattern Recognition and Machine Learning. Springer, 462-474.
[11] Murphy, K.P. (2012) Machine Learning: A Probabilistic Perspective. MIT Press, 731-746.
[12] McGrory, C.A. and Titterington, D.M. (2009) Variational Bayesian Analysis for Hidden Markov Models. Australian & New Zealand Journal of Statistics, 51, 227-244. [Google Scholar] [CrossRef
[13] Andrews, D.F. and Mallows, C.L. (1974) Scale Mixtures of Normal Distributions. Journal of the Royal Statistical Society Series B: Statistical Methodology, 36, 99-102. [Google Scholar] [CrossRef
[14] Turner, R.E. and Sahani, M. (2011) Two Problems with Variational Expectation Maximisation for Time Series Models. In: Barber, D., Cemgil, A.T. and Chiappa, S., Eds., Bayesian Time Series Models, Cambridge University Press, 104-124. [Google Scholar] [CrossRef
[15] Villa, C. and Rubio, F.J. (2018) Objective Priors for the Number of Degrees of Freedom of a Multivariatetdistribution and Thet-Copula. Computational Statistics & Data Analysis, 124, 197-219. [Google Scholar] [CrossRef
[16] Blei, D.M. and Jordan, M.I. (2006) Variational Inference for Dirichlet Process Mixtures. Bayesian Analysis, 1, 121-143. [Google Scholar] [CrossRef
[17] Murphy, K.P. (2007) Conjugate Bayesian Analysis of the Gaussian Distribution (Technical Report). University of British Columbia.
https://www.cs.ubc.ca/~murphyk/Papers/bayesGauss.pdf
[18] Hassan, M.R. and Nath, B. (2005) Stock Market Forecasting Using Hidden Markov Model: A New Approach. 5th International Conference on Intelligent Systems Design and Applications (ISDA’05), Warsaw, 8-10 September 2005, 192-196. [Google Scholar] [CrossRef
[19] 张旭东, 黄宇方, 等. 基于离散型隐马尔可夫模型的股票价格预测[J]. 浙江工业大学学报, 2020, 48(2): 148-153.
[20] Dayan, P., Hinton, G. E., Neal, R.M. and Zemel, R.S. (1995) The Helmholtz machine. Neural Computation, 7, 889-904.
[21] Neal, R.M. and Hinton, G.E. (1998) A View of the EM Algorithm That Justifies Incremental, Sparse, and Other Variants. In: Jordan, M.I., Ed., Learning in Graphical Models, Kluwer Academic Publishers, 355-368.