基于LSTM和GRU神经网络的上海原油期货价格预测研究
Forecasting Shanghai Crude Oil Futures Prices with LSTM and GRU Neural Networks
DOI: 10.12677/aam.2025.1410415, PDF,    科研立项经费支持
作者: 徐瑞敏, 贝乐敏*, 王舒琪, 徐星瑶, 何佳祺:嘉兴南湖学院现代金融学院,浙江 嘉兴
关键词: ARIMAVARIMALSTMGRU原油期货价格预测ARIMA VARIMA LSTM GRU Crude Oil Futures Price Prediction
摘要: 本研究针对上海原油期货价格预测问题,系统对比了ARIMA、VARIMA、LSTM与GRU四类模型的预测性能。实证结果表明:传统线性模型ARIMA和VARIMA因受限于线性假设,在测试集上预测误差显著(RMSE > 9.5),尤其对突发性波动事件响应滞后;而深度学习模型展现出显著优势,其中GRU以双门控耦合机制实现效率与精度的最优平衡(MAE = 4.6928,参数量较LSTM减少25.6%),LSTM则凭借三门分立结构在长期趋势捕捉中表现稳健(R2 = 0.9647)。消融实验进一步验证了门控设计的必要性——移除LSTM遗忘门导致误差激增50%,印证了其在噪声过滤中的核心作用。研究成果为油价预测提供了兼具精度与效率的深度学习解决方案,并为构建智能化风控体系提供了理论依据。
Abstract: This study systematically compares the predictive performance of four models—ARIMA, VARIMA, LSTM, and GRU—for forecasting Shanghai crude oil futures prices. Empirical results demonstrate that traditional linear models (ARIMA/VARIMA), constrained by their linear assumptions, exhibit significant prediction errors on the test set (RMSE > 9.5), particularly showing delayed responses to abrupt volatility events. In contrast, deep learning models demonstrate remarkable advantages: GRU achieves an optimal balance between efficiency and accuracy through its dual-gate coupling mechanism (MAE = 4.6928, with 25.6% fewer parameters than LSTM), while LSTM exhibits robust performance in capturing long-term trends owing to its triple-gate structure (R2 = 0.9647). Ablation experiments further validate the necessity of gating design—removing the forget gate from LSTM increases errors by 50%, confirming its critical role in noise filtering. The research outcomes provide a deep learning solution that combines accuracy and efficiency for oil price forecasting, offering a theoretical foundation for constructing intelligent risk management systems.
文章引用:徐瑞敏, 贝乐敏, 王舒琪, 徐星瑶, 何佳祺. 基于LSTM和GRU神经网络的上海原油期货价格预测研究[J]. 应用数学进展, 2025, 14(10): 1-15. https://doi.org/10.12677/aam.2025.1410415

参考文献

[1] 中国期货业协会. 原油期货[M]. 第2版. 北京: 中国财政经济出版社, 2023.
[2] Box, G.E.P. and Jenkins, G.M. (1970) Time Series Analysis: Forecasting and Control. Holden-Day.
[3] Tong, H. and Lim, K.S. (1980) Threshold Autoregression, Limit Cycles and Cyclical Data. Journal of the Royal Statistical Society Series B: Statistical Methodology, 42, 245-268. [Google Scholar] [CrossRef
[4] Hyndman, R.J. and Athanasopoulos, G. (2021) Forecasting: Principles and Practice. 3rd Edition, OTexts.
[5] Leippold, M., Wang, Q. and Zhou, W. (2022) Machine Learning in the Chinese Stock Market. Journal of Financial Economics, 145, 64-82. [Google Scholar] [CrossRef
[6] Vapnik, V.N. (2000) The Nature of Statistical Learning Theory, 2nd Edition, Springer.
[7] Wang, H., Xie, Z., Chiu, D.K.W. and Ho, K.K.W. (2024) Multimodal Market Information Fusion for Stock Price Trend Prediction in the Pharmaceutical Sector. Applied Intelligence, 55, Article No. 77. [Google Scholar] [CrossRef
[8] Breiman, L. (2001) Random Forests. Machine Learning, 45, 5-32. [Google Scholar] [CrossRef
[9] 林娜娜, 秦江涛. 基于随机森林的A股股票涨跌预测研究[J]. 上海理工大学学报, 2018, 40(3): 267-273.
[10] Elman, J.L. (1990) Finding Structure in Time. Cognitive Science, 14, 179-211. [Google Scholar] [CrossRef
[11] Rather, A.M., Agarwal, A. and Sastry, V.N. (2015) Recurrent Neural Network and a Hybrid Model for Prediction of Stock Returns. Expert Systems with Applications, 42, 3234-3241. [Google Scholar] [CrossRef
[12] Lecun, Y., Bottou, L., Bengio, Y. and Haffner, P. (1998) Gradient-based Learning Applied to Document Recognition. Proceedings of the IEEE, 86, 2278-2324. [Google Scholar] [CrossRef
[13] Jiang, J., Kelly, B. and Xiu, D. (2023) (Re‐)imag(in)ing Price Trends. The Journal of Finance, 78, 3193-3249. [Google Scholar] [CrossRef
[14] Hochreiter, S. and Schmidhuber, J. (1997) Long Short-Term Memory. Neural Computation, 9, 1735-1780. [Google Scholar] [CrossRef] [PubMed]
[15] Borovkova, S. and Tsiamas, I. (2019) An Ensemble of LSTM Neural Networks for High‐frequency Stock Market Classification. Journal of Forecasting, 38, 600-619. [Google Scholar] [CrossRef
[16] Cho, K., van Merrienboer, B., Gulcehre, C., Bahdanau, D., Bougares, F., Schwenk, H., et al. (2014) Learning Phrase Representations Using RNN Encoder-Decoder for Statistical Machine Translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, 25-29 October 2014, 1724-1734. [Google Scholar] [CrossRef
[17] 崔梦颖. 基于GRU神经网络的沪银期货量化投资策略[D]: [硕士学位论文]. 武汉: 华中科技大学, 2021.
[18] Slutzky, E. (1937) The Summation of Random Causes as the Source of Cyclic Processes. Econometrica, 5, 105-146. [Google Scholar] [CrossRef
[19] Wold, H. (1938) A Study in the Analysis of Stationary Time Series. Almqvist and Wicksell.
[20] Sims, C.A. (1980) Macroeconomics and Reality. Econometrica, 48, 1-48. [Google Scholar] [CrossRef
[21] Hamilton, J.D. (1994) Time Series Analysis. Princeton University Press.
[22] Johansen, S. (1995) Likelihood-Based Inference in Cointegrated Vector Autoregressive Models. Oxford University Press.
[23] Engle, R.F. and Granger, C.W.J. (1987) Co-Integration and Error Correction: Representation, Estimation, and Testing. Econometrica, 55, 251-276. [Google Scholar] [CrossRef
[24] Johansen, S. (1991) Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Autoregressive Models. Econometrica, 59, 1551-1580. [Google Scholar] [CrossRef
[25] Mikolov, T., Karafiát, M., Burget, L., Černocký, J. and Khudanpur, S. (2010) Recurrent Neural Network Based Language Model. Interspeech 2010, Makuhari, 26-30 September 2010, 1045-1048. [Google Scholar] [CrossRef