基于集成学习的生鲜电商水果销量预测研究
The Research on Sales Forecasting of Fresh Food E-Commerce Based on Ensemble Learning
摘要: 生鲜电商在快速发展的同时面临着高损耗率、高物流成本和库存管理复杂等挑战,精准的销量预测对于优化供应链管理至关重要。本论文针对生鲜商品中水果品类的销量预测问题,首先系统性地设计了包含滞后特征、滚动统计特征、时间特征、周期性特征和外部因素特征在内的30维特征体系,充分考虑了促销活动、节假日、气温和极端天气等电商特有的影响因素;在此基础上,采用随机森林、梯度提升决策树和岭回归三种算法,通过简单平均策略进行集成,构建了基于集成学习的预测模型。实验结果表明,集成学习模型的MSE为0.00286,R2为0.9075,MAPE为5.62%,相较于传统时间序列方法和单一机器学习模型在预测精度和稳定性方面均有显著提升。与移动平均方法相比,MSE降低了63.6%,与支持向量回归相比降低了75.1%。研究为生鲜电商平台的库存管理和运营决策提供了有效的技术支持,具有重要的理论价值和实践意义。
Abstract: While fresh food e-commerce is experiencing rapid development, it faces challenges such as high spoilage rates, high logistics costs, and complex inventory management. Accurate sales forecasting is crucial for optimizing supply chain management. This paper addresses the sales prediction problem for fruit categories among fresh products. First, a systematic 30-dimensional feature system is designed, including lag features, rolling statistical features, temporal features, periodicity features, and external factor features, which fully considers e-commerce-specific influencing factors such as promotional activities, holidays, temperature, and extreme weather conditions. Based on this, three algorithms—Random Forest, Gradient Boosting Decision Tree, and Ridge Regression—are employed and integrated using a simple averaging strategy to construct an ensemble learning-based prediction model. Experimental results demonstrate that the ensemble learning model achieves an MSE of 0.00286, R2 of 0.9075, and MAPE of 5.62%, showing significant improvements in prediction accuracy and stability compared to traditional time series methods and single machine learning models. Compared with the moving average method, MSE is reduced by 63.6%, and compared with Support Vector Regression, it is reduced by 75.1%. This research provides effective technical support for inventory management and operational decision-making of fresh food e-commerce platforms, offering significant theoretical value and practical implications.
文章引用:赵晓丹, 刘媛华. 基于集成学习的生鲜电商水果销量预测研究[J]. 电子商务评论, 2025, 14(12): 7215-7226. https://doi.org/10.12677/ecl.2025.14124724

参考文献

[1] Zhang, J. and Li, X. (2020) The Development of Fresh E-Commerce in China. Journal of E-Commerce Research, 12, 45-56.
[2] Wang, Y. and Zhao, L. (2021) Optimization of Inventory and Loss Control in Fresh Produce E-Commerce. Journal of Supply Chain Management, 34, 23-36.
[3] 陈军, 但斌. 基于实体损耗控制的生鲜农产品供应链协调[J]. 系统工程理论与实践, 2009, 29(3): 54-62.
[4] 孙春华. 我国生鲜农产品冷链物流现状及发展对策分析[J]. 江苏农业科学, 2013, 41(1): 395-399.
[5] Li, Y. and Zhou, Q. (2022) Sales Forecasting for Perishable Goods in E-Commerce. Journal of Retailing and Consumer Services, 59, 101-110.
[6] 刘墨林, 但斌, 马崧萱. 考虑保鲜努力与增值服务的生鲜电商供应链最优决策与协调[J]. 中国管理科学, 2020, 28(8): 76-88.
[7] 邵腾伟, 吕秀梅. 基于消费者主权的生鲜电商消费体验设置[J]. 中国管理科学, 2018, 26(8): 118-126.
[8] 颜波, 叶兵, 张永旺. 物联网环境下生鲜农产品三级供应链协调[J]. 系统工程, 2014, 32(1): 48-52.
[9] 林略, 杨书萍, 但斌. 收益共享契约下鲜活农产品三级供应链协调[J]. 系统工程学报, 2010, 25(4): 484-491.
[10] Tang, W. and He, Y. (2019) Time Series Forecasting of E-Commerce Sales Using Machine Learning Algorithms. International Journal of Data Science and Analytics, 8, 1-15.
[11] Hyndman, R.J. and Athanasopoulos, G. (2018) Forecasting: Principles and Practice. Springer.
[12] Smola, A.J. and Schölkopf, B. (2004) A Tutorial on Support Vector Regression. Statistics and Computing, 14, 199-222. [Google Scholar] [CrossRef
[13] Hochreiter, S. and Schmidhuber, J. (1997) Long Short-Term Memory. Neural Computation, 9, 1735-1780. [Google Scholar] [CrossRef] [PubMed]
[14] Breiman, L. (1996) Bagging Predictors. Machine Learning, 24, 123-140. [Google Scholar] [CrossRef
[15] Wang, Y. and Zhang, Z. (2021) Fresh Product Forecasting in E-Commerce Using Ensemble Methods. Retail and Consumer Research Journal, 17, 45-58.
[16] Guo, P. and Zhang, W. (2020) Time Series Forecasting for E-Commerce with Seasonal Effects. Data Science and Engineering, 35, 12-26.
[17] 曹晓宁, 王永明, 薛方红, 刘晓冰. 供应商保鲜努力的生鲜农产品双渠道供应链协调决策研究[J]. 中国管理科学, 2021, 29(3): 109-118.
[18] 李琳, 范体军. 基于RFID技术应用的鲜活农产品供应链决策研究[J]. 系统工程理论与实践, 2014, 34(4): 836-844.
[19] Wu, X. and Chen, Y. (2020) Using Random Forests for Demand Forecasting in E-Commerce. International Journal of Forecasting, 36, 284-295.
[20] 丁秋雷, 胡祥培, 姜洋, 等. 考虑新鲜度的农产品冷链物流配送受扰恢复模型[J]. 系统工程理论与实践, 2021, 41(3): 667-677.
[21] 叶俊, 顾波军, 付雨芳. 不同贸易模式下生鲜农产品供应链冷链物流服务与定价决策[J]. 中国管理科学, 2023, 31(2): 95-107.