基于LightGBM模型的二手车价格预测研究
A Research Study on Used Car Price Prediction Based on the LightGBM Model
摘要: 针对我国二手车交易价格评估困难、市场透明度不足的问题,本文提出一种融合特征工程与集成学习的优化预测方法。首先对来自某交易平台的15万条二手车数据进行预处理,包括异常值修正、缺失值填充和特征衍生;接着通过相关性分析筛选出关键特征;然后对比XGBoost、随机森林、CatBoost和LightGBM四种模型的性能,发现经参数优化的LightGBM模型表现最佳,其平均绝对误差(MAE)为487.03元,决定系数(R2)达0.9708,平均绝对百分比误差(MAPE)为13.20%;最后将该模型应用于5万条测试数据,生成价格预测及区间预测。实验表明,本文方法能有效提升二手车价格评估的准确性与可靠性。
Abstract: To address the difficulties in second-hand car price evaluation and the lack of market transparency in China, this paper proposes an optimized prediction method that integrates feature engineering with ensemble learning. First, 150,000 second-hand car data entries from a trading platform are preprocessed, including outlier correction, missing value imputation, and feature derivation. Next, key features are selected through correlation analysis. Then, the performance of four models—XGBoost, Random Forest, CatBoost and LightGBM—is compared, and it is found that the parameter-optimized LightGBM model performs the best, with a mean absolute error (MAE) of 487.03 yuan, a coefficient of determination (R2) of 0.9708, and a mean absolute percentage error (MAPE) of 13.20%. Finally, this model is applied to 50,000 test data entries to generate price predictions and confidence intervals. Experiments show that the proposed method can effectively improve the accuracy and reliability of second-hand car price evaluation.
文章引用:杨越. 基于LightGBM模型的二手车价格预测研究[J]. 统计学与应用, 2026, 15(3): 153-164. https://doi.org/10.12677/sa.2026.153064

参考文献

[1] 刘岳阳, 何彦廷, 李瑜, 方健荣, 史佳硕. 互联网+背景下国内二手车市场模式创新[J]. 时代汽车, 2022(3): 183-185
[2] 吕劲. 基于特征优化组合SVM的二手车价格预测研究[D]: [硕士学位论文]. 武汉: 中南财经政法大学, 2019.
[3] 李富强, 彭海丽, 杨熙, 张文静. 基于深度学习的二手车价格预测模型及影响分析[J]. 汽车工程学报, 2021, 11(5): 379-385.
[4] 郑婕. 基于随机森林和XGBoost算法的二手车价格预测[J]. 数字技术与应用, 2021, 39(6): 90-93+188.
[5] 崔四帅. 基于集成学习的国内二手车价格预测分析[D]: [硕士学位论文]. 大连: 大连理工大学, 2021.