随机森林算法在覆岩两带预测中的应用
Application of Random Forest Algorithm in the Prediction of Overburden and Two Zones
摘要: 为了应对煤矿开采过程中“垮落带”和“导水裂隙带”(两带)高度预测难的问题,运用了基于随机森林算法建立两带高度预测模型,在综合考虑各种复杂的地质构造、岩层组合、采厚、采深等因素的影响下,利用随机森林算法本身的双重随机性(Bagging自助采样以及特征随机选择)和抗噪声性能好、能自动选择重要特征的优点,建立了两带高度的预测模型,用决定系数(
R2)来评价模型的优劣,结果表明:此模型的平均预测
R2为0.8704,“强解释能力”(0.8 ≤
R2 < 0.9),可以为工程实际提供可靠的依据,相较于线性回归、KNN等传统方法来说,随机森林算法有较强的鲁棒性和准确性。
Abstract: In order to deal with the problem of difficulty in predicting the height of the “collapse zone” and the “water conduction fracture zone” (the two zones) in the process of coal mining, the height prediction model of the two zones was established based on the random forest algorithm, and the prediction model of the height of the two belts was established by taking into account the influence of various complex geological structures, rock formations, mining thickness, mining depth and other factors, and using the advantages of the random forest algorithm itself (Bagging self-sampling and random selection of features) and the advantages of good noise resistance and automatic selection of important features. The results show that the average prediction R2 of the model is 0.8704, and the “strong explanatory ability” (0.8 ≤ R2 < 0.9) can provide a reliable basis for engineering practice, and the random forest algorithm has strong robustness and accuracy compared with traditional methods such as linear regression and KNN.
参考文献
|
[1]
|
孔春芳, 田倩, 刘健, 等. 基于集成学习模型与贝叶斯优化算法的成矿预测[J/OL]. 地学前缘: 1-18. 2025-06-28.[CrossRef]
|
|
[2]
|
令狐曦. 机器学习模型测评技术研究与实现[D]: [硕士学位论文]. 北京: 北京邮电大学, 2019.
|
|
[3]
|
姚登举, 杨静, 詹晓娟. 基于随机森林的特征选择算法[J]. 吉林大学学报(工学版), 2014, 44(1): 137-141.
|
|
[4]
|
刘少泽, 崔美娟, 付晓祎, 等. 顾及缓冲区范围与负样本优化的随机森林地质灾害易发性评价[J]. 科学技术与工程, 2025, 25(15): 6220-6229.
|
|
[5]
|
李占山, 刘兆赓. 基于XGBoost的特征选择算法[J]. 通信学报, 2019, 40(10): 101-108.
|
|
[6]
|
潘自辉, 肖正利, 黄光体, 等. 机载激光雷达数据与机器学习算法的森林蓄积量估测模型构建精度评价——基于KNN、XGBoost与RF模型反演算法[J]. 湖北林业科技, 2025, 54(2): 34-44, 50.
|
|
[7]
|
胥雪炎, 李补喜. 不同被解释变量选择对决定系数R2的影响研究[J]. 太原科技大学学报, 2007(5): 363-365.
|