#### 期刊菜单

Research on Prediction of Coal and Gas Outburst Based on Integrated Learning
DOI: 10.12677/ME.2023.112018, PDF, HTML, XML, 下载: 210  浏览: 730

Abstract: In order to improve the accuracy and feasibility of coal and gas outburst accident prediction, prin-cipal component analysis is used to reduce the dimensionality of the original data of 12 factors af-fecting coal and gas outburst, and then the information content containing 85% of the original data is obtained. The 8 principal components are used as input through Adaboost and the single- layer decision tree is used as a weak classifier to learn, and a coal and gas outburst prediction model combining principal component analysis and AdaBoost is established. And select examples to use 64 sets of data as training samples and 16 sets as prediction samples, and prove the stability of the model by judging the confusion matrix. The results show that the prediction accuracy of the prediction model based on the AdaBoost algorithm and the single-layer decision tree as the weak classifier reaches 100%, and the overall level is stable, which can provide a theoretical basis for safe production.

1. 引言

2. 算法原理

2.1. 主成分分析法

Figure 1. Algorithm flow chart

1) 假设训练数据具有均匀的权值分布，即每个训练样本在基本分类器中具有相同的。输入训练集 $T=\left\{\left({x}_{1},{y}_{1}\right)\left({x}_{2},{y}_{2}\right),\cdots ,\left({x}_{N},{y}_{N}\right)\right\}$ ，其中xi ∈ X，X属于实例空间，yi ∈ [−1, 1]。

2) 初始化训练网络权值分布 ${D}_{1}=\left({w}_{1i},\cdots ,{w}_{1N}\right),{w}_{1i}=\frac{1}{N},i=1,2,3,\cdots ,N$ 使用具有权值分布Dm，m ∈ 1, 2, 3, …, M的训练数据集学习，得到基本的分类器Gm(x)，并计算Gm(x)在训练集上的误差率 ${e}_{m}=\underset{i=1}{\overset{N}{\sum }}P\left(G\left({x}_{i}\right)\ne {y}_{i}\right)=\underset{i=1}{\overset{N}{\sum }}{w}_{mi}I\left({G}_{m}\left({x}_{i}\ne {y}_{i}\right)\right)$ ，计算Gm(x)系数 ${a}_{m}=\frac{1}{2}\mathrm{log}\frac{1-{e}_{m}}{{e}_{m}}$ ，更新训练集权值分布 ${D}_{m+1}=\left({w}_{m+1,1},\cdots ,{w}_{m+1,i},\cdots ,{w}_{m+1,N}\right)$${w}_{m+1,i}=\frac{{w}_{mi}}{{Z}_{m}}\mathrm{exp}\left(-{a}_{m}{y}_{i}{G}_{m}\left({x}_{i}\right)\right),i=1,2,3,\cdots ,N$ ，这里Zm是规范因子 ${Z}_{m}=\underset{i=1}{\overset{N}{\sum }}\mathrm{exp}\left(-{a}_{m}{y}_{i}{G}_{m}\left({x}_{i}\right)\right)$ 它使Dm+1成为一个概率分布。

3) 构成基本分类器的线性组合 $f\left(x\right)=\underset{m=1}{\overset{M}{\sum }}{a}_{m}{G}_{m}\left(x\right)$ ，得到最终的分类器 ${G}_{m}\left(x\right)=\text{sign}\left(f\left(x\right)\right)=\text{sign}\left(\underset{m=1}{\overset{M}{\sum }}{a}_{m}{G}_{m}\left(x\right)\right)$图1为算法流程。

3. 实例分析

3.1. 煤与瓦斯突出的影响因素

Table 1. Raw data of main influencing factors of coal and gas

3.2. 原始数据的主成分分析

Table 2. Variance contribution rate and cumulative contribution rate

Table 3. Principal component analysis results

Table 4. Comparison of predicted results with the real situation

4. 结论