Abstract:
The correct prediction of gas emission has great practical significance for the safety production of coal mine. However, the law of gas emission in working face is complex, and there is multicollinearity among the influencing factors of gas emission volume, which seriously affects the accuracy of prediction. In order to study the relationship and characteristics between gas emission and its influencing factors, eliminate the multi-collinearity among these various factors, avoid “dimension disaster” and the over-fitting in the prediction of gas gushing volume, the least absolute shrinkage and selection operator(Lasso)penalized regression method was adopted for simulation prediction. On the basis of the original feature space, the Least Angle Regression(LARS)algorithm was used to achieve dimensionality reduction, eliminate irrelevant and redundant features, and finally screen out 6 high-influencing factors including coal seam depth, coal seam thickness, coal seam gas content, coal seam volatile yield, air capacity and coal seam spacing, etc.. The data set was then divided into ten parts by using the cross-validation method, and 9 of them were used as training data and one was used as test data in turn. Finally, the parameters of the test set with the highest recognition rate were selected to establish a prediction model to predict the coal mine field data. Finally, Lasso method was compared with traditional principal component analysis prediction model. The study results show that Lasso penalty regression model can better preserve the characteristic meaning of the original data set. and the mean relative error is 6.52%, the mean relative change value is 0.006, and the root mean square error is 3.20, which are superior to the results of principal component analysis regression model, proving that Lasso model has better prediction accuracy and stronger generalization ability. It can provide a theoretical reference for downhole gas prevention and control, and has important reference significance for solving the problem of high-dimensional and small-sample data prediction in other engineering fields.