Application of coupled FICA-FCM feature extraction and QPSO-SVM classifier model to predict coal and gas outbursts

: The coal and gas outbursts samples data are affected by all kinds of influencing factors, the accuracy of classification on these sample data is not high, the classification of some samples always have errors, which may be inaccurate annotation. In order to reduce the impact of noise data caused by the labeling errors on classification, this paper proposes a combination of classifier and clustering analysis model, which is used to improve the prediction accuracy of coal and gas outbursts. First, the high-order statistical characteristics of fast independent component analysis(FICA) method are used to extract the features of coal and gas outbursts sample data to obtain the independent nonlinear main metadata; Then, the fuzzy clustering mean(FCM) algorithm is used to cluster the coal and gas outbursts samples. Finally, based on improved clustering analysis results, the support vector machine(SVM) classifier based on quantum particle swarm optimization(QPSO) algorithm is used to compare the classification results with the existing labeled results, and the classification results are corrected to improve the classification accuracy. The results show that the performance of coupling model is significantly better than other exiting classifiers for coal and gas outbursts prediction.


1.Introduction
Coal and gas outbursts is a dynamic process with complex mechanism and many influencing factors, its mechanism and evolution law are not clear so far. Outburst disaster cannot be effectively pre-warning and prevention, which seriously threatens the safety production of coal mines in China. Therefore, rapid and accurate prediction of coal and gas outburst has important practical significance for coal mine safety production [1]. Due to the uncertainty and randomness of coal and gas outbursts, there are many complicated nonlinear relations between outbursts and each index. In the process of coal and gas outbursts prediction, it is very important to extract the complex relationship between coal and gas outbursts indexes, which greatly affects the design and performance of classifier, and is related to the efficiency and accuracy of the whole prediction model [2]. However, it is very difficult to obtain the sample data of coal and gas outburst, and there are few data samples and the labeling results may have problems. The effect of the existing classification model to classify coal and gas outbursts are not very good, the sample data with wrong labels is easy to generate noise data, which affects the classification effect. Therefore, it is necessary to take various methods to find out the wrong labeled samples to correct and improve the prediction accuracy.
In order to solve the above-mentioned key problems, many scholars at home and abroad have carried out research on coal and gas outbursts methods in recent years. The research results mainly focus on the clustering analysis and classification of coal and gas outbursts. At present, the clustering analysis of coal and gas outbursts methods is unsupervised learning. It is a mathematical method for classification of unknown types of data, and its main task is to classify the data according to the similarity or dependence of the characteristics of the patterns to be classified, so that the data of the same class are as similar as possible, and the data of different types are as different as possible. Clustering analysis can intelligently reveal the internal relations and differences between data, and discover the internal structure and law. In practice, the clustering analysis can be used as the preprocessing process of other mining algorithms to analyze the data after clustering, it can be used as a precursor process of other learning tasks such as classification to provide a basis for further data analysis, and it can also be regarded as an independent mining algorithm. By looking for the internal distribution structure of data samples, it reveals the inherent properties and laws of data samples, so as to study the distribution of data. Although unsupervised learning pattern recognition method cannot give clear classification information as supervised learning method, it can adaptively mine the distribution characteristics of a large number of data according to a certain similarity algorithm, reflecting the similarities and differences between data. The pattern recognition method based on unsupervised learning is mainly used to process unlabeled data, that is, data without clear category information. Therefore, the classification process of unsupervised learning has no supervision and guidance. Its classification principle is mainly to classify with the help of some algorithm and similar principles between data. Therefore, the classification result of data has no specific category information, the classification results can reflect the internal relationship, distribution structure or potential classification rules of these data. The direct distance between samples is used to describe the similarity between samples, and the Euclidean distance between sample vectors is used to judge the classification. The near distance is divided into one class, and the far distance is divided into different classes. A large data set is divided into several subsets or class clusters. Similar target data form a class cluster, and the data objects in different class clusters are different. The typical clustering algorithms in coal and gas outbursts are fuzzy clustering [3], k-means [4], dcaainet [5], projection pursuit clustering [6] and self-organizing mapping neural network [7]. The researchers have applied them to different research fields on this basis combined with other methods. Clustering analysis model can find the rule of sample data characteristics, but the existing clustering analysis model has low computational efficiency, high complexity and low accuracy. The FCM algorithm can mine and analyze the structural information and inherent regularity of the sample data set, and realize the correct classification of samples according to the sample similarity and probability density function estimation method. However, most of the current clustering algorithms generally have the following problems when dealing with practical problems: it is necessary to know the data distribution and the number of clustering categories in advance; the solution results are easy to fall into local minimum; and there is no way to deal with the data sets with unclear class boundaries.
The clustering measure does not accurately describe the similarity between the data sample and the clustering prototype, which leads to the low feasibility and effectiveness of the algorithm. Therefore, it is necessary to explore new mining theory and model, and improve some traditional algorithms, or study how to combine the existing algorithms to solve the problem of coal and gas outbursts. The FCM can mine and analyze the structural information and inherent regularity of the sample data set, and realize the correct classification of samples according to the sample similarity and probability density function estimation method. Clustering algorithm can automatically classify the data according to the degree of similarity between data, so that the similarity between classes is as small as possible, while the similarity within classes is as large as possible.
FCM is the most common clustering algorithm. At present, most of the clustering algorithms have the following problems when dealing with problems: we need to know the data points in advance. it is necessary to know the data distribution and the number of cluster categories in advance; the solution result is easy to fall into local minimum; there is no ability to do anything for the data set with unclear class boundary, and the clustering measure does not accurately describe the similarity between the data sample and the clustering prototype, which leads to the low feasibility and effectiveness of the algorithm. Therefore, it is necessary to explore new mining theory and model, to improve some traditional algorithms, or to study how to combine the existing algorithms to solve the problem of coal and gas outbursts Feature extraction has a very important impact on the clustering algorithm design and data processing process, good quality feature can simplify the algorithm complexity, improve the efficiency of the algorithm, and can obtain the ideal clustering effect. Each index feature plays a different role in the clustering process, some of the index features play a leading role in the clustering process, some of the index features do not play a significant role in the clustering process, and some of them cause noise to bring negative impact on the generation of the cluster. It is necessary to select and extract features among clusters, and the task is to find out the most effective features from the numerous features of the data set, and map the data points from the high-dimensional feature space to the low-dimensional feature space, so as to make it easier to distinguish the clusters contained in the dataset. The effective features should be able to highlight the similarities between similar data and the differences between different types of data, and have strong immunity to noise. Reasonable feature extraction can simplify the complexity of clustering algorithm design and improve the efficiency of the algorithm, and can obtain the ideal clustering effect. It can not only save the calculation time, but also can get a more compact and easy learning model. In order to realize the rapid, accurate and dynamic prediction of coal and gas outburst, various influencing factors of coal and gas outbursts are considered, feature extraction methods include principal component analysis, independent component analysis, genetic algorithm optimization projection pursuit and artificial fish swarm algorithm optimization clustering algorithm [8][9][10]. These feature extraction methods have their own advantages and disadvantages, improve the accuracy of coal and gas outbursts prediction. However, due to the complexity and diversity of coal and gas outbursts process, there are many uncertain factors in the outbursts characteristics. The existing coal and gas outburst feature extraction methods change the physical distribution of the original characteristics and cannot effectively judge the original, the main influencing factors in the initial features, and it is difficult to mine the complex relationship between the original features, which leads to the low classification accuracy.
Due to the particularity of the application field, most of the current clustering algorithms generally have the following problems when dealing with practical problems: it is necessary to know the data distribution and the number of clustering categories in advance; The solution results are easy to fall into local minimum; there is no way to deal with the data sets with unclear class boundaries, the clustering measure does not accurately describe the similarity between the data sample and the clustering prototype, which leads to the low feasibility and effectiveness of the algorithm. Therefore, it is necessary to explore new mining theory and model, and to improve some traditional algorithms, or to study how to combine the existing algorithms to solve the problem of coal and gas outbursts. The measured data and theoretical study show that coal and gas outbursts is affected by many factors, among these factors, some are definite and quantitative, some are fuzzy and qualitative. There is a certain nonlinear correlation between different factors, and the evaluation process is accompanied by strong fuzziness. The measured data and theoretical research show that the data collected in the process of coal and gas outburst often contain a large amount of redundant information, and the variables are usually correlated or collinear. Using these factors directly, a large number of redundant correlation data will affect the prediction accuracy of the model. Therefore, reasonable selection of variables can improve the predictability of the model It's important to be able to. First independent component analysis (FICA) feature extraction can effectively extract non-Gaussian and nonlinear independent components of process data, It reduces the complexity of the detection model; the FCM algorithm can mine and analyze the structural information and internal regularity of the sample data set, realizes the correct classification of the samples according to the sample similarity and probability density function estimation method, and explores the combination of FICA method and FCM method to improve the clustering effect and efficiency of coal and gas outburst. the FCM algorithm can mine and analyze the structural information and intrinsic rules of the sample data set According to the sample similarity and probability density function estimation method, the correct classification of samples is realized. The combination of equidistant feature mapping method and fuzzy mean clustering method is explored to improve the clustering effect and efficiency of coal and gas outburst.
The common classification models of coal and gas outbursts are as follows: SVM and its variant [15], neural network and its variant [12], random forest [13] and decision tree [14]. Random forest is a combination of forecasting methods, is also for non-linear, small sample forecasting methods, but there are low prediction accuracy and generalization ability, and there is no relatively perfect theory to support, for small sample data has a low prediction accuracy. Due to the limitations of BP neural network, such as easy to fall into local optimum, slow convergence rate and large demand for training samples, the prediction results of coal and gas outburst with BP neural network have large error and are not ideal; SVM has certain advantages in solving nonlinear and small samples, which is very suitable for coal and gas outbursts prediction, However, SVM has the problem that the penalty factor is difficult to determine. The experimental results are very sensitive to the data, and improper setting will cause over learning and other problems. The parameters of kernel function of SVM is the key factor affecting the performance of classifier. The proper combination of parameters is very important to the classification performance of classifier. It is very important to obtain the optimal combination of kernel function parameters in this paper. Many scholars have carried out in-depth research on the optimization of kernel parameters of support vector machine. For example, reference [27] uses grid search algorithm to select parameters of support vector machine, but the accuracy of the model obtained by the algorithm is limited. Literature [28] uses genetic algorithm to optimize the parameters of support vector machine, which reduces the time required for training the model, and is easy to fall into local optimization The model parameters of SVM are optimized, but the disadvantages are long time and unstable; The defect of these optimization algorithms is that they are easy to fall into local optimum, the parameters are not optimal, and the efficiency is not high, so we need to choose a better optimization algorithm to optimize the parameters. QPSO algorithm is a global search algorithm, which theoretically guarantees to find a good optimal solution in the search space. Compared with particle swarm optimization, the iterative equation of quantum particle swarm optimization algorithm does not need particle velocity vector, and needs less parameters to be adjusted, so it can be realized more easily. The experimental results on widely used benchmark functions show that the quantum particle swarm optimization algorithm has better performance than the standard particle swarm optimization algorithm [25][26]. For the parameter optimization problem of the classifier parameters, the quantum particle swarm optimization algorithm [40] will be used to optimize the basic parameters (penalty factor, kernel parameter) and the adjustable parameters of the SVM to obtain the optimal combination of parameters. Through function fitting and optimization of related parameters in coal and gas outbursts prediction, its effectiveness is verified, and the recognition effect of coal and gas outbursts is improved greatly. Aiming at the problems of kernel function selection and parameter intelligent optimization algorithm of SVM, another important research content of this paper is to design effective kernel parameters of SVM based on effective feature extraction. Aiming at the key problems in coal and gas outbursts prediction, a comprehensive model based on feature extraction clustering model and intelligent optimization classification model is proposed in this paper.
In view of the key problems in the classification of coal and gas outbursts, this paper proposes a method which employs   [15][16] model assumes that the observed mixed signal variable y = (y1, y2,...) .These variables are composed of n independent unknown source signals s= (s1, s2,... sn) is a linear mixture, so there is a relationship between the mixed signal y and the source signal s, which can be expressed as y = ws using vectors, and w is as the unknown mixed coefficient matrix.
Where n is the unknown unpredictable noise, which accords with the Gaussian distribution. ICA recovers the source signal and mixture matrix through y. The main idea is to remove the correlation between the data variables from the non-Gaussian signal of the original data, so that the components of each variable are statistically independent of each other. ICA has a better separation effect on the cross signal, and can more deeply mine the effective information which is covered due to the cross signal. Superior performance in signal separation, redundancy elimination and noise reduction to realize decomposition and fusion of deformation information. In this paper, formula (5) is analyzed by using FastICA algorithm which is more efficient than traditional ICA.

FICA y=ws + n (1)
Because the FICA [17][18][19] algorithm takes the maximum negative entropy as a search direction, we first discuss the decision criterion of negative entropy. According to the theory of information theory, the entropy of gauss variable is the largest among all equal variance random variables, so we can use entropy to measure non-gauss, which is often used as the correction form of entropy, namely negative entropy. According to the central limit theorem, if a random variable is composed of the sum of many independent random variables, as long as it has a limited mean and variance, then regardless of its distribution, the random variables are closer to Gaussian distribution. In other words, it is stronger than non-gaussian.
Therefore, in the separation process, we can express the mutual independence of the separation results through the non-Gaussian measurement of the separation results. When the non-gaussian measurement reaches the maximum, it indicates that the separation of the independent components has been completed. Definition of negative entropy is as follows: Where H(Y) is a Gaussian random variable with the same variance, and is the differential entropy of the random variable According to information theory, among the random variables with the same variance, the random variables with Gaussian distribution have the maximum differential entropy. When there is a Gaussian distribution, the stronger the non-Gaussian property of; is, the smaller the differential entropy is and the larger the value is, so it can be used as a non-Gaussian measure of random variables. Since it is impractical to calculate the probability density distribution function needed to know for differential entropy according to equation (7), the following approximate formula is adopted: Where G(⋅) is some non-quadratic function and V is a Gaussian variable of standard normal distribution. In this way, the problem of finding the maximum value of J(W) is transformed into the problem of solving the maximum value of E{G( )}, that is, when solving the value of mixed matrix W, each independent component has the strongest non Gaussian property and the best separation effect. The iteration formula (9) is used for calculation Where g (.) is the derivative of function G(). At the same time, w (K + 1) was normalized by (10) Therefore, the specific iterative steps are as follows: -Step l: Select the initial value w; -Step2: Calculate w (k + 1) according to (9); -Step3: Normalize w (k + 1) according to (10); -Step4: If the adjacent w change is less than the given value, the iteration stops; otherwise, it turns to setp2

FCM clustering analysis
Clustering analysis [20][21][22] is an unsupervised method that can decomposes data into subgroups or clusters based on the similarity between instances. It can also be divided into two categories: hard clustering and soft clustering. For hard clustering (such as k-means), an instance xi belongs to one and only belongs to one cluster which is most similar to xi. However, in the fuzzy clustering method, an example on the boundary between multiple clusters does not necessarily belong to one of them.
The idea is to make the similarity between objects divided into the same family the largest, while the similarity between different families is the smallest.
Let the sample set be X = {x 1 , x 2 , x 3 , … . x }, the clustering center is {v 1 , v 2 , 3 , … . v }.The membership degree of sample (j = 1,2,3, n) to clustering center (j= 1,2,3, n) is St ∑ = = 0 ≤ ≤ Among them, M is a constant that can control the fuzzy degree of clustering results: ‖ − ‖ represents the Euclidean distance between the jth sample and the ith clustering center. Lagrange multiplier algorithm is used to solve the iterative formula of the model.
The FCM can complete the required precision for the membership matrix determining the number of iterations, and the precision is calculated using the membership matrix from one iteration to the next iteration, the process is as follows: Where,

+1
are the partition matrix at iteration r and r+1, respectively. ‖ ‖ denotes the matrix norm operator, the iterations will be stopped when the difference between two successive partitions is less than a predefined level of accuracy .

FICA-FCM
FCM clustering is a flexible fuzzy partition scheme. Compared with the hard classification of K-means clustering, FCM belongs to soft classification algorithm. The mathematical theory of FCM clustering algorithm is complete, but the shortcomings of FCM are also prominent. It is very sensitive to noise and outliers, and has poor robustness to noise. The highorder statistics signal processing method adopted by FICA is to extract useful information from the high-order statistics of non-Gaussian signals, which can provide means for many signal processing problems that cannot be solved by the secondorder statistics method. Find a nonlinear expression that makes the components statistically independent or as independent as possible from non-Gaussian signals. However, FICA is an unsupervised learning method, which cannot extract various correlation information between class labels and features, which reduces the accuracy and sensitivity of feature extraction [23][24][25] Each index feature of coal and gas outbursts influencing factors plays a different role in the clustering process, some of the index features play a leading role in the clustering process and play a positive role in the formation of the cluster,some of the index features do not play a significant role in the clustering process, and some of them cause noise to bring negative impact on the generation of the cluster. It is necessary to select and extract features among clusters. Its task is to find out the most effective features from the numerous features of the data set, and map the data points from the high-dimensional feature space to the low-dimensional feature space, so as to make it easier to distinguish the clusters contained in the dataset. The

SVM
SVM [26][27][28][29][30] is based on VC theory and structural risk minimization criterion in statistical learning theory, SVM algorithm combines many techniques and methods, such as kernel, sparse solution, loose variables, convex quadratic programming and maximum interval hyperplane, etc. it has advantages in solving problems of small sample, non-linear, high dimension, local minimum, global optimization and generalization performance. It can be divided into linear separable and non-linear SVM. For linear separable, it is to find the optimal classification line with the largest classification gap between the two categories. For non-linear separable problems, the non-linear transformation is realized by kernel function, so that the pattern classification problem in the original space is transformed into a pattern classification problem in the higher dimensional space [41]. According to the characteristics of the data, this paper mainly classifies the nonlinear SVM, and the learning process is the process of finding the optimal separating hyperplane of maximizing the segmentation of the two classes. The principle is as follows: ∈ is the input information, ∈ {± } is the output target information, i is the number of training samples. To find the hyperplane of classification in new feature space, it is assumed that the separated hyperplane is linear, and the linear estimation function is defined as: Where ∅ (x) is a nonlinear mapping function from input space x to high-dimensional feature space, w is the weight vector, b is the bias value, and the classification hyperplane should meet the following constraints: In order to measure the degree of w and b violating the constraint conditions, we need to introduce a non-negative relaxation variable for each sample . ∑ = is the loss function, is the penalty parameter.
By introducing Lagrange multiplication, the above optimization problems can be transformed into quadratic programming optimization problems, namely: In this way, the linear decision function can be defined by the following dual optimization problem: SVM with kernel function can be used in nonlinear classification problem. Nonlinear mapping function can map the original data to linear separable high-dimensional feature space. The nonlinear decision function is as follows: In the formula, ( , ) is the kernel function ( , ) = ∅( )∅( ),, and the Gaussian kernel function is as follows: Where c, g are the parameters of SVM and kernel function, at present, there is no effective method to select kernel function and related parameters in the literatures. The setting of appropriate kernel parameters of SVM has an important impact on the prediction accuracy. Through the optimization algorithm, we can get the most suitable kernel parameters. Relevant research literatures have shown that Gaussian kernel function has significantly advantages in solving nonlinear problems due to its simple, effective and reliable computing ability, and it can realize the nonlinear mapping of input and output data. Therefore, in this paper Gaussian kernel function is generally selected as the kernel function of SVM.

QPSO
The main defect of PSO is that the global generalization ability cannot be guaranteed, and there are too many parameters to be set, which is not conducive to find the optimal parameters of the model to be optimized. The change of particle position is lack of randomness and easy to fall into the trap of local optimum. In order to solve this problem, Sunetal proposed the quantum particle swarm optimization algorithm(QPSO) [30], is inspired by the particle swarm optimization algorithm trace analysis and quantum mechanics. The algorithm cancels the particle's moving direction attribute, and the update of particle's position has nothing to do with the particle's previous motion. In this way, the random particles of particle's position are added, and the particles are moved according to the following iterative formula, Where pbest and ij are the best positions of all pbests in particle swarm, and mbest represents the average best position of the particle history of pbest. Where Xij represents the position of the ith particle and M represents the size of the particle swarm.
Pij (t) represents the update of the ith particle position. U, ∅ij are the random value generated by the unified probability distribution in the range of [0,1], α parameter is the contraction expansion factor, it is the only parameter controlling the particle generalization speed in the quantum particle swarm algorithm, and it is very sensitive to the population number and the maximum number of iterations.
Here we set the value range of α as (0.5,1) by searching, and the combination of particle swarm optimization and equation (21) is quantum particle swarm optimization.

QPSO-SVM
At present, the kernel parameter optimization algorithm of SVM is easy to fall into local optimum, and the parameter value may not be the optimal solution, and the operation efficiency is not high. We use the kernel parameter intelligent optimization algorithm based on QPSO and k-fold cross validation to obtain the optimal kernel parameter value of SVM. The specific flowchart of QPSO-SVM is as follows in Fig2:

FICA-FCM +QPSO-SVM comprehensive model
Based on different design ideas and learning strategies, clustering analysis can be used as a separate process to reveal the inherent properties and laws of data samples by searching for the internal distribution structure of data samples, and it can also be used as a precursor process of other learning tasks such as classification, which provides the basis for further data analysis because of the complexity and diversity of coal and gas outbursts, the sample data also has a complex relationship.
the internal relationship of sample data is very helpful to improve the prediction accuracy. In this paper, we use supervised learning SVM to predict coal and gas outburst samples, the class accuracy is very low, some samples are always wrong in classification, which affects the classification effect of classifiers. In order to mine the essential structural rules of sample data, and analyze the structural information and internal regularity of sample data set, the correct classification of samples is realized according to the estimation method of sample similarity and probability density function. In view of the above problems, this

Dataset description and preprocessing
The influencing factors of coal and gas outbursts include geological stress, gas and physical properties of coal seam. The experimental data come from the historical data of Pingdingshan No.8 mine in Henan Province. With reference to previous studies [31], the coal and gas outbursts influencing indexes include the following: gas pressure, initial velocity of gas output, initial velocity of gas emission, coal seam firmness coefficient, structural coal thickness, fault structure complexity. These indexes are operability, extensive and applicable in engineering practice. In order to make the experimental results more objective, the method of ten-fold cross-validation is used to verify the classification effect, that is, the data set is randomly divided into ten parts, one of which is taken as the testing set in turn, the other nine parts are taken as the training set, and then the corresponding classification method is run to classify and learn the data in the training set, and the testing set is used to test, and the corresponding 10 times testing results are obtained. In this paper, coal and gas outbursts prediction is a twoclassification (outbursts or normal). According to the data distribution characteristics of coal and gas outbursts, due to the use of different units in the acquisition, the meaning is different, there are great differences in the order of magnitude, which affects the convergence speed and accuracy of the algorithm. Before training the model, this paper uses the normalization method to process these data into new data within the range of [0,1], and removes the dimension to make the indexes more comparable.

Experimental environment and parameter setting
The Classifier results

ICA
Modify results and calculate classification accuracy

QPSO-SVM Model
Input sample data

QPSO optimization
Output classification results statistical results concerning 1)Accuracy, 2) Precision,3) Sensitivity, 4) Specificity for each approach along with their standard deviations are calculated according to related references, and the runtimes for all approaches are reported as well. In order to eliminate the random factors, we perform ten times with different random seeds on each experiment. The average performance of these ten repetitions is regarded as the final result. The proposed computational method such as (SVM, KNN, NB) are sensitive to the values of their main controlling parameters, the kernel function of SVM is Gaussian kernel function. Two parameters y and regularization parameter C determine the best classification. we use the combination of QPSO optimization and the 10-fold cross validation to find out all kinds of parameters structure of the classifier. The parameters of other classifiers are made according to the test. The parameters setting of feature extraction are as follows: The parameter of FICA is set to 3, and the parameters of the KPCA and FA are 4,4 respectively. In order to verify the effectiveness of the combination of FCM and FICA, we employ different clustering algorithms and different feature extraction methods to predict the coal and gas outbursts. From Table1-3,we can the performance comparison results using SVM.Table1 gives the performance comparison of different clustering methods, we can see that the accuracy of clustering analysis can achieve higher accuracy than the accuracy without clustering, among which the accuracy of FCM is the highest, reaching 92%, followed by self-organizing neural network(SOM) and K-means, and the values are 90% and 92%

3.3.1Performancecomparison of combination of different feature extraction and clustering analysis
respectively. This indicates that the original label of sample data has errors, the error of sample label can be corrected by clustering algorithm, which can improve the prediction accuracy. In order to illustrate the effectiveness of the feature extraction method proposed in this paper, Table2 gives the comparison results of different feature extraction and FCM methods.
We can find that the accuracy of FICA -FCM is the highest, reaching 100%, while the accuracy of KPCA and FA are the 0.93,0.98 respectively, which means that these feature extraction methods cannot extract effective feature information from coal and gas outburst sample data,and FICA -FCM is very suitable for coal and gas outbursts prediction. According to FICA feature extraction, the feature points of the original signal can be compressed, the redundant information in the signal can also be removed, and the feature information of the high-order moment of the data set is used to mine the relative independent information in the original signal, so that the feature information covered by the cross signal can be effectively separated from each other, and it can improve the modeling effect of the characteristic difference information. FICA relies on only high-order statistics information, which can extract non-Gaussian components and global information of process data effectively. The complexity of the detection model is reduced, and extract the sample data effectively, redundancy and noise data can be eliminated, FICA algorithm is more suitable for the feature extraction of coal and gas outbursts, meanwhile, the effective clustering algorithm can improve the clustering effect and efficiency based on the above feature extraction methods. Table3 shows the experimental results of FICA and different clustering methods. We can find that the accuracy of FCM is the highest, reaching 100%, and the others are also lower than the FCM methods. Among different clustering algorithms, FCM algorithm is more suitable for the characteristics of sample data and better clustering data. In order to verify the importance of each algorithm in the process of mean clustering algorithm based on feature extraction proposed in this paper, we do experiments on the dataset through the methods of FICA feature extraction, FCM and the combination of the two. Table 4 shows the accuracy and efficiency of each stage in the classifier. We can see that the accuracy of FICA-FCM is the highest and the time is the shortest. The indexes are 0.99,0.93,1, the execution efficiency is 0.45 seconds;

3.3.2Comaprison performance of different stages of proposed method
the accuracy rate of FICA method on classifier is 0.75 and the execution time is 3.4 seconds; the accuracy rate of FCM algorithm without feature extraction is 0.98, and the execution time is 3.7 seconds. This shows that the feature extraction is very necessary because of the redundancy and noise in coal and gas outbursts sample data, the nonlinear low dimensional essential structure can be extracted by FICA feature extraction, then the effect and efficiency of FCM can be improved. sensitive to the initial value of the data and has low accuracy, strongly depends on the quality of initialization data and is easy to fall into local saddle point; K-MEANS is sensitive to the initial selected centroid points and different random seed points have different clustering results, which has a great impact on the clustering results, using iterative method it may only get the local optimal solution, but it cannot get the global optimal solution; The number of hidden neurons in the SOM is difficult to determine, so the hidden neurons are often not fully utilized, and some neurons are far from the learning vector cannot win, thus becoming dead nodes, the learning rate of the clustering network needs to be set artificially, and the learning termination needs to be controlled artificially, which affects the learning progress; At the same time, the FCM based on FICA not only can reveal the potential data structure and the correlation between variables, but also can describe the complex structural features and find the potential dynamic laws. By revealing the inherent and low-dimensional geometric properties, this paper explores the FICA driving process, which can reduce the indicators needed for FCM, accelerate the clustering speed and provide more accurate classification estimation for SVM classifier, the algorithm proposed by us is more suitable for the characteristics of coal and gas outbursts sample data.

3.3.3Comparison of different classifiers
At the same time we also compare the performances of different feature extraction methods on different classifiers in terms of computation time. We can see from Table9 that the running time of ours is 0.45 seconds, which takes less total time than other literatures such as reference [3](2.98s), reference [4](1.27s), reference [32](2.19s), only the running time of reference [9] is 0.32s, which is the lowest among them, which is higher than ours. This is due to the fact that FirstICA -FCM method can provide more effective classification information for the classifier, improve the performance and efficiency of classification. In the design of classifier, the proposed SVM employs QPSO algorithm to improve the accuracy with much more time cost. To sum up, our proposed method has higher classification accuracy and much less computation cost compared to other classification methods. Hence, we can conclude that our proposed method is more applicable than other classification methods for classification of coal and gas outbursts in the case of fewer samples, due to its excellent feature extraction and higher accuracy. However, the proposed model has the following defects. The proposed model has only been validated on one available dataset which come from pingdingshan no.8 mine, but the datasets of different mines can be tested in order to achieve better generalization performance. The paper deals with solving a two-class classification problem, however a multi-class outbursts classification problem is highly in demand.

Conclusion
The exiting classification effect of the model for coal and gas outbursts is not very good, and the classification results of some samples are always wrong, there may be some wrong labeled samples, the wrong sample data are easy to generate noise data, then affect the classification effect of the classifier. The clustering analysis model can find the rule of sample data characteristics, and it can be regarded as the pretreatment of SVM classification, which improves the accuracy of classification, but the existing clustering analysis model has low computational efficiency, high complexity and low accuracy. FICA feature extraction can reveal the potential data structure of data samples and the correlation between variables, describe the complex structural characteristics, and discover the potential dynamic laws; FCM algorithm can mine and analyze the structural information and internal regularity of the sample data set. According to the sample similarity and probability density function estimation method, the correct classification of samples is realized. The combination of FICA and FCM method can be used to improve the clustering effect and efficiency of coal and gas outburst. On the basis of the classification results obtained from the above clustering analysis and feature extraction, the labeled sample label and the original label are compared, the results of classifier are modified. Considering the sample classification error rate of the SVM classifier, the error labeled samples with low classification accuracy are found, and the error labels are corrected to improve the classification performance. The optimal classification model of SVM is worth studying to reduce the influence of noise data caused by annotation errors on the prediction accuracy. Although the proposed model gives significant results of coal and gas outbursts, there are still some limitations in this study. In the future work, we will find out suitable feature extraction and clustering analysis methods, and modify exiting classifiers to improve the comprehensive performance of coal and gas outbursts prediction.

Availability of data and material
The raw/processed data required to reproduce these findings cannot be shared at this time as the data also forms part of an ongoing study. Some or all data, models, or code generated or used during the study are available from the corresponding author by request.

Declaration of Competing Interest
The authors declare no competing financial interest.