Research on CNN Coal and Rock Recognition Method Based on Hyperspectral Data


 Aiming at the problem of coal gangue identification in the current fully mechanized mining face and coal washing links, this article proposes a CNN coal and rock identification method based on hyperspectral data. First, collect coal and rock spectrum data by a near-infrared spectrometer, and then use four methods such as first-order differential (FD), second-order differential (SD), standard normal variable transformation (SNV), and multi-style smoothing to filter the 120 sets of collected data. The coal and rock reflectance spectrum data is preprocessed to enhance the intensity of spectral reflectance and absorption characteristics, and effectively remove the spectral curve noise generated by instrument performance and environmental factors.Construct a CNN model, judge the pros and cons of the model by comparing the accuracy of the three parameter combinations, select the most appropriate learning rate, the number of feature extraction layers, and the dropout rate, and generate the best CNN classifier for hyperspectral data. Rock recognition. Experiments show that the recognition accuracy of the one-dimensional convolutional neural network model proposed in this paper reaches 94.6%, which is higher than BP (57%), SVM (72%) and DBN (86%). Verify the advantages and effectiveness of the method proposed in this article.


Introduction
In the process of underground coal mining, it is necessary to distinguish the interface between the coal seam and the rock layer, and use this as a basis to control the lifting of the rocker arm of the shearer. However, how to quickly and accurately identify the rock and coal seam is still a technical problem in the field of coal mining.
In recent years, domestic and foreign scholars have carried out a large number of experiments and theoretical studies, and have proposed a variety of coal and rock identification methods. The Pittsburgh Research Center of the U.S. Bureau of Mines first proposed the coal and rock identification technology based on infrared detection. The shearer is used to cut the rock and coal seam to make the temperature different, and the extremely sensitive infrared temperature sensor is used to detect the temperature of the shearer pick. It can be judged whether it is each layer or rock formation [1].Wang Haijian et al. proposed a coalrock interface perception and recognition method based on multi-sensor information fusion. Considering picks with different degrees of wear, the vibration signal, current signal, acoustic emission signal and infrared flash temperature signal in the process of cutting different proportions of coal and rock, establish a sample library of multiple cutting signal characteristics under different degrees of pick wear. According to the construction of the "and" decision criterion based on the D-S theory, the accurate identification of the coal-rock interface is realized [2].However, this method relies on the relative movement of the pick and the rock layer or coal seam. When the mechanical properties of the coal seam and the rock layer are not much different, the accuracy rate is extremely low, so it is difficult to apply in practice. Si Lei et al. proposed a diagnosis method based on Probabilistic Neural Network (PNN) and Fruit Fly Optimization Algorithm (FOA) using the vibration of the rocker transmission part [3].However, the online real-time processing of high-frequency signals requires very high equipment, and the computer carried by the shearer is difficult to meet the requirements, and the vibration signal changes greatly with the change of the shearer's pose. Zhang Ning et al. proposed a coal-rock interface recognition method based on principal component analysis and BP neural network. This method first extracts the time-domain signal of the shearer drum torque, then uses the principal component analysis method to compress the time-domain signal, and finally inputs the obtained final signal to the BP neural network for coal and rock identification [4].Yang En et al. established a support vector coal classification (SVC) model using two in-situ coal and rock data, and the recognition accuracy has been greatly improved [5]. With the development of deep learning, many scholars apply image recognition technology to coal and rock recognition. First obtain cross-sectional images of coal seams and rock formations, and use image enhancement and denoising techniques to extract features to identify coal and rock [6][7][8][9].Zhang Bin and others combined the deep learning target detection algorithm YOLOv2 based on the regression equation with the linear imaging model and used the algorithm to intelligently identify and locate the coal and rock images collected underground. The results show that the recognition accuracy of the coal and rock by YOLOv2 reached 78%. [10].
The near-infrared reflectance spectroscopy analysis technology has the advantages of high efficiency, rapidity, high accuracy, and no damage to the sample. The principle is to use the combined frequency and multiple levels of the near-infrared spectral region and the hydrogen-containing groups (OH, NH, CH) in organic molecules. The absorption zone of the frequency band is consistent with the characteristics, and the near-infrared spectroscopy is obtained by scanning the sample with a near-infrared spectrometer. The application of this technology in remote sensing has been very mature, and it is now also used in the quantitative analysis and detection of coal mines, minerals, soil, etc. [11][12][13][14].Farrand, WH, etc. use methods including expert system/spectral feature fitting program MICA (material recognition and classification algorithm) and low-abundance substance detection to match the absorption characteristics in the image spectrum with the characteristics in the userdefined spectrum library, and track Acid-producing minerals and trace metals diffused from mines in northwestern India [15].Qi, YB, etc. all used Savitzky-Golay and continuum removal methods to smooth the soil spectral data. Combine the spectral angle cosine (SAC) algorithm with the spectral correlation coefficient (SCC) algorithm to classify soil spectra [16]. Scafutto, RDM, etc. used aerial hyperspectral thermal infrared data to detect petroleum hydrocarbons in the mainland [17].Zhao Kai [18] used principal component analysis, self-organizing map neural network-fuzzy C-means clustering (SOM-FCM) two-layer clustering method to effectively optimize coal samples, reduce the dimension of coal near-infrared spectroscopy data, and build The coal ash prediction model based on GA-BP neural network effectively improves the accuracy of model learning.Lei Meng [19] studied a variety of learning algorithms to improve the quality of modeling spectral data and the performance of prediction models for near-infrared spectroscopy analysis of coal ash based on machine learning. With the improvement of the requirements for intelligent tunneling technology and the development of big data technology, the number of samples has exploded, which puts forward higher requirements for coal and rock identification of hyperspectral data.
The author collected 120 carbonaceous shale and bituminous coal mass samples from the same fully mechanized underground mining face. In the laboratory, we collected the near-infrared band 1.5 from the surface of the mass coal sample (1000 ～ 2500 nm) reflectance spectrum. A one-dimensional convolutional neural network (1D-CNN) coal and rock prediction model is established based on the coal ash yield and its reflection spectrum.
2 Reflectance spectrum data collection of coal and rock samples

Experimental conditions
Collected 120 large samples of carbonaceous shale and bituminous coal at the junction of roof and coal seam from the same coal mine underground working face. The samples were collected and stored in ziplock bags, including 96 coal samples and 24 rock samples. The appearance of the samples are all black, which is relatively similar. Collect the reflectance spectrum of each sample on the flat surface in the laboratory.In order to reduce the influence of the twoway reflection characteristics of the measured coal and rock materials [20][21][22], a 100 W tungsten halogen spotlight was used to illuminate the center of the selected flat surface at an incident angle of 90°, forming a diameter of about 10 cm and an illuminance of about 20 000 lux. Round light spot. The Dutch neospectra near-infrared spectrometer was used for spectrum collection, and its spectral wavelength range was 1 000-2 500 nm, and the spectral resolution was 8 nm.
First, use PTFE material whiteboard to carry out reflection reference calibration. Use a coupling lens for beam collimation. The coupling lens is connected to the head end of the quartz optical fiber with a laser pointer tied beside it, and the end is connected to a spectrometer for spectrum collection. The collimator is fixed to adjust the alignment of the detection target. The axis of the collimator is adjusted to be perpendicular to the center of the spot on the sample surface, and the distance between the collimator and the center of the spot is l = 1.5 m [5].As shown in the experimental diagram in Fig. 1, the spectrum collected by the spectrometer is the average reflectance spectrum of the circular table on the bottom surface of the sample surface formed by the collimating lens field angle. The computer is connected to the spectrometer via USB3.0 to collect the reflectance spectrum curve and reflectance. The spectrum curve is shown in Fig. 2, after which the preprocessing of the spectrum is carried out. absorption valleys, which are mainly caused by the hydroxyl groups in the gangue. The spectral curve of bituminous coal has low reflectivity in the whole waveband, and the change is relatively gentle.After the wave band 1200 nm, the reflectance changes smoothly, which is the same as the gangue. In the wave band (1850-2050nm), (2150-2350nm) there is an absorption valley caused by the hydroxyl group.The main differences in the spectra of the two samples: ① In 900-2500 nm, the reflectivity of the gangue is higher than that of coal as a whole. ② The rapid rise of the spectral reflectance of gangue at 1900-2150nm has a large slope, while the slope of coal in this band is small. ③ In the near-infrared band (2300-2500 nm), the reflectivity of the gangue decreases, while most of the coal shows an upward trend or basically remains unchanged.
2.2.1.Differential processing Derivation of the spectral data can eliminate the influence of background drift. The first derivative can eliminate the constant translation of the background, and the second derivative can eliminate the linear background translation.In the computer, each spectrum is stored in the form of a two-dimensional array (one dimension stores the information of the abscissa of the spectrum, which is the wavelength, and the other dimension stores the information of the absorbance or spectral response value), and the number of spectral points is p and the interval is the same The spectral information of n samples can be combined and stored in an n×p data matrix.The derivation of the spectrum can use a multi-point numerical differential formula: The formula (1) and (2) are obtained by the 5-point quadratic smoothing formula, where y is the reflectivity and the wavelength interval.

2.2.2.Standard Normal Variation(SNV)
Regardless of whether it is a solid or liquid sample, it is difficult to achieve the ideal uniform state. The inhomogeneity of the sample will cause the scattering of light when passing through, passing through, or reflecting back from the sample, and the scattering of light will cause errors in the sample spectrum.SNV can be used to correct spectral errors caused by scattering. The method believes that in each spectrum, the absorbance value of each wavelength point should meet a certain distribution (such as a normal distribution). Based on this assumption, each spectrum is preprocessed to make it as close as possible to the "ideal" spectrum (that is, there is no The spectrum of the effect of scattering errors).SNV is the original spectrum minus the average value of the absorbance of all the spectral points of the spectrum, and then divided by the standard deviation S of the spectrum data. The essence is to normalize the original spectrum data standard, that is： In the formula, n is the number of samples and p is the number of spectral points.

2.2.3.Polynomial smoothing filter
In order to effectively remove the spectral curve noise caused by the influence of instrument performance and environmental factors, the author uses a polynomial smoothing filter algorithm to filter and denoise the experimentally collected spectral data. The formula is: Is the reflection spectrum data point vector with λ as the center wavelength point and r as the interval range of the wavelength points; m is the sample number; A is the smoothing matrix, calculated from the power function polynomial basis matrix of the interval between the center wavelength points; Y is the spectral data simulation The smoothed spectral vector is combined.

CNN prediction model based on reflectance spectrum
Recently, one-dimensional CNN has been proposed to process one-dimensional signals, and has achieved excellent performance and high efficiency. In a relatively short period of time, one-dimensional CNN has become popular in various signal processing applications, such as early arrhythmia detection in electrocardiogram (ECG) beat, structural damage detection and high-power engine fault monitoring. Compared with two-dimensional CNN, this method can directly process 1D signals and automatically learn complex features from training samples. Compared with the traditional fault diagnosis methods, this method can directly process the original fault data and realize the endto-end method. This can improve the flexibility of categorization model and reduce the dependence on expert knowledge.

Principle of one-dimensional CNN
One-dimensional CNN is mainly composed of convolution layer, pooling layer and fully connected layer. The general one-dimensional CNN architecture is shown in Fig.3 A one-dimensional signal is sent to the input layer of the one-dimensional CNN. Convolutional operation is carried out between the input signal and the corresponding convolution kernel to generate the input feature map. Then, by activating the function, the output feature map of convolution layer is generated. The output of the convolution layer can be expressed as: In the equation, After the convolution layer, pooling layer is usually used, which not only reduces the dimension of features extracted from the upper convolution layer and reduces the calculation cost, but also provides basic translation invariance for features. The equation is as follows: In the equation,

 
Max  is a sub-sampling function. In this paper, the maximum sampling is selected.
l j  is the weight coefficient and The output of each neuron in the pooling layer becomes the input of each neuron in the fully connected layer. Generally, the fully connected layer acts as a classifier in the whole one-dimensional CNN.

Influence of super parameter setting on categorization accuracy
This paper proposes a one-dimensional convolutional neural network coal and rock recognition model, as shown in Fig.4, which uses raw hyperspectral data as an input sample. The network does not take other relevant features as input, only raw fault data samples as input. In the feature extraction stage, four identical feature extraction layers are designed to extract features from each input sample. Each feature extraction layer includes two convolution layers, a batch normalization layer, a ReLU function activation layer and a pooling layer. After four feature extraction layers, a flat layer is used to transform the two-dimensional feature matrix composed of one-dimensional feature mapping into one-dimensional feature vector for classifier. The sample data is added to the batch normalization layer after two convolutions of feature extraction, so that the data is normalized before entering the ReLU activation layer, thus improving the speed, performance and stability of the neural network. By selecting the maximum activation degree near neurons in the feature map to reduce the complexity of the network and the possibility of over-fitting, the maximum pooling method is selected in the pooling layer to reduce the number of values of each feature map to half of the size.
In order to study the influence of one-dimensional CNN on categorization performance, we consider three factors, including (A) learning rate, (B) number of feature extraction layers, and (C) dropout rate. Each factor includes five levels, and the range of changes is selected according to past experience. The levels of each test factor are shown in Table  1. Then, through the design of test methods, the influence of different super parameters on categorization performance in the one-dimensional CNN is revealed, and the optimal parameter combination is found. The test results are shown in Fig.5. The change trend of the influence of various factors on each evaluation index is shown in Fig.8. When learning rate is 0.03, accuracy is the highest, as shown in Fig.8(a).As shown in Fig.8(b), when the number of feature extraction layers increases from 1 to 5, the categorization accuracy first increases, and then decreases with the increase of the number of feature extraction layers. This shows that too many layers of feature extraction will lead to over-fitting and affect universality. With the increase of learning rate, accuracy first increases significantly and then decreases.In Fig.8(c), when Dropout=0.425, the categorization accuracy is the highest. Taking the categorization accuracy as an index, in the training process of the CNN, we choose the learning rate of 0.03, the number of feature extraction layers of 4, and the dropout rate of 0.425. Using dropout layer (dropout rate is 0.425), 42.5% of nodes are randomly omitted to reduce over-fitting. Overfitting will lead to high training accuracy and low test accuracy. These super parameters are used to train a new one-dimensional CNN. The categorization accuracy of the model is 95.30%, which only shows that the superparameter set is optimal. By evaluating the model of test concentration, the final categorization accuracy is 94.60%.The experimental results of control variables are shown in Fig.6.

Comparative test of neural network models
In order to verify the effectiveness of the method based on CNN, it is compared with the commonly used traditional machine learning methods BP and SVM, and the deep learning DBN method, which also performs well in Coal and rock identification. In order to ensure comparability between different methods, the optimizer parameter configuration, cost function and activation function used by 1D-CNN and DBN are consistent with those of this method. SVM uses Gaussian kernel function, penalty coefficient and other parameters use default values, and each method uses test 10 times. The results are shown in Fig. 7.
From Fig. 7, it can be found that each method directly using the original data can draw the following conclusions: ① Compared with other models, the one-dimensional CNN method used in this paper performs best, with an accuracy rate of 94.6%; ② The accuracy of 1D-CNN and DBN were 94.6% and 87%, respectively, and the difference between them was not very big; ③ For complex categorization problems, the performance of traditional machine learning methods BP and SVM is obviously inferior to that of deep learning with an accuracy rate of 75% and a lower BP; The reason is that the traditional shallow feature machine learning algorithm for complex categorization problems has limited feature extraction ability and cannot accurately characterize the mapping relationship between data; In conclusion, the onedimensional CNN used in this paper has more advantages and effectiveness than other methods.   . 7 The accuracy of each method in ten tests