Articial Intelligence In Source Discrimination of Mine Water: A Deep Learning Algorithm For Water Source Discrimination

With increasing coal mining depth, the source of mine water inrush becomes increasingly complex. The problem of distinguishing the source of mine water in mines and tunnels has been addressed by studying the hydrochemical components of the Pingdingshan Coaleld and applying the articial intelligence (AI) method to discriminate the source of the mine water. 496 data of mine water have been collected. Six ions of mine water are used as the input data set: Na + +K + , Ca 2+ , Mg 2+ , Cl - , SO2- 4, and HCO- 3. The type of mine water in the Pingdingshan coaleld is classied into surface water, Quaternary pore water, Carboniderous limestone karst water, Permian sandstone water, and Cambrian limestone karst water. Each type of water is encoded with the number 0 to 4. The one-hot code method is used to encode the numbers, which is the output set. On the basis of hydrochemical data processing, a deep learning model was designed to train the hydrochemical data. Ten new samples of mine water were tested to determine the precision of the model. Nine samples of mine water were predicted correctly. The deep learning model presented here provides signicant guidance for the discrimination of mine water.


Background
With increasing coal mining depth, the source of mine water inrush becomes increasingly complex. Water inrush in mines can lead to serious disasters anywhere in the world due to the complicated hydrogeological conditions found in parts of China, which are uncommon elsewhere in the world 1 . Therefore, rapid and accurate discrimination of the source of water inrush is very important and necessary for both resuming production and rescuing miners 2 .
The upcoming technological revolution has been termed Industry 4.0. Examples of the use of arti cial intelligence (AI) in parameter identi cation of groundwater systems, management of groundwater and mine hydrogeology. Both today and in the foreseeable future, it is important to take advantage of new technological developments and innovations in source discrimination of mine water inrush. The development that has taken the world by storm in the past few years is arti cial intelligence, which has been widely adopted in many elds, such as computer vision, intelligent robots, natural language processing and data mining 3 . As an important method of AI, deep learning is a hot topic in various elds because of its strong ability to automatically extract high-level representations from complex data, which has been applied widely in the elds of natural science, social science and engineering 4 . The value of source discrimination of mine water in the prevention and cure of mine water has been well established over the past several decades 5 .
Hydrochemistry and mathematical methods are widely used to identify water sources in hydrogeology. The ion proportions of different aquifers in mines differ greatly, such as Na + , K + , Ca 2+ , Mg 2+ . However, even in the same aquifer, the content of hydrochemical ions has a great difference 6 . Therefore, some mathematical goelogy methods, such as Bayes and principal component analysis, are used to source identi cation of mine water. Characteristic ion contrast and ion proportional coe cients were applied to aquifers with distinct chemical characteristics to establish a characteristic index discrimination system 7 .
Because of the arti cial neural network structure, deep learning excels at identifying patterns in unstructured data such as images, sound, video, and text. As a result, deep learning is rapidly transforming many industries, including healthcare, energy, nance, and transportation.
These industries are rethinking traditional business processes 8 . Therefore, the study of source discrimination of mine water with arti cial intelligence is of great importance. Arti cial intelligence in this paper elaborates deep learning algorithms to process the main ionic composition of groundwater to better discriminate the source of mine water 9 .
The organization of the paper is as follows. Section 2 presents the geological and hydrogeological conditions of the study area. The source discrimination of mine water problems in the framework of the DNN model is introduced in detail in Section 3. The results of deep learning for the source discrimination of mine water are demonstrated in Section 5. This paper closes with some conclusions and nal remarks.

Outline of the coal eld
The Pingdingshan coal eld (113°00-114°E, 33°30′-34°00′N), located in the central and western parts of Henan Province, northern China ( Fig. 1), is the third largest coal producer in China. The coal eld is approximately 40 km long E-W and 20 km wide N-S. The Pingdingshan Coalfeld is located in the low hilly area, which is divided into eastern and western areas by the Guodishan fault. Structurally, it is a large syncline with symmetrically gently dipping limbs. The coal-bearing sediments are mostly Permian in age, comprised of sandstone, siltstone and carbonaceous shale, which are overlain by Neogene, Paleogene and Quaternary deposits. The entire sequence is underlain by Cambrian karstic limestone (Fig. 1).

Major structures
The main coal-bearing measures are dominated by strike-parallel compressional structures. Of these, most folds and faults are concentrated within a narrow zone, known locally as a compressional zone or disturbance zone, which together with the Likou syncline has been interpreted to occur during Indosinian late Triassic orogenic compression.

Stratum
The exposed strata from old to new in the Pingdingshan coal eld are archaean metamorphic rock series, Upper Proterozoic Sinian, Lower Paleozoic Cambrian-Ordovician, Upper Paleozoic Carboniferous to Permian, Mesozoic Triassic and Cenozoic Neogene, and Paleogene to Quaternary, as shown in Fig. 2. The main coal-bearing strata in the study area are Carboniferous-Permian.

Hydrogeological background
The research area is situated in a transitional zone from a warm temperate zone to a subtropical zone, with a long-term average precipitation of 747.4 mm/year, mainly concentrated between July and September. The geomorphology in the east and south is an alluvial plain with a layer of 200 m~500 m thickness. The ground elevation is +75~80 m. With a surface elevation varying from 900 m to 1040 m, the topography is low in the southeast and high in the northwest. In uenced by the topographic features, the surface water is mainly distributed in the south and north of the mining area, that is, the Shahe River, Ruhe River, Zhanhe River and Baiguishan Reservoir. The Ruhe River and the Shahe River are perennial rivers that lie on the northern and southern margins of the study area. There are some seasonal rivers and man-made ditches, such as Zhanhe, Beigan Canal and Xigan Canal. The riverbed inserts into Cambrian limestone or Neogene marl, which has a certain replenishment effect on the groundwater of limestone in the Qikuang mine in the southwest of the Pingdingshan coal eld.
The main aquifer is a limestone aquifer of the Taiyuan Formation. On the basis of the borehole pumping test data, the water in ow per unit of karst aquifer of the Taiyuan Formation is 0.00018~0.3569 L/s m, and the permeability coe cient is 0.0076~3.047 m/day.

Data
In the Pingdingshan coal mine, 496 mine water data points were collected. Due to the large amount of data, some of the data are shown in Table 2. Table 2 clearly shows that there is a variance of ve orders of magnitude. Data are most valuable when you have something to compare it to, but these comparisons aren't helpful if the data is bad or irrelevant. Data standardization is about ensuring that data are internally consistent, that is, each data type has the same content and format. Standardized values are useful for tracking data that isn't easy to compare otherwise. The raw data are normalized individually according to Eq (1) where the subscript i means the row of the data matrix, the subscript j means the column of the data matrix, Z ij represents the data after standardization, x ij represents the source data, and the symbol std represents the standard deviation of related data 10 . Table 2 Hydrochemical compositions and discriminant results of the water lling aquifer (unit: mg/L. In the last column, which is groundwater type (label column), 0 represents the surface water, 1 represents pore water of Quaternary limestone, 2 represents karst water of Carboniderous limestone, 3 represents sandstone water of Permian limestone, and 4 karst water of Cambrian limestone.) In the datasets, the label column is categorical data (string values). These labels have no speci c order of preference, and since the data are string labels, the deep learning model cannot work on such data directly 11 . One approach to solve this problem can be label encoding, where we assign a numerical value to these labels, for example, the surface water and pore water of the Quaternary mapped to 0 and 1.
However, this can add bias in our model, as it will start giving higher preference to the pore water of the Quaternary parameter as 1>0, and ideally, both labels are equally important in the datasets. To address this issue, we will use the one hot encoding technique, which will create a binary vector of length 5. Here, the label 'the surface water', which is encoded as '0', has a binary vector of [0,0,0,0,1]. As is shown in Table 3.

Deep learning basics
A machine learning algorithm is an algorithm that is able to learn from data. As a special machine learning algorithm, most modern deep learning models are based on arti cial neural networks (ANNs), which form the basis of most deep learning methods and are a class of supervised learning techniques that mimic biological neural networks (Fig. 3). ANN is built from one or more layers containing a series of neurons 12 . The weights and biases between different neurons adjust as learning proceeds with the aim of minimizing the loss between the predicted output and actual output. The training processes of the ANN are the adjustment processes of weights and biases, which are carried out by a back propagation procedure. In the procedure, the gradient descent algorithm is used to update the weights and biases of neurons by estimating the gradient of the loss function. In the process of training, weights and biases accept an adjustment proportional to the partial derivative of the loss function relative to the current weights and biases. With the increasing number of layers, the problem of vanishing gradients, however, makes ANNs hard to train 13 .
Typically, when training an ANN model, we have access to a training set, we can compute some error measure on the training set, called the training error, and we reduce this training error. Thus far, what we have described is simply an optimization problem. The training and test data are generated by a probability distribution over datasets 14 .

Deep learning architectures
Deep learning is a subset of machine learning where the arti cial neural network comes in relation. It solves all the complex problems with the help of algorithms and its process. This idea is that the additional level of abstraction improves the capability of the network to generalize to unseen data and hence outperforms traditional ANN on data outside of the network training set. The learning process is deep because the structure of arti cial neural networks consists of multiple input, output, and hidden layers. Each layer contains units that transform the input data into information that the next layer can use for a certain predictive task 15 .
While indisputably powerful tools, traditional arti cial neural networks (ANNs) and more classical machine learning techniques rely on developers identifying the typical features that describe the problem. In this work, a deep learning approach is applied to the problems of source discrimination of mine water inrush. Deep learning further exploits the power of ANNs by relying on the network itself to identify, extract, and combine the inputs into abstract features that contain much more pertinent information to solve the problem, that is, predicting the output, as illustrated in Fig. 4. Na + +K + , Ca 2+ , Mg 2+ , Cl − , SO2-4, HCO-3. Every neuron accepts inputs from neurons on the previous layer based on linear or nonlinear activation functions (e.g., ReLU). The contents of six elements are delivered from the input layer to the output layer, where the output layer corresponds to the expectation to be predicted, which are the surface water, pore water of Quaternary limestone, sandstone water of Permian limestone, karst water of Carboniderous limestone and karst water of Cambrian limestone.
An ANN with three hidden layers and one output layer is shown in Fig. 5. Every layer constitutes a module through which one can backpropagate gradients. At every layer, we compute the total input i to every unit rst, which is a weighted sum of the outputs of the units in the layer below. Then, a nonlinear function f is applied to i to obtain the output of the unit. For the sake of simplicity, the bias terms are omitted. The nonlinear functions in the hidden layer using the ANN include the recti ed linear unit (ReLU) f(z)= max(0,z). At the output layer, softmax is used to calculate the probability of the water source, which is commonly used in recent years 16 .
At every hidden layer, we calculate the error derivative with respect to the output of every unit, which is a weighted sum of the error derivative with respect to the total inputs to the units in the layer above. Then, we convert the error derivative with respect to the output into the error derivative with respect to the input by multiplying it by the gradient of f. At the output layer, the error derivative with respect to the output of a unit is calculated by differentiating the cost function. This gives y l −t l if the cost function for unit l is 1/2*(y l −t l ) 2 , where t l is the target value. Once ∂E/∂z k is known, the error derivative for the weight w jk on the connection from unit j in the layer below is just y j ∂E/∂z k .
The Python deep learning library Keras, with a TensorFlow backend and GPU acceleration, is used to train the ANN. TensorFlow is an end- to-end open source platform for machine learning. It has a comprehensive, exible ecosystem of tools, libraries, and community resources that allows researchers to push the state of the art in DL, and developers easily build and deploy DL-powered applications. The model parameters of the intelligent evaluation of the DNN model are shown in Table 4.

Results And Discussion
All of training process and result can be shown from Tensorboard, which is a browser based application that helps to visualize your training parameters (weights, biases & metrics). This shows the distribution of tensors in histograms and is used to show the distribution of weights and biases in every epoch regardless of whether they change as expected. We plot the histogram distribution of the weight for the rst fully connected layer every 20 iterations. It takes an arbitrarily sized and shaped tensor and compresses it into a histogram data structure consisting of many bins with widths and counts.
The data source of distributions is the same as the histogram, which is shown in different former (Fig. 6). The distribution of weights and bias of the rst layer are shown in Fig. 6((a) and (c)). The abscissa represents the training times, and the ordinate represents the range of weights. It shows the range of weight values in the training process as a whole, which is constrained to learn to the layer by optimizing its weights or the layer truly 'eats up' many errors. Almost the same number of weights have values of -0.8 to 0.8 and everything in between.
There are some weights having slightly smaller or higher values, but it might not be using its full potential.
In comparison, this simply looks like the weights have been initialized using a uniform distribution with zero mean and value range -0.8 to 0.8 (Fig. 6 (c) and (d)). The histogram of the layer forms a bell curve-like shape. The values are centered around a speci c value, but they may also be greater or smaller than that. Each slice in the histogram visualizer displays a single histogram. The slices are organized by step; older slices are further 'back' and darker, while newer slices are close to the foreground and lighter in color. The y-axis on the right shows the step number. Most values appear close around the mean of 0, but values do range from -1.3 to 1.2. With increasing training times, the color of the curves gradually becomes lighter from back to front. There are many slices in Fig. 6(b,d), and each slice represents the frequency of the weight in the distribution of weights.
Accuracy and loss are unitless numbers that indicate how closely the classi er ts the validation training data. A loss value of 0 represents a perfect t. The further the accuracy is from 0, the more accurate the t. Separate loss plots are provided for the batches The accuracy and loss of training metric of deep neural network and BP neural network have been compared, which are drawn by Matplotlib (Fig. 7). In Fig. 7, the abscissa axis represents the times of forward calculation and back propagation, and the vertical axis represents the accuracy or loss. The blue curve represents the accuracy and loss of training metric of deep neural network, and the red curve represents the accuracy and loss of training metric of BP neural network. It can be seen that the accuracy of training metric of deep neural network is higher than BP neural network, and the loss of training metric of deep neural network is lower than BP neural network. It means that the deep neural network do a better job than BP neural network in the source discrimination of mine water.
The probability is based on the fraction of correctly predicted values to the total number of values predicted to be in a class, which is calculated by the softmax function (Fig. 8). Ten mine water samples were inputted into the trained DNN model to test its accuracy of prediction. The data is also be inputted into BP model. The prediction result is shown in Table 5. From the table, we can see that nine water samples have been predicted correctly with the DNN model, and one water sample has been predicted incorrectly. With the BP model, four water samples were predicted correctly, and six water samples were predicted incorrectly. The prediction result can also be seen in the 3-D histogram, as shown in Fig. 9.

Conclusions And Outlooks
In the research reported here, we apply deep learning methods to discriminate the source of mine water.
(1) Deep learning allows computational models that are composed of multiple processing layers to learn representations of data with multiple levels of abstraction. The method has dramatically improved the state of the art in the source discrimination of mine water. Deep learning discovers intricate structures in large data sets by using the back propagation algorithm to indicate how a machine should change its internal parameters that are used to compute the representation in each layer from the representation in the previous layer.
(2) On the basis of hydrochemical data processing, a deep learning model was designed to train the hydrochemical data. Ten new samples of mine water were tested to determine the precision of the model. Nine samples of mine water were predicted correctly. The deep learning model presented here provides signi cant guidance for the discrimination of mine water.
(3) This high predictive accuracy, combined with very low computational costs-execution of the full framework takes place on the order of milliseconds-makes the developed networks very well suited for discriminating source of mine water.  The geological section of Pingdingshan coal eld  Softmax layer as the output layer 3D-Histogram of probability of source of mine water