Predictive Modeling of Uniaxial Compressive Strength of Rocks for Protecting Environment Using Arti�cial Neural Network

: Sedimentary rocks provide information on previous environments on the surface of the earth. As a result, they are the principal narrators of former climate, life, and important events on the surface of the earth. Complexity and expensiveness of direct destructive laboratory tests are adversely affects the data scarcity problem, making the development of intelligent indirect methods an integral step in attempts to address the problem faced by rock engineering projects. This study established artificial neural network (ANN) approach to predict uniaxial compressive strength (UCS) in MPa of soft sedimentary rocks using different input parameters i.e. dry density (ρ d ) in g/cm 3 ; Brazilian tensile strength (BTS) in MPa; point load index (I s(50) ) in MPa. The developed ANN models M1, M2 and M3 were divided into the overall dataset; 70% training dataset and 30% testing dataset; and 60% training dataset and 40% testing dataset respectively. In addition, multiple linear regression (MLR) was performed to compare with the proposed ANN models to verify the accuracy of the predicted values. The performance indices were also calculated by estimating the established models. The predictive performance of the M3 ANN model with the highest coefficient of correlation (R 2 ), the smallest root mean squared error (RMSE), the highest variance accounts for (VAF) and reliable a10-index was 0.99, 0.00060, 0.99 and 0.99 respectively at the testing dataset revealing ideal results and proposed as the best-fit prediction model for UCS of soft sedimentary rocks at the Thar Coalfield, Pakistan, among other developed models in this study. Moreover, by performing sensitivity analysis, it was determined that the BTS and I s(50) were the most influential parameters in predicting UCS.


Introduction
Sedimentary rocks provide information about the previous environment of the Earth's surface.As such, they are the primary narrators of climate, life, and important events that occurred prior to the Earth's surface.Uniaxial compressive strength (UCS) is an essential rock strength parameter widely used in the design of rock structures (Madhubabu et al. 2016;Asheghi et al. 2019).UCS is an integral parameter in rock characterization, tunnel construction, slope stability analysis, construction, bridges, and other rock-related complications (Abdi eta al. 2019;Abdi et al. 2018;Shahri et al. 2020;Barzegar et al. 2019;Gockceoglu et al. 2004;Baykasoğluet al. 2008).Direct estimation of UCS based on the principles of ISRM (international society rock mechanics) and ASTM (American society for testing materials) is complex, time-consuming, and expensive procedure.It makes testing infeasible for engineering projects where large amount of data is needed.To overcome these shortcomings, this study establishes artificial neural network predictive models for the estimation of UCS.Many research scholars have established predictive methods to deal with such comple problems using various statistical methods such as artificial neural network (ANN) and adaptive neuro-fuzzy interference system (ANFIS) (Tiryaki 2008;Ozcelik et al. 2013;Rajesh-Kumar et al. 2013;Kong et al. 2018;Teymen et al. 2020;Kamani et al. 2020;Cabalar et al. 2012;Bashari et al. 2011;Umrao et al. 2018).Currently, intelligent methods like ANN, ANFIS, PSO (particle swarm optimization), and GA (Genetic Algorithm) are frequently applied to solve problems related to rock structure design (Asheghi et al. 2019), and these methods are considered to be fast, economical, and have achieved good agreement between the measured and predicted values of rock mechanical properties, i.e., UCS and E (modulus of elasticity in MPa), etc. (Teymen et al. 2020).(Torabi-Kaveh et al. 2015) employed ANN and multiple regression methods to estimate UCS and their findings indicated that the ANN method performed better.Yagiz et al (Yagiz et al. 2012) analyzed ANN and multiple regression for predicting UCS of carbonate rocks and found that the ANN method is in good agreement with traditional multiple regression.(Ceryan et al. 2013) also employed the ANN and regression methods to predict the UCS of carbonate rocks and proposed that the ANN results were significantly accurate.(Mohamad et al. 2020) used a PSO-based ANN method to estimate the UCS of soft rocks with input parameters of Brazilian tensile strength (BTS) in MPa, point load index (Is(50)) in MPa, and ultrasonic (Vp) in m/s, and demonstrated the high performance of the proposed model.ANN method has proved to be a key method among all intelligent methods and is therefore mostly used to solve challenging problems that are reliant on laboratory experimental data for the reason of their high efficiency and ability to learn from inputs (Aboutaleb et al. 2018).Based on the reliable predictions of ANN methods, some researchers have estimated various mechanical properties of rocks by analyzing the correlation among various physical parameters (Bejarbaneh et al. 2018;Fakir et al. 2017). (Yin et al. 2020) employed ANN back-propagation algorithm, which has been considered as the best prediction method based on the previous studies.Table 1 shows previous studies using intelligent methods to predict UCS.This study applied the ANN approach to estimate UCS with different input parameters such as dry density (ρd) in g/cm 3 ; Brazilian tensile strength (BTS) in MPa; and point load index (Is(50)) in MPa.A total of 37 soft sedimentary rock samples of each type of core rock were randomly selected from Block IX of the Thar coalfield.For the developed ANN models, the dataset is distributed as follows: model 1 (M1) is the overall dataset, model 2 (M2) consists of 70% of the training dataset and 30% of the testing dataset, and model 3 (M3) consists of 60% of the training dataset and 40% of the testing data set.Similarly, simple regression and multiple regression analyses were performed for comparison with the proposed ANN model to check the accuracy of the predicted values.The performance indices are also calculated by estimating the established models.Besides, to determine the effect of each variable on the estimated values of UCS, a sensitivity analysis was performed.Complexity and expensiveness of direct destructive laboratory tests is adversely affects the data scarcity problem, making development of intelligent indirect methods an integral step in attempts to address the problem faced by rock engineering projects.

Building dataset
In this study, soft sedimentary rock samples were collected from Block IX of the Thar Coalfield, Pakistan.Fig. 1 represents the geological site of collected rock samples.Initially, a total of 37 randomly selected core rock of each type was prepared and subdivided into standardized samples according to the ISRM and ASTM standards to maintain the same rock core dimensions, geological and geotechnical features.Next, these rock samples were tested in the laboratory at the Department of Mining Engineering, Mehran University of Engineering and Technology to determine the physical and mechanical parameters such as ρd in g/cm 3 ; BTS in MPa; Is(50) in MPa and UCS in MPa using universal testing machine (UTM) and Point Load Testing Device (TS-706) as shown in Fig. 2 2 presents the dataset of physical and mechanical parameters.Table 3 shows the minimum, maximum, average, and standard deviation of parameters of rock samples determined in the laboratory.

Methods
The ANN approach was employed to predict UCS with three corresponding inputs ρd (g/cm 3 ), BTS (MPa), and Is(50) (MPa).Fig. 5 demonstrates the flow chart of predictive modeling process for UCS.A dataset of 37 samples for the established models (M1, M2 and M3) was divided and is presented in Table 4.Moreover, cosine amplitude method based sensitivity analysis was carried out in order to estimate the influence of each variable on output.

Artificial Neural Network
The concept of artificial neural network (ANN) was originally introduced by Frank Rosenblatt in 1958 (Alexx 2001).ANN is considered to be the most common and effective soft computing technique (Alizadeh et al. 2018;Asteris et al. 2019) based on the function of the human brain's nervous system (Ly et al. 2020;Pham et al. 2020;Le et al. 2020a;Le et al. 2020b).This technique is mainly used to solve complex rock structure design problems, i.e. mining, civil, geotechnical, geological engineering, etc.The ANN structure is an essential factor in designing the ultimate prediction model, as the structure affects the learning capability and performance when estimating the network data.The ANN is structured with three layers (i.e.input layer; hidden layer; and output layer) with a number of interrelated units, called neurons, and the method is used to classify the appropriate correlation between the specified input and output parameters (Asteris et al. 2019;Pham et al. 2020).Fig. 6 shows the structure of ANN to estimate UCS in this research.
Because of the complexity of the problem, each neuron has enough neuron capacity, and each neuron is related to the weight of the next layer (Rashidian et al. 2014;Fidan et al. 2019;Gowida et al. 2019).Eq. 1 is used to evaluate the approximate number of neurons in the hidden layer, since the improper selection of the number of neurons in the hidden layer often leads to "under-fitting" and "over-fitting" and must be avoided.
ANN toolbox in MATLAB package 2018a was used in this study to develop the feed-forward back propagation (FFBP) ANN model with 3-7-1.BP is the most commonly applied powerful learning algorithms in multilayer networks (Hajihassani et al. 2014;Ekemen Keskin et al. 2020).The predictive input parameters ρd, BTS, and Is(50) were allocated to an input layer composed of three neurons to predict UCS of the output layer.ANN models, M1, M2 and M3 were trained, tested and validated.One hundred epochs were used to train the models and the minimum validation error is considered as a stop to prevent overfitting.Fig. 7 represents the validation curves for the training performance of ANN models of UCS.
Therefore, the M3 model demonstrates the best performance curve of UCS, with validation error equal to 0.014, which is reached at 0 epochs.Fig. 8 illustrates the scatter plots of the predicted UCS against the measured UCS, as M1 for overall dataset, and M2 and M3 for the training and testing dataset, respectively.

Multiple Linear Regression
SPSS (version 23) was used to conduct a multiple linear regression (MLR) analysis to determine the existence of a linear relationship between the dependent variable and the independent variables.Regression analysis is used to determine the independent variables' significance in determining the dependent variable's values (Sajid 2020a;Sajid et al. 2017).More precisely, the purpose of regression analysis in this study is to compare the performance of ANN analysis to that of conventional linear regression.This approach is also used in several recent studies on the application of artificial neural networks and linear regression analysis (Sajid 2020b).The basic linear regression equation (Eq.2) modified to include our dependent and independent variables is as follows: where  represents the dependent variable,  represents the regression constant,  represents the regression coefficient,  represents the value of the independent variable.

Model Evaluation
This study used ANN and MLR methods.To verify the prediction results of the models M1, M2 and M3, the performance indices were calculated.The outcomes of all established models are illustrated as measured and predicted values.Eq. 3, 4, 5 and 6 were used to find the R 2 , RMSE, VAF and a10-index of each model, respectively.] × 100 (5) In addition, to further assess the reliability of the model, a new engineering index a10-index was applied to the studied models.
where,   is the measured value,   is the predicted value,   ̅̅̅̅̅̅̅ and   ̅̅̅̅̅̅̅ are the mean of the measured and predicted value, respectively, and n shows the number of the dataset.10 denotes the dataset with a value rate of measured UCS/predicted UCS between 0.90 to 1.10 and  represents the dataset number.The first step is to determine whether the data under consideration is appropriate for linear regression analysis.Numerous tests are suggested in the literature for this purpose.Apart from R 2 , another most frequently used test is the ANOVA test.In the first case, linear regression was used to determine the relationship between the dependent variable measured UCS and three independent variables:   , BTS, and  (50) .In Table 6, the R 2 values of UCS are estimated by using different equations of the MLR models as M1, M2 and M3 in the overall dataset, training and testing data, i.e., 0.65 for M1, 0.62 and 0.83 for M2, and 0.65 and 0.84 for M3 respectively.Therefore, the R 2 values of UCS are quite satisfactory at M1, M2 and M2 models.Furthermore, the ANOVA test also rejected the null hypothesis at a significance value of P < 0.001.

Taylor Diagram
Taylor's diagram addresses a short numerical explanation of how the fit patterns match their connection and standard deviation.The expression of Taylor diagram can be expressed as follows in Eq. 7: where, R denotes the correlation, Z denotes the discrete points,   and   represents two variables,   and   shows the l and m standard deviation, and  ̅ and  ̅ denotes the average of   and   .
Fig. 15 indicates the Taylor diagrammatic correlation between the R 2 , RMSE and standard deviation of the original and predicted UCS for the M2 and M3 ANN and MLR models at the testing stage, respectively.The prediction of the M3 ANN model is highly correlated with the original values and as compared to the other developed models, the standard deviation is similar to the original value.Thus, the M3 ANN model with R 2 = 0.99 is most suitable for predicting UCS of soft sedimentary rocks in the Thar Coalfield, Pakistan, among other developed models.In an ideal scenario, the best-fit prediction model is to be considered when the R 2 value is highest, the RMSE is lowest, the VAF is maximum and the reliable a10-index.Therefore, according to Fig. 15, M3 (ANN) model at the testing dataset revealed the optimal results and is proposed as the best-fit prediction model for UCS in this study.

Sensitivity Analysis
It is crucial to accurately analyze the most important parameters that have a great influence on the rock UCS, which can certainly be problematic in the design of the structure.Therefore, the cosine amplitude method (Momeni et al. 2014;Ji et al. 2017) is used for the relative influence of the input parameters on the output in this study.The general formula of the adopted method can be expressed as follows in Eq. 8: where,   and   = input and output values and n denotes the dataset numbers during the testing stage.Finally,   ranges between 0 and 1, specifying the additional evidence on the accuracy between each variable and the target.According to the Eq. 6, if   of any parameter is 0, this indicates that there is no significant relationship between this parameter and the target.On the contrary, when   is equal to 1 or approximately 1, a significant relationship can be considered that can greatly influence the UCS of the rocks.
Fig. 16 The effect of input variables on the result of the established model Fig. 16 shows the relationship between each input parameter (  , BTS, and  ( 50) ) of the developed model and output (UCS).Therefore, it can be seen from the figure that the Brazilian tensile strength and point load index are the most influential parameter in predicting UCS.

Conclusions
In this study, an intelligent method was used to predict the UCS output of soft sedimentary rocks collected from Block IX of the Thar coalfield, using ρd, BTS, Is(50) as input parameters.The physical and mechanical properties of rock samples were determined laboratory in accordance with ISRM and ASTM standards at the Department of Mining Engineering, Mehran University of Engineering and Technology.This study concluded the predictive performance of ANN and MLR models by determining the highest R 2 , the smallest RMSE, the highest VAF and reliable a10-index.For ANN models R 2 , RMSE, VAF and a10-index were 0.98, 0.02568, 0.98 and 0.98 respectively at M1, 0.87 and 0.91, 0.02932 and 0.00030, 0.99 and 0.99 and 1.03 and 1.10 respectively at the training and testing dataset of M2 and 0.91 and 0.99, 0.02232 and 0.00060, 0.97 and 0.99 and 1.04 and 0.99 respectively at the training and testing dataset of M3.In comparison the MLR models' R 2 , RMSE, VAF and a10-index were 0.68, 0.00764, 0.99 and 1.07 respectively at M1, 0.62 and 0.86, 0.00001 and 0.36488, 0.81 and 0.97 and 1.06 and 0.92 at the training and testing dataset respectively for M2 and 0.65 and 0.84, 1.10895 and 0.01245, 0.82 and 0.99 and 1.01 and 1.10 at the training and testing dataset respectively for M3.Thus, the proposed M3 (ANN) model at the testing dataset, respectively yielded the optimum results and are proposed as the best-fit prediction models for UCS in this study.Finally, by performing sensitivity analysis, it was concluded that the BTS and Is(50) were the most influential parameter in predicting UCS.

Future Work
The current study used only the Aartificial neural network to predict UCS in comparison with multiple linear regression, and this could have produced a more suitable results.However, future work could expand the dataset used in this study and employ techniques such as support vector machine (SVM), random forest (RF), extreme gradient boosting (XGBoost), boosted decision tree regression (BDTR), etc. to further understand the nature of the study.

Fig. 2
Fig. 2 (a) Universal testing machine (UTM), and (b) deformed rock core specimen for Brazilian tensile strength test, (c) deformed rock core specimen for UCS test, (d) Point load testing device (TS-706), and (e) deformed rock core specimen for point load index test.(Source for Fig. 2 (e): (Geology 2017)) Fig. 3 represents histogram plots of the original dataset under this study: (a) Dry density (g/cm3), (b) BTS (MPa), (c) Is(50) (MPa), and (d) UCS (MPa).Fig. 4 shows the pairwise plot of the original dataset of different parameters and UCS under this study.Notably, none of the parameters are well correlated to the UCS, thus all the parameters are analyzed for UCS prediction.In addition, Fig. 4 represents a moderate positive correlation of BTS and Is(50) with UCS, however, the dry density shows a negative correlation with UCS.

Fig. 9
Fig. 9 ANN model M1 results for UCS plotted against the measured data.

Fig. 11
Fig. 11 ANN model M2 results for UCS plotted against the measured data at the (a) training and (b) testing data.

Fig. 11
Fig. 11 shows the predicted outputs of the ANN M2 model for UCS versus measured data at the training and testing data.So, at the training and testing data, the predicted R 2 values of the M2 model are 0.87 and 0.91, respectively.According to the M2 estimated results at the training data, Fig. 12a displays the aggregated comparison of predicted against measured values for UCS.Fig. 12b shows the change in relative error between the measured and predicted values.The MSE value of the M2 is 0.00086.Fig. 12c denotes the error histogram of the performed model M2.As a result, it can be seen that the distribution of the errors is almost zero, which indicates that the performance of the proposed model M2 is satisfactory and reliable.Similarly, in M3 estimated outputs at the testing data, Fig. 12d exhibits the aggregated comparison of predicted against measured values for UCS.Fig. 12e denotes the change in relative error between the measured and predicted values.The MSE value of the M2 is achieved as ~ 0. Fig. 12f represents the error histogram of the M3 model.Consequently, it can be seen that the distribution of the errors is nearly zero, which indicates that the performance of a proposed model M2 is acceptable.

Fig. 13
Fig. 13 ANN model M3 results for UCS plotted against the measured data at the (a) training and (b) testing data.

Fig. 15
Fig. 15 Demonstration of Taylor diagram at the testing data based on the ANN and MLR.

Table 1 .
Previous studies using intelligent methods to predict UCS.
Fig. 1 Geological site of collected rock samples.

Table 2 .
Physical and mechanical parameters of dataset.

Table 3 .
The minimum, maximum, average, and standard deviation of dataset.

Table 4 .
The dataset distribution for ANN and MLR models.

Table 7 .
Performance indices of ANN and MLR models at the overall dataset, training dataset and testing dataset for UCS.