The establishment of prediction model for soil liquefaction based on the seismic energy using the neural network

In the development of the prediction model for soil liquefaction, compared to the stress-based method, the energy-based methods proposed and developed in recent years are closer to the essence of soil liquefaction which is about the energy dissipation. Therefore, considering the weak nonlinear relationship found by the previous research, the fuzzy neural network (FNN) and BP neural network (BPNN) were adopted to try to obtain a prediction model which is the most proper to this nonlinear relationship. Firstly, the database including 284 cases obtained from laboratory test was divided into 3 separate groups denoted as training, validation set and testing sets by the ratio of 5:1:1; then, the FNN model and BPNN model were iterated to determine the model parameter by referring to the variation of fitness value and relative error of validation set; at the same time, the optimization algorithm of genetic algorithm (GA) was adopted to BPNN to find the best coefficients; besides, the parameter of Cc\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${C}_{c}$$\end{document} and D50\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${D}_{50}$$\end{document} was respectively excluded from the database to test their influence degree according to the prediction error; finally, six prediction results of FNN and genetic algorithm BP neural network (GABP) were compared with the previously proposed models. The results showed that the relationship of capacity energy to the influencing parameters could not be fitted as a fully linear relationship; the FNN model can learn the role of Cc\documentclass[12pt]{minimal} \usepackage{amsmath} \usepackage{wasysym} \usepackage{amsfonts} \usepackage{amssymb} \usepackage{amsbsy} \usepackage{mathrsfs} \usepackage{upgreek} \setlength{\oddsidemargin}{-69pt} \begin{document}$${C}_{c}$$\end{document} in affecting the capacity energy while the GABP model needs not to take it into account; the FNN and GABP model all fitted a good weakly nonlinear relationship for the capacity energy, and the GABP model is a better prediction model for capacity energy so far.


Introduction
During an earthquake, significant damage can result due to instability of the soil in the area affected by internal seismic waves, resulting in huge human loss and property damages every year (Towhata 2008;Javdanian 2019;Sassa and Takagawa 2018;Xue and Yang 2013). Therefore, before the engineering design, the soil liquefaction potential needs to be estimated in advance based on the geology survey, especially for the high seismic magnitude area. Aiming to this, researchers have proposed many prediction models based on the cyclic stress ratio (CSR) and cyclic resistance ratio (CRR); the typical method is to establish a mathematic formulation to fit the relationship between the CRR, CSR and some macroscopic parameters based on the field test such as the standard penetration test (SPT) and the cone penetration test (CPT). These stress-based methods are incapable of catching the essence of soil liquefaction, which is the accumulated strain realized by dissipating seismic energy (Kokusho 2013(Kokusho , 2017Azeiteiro et al. 2017;Tsaparli et al. 2017); the study on the prediction model based on the capacity energy is also conducted simultaneously, however, compared with the stress-based model, it was less focused on. Sonmezer (2019) conducted 36 deformation controlling cyclic simple shear tests to find that the dissipated energy is strongly dependent on relative density and effective stress and concluded a relationship with multiple regression analysis. Dief and Figueroa (2007) studied the influence of relative density and effective confining pressure, as well as the effect of different grain size distribution on the energy per unit volume required for liquefaction; the generalized relationships were obtained by performing regression analyses between the energy per unit volume at the onset of liquefaction and liquefaction affecting parameters. Ni et al. (2020) investigated the variations of the excess pore water pressure ratio and the double-amplitude axial strain with the dissipated energy by conducting a series of undrained cyclic triaxial tests on both loose and medium-dense sand. Jafariavval and Derakhshani (2020) adopted the M50 algorithm to fit the best correlation between parameters and derived formulas with a simple structure. Ke et al. (2020) proposed a unified viscous energy dissipation ratio (VEDR) based on the relationship between cyclic stress and strain to study the energy dissipation at each cycle. Millen et al. (2020) presented an energy-based approach for estimating the time to liquefaction; it is directly related to kinetic energy and is uniquely related to liquefaction triggering.
At the same time, with the development of artificial intelligence, due to its strong learning capacity, some AI algorithms were adopted to take the place of the traditional mathematic model, including linear genetic programming (LGP), multi expression programming (MEP), standard genetic programming (GP) (Alavi and Gandomi 2012), adaptive neuro-fuzzy inference system (ANFIS) (Cabalar et al. 2012), and the multivariate adaptive regression splines (MARS) (Zhang et al. 2015). Ghani et al. (2021) developed PCA (principal component analysis)-based advanced hybrid computational models to predict the liquefaction behavior of soils. Based on cone penetration test (CPT) and shear wave velocity test (Vs) measurements, Zhao et al. (2021) proposed the PSO-KELM model by combining kernel extreme learning machine (KELM) with particle swarm optimization (PSO) to assess soil liquefaction potential. Zhang and Wang (2020) proposed a novel hybrid classifier ensemble to improve generalizability by combining the predictions of seven base classifiers. Considering the unbalance database of in situ tests, such as the standard penetration test (SPT), Das et al. (2020) and Pham (2021) respectively used multi-objective feature selection algorithms (MOFS) and Forward Neural Network (FNN) to estimate the activation of soil liquefaction under seismic condition. Ghorbani and Eslami (2021) used the evolutionary polynomial regression to train an energybased evaluation model for the liquefaction of sand-clay mixtures using a dataset from shaking table experiments. Pirhadi et al. (2019) applied the response surface method (RSM) to develop six new strain energy models to estimate the capacity energy for soil liquefaction; besides, the effect of fine content was also studied. Pirhadi et al. (2018) obtained a new equation for soil liquefaction evaluation in sandy soil, in which two new earthquake parameters: standardized cumulative absolute velocity and closest distance from the site to the rupture surface (CAV5 and rrup) were introduced. Samui et al. (2015) adopted minimax probability machine regression (MPMR) and extreme learning machine (ELM) for the prediction of seismic liquefaction of soil based on strain energy.
According to the research result above, the relationship between the capacity energy and those measurable parameters is weak nonlinear, can be generally fitted as a linear function. However, it will induce an inevitable error if the relationship is purposely fitted as linear. Fortunately, as a supervised learning algorithm, the neural network has a powerful ability of nonlinear interpolation and has been applied to many complicated geotechnical engineering problems. Besides, because the nonlinearity of the fitted model can be controlled by adjusting the structure of the neural network, the technology of the neural network is also tried to evaluate the capacity energy. Baziar and Jafarian (2007) developed an artificial neural network (ANN) model to fit the relationship between soils initial parameters and the strain energy required for liquefaction in sands and silty sands. (Chen et al. 2013) investigated the nonlinear relationship between an increase in pore water pressure and the dissipation of seismic energy through the triaxial shear test of saturated sand and calculate the capacity energy, and presented the principle of nonlinear energy dissipation using an ANN to assess liquefaction potential.
Rahbarzare and Azadi (2019) presented a neuro-fuzzy GMDH (NF-GMDH) model for evaluation of the soil liquefaction potential through cyclic laboratory tests. Also, the gravitational search algorithm (GSA) was used in this energy-based model.
In this study, two kinds of neural networks, FNN and BPNN, were selected and tried to establish a most reasonable model to predict the capacity energy. To make it convenient to indicate their better prediction performance compared to available models, the dataset containing 284 cases used in the previous research was chosen and divided into 3 separate groups denoted as training, validation set and testing sets by the ratio of 5:1:1. Besides, the optimization algorithm of GA was adopted to BPNN to find the best initial coefficients. Firstly, the FNN model and GABP model were iterated to determine the model parameter by referring to the variation of fitness value and relative error of every case; simultaneously, the parameter of C c and D 50 was respectively excluded to test their influence degree according to their prediction error; then, the performance of FNN and GABP on the testing set were compared and analyzed to indicate the advantage of the later model; finally, six prediction results were compared with the previously proposed models to prove the higher accuracy of this newly proposed model. The introduction of the dataset

The analysis of the determined parameters
According to the established CPT-and SPT-based model, the essential parameters include the relative density D r , the effective overburden stress ′ and the fine content FC ; for the stratum in certain depth, the ′ can be represented by mean effective stress p ′ by assuming the lateral stress coefficient is fixed. Except these measurable macroscopic parameters, some mesoscopic parameters characterizing the soil structure also need to be introduced because the energy is dissipated by the friction among soil particles, and it is related to the particle morphology; however, due to the limited measurement technology, the present quantifiable parameters only include the coefficient of uniformity C u , coefficient of curvature C c and mean grain size D 50 . Therefore, these six parameters were determined. In this paper, these parameters' effect on the dissipated energy were analyzed. The D r and p ′ are both related to the contact number and contact force in the soil skeleton, and their increase will compact the soil skeleton to increase the movement resistance of soil grains, increasing the capacity energy under the condition of the same deformation. The variation of FC will influence the soil skeleton structure, with the FC , the sand particle structure is gradually filled with the fine particle, then it transforms to the fine particle structure; at the same time, the CSR firstly increases and then decreases, which means the capacity energy also varied. Besides, the FC also change the C u , C c and D 50 .

The introduction and division of the dataset
In this paper, to collect the capacity energy which cannot be obtained in history cases, the data used in this paper were collected by Baziar and Jafarian (2007), including 217 cyclic triaxial (Green 2001), 61 cyclic torsional shear (Towhata and Ishihara 1985) and 6 cyclic simple shear tests (VELACS project), in which these parameters above can be more accurately measured compared to the field test, improving the performance of the established model. Part of the dataset is presented in Table 1, the capacity energy Log(W) is the output parameter. The analysis for the parameter sensitivity had been conducted by (Alavi and Gandomi 2012; Ghorbani and Eslami 2021), but the parameter C c was ignored by (Alavi and Gandomi 2012; Zhang et al. 2015;Jafariavval and Derakhshani 2020); besides, the parameter D 50 was indicated to have a negative effect on the model accuracy in our submitted paper. So, to verify the effect of C c and D 50 , other two datasbases excluding these two parameters were trained, respectively. Table 1 The part of the database The database is randomly divided into three separate groups denoted as training, validation set and testing sets by the ratio of 5:1:1, in which the training set is used to train neural networks; the validation set is used to validate the performance of the training model to avoid the overfitting once the training of network model has been successfully accomplished; the test set is used to evaluate the generalization ability of the final neural networks and does not participate in the training of the neural networks. In general, the distribution form of the training dataset will affect the calculation efficiency of the gradient descent algorithm, and data preprocessing can improve the solving speed and accuracy. In this paper, the linear normalization method was used.
x = where X max and X min represent the maximum and minimum value of every input parameter; X is the input value before normalization; x is the normalized input value; Y max and Y min represents the maximum and minimum value of the output parameter; Y is the output value before normalization; y is the normalized output value.

The introduction of the applied neural network
The fuzzy neural network FNN is established by combining fuzzy theory and neural network; it is equipped with the advantage of neural network and fuzzy technology. Simultaneously, the rule reasoning and fuzzy concept are applied in the node to improve the transparency of the FNN; thus, the ability to interpret, reason and adapt is greatly increased. In this paper, the required structure of FNN is illustrated in Fig. 1.
It can be seen that there are five layers in the structure. The first layer is the input layer, and each node represents one parameter; the second layer is the fuzzy layer which can be calculated as follows: where x p represents the value of the pth input parameter; h p is the output of the pth membership function of the hth cluster; c h p and h p is the centre and width of membership function, which is used to evaluate the membership degree of input values to a fuzzy set of the input variables; H f is the fuzzy partition number, and the total node number of this layer is ∑ np p=1 H f . The third layer is fuzzy rule layer matching antecedent of fuzzy rule and calculates applicability of each rule, namely adopting fuzzy operator as a multiplicative operator, its value can be obtained for the second layer following: where Φ h is the fitness of the hth fuzzy logic rule of the current layer.
The fourth layer is the normalized layer; its value of each node is the same as that of the third layer, and normalized calculation is realized to avoid oscillation due to excessive modification parameters in the learning process.
( Finally, the predicted value is output in the output layer: h p x p + h 0 ;y f is the output value of the normalized capacity energy; h p and h 0 is the parameter of the neuron-fuzzy system.

Multi-layer BP neural network
BP neural network is one of the most widely used and acknowledged artificial neural networks nowadays, typically including an input layer, an output layer, and one or more hidden layers between them, as shown in Fig. 2.
The hyperparameters of the BP neural network, including the number of hidden layers, the number of neurons for each layer, the definition of the objective function, the selection of excitation function, and the approach of initializing weight and bias, all need to be set before network training. During the BP network learning process, the errors are subsequently backwards propagated through the network to adjust the weights and thresholds of the connections between two layers according to the gradient descent algorithm so as to minimize the sum of the absolute error between the output value of the network and the actual output value. BP neural network can theoretically approximate any nonlinear continuous function under the condition of reasonable structure and appropriate weights.
The relationship between the input layer and the first hidden layer can be expressed in Eq. (7) The relationship between the first hidden layer and the second hidden layer can be expressed in Eq. (8) where O h 1 i and O h 2 h 1 represents the output of the first and second hidden layer, when the number of hidden layer is more than two, their relationship is same as Eq. (8). x i is the normalized input parameter; w h 1 i , b h 1 and w h 2 h 1 , b h 2 are the weight and threshold of these two hidden layers; np is the input number; H 1 is the number of neuron in the first hidden layer.
The relationship between the last hidden layer and the output layer is: where y oh 2 represents the normalized output value; w oh 2 and b o represent the weight and threshold value of the output layer; H 2 is the number of neuron in the last hidden layer.

The establishment of the prediction model and the performance
As mentioned above, the relationship between the capacity energy and these determined parameters presents weak nonlinearity; according to the fitted linear mathematic formulations collected by (Alavi and Gandomi 2012), their value of the coefficient of correlation R ranges from 0.806 to 0.997. Therefore, while calibrating the model parameter, the neural network structure needs to be carefully tried to avoid the problem of over linearity or over nonlinearity.

The establishment procedure of the FNN model
In this paper, the structure of FNN is a multi-layer of forwarding feedback of local approximation and error backpropagation algorithm, which can be adopted in the network learning and training process. This error is defined as loss function and calculated as follows.
where x m i and x p i represents the actual value and predicted value of the capacity energy; n represents the number of case in the training set and the validation set.
Backpropagation means that if there is a difference between predicted output and actual output, hyperparameters of the last layer will be corrected first, and then the hyperparameter correction procedure will be processed on the input layer. To determine the network coefficient, Levenberg-Marquardt algorithm (LM) was used, which is considered a classic method for optimization. (Asvar et al. 2018;Leng et al. 2019).
In this paper, the number of the input parameter of three datasets is 6 and 5, respectively; the output index is the capacity energy log (W) , so the n p = 6, 5 in the first layer; at the same time, the node number m in the second layer was determined to be 6 by considering the overfitting of the model. The model is trained following steps: (1) Set the learning ratio = 0.001 , correct coefficient = 0.01 , the maximum iteration number t = 1000.
(2) The Gaussian subject function in Eq. (11) was employed because it is able to comprehensive consider n the membership of each element and the actual situation.
The initial value of c j i and j i is set randomly.

The establishment of the GABP model
Because the performance of the BP neural network depends on the local gradient, and the final training result is determined by the initial weight and bias, the inappropriate initial grid parameter will induce an optimal local solution. In the hybrid optimization algorithm, the GA is a global optimization algorithm to reach the approximate solution of problem search space by simulating biological evolution behavior. As a kind of random search algorithm, GA evolves population through individual hereditary, crossover, and mutation; thereby, it is utilized to optimize the initial value of weight and threshold in BP neural network, as shown in Fig. 3. After obtaining the optimized weight and threshold, they are applied to the BP neural network to be locally optimized to achieve more accurate prediction.
In the GA, the encoding firstly proceeds, and Real-number coding is utilized to avoid the conversion of number systems and long chromosome. In this paper, the chromosome of each individual consists of weights and bias. GA's Fitness function is used to evaluate the superiority and inferiority of each individual.
where x m i and x p i represents the actual value and predicted value of the capacity energy; n represents the number of case in the training set and the validation set. Figure 5a illustrates the fitness function value of the best individual in each iteration for GA.
In this optimized GA-BP algorithm, Eq. (10) is also applied to evaluate the neural network, and Roulette algorithm is selected for the selection operator. The population size is set as 10 and inherited to the next generation with 10% mutation and 30% crossover fraction; at the same time, five generations are employed. In the BP neural network, there is two hidden layer with six neurons and the learning rate is 0.1. The maximum number of training step is set as 100, and the minimum target error is set as 0.001. The input vector is fully connected to the hidden neurons by a tan-sigmoid transfer function, and the neurons of the hidden layer are fully connected to the output layer via a linear function.

The accuracy of the established model
In the process of training, three indexes are used to evaluate the model performance, including the coefficient of correlation R , the value of R should be close to one for a good model. (Samui et al. 2015), the root-mean-squared error (RMSE), and mean absolute error (MAE); they are calculated as follows: where x m i and x p i represents the actual value and predicted value of the capacity energy; n represents the number of case in the training set or the testing set.
For the three datasets, the variation of fitness value of two neural networks during the model training is illustrated in Figs. 4 and 5b; it can be seen that the number of iterations of BPNN optimized by GA is obviously less than that of FNN, and the initial loss function of BPNN is lower than that of FNN.
The fluctuation of the prediction error during training is illustrated in Figs. 6, 7, 8. It can be seen that the process of GABP is more stable than that of FNN even when the other two parameters were excluded; besides, the amplitude of error of GABP was less than FNN. It can also be known that after excluding D 50 and C C , the variation of FNN is more intensive than GABP, suggesting that the GABP is more stable during training. As for the GABP model, it can be seen the C C shows a negative influence on the training progress while the D 50 contributes to the training and validating process. By contrast, in the FNN model, these two parameters both show the positive effect and the influence of C C is higher.
After getting the prediction model, the comparison between the actual and the predicted log(W) of the test set and the training set is shown in Figs. 9, 10, 11. It can be seen that the value of R 2 for all models are very close to 1; for the GABP model, the distribution of points more concentrate to the middle line and there is nearly no point locating in the area between 10 and 20%, indicating the GABP model performed better than the FNN model. Besides, the red points of the training set and blue points of validation set in GABP model are closer to the middle line, suggesting its learning ability is better than FNN. After excluding the parameter of C C , for GABP model, the distribution become more concentration while that of the FNN become more disperse, indicating that the presence of C C disturbed the GABP model learning and the impact of this parameter can not be reflected; by contrast, the FNN can distinct the role of C C from other two parameters representing the particle Fig. 3 The flowchart of the GA-BP neural network size. After excluding the D 50 , the point distribution of both FNN and GABP became more disperse, indicating that the relationship of D 50 with the capacity energy can be leaned by these two models. It is indicated that the GABP can better fit this weak nonlinear relationship. It needs to be noticed that the variation in Fig. 11a is more intense than that in Fig. 10a, indicating in FNN model, the effect of D 50 is stronger; In terms of the training set and testing set, the performance of these two models was compared to previous research, as shown in Table. 2. It can be seen that during model training, for the FNN model and the GABP model, although the number of cases is smaller, the value of RMSE and MAE was mostly less than that of previous models, indicating that these two methods can more welly learn the relationship between the capacity energy and influencing factors. It should be noted that for FNN, after excluding the D 50 and C C , the training effect was improved and reduced, respectively, while it was improved for GABP, indicating that the influence of D 50 and C C was different while using a different method and furtherly proving the importance of selecting a reasonable index. According to the value of RMSE and MAE of testing set, For models trained by all parameters, including the MLR, ANFIS, FNN and GABP, the GABP behaved best; after excluding the parameter of C C , although the value of MAE of the GABP was slightly less more than the LGP, MEP, its RMSE and number of cases were less, indicating the greater learning ability of GABP; after excluding the D 50 . It can be indicated that the accuracy of the FNN model and GABP model were both reduced, but the GABP-D 50 still behaved better, suggesting that the GABP absolutely is the most appropriate method of machine learning to fit this kind of weak nonlinear problem.
Therefore, combining the performance of the training set and the testing set, the optimal prediction model was determined to be GABP-C C , in which only two macroscopic parameters p ′ D r , and three mesoscopic parameters FC , C u , D 50 need to be tested. while using it to predict the soil liquefaction, parameters p ′ D r can be directly obtained from the field test, and FC , C u , D 50 can be obtained through the indoor test on the soil sample without needing to consider the disturbing effect; while improving the model by adding more training data, only the FC , C u , D 50 need to be controlled to make different soil samples, which improving the test efficiency.

Conclusions
In this paper, the prediction model for the capacity energy which can be used to predict the potential of soil liquefaction, was studied. Different from the previous research, the weak nonlinear relationship was noticed and focused on while training the FNN and GABP model. At the time, referring to the author's finding in previous work, their response to the exclusion of parameter of C C and D 50 was studied, their prediction performance was also compared to the available models. Finally, the following conclusions can be drawn.
(1) Different from those stress-based prediction models, the relationship of capacity energy to the parameter is weak nonlinear, which cannot be regarded as a similar nonlinear relationship between the CSR and tested insite parameters.
(2) Comparing with the FNN model, the GABP model does not have to consider the effect of C C . (3) According to the prediction error, compared to those previously established models, the FNN and GABP model all fitted a good weakly nonlinear relationship for the capacity energy. (4) For different models, the effect of C C is different.
Author contributions All authors contributed to the study conception and design. Material preparation, data collection and the first draft were finished by YZ; W-HC proposed the idea of this research and calibrated the model parameter; Mahmood Ahmad provided the advice about the AI technology, and revised the first draft. All authors read and approved the final manuscript.
Funding There is no funding to support this research.

Availability of data and material Not applicable.
Code availability Not applicable.

Conflict of interest
The authors declare that they have no conflict of interest.