Optimized ANN-based approach for estimation of shear strength of soil

The shear strength of the soil (SSS) is a significant attribute that is employed most frequently throughout the design phase of construction projects. The conventional approach of determining shear strength (SS) in the laboratory is one that is both costlier and more time-consuming. The ability to precisely predict the SSS without the need for laborious and expensive testing in a laboratory is just one of the real-world needs of geotechnical professionals. In this paper, an attempt has been made to develop a common methodology for predicting the SSS using optimized models. For this purpose, three additional optimized algorithms (GA, MPA, and PSO) were utilized to improve the bias and weight of the ANN's learning parameters, and three optimized ANNs (ANN-GA, ANN-MPA, and ANN-PSO) were developed. Validation of all the developed optimized models was executed using RMSE, R2, RSR, WI, and NSE indices. After validation of optimized models, it was found that out of three, ANN-GA produces good modelling outcomes in training as well as in the testing phase, outperforming other models. It has been shown that the GA develops the most trustworthy ANN, and this was also validated by the rank analysis of developed models. When trying to predict SSS, it has been shown that the liquidity index (LI) is the key factor to take into consideration. This was determined by plotting the feature significance plot along with the feature selection plot. Following the LI, the water content (wc) is the second most important input variable that has an effect on the value of the parameter of interest being investigated in the present investigation. In a broad sense, it was found that the factors associated with water were the primary characteristics that impacted the prediction of SSS.


Introduction
The long-term growth and economic success of any country are heavily dependent on the availability and progress of its infrastructure services, since these resources are the basic factors that control the country's economy while contributing to the well-being of individuals.Rapid industrialization and urbanization have resulted in a constant requirement for technological advancement and research throughout the sectors of construction and engineering.In the past few decades, there has been a tremendous transformation in the growth of infrastructure services, and these amenities or buildings are erected over the ground surface and necessitate significant expenditures throughout the pre-and postconstructions phases.The SSS is frequently linked to the failure of the ground and the collapse of buildings.The friction, cohesion, and interlocking between two particle mainly depend on the SSS under various cases of loading (Das & Sobhan, 2013).The SS for any geotechnical material may be expressed using the Mohr-Coulomb theory.Cohesion and the angle of internal friction (c and φ) are two SS components that, based on a theory, vary according to normal stress (Zhang et al., 2010): where τ is the SS, c is the cohesion, σ is the normal stress, and φ is the angle of internal friction.Various factors that affect SSS are: I P , clay content (CC), and others (Das & Sobhan, 2013).Three different types of stress always exist (1) = c + tan( ) in slope materials, i.e. major, intermediate, and minor principal stress-that act on three principal axes that are at right angles to one another.Normal stress on the slip surface, angle of internal friction, and cohesion are all functions of the SSS.The angle of rupture, stability condition of slope materials, safety factor, and SS are all affected by the soil's key physical qualities of cohesion, and angle of internal friction.Any soil's values for these parameters depend on a variety of variables, including the soil's textural characteristics, its past development, its initial condition, the soil's permeability characteristics, and the drainage conditions permitted during the test (Murthy, 2008).
The SSS is influenced by its constituents: specific gravity (G), e, w c , PL, LL, CC, stress history, and relative density (Pham et al., 2020).Because soil frequently has varied particle sizes, a higher w c , and larger voids, its physical characteristics are complex (Das & Sobhan, 2013).The exact assessment of both SS characteristics should be a primary priority for the design of any construction that will be sitting on soil.These measurements can be determined either in the laboratory or in the field.In the laboratory, these parameters of interest (c, φ) may be measured either with the help of a direct shear test, a tri-axial test, or an unconfined compressive strength test, whereas in the field, they may be measured by vane shear test apparatus or with the help of any indirect method of soil testing (Mollahasani et al., 2011;Murthy, 2008;Pham et al., 2018).
Computing is the procedure of transforming one type of information into another desirable form of outcome via the use of commands and operations.Computer-aided techniques are among the most promising options for simulating a variety of structural engineering-related issues as a result of technological improvements (Nguyen et al., 2020).
Companies and developers all around the world are discussing incorporating artificial intelligence (AI), deep learning (DL), and machine learning (ML) in this new era of technology.In the world of technology, such abbreviations are frequently used inappropriately.Since the 1980s, many searching techniques have been established under the aegis of AI.Certain approaches imitate biological processes like neurological connections, the development of species through natural selection, and the behaviours of social groupings of organisms like ants, birds, bees, etc.On the contrary, other searching methods rely on mathematical, logical, and statistical processes rather than any natural processes in order to get the best answer.AI is a broad field that includes all aspects of computational intelligence.
ML techniques are being used more and more to make accurate predictions about real-world problems like clustering, correlation, regression, and classification (Kaveh & Iranmanesh, 1998;Kaveh & Servati, 2001;Kaveh et al., 2008;Varma et al., 2023;Xie et al., 2020;Xu et al., 2021).Geotechnical engineering is one of the engineering fields where AI is routinely used for mapping the nonlinear correlation between output and input parameters (Bardhan et al., 2022a;Wu et al., 2021;Zhang et al., 2021Zhang et al., , 2022)).In recent years, neural networks have also been successfully used as an alternative approach to handle a variety of challenges associated with geo-technics (Pham et al., 2017;Shahin et al., 2009).Testing and training are the two critical phases on which ANN completely depends.Large amounts of data must be labelled in the training phase along with their corresponding characteristics, while in the testing phase, conclusions are drawn from prior experience and new, untouched data are labelled (Dargan et al., 2019).An input dataset is given to the ANN during training, and weights between the interconnections are changed to produce the output determined by the input dataset.Among the various ANN algorithms used for classification and regression issues, multilayer perceptron (MLP) is regarded as one of the most effective methods.
Instead of measuring directly in the laboratory and field, geotechnical structures' parameters for design are often measured by employing empirical correlations, which are created by fitting equations for regression to an established database (Zhang et al., 2020).Nineteen different empirical methods for predicting SSS in unsaturated situations were investigated by Garven and Vanapalli (2006).Several potential soil factors were evaluated for correlation with SSS in the method that was employed.Kiran et al. (2016) employed a PNN approach for the prediction of SSS on the basis of input parameters: dry density (ρ d ), w c , I P , silt proportion, sand proportion, gravel proportion, clay proportion, etc. (Pham et al., 2018).The researchers find that the algorithm falls short of precisely describing the complicated behaviour of soil.The reason could be that such types of algorithms were utilized to map curvilinear relationships between the input and response variables in spite of the availability of some of the empirical equations (Farrokhzad et al., 2012).
There has been a rise in the use of SC techniques such as ANN, ANFIS, SVM, etc. for studying parameters having nonlinear relationships to their important components (Gao et al., 2018).Prediction of soils is often performed using ANN, adaptive neuro-fuzzy inference system (ANFIS), and genetic expression programming (GEP) (Kayadelen et al., 2009).Researchers and academics have attempted to improve AI models and developed associated optimization technologies as a result of technological progress and rising geotechnical engineering accuracy needs (Chou & Ngo, 2018;Kaveh, 2017aKaveh, , 2017b)).In a general sense, we can say that one of the general ways to enhance the performance of any geotechnical structure is to design it with the help of MOAs in the optimization technique.Such techniques are effective and strong methods for tackling stochastic optimization problems.The optimization of the ML performance of predicting approaches may be improved via the optimization of model parameters.The various model parameters are: bias, weight, and kennel function penalties.
Many MOAs were developed in the evolutionary work and optimization techniques of these days (Abbasi et al., 2021;Kaveh, 2017aKaveh, , 2017b)).The latest study has demonstrated that meta-heuristic-based optimization methods can improve the efficiency of ML-based algorithms (Bardhan et al., 2021;Kardani et al., 2021;Raja et al., 2022).MOAs are employed with ML systems for two reasons: (1) improving and predicting parameters for the model during development; and (2) optimizing the hyperparameters linked to network structure (Alkabbani et al., 2021;Yang & Shami, 2020).This type of optimization not only increases the ANN's prediction capacity but also assists in reducing the typical "local minima trap" problem by altering the parameters used for learning (biases and weights), giving notable benefits (Bui et al., 2019;Raja et al., 2022;Xie et al., 2021;Zhang et al., 2020).Today, heuristic methods have gained reputation, especially GAs and swarm intelligence methods (Biswas & Biswas, 2015).For example, the PSO method is inspired by the migration of birds in the sky (Kaveh, 2017a;Kennedy & Eberhart, 1995).
Several meta-heuristic optimizations (MOAs) were developed and employed to optimize the configuration of traditional machine learning (TML) techniques.These offer a balanced approach to exploration and exploitation (E&E), which improves traditional ML algorithms' searching performance and capabilities.It implies that optimization of ML algorithms with MOAs will find the real global optimum instead of local minima by producing optimal structures and optimum ML algorithm learning parameters.Additionally, several types of performance parameters, advanced visualization methods, sensitivity analysis, uncertainty analysis, and analysis for feature importance have done and investigated to compare the effectiveness of the suggested models.The main objective of this research work is to optimize the ANN model for the prediction of SSS using three different MOAs: ANN-GA, ANN-MPA, and ANN-PSO, and also to develop the most efficient computational model.

Research significance
Prior to being used to solve important problems, ANNs have computational shortcomings that must be fixed, like risk due to their dimension (Chen et al., 2017), local minima (Akkurt et al., 2003), etc.Additionally, its utilization can be constrained because of over-fitting problems, which is considered as TML techniques' primary weaknesses (Mohammadzadeh et al., 2014).Dang et al. (2019) and Rokach (2010) have shown that optimized algorithms enhance the accuracy of predictions while decreasing the issue of overfitting.By combining TML with MOAs results in a multidimensional structure that optimizes the E&E phases during optimization, providing an effective method for resolving difficult problems.

Methodology
When it comes to the design of geotechnical constructions, one of the most important aspects is the precise evaluation of the SS characteristics of the soil.In this paper, the detailed methodology is distributed into the following four parts: (a) data acquisition; (b) database pre-processing; (c) testing and training the model; and (d) model evaluation.
Data acquisition: The soil data are utilized to train and evaluate the different optimized models using the data provided from the Le Trong Tan Geleximco Project (Vietnam), according to the soil survey completed at the location by Cao et al. (2020).The development encompasses around 135 ha and was utilized to create a multiple-story building, restaurants, parking facilities, and theatres for entertainment.Samples of soil were taken from the depth of 1.20-39.5 m (Table 1).Similar to other studies, in this research a total of 12 influencing factors were considered: X1, X2, X3, X4, X5, X6, X7, X8, X9, X10, X11, and X12 (Bui et al., 2019;Cao et al., 2020;Moayedi et al., 2019bMoayedi et al., , 2020;;Nhu et al., 2019).
Database pre-processing: Pre-processing of data is required to guarantee that all variables are given equal attention throughout the process of training and, hence, increase the speed of the learning process.In this research work, training as well as testing data were considered, where training databases are utilized for the construction of models whereas test databases are utilized to calculate the ability of the last tuned model for selecting the best.As in previous work (Moayedi et al., 2019a;Rabbani et al., 2022;Xie et al., 2019), this work randomly selected 80% of the data (i.e.199 samples) for the training of the model and the rest 20% (i.e.50 samples) for the testing of the model.
Train and test the model: For the best model, the input factors were trained and tuned, and then the model's performance was checked.To improve the performance of the model in such learning techniques, it is crucial to normalize the input as well as output variables.All of the variables in this dataset have been normalized between − 1.0 and 1.The normalization formula commonly used in any ML algorithm is as follows: where x Actual refers to the actual dataset used in the study, x mini to its lowest value, and x max to its highest value.
Model evaluation: Apart from R 2 and RMSE, additional performance parameters, including NSE, RSR, and WI, were employed as comparative considerations in these kinds of situations to assess the effectiveness of each of the optimized models employed in the present investigation (Bardhan et al., 2022b;Bui et al., 2019;Malik et al., 2020;Raja & Shukla, 2021).In general, a greater value of R 2 denotes a better predicted framework whereas a greater RMSE denotes a less effective predicted framework (Asteris et al., 2019;Huang et al., 2019).For those models, a WI equal to 1 indicates full justification between observed and predicted values, whereas a WI equal to 0 indicates no match at all.When NSE equals one, it can reveal a great agreement between observed and anticipated values (Hammed et al., 2021).These quantities (R 2 , RMSE, NSE, RSR, and WI) may be determined with the help of the equations listed below: (2)

Artificial neural network (ANN)
A neural network is a set of algorithms designed to find connections between different datasets and imitate the workings of the human brain (Kaveh & Servati, 2001).It is used for different purposes, including classification, regression, image recognition, and many others.Finding the best solution to a difficult problem which is not easy to define analytically is a task that ANNs are very good at.The ANN algorithm is frequently used as an approximate in the nonlinear analysis of each variable used in the model making (Xie et al., 2019;Wang et al., 2018).These techniques are frequently used for various engineering problems due to their unique mathematical solutions for optimization tasks (Gao et al., 2018).When compared to more conventional methods of numerical analysis, such as regression analysis, a properly trained ANN model may produce more reliable findings with a significant reduction in the amount of processing work required (Kaveh & Khalegi, 2000).A network structure is the arrangement of nodes and linking cables in a network.The information is initially received via the input layer and then directed to the hidden layer.Connectivity among each of these layers distributes weight to every input neuron at first at random, and then bias is introduced to every input neuron (Kaveh & Iranmanesh, 1998).

Particle swarm optimization (PSO)
This algorithm was developed for the first time by two researchers in 1995 (Kennedy & Eberhart, 1995).The simple knowledge behind this strategy is to partition the space in the solution into zones and assign a particular creature to every zone (Ebid, 2021;Kaveh, 2017a).PSO employs the parameters of location (x) and movement (v) to efficiently find optimal responses to the field of engineering, optimization, and classification issues (Kuntoji et al., 2018).The basic PSO method assigns at random the location and velocity vectors to each particle in an n-dimensional search space (Fig. 1).
Each particle is assigned a value for fitness based on their initial locations in searching space using the fitness function of choice.Every generation receives an updated list of the top two locations.Pbest id represents the highest level of success a person has had to date.Gbest id , the best position in any broad search of an entire population, is another optimization that this individual adheres to.That is, everyone in the population shifts to find the best fit for themselves inside the swarm's ideal structure (Zhou et al., 2011).If the conditions at position x id are not satisfied, the next location is produced with a different velocity of particles (Pham et al., 2018).
The formulas are as follows: where x t id is the location of specific i at generation t, v t id is the velocity of specific i at generation t, x t+1 id is the location of specific i at generation t + 1, v t+1 id is the velocity of specific i at generation t + 1, pb is the personal best position of specific i so far, and gb is the global best position of all the individuals so far (Pham et al., 2018), whereas, c 1 , c 2 : are the cognitive and social parameters, and r 1 , r 2 : are the random number, ranging from 0 to 1.

Genetic algorithm (GA)
GA is an artificial intelligence method that models the reproductive process in accordance with Darwin's famous "survival of the fittest" principle.Similarly, GAs provide a "population" of potential solutions to a particular problem.Elements are randomly drawn from this population and can "reproduce" themselves by linking some aspect of the two stock solutions (Kaveh et al., 2008).Importantly, the likelihood that an item will be selected for reproduction is essentially based on its suitability, which is the objective function associated with the solution.If parents were fit, their children were going to exceed them with a greater likelihood of surviving.This technique is used until an entire generation of fit people is discovered.This term can be applied to search problems.The function of (8) id fitness allows the GA to assess the efficiency of every chromosome in the general population.Each individual receives a fitness value.Once a single chromosome is chosen, the odds are renormalized minus the chosen chromosome, and the parent is chosen from among the rest of the chromosome.Individuals who have excellent fitness have a better probability of being chosen for reproduction.

Marine predators algorithm (MPA)
This algorithm is a simple yet powerful meta-heuristic optimization machine learning (ML) technique.MPA, like most meta-heuristics, takes a population-based strategy, scattering the initial solution from the initial trial randomly over the searching field.The MPA optimization procedures are broken down into three primary stages (Fig. 2) for modelling every stage of life of a predator and prey: (1) high-velocity ratio (i.e.prey is travelling more rapidly compared to predator), (2) unit velocity ratio (i.e.prey and predator are travelling at nearly the same velocity), and (3) low-velocity ratio (i.e.predator is proceeding more rapidly compared to prey).Since the prey flees the predator as quickly as possible and travels across the search space at a high rate of speed, the first phase essentially corresponds to the exploration phase.Since the prey population advances more slowly while the predator population moves more quickly, the subsequent phase serves as a transition to the exploitation phase.As a result, a particular population is investigating in one area while the opposing population is exploiting in another, which suggests that a predator is in charge of exploiting while the prey is in charge of exploration (Vankadara et al., 2022).Since only the predator population evolves slowly and converges on the optimal option, the third phase is represented by the exploitation phase.

Results and discussion
In this part, we'll discuss how well the optimized ANNs and conventional models we built to predict SSS performed.All of the models were developed and evaluated using the same set of training and test data.Various performance measures have been investigated, analysed, and discussed for use in determining a model's predictive efficacy.

Descriptive analysis
All the input variables used in this paper were compared with SSS (Y) and represented in graphical form (Fig. 3).
From the graphical plot, we can clearly see the variation in the range of all the variables with respect to SSS.The correlation coefficient matrix of the collection of data with the label was established and shown (Fig. 4).This plot can be used to summarize the strength of the linear relationship Fig. 1 Concept of updating location in PSO algorithm (Ellahi & Abbas, 2020) between two data samples.From this map, the rank correlation coefficients (rs) were determined.It was discovered that the parameters had a correlation that may be considered pretty good.Using Fig. 4, we can find that SSS (Y) has a strong correlation with LI (X12) and a moderate correlation with LL (X9), but it has very little correlation with sand proportion (X2), loam proportion (X3), clay proportion (X4), and IP (X11).

Importance of the variable used and its interpretation
The results of the feature importance tests conducted over a total of 1000 simulations are presented in Fig. 5, which also depicts the relative importance of the variables (Khan et al., 2022).Such outcomes are further in agreement with the correlation matrices of the parameters, which contain the maximum coefficient for those variables, which provides validity to the findings that were derived from both of these sets of data.When trying to predict SSS, it is easily apparent that the LI is the single most important measurement to take into consideration.Next to LI, the important parameter is w c which has an effect on the predicted outcome of the SSS, and the two of those variables are subsequently followed by the e, ρ b , after which the depth of sample comes into play.The LI is the most important input variable that has an effect on the prediction of the SSS.As a result, the following conclusion on the order of significance of the factors that influenced the outcome may be drawn: LI > ρ d > ρ b > e > w c > PL > LL > sample depth > I P > proportion of loam > proportion of sand.It was found that the factors associated with water were the most important elements in the practice of forecasting SSS (Rabbani et al., 2023).It is appropriate since water reduces friction and linkage among particles of soil, and therefore, the SS for soil with a lesser water content shall be greater than that of soil with higher water content (Ly and Pham, 2020).

Performance analysis
In this section, an analysis of how well each model performed was carried out.It is difficult to fully reflect the model's performance in terms of generalization because there is a general lack of data (there are only 249 data points).
A maximum number of repetitions of 500 have been taken into consideration to ensure the mathematical models used in this investigation achieve an adequate level of precision.The approaches will then begin to create the issue function, after which they will begin to investigate the data in order to identify suitable values for the parameters.After the ANN is completed and produced with the inputs that have been provided, the outcome of the ANN may then be evaluated.The root mean square error (RMSE) criterion was employed as the objective function in order to determine the error after every round of iteration.When applied to the train dataset, the RMSE values for the models ANN-GA, ANN-MPA, and ANN-PSO came out to be 0.1068, 0.1088, and 0.1256, respectively; however, when applied to the testing dataset, the same values came out to be 0.1172, 0.1060, and 0.1178 (Table 2), which indicates GA did better than the other models when it came to optimizing the ANN.In addition to being tabulated for both the training and the testing datasets (Tables 2), the results for the parameters of all of the optimized models have also been shown in a graph (Fig. 6).

Parametric configuration of developed optimized models
Several MOAs were utilized to enhance the effectiveness of traditional ANNs in a range of technical fields over the past decade (Raja & Shukla, 2021;Skentou et al., 2023).
As an example, an optimized model having input IP, hidden neurons NH, and outputs O was chosen, and subsequently the total number of biases and weights was performed, which is IP × NH + NH + NH × o + o.The development of methodology for the optimized ANNs may be summarized as follows: (i) initialize the ANN model; (ii) input various  hyper-parameters used to generate the models, like the number of hidden layers and neurons as well as activating function; (iii) now initialize the MOAs and randomly develop bias and weight; (iv) after initialization of MOAs, various factors that affect MOAs can be selected.Weights and biases are utilized as factors in order to organize the computational connection that exists among the many components of an ANN.It was determined through a procedure of trial and error how many hidden processes would be optimal for the system.In the course of this inquiry, 500 iterations were carried out to ensure that the simulations had a suitable level of reliability.The convergent behaviour of the mathematical framework may be evaluated (Fig. 7) by assessing the amount of error at every single iteration of the process.

Actual vs predicted SSS of optimized models
The results of both phases can be seen in the form of a scatter plot (Figs. 8 and 9).An indicator for determining how well the prediction was made is to examine the correlation among the actual and predicted SSS values.Table 3 displays the range of the actual SSSs as well as the results produced by the machine learning models for both phases.Table 2 contains the predicted values of R 2 for both the phases of testing and training.According to the outcomes, it is clearly apparent that the ANN-GA generates results that are more trustworthy than those produced by the rest of the models.
Because of the model's acceptable accuracy, it is possible to utilize it as a suitable alternative for the prediction of SSS in any project that may be undertaken in the future.

Rank analysis of optimized models
After calculating all the performance indices for the testing and training phases, models are ranked accordingly.The ranking scores for two distinct models that produce identical results could be equal.The ideal value of performance parameters for R 2 , NSE, and WI is considered 1, whereas for RMSE and RSR it is 0. On the basis of the calculation of performance parameters, the rank analysis of all the developed optimized models was computed and shown in tabular form (Table 4) to pick the best model (Xue et al., 2023).The highest score ( 15) is reached by ANN-GA in the training phase, followed by ANN-MPA (10), and ANN-PSO (5), whereas in the testing phase, the highest score ( 15) is reached by ANN-GA, followed by ANN-MPA (9), and ANN-PSO (6) as in Table 4. ANN-GA's overall score in both phases together is 30 (15 + 15), considerably higher than ANN-MPA's 19 (10 + 9), and ANN-PSO's 11 (5 + 6).This gives an in-depth evaluation of the model's predictive ability and presentation (Mustafa et al., 2022).The ranked values indicate that optimization has a considerable influence because the effectiveness of ANN is increased several times.As a result, the ANN-GA outperforms the other developed models in predicting SSS.In simple words, GA outperforms MPA, and even PSO in terms of improving the performance of the parent model (ANN).The top model received a maximum of three points (as three models were used in this study), while the worst model received one point (Xue et al., 2023).Following that, all of the rankings are summed together to provide a total rank, which is also calculated in this learning process (Mustafa et al., 2022).

Limitations and future scope
There are certain limitations associated with the utilization of optimized models for the purpose of expressing the link that exists among the SSS and the several factors that contribute.These restrictions have to be highlighted here in preparation for future studies.The very first constraint of this study is the employed variables, which are considered in between their maximum and minimum values.In other words, it can be assumed that the model may not be able to make accurate predictions for those parameters whose minimum and maximum values lie outside of these limits.
The second limitation is that in this work, limited data (only 249) are used to test and train the model.Because the data were collected from an actual building job, it became much clearer that the soil in the region that was the subject of the study has distinctive qualities.This led to a lower value for critical elements, which in turn led to the SSS having a lower value.As a consequence of this, while putting the generated models into practise, it is important to take into consideration the whole quantity of data that is accessible.If we want to improve the precision and dependability of the predictive analysis, having a larger set of data to work with is the best alternative.
On the other side, the limitations of the parent algorithm, i.e.ANN, are that it is complicated and requires a large amount of data to train.This may be costly and extremely time-consuming.These are also prone to overfitting, which means they might discover patterns in data that do not exist.This might result in incorrect results.

Concluding remarks
At the development stage of construction projects, the SSS is utilized the majority of the time because it is a significant component.The usual approach to determining this parameter of interest in a laboratory is laborious and costly.The ability to precisely predict the SSS without the need for laborious and expensive testing in a laboratory constitutes one of the practical demands of geotechnical professionals.In the present study, three additional OAs (GA, MPA, and PSO) were utilized to enhance the biases and weights in the ANN's training variables.As a result of these improvements, three optimized ANNs (ANN-GA, ANN-MPA, and ANN-PSO) were constructed.The following indices: R 2 , WI, NSE, RMSE, and RSR were utilized in the testing process for all the optimized models.The correlation chart among each of the input variables that were employed indicates that the significant factor of SSS (Y) correlates with LI (X12), and only marginally correlates with LL (X9).On the other hand, it is practically independent of the sand proportion (X2), loam proportion (X3), clay proportion (X4), and IP (X11).
Based on performance indices (R 2 , RMSE, NSE, WI, and RSR), the outcomes of optimized ANN models were equated with experimental values by scatter plot.RMSE is used as an objective function to find the error at different iterations.Also, the best performance of the model is judged in terms of low RMSE and high R 2 scores.The result obtained in terms of RMSE for the optimized models ANN-GA, ANN-MPA, and ANN-PSO was 0.1068, 0.1088, and 0.1256, respectively, for the training dataset, whereas it was 0.1172, 0.1060, and 0.1178 for the testing dataset.R 2 for the optimized models ANN-GA, ANN-MPA, and ANN-PSO was 0.7925, 0.7857, and 0.7161, respectively, for the training dataset, whereas it was 0.7891, 0.8172, and 0.7937 for the testing dataset.It was shown that a GA generates the most reliable ANN by virtue of the best consistency of the results that were produced by the connected optimized model.This was accomplished through a demonstration.
For knowing the precision of predicted SSS, the correlation between predicted and observed values obtained by the experiment of the optimized model is plotted in the form of a scattered plot.The scatter plot makes it abundantly evident that the results produced by the ANN-GA are more trustworthy than the results produced by the other optimized models.That means that ANN-GA provides a more accurate understanding of the connection between SSS and the many elements that might influence it, which ultimately results in a more reliable prediction of SSS.The rank analysis of the ANN-GA optimized model in both phases together is 30, considerably higher than ANN-MPA 19 and ANN-PSO 11.As a result, the ANN-GA outperforms the other optimized models in predicting SSS.

Fig. 4 Fig. 5
Fig. 4 Correlation plot of the dataset used

Fig. 6 Fig. 7
Fig. 6 Performance parameter plot for an ANN optimized model (for training and testing datasets)

Fig. 9
Fig. 9 Scatter plotting of optimized models (for testing data)

Table 3
Range of SSS in the real and optimized models (for testing and training data)