Evaluation of total dissolved solids in rivers by improved neuro fuzzy approaches using metaheuristic algorithms

26 Substantial deterioration of surface water quality, mainly caused by human activities and 27 climate change, makes the assessment of water quality a global priority. Thus, in this study, 28 four metaheuristic algorithms, namely the particle swarm optimization (PSO), differential 29 evolution (DE), ant colony optimization algorithm (ACOR), and genetic algorithm (GA), were 30 employed to improve the performance of the adaptive neuro-fuzzy inference system (ANFIS) 31 in the evaluation of surface water total dissolved solids (TDS). Monthly and annual TDS were 32 considered as target variables in the analysis. In order to evaluate and compare the authenticity 33 of the models, an economic factor (execution time) and statistical indices of the coefficient of 34 determination (R2), Kling Gupta efficiency (KGE), root mean squared error (RMSE), mean 35 absolute error (MAE), and Nash-Sutcliff efficiency (NSE) were utilized. The results revealed 36 that the hybrid methods used in this study could enhance the classical ANFIS performance in 37 the analysis of monthly and annual TDS of both stations. For more clarification, the models 38 were ranked using the TOPSIS approach by simultaneously applying the effects of statistical 39 parameters, temporal and spatial change factors, and execution time. This approach 40 significantly facilitated decision-making in ranking models. The ANFIS-ACOR annual model 41 considering discharge had the best performance in the Vanyar Station; Furthermore, ANFIS-42 ACOR monthly model ignoring discharge was outstanding in the Gotvand Station. In total, 43 after utilizing two defined and proposed temporal and spatial change factors, ANFIS-ACOR 44 and ANFIS-DE hybrid models had the best and worst performance in TDS prediction, 45 respectively. 46

Because of the growing demand for fresh water and restricted access to water resources (Khataee et al., 2013), observation and control of river water quality have an incontrovertible role in environmental conservation and sustainable distribution of natural resources (Varol et al., 2022;Zhang et al., 2021;Deng et al., 2021;Jamei et al., 2020).
One of the widely accepted factors that has been effectively used for studying water quality is total dissolved solids (TDS) (Kabolizadeh et al., 2022;Salmani and Salmani Jajaei, 2016;Ghfolamreza et al., 2016;).TDS comprises dissolved organic matters and a variety of inorganic salts e.g., sodium )Na + (, magnesium (Mg 2+ ), calcium (Ca 2+ ), and potassium (K + ) as cations, as well as chloride (Cl -), sulfate (SO4 2 ), nitrates (NO -), and bicarbonates (HCO 3-), as anions (Sun et al., 2021).Based on the World Health Organization WHO standards, the acceptable TDS range for agricultural uses is 450-2000 mg/l (Gawande & Sarode, 2021).As high concentrations of this parameter can overshadow human health and the environment, it is important to focus on it in the evaluation of water resources, especially surface waters (Chellaiah et al., 2021).The parametric intricacy brings about a high cost and time for on-site investigations and laboratory examinations of water quality (Asadollah et al., 2021).Numerical methods used for water quality modeling have been shown to be time-consuming and have weak optimization performance.Artificial intelligence-based models eliminate these limitations and also have the benefit of being less sensitive to missing values and being able to perform sophisticated mathematical calculations with a huge quantity of data and nonlinear structures (Najafabadipour et al., 2022).Various studies from different fields have also used artificial intelligence-based techniques like artificial neural networks (ANNs), ANFIS, and gene expression programming (GEP) to estimate and predict water quality (Al-Mukhtar & Al-Yaseen, 2019;Antanasijević et al., 2014;Hu et al., 2020;Khalil et al., 2011;Palani et al., 2008;Yoosefdoost et al., 2022).Among all these methods, ANFIS was introduced as a practical technique for evaluating water quality (Shah et al., 2021).The ANFIS is a neural network based on the Takagi-Sugeno fuzzy inference system (Abd El-Mageed et al., 2022).By taking advantage of the mathematical characteristics of general function estimators, fuzzy systems can effectively process and integrate logical information.This method is transparent to the user and results in fewer memorization errors (Tutmez et al., 2006).In addition, this method combines the advantages of ANN and fuzzy approaches into a single framework by integrating the fuzzy inference system into the adaptive network framework (Ying & Pan, 2008).Therefore, the engineering application of the neural fuzzy method has been widely applied in hydrology and water quality investigations (Gavili et al., 2018).
In neuro-fuzzy networks, choosing the proper parameters and determining the appropriate network structure is essential in analysis.The more accurate and efficient the algorithms used for training, the more successful the model will be.Most of the workable variable training algorithms for fuzzy networks are gradient algorithms, especially backpropagation and Levenberg-Marquardt.Despite these algorithms being prevalent in the literature, they have weaknesses that need to be considered.For instance, some of these algorithms are potent for determining only local optimal points (Azad et al., 2018).In addition, some training algorithms, such as Levenberg-Marquardt, are methodologically complicated and need high data processing memory for calculations that are less likely to be optimized.Thus, new approaches are required to avoid the weaknesses of gradient-based algorithms.Not only can the metaheuristic algorithms (EA) discover solutions globally without providing exclusively local optimal points, but also an intricate problem with a high calculation volume can be simply evaluated if a proper algorithm is used.
Many studies have been devoted to estimating surface water quality using modified AI methods.Jalalkamali (2015) applied the ANFIS improved with the ant colony optimization algorithm (ACOR), particle swarm optimization (PSO), differential evolution (DE), and genetic algorithm (GA) to evaluate the groundwater quality of Kerman Province, Iran.ANFIS demonstrated appropriate performance in evaluating all water quality parameters during the training phase.The findings indicated that applied metaheuristic algorithms significantly improve the accuracy of ANFIS in predicting river water quality parameters.Banadkooki et al. (2020) employed the artificial neural networks (ANN), ANFIS, and support vector machine (SVM) models for the estimation of the total dissolved solids of aquifers.The grey wolf optimization (GWO), gravitational search algorithm (GSA), moth flame optimization (MFO), particle swarm optimization (PSO), shark algorithm (SA), and cat swarm optimization (CSO) were used to train the models.The results showed that the ANFIS-CSO and ANFIS-MFO models outperformed the other ones.Aghel et al. ( 2019) evaluated water quality parameters using a hybrid particle swarm optimization-neural fuzzy PSO-ANFIS approach.The results showed that applying two methods was highly satisfactory for estimating inorganic water quality factors.The flexibility of the ANFIS-PSO method in modeling was better than the ANFIS approach.Azad et al. (2018) predicted the water quality parameters of the Gorganrood River using ANFIS optimized by Genetic Algorithm (GA), Ant Colony Optimization for Continuous Domains (ACOR), and Differential Evolution (DE) algorithms.The results showed that, for predicting electrical conductivity (EC) and total hardness (TH) in the test stage, ANFIS-DE was the most appropriate model.Liu et al. (2022) used support vector machine particle swarm optimization with deep learning to investigate the causes of water pollution and control countermeasures in the Liaohe estuary.The results showed that the SVM-PSO approach has excellent predictive performance.
In recent decades, various pollution loads, including agricultural drainages, and urban and industrial wastewaters have damaged the water quality of many surface water resources (Wang et al., 2020).The Karun River is the largest and only navigable river in southwest Iran, with a contentious water quality (Diagomanolin et al., 2004).An environmental catastrophe has emerged from the building of the massive Upper Gotvand Dam in the Karun River.This is because of the accumulation of 66.5 million metric tonnes of dissolved salt in the reservoir and a dramatic increase in the reservoir's water salinity to 200 g/L (Jalali et al., 2019).The Aji Chay River, in northwest Iran, is also a very important river of the country that is problematic in terms of water quality.This river drains into Lake Urmia (LU), the world's second-largest hyper-saline lake, and its water quality influences LU's water quality (Andaryani et al., 2021).
Because of the salt formations that exist before the Vanyar Station of the Aji Chay River basin, the water of this river is salty.These two rivers can serve as excellent representatives in evaluating the capability of optimized artificial intelligence methods in studying surface water quality.Therefore, Vanyar and Gotvand stations were selected as study areas in this research.
As yet, no published research has compared different metaheuristic algorithms for enhancing the precision of the ANFIS method in the simultaneous evaluation of monthly and annual TDS for the Aji Chay and Karun rivers.Thus, in this study, four metaheuristic algorithms including genetic algorithm (GA), ant colony optimization (ACOR), particle swarm optimization (PSO), and differential evolution (DE) were used to optimize the performance of ANFIS model accuracy in the evaluation of monthly and annual TDS of Vanyar Station (the Aji Chay River), and Gotvand station (the Karun River) in Iran.The effect of omitting the discharge parameter from the input dataset was also assessed in all calculations.Eventually, the TOPSIS method was utilized to rank the preferred hybrid models and select the most precise one.Given the expected proximity in performance between these models under specific spatial and temporal conditions, two factors relating to spatial and temporal changes were proposed and employed in the TOPSIS method to facilitate evaluating and decision-making procedures.The outline of this research is also shown in Fig. 1.

Study area and data description
In order to accomplish the objective of this study, two distinct areas in Iran were selected based on their different climatic conditions and exposure to salt domes and water pollution; the Karun River in a hot and dry climate (Eskandari and Mahmoudi Sarab, 2022), and the Aji Chay River in a cold and mountainous climate (Sharafkhani et al., 2018).Fig. 2 shows the maps of the Gotvand and Vanyar Stations.
At around 950 kilometers, the Karun River is the longest river situated in the southwest area of Iran and is the only navigable river in the country providing water to several cities and villages (Fakouri et al., 2019;Golshan et al., 2020).It also meets the water demands of various industries, including steel, oil, petrochemical, sugarcane, paper, and cement industries, as well as farmlands along the river.Since agricultural, domestic, and industrial wastewater discharges into the Karun River, its salinity has been increasing (Karamouz et al., 2009).One of the most significant hydrometric stations on the Karun River is Gotvand Station, which is situated at 48° 49' east longitude and 32° 14' north latitude.The elevation of this station is 243 meters (Radmanesh et al., 2013).The water quality of the Karun River has been severely impacted by the Gachsaran salt domes in this area.Downstream agricultural fields have also been degraded and lost their productivity as a result of water salinization in the dam reservoir.Saline water destroys soil structure and prevents crops from absorbing irrigated water (Gutiérrez & Lizaga, 2016).This research used a 45-year (1971-2015) water quality data of Gotvand Station.The Aji Chay River, which has an average discharge of 16.4 (m3 s-1) and is recognized as one of the most saline rivers of northwest Iran, was also subject to evaluation in this study.This river originates from the Qusheh-Dagh and Sabalan elevations and eventually ends up in the Urmia Lake Basin, the largest lake in the Middle East.The water quality of the Aji Chay River has a direct impact on the drinking water of West Azerbaijan Province as well as the water quality of Urmia Lake.The salinity of the river water is also significantly affected by the salt domes that lie under the reservoir region behind the Vanyar dam (Hossein and Moghaddam, 2006).Hence, Vanyar Station, one of the major hydrometric stations of the Aji Chay River, was selected as the representative of this river.It is located in the coordination of 38° 07′ 00′′ N and 46° 24′ 18′′ E, with an elevation of 3882 meters above sea level.A 45-year water quality data )1966-2011( from the Vanyar Station was used in this research. The monthly WQ data of Gotvand and Vanyar Stations utilized in this research were obtained from Iran water resources management company (wrm.ir).The annual data used in the analysis was derived from the average monthly data of each station.The selection of the studied stations was based on the availability of long-term data, lack of outliers, and water pollution vulnerability.
Among the chemical parameters, the ones that had a high correlation with TDS, including Cl -, SO 2-4, Ca2+, Mg 2+, and Na + were selected as input parameters.In parallel, among the physical parameters, discharge, which has a great impact on the quality of rivers, was taken into account (Gao et al., 2018;Seiler et al., 2020).The maximum, average, and minimum monthly TDS of both stations are shown in Table 1.For the Vanyar station, due to the limited access to the data of recent years, the analysis was performed using the data from 1975 to 2012.The data must be preprocessed in order to prepare them for analysis using ANFIS and metaheuristic approaches.There are various WQ parameters that can be employed in the evaluation of TDS.However, only a limited number of inputs are used for modeling because of constraints like time, memory space, and computational volume.Therefore, candidate parameters need to be evaluated in terms of correlation.For this purpose, in this research, linear regression analysis was performed on various water quality parameters of the Gotvand and Vanyar Stations using the SPSS software).The findings showed that Cl -, SO 2- 4, Ca 2+ , Mg 2+ , Na + , and discharge )Q( had the highest linear regression with TDS (Azad et al., 2019).After choosing the best input parameters, for both study areas, three classification types were used: 30-70 i.e., 30% of the data for testing and 70% for training, 20-80, and 40-60 modes.When compared to the other two dataset classification types, the 30-70 type performed the best (Jannatkhah et al., 2021).In addition, the datasets of both stations were normalized between 0 and 1 using Eq. 1.
where xnew, x, xmin, and xmax are the standardized values of the predicted data, the original data, and the minimum and maximum values in the data set, respectively (Khoi et al., 2022).

Adaptive Neuro Fuzzy Inference System (ANFIS)
The Adaptive Neuro-Fuzzy Inference System ANFIS is an AI-based technique that is derived from Takagi-Sugeno TS fuzzy systems.The ANFIS development procedure consists of distinguishing the most dependent inputs that have a correlation with a targeted output.The optimal rules, types, and the number of the associated membership functions (MFs) should be determined to select the optimum ANFIS structure with minor resulting errors.Two TS fuzzy sets of "if-then" rules in a typical ANFIS structure are shown in Eqs. 2 and 3. Rule Ai and Bj indicate the linguistic degrees, while p1, q1, p2, and q2 are ANFIS determinants.The ANFIS structure consists of five layers (Ying & Pan, 2008).The following descriptions are concise explanations of the role of these layers (Al-Mukhtar and Al-Yaseen, 2019).
Layer 1, the fuzzification layer, receives the input values and identifies the MFs.
Layer 2, or the rule layer, develops the firing strengths for the rules.
Layer 3, or the standardization layer, normalizes the calculated strengths.
Layer 4 takes the standardized values and then generates parameter sets.
Layer 5, or the defuzzification layer, returns the values to the ultimate result (Shah et al., 2021).
The number of epochs in ANFIS was set at 500 and the results did not represent a noticeable influence on the authenticity of models beyond this epoch value.The best values of step-sizedecrement, initial step-size, epochs, number of clusters, and initial-increment were 0.9, 0.01, 500, 10, and 1.1, respectively.The approaches used in this study were all coded in MATLAB R2021b and were executed on a computer equipped with an Intel® CoreTM i7 processor.

Particle swarm algorithm (PSO)
PSO is a metaheuristic algorithm formulated using the stochastic optimization approach.In the PSO method, there is a particular search interval in which the solutions of the algorithm are marked as particles (Banadkooki et al., 2020;Samanataray and Sahoo, 2021).The fitness function assigns a fitness value for every particle.The PSO approach is characterized by exploring the solution domain for particles in better condition.When the position of particles is changed, they try to find the optimum place according to their previous experiences.Particles are cognizant of their location and the position of other particles as well.As an example, in the problem d-dimensional space, particle i has a location that can be calculated by equation 3 in which t shows the iteration of the particle.In addition, the particle has a velocity V leading the particle to the new location Eq. 4. Particles also have a memory to remember their position and the location of other particles Eq. 5.After every iteration, particles are updated with their two best values including the "best solution" pbest and the "global best value" gbest.After determination of the pbest and gbest, the particle velocity is updated based on Eqs. 4 and 5 (Alqaness et al., 2021).
The learning indicators are shown with C1 and C2.These two factors qualify the movement of each member after iterations.Generally, C1 = C2 and have a value equal to 2. r1 and r2 are two random numbers with a range between 0 and 1.The inertial weight is shown by W, and its initial range is normally between 0 and 1.In this research, although 200 iterations were used for this procedure, there was no improvement in the outputs as the number of iterations increased over 200.Furthermore, the initial population size was set to 100, the velocity bounds were set to -2 and +2, the inertia weight was selected between 0.5 and 1.5, and the personal and global best learning coefficients were determined between 1 and 5.

Differential evolution (DE)
DE is classified as a stochastic optimization algorithm driven by biological procedures, where the permanence of the fittest is needed for adoption with intrinsic genetic characteristics (Wang and Zhao, 2013).If the target optimization functions are called with f, in the following equation, R is related to the observed dataset, and D indicates the objective function parameters f (V).
The purpose of a DE approach is to minimize the objective function by utilizing the values of optimized parameters: The vector is shown by V.It includes the parameters of the objective function of D. The objective function is the mean squared error between the observed and predicted TDS.The objective function parameters are indicated as follows: The lower and upper boundaries set for the objective function are shown with vi L and vi U, respectively.DE operates not on single solutions but instead on a population PG of candidate solutions; the generation of a population is denoted by G.The population predicted by the DE approach can be shown as follows: In Eq. 12, Gmax is the uttermost generation that mostly performs as the cessation standard of DE.Each vector includes the accurate parameter of D assumed as a single chromosome.
An initial population needs to be evolved to assign a leading point for the best search.As a whole, except for the optimum problem variables, it is impossible to find information about the best solutions.Accordingly, PG with the value of 0 is one of the approaches to specifying the initial population.In point of fact, it is the random selection of limitations shown as follows: The random value distributed consistently is shown by randj [0, 1] in a range between 0 and 1, which is chosen for every current j.The procedure of DE is different from other metaheuristic approaches.PG is incorporated and sampled haphazardly from the initial output to the regular population of vectors in order to generate candidate vectors for the subsequent generation, PG+1.
A candidate population of vectors derived from multiple trials, is determined as follows: In each run, the values of r1, r2, and r3 vary.The value of parameter i should be determined.
The precise values of r1, r2, and r3 are selected randomly for each value of i. PG + 1, which is selected from the current population PG, is the afterward generation population, and the children's population follows this equation.(Ebtehaj and Bonakdari, 2017) In this research, the population size, initial mutation factor set, crossover factor, and maximal metaheuristic iteration number were 80, 0.4, 0.9, and 2000, respectively.

Ant Colony Optimization for Continuous Domains (ACOR)
Studies and observations on ant colonies first made by Marco Dorigo served as the basis for the ACOR model in 1992.The propensity of ants to search for the shortest route between their nest and food is one of the most significant and interesting aspects of their behavior.In the real world, ants first travel randomly to and from food sources.The next step is that they go back to their nest while leaving pheromone trails behind them (Dorigo and Blum, 2005).By using the primary population and correcting answers to earlier questions, the ACOR algorithm is intended to produce the best possible response.The primary population is selected at random.
The identified paths are assigned a value in the third step according to the values of target functions like pheromones, and some of them disappear.The fourth step involves creating new paths using a rotary cycle based on the distance between customers and pheromone following the pheromone setting before proceeding to the third stage (Mullen et al. 2009).Each ant, once stationed at a point, utilizes the following Eq.16 to decide where to move to next: Where ηij equals to 1/dij, and dij represents the distance between ants from each other.τij indicates the pheromone value and    shows the sum of point i and ant k neighbors.After forming the primary population, pheromone is vaporized, and the algorithm is ultimately evaluated to be continued or terminated: m is the number of ants,  is the coefficient of evaporation, and Δτij is the concentration of pheromone between i and j vertices.Using Eq. 18, pheromone route updates are performed (Dai et al., 2019).
where Δτ best ij represents the optimal pheromone path between i and j vertices.In this study, ACOR replication number was set to 200, and greater repetition had no discernible effect on the quality of evaluations.Furthermore, the initial population was considered with a value of 100.

Genetic algorithm (GA)
A genetic algorithm GA is a metaheuristic approach that uses the principles of natural selection and genetics (Chau, 2006).Although random in some characteristics, GA is not completely coincidental because it makes use of past data to determine where to explore next.Several fields of knowledge have found success using GAs to solve optimization challenges (Ahmad, 2012).Steps for the modified GA are listed as follows: 1.
A random collection of chromosomes can be used to generate the current candidate solutions for the given problems.

2.
The fitness of each chromosome is determined.The fitness values are utilized in the genetic algorithm to help the simulations find the best possible design options.

3.
Sorting the chromosomes with the lowest fitness yielded the best result.

4.
Clone the appropriate chromosomes to improve the positive attributes of the population.

5.
In order to perform a crossover operation, new springs are derived from the chromosomes of the parents.

6.
If the ratio is low and has been predefined, the chromosomes should be mutated.

7.
New fitness values for each chromosome are assessed before the analysis continues.

8.
Steps 3 through 7 are performed until the non-minimum error requirement is met.
The evolution of individuals within a population throughout generations typically initiates in a random manner.As a result, during each generation, the appropriateness of each member of the population is analyzed, members of the current population are selected on the basis of their fitness, the population is modified reintegrated, and possibly mutated to form another new population (Fu & Li, 2021;Hassan et al., 2020).In this study, iteration numbers and the population size were 200 and 100, respectively.

Statistical indicators
The execution time of models examined by the applied techniques was one of the performance parameters addressed in this study.In addition, statistical parameters such as the coefficient of determination (R 2 ), root mean square error (RMSE), mean absolute error (MAE), and Nash- The RMSE index indicates the appropriacy of fit related to high values.The MAE calculates a more balanced aspect of suitability of fit at medium data (Karunanithi et al., 1994;Kisi et al., 2014;Montaseri et al., 2018).NSE is a potentially reliable assessment criterion and is widely used for investigating the suitability of fit of hydrologic models (McCuen et al., 2006). (23)

Decision making
It is challenging to assess the performance of the models and select the optimal algorithms for diverse circumstances due to the large number of models and variety of analyses.In order to accurately choose the optimal models, multi-criteria decision-making techniques must be used.
Therefore, the TOPSIS technique was employed in this study to choose the best models since it is among the most widely used Multiple-criterion decision-making MCDM methods (Mahmood and Ali, 2021;Rezaee et al., 2021).In this method, the best choice is the one that is the closest to the positive ideal solution (PIS) and the farthest from the negative ideal solution (NIS) (Karabašević et al., 2020).The decision matrix is as follows: Where m, n, and A represent the number of criteria, number of alternatives, and decision matrix, respectively.Then, the decision matrix is normalized as follows: Then, the weighted normalized decision matrix Z is formed: PIS and NIS are computed following the formation of the weighted normalized decision matrix: Four influential criteria were considered in order to rank the models: statistical parameter )R 2 , RMSE, MAE, NSE(, model run time, spatial change factor (SCF), and temporal change factor (TCF), where SCF shows the effect of changing stations on the analysis, and TCF indicates the impact of changing time scale of the used data.The KGE coefficient was taken into account in TOPSIS for calculating the last two factors.The utilization of SCF and TCF in the TOPSIS method is because of the insufficiency of statistical parameters in providing a satisfactory prioritization of hybrid models and the inability to generalize results to other studies with varying conditions.These two factors are described below: Where KGE V and KGE G correspond to the Vanyar and Gotvand datasets, respectively.

Results and discussion
Based on accuracy-error and economic criteria time, the standalone ANFIS approach and its hybrid models were assessed in estimating monthly and annual TDS for both Gotvand and Vanyar stations.The findings of the analysis are presented in Tables 2 and 3 for the Gotvand and Vanyar Stations, respectively.Furthermore, Figures 4 and 5 show the scatter plots of the observed versus simulated data for the ANFIS and the strongest hybrid method of each station.

Considering discharge
Based on the statistical assessment criteria of the Gotvand Station given in Table 2, it was determined that the ANFIS approach had the weakest performance compared to the other approaches (train: R 2 = 0.93, KGE = 0.92, RMSE = 0.041, MAE = 0.016, NSE = 0.92) and test(R 2 = 0.76, KGE = 0.77, RMSE = 0.077, MAE = 0.04, NSE = 0.7).This might be because it failed to find the global optimum due to getting confined to local optima (Azad et al., 2019).
In respect of the statistical assessment criteria reported in Table 3, the analysis conducted on the Vanyar Station dataset using the standalone ANFIS method had poor results because of the significant difference between the test and train outputs: in train (R 2 = 0.92, KGE= 0.99, RMSE = 0.044, MAE = 0.013, NSE = 0.917) and in test (R 2 =0.61,KGE= 0.66, RMSE = 0.13, MAE = 0.106, NSE= 0.38).The results of all optimized models were quite close to one another, with only minor differences in performance criteria values.By combining ANFIS with metaheuristic methods, it was found that all the applied algorithms improved the performance of ANFIS in the evaluation of the monthly TDS dataset of the Vanyar station with applying discharge.Analysis revealed that ANFIS-PSO has the highest potential: in train ( R 2 = 0.98, KGE = 0.98, RMSE = 0.018, MAE =0.001, NSE = 0.983) and in test (R 2 = 0.97, KGE = 0.98, RMSE = 0.023, MAE = 0.006, NSE = 0.96).An investigation of the running times of the models revealed that the ANFIS method was the fastest, taking 250 seconds, while the ANFIS-DE was the slowest, with 3100 seconds of execution time.
From the results obtained for the methods used in the assessment of the Gotvand Station, it can be inferred that when there were sufficient and appropriate amounts of data and parameters for training the ANFIS-PSO was superior.Previous investigations have also underscored the crucial role played by both the quantity and quality of data employed in the modeling process (e.g., Alwosheel et al., 2018).This was also true for the analysis of Vanyar data.Compared to the Vanyar Station analysis, however, the models of the Gotvand Station required more execution time, but they delivered more precise results.This is probably due to the nature of the data.One explanation for this can be that dataset of the Gotvand Station is larger than the Vanyar Station.Additionally, it is possible that the monthly data from Vanyar are of lower quality than the data from the Gotvand Station as a result of challenges like discharge measurement error (Potash and Steinschneider, 2022) or a stronger influence of TDS from inputs other than discharge and ions employed in this part of the calculations.Moreover, the low accuracy of the Vanyar Station calculation findings compared to the Gotvand Station may be because of its higher TDS fluctuations and a broader spectrum of maximum and minimum TDS values (Kouadri et al., 2021).The longer execution time at the Gotvand Station might be attributed to its larger period of data series.

Ignoring discharge
In the Gotvand Station, eliminating the discharge factor improved the modeling results in both ANFIS and hybrid methods.The results of the ANFIS analysis were (R 2 = 0.98, KGE = 0.98, RMSE = 0.0198, MAE = 0.0017, and NS = 0.97 in train), and (R 2 = 0.82, KGE = 0.87, RMSE ANFIS approach was the quickest, with 200 seconds.The ANFIS-DE approach required the most time to run, with 2600 seconds. The analysis represented that the assessment of Vanyar Station's monthly TDS when omitting discharge required a shorter running time and had more reliable results than the monthly models including discharge.With disregarding the discharge as input (Similar to the condition where the discharge was factored into the computations), the applied methodologies performed better in the calculations of monthly data of the Gotvand Station in comparison to the Vanyar Station (the possible reasons are mentioned in section 3.1.1).Furthermore, the analysis of the monthly data of the Gotvand Station in the condition of ignoring discharge required more execution time compared to the monthly data of the Vanyar Station regardless of discharge.
Based on the results, the ANFIS-PSO method performed better than other methods in all cases of monthly data analysis (except in the monthly models of Gotvand Station disregarding discharge in which the ANFIS-GA model was the most appropriate approach).This output yields a conclusion that the performance of employed models varies temporally and spatially; and can also be influenced by a variety of factors, including the complexity of the problem, the size of the search space, the number of variables and constraints, and the quality and quantity of the data available (Djebedjian et al., 2021;Kim et al., 2016).
The results indicated that, while the annual models of the Gotvand Station required less time to run, they were less accurate than its monthly models.The findings of the annual data analysis at the Gotvand Station were superior to those at Vanyar Station in both the case with and without discharge.Researchers often use average monthly data to estimate annual data when annual data is not available (e.g., Yu et al., 2014).Thus, this study utilized average monthly water quality data to create annual data for both of the studied rivers.The aim was to conduct the analysis in cases that the comprehensive data indicating temporal changes are not available.
The results indicated that the models utilized for processing the annual data demonstrated satisfactory performance, though not as well as the monthly data.
Through an annual dataset analysis of the Vanyar Station using the discharge variable, it was discovered that ANFIS was the least effective when compared to the other models.There was a significant gap between the test and train results of this approach: (R 2 = 0.99, KGE= 0.99, ANFIS-ACOR.However, there have been some studies showing that the outputs obtained from modeling monthly data do not significantly differ from those obtained from modeling annual data (Yu et al., 2014).
Annual models of Gotvand without discharge were less accurate and quicker than the monthly models.According to the results, the annual models of the Gotvand Station ignoring discharge models were faster, but their accuracy was lower than this station's modeling of monthly data without discharge.In addition, Gotvand's annual models ignoring discharge models were less precise and required more time than Vanyar's annual analysis ignoring discharge.According to the outputs of the total analyses, in the assessment of monthly data, ANFIS-GA and ANFIS-PSO approaches were more accurate, but with annual data evaluations, the ANFIS-ACOR resulted in higher performance.In the case of omitting the discharge, the assessment of annual data of Vanyar Station yielded weaker results than the annual data modeling of the Gotvand Station; also the annual analysis of Vanyar resulted in poor outputs than the monthly assessment of this station.Furthermore, in the circumstance including the discharge, the annual models of Vanyar were weaker than this station's annual model without discharge.
The box plot of errors for the approach used in this study is shown in Fig. 6.It is evident that hybrid models had a narrower error change range than ANFIS, and all models outperformed ANFIS.The ANFIS-ACOR represented the most superior performance, while the ANFIS-DE demonstrated the least efficiency.The reason for the disparity between the information of this chart with the findings of TOPSIS is that TOPSIS took into account the spatial-temporal sensitivity and execution time factors.

Decision making
It is challenging to make decisions and choose between poor and superior models due to the abundance of available models and the variety of modeling situations.As a result, this research employed the TOPSIS technique for multi-criteria decision-making.In the previous studies, the ranking of models was mostly done focusing on the statistical indicators (R 2 , RMSE, etc.) (e.g., Kadkhodazadeh and Farzin, 2022).Our study used additional factors (spatial and temporal change factors) to make the results more realistic and informative to address practical water management needs, as opposed to solely statistical considerations.This strengthens the practical usefulness and actionable value of our proposed monitoring station selection process.
The matrix of experts employed in TOPSIS includes an impact coefficient of 0.2 assigned for the matrix elements.It should be emphasized that Kling Gupta's coefficient served as the ranking criterion for assessing the effects of applying and removing discharge, the type of data interval for the station, and the effects of river stations.
An analysis of the results of different methods showed that the ANFIS-ACOR algorithm has superior performance in predicting water quality.Likewise, Azad et al. (2019) demonstrated that predicting precipitation, as a crucial parameter influencing the water quantity, had the best performance using the same model (Kitan & Nang, 2020;Lintern et al., 2018).Thus, this model can be considered reliable for both quantitative and qualitative evaluations.On the other hand, the TOPSIS showed that the ANFIS-DE was the weakest model in this study, considering the whole criteria (spatial and temporal variation factors, statistical factors, and model execution time).However, a closer examination of the algorithms used in the quantitative and qualitative research on water resources reveals that, in most cases, the performance of hybrid models in predicting water quality has been suitable.This makes selecting the most effective model for problems that use hybrid models challenging (Vazquezl et al., 2021).

Suggestions for future studies
The methods used in this research can contribute to a proper management of ecological consequences, reduce the high costs of water quality data collection, and save time in water quality assessment (Zhu et al., 2022).However, future studies can focus on the following recommendations: • Explore the performance of ANFIS and its hybrid models (ANFIS-PSO, ANFIS-DE, ANFIS-ACOR, and ANFIS-GA) models in modeling TDS in other rivers and hydrometric stations to assess the generalizability of the results.
• Investigate the use of alternative metaheuristic algorithms to further enhance the performance of ANFIS models.
• Consider the impact of other factors, such as temperature, rainfall, and land use, on TDS modeling to provide a more comprehensive understanding of the factors that affect water quality.
• Evaluate the effectiveness of other evaluation factors to rank the models when using the TOPSIS method.
• Investigate the effect of different temporal and spatial scales on the performance of ANFIS models in modeling TDS.
• Compare the performance of ANFIS hybrid models with the hybrid of other machine learning techniques, such as artificial neural networks and support vector machines.

Conclusions
The performance of the ANFIS approach in the modeling of TDS was assessed and improved by the utilization of four different metaheuristic algorithms, including PSO, DE, ACOR, and GA.This study focused on two key hydrometric stations, Vanyar in the Aji Chay River and Gotvand on the Karun River.The linear regression analysis was used for parameter selection.
In addition, results revealed that metaheuristic techniques employed in this study enhanced ANFIS performance in all models.The running time of models assessed using ANFIS was significantly less than that required for hybrid models.The monthly data calculations produced better findings than the annual data in the majority of the models.However, the results of the annual data were likewise reliable, and it appears that they can be utilized in analyses when monthly data is unavailable.In terms of running time, the annual models were faster than the monthly models.Investigation of the effect of discharge on modeling showed that removing the discharge from analysis improved the results in both accuracy and running time.Although the Gotvand Station dataset required more time to process than the Vanyar Station dataset, More precise outcomes were obtained for the Gotvand Station than Vanyar.In the final step, the ranks of models were obtained using the TOPSIS approach.In order to increase the reliability of TOPSIS performance, the temporal and special change factors were used in the computations.The comprehensive findings of the study revealed the ANFIS-ACOR model's superiority, while concurrently exposing the ANFIS-DE model's weak.ness in total.
Consequently, the findings of this research support the hypothesis that hybrid ANFIS models can be used to evaluate the TDS with high accuracy, which is of vital importance in water quality management.

References
Abd  The research process followed in this study.
Fig. 3 represents the TDS and discharge time series of study areas, and scatter plots of TDS versus discharge of the Gotvand and Vanyar Stations.
Sutcliffe model efficiency coefficient (NSE) were utilized in this study.Eqs.19-23 are the respective equations of these criteria: Where xi and yi show the observed and estimated values, x and y represent the mean observed and estimated values obtained from ANFIS and its hybrid models, and n is the number of data points.The R 2 index determines the potential and direction of the linear relationship among variables.The Kling-Gupta efficiency (KGE) more effectively incorporates the three parts of the Nash-Sutcliffe efficiency (NSE) of model errors correlation, bias, ratio of variances, or coefficients of variation.In recent years, this index has been frequently utilized for calibrating and evaluating hydrological models.In this index, CC shows the Pearson coefficient value, rm refers to the average of observed values, cm indicates the average of forecast values, rd is the standard deviation of observation values, and cd is the standard deviation of forecast values.
KGE max represents the highest KGE value achieved by each algorithm, taking into account all associated conditions (Considering and ignoring discharge, and monthly and annual time scales) in both hydrometric stations.KGE a are utilized to assess the monthly and annual datasets, respectively.KGE max denotes the utmost KGE value attained by each algorithm during the analysis.This comprehensive analysis encompasses all relevant conditions.
= 0.0035, MAE = 0.0078, and NSE = 0.99 for train), and (R 2 = 0.66, KGE= 0.77, MSE = 0.128, MAE = 0.09, and NSE = 0.569 for test).After applying meta-heuristic algorithms, the statistical factors yielded by ANFIS evolved to appropriate values.ANFIS-ACOR Method (R 2 = 0.97, KGE=0.98,RMSE = 0.02, MAE = 0.007, NSE = 0.97 in train) and (R 2 = 0.94, KGE = 0.9, RMSE = 0.035, MAE = 0.08, NSE = 0.93 for test), shows the best results among the other three approaches.ANFIS-DE, ANFIS-PSO, and ANFIS-GA ranked as the second, third, and fourth best, respectively.A comparison of the execution times of the models demonstrated that the ANFIS method was the fastest with 95 seconds whereas the ANFIS-DE method demanded the most amount of time 3000 seconds to tune the ANFIS parameters.The annual models of the Gotvand Station considering discharge were more reliable than the annual models of the Vanyar Station considering discharge., but required more time to execute.Moreover, The results of calculations of the monthly dataset of Vanyar encompassing the discharge were more accurate and had less error compared to annual models of Vanyar considering discharge.In our study, except for the following cases, the modeling of monthly data yielded better results: Gotvand Station without discharge modeled with ANFIS-ACOR, Vanyar Station considering discharge analyzed with ANFS, Vanyar Station considering discharge assessed with ANFIS-ACOR, and Vanyar Station ignoring discharge evaluated with In the next step, The accuracy of assessments was raised by leaving the discharge factor out of the input parameters for the Aji Chay River's annual water quality.ANFIS model yielded statistical parameters with values of R 2 =0.99,KGE= 0.99, RMSE = 0.0005, MAE = 0.0001, NS = 0.9 in train and R 2 = 0.8, KGE = 0.793, RMSE = 0.086, MAE = 0.044, NS = 0.677 in test.When compared to other models, the optimizations revealed that ANFIS-ACOR has the highest performance in train R 2 = 0.973, KGE= 0.96 RMSE = 0.04, MAE = 0.0059, NSE = 0.971, and test R 2 = 0.96, KGE= 0.91, RMSE = 0.04, MAE = 0.007, NS = 0.962.

Figures Figure 1
Figures