Solving the pan evaporation process complexity using the development of multiple mode of neurocomputing models

Finding an accurate computational method for estimating pan evaporation (EPm) can be useful in the application of these methods for the development of sustainable agricultural systems and water resources management. In the present study, the proposed hybrid method called multiple model-support vector machine (MM-SVM) with the aim of showing the increasing, decreasing, and constant accuracy behavior of this hybrid model and improving the results of estimating EP compared to the two models ANN and SVM on a monthly scale of EPm in four meteorological stations (Ardabil, Khalkhal, Manjil (from Iran), and Grand Island (from the USA)) located in semi-arid regions, using the output of artificial intelligence (AI) models (i.e., artificial neural network (ANN) and support vector machine (SVM)), was evaluated. The results of intelligent models using several statistical indices (i.e., root mean square error (RMSE), mean absolute error value (MAE), Kling-Gupta (KGE), and coefficient of determination (R2)) and with the help of case visual indicators were compared. According to the results of evaluation indicators in the test phase, MM-SVM-6, ANN-5, MM-SVM-3, and MM-SVM-7 with RMSE = 1.088, 0.761, 0.829, and 0.134 mm/day; MAE = 0.79, 0.54, 0.589, and 0.105 mm/day; KGE = 0.819, 0.903, 0.972, and 0.981; and R2 = 0.939, 0.962, 0.967, and 0.996 and with four input variables were introduced as the best models in Ardabil, Khalkhal, Manjil, and Grand Island stations, respectively. The proposed hybrid model (MM-SVM) was able to use its multi-model strategy with inputs estimated by independent models, its power to estimate EPm in scenarios where there is a high correlation between its components with EPm, in a feasible state Accept to show. So that the incremental, constant, and decreasing modes in EPm estimation accuracy by this hybrid model under the semi-arid climatic conditions of the studied areas were quite clear. Therefore, the results of the proposed and superior models in the present study can help local stakeholders in discussing water resources management.


Introduction
Evaporation is known as an important hydrological process that converts liquid water to steam, and this factor, along with evapotranspiration, leads to the loss of 60% of global rainfall (Ghaemi et al. 2019;Kisi 2015;Malik et al. 2020b). Despite this, evaporation is less well known as a component of the hydrological cycle (Jing et al. 2019). Therefore, it is important to accurately estimate the rate of evaporation, especially in arid and semi-arid regions, which has a critical impact on agricultural issues and water resources management (Ashrafzadeh et al. 2020;Ghorbani et al. 2018). Evaporation is important in arid and semi-arid regions because the level of evaporation is higher compared to other elements of the hydrological cycle such as estimation and groundwater flow. Evaporation rate mainly depends on meteorological factors such as air temperature, wind speed, relative humidity, and vapor pressure .
Evaporation can be measured using two methods: direct and indirect methods. Pan evaporation is known as one of the direct measurement methods (Eslamian et al. 2008;Kisi et al. 2016). Class A type pan evaporation (US Weather Bureau), with 4 ft (122 cm) in diameter and 10-inch (25 cm) deep and is located about 6 inch (15 cm) above the soil surface, is used worldwide to measure free-surface evaporation as well as to estimate reference evapotranspiration (ET o ) (Khosravi et al. 2019;Rahimikhoob 2009). Although pan evaporation is very useful in measuring direct evaporation, it faces limitations such as the number of stations where pan evaporation is located, the high cost of setting up and installing safety tools, and its maintenance in research projects. In addition, it is difficult to derive a precise formula for all physical evaporation processes due to its instability, non-linearity, and complexity.
To overcome these difficulties, many researchers in recent years have tried to use indirect methods to estimate EP m using meteorological parameters. Indirect methods include the use of artificial intelligence (AI) algorithms such as artificial neural networks (ANNs) and SVM, which have been widely used by researchers to extract the relationship between meteorological and EP m data (Bruton et al. 2000;Ghorbani et al. 2013;Kişi 2006;Malik et al. 2020a;Seifi and Soroush 2020). Based on the reported research over the literature, Scopus database, 158 articles were published on the EP m estimation using soft computing models. The major keywords of the literature review for the 158 articles are presented in Fig. 1. The development of ANN and SVM intelligent models in a study aimed at estimating monthly pan evaporation was conducted by Eslamian et al. (2008). The results of their research showed the high ability of ANN and SVM models in estimating EP m and in general, the SVM method was superior to the ANN method. In order to estimate pan evaporation in a hot and dry area (Piri et al. 2009), the authors conducted ANN model and integrated model based on ANN and autoregression with exogenous inputs.
The results indicate the high ability of integrated model ANN compared to single model ANN. Also, the mentioned two models showed better results than two empirical methods. Chen et al. (2019) conducted a study with the aim of estimating monthly evaporation in Three Gorges Reservoir Fig. 1 The reported major keywords on pan evaporation estimation using soft computing models and the adopted regions based on the Scopus database Area in China using SVM model. The results indicated that the SVM method would be a promising alternative over the traditional approaches for estimating pan evaporation from measured meteorological variables. In the research of Tezel and Buyukyildiz (2016), the usability of ANNs (multilayer perceptron (MLP) and radial basis function network (RBFN)) and ε-SVM models was investigated to estimate monthly pan evaporation in Turkey Beysehir meteorology station. The ANNs and ε-SVM methods were found to perform better than the Romanenko and Meyer methods. In the research conducted by Ghorbani et al. (2018), scholars used a hybrid estimation model (multilayer perceptronfirefly algorithm (MLP-FFA)) based on the FFA optimizer embedded in the MLP method to estimate pan evaporation and the results of the MLP-FFA hybrid model. Compared with traditional MLP and SVM models, the results of their research showed that the optimal MLP-FFA model performs better in estimating EP m than the MLP and SVM models, and this shows the importance of the firefly algorithm in improving the performance of the MLP-FFA model. The examination of tree classification and regression (C&RT) methods, automatic chi-square interaction tracker (CHAID), and ANN estimates daily EP m in arid regions (Kisi et al. 2016). In a study using multilayer perceptron method with error backpropagation learning algorithm to estimate daily EP m and finally found that ANN method is better than Stephens and Stewart models (Sudheer et al. 2002). In another study, researchers estimated the daily EP m in the semi-arid region of Iran, and two methods of ANN and multivariate non-linear regression (MNLR) were examined by determining the different combinations of climatic variables (Tabari et al. 2010). Their research results showed higher accuracy of ANN model compared to MNLR. An evaluation for the capability of three models including ANN, active fuzzy neural inference system (CANFIS), and multiple linear regression (MLR) with the aim of estimating pan evaporation at a number of Indian stations (Malik et al. 2019). The results of their research showed the high accuracy of the ANN model compared to CANFIS and MLR. Another researcher conducted a study to evaluate the accuracy of least square support vector machine (LSSVM), multivariate adaptive regression splines (MARS), and M5 model tree (M5Tree) in estimating EP m (Kisi 2015). The results of their research showed that the LSSVM model was superior to other models in cases where station inputs and outputs were considered.
In the present study, several input combinations (i.e., scenarios) evaluate the multiple model-support vector machine (MM-SVM) with the aim of determining the behavioral pattern (i.e., the increasing, decreasing, and constant accuracy behavior) of the combined MM-SVM model and improving the results of estimating EP m compared to the two models ANN and SVM used. In general, the strategy of multiple models has shown great potential in various discussions of hydrological engineering. Among the efficiency of multiple models are the ability of these models to estimate and detect river flow, pan evaporation, cation exchange capacity, hydraulic conductivity of saturated soil, and oblique weir discharge coefficient (Ghorbani et al. 2018;Kashani et al. 2020;Khatibi et al. 2018b;Khatibi et al. 2017;Malik et al. 2020a;Norouzi et al. 2020). In this research, the strategy of multiple models is presented by presenting a new perspective to improve the estimation of the pan evaporation process due to its randomness and non-linearity.

Study region and datasets
In this study, the observed values of monthly meteorological data collected from Ardabil, Khalkhal, and Manjil meteorological stations, located in the semi-arid region of Northern Iran, and Grand Island meteorological station, located in the semi-arid region of the USA, were used. Data on average temperature (°C), relative humidity (%), average wind speed (m/s), total hours of sunshine (hours) (or solar radiation (RS) (mj/m 2 /day) for Grand Island), and average pan evaporation  Table 1 shows the statistical characteristics related to the data of the parameters used in the two stations. In the present study, 70% of the data were categorized for the purpose of training and 30% of the data for the purpose of validation or testing of results. Figure 3 shows the time series of monthly changes of pan evaporation in all stations at the study periods. Also, the boxplot of the normalized input and output parameters (i.e., EP m ) of the stations is shown in Fig. 4. All modeling steps were performed in WEKA software version 3.9.4, developed by Waikato University in New Zealand (Garner 1995). Three methods of ANN, SVM, and integration strategy of multiple models based on support vector machine (MM-SVM) have been used in the present study.

Artificial neural network
For years, the ANN method has been recognized as a reliable tool with a mathematical structure for data processing and mimics the biological processes and neural power of the human brain (Bhagat et al. 2020). Artificial neural networks were first introduced by Rosenblatt (1958) as perceptron networks. This method is based on complex internal theory and parallel processes of biological nervous systems and its ability to communicate the inputs and outputs of a process without full knowledge of its physics. Artificial neural networks use real output data to target output, which is network training (Tao et al. 2019). In general, an ANN model consists of three layers including input, hidden, and output layers, the input layers containing the studied variables and the output layer containing the expected results. The hidden layer, which the number of that determined using a trial-and-error process, includes transfer functions to train and process input variables in nodes (Aytek et al. 2008). Each layer also contains a number of nerve cells (neurons) that are introduced as the main building blocks of the network. Neural networks are divided into four categories according to the direction of information entry and processing: feedforward networks, feedback networks, networks of radial basis functions, and time reversal propagation networks. The most common type of neural network that is used to approximate functions and has been used in the present study is the feedforward neural network with backpropagation algorithm based on gradient reduction and more than 90% of existing research on the application of ANN in water resources management has been done using it (Maier and Dandy 2000). In the present study, by considering a hidden layer as a constant, trial and error was performed to determine the number of nerve cells in the hidden layer to achieve the minimum error in the range between 1 and 10 nerve cells.

Support vector machine (SVM)
SVM is known as a non-statistical binary classifier that has attracted the attention of many researchers in recent years (Mantero et al. 2005;Naganna et al. 2019). SVM models are divided into two main groups: support vector classification model and support vector regression model. SVC models are used to solve data classification problems that fall into different classes, and the SVR model is used to solve estimation problems (Vapnik 1995). Among the properties of SVM models, the following can be mentioned: i. Applies maximum generalization in the design of classifiers. ii. Ability to find the optimal answer of the function.  iii. In solving classification problems, it automatically prepares the optimal structure and mechanism. iv. Using non-linear kernels as well as internal multiplication capability in Hilbert spaces, it can model nonlinear functions.
SVM is known as an algorithm that searches for unique linear models and determines the maximum margin of the cloud using them. Maximizing the margin of the cloud page leads to maximizing the separation between classes and increasing the accuracy of the modeling process. Support vectors are the closest training points at the edge of the cloud and are used to define the boundary between classes (Shin et al. 2005). If the data is linear and separate, the SVM uses linear machines to separate and train an optimal level with the least error and the maximum distance between the page and the nearest training points (support vectors) (Shin et al. 2005).
If the training points are in the form of [x i , y j ] and the input vector is in the form of x i ∈ R n , then the value of each class is defined as y i ∈ [− 1,1] i = 1, …, i. The decision rules that can then be expressed by an optimal page that separates binary decision classes can be expressed as Eq. (1): In the above relation, Y as the output of the relation; y i as the value of the sample class; and X i , a i , and b are the parameters that determine the hyperplane. If linear separation is not possible, then Eq. (1) is changed as follows: In Eq. (2), K (X × X i ) is a kernel function that generates internal multiplications to create SVM models with different modes of non-linear decision levels in the data space, and for this purpose, it is necessary to define the line equation. The line equation in 2D space is calculated by Eq. (3), the plane equation by Eq. (4), and the screen equation by Eq. (5) (Chen et al. 2002). According to Fig. 5, the continuous bold line with the equation w T x + b = 0 is known as the line separating the data on the plane and divides them into two categories A and B. This line leads to the formation of a space in which the data belonging to category A take a positive number and the data belonging to category B take a negative number. But in SVM models, in addition to using the delimiter line, a confidence margin is also used for classification (Fig. 5).
In this case, none of the data is allowed to be in the middle area. Assuming that the line with the equation w T x + b = 0 is a boundary zero point, so for the data, depending on the position in classes A and B, respectively, the equations w T x + b > 1 and w T x + b < − 1 are established. The thickness of the separator in the SVM includes an area and makes the classification process more resistant to the risk of misalignment (Ehteram et al. 2020).
One of the common methods for solving non-linear problems is to use kernel functions. In fact, with a non-linear transformation of the input space into a larger space, usury issues can be separated linearly. The choice of kernel function is very important in SVM models and different issues can be considered depending on the nature of the problem. Therefore, a function cannot be definitively introduced as  Table 2.

Multiple model-support vector machine
The use of multiple models based on support vector machine (MM-SVM) with the aim of increasing the accuracy in estimating EP m and showing the increasing, decreasing, and constant accuracy behavior and comparison with other models in the present study was on the agenda, which is a new hybrid model. This multiple model strategy (MM-SVM) was first introduced implicitly by Khatibi et al. (2018a) to estimate soil cation exchange capacity and has not been studied so far with the aim of increasing the accuracy of the pan evaporation (EP) estimation. MM-SVM hybrid model is achieved in two stages: In the first stage, the learning process of artificial intelligence models (i.e., ANN and SVM) was performed by considering meteorological data as input and EP m data as output. Then the estimation process in the second stage began with the help of the results obtained from the first stage. In other words, a kind of binary learning process begins under the machine learning modeling strategy. The only common denominator in both stages is the use of data (i.e., EP m ) as the output of the models. Note that the start of second stage was subject to the results of first stage, and must be fully implemented. According to Fig. 6, the input data to the SVM model in the second stage is the same as the output data in the model learning process in the first stage (i.e., the outputs of the two models ANN and SVM).

Model performance evaluation indicators
To evaluate the accuracy of ANN, SVM, and MM-SVM models according to Eqs. (6) to (9)   In Eqs. (6) to (9), x i and y i are real and estimated values, n is the number of data evaluated, cc is the linear correlation coefficient between x i and y i , α is equal to the ratio of standard deviation y i to standard deviation x i , and β is equal to the ratio of average y i to the average x i .

Determining the most effective input compounds to models
Based on the statistical characteristics of meteorological variables in Table 1, in the present study, seven different input scenarios were considered to the models (Table 3).
Because T variable had the highest correlation with the EP m variable, it was used in all input compounds with the aim of increasing the EP m estimation accuracy.

Estimation of EP m at Ardabil station
EP m values at Ardabil station were estimated during 7 scenarios in two stages of training and testing using ANN, SVM, and MM-SVM methods based on evaluation with   Table 4. According to the results of evaluation criteria in the test phase (Table 4), the best result of the ANN model in scenario 7 (ANN-7) with RMSE = 1.168 mm/day, MAE = 0.930 mm/ day, KGE = 0.862, and R 2 = 0.902. It is observed that the value of KGE statistic is more than 0.7, which is a relatively good result. The best result of SVM model as ANN-7 model in scenario 7 (SVM-7) with RMSE = 1.136 mm/day, MAE = 0.872 mm/day, KGE = 0.853, and R 2 = 0.907 has been recorded, while the new hybrid model MM-SVM was able to have the best result among the ANN and SVM models with the first rank. This is observed in scenario 6 (MM-SVM-6) with RMSE = 1.088 mm/day, MAE = 0.790 mm/ day, KGE = 0.819, and R 2 = 0.939. Therefore, MM-SVM-6, SVM-7, and ANN-7 models were determined as the best models in estimating EP m at Ardabil station, respectively. It is clear from Table 4 that ANN, SVM, and MM-SVM models are sensitive to the correlation between input and output (EP m ) variables, and the best results occur when the variables with the highest correlation with EP m as input models are considered (i.e., scenario 6 (including T, W, and S) and scenario 7 (including T, RH, W, and S). On the other hand, considering that the number of input variables in scenario 6 is less than scenario 7, so the MM-SVM-6 model can be introduced as an optimal model with high accuracy in estimating EP m at Ardabil station. The core of the present study is to observe the behavioral pattern of the new MM-SVM hybrid model in EP m estimation based on the results of evaluation criteria of different scenarios. Accordingly, with accuracy in Table 4 can be observed incremental, constant, and decreasing accuracy behavior in the RMSE and R 2 criteria related to the scenarios compared to the criteria of ANN and SVM models in the test phase. The largest incremental change in estimation accuracy occurred in scenarios 2, 6, 4, and 7, respectively. Also, almost constant changes in the accuracy of the MM-SVM model compared to the two ANN and SVM models are shown in scenario 5. The largest decrease in the estimation accuracy of the new model is also seen in scenarios 3 and 1, respectively.
As shown in Fig. 7 (left side), EP m time series changes estimated for the superior models (ANN-7, SVM-7, and MM-SVM-6) are compared with the observed EP m in the test phase (2017-2019). The observational and estimated scattering plots of the best ANN, SVM, and MM-SVM models for the test phase are also shown in Fig. 7 (right side). From scatter plots in Fig. 7, it can be inferred that the estimates made by the MM-SVM-6 model at Ardabil station  are relatively more consistent with the observed EP m values than other models. In other words, EP m values at Ardabil station can be estimated with acceptable accuracy with meteorological data on T, W, and S. It is also observed that the distribution of points drawn around the bisector axis in Fig. 7 is lower for the MM-SVM-6 model and the model has shown a relatively higher accuracy. The important point of scatter plots is the low estimation of all three models in estimating EP m , especially in high evaporation values, which is more evident in the MM-SVM-6 model. It can be said that unlike the ANN-7 and SVM-7 models, the MM-SVM-6 model has a much lower estimate in EP m estimates. The estimation error of the superior models is also shown in Fig. 8. According to this figure, the MM-SVM-6 model was able to show better performance in estimating EP m in some months than the two models ANN-7 and SVM-7 and reported values close to reality.

Estimation of EP m at Khalkhal station
According to Table 5, ANN, SVM, and MM-SVM models showed different results in Khalkhal station compared to Ardabil station. So that the best result in the test phase with RMSE = 0.761 mm/day, MAE = 0.54 mm/day, KGE = 0.903, and R 2 = 0.962, in scenario 5 and obtained by ANN model (ANN-5). MM-SVM model as the second top model with RMSE = 0.833 mm/day, MAE = 0.568 mm/day, KGE = 0.908, and R 2 = 0.953 in scenario 5  and is selected at Khalkhal station. It can be seen that there is very little difference in terms of accuracy between the ANN-5 and MM-SVM-5 models, and even the KGE statistic in the MM-SVM-5 model is slightly higher than the ANN-5. Unlike Ardabil station, which in connection with the combined model MM-SVM, we saw more incremental behavior in the accuracy of evaluation, in Khalkhal station; this case was observed in only one scenario. In other words, an increasing trend in the accuracy of the MM-SVM model was observed only in the fifth scenario and the other scenarios showed a decreasing trend in the evaluation accuracy compared to the two models ANN and SVM. The best result for SVM model is also in scenario 5 (SVM-5) with RMSE = 0.859 mm/day, MAE = 0.57 mm/day, KGE = 0.9, and R 2 = 0.952 happened. According to Table 1, the variables of T, RH, and S showed the highest correlation with EP m with CC values of 0.955, − 0.868, and 0.931, respectively, compared to the other input variables. Therefore, the reason for the absolute superiority of scenario 5 over the other scenarios of Khalkhal station is the presence of T, RH, and S variables within this scenario. As shown in Fig. 9 (left side), EP m time series changes estimated for the superior models (ANN-5, SVM-5, and MM-SVM-5) are compared with the observed EP m data in the test phase (2017-2019) at Khalkhal Station. The observational and estimated scattering plots of the best ANN, SVM, and MM-SVM models for the test phase are also shown in Fig. 9 (right side). As shown in the time series plots, the ANN-5 and MM-SVM-5 models were able to provide more consistency with the observational data with relatively higher accuracy than the SVM-5 model. In other words, ANN-5 and MM-SVM-5 models using temperature (T), relative humidity (RH), and sunshine hours (S) variables can estimate EP m data with higher accuracy. Also, by carefully looking at the scatter plots of the top models, it can be seen that the distribution of points drawn around the bisector axis for ANN-5 and MM-SVM-5 models is less than the SVM-5 model, and these models show higher accuracy. In addition, all three top models estimate the high EP m values with less accuracy; in other words, we saw a slightly lower estimate for the high EP m values in these models, and this point is slightly higher for the SVM-5 model. Figure 10 shows that there is not much difference between the superior models in terms of errors in EP m estimation. However, it

Estimation of EP m at Manjil station
According to Table 6, ANN, SVM, and MM-SVM models showed different results in Manjil station compared to Ardabil and Khalkhal stations in the test phase. So that the best result in the test stage with RMSE = 0.829 mm/ day, MAE = 0.589 mm/day, KGE = 0.972, and R 2 = 0.967 was obtained in scenario 3 and by MM-SVM (MM-SVM-3) model. ANN model, as the second superior model with RMSE = 1.081 mm/day, MAE = 0.836 mm/ day, KGE = 0.911, and R = 0.959, is selected in scenario 3 (ANN-3) for Manjil station. According to Table 6, it can be seen that the MM-SVM model approach was able to prove the increasing trend of accuracy compared to ANN and SVM models in all scenarios (except scenario 6 where the incremental changes compared to ANN are almost constant) which is similar to the Ardabil station results with leading to more accuracy. The SVM model in scenario 3 (SVM-3) with statistical indices RMSE = 1.191 mm/ day, MAE = 0.940 mm/day, KGE = 0.864, and R = 0.946 showed less accuracy compared with the MM-SVM and ANN models. According to Table 1, temperature and sunshine hour parameters in comparison with other input scenarios showed the highest correlation with monthly pan evaporation (EP m ) values with CC values of 0.93 and 0.94, respectively. Had given. Therefore, the reason for the superiority of scenario 3 at Manjil station compared to other scenarios and its difference from other stations is the difference in the correlation coefficient of input parameters to EP m .
As shown in Fig. 10 (left side), EP m time series changes estimated for the superior models (ANN-5, SVM-5, and MM-SVM-5) are compared with the observed EP m data in the test phase (2012-2014) at Manjil station. The observational and estimated scattering plots of the best ANN, SVM, and MM-SVM models for the test phase are also shown in Fig. 11 (right side). As shown in the time series plots, the MM-SVM-3 and ANN-3 models can estimate the monthly EP m data with higher accuracy using temperature and sunshine hour inputs. Also, by looking at the scatter plots of the best models, it can be seen that the distribution of points drawn around the bisector axis is lower for the MM-SVM-3 and ANN-3 models than for the SVM-3 model, and these models showed higher Table 6 Values of the RMSE (mm/day), MAE (mm/day), KGE, and R 2 criteria for the developed models during training (2005-2011) and testing (2012-2014)  accuracy. However, the MM-SVM-3 is more accurate and visually superior to the ANN-3 due to its low and overestimation. In addition, the ANN-3 and SVM-3 models estimate high (EP m ) values with less accuracy. As shown in Fig. 12, there is a large difference between the superior models in terms of errors in estimating the EP m for the test phase, and it is clear that in most months, the MM-SVM-3 model was able to make fewer errors than ANN-3 and SVM-3 models. According to Figs. 8, 10, and 12, in general, the difference in accuracy of the results of the best models in Manjil station is greater compared to Khalkhal and Ardabil stations, and the MM-SVM model has been able to improve the accuracy of monthly pan evaporation estimates to a greater extent.

Estimation of EP m at Grand Island station
According to Table 7, it is clear that the accuracy of estimating the monthly pan evaporation (EP m ) of the models in Green Iceland station has performed better than Ardabil, Khalkhal, and Manjil stations. Although both ANN and SVM models have shown excellent accuracy in estimating EP m values of the Green Island station, the MM-SVM model under optimal conditions has also been able to significantly improve the accuracy of both ANN and SVM  Table 7, the increasing and decreasing trend of MM-SVM model was observed in Green Island station similar to other stations studied. Among the 7 studied scenarios, the MM-SVM model in four scenarios has showed increasing trend, and in three scenarios has shown a decreasing trend in the accuracy of monthly EP m estimates. These results are similar to the trend in Ardabil and Manjil stations. According to Table 1 for Grand Island station, the parameters of temperature, solar radiation, wind speed, and relative humidity in scenario 7, compared to the monthly EP m values, showed correlation coefficient equivalent to 0.9, 0.94, − 0.2, and − 0.49, respectively. Therefore, it can be said that due to the greater relationship between T and RS parameters with EP m , correlation coefficient between these parameters is high and the MM-SVM model was able to figure out the ideal results for estimating evaporation in the presence of these parameters.
As shown in Fig. 13 (left side), EP m time series changes estimated for the superior models (ANN-5, SVM-5, and MM-SVM-5) are compared with the observed EP m data in the test phase (2011)(2012)(2013) at Grand Island station. The observational and estimated scattering plots of the best ANN, SVM, and MM-SVM models for the test phase are also shown in Fig. 13 (right side). As shown in the time series diagrams, the two models MM-SVM-7 and ANN-7 can estimate EP m data with higher accuracy using T, RS, W, and RH inputs. In general, if we compare the time series and scatter plots of this station with other stations under study, it becomes clear that the accuracy of the models in this station has been higher. Also, according to Fig. 13, carefully in the scatter plots of the superior models, it can be seen that the distribution of points drawn around the bisector axis for ANN-7 and MM-SVM-7 models compared to the SVM-7 model was less and these models were more accurate. The SVM-7 model estimates the high values of EP m with less accuracy; in other words, we have seen a low estimate for the high values of EP m in this model. According to Fig. 14, it can be seen that the error of SVM-7 model is more than the two models ANN-7 and SVM-7 in estimating the monthly pan evaporation of Grand Island station. It is also clear that in most months, the MM-SVM-7 was able to record fewer errors than the ANN-7 and SVM-7. Comparing the results of all 4 stations studied in the present study, we can say that ANN and MM-SVM models have always been superior to the single SVM model, and we have seen more reliable results by these two models in estimating EP m . The superiority of ANN and MM-SVM models over each other in the present study was relative in all stations. But in most cases, the MM-SVM model has an increased accuracy approach, and this has been proven in almost all stations in different scenarios.

Conclusion
Detection of the evaporation process that occurs in the hydrological cycle and under the influence of random phenomena is very complex. In this paper, the ability to estimate EP m values with intelligent ANN and SVM models and by presenting multiple hybrid model (MM-SVM) in four meteorological stations of Ardabil, Khalkhal, Manjil (in Iran), and Grand Island (in the USA) was evaluated. All possible hybrid scenarios were considered in determining the inputs of the ANN and SVM intelligent models to estimate EP m . The outputs obtained from the models were evaluated and presented using numerical and visual performance evaluation criteria. In general, the results obtained in the present study can be classified and expressed as follows: • In all meteorological stations, scenarios were selected as superior scenarios in which there were three and four input variables (i.e., T, S (or RS for Grand Island), W, and RH). In other words, the EP m process in the study area is very random and more climatic information is needed to achieve the desired result in estimating this process. • At Ardabil station, the MM-SVM-6 model had better results compared to the two models ANN-7 and SVM-7, while ANN-5 model was selected as the best model in Khalkhal station. Also, at Manjil and Grand Iceland stations, MM-SVM-3 and MM-SVM-7 related to scenarios 3 and 7, respectively, were selected as the best models in estimating EP m values. Given the actual mechanisms between the independent and dependent variables, it is natural for intelligent models to present different results in different regions. • The proposed MM-SVM hybrid model for EP m estimation in specific climatic conditions of the studied areas was able to increase the accuracy of estimations in scenarios with high correlation between its components and EP m by applying the strategy of multiple models with inputs estimated by independent models. So that the incremental, constant, and decreasing modes in the accuracy of EP m estimation by this hybrid model under the above conditions were quite clear. • In general, in the first stage of multiple models in future studies, different combinations of other climatic parameters affecting the pan evaporation (EP) process can be added as input. The proposed hybrid intelligent model in the present study has a strong modeling strategy for estimating EP m values in specific climatic conditions of the studied areas, and can be used in the discussion of engineering and management of water resources.
Author contribution All authors have contributed to different sections of the article.

Data availability
The data used were obtained from the Iran and US Meteorological Organization. The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
Code availability Weka software user interface was used.

Declarations
Ethics approval This article does not contain any studies with human participants performed by any of the authors.