High-resolution reference evapotranspiration for arid Egypt: Comparative analysis and evaluation of empirical and artificial intelligence models

Accurate estimation of evapotranspiration has crucial importance in arid regions like Egypt, which suffers from the scarcity of precipitation and water shortages. This study provides an investigation of the performance of 31 widely used empirical equations and 20 models developed using five artificial intelligence (AI) algorithms to estimate reference evapotranspiration (ET 0 ) to generate gridded high-resolution daily ET 0 estimates over Egypt. The AI algorithms include support vector machine-radial basis function (SVM-RBF), random forest (RF), group method of data handling neural network (GMDH-NN), multivariate adaptive regression splines (MARS), and dynamic evolving neural fuzzy interference system (DENFIS). Daily observations records of 41 stations distributed over Egypt were used to calculate ET 0 using FAO56 Penman – Monteith equation as a reference estimate. The multi-parameter Kling-Gupta efficiency (KGE) metric was used as an evaluation metric for its robustness in representing different statistical error/agreement characteristics in a single value. By category, the empirical equations based on radiation performed better in replicating FAO56-PM followed by temperature-and mass-trans-fer-based ones. Ritchie equation was found to be the best overall in Egypt (median KGE 0.76) followed by Caprio (median KGE 0.64), and


| INTRODUCTION
One of the most crucial issues in water management over the past few decades has been the rarity of water availability in some regions of the world and the difficulty of accurately assessing agricultural demand (Shahid, 2011). To overcome these issues and improve water use efficiency, an accurate estimation of evapotranspiration (ET) is required. ET is a combined process of water transfer from the soil-plant system to the atmosphere by evaporation and transpiration. It plays an important role in irrigation management (Adnan et al., 2021), agricultural planning (Ismail et al., 2017;Mehdizadeh et al., 2017), water stress assessment (Mohsenipour et al., 2018;Sanikhani et al., 2018), and climatological change (Roudier et al., 2014;Yu et al., 2020), as it is a key element of Earth's water cycle.
The most accepted method of estimating ET is by measuring reference evapotranspiration (ET 0 ). ET 0 is the rate of evapotranspiration from a hypothetical culture with predefined characteristics (Allen et al., 1998). Its concept is ascended to avoid variability of evaporation parameters among crops and at different growth stages (del Cerro et al., 2021). However, measuring ET 0 in situ is not only expensive and time-consuming but also subjective to different degrees of uncertainty. Thus, several empirical equations of estimating of ET 0 have been developed. Among them, the Food and Agriculture Organization (FAO) recommended the Penman-Monteith equation (FAO56-PM) as a standard method for reliable estimation of ET 0 (Allen et al., 1998). The performance of the FAO56-PM equation is universally accepted as the most skilful equation in estimating ET 0 . It requires mainly four climatic variables (e.g., temperature, relative humidity, wind speed, and solar radiation). The common availability of these four variables is limited in data-scarce regions, such as Egypt, which limits the use of the FOA56-PM. Thus, several alternative empirical equations are used that require fewer climate records. The performance of the alternative equations depends mainly on climate characteristic patterns of domains where the equations have been developed for which have a significant impact on the estimations (Adnan et al., 2021). Furthermore, changes in modelling assumptions and input data requirements of empirical equations yield conflicting values (Muniandy et al., 2016). Soft computing techniques based on artificial intelligence (AI) algorithms have been widely used to reliably estimate ET 0 from available and recorded meteorological variables (Alizamir et al., 2020). This is driven by the improvement of computer-aided hydrology and water resources models throughout the previous few years (Khosravinia et al., 2020).
In the literature, several studies examined the performance of empirical equations and AI models to select the most robust in estimating ET 0 in different parts of the globe (Gavili et al., 2018;Muhammad et al., 2019). Mehdizadeh et al. (2017) evaluated 16 empirical equations and four AI algorithms (namely gene expression programming [GEP], two different types of support vector machine [SVM], and multivariate adaptive regression splines [MARS] in estimating monthly ET 0 in Iran). They found that MARS performed better compared to others. Salam et al. (2020) used 13 empirical equations and random forest (RF) algorithm to estimate monthly ET 0 in Bangladesh and found that the Abtew (1996) model has the closest estimates and the best linear correlation with FAO56-PM. Recently, Adnan et al. (2021) proved the feasibility of the group method of data handling neural network (GMDH-NN) which used temperatures records as a predictor and found its high performance in Turkey compared to MARS and other empirical equations. Mattar (2018) tested the performance of GEP models in estimating monthly ET 0 in Egypt and reported the importance of including mean relative humidity (RH) and wind speed at 2 m height (U 2 ) as predictors in modelling ET 0 in Egypt. The review of the literature revealed that limited studies neither established accurate maps of ET 0 in Egypt nor selected an accurate method for its estimation.
Egypt, as an arid country, is suffering from severe water scarcity. Precipitation can reach 200 mm annually in a narrow strip along the Mediterranean shores. Otherwise, this amount drastically reduces to below 5 mm in inland areas (Hamed et al., 2021c). The Nile supplies Egypt with 98% of its renewable water resource which is mainly consumed in agricultural activities (Hamed, 2019). Meanwhile, Egypt is facing external pressure on its perceived Nile water share due to the construction of the Ethiopian Grand Renaissance dam far upstream (Sharaky et al., 2019). Furthermore, it faces internal pressure on water due to rapid growth in population and economic status in recent years (Nikiel and Eltahir, 2021). This has spurred the country to control and manage its water resources prudently to improve water (re)usage and especially agriculture productivity and food security (AbuZeid, 2020;Nikiel and Eltahir, 2021). However, accurate measurements of ET 0 are lacking in Egypt. This may be due to several facts including which the absence of a dense network of measuring stations, especially outside urbanized areas. This lack of adequate network has limited the ability to capture the variability of the ET 0 in those vast areas. Therefore, accurate management of water resources is challenging.
This study aims to develop a high-resolution 0.10 × 0.10 estimate of ET 0 for arid Egypt. In order to gain an accurate estimation of ET 0 , 31 empirical equations and 20 AI models were evaluated against the FAO56-PM estimates at 41 station locations. The multiparameter Kling-Gupta efficiency (KGE) metric was used as an evaluation metric for its robustness in representing different statistical error/agreement characteristics in a single value. Then, the best method found was chosen to develop the high-resolution 0.10 × 0.10 ET 0 data for Egypt. Finally, the rate of change in historical ET 0 was calculated and tested for significance to assess the spatial pattern of trend in ET 0 . This study is among the leading approaches in evaluating several empirical and AI models for ET 0 estimation in Egypt. The findings of this study may facilitate an opportunity for improving crop water estimation and reducing internal pressure on water supply in general.
2 | STUDY AREA AND DATA SOURCES

| Study area
Egypt covers a land area of nearly 1 million km 2 . Most of the land is flat; however, the elevation varies from −139 m in the Qattara Depression in the north to about 2,614 m in the elevated mountains in Sinai and 1,356 m in the Red Sea Mountains chain, as can be seen in Figure 1. The Nile divides Egypt's land into eastern and western Deserts. These deserts represent most of Egypt's area. Egypt is an arid country. It has mild cold winters and hot summers, and rainfall is rare. The annual rainfall is about 200 mmÁyear −1 in the northern parts and less than 5 mmÁyear −1 in the far south Yassen et al., 2020;Hamed et al., 2021a;2021b). Due to the shortage of water, most Egyptians live along the Nile in nearly 4% of Egypt's total land. Agricultural land represents about 3% of the total area of the nation, located along the Nile valley and its delta. Cultivated land in Egypt occupies 85% of the total area of arable land (Omran and Negm, 2020).

| Data and sources
This study employed two types of data: gauge records and reanalysis gridded data. Daily meteorological observation records at 2.00 m above ground level of maximum, minimum, mean, and dewpoint temperatures (T max , T min , T mean , and T dew , respectively), and wind speed (U 2 ) were collected at 41-gauge stations distributed all over Egypt's land. The data were obtained from the National Climatic Data Center (NCDC) of National oceanic and atmospheric administration (NOAA) Global Summary of Days (GSOD) dataset. The gauges' locations are shown in Figure 1 and descriptive statistics of the records are listed in Table 1. All stations are land-based automatic weather stations that report in SYNOP/METAR codes, and their data were exchanged under the World Meteorological Organization (WMO) World Weather Watch Program. The reported codes undergo 57 extensive automated quality control tests, in NCDC facilities, to eliminate errors (Lott et al., 2001). Further checks were done to ensure the quality of observation. The Student's t test was employed for the objective analysis of data homogeneity and double mass curves were plotted for the subjective analysis of nonhomogeneity. The analysis showed that variation in different samples of climatic records was not statistically significant at a 95% level of confidence and the double-mass curves were nearly straight without breakpoints.
Using reanalysis as a scientific process, an accurate representation of the weather and climate at regular intervals over a longer period can be obtained. European Reanalysis v.5 (ERA5) data are a combination of a F I G U R E 1 Topography and meteorological stations' locations in Egypt numerical prediction model with satellite observation data and other available gauge records. The results of a reanalysis are meteorological fields in a uniform grid with a reasonable temporal resolution over an extended time period that can provide knowledge on climate conditions (Lompar et al., 2019). ERA5 is the newest edition, T A B L E 1 Descriptive statistics of the station records used in the study No.
WMO ID Data range T max ( C) T min ( C) T dew ( C) U 2 (mÁs −1 ) temperature data compared to observation data. The results show that over northern and southern Africa, the correlation is greater than 0.95. Several surface parameters found in ERA5 land were used in this study, such as temperature, dew point temperature, and wind speed from 1981 to 2020. The daily data of ERA5 provided high-resolution 0.10 × 0.10 $ 9km ð Þgrid spacing. Data from ERA5 are accessible from 1981 up to 2-3 months before the current time. The wind speed components (i.e., u and v) provided in ERA5 were at 10.0 m level above ground. As most of ET 0 estimation methods require wind speed at 2.00 m level, the ERA5 wind speed components were converted into wind speed at 2.00 m level using the wind power law (Cook, 1986;Pryor et al., 2005). Despite the progress that has occurred in model expansion, the potential evaporation field provided in ERA5 is largely underestimated over deserts due to an error in the code, which prevents transpiration when there is no or little vegetation (C3S, 2022). This limits the use of ERA5 potential evaporation in Egypt and brings the need to develop an accurate highresolution estimate of ET 0 over Egypt. Figure 2 represents a flowchart of the study methodology. First, the FAO56-PM equation was used to calculate ET 0 at different locations for the available observations' records. Then, the resulting ET 0 was used as a reference to evaluate the performance of 31 empirical equations and 20 models developed using five AI algorithms. For each AI algorithm, four models were developed. The first model employed T max and T min as input variables to estimate reference ET 0 . The second model employed T max , T min , and U 2 , whereas the third model employed RH instead of U 2 . Finally, the fourth model employed T max , T min , U 2 , and RH. Random samples representing 70% of the available records of all stations were used to train different models and the remaining 30% were used for testing model performances. The multiparameter Kling-Gupta efficiency (KGE) metric was used as an evaluation metric for its robustness in representing different statistical error/agreement characteristics in a single value. On the other hand, the performance of ERA5 in replicating the available observations' records was first verified. Afterward, the best performing equation/model was selected and used to estimate ET 0 at ungauged locations over Egypt using ERA5 data at 0.10 × 0.10 spatial resolution. Finally, the rate of change in annual ET 0 for 1981-2020 was calculated using Sen's slope and significance in change was tested using the modified Mann-Kendal test.

| Penman-Monteith equation
The Penman-Monteith equation (FAO56-PM) is an energy balance physical-based equation to estimate ET 0 proposed by Allen et al. (1998) and recommended by the Food and Agriculture Organization (FAO). The performance of the FAO56-PM equation is universally accepted as the most skilful equation in estimating ET 0 . Due to the limited availability and absence of experimental ET 0 in Egypt, the FAO56-PM equation was used as a reference to evaluate the empirical and AI models. However, it requires numerous climatic parameters which are hard to obtain in the data-scarce region such as Egypt. The FAO56-PM formula is given in Equation (1), where ET 0 is in mmÁday −1 , Δ is the slope of the saturation vapour pressure-temperature curve (KPaÁ C −1 ), R n is net radiation (MJÁm −2 Áday −1 ), G is the soil heat flux (MJÁm −2 Áday −1 ), γ is the Psychometric constant (KPaÁ C −1 ), T mean is the mean air temperature at 2 m height ( C), U 2 is the wind speed at 2 m height (mÁs −1 ), and e s and e a are the saturation and actual vapour pressure (KPa). This study followed the guidelines presented in Allen et al. (1998) for calculating ET 0 expect for the following: • The actual sunshine hours (n) were derived using the linear regression proposed Abd el-wahed and Snyder (2015) (expressed in Equation (2)) which was developed for reliable estimation of n for arid climate and desert locations, n =4:352 +0:232 T mean : ð2Þ • Angstrom-Prescott solar radiation model (Ångström, 1924;Prescott, 1940) used sunshine duration to determine daily global irradiance for a horizontal surface (Equation (2)), as recommended by Allen et al. (1998). But this study used satellite-based calibration of the Angstrom-Prescott coefficients (i.e., a and b) to obtain a more accurate estimation of solar radiation over Egypt leading to a better estimation of FAO56-PM ET 0 . These site-specific coefficients were developed by Bojanowski et al. (2013) which led to better precision in estimating solar radiation compared to interpolated ground-based model coefficients, 3.2 | Empirical estimation of ET 0 A bundle of 31 empirical equations to estimate ET 0 was evaluated in this study. They are categorized based on their bases; namely, temperature-, radiation-, and mass transfer-based equations. They are listed in Table 2 where Equations (4)- (14), (15)-(24), and (25)-(34) are temperature-, radiation-, and mass-transfer-based equations, respectively. This bundle of equations is widely and commonly used to estimate ET 0 empirically for different climates and regions (Sharafi and Ghaleni, 2021). ET 0 is the reference evapotranspiration in mmÁday −1 except for McGuinness and Bordne, and Ritchie equations, ET 0 in cmÁday −1 . R n is net radiation (MJÁm −2 Áday −1 ). T max , T min , T mean , and T d are the maximum, minimum, mean, and dewpoint daily air temperature in C, respectively, except T mean in McGuinness and Bordne equation wherein F. G is the soil heat flux (MJÁm −2 Áday −1 ). γ is the Psychometric constant (KPaÁ C −1 ). U 2 is the mean daily wind speed at 2 m height. Z is the elevation in m. L is the local latitude in degree. R a is extraterrestrial radiation (MJÁm −2 Áday −1 ). TD is the difference between the T max and T min in C. R s is solar radiation in MJÁm −2 Áday −1 except in Ritchie, Makkink, Turc, and McGuinness and Bordne equations, R s wherein CalÁm −2 . e s and e a are the saturation and actual vapour pressure (hPa), except for Rohwer, Papadakis, and penman in KPa. e ma is the saturation vapour pressure at monthly mean daily maximum temperature (KPa). λ is the latent heat of evaporation (MJÁkg −1 ). Δ is the slope of the saturation vapour pressure-temperature curve (KPaÁ C −1 ). RH is the average relative humidity percentage. p is the daily percent of annual daytime T A B L E 2 Empirical equations were used in this study
hours for each day of the year. L d is the daytime length in multiples of 12 hr. F U 2 ð Þ is a function of wind speed. RHOSAT is the saturated vapour density (gÁm −3 ). ESAT is the saturated vapour pressure (mbar). KPEC is a calibration coefficient (1.2) and α is a constant (1.26). At Ritchie equation, α 1 is a coefficient that depends on T max and calculated as follows: If 3.3 | AI models

| Support vector machine
SVM is a machine learning technique that can be used for both classification and regression. SVM's core concept, which is extrapolated to support vector regression (SVR), is to separate two or more classes linearly with a hyperplane, by selecting support vector points. SVR is a powerful forecasting tool for resolving regression issues using the kernel function. ET 0 prediction using this technique is obtained using the following formulation, where b y t is the target predict variable, b o is bias, α 1 , α Ã i are the dual variables, and K χ k , χ ð Þis the kernel function. The radial basis function kernel (RBF) was used in this study.

| Random forest
RF, which is extensively used in different research fields, is an ensemble tree-based method in which trees are trained in a random subset and can be used with bagging or pasting, bagging is more frequent and commonly used. When the number of trees grows, it also brings more randomness and achieves the concept of generalization. Rather than looking for the best characteristic to split a node, it looks for the best feature among a set of random subnodes. Randomly various features m try À Á selected in each split node, and the number of regression trees n tree ð Þ are the parameters used by RF. Prediction obtained by the model after training is represented as where M is the number of regression trees, T m denotes a single decision tree, and f t is a vector of predictors. Root-mean-square error was used to select the optimal model. Two-fold cross-validation was used to calibrate m try and the final value used was 3. The number of regression trees used in this study was n tree =500.

| Group method of data handling neural network
Group method of data handling neural network (GMDH-NN) is a nonlinear machine learning model with a statistical analysis approach, used for classification and regression problems by identifying the complex relationship between input and output of the system. The GMDH-NN model has the ability to handle several input variables and predict a single output, using different layers in the model. Each layer output considers an input to the next layer (Adnan et al., 2021). The GMDH-NN's layer structure can be presented as where x i = ith input variable; M is the total number of input variables; and a i is the weight associated with x i . To minimize the difference between the observed values and the output values generated by GMDH-NN, the weights are optimized using the least-squares method.

| Multivariate adaptive regression splines
Multivariate adaptive regression splines (MARS) is a nonparametric flexible regression model, used to predict persistent numerical target variables. In this algorithm, input variables are separated into intervals, and each interval is fitted with a basis function. The information about independent variables is represented by the basis function. The initial and final points of a basis function are known as knots, and they are defined in a specific range. The MARS model is divided into two stages: In the first stage (forward step), the model uses a constant value to estimate the value of the target variable. This variable is the mean of the data for the target variable. An overfitted and complex model with a huge number of knots is constructed at this stage. In the second stage (backward step), checking all basis functions added in the first stage is carried out. Finally, the model was obtained (Mehdizadeh et al., 2017).

| Dynamic evolving neural fuzzy interference system
Dynamic evolving neural fuzzy interference system (DENFIS) is an AI learning algorithm based on an adaptive neural fuzzy inference systems (ANFIS) introduced by Kasabov and Qun (2002). DENFIS uses an evolving clustering algorithm that takes into account the greatest possible distance between the points and the centre of the cluster. Thus, it can handle the noise in the data (Ye et al., 2022). Furthermore, DENFIS learning approach assesses the position of the input vectors in the feature space assessed by fuzzy neural networks and then a dynamic process forms a fuzzy inference system to predict the output based on the nearest fuzzy rules. In comparison to other well-known models, DENFIS showed to be able to learn complex temporal sequences in an adaptable manner (Ye et al., 2022).

| Evaluation metric
For the evaluation of the robustness of empirical equations and AI models, several statistical metrics are usually used individually or commonly like Pearson's correlation, percentage of bias, and coefficient of determination (Mehdizadeh et al., 2017;Muhammad Adnan et al., 2020). These metrics have been developed to measure pair-wise statistical characteristic as accuracy, under or overestimation, presence of outliers, and so forth. However, the use of different metrics usually leads to contradictory conclusions since each metric presents its viewpoint of pair-wise evaluation making decision-making a difficult process (Radcliffe and Mukundan, 2017;Nashwan et al., 2019;Salman et al., 2019b). To avoid this confusion, this study statistically assessed the performances of the empirical equations and the AI models using the robust KGE metric (Gupta et al., 2009). The KGE integrates several statistics into one metric and gives an overall score of the skill of each method in estimating the ET 0 compared to FAO56-PM estimations. As shown in Equation (40), it incorporates three components of observed and simulated data: Pearson's linear correlation (r), coefficient of variability (α), and bias ratio (β), α= σ sim σ obs , where σ sim is the simulation standard deviation, σ obs is the FAO56-PM standard deviation, μ sim is the mean of simulation, and μ obs is the mean of FAO56-PM estimations. In case there is a perfect match between the pairs of ET 0 time series, the KGE values shall be 1 (Gupta et al., 2009). Otherwise, the KGE values shall be lower than 1 to −∞. A positive KGE value indicates a "good" overall skill, whereas a negative KGE value indicates a "weak" model skill (Knoben et al., 2019).

| Sen's slope estimator
Sen's slope method (Sen, 1968) is a nonparametric statistical method for fitting a line for the mean of the slope. Sen's slope estimator (1968) turned out to be an effective tool for constructing linear relationships. Estimating slopes between series data points as follows: where Q i is the median slope between data points, χ i and χ j is measured, respectively, at j and k (where j>k). Sen's slope estimator clearly shows the trend over the entire period.

| Evaluation of empirical equations
Heat-scatter plots between ET 0 observed from FAO56-PM against each simulated ET 0 from the empirical equation are shown in Figure 3. As observed in Figure 3, most temperature, radiation, and mass transfer tend to overestimate the simulated ET 0 except Baier and Robertson, and Penman. Radiation-based equations generally have a smaller overestimation than temperature-based equations.  was the best among the radiation-based equations having median r, α, β, and KGE 0.91, 0.80, 1.14, and 0.76, respectively. On the other hand, the mean median scores of the KGE components r, α, and β expressed as r −1 j j, α−1 j j, and β −1 j jare 0.11, 0.28, and 0.01, respectively. The mean median score of KGE of all equations is 0.26 and without considering the mass-transfer-based equations results is 0.53. The empirical equations thus performed remarkably worse in terms of α meaning that bias in estimates is the issue rather than correlation and ratio in co-efficient of variation. In turn, the α value was the dominant component in the final KGE scores due to the squaring of the three components of KGE (refer to Equation (41)).

| Performance evaluation
KGE scores were ranked at each gauge station location to identify the best empirical equation per location. Out of the 31 equations used, 12 equations dominated the top station-wise ranks. Namely, they are Ritchie, Penman, Caprio, Irmak (Rn), Ivanov, Hargreaves-Samani, McGuinness, Jensen, Kharrufa, Priestley, Papadakis, and Szasz and their spatial distribution is presented in Figure 5. The Ritchie equation was found the best at nine locations, followed by Penman, Caprio, and Irmak (Rn) each at five locations, then Ivanov at four locations, Hargreaves-Samani, and McGuinness each at three locations, Jensen, and Kharrufa at two locations and finally Priestley, Papadakis, and Szasz each at one location. Furthermore, the figure also shows the distribution of the type of each best equation per location. It can be seen that there is a spatial pattern in equation type over Egypt. The radiation-based equation dominated the majority of Egypt (25 out of 41 locations: 61%). The temperaturebased equations dominated the northern west edge of the country where solar radiation is the lowest, and the mass transfer-based equations dominated the interior land of the western desert. Most of the coastal locations had radiation-based equations as the best. At a few locations in the middle and east regions of Egypt, temperaturebased equations were found to be the best whereas the radiation-based equations were the best in the nearby locations. However, it was found that the second best equation for most of these locations was a radiationbased equation. For example, station no 27 on the Red Sea coast had Papadakis (KGE 0.86) as the best equation followed closely by the radiation-based Caprio equation (KGE 0.85 Figure 6 presents box-and-whiskers plots to show the distribution of the available records from the 41 meteorological stations for the four climatic variables (i.e., T max , T min , U 2 , and RH). Furthermore, the figure presents the F I G U R E 5 Spatial distribution of the best performing empirical equation at each station location distributions of 70% of the available records which were chosen randomly and used in the training stage, as well as the remaining 30% which were used in the testing stage of the 20 AI models. The training set contains 70% (301,824 daily data) of collected records and the remaining 30% (128,864 daily data) of the data. In this study, the K-fold cross-validation was used to ensure that the model is not overfitting the training data (Berrar, 2019). The Kfold method divides the training data into a number (K) of folds. Each fold will contain a different random sample from the training data (bagging), to build multiple-weak models which vote together to give the resulting prediction through smoother decision boundaries. K-fold cross-validation was used to the grantee that the devolved model will not overfit the training data. It is clear that the three data groups of each climate variable had nearly the same distribution, which is crucial for data-driven models that are highly affected by the distribution of the training and testing samples.

| Evaluation of AI models
In this study, 20 models were developed using five AI algorithms having four different combinations of the model's inputs to estimate the FAO56-PM ET 0 . Figure 7 illustrates the FAO56-PM ET 0 and simulated ET 0 from AI models, each row represents the same algorithm with the four different combinations of the model inputs.
Combinations were utilized to estimate the daily ET 0 using the SVM, RF, GMDH-NN, MARS, and DENFIS. Twenty models were implemented to evaluate the performance of various climate data combinations. The first combination used two climatic data T max and T min . The second and third combinations were generated by the addition of U 2 and RH into the first combination, respectively. U 2 and RH were added to the first combination to create a new fourth combination. The scatter plot in Figure 7 clearly shows that overall better performance of the fourth combination compared to others. The ET 0 simulated by RF4 was considerably aligned with the diagonal line 1:1 and closely followed by the estimates of SVM4. Table 3 shows a comparison between models compared to FAO56-PM in respect to KGE for the training and testing stages. Remarkably, the testing KGEs were greater than or equal to those of training stage indicating a good generalization of models' performances. Furthermore, the RF models presented the best results across the four combinations of input parameters. It was also found that the U 2 data when combined with T max and T min as input parameters led to improvement in estimating ET 0 compared to using only T max and T min . Furthermore, the improvement in the second combination is better than in the third which used RH except in MARS and DENFIS. DENFIS in the first combination (DENFIS1) recorded testing KGE 0.83 which was equal to the RF1. SVM had better testing KGEs following RF in the second, third, and fourth combinations which were 0.93, 0.87, and 0.98, respectively. DENFIS4 showed weak results compared to other corresponding models.
F I G U R E 6 Data ranges for all records from 41 stations, and for data used in training and testing stages for (a) maximum temperature ( C), (b) minimum temperature ( C), (c) wind speed (mÁs −1 ), and (d) relative humidity (%)

| Estimating ET 0 in ungauged locations
Reanalysis climatic data are frequently used as a substitute for observational weather and climate data, especially in areas with a rarity of data, like Africa (Gleixner et al., 2020) for their complete temporal and spatial coverage. Prior to estimating ET 0 in ungauged location, it is important to properly assess both their strengths and limitations. The performance of ERA5 was verified against available observations for the common period (1981-2020) using KGE. Figure 8 presents box-and-whiskers plots of KGE scores for the ERA5 data. The T max scored the best among the four variables, slightly followed by T min , RH,and U 2 (median KGE 0.92,0.90,0.77,and 0.28,respectively). Overall, the performance of ERA5 was acceptable and thus it was used to estimate ET 0 at ungauged locations.
As presented in previous results the RF4 had the top performance compared to others besides its KGEs were even higher than those of empirical equations. Thus, the RF4 was adopted to estimate ET 0 in ungauged location in Egypt. The four input parameters required for RF4 were obtained from ERA5 to estimate daily ET 0 all over Egypt at 0.10 × 0.10 horizontal resolution. Figure 9 shows the mean daily ET 0 spatial distribution for the period 1981-2020. The ET 0 ranges from 2.30 to 6.72 mmÁday −1 , which increases spatially by going southeast from the north of Egypt, except the Red Sea mountains chain. ET 0 rates in central Egypt range from 4.30 to 4.92 mmÁday −1 .
The daily ET 0 values for each grid point were cumulated and averaged during the years from 1981 to 2020 to derive the annual averages. Time-series data were then utilized to estimate rate of change and determine the significance of the shifts in the data. The colour ramps in Figure 10 describe the rate of change obtained by Sen's slope estimator for ET 0 (Figure 10e), as well as the four predictors (Figure 10a-d), to examine the driving factors. A dot in the middle of each grid point indicates the significance of change as obtained using a modified MK test for a 95% level of confidence.
The Sen's slope estimator test revealed an increase in the annual average T max and T min across all of Egypt's rates from 0.26 to 0.53 and 0.20 to 0.60 CÁdecade −1 , respectively. The largest change in T max occurred in the north and east of the Nile Delta. However, for T min , it occurred in the far south of Egypt. The modified MK test showed a significant decrease in RH in the northeast of Egypt. The changes in RH were detected in the range of −1.13 to −0.05%Ádecade −1 . The highest rate of decrease −1.13%Ádecade −1 was observed East of the Nile Delta and north of Sinai. The U 2 trend shows varieties of increases and decreases. However, the changes were only significant  Figure 10e, the changes in ET 0 were found to be significantly increasing all over Egypt, with the highest change of 0.16-0.18 mmÁdecade −1 observed in the far southeast of Egypt.

| DISCUSSION
Most of the previous studies had adopted several statistical metrics for the evaluation of either empirical equations or AI models in estimating the FAO56-PM ET 0 . The use of several metrics can generally produce a reliable assessment of modelling techniques rather than the use of a single metric. However, the use of several metrics may provide contradictory results due to the fact each metric reflects the state of specific statistical characteristics (e.g., error, association, etc.) (Nashwan and Shahid, 2019b;Hamed et al., 2022). This study used only the KGE which is a composite metric that integrates estimates of bias, correlation, and variability, to gain the advantages of using multiple metrics while avoiding related complexities in using multiples. Based on median KGE values, the radiation-based Ritchie empirical equation was found the best (median KGE 0.76) to replicate the FAO56-PM estimates of daily ET 0 . It was followed by the temperature-based Hargreaves-Samani and the radiation-based Priestley et al. (median KGE 0.72 and 0.71,respectively). Ritchie was found the best based on station-wise ranking followed by Penman, Caprio, and Irmak (-Rn). Also, the results proved the better performance of the RF models in general compared to other AI algorithms. Thus, this study recommends Ritchie equation as a substitute method for the FAO56-PM when required observations are limited.
The present study revealed that Ritchie equation, which is a radiation-based method, had a superior performance in modelling ET 0 in the arid climate of Egypt. Furthermore, the study results illustrated the overall better performance of radiation-based equations than temperature-and masstransfer-based equations. These findings agree with previous studies' results. del Cerro et al. (2021) evaluated the performance of various radiation-and temperature-based empirical equations in the semi-arid region of south India and found that Ritchie had performed the best. Furthermore, they concluded that radiation-based equations are best suited to estimate ET 0 in the study region. Also, Tabari et al. (2013) and Sharafi and Ghaleni (2021) concluded that the radiation-based models were the best-suited equations, while mass transfer-based equations had the worst performances in Iran. Tabari et al. (2012) found the Ritchie equation gives an acceptable performance in modelling crop ET Tabriz, Iran. Farzanpour et al. (2018) proved that a calibrated version of Ritchie equation was among the best in semi-arid regions in Iran. Tikhamarine et al. (2020) also confirmed that Ritchie equation had a reliable estimate of ET 0 in Algeria. On the contrary, Ritchie equation was found to be the worst at estimating ET 0 in other climate types, rather than arid, as in Muhammad et al. (2019) and Feng et al. (2016).
The finding of the present study revealed an overall good performance of AI models compared to empirical equations in simulating daily mean ET 0 in arid Egypt. Besides, this study found that the models having U 2 and RH as input parameters with T max and T min were better in performance. This agrees with the finding of Mattar (2018) in Egypt. Furthermore, this study found that including U 2 gave better performance than including RH along with T max and T min .
Among AI models, the RF models were the most accurate, closely followed by SVM. These findings can be confirmed by previous studies. Wen et al. (2015), Mehdizadeh et al. (2017), and Adnan et al. (2021) confirm that, overall, the soft computing approach performs better than empirical equations, which is in agreement with the present study finding. Furthermore, Mehdizadeh et al. (2017) reported the strong ability of SVM-polynomial in estimating monthly ET 0 in Iran. Furthermore, GMDH-NN had acceptable performance, as found by Adnan et al. (2021) in Turkey. Also, this study found that DIN-FIS models were the worst among the applied algorithms, as was similarly reported by Muhammad Adnan et al. (2020) when modelling monthly ET 0 in China.
The main advantage of using RF models is their flexibility, meaning that when the dimensions of the characteristic parameters increase, the RF method does not F I G U R E 9 Mean daily ET 0 calculated over a period 1981-2020 generate overfitting, but better results are still achieved. The RF algorithm outperforms SVM when tested with identical characteristic parameters (Jia et al., 2013). Han et al. (2018) conducted a comparative analysis to evaluate the behaviour of RF and showed that RF has outstanding properties in terms of classification accuracy, stability, and robustness to characteristics, as demonstrated by the comparison. In particular, they discovered that RF is F I G U R E 1 0 Spatial distributions of the rate of change in (a) T max , (b) T min , (c) U 2 , (d) RH, and (e) ET 0 as estimated using the RF4 model. Colour ramps indicate the rate of change obtained using Sens's slope estimator. The significant changes (p > .05) are marked with a dot significantly better when the training samples are constrained in a number of analytical studies. These may be the reasons why the RF models outperformed other algorithms.
The high-resolution estimation of daily ET 0 and trend analysis showed that locations of relatively low ET 0 (north of Egypt) are experiencing a higher increase. Meanwhile, locations of high ET 0 (south and east of Egypt) are experiencing an equal and lower rate of increase. This indicates that ET 0 shall be overall higher in future climate states. This may have severe consequences for agriculture and water resources in Egypt in the future. The high rate of evaporation has already had a significant impact on Lake Nasser (Egypt's national reserve of fresh water) and thus the Egyptian water budget (Salih et al., 2019).

| CONCLUSIONS
Accurate estimation of ET 0 using the standard FAO56-PM is challenging when required climate data are not available as in most regions around the globe. In this study, the abilities of five AI algorithms (i.e., SVM, RF, MARS, GMDH-NN, and DENFIS) and 31 empirical equations in modelling daily ET 0 compared to the FAO56-PM equation were evaluated. First, at 41 locations covering Egypt, the KGE evaluation results showed that Ritchie had the best overall performance (median KGE 0.76) and found the best at the highest number of stations based on a stationwise ranking compared to others. In contrast to the several meteorological variables required for the FAO56-PM computation of ET 0 , the Ritchie equation just requires T max , T min , and R s . The result illustrates that Ritchie equation can be used as a substitute for FAO56-PM when data are scarce. Second, different combinations of T max , T min , U 2 , and RH were used as inputs to train AI models to target the FAO56-PM ET 0 . The RF model was generally the most accurate in predicting daily mean ET 0 , closely followed by SVM, whereas DENFIS provided the worst prediction at the third and fourth combinations of input variables. Thus, this study recommends using the RF model in predicting ET 0 . According to the findings of this study, the RF1 model, which requires only maximum and minimum temperature, can be employed for worst-case scenarios. Using the RF model, irrigation schedulers can get an accurate estimate of ET 0 in Egypt, without the need for the full set of climatic data. Thirdly, the RF4 model was used to generate a 0.1 resolution of ET 0 , all over Egypt, using ERA5 data. Finally, Sen's slope estimator and modified Mann-Kendall test showed a significant change in ET 0 in most locations in Egypt over the past 40 years.
Considering the lack of reliable observation records for the required inputs of the FAO56-PM in most developing countries, the present study shall be useful for determining ET 0 in arid climates, such as Egypt, using simpler and more accurate methods. Furthermore, this study enables peers to produce better estimations of drought and aridity status in Egypt, which employs common methods such as standard precipitation evaporation index and aridity index. Thus, it can provide enhanced strategies for the adaptation and mitigation of such events. Locally, irrigation schedulers can get an accurate estimate of ET 0, without the need for the full set of climatic data.
The developed high-resolution (0.10 × 0.10 ) daily estimation of ET 0 is freely available online (https://doi. org/10.6084/m9.figshare.18551084) in comma-separated values (CSV) format. The daily records cover the land of Egypt and temporally span from 1981 to 2020.
This study was limited by the availability of measured solar radiation. However, it employed the recommendations presented in Abd el-wahed and Snyder (2015) and Bojanowski et al. (2013) for better estimation of solar radiation by the Angstrom-Prescott solar radiation model.
Future works may investigate the future change in ET 0 regionally in Egypt, using the state-of-art Coupled Model Intercomparison Project phase 6 (CMIP6) models with the use of RF models of this study. Also, the highresolution data can be used in downscaling and biascorrection of CMIP6 model estimations. Furthermore, peers can provide a locally calibrated version of the Ritchie equation in Egypt and determine what is the most influencing factor affect ET 0 regionally.