In the present study, the capability of four regression-based machine learning methods, SVM, RF, BoT and BaT were investigated for ET0 estimation. Several input scenarios including Tmax, Tmin, Tmean, SR, WS & RH climatic variables were used as model inputs and data in daily scale were collected from five stations, Qarah-Tapah, Mandali, Kalar, Iran-Iraq Border and Adhim stations, Iraq. Utilized input scenarios for daily ET0 estimation are presented in Table 2. Scenario M1 uses full variables as inputs while M7 has only two variables Tmean & SR.
Table 3 sums up the performance metrics of the employed methods in estimating daily ET0 of Qarah-Tapah Station. RF method has the R2, MSE, RMSE and MAE ranges from 0.86 (RF-5) to 1 (RF-1), from 0.05 (RF-2) to 0.414 (RF-5), from 0.074 (RF-2) to 0.643 (RF-5) and from 0.055 (RF-2) to 0.487 (RF-5), respectively while the ranges of the corresponding metrics for SVM are from 0.87 (SVM-5) to 0.97, from 0.077 (SVM-1) to 0.367 (SVM-5), from 0.28 (SVM-1) to 0.606 (SVM-5) and from 0.23 (SVM-1) to 0.488 (SVM-5), for BoT and BaT, the ranges are from 0.90 (BoT-5) to 0.99, from 0.029 to 0.297 (BoT-5), from 0.172 to 0.545 (BoT-5), from 0.135 to 0.439 (BoT-5) and from 0.90 (BaT-5) to 1 (BaT-2), from 0.008 (BaT-2) to 0.297 (BaT-5), from 0.091 (BaT-2) to 0.545 (BaT-5), from 0.052 (BaT-2) to 0.428 (BaT-5), respectively. It is evident from the metrics’ ranges; the RF method is generally more successful in estimating daily ET0 of Qarah-Tapah Station. According to the RF method, there is a small difference between the 1st and 2nd scenarios and the 2nd one produces the best accuracy. The other methods behave differently, for example, 1st and 2nd input combinations provide same accuracy for the BoT method. M1 scenario has slightly better accuracy than the M2 for the SVM method while the 1st scenario performs worse compared to latter and for the BaT. This difference can be explained by the different working principles of the four methods implemented. The best estimates belong to the RF method and it is followed by the BaT methods while the SVM generally produces the worst ET0 estimates.
Table 4 reports the test performances of the RF, SVM, BoT and BaT methods in estimating ET0 of Mandali Station. Here also the RF-2 model has the lowest MSE (0.024), RMSE (0.156) and MAE (0.059) followed by the RF-1 and BaT-2 models. SVM produces the worst estimates similar to the previous station. The models’ ranks from the best to worst are RF-2 > RF-1 > RF-3 > RF-7 > RF-4 > RF-6 > RF-5; SVM-1 > SVM-2 > SVM-3 > SVM-7 > SVM-4 > SVM-6 > SVM-5; BoT-1 > BoT-2 > BoT-3 > BoT-7 > BoT-4 > BoT-6 > BoT-5 and BaT-2 > BaT-1 > BaT-7 > BaT-4 > BaT-3 > BaT-6 > BaT-5. From this, we can say that 1st or 2nd input scenarios inlcuding Tmax, Tmin, Tmean, SR, WS & RH and Tmax, Tmin, Tmean & SR variables generally provide the best estimates while the 5th scenario involving RH, WS & Tmax gives the worst ET0 estimates. The main reason of this might be the fact that the SR input is very effective on ET0 and not involved in this combination (5th) and involving WS parameter may worsen the estimation accuracy. Adding some input variables may negatively affect the variance and cause worse model accuracy in machine learning modeling. Here, adding WS might deteriorate the model performance as this can be observed from the 1st and 2nd input cases.
Test performances of the implemented four methods in estimating ETo of Kalar Station are reported in Table 5. Parallel to the Qarah-Tapah and Mandali stations, RF-2 model provides the best performance with the lowest MSE (0.022), RMSE (0.148) and MAE (0.056) and the highest R2 (0.998) followed by the RF-1 and BaT-2 models. In this station, the BaT and BoT are ranked 2nd and 3rd places while the SVM was the least accurate model. The ranks of the models with respect to different input scenarios are RF-2 > RF-1 > RF-3 > RF-7 > RF-4 > RF-6 > RF-5; SVM-1 > SVM-2 > SVM-3 > SVM-7 > SVM-4 > SVM-6 > SVM-5; BoT-2 > BoT-1 > BoT-3 > BoT-7 > BoT-4 > BoT-6 > BoT-5 and BaT-2 > BaT-1 > BaT-7 > BaT-4 > BaT-3 > BaT-6 > BaT-5. Here also clearly seen that the 1st and 2nd scenarios produce the best estimates whereas the 5th scenario has the worst results.
Table 6 gives the performance metrics of the four methods in the test period of Iraq-Iran Border Station. In this station also the RF-2 is ranked in the 1st place by providing the lowest MSE (0.020), RMSE (0.143) and MAE (0.055) and the highest R2 (0.998) followed by the RF-1 and BaT-2 models. Here also the BoT and BaT also perform superior to the SVM method. The accuracy ranks of the implemented models are RF-2 > RF-1 > RF-3 > RF-7 > RF-4 > RF-6 > RF-5; SVM-1 > SVM-2 > SVM-3 > SVM-7 > SVM-4 > SVM-6 > SVM-5; BoT-1 > BoT-2 > BoT-3 > BoT-7 > BoT-4 > BoT-6 > BoT-5 and BaT-2 > BaT-1 > BaT-7 > BaT-4 > BaT-3 > BaT-6 > BaT-5. Here also the 5th scenario provides the worst results while the 1st and 2nd input scenarios have the best ETo estimates. The performance measures of the RF, SVM, BoT and BaT methods in estimation of Adhim Station are presented in Table 7. Here also the RF-2 model has the lowest MSE (0.006), RMSE (0.078) and MAE (0.058) followed by the RF-1 and BaT-2 models. SVM generally produces the worst estimates while the BaT and BoT methods follow the RF in accuracy for estimating daily ETo. The models has the accuracy ranks are RF-2 > RF-1 > RF-3 > RF-7 > RF-4 > RF-6 > RF-5; SVM-1 > SVM-2 > SVM-3 > SVM-7 > SVM-4 > SVM-6 > SVM-5; BoT-1 > BoT-2 > BoT-3 > BoT-7 > BoT-4 > BoT-6 > BoT-5 and BaT-2 > BaT-4 > BaT-1 > BaT-7 > BaT-3 > BaT-6 > BaT-5. Here also the models including the 1st and/or 2nd scenarios perform the best while the models with 5th combination has the worst results in all methods. In this station, the BoT-1 has better accuracy than the BoT-2, however, this difference is marginal.
Overall, the RF method especially with the Tmax, Tmin, Tmean & SR inputs provides the best accuracy in estimating daily ETo of all stations. Its accuracy is followed by the BaT and BoT methods while the SVM has the worst accuracy. In most of the cases, 2nd input scenario provides the best accuracy in estimating daily ET0. It is also worth to say that the 7th input scenario having only Tmean and SR inputs perform superior to the 4th, 5th and 6th input scenarios.
Figures 6 and 7 illustrate the time variation and scatter plots of the best model (RF-2) estimates. It is clear from Figure 1 that the ETo estimates by the RF-2 are closely following the observed values. As clearly seen from Figure 2 that the fit line of the RF-2 overlaps the ideal line (1:1 line) and it has high correlation for all stations. All these results highly recommend the RF method in estimating daily ET0.