PROJECTING ARIDITY FROM STATISTICALLY DOWNSCALED AND BIAS CORRECTED VARIABLES FOR GEDIZ BASIN/TURKEY

In recent years, significant temperature increase all across of the world draws attention and climate regimes are shifting in different parts of the world. By the reason of these climatological changes, a study was conducted in Gediz Basin/Turkey, where agricultural production has an important place. In the study prepared, twelve GCMs were utilized under RCP4.5, RCP6.0 and RCP8.5 scenarios of 5th Assesment Report of IPCC for the period 2015-2050. The statistical downscaling methods were employed and the projections were derived right after applying weighted-averaged ensemble mean by Bayesian Model Averaging method and bias correction by equidistant quantile mapping method. The temperature-based potential evapotranspiration formulas were modified in accordance with Penman-Monteith method and the aridity indexes were calculated by UNEP (1992). According to the projections, the mean annual temperature increases between 1.5℃ and 2.2℃ and mean total annual potential evapotranspiration increases between 5% and 8% are foreseen in Gediz Basin for the near future. It is foreseen that semi-arid climate regime may predominate over the region for all of the RCP scenarios under the increasing dryness in basin climate. In addition, it was obtained in the study that sub-humid climate state occurance for all of the regions included by the basin may be unexpected in the future for ORCID RCP8.5 scenario. The presence of semi-arid climate conditions may be more potent with the increasing trend of radiative forcing over time.


Introduction
Earth climate is reacting to human activities conspicuously and becoming the world's most important issue to be dealt with by all countries. Climate patterns are changing day by day in consequence of anthropogenic effects which are originated from especially energy requirements. These requirements cause the greenhouse gas (GHG) emmissions, hence, the temperature is gradually increasing in various parts of the earth. Moreover, there are several up-to-date researches which capture the increasing temperature. Additionally, National Oceanic and Atmospheric Administration (2019) released that the increasing temperature trend kept going in 2018 and it was ranked as fourth warmest year among the annual records. In the light of these notifications, especially the peak temperature events recently experienced consecutively across the world, led the authors to investigate how the climate regime might change in the future.
In the climate change studies, the mathematical models which capture the Earth's physical climate system, the general circulation models (GCMs), are the most frequently utilized tools to predict the past and future climate. The GCM data with coarse resolution are not enough to represent the local climate of a specific region, hence, the downscaling techniques are highly recommended by climate scientist so as to derive precise estimates. It is important to monitor climate change so that the relevant authorities may take measures against natural hazards such as droughts, floods, tornadoes, and etc. which are induced by unbalanced climatological shifts.
One of them, droughts, can be defined as a normal climatological event which can occur in every type of climate regime, however, it can be more severe in arid regions (Smakhtin and Schipper, 2008). In the literature, those ideas can sometimes intermingle with one another, drought and aridity. Drought is considered as a notion of temporary events, this is why it is called as a natural hazard but undergone more slowly than others, whereas aridity is a longterm climate condition which generally specifies the climate state of a region. Hence, the reason of aridity is significant within the scope of climate change (Maliva and Missimer, 2012). The aridity can be indexed by means of several methods which are mainly the functions of precipitation (P), temperature (T) and temperature-related potential evapotranspiration (PET).
In the related literature, it is possible to encounter many downscaling implementations relevant to P and T variables. However, the number of studies concentrating on the derivation of the aridity indices from the downscaled P and T projections is rather limited (e.g  In this study, it is desired to investigate possible future AIs over Gediz Basin, an important basin where agriculture has the biggest allocation in the production sector. The up-to-date climate models derived under IPCC's last assesment report, AR5, was taken into account. Three scenarios related to future GHG stabilization named as Representative Concentration Pathways (RCPs) were utilized. Twelve GCMs under two mid-stabilization scenarios RCP4.5, RCP6.0 and high-stabilization scenario RCP8.5 were employed. A novel methodological strategy was applied by means of downscaling/multi model ensemble/bias correction strategy which will be discussed in Methodology section and the projections were obtained for the period 2015-2050.
Projections which were prepared for this study has three processes: a) Simulation of precipitation and temperature, b) simulation of potential evapotranspiration and c) simulation of AI for the term 2015-2050.
In fact, any aridity projection work under RCP scenarios has not been exercised for the basin of Turkey. This is one of the outstanding aspects of the paper. Another novel aspect is the development of a temperature-based PET approach that simulates the Penman-Monteith predictions, and thus converting the RCP scenario-based T projections into PET and aridity projections. The proposed methodological strategy is thought to provide a practical way for researchers interested in the subject. In remaining part of the paper, the results which were achieved under different RCP scenarios were evaluated and finally, conclusion part with a thorough discussion was given.

Study area and data
The study area is Gediz Basin which is located in the western part of Turkey. Gediz Basin is named by the Gediz River with 401 km long river reach and having drainage area of 17125 km 2 . The river rises from Murat Mountain of Kutahya region in the east and flows into Aegean Sea in the west. The study area has a typical Mediterrenean climate. The most of the population existing in the basin lives off agricultural sector. Agricultural productions are made in irrigated fields with the area of 110000 ha which extends from south to west of the basin. In the study, the data of 20 meteorological stations which observe both P and T were utilized for the time period between January 1980 and December 2005 and the data archive are provided from Turkish Meteorological Service (TMS) (Table 1, Figure 1). According to the data, the mean areal P is observed as 550 mm per year and mean annual T is observed as 13℃.
In the climate change studies, the reanalysis data which represent the atmospheric, oceanic and land surface state of the past time are utilized frequently. In this work, ERA-Interim reanalysis data were used so as to determine predictor variables of the Gediz Basin T and utilized as inputs in downscaling applications. The reanalysis data set covers Gediz Basin between 37.875 o -39.375 o N latitudes and 26.625 o -29.625 o E longitudes with the resolution of 0.75 ○ in both axis.
The selected time period of the reanalysis data is same as T data obtained from TMS.
In this study, the climate scenarios referred to in IPCC's last assessment report (AR5) was taken into account. In AR5, Representative Concentration Pathways (RCPs) in which the radiative forcing forecasts are made for 21st century and after are constituted by means of different emmission scenarios. In the work, the mid-stabilization pathways RCP4.5 and RCP6.0 and high stabilization pathway RCP8.5 scenarios were taken into consideration for this study. The optimistic scenario RCP2.6 where it is foreseen radiative force of 2.6 W/m 2 by the end of 2100 was not evaluated because it is desired to seek the climate in the conditions of increasing radiative forcing and greenhouse gas emmissions. The Coupled Model of Intercomparison Phase 5 (CMIP5) data set of twelve GCMs were utilized for the related RCP scenarios. Though more than 30 GCMs were present in the CMIP5 archive, some GCMs were eliminated through minding the compatibility between the variables in the reanalysis and GCM data sets. Anandhi et al. (2008) and Okkan and Inan (2015) recommended the usage of Kendall's Tau nonparametric rank correlation statistics to detect the relationship between reanalysis predictors and historical scenario results of GCMs, and uttered that this statistics has expediencies comparable with other indices. In the study, the GCMs, in which the rank correlation statistics were less than or equal to 0.3, were ruled out and as a result, it was decided to utilize 12 GCMs that were also listed in Okkan and Kirdemir (2016). The computational details about the GCM based rank indicators were not shared in the paper due to the limitation of word number.
Although it was possible to access the data of whole of 21st century, the projection period was decided as the years between 2015 and 2050 such that it was considered that predictions are less uncertain for the near future. Moreover, in addition to data of RCP scenarios, historical period data were utilized as the reference scenario (REF) which represents past climate over the region. By the reason of reference period for all of GCMs is commonly provided as in between 1980 and 2005, the same time period was chosen for both ERA-Interim reanalysis and observed TMS data. In GCMs, the atmospheric variables at different atmospheric levels are selected same as those of ERA-Interim reanalysis data set so that selected predictors via reanalysis data could be used analogously at the stage of downscaling GCMs. The related atmospheric variables exist in both reanalysis data and GCMs are mean air temperature (air), sea level pressure (slp), large-scale precipitation (pr) at surface level, mean air temperature and geopotential height at atmospheric levels of 200 hPa, 500hPa and 850 hPa (air200, air500, air850, hgt200, hgt500, hgt850) and relative humidity at atmospheric levels of 500 hPa and 850 hPa (rhum500, rhum850).
The purpose of the study is to determine the AIs of Gediz Basin in the future. The methodology set up for this purpose requires projection of the P for the related time period and these data were extracted from Okkan and Kirdemir (2016) (hereinafter referred to as OK16). In OK16, the projection of P over Gediz Basin was derived for RCP4.5, RCP6.0 and RCP8.5 scenarios.
The information about precipation projections prepared over Gediz Basin will be given in Results section. Moreover, the details of GCMs used in this study can be accessed from OK16.

Downscaling of GCM data
General circulation models that are relatively coarse in terms of resolution are not efficient to capture the meteorological effects of climate change in the station scale. For this reason, high-resolution results are needed to interpret the effect of coarse-resolution atmospheric models in the station scale. Due to this requirement, it is possible to achieve a data set that more reliably represents the climatic characteristics of the study area by downscaling the coarse-resolution GCM data to the local scale using the downscaling method.
The downscaling models are divided into two as dynamical and statistical. Among them, the dynamical downscaling method involves applications based on physical climate models operating at regional scale. These models can be employed at higher resolutions, which also reflect topography features. Thus, orographic precipitation, extreme processes, and climatic anomalies in regional measurements can be taken into account with regional climate models.
However, these models require too much input, contain too many parameters, and use coarse output of global-scaled models as initial-boundary conditions. With uncertainties from globalscaled models and high quantity in their own degrees of freedom, bias and uncertainty in predictions are increasing. In this respect, the implementation, setup and performance validation of these models are troublesome, which limits the applicability of these models. compared and it was concluded that it was more logical to use statistical downscaling models as it needs less computational effort although both models had close results to each other. By virtue of these reasons, statistical downscaling methods were preferred to dynamical downscaling methods for the projections in this study. However, it should be noted that a qualified statistical downscaling model requires that various assumptions are met. These assumptions, which were earlier mentioned by Schoof (2013), can be briefly listed as follows: a. There is a robust relationship between the predictors and the predictand (i.e., the variable being downscaled).
b. The predictors are sufficiently simulated by the GCMs.
c. The predictors incorporate the signal pertaining to climatic change.
d. The relationship between the predictors and predictand show stationary behavior.
The readers can look through the study of Schoof (2013) for the caveats of statistical downscaling methods and the detailed explanations about the assumptions listed above.
Artificial neural networks (ANN) and least squares support vector machines (LSSVM) methods were used for statistical downscaling to derive T projections. P projections were already obtained from OK16, hence it will be elucidated how to get T projections in this study. In both models, re-analysis variables were introduced as model inputs so that coarse-resolution GCM data could be downscaled to station scale. During the application of the downscaling models, it is aimed to define predictor data sets representing the observed T in the study area in order to reduce the computational cost. To apply this, all possible regressions (APREG) method was used and 12 ERA-Interim re-analysis variables as predictors and T observations of 20 meteorological stations as predictands were utilized in the related method. For the predictor selection, 2 12 -2 = 4094 regression models were established and the most suitable data set to be used as input in the downscaling model was selected according to the two performance criteria: one of them is root mean square error which indicates a derivation of average error stemmed from each submodels and the other one is Mallow's Cp that its relatively smaller values represents the smaller amount of bias in the predicted responses yielded by submodels. It was obtained when APREG method was employed that the regression model with only large-scale air temperature (air) variable showed a high correlated result with observed T variable for each meteorological station (approximately represents 99% of dependent variable variance for each station). As a result, it is sufficient to use the air variable of the re-analysis data as the T predictor of the stations. As with this study, only prate variable was selected as potential predictor of P in OK16¸ so it can be evaluated that prate and air variables of ERA-Interim data set are prominent variables in representing of P and T for Gediz Basin.
In the employing of ANN model, z standardization were applied for time series and Levenberg-Marquardt algorithm were used for optimizing network's weights. In the employing of LSSVM model, the same standardization procedure were applied and, regularization parameter and radial basis function width of the model were calibrated.
Performance of downscaling models were evaluated by means of statistical performance criteria such as determination coefficient (R 2 ), RMSE, RSR which is proportion of RMSE to standard deviation of observed data and percent of bias (PBIAS) which is the metric of overestimation

Multi-model ensemble and bias correction strategy
When the climate models are taken into consideration, it can be stated that they have effective where p(fk|D) is the posterior probability of k th model forecast for given observed data and defined as weight of the k th model (wk). Sum of the related model weights adds up to 1. The conditional PDF pk(y|fk, D) is assumed to be normally distrubuted and centered at linear regression function of ak + bk.fk whose response variables are the observed ones. Expected value of BMA forecasts are given in Eq. (2).

Raftery et al. (2005) recommended maximization of log-likelihood function as objective
function to define the parameters which are weights (wk) and variances of model forecasts (σk 2 ).
The log-likelihood function is where g(y|fk, σk) is the conditional PDF of y for given model forecasts and predicted variance.
The term g(y|fk, σk) is similar to pk(y|fk, D) such that it is normally distributed as N(ak + bk.fk , σk where f(x;θscn,i) is the PDF of corresponding meteorological variable for given distribution parameters. Also, F -1 (.) represents the inverse CDF and θhistorical,i and θobserved,i denotes the value of the distribution parameters of corresponding meteorological variable for the historical scenario and observed data, respectively.
The above-mentioned corresponding meteorological variable is T for this study. The bias corrected time series of P was already obtained from OK16. EDQM method requires fitting distribution functions for the time series, hence, the probability plot-correlation coefficient (PPCC) tests were applied to observed and downscaled T time series. As a result of the PPCC tests, it was determined that T time series were fitted to normal distribution, so the distribution parameters were mean and standard deviation for this application. For the sake of brevity, the results of the PPCC tests will not be given in the study.

Simulation of potential evapotranspiration and aridity index
In the literature, there are several methods used for determination of the AI. Most of them determine the AI by using P, PET or T data. One of them is P and T data user Köppen and  (2007)). In the decision phase of model selection, test results of better performing models were selected and utilized in the next process.

Multi-model ensemble, bias correction and ultimate results
In the downscaling application, the raw GCM data were integrated with calibrated and validated downscaling models which were selected individually for each station and it was obtained that statistical behavior of T series belonging to 12 GCMs were different from each other. As stated before, this situation entails uncertainty in model selection and puts into trouble in decision.
Hence, BMA method was utilized to produce weighted-average projections in which 12 GCMs were combined. For each station, observed data and historical projections of 12 GCMs were utilized and the time series were divided into wet and dry periods. In this process, each GCM had its own weight for wet and dry periods respectively. At the end, the expected values of multi-model ensembles for related wet and dry periods were derived and the long term monthly time series were reproduced by combining wet and dry periods. Same weights were used for RCP4.5, RCP6.0 and RCP8.5 scenarios respectively. The related weights were shown in Table   4 for each meteorological station. According to  (Figure 2a and b). In order to capture the climatological conditions of the study area, the model biases were corrected and it was obtained that the corrected historical T series were more compatible to the observed series ( Figure 2c). Similar biased results were produced for other related meteorological stations and the related bias correction procedure were applied for all of them.
Upon evaluating the T projections for all meteorological stations, it is foreseen that mean annual T changes between +1.26℃ and +2.80℃ may occur over different regions of Gediz Basin. The  it has not been identified how uncertainties are propagated as the stages proceed in produced projections.

Possible changes in potential evapotranspiration
The similar assessments for T projections were made for PET forecasts which were projected  Table 5. Then, the T-based PET function consisting of these coefficients mentioned in Table 5  As is the case with T projections, the PET changes were evaluated in terms of time periods for the entire basin. The mean total annual PET which is 1141.8 mm in REF, is expected to increase 5.5%, 5.6% and 7.7% for RCP4.5, RCP6.0 and RCP8.5, respectively. For the seasonal periods, the mean total PET in autumn season whose REF value is 219 mm is expected to have maximum increase as 11.7%, 12.1% and 15.2% for RCP4.5, RCP6.0 and RCP8.5, respectively. The increase rate of 5.6% in winter and summer seasons and 0.81% in spring season for RCP4.5, the increase rate of 6.8%, 1%, 5.4% and 9%, 2.6%, 7.2% for RCP6.0 and RCP8.5 scenarios of winter, spring and summer seasons, respectively are foreseen in the scenario period. The maximum increase rate of PET is expected to occur on November for all of RCPs as in between 25% and 31% (see Figure 4). In order to analyze whether the possible PET changes are statistically significant, two sample t-tests were applied (the significance level is .05) as tested in the phase of T simulation. According to the t-tests, possible annual changes for all RCPs are statistically significant. The changes of winter, summer and autumn seasons are expected to be as statistically significant for all RCPs but only in spring season, PET is foreseen not to deviate significantly throughout the basin. The statistically significant changes are expected for the months between May and December in all RCPs, for April in RCP8.5 scenario. It is not predicted any statistically significant changes for the rest of the months.

Future aridity index
In the study, the PET simulations for 2015-2050 scenario period were derived for estimating future AI. As known, UNEP AI is the function of mean annual P and mean annual PET. Prior giving results about AI, it will be informed briefly from OK16 about the mean annual P simulations over Gediz Basin. According to data utilized from OK16; it is predicted 8.2%, Due to the fact that it is expected in the future that there is no region where mean annual P may increase, mean annual T may decrease and mean annual PET may decrease, the AI over Gediz Basin is expected to tend to decrease in the scenario period. In

Discussion and Conclusions
In this work, a projection algorithm was set up to analyze how AI over Gediz Basin may change until the end of the mid-21st century. In order to do this implementation, a statistical downscaling procedure was applied so that coarse-resolution twelve GCMs data of three RCPs Individual GCMs have uncertainty originated from different aspects such as initial and boundary condition uncertainities, problems in theoretical definition of the global climate system and parameter uncertainties (Knutti, 2008 of the CV is the indicator of larger variability. Instead of checking S/N ratios, inter-annual CV statistics for T, P and AI were examined. Since the study is on the subject of projected aridity, it was investigated the extent to which the variability in the AI was affected by the variations in P and T. At this stage, the inter-annual CV values calculated for the AI were analyzed by means of scattering diagrams (not presented here) in response to that of P and T, respectively.
The obtained results demonstrated that inter-annual variability in the AI is more directed by P variability. Although it is monitored that the increase in inter-annual variability of T has an inverse effect on AI's variability, the correlations between them are not statistically significant.
The main reason for this is that the trend in annual average T indicates a significant increase during the existed period, while the inter-annual irregularities in the P point out both notrending and the large CV. In our future works, we have in view to deal the methodologies, which were effectively applied by Tabari and Willems (2018b), to decompose total uncertainty into uncertainty originated from various stages of climate change projections.
Another noticable finding obtained in the study is that the results of RCP4.5 and RCP6.0 are very close to each other. As known, PET is the function of T and AI is that of P and PET.
Hence, it was decided by the authors to check out how raw GCM data of P and T vary for RCP4.5 and RCP6.0 scenarios. When examined the raw data of twelve GCM data, the similar close values draw attention for the scenario period. Overall, approximately the difference of -18 mm which corresponds to 0.035% of the raw historical data in mean annual P and of +0.1℃ in mean annual T are projected by the climate models in the grids of stuyd area. The range increases between the years 2051-2100 with the difference of -20 mm (0.04% of raw historical data) and +0.4℃ are projected in mean annual P and mean annual T, respectively. When the emission scenarios of AR5 is taken into consideration, the difference of greenhouse gas emmision between the scenarios RCP4. should be made about the how the agricultural productivity may change in the future with repsect to aridity for the sustainability of agricultural sector.

Funding
The research leading to these results received funding from The Scientific and Technological