2.1 study area
Sari-Neka region is a basin which is located by the Caspian Sea between (35° 56' 36° 52' N and 52° 43' 54° 44' E), is selected as a case study at the Mazandaran province (Fig. 1). It is divided into mountainous parts, hills, and flat plains, mainly covered by alluvial sediments from a geomorphological view. The Sari-Neka climate regard to the DeMartonne’s method is generally Mediterranean and semi-humid. This region covers an area of about 6938.5 km2, of which an area of 977.87 km2 is devoted to the plains, while the rest is devoted to the highlands (5877.8 km2). The highest elevation in the area is 3836 m, and the lowest point is -27 m (Nasiri et al. 2021). Annual average rainfall fluctuated between 400–1000 mm from 2000 to 2019 and the most humid months are related to October and November. Moreover, the mean maximum and minimum temperatures are 23 and 13 Celsius, respectively.
2.2 Date collection
Different datasets were prepared for this study that can be identified as:
1- Maximum temperature, minimum temperature, and precipitation data of historical and future period for three models ACCESS-CM2, HadGEM3-GC31-LL, and NESM3 under CMIP6 report from IPCC based on SSP2-4.5 and SSP5-8.5 scenarios were received for downscaling from https://climate4impact.eu.
2- Maximum and minimum temperatures, and precipitation for the observation period were provided from the meteorological organization of Mazandaran province on a daily scale.
3- Data on groundwater level fluctuations in 68 piezometers were provided from the regional water organization of Mazandaran province.
2.3 Climate models and emissions scenarios
In this study, the outputs of three climate models from CMIP6 were received from the mentioned databases, and the data of the region were extracted using ArcGis10.8. The ACCESS-CM2, HadGEM3-GC31-LL, and NESM3 models were used. The precipitation, maximum temperature, and minimum temperature of these models are available and presented in Table 1. The SSP2-4.5 and SSP5-8.5 scenarios were used for this study which are called The Middle of the road and Fossil-fueled development—Taking the highway, respectively. The specifications of scenarios are summarized in Table 2. The present study used these scenarios due to the following reasons: (1) the SSP2-4.5 and SSP-5-8.5 are utilized for the vulnerabilities to climate change and its consequences (Warnatzsch and Reay 2019); (2) since SSP1-2.6, which is an update on RCP2.6, is absent in some models, a comparison between the models becomes problematic.
Table 1
List of CMIP6 models that have been used in this study (Priestley et al., 2020)
Model name
|
Institution
|
resolution
|
Processing
|
Simulated scenarios
|
ACCESS-CM2
|
CSIRO-ARCCSS; Commonwealth Scientific and Industrial Research Organization, and Bureau of Meteorology (Australia)
|
192*145
|
r1i1p1f1
|
SSP2-4.5
SSP5-8.5
|
HadGEM3-GC31-LL
|
MOHC; Met Office Hadley Center, United Kingdom
|
192*144
|
r1i1p1f1
|
SSP2-4.5
SSP5-8.5
|
NESM3
|
NUIST; Nanjing University of Information Science and Technology, China
|
192*96
|
r1i1p1f1
|
SSP2-4.5
SSP5-8.5
|
Table 2
Summary of assumptions regarding demographic and human development elements of SSP2 and SSP5 (O’Neill et al., 2017)
SSP element
|
SSP2
|
SSP5
|
Technology
|
|
|
Development
|
Medium, uneven
|
Rapid
|
Carbon intensity
|
Medium
|
High
|
Energy tech change
|
Some investment in renewables but
continued reliance on fossil fuels
|
Directed toward fossil fuels; alternative sources not actively pursued
|
Economy & lifestyle
|
|
|
Growth (per capita)
|
Medium, uneven
|
High
|
Globalization
|
Semi-open globalized economy
|
Strongly globalized, increasingly connected
|
Consumption & Diet
|
Material-intensive
consumption, medium meat consumption
|
Materialism, status consumption, tourism,
mobility, meat-rich diets
|
Policies & institutions
|
|
|
International
Cooperation
|
Relatively weak
|
Effective in pursuit of development goals, more limited for envt. goals
|
Environmental Policy
|
Concern for local pollutants but only
moderate success in implementation
|
Focus on local environment with obvious benefits to
well-being, little concern with global problems
|
Policy orientation
|
Weak focus on sustainability
|
Toward development, free markets, human capita
|
2.4 Downscaling
Presently, GCM outputs are not directly employed in hydrological models due to their resolution inability and their lack of sufficient spatial and temporal certainty (Semenov and Barrow 1997). The model employed by this research is LARS-WG6, the initial version of which was introduced by Racsko et al. (1991) to address the issues of the Markov chain, which was frequently used to model precipitation, and later upgraded by Semenov and Barrow. As a generator, LARS-WG produces climatic parameters, such as maximum temperature, minimum temperature, precipitation, and solar radiation on a daily basis for any period of time according to a set of semi-empirical distributions. It must be noted that this model is not a weather forecast tool but a means to generate an artificial weather time series that statistically resembles observational data.
The delta change factor (DCF) technique was utilized to generate the Atmosphere-Ocean General Circulation Model (AOGCM) climate change scenario. This method calculates the maximum and minimum temperature difference and the precipitation ratio of the prospective and base periods in the studied region’s model according to Eq. 1, 2, and 3. In this research, the base and prospective periods were taken to be 2000–2019 and 2021–2040, respectively (Semenov and Barrow 1997).
\(\varDelta {P}_{i}={\stackrel{-}{P}}_{GCM,FUT,i}/{\stackrel{-}{P}}_{GCM,Base,i}\) (1)
\(\varDelta {T}_{i,Min}={\stackrel{-}{T}}_{Min\left(GCM,FUT,i\right)-}{\stackrel{-}{T}}_{Min\left(GCM,Base,i\right)}\) (2)
\(\varDelta {T}_{i,Max}={\stackrel{-}{T}}_{Max\left(GCM,FUT,i\right)-}{\stackrel{-}{T}}_{Max\left(GCM,Base,i\right)}\) (3)
In the above equations,
\(\varDelta {P}_{i}\),
\(\varDelta {T}_{i,Min}\), and
\(\varDelta {T}_{i,Max}\) represent the climate change scenarios of precipitation, minimum temperature, and maximum temperature, respectively, for every month. In addition,
\({\stackrel{-}{P}}_{GCM,FUT,i}\) denotes the 20-year precipitation average simulated by the AOGCM models for the prospective period, and
\({\stackrel{-}{P}}_{GCM,Base,i}\) is the same for the base period (2000–2019 in this study). Also, the explanations provided for the minimum and maximum temperature are accurate.
Two files were created to generate climatic data in the LARS-WG model and to downscale the GCM data for future periods. The first file describes past climatic behavior, while the other file contains climate change scenarios. The model is calibrated in the first step and then verified using statistical tests and a comparison of the graphs.
2.5 Clustering of observation well
Clustering is an unsupervised learning technique in which samples are categorized into similar groups with identical features. A common clustering method is the K-means, which MacQueen introduced in 1967 (MacQueen 1967). In this method, the number of clusters is predetermined. The number of clusters was validated using the common Elbow index, which is determined by Eq. (4) (Brusco and Steinley 2007). There were 68 piezometers in this basin; thus, clustering was performed to avoid over-complication of the model. The geographical coordinates and groundwater levels of the piezometers were used for clustering.
\(WCSS=\sum _{k=1}^{K}\sum _{i\in {C}_{k}}\sum _{v=1}^{V}{\left({x}_{iv}-{\stackrel{-}{x}}_{vk}\right)}^{2}\) (4)
Where \(Ck\) is the set of observations in the Kth and \({\stackrel{-}{x}}_{vk}\) is the mean of variable \(v\) in cluster \(k\). In this technique, the number of clusters is directly related to the Within-Cluster Sum of Square (WCSS), which is the sum of squared distance between each point and the centroid in a cluster. The vertical axis represents WCSS, while the horizontal axis represents the number of clusters. In this index, the number of clusters K begins from 1 and grows up to where the value of WCSS remains almost constant. It is worth noting that this value is largest at the first cluster.
2.6 Artificial neural network (ANN)
Artificial neural networks (ANN) are typically based on human nervous systems. In hydrological contexts, these heuristics are particularly appropriate for predicting and forecasting variables because they are capable of modeling nonlinear, nonstationary, and nongaussian processes. NNs train themselves to recognize patterns in data in order to be able to forecast the output of a future set of similar data. NNs are mostly divided into three general layers. The input layer is responsible for receiving data, the output layer contains forecast information, and the middle layers perform the necessary calculations (Maier and Dandy 1997; Daliakopoulos et al. 2005; Moghaddam et al. 2019). The inputs are multiplied by synaptic weights and delivered to the first hidden layer. In the hidden units, the weighted sum of inputs is transformed by a nonlinear activation function (Taormina et al. 2012).
NNs are either feedforward or recurrent. The present research utilized feedforward NNs and the sigmoid activation function. Furthermore, the Levenberg-Marquardt algorithm was employed to train the NN (Daliakopoulos et al. 2005; Derbela and Nouiri 2020). In feedforward networks, the input enters from the left, and the output exits from the right. Moreover, the number of input and output neurons are determined by the number of parameters in the network, and those of the hidden layer are determined via trial and error. The hidden layer is tasked with linking the input and output layers. Using this layer, the NN can extract nonlinear relationships from the data input to the model.
Training is aimed at reaching a state where the network is capable of correctly responding to training data in addition to similar non-training data. NN learning is either supervised or unsupervised, the former of which was used in the present study. In this approach, training is mostly performed using sample vectors of pairs, such that a specific output vector is assigned to each input vector. As this group of vectors is presented to the network, the weights are corrected according to the learning algorithm. This research used minimum and maximum temperature, precipitation in the current month, and groundwater level in the previous month as the input data and groundwater level in the current month as the output data (Coppola et al. 2005; Chitsazan et al. 2015; Moghaddam et al. 2019).
2.7 Performance criteria
Statistical methods for assessing the error between observational and predicted are used by correlation coefficients (r), coefficients of determination (R2), Mean Absolute Error (MAE), and Root Mean Squared Error (RMSE).
\(r=\frac{n\left(\sum _{i=1}^{n}{O}_{i}{P}_{i}\right)-\left(\sum _{i=1}^{n}{O}_{i}\right).\left(\sum _{i=1}^{n}{P}_{i}\right)}{\sqrt{\left(n\sum _{i=1}^{n}{{O}_{i}}^{2}-{\left(\sum _{i=1}^{n}{O}_{i}\right)}^{2}\right)}.\left(n\sum _{i=1}^{n}{{P}_{i}}^{2}-{\left(\sum _{i=1}^{n}{P}_{i}\right)}^{2}\right)}\) (5)
\({R}^{2}=\frac{{\left[{\sum }_{i=1}^{n}\left({O}_{i}-\stackrel{-}{{O}_{i}}\right).({P}_{i}-\stackrel{-}{{P}_{i}})\right]}^{2}}{{\sum }_{i=1}^{n}{\left({O}_{i}-\stackrel{-}{{O}_{i}}\right)}^{2}.{\sum }_{i=1}^{n}{({P}_{i}-\stackrel{-}{{P}_{i}})}^{2}}\) (6)
\(RMSE=\sqrt{\frac{{\sum }_{i=1}^{n}{({P}_{i}-{O}_{i})}^{2}}{n}}\) (7)
\(MAE=\frac{{\sum }_{t=1}^{n}\left|{P}_{i}-{O}_{i}\right|}{n}\) (8)
Where n is the total number of measured data, \({P}_{i}\) & \({O}_{i}\) are the predicted and observed value, respectively, and \(\stackrel{-}{{O}_{i}}\) & \(\stackrel{-}{{P}_{i}}\) are the averaged value of the measured data.