Examining of the countries in terms of water resources along with the effect of climate change by time series clustering

Today, the problem of climate change is addressed in many ways. The most important effect of climate change is the disruption of the water cycle. Therefore, the location and timing of the water resources in the world are changing. Water is an indispensable resource that connects all living things together and directly affects their lives. Water is not only a biological requirement for man, but also for economic, social, and cultural life itself. However, this resource, which is of vital importance, exists unfortunately in a limited amount on earth. This study aims to identify similar countries in terms of water resources by taking into account the changes in precipitation amounts together with climate change. For this purpose, a time series clustering analysis was conducted to precipitation amounts of 21 countries between 2009-2017 with the Hierarchical clustering and Partitioning Around Medoids (PAM) clustering methods. Countries are divided into two clusters and they were evaluated according to their water risk levels and populations. It should be noted that countries with low precipitation, high risk of water, and a high population should take urgent measures for climate change.


Introduction
Water has been a very important natural resource for all civilizations for centuries (Akkaya, Efeoğlu, Yeşil, 2006;195).With the advancement of technology, water use has increased.
Water resources are used for many purposes such as for drinking, irrigation water, and energy production.Therefore, water has an indispensable role in the economic development of countries.
The total amount of water on earth, which has an area of 510 million km 2 , is 1.2 billion km 3 .
Approximately 97.5% of the water consists of saltwater in the seas and oceans.The remaining 2.5% is freshwater.However, a very small portion of these freshwaters can be used.79% of freshwater (2.39% of all water) is in glaciers, 20% (0.6% of all water) is groundwater and 1% (0.03% of all water) is surface and atmosphere water (Bahadır, 2011).
This situation shows that the freshwater that can be reached in the world's water resources are very small and even insufficient.Approximately 1.5 billion people in the world are devoid of sufficient drinking water, while 2.5 billion people are in need of healthy water and about 7 million people die from water-related diseases annually (Kanber et al., 2010).
Today, the three main threats to water resources around the world are population growth and urbanization, given our consumption habits and climate change.Due to climate change, and the hydrological cycle, water resources are becoming more and more critical to manage the on local, regional, global management, and distribution of water resources.Global climate change affects the hydrological system, namely the cycle of water through evaporation and precipitation, causing floods; it can also cause floods in some places and drought in others.
For this reason, unprecedented changes in pattern and time behavior of water resources are observed in different parts of the World (Şen, 2005).Significantly, the presence of water is affected.This creates difficulties in accessing water in terms of time and location.
In addition, the world population is growing rapidly.Over the next 40 years, an additional 2.5 billion people are expected to be added to the world population (Godfray et al., 2010).
Therefore, natural resources are consumed day by day.In addition to the increasing population, rising incomes, and consumption levels, as well as increasing demand for food products, can also create additional pressure on water resources (Faures et al., 2007).
However, economic developments are increasing the demand for water on the one hand and are threatening water reserves that seem to be approaching their limit values on the other.
Due to the rapidly increasing water demand in the twenty-first century, the gradual decrease of usable water resources as a result of global warming, and its misuse, has brought water to the forefront in the international agenda (Kılıç, 2008).The Global Risk Report 2014, prepared for the World Economic Forum held in Davos, shows a possible water crisis among the top three worrying risks for the world economy.Although the effects of water resources are generally experienced locally, water security is now defined as a global issue.

Literature Review
Many studies have been conducted to examine the impact of climate change on hydrology due to human and environmental water use focusing on the water cycle and water resources.Some of these are given below.Abadi et al. (2017) forecasted the dynamics of water consumption behavior and predicted future consumption behavior with a new predictive approach based on non-homogenous Markov Models.This predictive classification method was applied on a real dataset provided by a water supply enterprise in France, and the results show that it can be useful for water supply enterprises to better manage water resources and respond to consumer needs.the results obtained, there is high variability between these regions in terms of water resources.Sohoulande et al. (2019) defined precipitation regions for an area covering South Carolina, North Carolina, and Georgia, using a spatial regionalization approach.208 precipitation stations area were selected in the study.For the data set, the time series of seasonal precipitation totals and the seasonal numbers of precipitation events greater than 5 mm during the period 1960-2017 were obtained.A regionalization method combining the principal components and cluster analysis was applied to the data obtained and three precipitation regions were defined according to statistics and similarity criteria.Pathak and Dodamani (2019) evaluated regional groundwater drought characteristics in the Ghataprabha river basin and focused on understanding trends in groundwater levels.They classified the wells by cluster analysis using long-term monthly groundwater levels.The results supply valuable information about the long-term behavior of regional groundwater levels.It also helps to create an effective groundwater management strategy for upcoming droughts.Fukushima et al (2019) examined the long-term trends, interannual variations (IAVs) of seasonal precipitation, and their relationship to atmospheric circulation during two different periods in India.Using hierarchical cluster analysis, they identified homogeneous regions based on pentad precipitation seasonality.They focused on the relationship between the regional characteristics of precipitation and the large-scale circulation system.Reviewing existing works in the literature, it is implied that essentially time-series clustering has four components: dimensionality reduction or representation method, distance measurement, clustering algorithm, prototype definition, and evaluation.Xiong and Yeung (2004) proposed a model-based approach using mixtures of autoregressive moving average (ARMA) models to cluster data models represented as time series.They used the Bayesian information criterion (BIC) to determine the model selection and the number of clusters.Alonso et al. (2006) proposed a new clustering method for time series based on the full probability density of the forecasts.First, a resampling method combined with a nonparametric kernel estimator provides estimates of the forecast densities.A measure of the discrepancy is then defined between these estimates and the resulting dissimilarity matrix is used to carry out the required cluster analysis.Applications of this method to both simulated and real-life data sets were discussed.Zhang et al. (2006) recommended an unsupervised feature extraction algorithm based on selecting the feature dimensionality by leveraging two conflicting requirements, i.e., lower dimensionality and the lower sum of squared errors between the features and the original time series.Experimental results were attained on several synthetic and real-world time series datasets.Aghabozorgi et al. (2015) examined the clustering of time series data.In the study, the improvement trend in the efficiency, quality and complexity of time series clustering approaches and new ways for future studies were presented.Ferreira and Zhao (2016) recommended a method for time series clustering using community detection in complex networks.Firstly, they presented a technique to transform a set of time series into a network using different distance functions.Then, they applied community detection algorithms to identify groups of strongly connected vertices and identify time series clusters.Li and Prakash (2011) proposed a method of time series clustering that can learn joint temporal dynamics in the data, handle time lags, and produce interpretable features.They obtained this by developing complex-valued linear dynamical systems which included realvalued Kalman Filters as a special case.Rinderer et al. (2019) presented a data-driven approach composed of time series clustering and topography-based upscaling of shallow, perched groundwater dynamics using groundwater data from 51 monitoring sites in a 20-ha pre-alpine headwater catchment in Switzerland.Mutti et al. (2020) proposed a comprehensive approach for the characterization of precipitation climatology on semiarid watersheds using the monthly precipitation time series  with up to 30% of the gaps measured in 56 rain gauges in the Piranhas-Watershed -Brazil semi-arid region.It identified two homogeneous precipitation sub-regions in the basin, C1 at the top and C2 at the middle, and the bottom by principal component analysis and cluster analysis.
Many of the studies in the literature have focused on changes in the amount of rainfall in the countries regionally.This study aims to determine the similarities of countries in terms of water resources by taking into account the change in precipitation amounts of countries due to climate change.According to the findings, the water risk levels and populations of these countries are evaluated.In this way, guidance can be provided for these similar countries in terms of similar laws and policies in water consumption.

Climate Change and Its Effect on Water Resources
Climate change is a general term that refers to changes occurring in many climatological factors such as changes on a global scale, on a local scale, changes in temperature and changes in precipitation.Changes in climate systems not only harm the ecological balance but also disrupt the hydrological cycle and contribute to the increase in unwanted hydrological events.According to scientists, the most important effects of climate change are the deterioration of the water cycle and the change in water quality.It can be said that the water resources in the world remain constant with the water cycle, but due to climate change, the location and time of the water resources in the world will change.Therefore, in many places, the management of water resources will be difficult in terms of quantity and quality.In this case, significant changes will occur in the supply and quality of water resources (Çapar, 2019).Considering that water is related to all sectors from agriculture to tourism, from health to energy, it can be said that the negative consequences that will arise are vital.
Water resources are directly related to weather events.Surface and underground water increase the water potential as the amount of precipitation falls into the basins it feeds, while excessive evaporation as a result of global warming and precipitation falling below normal level causes drought.Drought does not emerge suddenly, unlike other natural disasters; it is a result of the accumulation of factors.It is impossible to estimate the start and endpoint.In addition, it is not only evaluated by its effects alone, but it also simultaneously has negative  As seen in Figure 1, the warming that started in the late 19th century of global surface temperatures became more evident with the 1980s; global temperature records have been broken with almost every year being warmer than the previous year.The global average surface temperature has increased by approximately 0.7 ° C since the early 20th century (Türkeş, 2008).Elements of climate change such as changing temperature, sudden severe weather events (flood, drought, hose, etc.), and an increase of solar radiation causes physicochemical changes in water resources.
Given the data on precipitation and temperature changes, it is clear that any change in climate will change the amount of precipitation, evaporation, surface runoff, and the amount of usable water in the soil.Changes seen in seasons and annual precipitation are very important in terms of both storage of water resources and regulation of the moisture regime in the soil (Aksay et al., 2005) The most important factor affecting a country's water potential is precipitation.With climate change, land and sea temperatures increase and precipitation behavior/patterns change.In general, rainy regions become rainier especially in winter, and arid regions become drier especially in summer.Factors such as air mass-facade systems, landforms, and geographical location affect the amount of precipitation and precipitation distribution (Çiçek and Ataol, 2009).A hierarchical clustering algorithm, presentedby Maharaj (2000),takes as starting point the m × m matrix P =(pi,j ), whose (i, j)-th entry, pi,j , for  ≠ , corresponds to the p value obtained by testing whether or not   () and   () come from the same generating model.Then, the algorithm proceeds in a similar way as an agglomerative hierarchical clustering based on P, although in this case will only group together those series whose associated p values are greater than a significance level α previously specified by the user.In other words, the i-th series   () will merge into a specific cluster Ck formed by the series {  ( 1 )
There are many different approaches to the K-medoids algorithm.The most common is the kmedoids clustering method, the Partitioning Around Medoids (PAM) algorithm (Kaufman and Rousseeuw, 1990).In the k-medoids clustering method, it is aimed to find k representative objects showing various properties of the data.In the PAM algorithm, k representative objects are called "medoid".The representative object is the central object of the cluster that minimizes the average distance to other objects (Kaufman and Rousseeuw, 1987).In this clustering method, the logic of minimizing the total value of the distances (dissimilarity) between each object and the reference point is used.Any point (i) in a cluster that will minimize the sum below is selected.

∑ 𝑑(𝑖, 𝑗) 𝑗∈𝐶 𝑖
where,   is the cluster containing point i, and d (i, j) is the distance between points i and j.
PAM first takes k randomly selected numbers as the cluster center, as in the k-means algorithm.Whenever a new element is added to the cluster, by trying the elements of the cluster, when it detects the point that will contribute the most to the development of the cluster, it performs the swap operation to be the new center and the old center to be the ordinary cluster element.
Casado de Lucas ( 2010) considered a distance measure based on the cumulative versions of the periodograms, i.e., the integrated periodograms.The normalized version gives more weight to the shape of the curves while the nonnormalized considers the scale (Montero and Vilar, 2014).

Data and Results
Precipitation data is the most important input for calculating the water potential of an area.It is possible to calculate the total precipitation when the annual average falling precipitation per year is known for the whole country or a precipitation basin.However, the topography of that area plays a very important role in precipitation that falls on an area.This is because it is a known fact that precipitation increases depending on the altitude in many regions (Çiçek and Ataol, 2009;Napoli et al., 2019).Therefore, it is one of the indispensable conditions for assessing the correct water potential to know the distribution of precipitation in the area and the annual total rainfall depending on this distribution.The aim of this study is to determine the similarities of countries in terms of water resources by years, to evaluate the water risk levels and populations of the countries according to the results obtained, and to provide guidance for similar countries in terms of similar laws and policies in water consumption.For this purpose, the data set of the annual precipitation amounts of 21 countries (Belgium, Czech R, Ireland, France, Cyprus, Latvia, Lithuania, Hungary, Malta, Netherlands, Poland, Portugal, Romania, Slovenia, Slovakia, Finland, Sweden, Albania, Serbia, Turkey, Bosnia, and Herzegovina) was obtained from Eurostat for the period 2009-2017.Figure 2 shows the precipitation amount of each country peryear.Time series clustering analysis with hierarchical clustering and PAM methods using the distances based on the integrated periodogram was applied to the obtained data.R program was used for the dendrogram analyses obtained with the hierarchical clustering method, and results are given in Figure 3. Connectivity indexes, those of the cluster validity indexes, were used to find the appropriate number of clusters, and the results are given in Table 1.The Dunn Index has a value between zero and ∞, and should be maximized.The connectivity has a value between zero and ∞ and should be minimized (Brock et al., 2011).As can be seen from Table 1, according to both Dunn and Connectivity indexes for two methods the number of clusters suitable was found to be 2. Considering that the countries will be divided into 2 clusters, the countries in each of the clusters were determined for the hierarchical and PAM methods according to the number of clusters of 2. The distribution of countries to each cluster is the same according to both the hierarchical clustering and PAM method.The first cluster includes Finland, Lithuania, Turkey, France, and Poland, and the second cluster includes Latvia, Netherlands, Belgium and Albania, Malta, Cyprus, Sweden, Ireland, Portugal, Romania, Bosnia and Herzegovina, Czech R, Hungary, Slovakia, Slovenia, Serbia.While the first cluster shows countries with more precipitation, in recent years the amount of precipitation has decreased in these countries.The second cluster shows countries with low precipitation.While Sweden was expected to be in the first cluster, it was included in the second cluster.Table 2 shows the water risk levels and populations of the countries.According to this information, as can be seen from Figure 2, Turkey's and France's precipitation is declining in recent years.In addition, it is seen in Table 2 that the water risk level and the high population is higher than in other countries.Lithuania, Poland, and Finland, on the other hand, show an increase in the amount of precipitation over the years, and their risk levels are low.
In Belgium and Albania, which are in the second cluster, the amount of precipitation decreased, and the water risk level is high.On the other hand, the amount of precipitation increasing, and the risk level is low in Latvia and the Netherlands.In Malta, Cyprus, and Portugal, the amount of precipitation decreased over the years, and the risk level is high.
Malta, especially, has a very high population density.However, the risk levels of other countries in this cluster are low.

Conclusions
The world is in the process of climate change, the effects of which are felt as unstable weather events.Climate change causes problems that negatively affect the development of countries such as desertification, drought, land degradation, severe storms, and floods, which threatens their national security and assets.Development such as population growth, urbanization, and industrialization, aggravates this climate change.The effects of climate change especially on water, which is the most important natural resource, are starting to be seen.A country's water presence is possible by knowing the average precipitation amount and the spatial distribution of precipitation.Whether on a global or regional scale, climate change causes changes in the frequency, severity, spatial distribution, length, and timing of extreme weather and climate events.One of the most important indicators is the changes in the amount of precipitation.
These changes in the amount of precipitation are not proportionally distributed throughout the earth.Precipitation, zonal and temporal variations were high, and tendency to decrease (drought) and increase was observed in very large regions and continents.Due to this change in precipitation amount, changes were also observed in water resources.This disparity in the amount of precipitation between regions is expected to continue in the coming period.
Studies on the impact of climate change on water resources have generally focused on human factors.In addition, the majority of studies in the literature have focused on the regional rainfall of each country.In this study, precipitation amounts of different countries are discussed.Taking into consideration the changes in the amount of precipitation together with climate change, the similarities between the countries in terms of water resources were examined.In this context, cluster analysis was applied by considering the precipitation data of 21 countries between 2009-2017.According to the cluster validity indexes, the appropriate cluster number was determined as 2 and the countries in the clusters with the hierarchical clustering and PAM methods were: Finland, Lithuania, Turkey, France and Poland in the first cluster, Latvia, Netherlands, Belgium and Albania, Malta, Cyprus, Sweden, Ireland, Portugal, Romania, Bosnia and Herzegovina, Czech R, Hungary, Slovakia, Slovenia, Serbia in the second cluster.The first cluster shows the countries with more precipitation and the second cluster shows the countries with low precipitation.The water risk levels and populations of the countries in the clusters obtained have been evaluated.It appears that the water risk level and population in Turkey and France are higher than the other countries in the first cluster.
For Belgium, Albania, Malta, Cyprus and Portugal, which are the second cluster, the amount of precipitation decreases over to the years and the risk level is high.Malta, especially, has a very high population density.Thus the risk increases even more.On the other hand, in Lithuania, Poland, and Finland, which are in the first cluster, the number of precipitations increases over the years and their risk levels are low.For Latvia and Netherland in the second cluster, the number of precipitations increases, and the risk level is low.The risk levels and population of other countries in the cluster are low.
Turkey, France, Belgium, Albania, Malta, Cyprus, and Portugal, where precipitation is low in recent years, level of water risk is high and the population of the country is also high, must take urgent measures to climate change.They should pay particular attention to water, which is a natural resource where negative effects are beginning to be seen since water is important not only for the continuity of life but also for the economic development of countries.In addition, ensuring food, energy, and ecosystem security depends on water.
In order to ensure water security throughout the world, policies that increase water supply and reduce water demand should be emphasized up to a determined balance point.
Investments to increase the supply of water reserves depend on the development of credit mechanisms, finding new resources, and utilizing technological opportunities.On the other hand, a measure that will shorten the demand is raising the awareness of citizens in water use in order to increase water conservation in the short term.The false belief that water resources, which are seen as a free resource, are unlimited should be abandoned and the reality of the scarcity of resources should be embraced by society.Scientific and technological studies should be encouraged to increase water saving at home, school, industry, and especially in agriculture.For this, many habits in daily life should be abandoned.In the long-term studies, policies to prevent population growth should be implemented.Otherwise, the measures taken are bound to become useless from the outset.

Figures
Figure 1 Global Land-Ocean temperature index (Source:https://climate.nasa.gov/vital-signs/global-temperature/) The amount of precipitation of countries by years The dendrogram obtained with the hierarchical clustering method Santos et al. (2019) divided sites with similar trends using precipitation data from the Tropical Rainfall Measurement Mission (TRMM) satellite for the period from January 1998 to December 2015 by hierarchical clustering.The results show an uneven spatiotemporal precipitation distribution in all mesoregions of the state and considerable monthly precipitation variation per location.Doulah and Islam (2019) identified Bangladesh climate regions by cluster analysis using monthly data from 34 climate stations for precipitation from 1991 to 2013.They applied Five Agglomerative Hierarchical clustering measures, K-means, Fuzzy, and density-based clustering techniques initially to decide the most suitable method for the identification of homogeneous regions.It is decided to use the Ward method based on Euclidean distance, K-means, and Fuzzy by nine validity indices.Findings show seven different climate zones were found.Ullah et al (2019) determined homogeneous climatic zones (HCRs) based on droughts determined by the Reconnaissance Drought Index from cluster analysis using climate data from 55 metrology stations in Pakistan.These statistical measures provided the validity of five homogeneous climatic zones for Pakistan.According to effects on many resources in nature.Unconscious consumption of resources, which are rapidly declining as a result of global warming and changing weather events, can have serious consequences leading to desertification and threatening the lives of future generations (Albayrak, 2017).

Figure 1 .
Figure 1.Global Land-Ocean temperature index (Source:https://climate.nasa.gov/vital-signs/global-temperature/) Time series data mining has received a lot of attention in the last years due to the ubiquity of this kind of data.A special type of clustering is time-series clustering.Just like static data clustering, time series clustering requires a clustering algorithm or procedure to form clusters given a set of unlabeled data objects and the choice of clusteringalgorithm depends both on the type of data available and on the particular purpose and application.Time series clustering algorithms can be broadly classified in to two approaches: data adaptation and algorithm adaptation.The form erextracts features arrays from each time series data and, then, applies the conventional clustering algorithm.The latter modifies the traditional clustering algorithms in such a way that they can handle time series directly.Many dissimilarity measures between time series have been proposed in the literature.These have been grouped into four categories as complexity-based measures, model free measures, prediction-based measures and model-based measures.Additionally, majority of existing time series clustering techniques in literature use k-means, k medoids or hierarchical clustering algorithms in their original forms or modified versions.

Figure 2 .Figure 3 .
Figure 2. The amount of precipitation of countries by years

Table 1 .
Validity indexes for hierarchical clustering and PAM methods by the

Table 2 .
Populations and risk levels of water resources according to countries