On the Reduction of Inaccuracies in Drought Monitoring-A Novel Blended Procedure for Standardized Type Drought Indicators

1 Due to climate change and an increasing temperature, drought is prevailing in several parts of the 2 globe. Therefore, drought monitoring is a challenging task in hydrology and water management 3 research. Drought is occurring recurrently in various climatic zones around the world. In literature, 4 in that respect, there are several drought monitoring indicators. Regardless of their pros and cons, 5 their abounded creates a chaotic scenario in analysis and reanalysis in certain gauge station. This 6 research aims to improve drought monitoring system by providing a comprehensive data mining 7 approach under principle component analysis. Consequently, we propose a new index named: 8 Seasonal Mixture Standardized Drought Index (SMSDI). In our preliminary analysis, we have 9 included three multiscaler Standardized Drought Indices (SDIs). In application, we have applied 10


Introduction 17
Due to climate change and an increasing temperature, there is a continuous trend in recurrent 18 occurrences of drought events at several parts of the world. Comparative to other hazards, effects 19 of drought are more disastrous and long lasting on humans, agriculture, livestock, and industries 20 (Vásquez-León et al., 2003). Drought can be defined as "a certain period of time (usually lasting 21 over several months or longer than usual) during which a particular region receives comparative 22 less precipitation (in terms of rain or snowfall)" ( Van et al., 2016). According to the characteristics 23 of drought monitoring and its assessment, it has been divided into four major types. Details on 24 each type of drought can be found in Sun et al., (2019). 25 Every year, around 55 million people are affected directly or indirectly from all over the world 26 (WHO, 2020). In addition, continuous increase in temperature and global warming is threatening 27 for bad effect on the soil fertility of the agricultural land (Thadshayini et al., 2019). Further, a list 28 of catastrophic consequences of drought includes the decrease of accessible resources of drinking 29 and groundwater, death of inhabitants and livestock, deterioration of food quality, serious diseases, 30 desertification, economic inflation, social disruption, soil erosion, depletion of freshwater 31 resources, and low economy, etc. (Garcia et al., 2013). 32 suggested various drought indices by including additional meteorological variables under the same 48 standardized procedure. For instance, Standardized Precipitation Evapotranspiration Index (SPEI) 49 accounts evaporation before standardization (Vicente Serrano et al., 2010). Ali et al., (2017) have 50 suggest Standardized Precipitation Temperature Index (SPTI). In SPTI, a time series vector of 51 average temperature is used with precipitation data before standardization phase. As, SPI, SPEI, 52 SPTI have homogenous computational procedure, therefore, we call these indices as a set of 53 Standardized Drought Indices (SDI). Some more details on SDIs are available in Erhardt and 54 Czado, (2015). 55 In this paper, instead of using single probability model, we proposed KCGMD for modeling time 101 series data of SPI. The rest of the procedure is same as used in Ali et al., (2017). 102

SPEI 103
After SPI, Vicente-Serrano et al., (2010) have used temperature as additional variable in the same 104 procedure of SPI, developed a simple but effective standardized drought index-the Standardized 105 Precipitation Evapotranspiration Index (SPEI). In SPEI, the time series data of the difference 106 between rainfall amount and estimated amount of evaporation e.g., Potential Evaporation (PET) is 107 used to quantify drought (see Eq. 1). 108 Different equations can be used to estimate PET quantities according to the nature of the data. The 109 Thornthwaite (Thornthwaite et al,. 1948), the Penman equation, Blaney-Criddle (Allen and Pruitt, 110 1986) and (Allen et al,. 1998) are the most widely used methods for estimating PET. 111 Vincente-Serrano et al. (2010) have utilized the same drought characterization criterion as 112 described by (Mckee et al,. 1993

SPTI 119
The main challenge in SPI is the use of only one variable and ignoring other climatic parameters. 120 Further in SPEI, the major problem is the under estimation of over estimation of PET for arid and 121 semi-arid regions. To resolve these problem, Ali et al., (2017) have proposed a new drought  122 indices-the Standardized Precipitation Temperature Index (SPTI). The main benefit of SPTI is 123 that it can be used for any type of region. More detail on SPTI is available in (Ali et al., 2017). 124 In this research, standardization of SPI, SPEI and SPTI has been done by using the novel concept 125 of mixture distribution. In the current study we are using the mixture distribution instead of using 126 a single distribution. A detailed explanation of mixture distribution can be seen in section 2.3, 127 whereas the standardization procedure has been provided in Appendix-B. 128

K-Component Gaussian Mixture Distribution 129
Modeling data with several multiple modes as well as various skewness forms of the data in hand, 130 a k-component mixture model is used for hydrological data which is strictly positive. Evin et al., 131 (2011)  (3) 154 Therefore, the complete-density for one measurement is as follows, 155 The key goal of the current study is to establish a new indicator of drought by integrating the 187 mixture distribution for SPI, SPEI and SPTI, most comprehensive knowledge at seasonal level and 188 to solve the issue of choosing an index among several indices. To accomplish these objectives this 189 segment is focused primarily on the step by step procedure of the proposed index i.e., SMSDI. 190 Detailed description on SDI, K-components mixture distribution and PCA were listed in section 191 2, above. Prior to the implementation of our proposed technique, we identified the 3 major points 192 which are of considerable importance for an appropriate index. Following the identification of the three-points defined above, a step-by-step execution of our 216 proposed paradigm comprises of 5-phases. The following subsections give a thorough overview 217 of each phase, 218

Phase 1: Estimation of drought indices under K-components Gaussian distribution 219
As explained in the introductory part, the selection of an appropriate distribution for fitting the 220 available data in the analysis of SPI, SPEI and SPTI is experiencing intense debate. Therefore, a 221 mixture distribution approach has been applied (see Section 2.3) on Pi (precipitation series), Di 222 (series for SPEI) and Ei (series for SPTI) for the calculation of the said indices. The chosen models 223 for Pi, Di and Ei are then standardized (see Appendix-B) in order to generate the series of SPI, 224 SPEI and SPTI respectively. Figs. 2-4 clearly show that for almost all the stations data, a K-225 component model is a suitable model on the basis of K-modality evidence observing from the 226 graphs. We applied the mixture model technique using R-package mixtools on the selected stations 227 data, this package includes a set of operations for the analysis of various finite mixture models. 228

Phase 2: Seasonal Segregation 229
In this phase we will classify SPI, SPEI and SPTI in different months. For instance, combine the 230 calculated SPI, SPEI and SPTI values of January for Badin station. Similarly combine the SPI, 231 SPEI and SPTI values of February for Badin station. The same procedure is repeated for all the 232 months of all the selected stations and timescales in the current study. 233 Let S1, S2, S3,…,S12 are the monthly indexed time series data, in which each month is perceived to 234 be a season. The consequential step will be to consider each indexed time series for all the stations 235 as an individual time series in accordance with further practices. 236

Phase 3: PCA on each segregated data 237
It is better to calculate SPI, SPEI and SPTI at several time scales using different indices and 238 seasonal segregated data simultaneously to identify drought classes but as we mentioned above, 239 several indices may confuse us and may create difficulty in the interpretation of the results. 240 Different researchers showed the uses of different indices, for example SPI, SPEI and SPTI, but 241 which index is best among these indices. 242 To resolve the above issue, it is better to reduce the number of SPI, SPEI and SPTI aggregation 243 time series using a multivariate technique, PCA. A detailed description of standardization of PCs 244 can be seen in the following section. 245

Phase 4: Standardized PCA1 246
In our case study, we calculate SMSDI for each set of SPI, SPEI and SPTI-time scales. i.e. (1,3, 247 6, 9, 12 and 24 months) for each station. SMSDI is based on the first principal component

Estimation of SPI, SPEI and SPTI under mixture distribution settings 263
This section presents the results associated with k component mixture distribution based 264 standardization of drought indicators. Here, we applied 12-CGMD mixture distribution for 265 modeling the data of all indicators in all the three selected stations. Table 2  provide evidence that in each data, the mixture distribution is more appropriate instead of applying 271 a single distribution. Some more results are archived in author's gallery. 272

Principal Components Analysis 273
In further part of research, we intend to use SPI, SPEI and SPTI time series seasonal data for PCA. 274 This will reduce the three dimensional data into one dimension. In section 2.2, we overview SPI, 275 SPEI and SPTI and a detailed explanation of the proposed index i.e. Seasonal Mixture 276 Standardized Drought Index (SMSDI) is given in section 3. We calculate SPI, SPEI and SPTI for 277 1 to 24-month time-scales. The calculation procedure and methodology has been explained in the 278 said section. The SMSDI is based on 3*12*6*3 sets of the chosen indices for different months 279 with time-scales 1-24. There are total seven categories which contain the values of SMSDI as 302 mention in the graphs. These categories were defined by McKee et al. (1993), the ranges of these 303 categories can be seen in Table 3. 304 Results shows that the eigenvalues for the first PCs are large, and for subsequent PCs small, see 305 Tables 4-5. Such that, the first PCs in the data set correspond to the directions with the greatest 306 amount of variations. The sum of all the eigenvalues give a total variance of 3 for each month-307 (Jan, Feb, …, Dec) with timescales -(1,3,6,9,12 and 24). Similar results for all the stations along 308 with the months and timescales can be seen by observing Tables 4-5. 309 with the figures, the SMSDIs are seen to follow appropriately the fluctuations of SPI, SPEI and 320 SPTI, particularly during extended wet and dry periods. Likewise, SMSDI avoids dramatic 321 volatility within actual SPI, SPEI and SPTI time-series (especially for the series of timescales of 322 less than 9 months), culminating in reduced wet and dry occurrences in comparison with SPI, SPEI 323 and SPTI. This signifies that the SMSDI can remove the slighter wet and dry periods in the extreme 324 and prolonged wet or dry periods. Further analysis was conducted to consider the parallelism in 325 stations of interest among the SMSDI and each chosen series of SPI, SPEI and SPTI timescales in 326 terms of various months of the year. 327 Fig.9 shows the seven-categories of drought for all the selected stations using timescales-1, 3, 6, 328 9, 12 and 24. For instance, sub-figure a (i.e., Fig. 9(a)) demonstrates the drought characterization 329 for Badin station under the said timescales. Normal category of drought for Badin occurred 469, 330 353, 469 and 458 times by using SPI, SPEI, SPTI and SMSDI respectively for timescale-1. Similar 331 interpretations can be found for ED, SD, MD, MW, SW and EW for Badin from Fig. 9(a) and 332  (9c)  337 respectively. 338

Concluding Remarks 339
The main purpose of this research is to monitor drought in more accurate and efficient way. This 340 article suggests a new drought index which accounts the characterization of SPI, SPEI and SPTI-341 -the SMSDI. In SMSDI, a novel blending procedure for SDIs are presented. Here, the estimation 342 and aggregation of SPI, SPEI and SPTI is based K component Gaussian distribution and PCA 343 technique, respectively. In application, SMSDI is estimated for three gauge stations of Pakistan. 344 The important features related to proposed index are summarized as; 1) as mixture distribution has 345 been applied to evaluate SPI, SPEI and SPTI using time-scales 1-24 for different months. 346 Consequently, we have assessed that all of the data in the selected stations contain K underlying 347 classes, each defined by different parameters, are correctly estimated. So, the procedure of SMSDI 348 is free from the problem of using just one probability distribution, 2) the seasonality have also 349 considered in the proposal of SMSDI, 3) the problem of the existence of multiple drought indices 350 has been resolved in SMSDI procedure, 4) drought categories defined by SMSDI are greatly 351 accorded with those defined by SPI, SPEI and SPTI time series. 352 Hence, to avoid the hardness of computational work, and confusion in the interpretation of SPI, 353 SPEI and SPTI, our proposal provides best solution to date. 354

Acknowledgement 355
The authors are very grateful to the China Huaneng Group Co., Ltd., for the financial support 356 through the Project (HNKJ17-H20) 357

Competing Interests 358
The authors declare no competing interests. 359

Data Availability Statement 360
The data that support the findings of this study are available from the corresponding author upon 361 reasonable request. 362

Author Contribution 363
All author has equal contribution.