Evaluating The Sensitivity and Applicability of Precipitation-Based and Precipitation-Evapotranspiration-Based Drought Indices To Different Record Periods

15 As drought indices are generally calculated based on multi - year historical data spanning periods of at least 16 30 years, different drought index values at certain times are therefore calculated due to different record lengths, 17 making it difficult to accurately define dry or wet periods in a studied region or station. This investigation 18 assessed the sensitivity and applicability of precipitation - based and precipitation - evapotranspiration - based 19 drought indices, such as the Generalized extreme value drought index (GEVI), Homogeneity index of 20 precipitation and temperature (HI), the K index (K), Precipitation anomaly percentage (Pa), Standardized 21 precipitation evapotranspiration index (SPEI), Standardized precipitation index (SPI), and the China Z index 22 (CZI), to different record lengths on monthly, seasonal and annual time scales. By using monthly, seasonal and 23 annual precipitation and evapotranspiration data from a research station over the period 1961 - 2017, data over 24 periods of 55, 50, 45, 40, 35 and 30 years were extracted. Analysis of correlation coefficient of all indices, 25 match and non - match, and actual drought and no - drought recognition rate of the indices indicated that K, Pa and 26 SPEI indices recorded better time stability compared to other indices at all time scales across different climatic 27 zones in the study region; the GEVI index recorded the lowest time stability compared to other indices. Results 28 also indicated that the majority of optimal lengths for all stations having the lowest non - match were 41 - 45 years, 29 with some indices at different time scales being 36 - 40 years and 46 - 50 years. In addition, the HI index had the 30 highest actual drought and no - drought recognition rate at almost all climate zones, followed by Pa and SPEI 31 indices. Results from this study indicate that more priority should be given to precipitation - evapotranspiration - based indices when studying a large region; indices with concrete results 33 should be selected when analyzing relatively small regions.

zones in the study region; the GEVI index recorded the lowest time stability compared to other indices. Results 28 also indicated that the majority of optimal lengths for all stations having the lowest non-match were 41-45 years, 29 with some indices at different time scales being 36-40 years and 46-50 years. In addition, the HI index had the 30 highest actual drought and no-drought recognition rate at almost all climate zones, followed by Pa and SPEI 31 indices. Results from this study indicate that more priority should be given to 32 precipitation-evapotranspiration-based indices when studying a large region; indices with concrete results 33 should be selected when analyzing relatively small regions. 34

Introduction 62
As drought episodes can occur in high as well as low rainfall areas over an extended period of time, this 63 natural disaster is one of the most complex hydroclimatic disasters occurring in the world ( Although almost all climatic regions in the world have suffered from drought episodes, with drought effects 71 being more serious in arid and semi-arid regimes (Valverde-Arias et al., 2017). It is therefore vital in these areas 72 that planners define the characteristics of droughts and wet periods for water resource management. The arid 73 area of Northwest China is one of the most vulnerable arid and semi-arid regions of the world. This area is 74 characterized by relatively low precipitation levels, high changes in the rate of precipitation, and uneven spatial 75 and temporal distribution of precipitation (Geng et al., 2014). In recent years, severe regional drought episodes 76 have become more frequent under a changing climate, episodes which are likely to increase in frequency for the 77 foreseeable future (Jia et al., 2018). Precipitation anomalies have also increased in arid areas of Northwest China 78 due to global climate change, resulting in more complicated temporal-spatial properties in these arid areas 79  Results from this investigation concluded that the Effective Drought Index (EDI) was more sensitive to monthly 94 rainfall changes in terms of multi-monthly cumulative rainfall changes; this index also had the best correlation 95 with other drought indices. Mercado et al. (2016) also compared the variation and performance of seven drought 96 indices to identify droughts using Non-Contiguous Drought Analysis. They concluded that to identify drought 97 events and drought spatio-temporal evolution, it was important to combine different drought indices, 98 meteorological, hydrological and agricultural drought indices by analyzing drought evolution, severity and 99 trends in mainland China using four drought indices. Yao et al. (2017) revealed that all indices were regional-or In this investigation, therefore, we assess the sensitivity and applicability of precipitation-based 146 and precipitation-evapotranspiration-based drought indices (GEVI, HI, K, Pa, SPEI, SPI, and CZI) to different 147 record lengths to select a regional applicable drought index. 148

Study area and data 149
The arid region of Northwest China was selected as the study area for this investigation. This region is 150 be divided into ten climatic regions (Figure 1), which were characterized by accumulated temperature and 154 moisture index. The whole area was initially divided into different temperature zones by accumulated 155 temperature: mid-temperate zone (1700-3500℃), warm temperate zone (3500-4500℃), North subtropical zone 156 (4500-5300℃), plateau temperate zone (1500-3000℃), plateau sub-cold zone (500-1500℃) and plateau cold 157 zone (0-500℃). The area was then divided into different humidity zones by annual precipitation (P) and 158 humidity degree (the relationship between precipitation and evaporation (E)): wet zone (P>800 mm, P>E), 159 semi-humid zone (P>400 mm, P>E), semi-arid zone (P<400 mm, P<E) and arid zone (P<200 mm, P<E). 160 Meteorology data used in this study was derived from the China Meteorological Data Service Centre 161 (http://data.cma.cn/en). In this study, one station representing a climate region was selected, except for two 162 zones whose meteorological stations were little sited. Therefore, eight meteorological stations were selected in 163 this study to investigate specific characteristics in the different regions. 164 In this study, monthly, seasonal and annual precipitation data were used from a 57-year period . 165 The names, geographical coordinates, mean annual temperatures, total means of annual precipitation, establishment years and types of station used are recorded in Table 1. Record lengths of 55, 50, 45, 40, 35 and 167 30 years were extracted from the main period (1961-2017) for monthly, seasonal and annual time scales. 168

Methodology 169
The method used in this study was derived from research undertaken by Wu et al. (2005), where impact 170 lengths of data records were examined using SPI values. In this investigation, drought indices were calculated 171 using Python software for set time scales for all selected record lengths. Indices used in this study are also 172 briefly introduced. 173

Generalized extreme value drought index (GEVI)
174 GEVI assumes precipitation series as a generalized extreme value distribution function, based on 175 precipitation relative hydrological variables skewed to the right (Wang et al., 2013). The probability distribution 176 function of the GEVI series is: 177 The cumulative distribution function of the generalized extreme value is: 181 The corresponding inverse function for a given frequency F was then solved as: 183 where, x is precipitation at a certain period; and u, v, w are location, scale and shape parameters of GEVI 185 probability distribution, respectively. Values were estimated using marimum likelihood, linear moments and 186 maximum product of spacing, respectively. 187 GEVI is therefore defined as a complex negative logarithm of () Fx as the drought index: where, GEVI is the drought index; and i x is precipitation at a certain timescale. 190

191
The HI of precipitation and temperature takes precipitation and temperature into consideration, providing a 192 quick response to precipitation and temperature changes. This index is defined as (Wu et al., 2011): 193 HI= where, HI is the Homogeneity index of precipitation and temperature; P is precipitation at a certain timescale; 195 P is mean precipitation at a certain period; P  is mean square error of precipitation; T is temperature at a 196 certain timescale; T is mean temperature at a certain period; and T  is the mean square error of temperature.

The K index (K)
198 The K drought index takes precipitation and ET0 into consideration, defined as (Wu et al., 2012): where, ij K is the K drought index at a certain time; ij P is the relative change rate of precipitation at a certain 203 period; ij E is the relative change rate of ET0 at a certain period; ij P is precipitation at a certain time; i P is mean precipitation at a certain period; ij E is ET0 at a certain time period; and i E is mean ET0 at a certain 205 period; i = 1, 2, …, n, is timescale, month; j = 1, 2, …, m, is the station number. 206

Precipitation anomaly percentage (Pa)
207 Precipitation anomaly percentage reflects the degree of deviation between precipitation in a certain period 208 and the contemporaneous mean state, defined as (Wei and Ma, 2003): 209 where, P is precipitation in a certain period; and P is contemporaneous mean precipitation, calculated as: where, , ,  are scale, shape and location parameters, respectively. These parameters are calculated as: where,  is gamma function; and W0, W1, W2 are probability weighted moments of original sequences Di: 224 (1 ) where, N is the number of calculating Di; and i is the ordinal of Di in ascending order. 227 SPEI can then be transformed into the standardized value of F(x), as: 228 In addition, c0, c1, c2, d1, d2, d3 are constant coefficients as follows: 232

Standardized precipitation index (SPI)
When the gamma function is not defined for x=0 and precipitation distribution is 0, the cumulative probability is 247 calculated as: 248 In Equation 27, the precipitation probability is 0, while m is the number of zeros in the precipitation time 250 series. q is estimated in m/n and H(x) is transformed into variable (Z) with the following approximation: 251 Precipitation is then normalized into a standardized normal distribution as:

Comparison of correlation coefficients 282
Spearman correlation coefficient (rank) and Pearson correlation coefficient were obtained for all stations on 283 monthly, seasonal and annual time scales; 2-3 cases from each time scale were included in the study. As the 284 results of two correlation coefficients were very close and similar to each other, the results of the Spearman 285 correlation coefficient were also included in our analysis. 286 Firstly, as the recorded length of the indices increased, the overall correlation coefficient between record 287 lengths initially increased before decreasing. This indicates a variation trend in the quadratic polynomial, and its 288 whole linear trend was increasing (Figure 2), with the highest correlation coefficient being was recorded at a 289 certain record length. This result indicates that the most stable record length to monitor drought could be 290 In all studied stations, correlation coefficients between records were greater than 0.99, and correlation 311 coefficients expressed an increasing trend as the time scale increased. 312

Comparison of match and non-match in the studied indices 313
Investigating match and non-match of different drought classes for all time scales derived from all record 314 periods enabled applicability of indices at a region to be calculated. Match and non-match were determined 315 using the following criteria: if one class of drought occurrence derived from a (30-57 years) lengths of record 316 matched with b (also 30-57 years) lengths of record, it was termed as a match; if records did not match they 317 were then termed as non-match. In this investigation, data for a and b were selected from 1988-2017, spanning 318 the last 30 years. The percentage of non-match was obtained by dividing the number of "non-matches" into the 319 sum of the number of "matches" and "non-matches". Despite dividing drought into four types, a number of 320 extreme values were recorded for each class during this process, resulting in this analysis not fully reflect 321 general characteristics. Therefore, we mainly discussed the total non-match of four non-matches, with the total 322 percentage of non-match being obtained by dividing the sum of "non-matches" of the four classes into the sum 323 of the number of "matches" and "non-matches" of the classes. Our results provided a 28×28 non-match matrix 324 for a drought index at certain time scale when data was calculated for a station.  Based on our results, we identified a record length that had the lowest average percentage, termed as the 334 optimal record length. We then calculated all optimal lengths at all stations, as well as obtaining their 335 frequencies of occurrence. The frequency of optimal length for all stations was obtained for all indices. Results 336 from this analysis indicated that the majority of optimal lengths from all stations to calculate drought indices 337 was 41-45 years, with some indices at different time scales being 36-40 years and 46-50 years. However, K, Pa 338 and SPI indices had relatively large differences among different frequencies of optimal length. Results also 339 indicate that the frequency of optimal length initially increased before decreasing with an increase in record 340 length for all indices at all time scales ( Figure 5).  By calculating R from all record lengths, results indicated that the HI index had the highest frequency of 370 max R at all record lengths, with SPEI and Pa indices also having high results ( Figure 6). The lowest frequency 371 of the highest R value was recorded by the K index, with CZI and GEVI indices also having low frequencies. 372 Frequency results of max R calculated at different record lengths therefore, indicated that HI, SPEI and Pa 373 indices had better applicability for different regions; the K index recorded the lowest applicability. Results for 374 different climate zones of indices having the highest R are shown on Figure 6 and Table 3. 375

Conclusions 376
In order to analyze drought events in different countries, it is important to examine long-term data spanning 377 at least 30 years. Data collection at meteorological stations can also vary between stations, and between 378 countries. It is therefore difficult to select long enough lengths of record to calculate, and it can be from 30-year 379 data to the length being from the beginning of record to the present. The selected length of recording data cannot 380 be ignored due to drought indices changing with record length. In this investigation, we evaluated the sensitivity 381 of precipitation and evapotranspiration record lengths to identify the lowest impact values. This method can 382 account for selecting weak data, as well as the applicability of different drought indices at different regions. 383 On the three examined time scales, K, Pa and SPEI indices recorded better time stability compared to other 384 indices. As the time scale increased, the correlation coefficient of the indices also increased. These indices are 385 very stable at different record lengths for the different climatic zones of the study region. The GEVI index 386 recorded the lowest time stability compared to the other indices, recording a significant downward trend as the time scale increased, indicating that the GEVI index had relatively low applicability. This indicated that indices 388 only derived from precipitation may have lower stability compared with precipitation-evapotranspiration-based 389 indices. The generalized extreme value distribution function of GEVI also had lower applicability for drought 390 monitoring compared with the gamma distribution function of SPI and the Pearson Type Ⅲ distribution 391 function of CZI. We therefore had to select an appropriate distribution function to describe regional 392 precipitation if only precipitation-based indices were used. 393 In addition, we found that the majority of optimal record lengths for all stations had a lowest non-match of 394 41-45 years; some indices at different time scales also had a non-match for 36-40 years and 46-50 years. Results 395 for the K, Pa and SPI indices had relatively large differences among different frequencies for optimal length. 396 The percentage of non-match also reflected a trend of initially decreasing before increasing as the record length 397 increased, indicating that a kind of periodicity law at 57-year record lengths existed, whereby the non-match 398 percentage would reduce to the minimum at a certain record length. 399 As it was unknown if analyzing the characteristics of drought indices could identify actual drought events, 400 actual drought and no-drought recognition rates of different indices on the seasonal time scale were calculated. 401 Results indicated that the HI index had the highest actual drought recognition rate at almost all climate zones, 402 followed by the Pa and SPEI indices. According to results from this study, more priority can be given to 403 precipitation-evapotranspiration-based indices for regional drought monitoring.