Application of a hybrid ARIMA-LSTM model based on the SPEI for drought forecasting

Drought forecasting can effectively reduce the risk of drought. We proposed a hybrid model based on deep learning methods that integrates an autoregressive integrated moving average (ARIMA) model and a long short-term memory (LSTM) model to improve the accuracy of short-term drought prediction. Taking China as an example, this paper compares and analyzes the prediction accuracy of six drought prediction models, namely, ARIMA, support vector regression (SVR), LSTM, ARIMA-SVR, least square-SVR (LS-SVR), and ARIMA-LSTM, for standardized precipitation evapotranspiration index (SPEI). The performance of all the models was compared using measures of persistence, such as the Nash-Sutcliffe efficiency (NSE). The results show that all three hybrid models (ARIMA-SVR, LS-SVR, and ARIMA-LSTM) had higher prediction accuracy than the single model, for a given lead time, at different scales. The NSEs of the hybrid models for the predicted SPEI1 are 0.043, 0.168, and 0.368, respectively, and the NSEs of SPEI24 is 0.781, 0.543, and 0.93, respectively. This finding indicates that when the lead time remains unchanged, the hybrid model has high prediction accuracy for SPEI on long time scales and low prediction accuracy for SPEI on short time scales, and the prediction accuracy of the model with a 1-month lead time is higher than that of the model with a 2-month lead time. In addition, the ARIMA-LSTM model has the highest prediction accuracy at the 6-, 12-, and 24-month scales, indicating that the model is more suitable for the forecasting of long-term drought in China.


Introduction
From a global perspective, drought is one of the most serious natural disasters in the world, with the widest impact and incurring greatest economic losses (Tian et al. 2018;Miao et al. 2019Miao et al. , 2020Sun et al. 2019Sun et al. , 2020aSun et al. , 2020bGou et al. 2020;AghaKouchak et al. 2021;Fan et al. 2021). In recent years, global warming, excessive carbon emissions, and other issues have led to the continued increase of agricultural droughts, which seriously threatens China's domestic and global food production. Therefore, quantitative research on drought is conducive to improving drought monitoring, strengthening the development of drought prediction, and mitigation strategies. Generally, easy-to-calculate drought indicators are used to monitor and evaluate the drought intensity, duration, and disaster area (Zhang et al. 2019;Dai et al. 2020;Wu et al. 2020b). Due to the wide application range and variety of drought indicators and the different understanding of drought in different majors and disciplines, a variety of drought indicators have been proposed (Yao et al. 2018a;Wu et al. 2020a;Fan et al. 2021;Li et al. 2021a).
Common drought indices include the meteorologicalrelated Palmer drought severity index (PDSI) (Fung et al. 2020;Han et al. 2021;Jiang et al. 2021;Li et al. 2021b), standardized precipitation index (SPI) (Mckee et al. 1993;Omer et al. 2020;Huang et al. 2021), standardized precipitation evapotranspiration index (SPEI) (Vicente-Serrano et al. 2010;AghaKouchak et al. 2021;Jiang et al. 2021), soil water content-related soil moisture anomaly index (SMAI) (Gao et al. 2016), evapotranspiration deficit index (ETDI) (Narasimhan and Srinivasan 2005), hydrological-related Palmer hydrological drought index (PHDI) (Tatli and Türkeş 2011), and standardized water-level index (SWI) (Bhuiyan et al. 2006). In addition, the remote sensing-based normalized differential vegetation index (NDVI) (Gu et al. 2007) is also frequently used. Both the PDSI and SPEI require temperature and precipitation data, and the SMAI and ETDI are more complicated to calculate than the SPI. The SWI only considers groundwater, the NDVI requires long-term data, and satellite data have no long-term history. SPI is multiscale and computationally simple because rainfall is the only input data (Chen and Sun 2015;Wang et al. 2015Wang et al. , 2016. The SPEI is based on the SPI but adds a temperature component that allows the SPEI to account for the effect of temperature on the development of drought through basic water balance calculations (Soh et al. 2018;Yao et al. 2018b). Li et al. (2020) elucidated drought characteristics of China during 1980-2015 using two commonly used meteorological drought indices: SPI and SPEI. Given the fact that potential evapotranspiration increases in a warming climate, Li et al. (2020) show that SPEI may be more suitable than SPI in monitoring drought under climate change. Therefore, the SPEI was chosen to monitor drought in this study.
In recent years, data-driven models have been commonly used for drought forecasting (Rafiei-Sardooi et al. 2018;Fung et al. 2019b;Omer et al. 2020;Dikshit et al. 2021a;Ko et al. 2021;Zhu et al. 2021). There are many models used to forecast drought, such as the ARIMA model Desai 2005, 2006;Mishra et al. 2007;Han et al. 2010), which is the most widely adopted stochastic approach. Tian et al. (2016) used the ARIMA model to predict the agricultural drought in the Guanzhong Plain, mainly on the time series of the VTCI drought monitoring results; the results showed that the absolute error of the AR(1) model was lower than that of the seasonal ARIMA (SARIMA) model, both in frequency distribution and statistical results. It shows that the ARIMA model can better predict the type and extent of drought and can be used for drought prediction in the plains. However, the author did not conduct the analysis and prediction from multiple time scales, so the results obtained were limited and the model was not necessarily applicable to all time scales. And this paper employs multiple time scales in addressing this problem to compensate for the previous. Mossad and Alazba (2015) developed several ARIMA models for drought forecasting using multiscale SPEI in hyperarid climates, and showed that ARIMA models can be very useful tools for drought forecasting that can help water resource managers and planners to take precautions considering the severity of drought in advance. However, stochastic models are linear models with limited ability to predict nonlinear data; therefore, this paper combines the advantages of linear and nonlinear models for prediction. In recent years, ANN models have been used by researchers to improve the prediction accuracy of nonlinear data, as traditional data-driven models do not predict nonlinear data well (Sigaroodi et al. 2013;Djerbouai and Souag-Gamane 2016;Rezaeianzadeh et al. 2016;Kousari et al. 2017;Seibert et al. 2017;Vidyarthi and Jain 2020). ANNs have been used as drought prediction tools in many studies (Belayneh and Adamowski 2012;Deo and Şahin 2015;Belayneh et al. 2016;Borji et al. 2016;Chen et al. 2017;Seibert et al. 2017) and have achieved good results. There are many kinds of ANNs, including recurrent neural networks (RNNs) and long short-term memory (LSTM) networks (Poornima and Pushpalatha 2019). RNNs and traditional ANNs have been compared in terms of time series model prediction, and it has been found that RNNs achieve a higher prediction accuracy than traditional ANNs. RNNs have shown significant effects on the forecasting of data sequences. However, when the sequence length is too large, there is a dramatic increase in the training time of the RNN. Based on the above problems, the LSTM network (Hochreiter 1997) is proposed, which improves the hidden layer of the RNN, expands the memory function of the network, enables the model to obtain more persistent information, and slows down the information decay rate (Tran Anh et al. 2019). LSTM is a kind of deep learning method and has made great progress in time series prediction (Zhang et al. 2018;Jeong and Park 2019;Qi et al. 2019;Tran Anh et al. 2019;Dikshit et al. 2021a, b;Ko et al. 2021;Sarker et al. 2021). However, it is difficult to use LSTM architecture in short-term time series prediction or with meteorological drought indices. Poornima and Pushpalatha (2019) used LSTM to predict the multitime scale SPI and SPEI and found that the model could better process real-time nonlinear data and achieve good long-term prediction. However, because the prediction accuracy is not high or cannot improve the short-term prediction accuracy, a variety of hybrid models are used to predict the drought index. This paper uses a linear model combined with a nonlinear model to greatly improve the prediction accuracy.
The objective of this paper is to propose a hybrid model with the advantages of a linear and nonlinear models to improve the short-term prediction accuracy. The aim of this paper is to use the 6 models to predict the multiscale SPEI computed from 613 sites in China, and to explore the prediction accuracy of different models at each scale with the help of evaluation and validation metrics. Because the kriging interpolation method is commonly used in drought index analyses (Manatsa et al. 2008;Karavitis et al. 2012;Afzali et al. 2016;Jain and Flannigan 2017;Cai et al. 2019), this paper will describe the use of the improved kriging interpolation method, empirical Bayesian kriging (EBK), in ArcGIS to conduct a visual analysis of SPEI12 observed values and fitted values of the three hybrid models from 2014 to 2015. The performance of all the models is compared using measures of persistence, namely, the mean squared error (MSE), NSE, RMSE, and MAE.

Study area and data
China has a descending 3-terrace topography from the west to the east with much diversified topography (Fig. 1). Considering the multi-year average temperature, precipitation, and moisture conditions, China is divided into seven natural sub-regions (Yao et al. 2018b), which include the temperate and warm-temperate desert of northwest China (sub-region 1), the temperate grassland of Inner Mongolia (sub-region 2), the temperate humid and sub-humid northeast China (sub-region 3), the warm-temperate humid and sub-humid north China (sub-region 4), the subtropical humid central and south China (sub-region 5), the Qinghai-Tibetan Plateau (sub-region 6), and the tropic humid south China (sub-region 7). The original data were derived from the National Meteorological Information Center (http://data.cma.cn).

SPEI
In 2009, the World Meteorological Organization (WMO) recommended the SPEI as the main meteorological drought index, which countries can use to monitor and track drought conditions (Hayes et al. 1996). By defining the SPEI as a widely used index, the WMO provides guidance to countries seeking to establish early warning levels for droughts. The probability density function (PDF) that has been used for the SPEI is the log-logistic function, as suggested by Mckee et al. (1993) and Vicente-Serrano et al. (2010), respectively. Since both Yao et al. (2018b) and Li et al. (2020) used the SPEI to monitor the drought situation in China, the log-logistic PDFs are available for meteorological sites in China.
The Penman-Monteith (PM) method has been adopted by the International Commission on Irrigation and Drainage (ICID) as the standard procedure for computing PET (Yao et al. 2018b). SPEI is calculated using precipitation and PET series for each station, each sub-region. The computation procedure of SPEI is as follows: P is the probability of a definite D value: For P ≤0.5, For P >0.5, where c 0 =2.515517, c 1 =0.802853, c 2 =0.010328,-d1=1.432788, d 2 =0.189269, d 3 =0.001308. The classification of dry and wet spells resulting from the values of the SPEI is shown in Table 1.

ARIMA modeling steps
ARIMA is a linear model that has been widely used in drought prediction in recent years (Kisi et al. 2015;Choubin and Malekian 2017). The formula for ARIMA and the model ordering process can be found in detail in Xu et al. (2020). ARIMA models were developed based on the Box and Jenkins approach. The formula for a general non-seasonal ARIMA model can be abbreviated as follows : And where Z t is the observed time series and B is a back shift operator. ϕB and θB are polynomials of orders p and q, respectively. ∇ d describes the differencing operation to data series to make the data series stationary and d is the number of regular differencing.
In this paper, ACF and PACF are used to order the model, and AIC, BIC, and HQIC parametrics were combined to select the optimal model parameters. Finally, 80% of the data were selected as the training set and 20% as the test set using a cross-validation approach. The results of the SPEI with 1 month of lead time are shown in Fig. 2 and Fig. 3, respectively, and the results of the 1-to-2-month lead time are shown in Table 4 and  Table 5.

LS-SVR model
Support vector machine regression is a support vector machine algorithm used to solve regression problems, which is an extension and application of support vector machines to regression estimation problems. If we extend the conclusions obtained by the support vector machine for classification problems to the regression function, it becomes support vector regression, which is commonly used for time series prediction, nonlinear modeling and prediction, and optimization control (Modaresi et al. 2018). Details and formulas about the SVR can be seen in the article by Xu et al. (2020).
LS-SVMR, or least squares support vector regression, was proposed by J.A.K Suykens (Suykens et al. 2002) as an improvement to the SVM to facilitate its solution. Compared with the standard SVM, LS-SVMR replaces the inequality constraint in the SVM with an equation constraint, and the solution process becomes a set of equations, which avoids the time-consuming quadratic planning problem of solution and speeds up the solution process relatively fast. Details and formulas about the SVR can be seen in the article by Suykens et al. (2002).
The LS-SVR modeling steps are as follows: First of all, the data is standardized processing, because this article used radial basis function (RBF), so to calculate the distance of two points in high-dimensional space, it is necessary to standardize the data first, in order to avoid the effect of the scale on the calculation distance. Secondly, set the parameters σ and γ, where σ represents the degree of dispersion of the point distribution in high-dimensional space, γ balances the two items of the target function, and this paper uses crossvalidation to give the appropriate σ value and γ value.

LSTM model
LSTM is a variant of the RNN network. Compared to traditional RNNs, its hidden layer module structure has been greatly redesigned. Although RNNs can effectively handle nonlinear time series, there are still two problems: (1) Due to gradient vanishing and gradient explosion, RNNs cannot handle time series with excessive delays; (2) training the RNN model requires a predetermined delay window length, but it is difficult to automatically obtain the optimal value of this parameter. Thus, the LSTM model emerged. The LSTM model replaces the hidden layer of RNN cells with LSTM cells to achieve long-term memory. The special design of the LSTM model makes it able to learn long-term dependencies and is particularly suitable for time series analyses.
The hidden layer of the LSTM module is also called a memory module. The specific structure is shown in Fig. 4. For the sake of understanding, memory modules are often compared to computer memory. The memory module consists of a storage unit and three computing components. The three computing components are called input gates, output gates, and forget gates, which control the reading, writing, and resetting of memory cell data, respectively. As seen from the figure, LSTM is a fourlayer structure, and each structure is connected to and interacts with the others. The yellow part of the figure shows the various activation functions inside the memory module, and the red part represents the basic operation of the vector, for example, vector addition and product operations. The arrows represent the direction in which the vector is transmitted in the network. The confluence of two solid lines in the figure indicates the vector combination of two parts. A solid line is divided into multiple lines to indicate that the vector is copied and then flows to different parts of the network.
The calculation of the LSTM hidden layer is shown in Fig.  5, and its forward calculation method can be expressed as follows: where i, f, C, and o denote the input gate, forget gate, the cell state vector, and output gate, respectively. W denotes the matrix of weights, b denotes the bias vectors, and U denotes the matrix of weights to the hidden; σ and tanh are sigmoid and hyperbolic tangent activation functions, respectively. The elementwise multiplication of two vectors is denoted by ⊗.
In this paper, multitime scale SPEI values were used in the prediction because the data are a time series related to the previous sequence value. Since the number of hidden layers determines the model's fitting ability, to prevent overfitting, this paper proposes an early stopping technique, which stops training when the loss function no longer drops. The loss function is defined below: where y i is the measured value at time i and b y i is the predicted value at time i.
The LSTM modeling steps are as follows:

Preprocessing of the input data
The original time series should not be directly input into the model as input data. To obtain better training results, the original time series needs to be preprocessed. Usually, the data processed by a neural network model are normalized data because the training effect of the model is affected by the amount of the original data. The normalized data limit the data to a range of [−1, 1], which can eliminate the impact on the neural network due to differences in the dimensions of each dataset, facilitate the analysis of data, and improve the training speed of the model.

Model construction
The LSTM network in this paper was built based on the Keras framework of the Python 3.7 platform. Therefore, before modeling the training data, the methods and parameters need to be configured in the LSTM model based on the Keras framework. The hidden layer of the LSTM in this paper was composed of 20 storage units, the number of iterations was 300, the batch size was set to 1, the activation function was set to ReLU, the loss function was the MSE, and the optimizer was stochastic gradient descent (SGD). However, the number of hidden layers directly determines the quality of the model's fitting ability. To avoid overfitting, the MSE is adopted to describe and select the performance of the model corresponding to the number of hidden layers of the LSTM network.

Fitting model and parameter tuning
The golden section method was used in this paper to select the optimal number of hidden neurons. The golden section method involves first finding the ideal number of hidden layer nodes in the interval [a, b], expanding the search interval according to the golden section principle, that is, obtaining the interval [b, c] (where b=0.619*(ca) + a), and searching for the best value in [b, c]. The batch size is the number of samples of model weight updates. The weight was updated after each sample. Therefore, the batch size was set to 1, and the process is called random gradient descent. Common activation functions are sigmoid (σ), hyperbolic tangent (tanh), and rectified linear unit (ReLU) (Weiss et al. 2018;Vijayaprabakaran and Sathiyamurthy 2020;Ko et al. 2021). Because the timebased back propagation algorithm was used for network training, the sigmoid function is prone to gradient vanishing during backpropagation and was unable to complete deep network training; the SGD obtained by ReLU converged much faster than that obtained by the sigmoid and tanh activation functions, so ReLU was used as the activation function in this paper. The number of iterations was set as high as possible. The early stopping method was used in this article to prevent overfitting. Since the number of training cycles made the process very timeconsuming, the loss function is set to check the performance of the model on the training and verification dataset. Once overlearning starts, training stops, so early stopping was used to suppress overfitting; that is, after each epoch, the test results were obtained on the verification set. When the error increases, the training was stopped, and the weight after stopping was taken as the final parameter of the network.

Training output and anti-normalization
The training data were programmed into the network training model, and when the number of iterations exceeded the set threshold, the training ended. The output value of the network was not the final forecast result of the time series. The actual output of the model could be obtained by performing reverse normalization on the output value of the network. Usually, this method is called inverse normalization.

Hybrid ARIMA-LSTM model
In the time series forecasting problem, the characteristics of the linear model and the nonlinear model determine that the former can only recognize the linear pattern of the time series, and the latter has the advantage of being able to mine the nonlinear relationship of the time series. A large number of experiments and applications have shown that using a single model can have a good effect when dealing with a single time series of components (Choubin et al. 2016;Soh et al. 2018;Fung et al. 2019a, c). However, in the face of complex problems, a single model had certain limitations. Because the time series of the research object contains linear components and nonlinear components, a single linear model or a single nonlinear model is not suitable for use.
A hybrid model was established using the advantages of both the ARIMA and LSTM models, as same as hybrid ARIMA-SVR model ). The formulas are as follows: where Y t is a combination of linear (L t ) and nonlinear (N t ) parts.
The modeling flow chart is shown in Fig. 6.

Evaluation indices
In this study, MSE, RMSE, NSE (Nash JE 1970), and MAE were used as indicators of model evaluation by the following formulas (Belayneh et al. 2016; Deo and Şahin 2016): where the SSE is the sum of squared errors and N is the number of samples. The SSE is given by: with the variables already having been defined.
where y i is the observed value at time i (i = 1, ⋯, N), y is the mean value taken over N, N is the total data size of y i , (i = 1, ⋯, N), and b y i is the forecasted value at time i.

Results and discussion
Calculation results of SPEI The matplotlib visualization library in Python 3.6 was used to visualize the multiscale SPEI calculation results, and the calculation steps are shown in the "SPEI" section. The results of the multiscale SPEI calculations in the 7 regions are shown in Fig. 7. As can be seen from Fig. 7, the SPEI of the 7 sub-regions in China is on the rise, especially the severity of the drought in sub-regions 2, 4, and 6. But on the 12-month scale, the drought in regions 1, 2, and 6 is relatively serious in the recent 5 year.

ARIMA, SVR, and LSTM model predicts results
In this paper, the ADF test was conducted in the 7 regions of China, and the P values of the multitime scale SPEI were all less than 0.05 (Table 2), so it was judged that SPEI1, SPEI3, SPEI6, SPEI12, and SPEI24 are stationary time series. The optimal ARIMA parameters of the multiscale SPEI of the 7 sub-regions are selected through the AIC, BIC, and HQIC (as shown in Table 3). As the drought situation in sub-region 1 is more serious than that in other regions, sub-region 1 is selected as an example for time series prediction. The prediction results in other regions can be seen in Fig.  9. The results of the 1-month lead time of the SPEI are shown in Fig. 3, and the results of the 1-to-2-month lead time are shown in Table 4 and Table 5. As seen from Fig. 3, the prediction accuracy of the ARIMA model is the highest for SPEI24 and the lowest for SPEI1. As the time scale increases, the prediction accuracy gradually increases because the ARIMA model is essentially an overall linear autoregressive model that predicts a trend that tends to stabilize as the test set grows (Yurekli et al. 2005;Hu et al. 2007). This is the same conclusion as that of Hossain Bagmar and Khudri (2020). The use of ARIMA to predict drought indices is also found by Karimi et al. (2019) and Chen et al. (2020); they found that ARIMA, a linear model, had some advantages in predicting drought indices. In this paper, monthly precipitation and temperature data from 613 national meteorological stations from 1980 to 2019 are used to conduct LSTM and SVR modeling with multitime scale SPEI values. Due to the large number of regions, only the same regions by the ARIMA model were selected for result presentation for comparative study, and the same dataset predicted by the ARIMA model after cross-validation was also selected. The parameters of the LSTM model are shown in Fig. 8.
It can be seen from Fig. 2 that the prediction results of the SVR and LSTM models have the lowest accuracy  at SPEI1 and the highest accuracy at SPEI24, and the prediction accuracy increases gradually with increasing time scale. And we found that the prediction accuracy of SVR is higher than that of the ARIMA model at SPEI1 and SPEI3, but it is lower than that of the ARIMA model at SPEI6, SPEI12, and SPEI24. This shows that SVR is more suitable for long-term drought prediction than the ARIMA model. This is the same conclusion as that of Poornima and Pushpalatha (2019) and LSTM has recently been used in many applications, including drought prediction (Dikshit et al. 2021b;Mokhtar et al. 2021), weather prediction (Salman et al. 2018), and water table depth prediction (Zhang et al. 2018), and has achieved good results because it can describe nonlinear relationships. Figure 8 shows that the box diagram of the LSTM model predicting the number of hidden layer neurons of the SPEI at various scales shows that with the increase in the number of hidden layers, the MSE gradually decreases, indicating that the more hidden layers there are, the higher the prediction accuracy of the LSTM. However, to prevent overfitting, this paper adopts a regularization method, namely, the early stopping method (described in the "LS-SVR model" section). At present, most studies on LSTM prediction models do not explain the problems of LSTM hidden layers and the prediction accuracy or the relationship between them.
Hybrid ARIMA-SVR, LS-SVR, and ARIMA-LSTM model prediction results The flow of the ARIMA, SVR, and LSTM modeling processes is described in the "ARIMA modeling steps," "LS-SVR model," and "LSTM model" sections, respectively, and the SPEI values were predicted separately. This paper proposes the ARIMA-LSTM hybrid model. Specifically, the ARIMA model was used to extract the linear features of the multitime scale SPEI. The residual between the predicted value and actual value of the ARIMA model was entered into the LSTM model for prediction, and the advantages of the deep learning method for time series were used to extract the nonlinear features. The linear part and nonlinear part were combined to obtain the prediction results of the hybrid ARIMA-LSTM model, and the flow chart is shown in Fig. 6. The results are shown in Fig. 3. At the same time, four kinds of experimental evaluation indices were selected in this paper. The hybrid ARIMA-LSTM model proposed in this paper was compared with the ARIMA, SVR, LSTM, ARIMA-SVR, and LS-SVR models. The feasibility of the hybrid ARIMA-LSTM model for SPEI forecasting was verified. The effectiveness is shown in Table 4 and Table 5. Figures 2 and 3 show that the predicted accuracy of the SPEI of the hybrid ARIMA-LSTM model at all scales is higher than that of the single ARIMA, SVR, and LSTM and hybrid ARIMA-SVR and LS-SVR models. In recent years, several scholars used hybrid models to forecast drought index for drought prediction, and found that hybrid models, especially the combined models of linear and nonlinear, have better results than single models (Ali et al. 2019;Das et al. 2020;Khan et al. 2020; Mohammed Salisu and Shabri 2020). Khan With the increase in the time scale, the precision of the hybrid ARIMA-LSTM model increases gradually because the precision values of the single models increase gradually. Xu et al. (2020) combined the ARIMA model and SVR model and predicted multiscale SPI by using a hybrid model and a single ARIMA model. This paper is similar in that the hybrid linear model and nonlinear model are used for multiscale prediction, and the difference is that this paper is mainly aimed at improving the short-term prediction accuracy and avoiding model overfitting by using a regularization method, so it is more applicable.
It can be seen from Fig. 7 that the severe drought in the past 10 years in the 7 sub-regions of China occurred in 2014 and 2015. The EBK method was used to demonstrate the spatial distribution of the model prediction results of SPEI12 in China in 2014-2015, as shown in Fig. 9. It was found that the prediction accuracy of the three hybrid models (ARIMA-SVR, LS-SVR, and ARIMA-LSTM) is similar in spatial distribution. It can be seen from Fig. 3 and Table 5 that the prediction accuracy of SPEI6, SPEI12, and SPEI24 of the ARIMA-LSTM model is highest, indicating that the ARIMA-LSTM model is suitable for long-term drought prediction. The LSTM itself has a unique "gate" (in the "LSTM model" section) that allows it to remember longer sequences, and the accuracy of the LSTM in predicting SPEI1 and SPEI3 is higher than that of the SVR model, indicating that the LSTM model is better than the SVR model in capturing nonlinear information. And the ARIMA model was suitable for predicting SPEI on longer time scales, so the hybrid model ARIMA-LSTM has the highest accuracy, especially for SPEI24.

Conclusions
In this paper, the six prediction models (ARIMA, SVR, LSTM, ARIMA-SVR, LS-SVR, ARIMA-LSTM) are selected based on their advantages of predicting multiscale SPEI in the 7 sub-regions of China. Four evaluation indices were selected to evaluate the prediction accuracy of the three models in the 1-2-month lead time, and the spatial distribution of the prediction results of the three models was displayed in combination with the EBK method.
Based on the key statistical parameters, the performance of the three hybrid models (ARIMA-SVR, LS-SVR, ARIMA-LSTM) increases gradually with increasing time Fig. 6 Hybrid ARIMA-LSTM model flow chart Fig. 7 The variations of SPEI at the 1-, 3-, 6-, 12-, and 24-month timescales over 1980-2013 in the 7 sub-regions of China scale. And the prediction accuracy of the three hybrid models is higher than that of the three single models (ARIMA, SVR, LSTM). According to Table 4, we can see that the prediction accuracy of the hybrid ARIMA-LSTM model is the highest at each timescale, especially at 6, 12, and 24 months. When the lead time was 1 month, the prediction accuracy of the hybrid ARIMA-LSTM model was lowest at SPEI1 and highest at SPI24 and SPEI24. When the prediction area remains unchanged, we find that the prediction accuracy of the three hybrid models decreases at all time scales with the increases in the prediction advance period; that is, the prediction accuracy of the model with a 1-month lead time is higher than that of the model with a 2-month lead time. From the spatial distribution diagram of the prediction model, the hybrid ARIMA-LSTM model has higher prediction accuracy than the other models, indicating that the hybrid model can contribute to improving the short-term prediction accuracy and more suitable of long-term scale drought in China of meteorological drought.