NDVI Forecasting Model Based on the Combination of Time Series Decomposition and CNN – LSTM

Normalized difference vegetation index (NDVI) is the most widely used factor in the growth status of vegetation, and improving the prediction of NDVI is crucial to the advancement of regional ecology. In this study, a novel NDVI forecasting model was developed by combining time series decomposition (TSD), convolutional neural networks (CNN) and long short-term memory (LSTM). Two forecasting models of climatic factors and four NDVI forecasting models were developed to validate the performance of the TSD-CNN-LSTM model and investigate the NDVI's response to climatic factors. Results indicate that the TSD-CNN-LSTM model has the best prediction performance across all series, with the RMSE, NSE and MAE of NDVI prediction being 0.0573, 0.9617 and 0.0447, respectively. Furthermore, the TP-N (Temperature & Precipitation-NDVI) model has a greater effect than the T-N (Temperature-NDVI) and P-N (Precipitation-NDVI) models, according to the climatic factors-based NDVI forecasting model. Based on the results of the correlation analysis, it can be concluded that changes in NDVI are driven by a combination of temperature and precipitation, with temperature playing the most significant role. The preceding findings serve as a helpful reference and guide for studying vegetation growth in response to climate changes.


Introduction
Vegetation growth status as a critical indicator of regional ecosystem evaluation is affected by several factors, particularly in arid and semiarid regions (Jaber 2019;Herrmann et al. 2005;Liu et al. 2016). The response between vegetation and climatic factors is close, and climate change frequently causes differential changes in surface vegetation's temporal and spatial distribution (Jong et al. 2012;Pepin et al. 2015). According to previous research, temperature and precipitation are significant factors in arid and semi-arid areas (Na et al. 2021;Parizi et al. 2020;Cañón et al. 2011). Normalized difference vegetation index (NDVI), vegetation condition index (VCI), and vegetation health index (VHI) are currently the most widely used indices for vegetation drought studies (Alamdarloo et al. 2018). Where VCI and VHI are the new indices derived from NDVI that are used to study the effect of weather, moisture and temperature on vegetation (Kogan 1990;Pei et al. 2018;Tran et al. 2017). Meanwhile, NDVI is more commonly used as the preferred indicator of vegetation growth status in the study of vegetation growth in response climate changes (Bianchi et al. 2019;Wu et al. 2019).
In recent years, time series analysis has proliferated and plays a vital role in numerous application fields. Autoregressive Moving Average (ARMA) (Zarei and Mahmoudi 2020a), support vector regression (SVR) (Aghelpour et al. 2019), extreme learning machine (ELM) (Kuila et al. 2022), and artificial neural network (ANN) (Hu et al. 2021;Panwar et al. 2021) are popularly employed models. With the rapid advancement of artificial intelligence, convolutional neural network (CNN) and recurrent neural network (RNN) have become the most widely used deep neural networks, each with their advantages (Pérez-Alarcón et al. 2022). Due to its ability to efficiently extract input features, 1D CNN has become one of the most popular time series prediction techniques (Qiao et al. 2021;Qayyum et al. 2022). Long Short-Term Memory (LSTM) as a variant of RNN, is widely used in time series forecasting tasks due to its superior nonlinear fitting capability and low computational load (Tulensalo et al. 2020). However, there have been fewer studies on NDVI forecasting in the past two decades. Wang et al. (2015) simulated NDVI based on climatic factors using multiple linear regression. Huang et al. (2017) proposed an ANN and SVM-based combination forecasting model to improve the performance of NDVI prediction in the Yellow River Basin. Wu et al. (2019) utilized a time-delay neural network to predict NDVI in arid and semiarid grassland. Although the preceding model yielded superior results, these models are not suited for solving complex nonlinear problems and cannot mine internal time series features.
Detailed analysis of the internal characteristics of time series helps capture the precise variation trend, as seasonal and trend variables positively influence the model's prediction outcomes (Liu et al. 2020;Zarei and Mahmoudi 2020b;Zhang and Qi 2005). To extract seasonal and trend features from the original data, this paper introduces the X-12-Autoregressive integrated moving average (X-12-ARIMA) model (Hou 2021;Guo et al. 2018). Compared to other decomposition methods such as Seasonal-ARIMA (Parviz 2020), X-11-ARIMA (Pfeffermann and Sverchkov 2014), seasonal-trend decomposition procedures based on loess (STL) (Jiao et al. 2021), and empirical mode decomposition (EMD) (Malik et al. 2022), the X-12-ARIMA model improves the pre-processing based on the X-11-ARIMA model and is more extensively used in the seasonal adjustment analysis of time series (Su et al. 2021). Meanwhile, since deep learning relies primarily on historical data to recover the changing time series pattern to predict future variables, a reasonable selection of historical data can improve the model's prediction accuracy. Based on time series decomposition, R/S analysis was developed to determine the long-term correlation and the average cycle of trend components and to provide a rationale for selecting historical data Nikolopoulos et al. 2021).
For complex nonlinear processes caused by multiple elements, direct analysis of the original can't adequately reflect the patterns formed by the superposition of various features. On the other hand, time series decomposition of non-stationary NDVI series can effectively extract periodic and trend information caused by the deterministic factors and fluctuation information caused by random factors. CNN-LSTM eliminates the effects of random perturbations through filtering and feature extraction from TSD, allowing effective information to be transmitted downward and fitted. Based on TSD-CNN-LSTM, this study established two forecasting models of climatic factors (temperature and precipitation) and four forecasting models of NDVI. The primary goals of this paper are as follows: (1) validate the positive effect of internal feature analysis of time series on model's performance.
(2) validate the universality of the TSD-CNN-LSTM algorithm for all series. (3) discuss the effect of temperature and precipitation on the NDVI.

Study Area
Ningtiaota coal mining region is located in the middle of Shenmu County, Shaanxi Province. The construction of the coal mine began in September 2005, the underground production equipment was put into operation in March 2009, and the first phase of mining in 2013. Most the region's landform units are wind-sand, loess hilly, and gully areas with a typical mid-temperate semiarid continental climate characterized by a cold winter, windy spring, hot summer, cool autumn, varying hot and cold seasons, a significant temperature difference between day and night, drought, little rainfall, and high evaporation. A thick layer of sand covers the area's surface, making it suitable for precipitation infiltration and conducive to groundwater recharge. Precipitation is most concentrated in July, August, and September, and precipitation is the primary source of recharge to the unconfined aquifer. The range of the study area is 110°10′30″E ~ 110°13′20″E, 38°57′20″N ~ 39°0′30″N, with a size of approximately 20 square kilometers (Fig. 1).

Data Processing
Landsat-5, landsat-7 and landsat-8 remote sensing data with a spatial resolution of 30 m × 30 m were downloaded from the United States Geological Survey (USGS) website. The National Earth System Science Data Center, National Science and Technology Infrastructure of China (http:// www. geoda ta. cn). makes available climate data sets that include monthly temperature and cumulative precipitation with a global resolution of 2.5'. This study utilized monthly average temperature and precipitation data from 1961 to 2018 and monthly NDVI data from 2000 ~ 2018. NDVI values were calculated from GEE cloud platform, and finally get the long time series (Fig. 2 shows the variation curves from 2000 to 2018 of NDVI, temperature and precipitation series, which have high consistency). The calculation process is as follows: (1) where NIR is the reflection value of the near-infrared channel, Red is the reflection value of the visible channel, n is the number of pixels of remote sensing image of the study area; vi represents the different pixels, NDVI vi represents the value of each pixel, and NDVI mean represents the mean value of all pixels.

X-12-ARIMA Algorithm
In 1998, Findley et al. added the pre-adjustment module to the X-11 method and introduced the X-12-ARIMA algorithm and accompanying procedures, which are widely used in the seasonal adjustment analysis of socioeconomic and other time series (Guo et al. 2018;Findley et al. 1998). In this study, the temperature, precipitation and NDVI series exhibit evident seasonal characteristics and an annual cycle, whereas the trend changes are not readily apparent. For time series decomposition, the additive model with three components was used (Su et al. 2021): where, Y t represents the original series, T t , S t and R t represent the trend components, seasonal components and residual components of time series at time t , respectively.

Rescaled Range Analysis
The time window of the input feature is a crucial parameter for network training because it reflects each subsequence feature's influence over the past months on the feature of the + 1 month. If the time window is too large, it is possible to extract all pertinent Location of the study area information from the time series, but it also introduces more random noise. On the other hand, if the time window is too small, the correlation between the response variables and the historical feature cannot be fully revealed. Therefore, R/S analysis is frequently used to describe the long-term correlation and average cycles of time series (Kalinin et al. 2018). In this study, R/S analysis is primarily used to determine the average cycles of the trend components and as a calculation basis for selecting time series feature windows He et al. 2015).

CNN
Generally, 1D Convolutional Neural Networks (CNN) are employed for series processing (Barzegar et al. 2020). The typical structure of a CNN model primarily includes convolutional layer to extract the local data features; different convolution kernels are equivalent to different feature extractors. Batch Normalization makes the data pass down more effectively, while the pooling layer performs feature selection; the fully connected output layer is the final layer. Figure 3a illustrates the input time series of 1D CNN as an n*3 matrix; the red box represents the convolution size of a filter is 3*3, and the 128 filters are convolved from top to bottom with a stride of 1. The extracted feature dimension after filter convolution is a (n-3 + 1)*3 matrix C 1 ~ C n-2 , eventually obtaining a 128*1 matrix after feature selection of the pooling process. The dimension of the feature is related to the input data dimension, the size of filter, and the convolution step.

LSTM
Long Short-Term Memory (LSTM) neural network can effectively solve the gradient explosion and gradient disappearance issues that plague simple recurrent neural networks (Roy 2021;Rajesh et al. 2022). In Fig. 3b, c t is the internal cell status introduced by the LSTM network for linear cyclic information transfer, which also outputs information nonlinearly to the external status h t of the hidden layer. c t is the candidate cell status obtained by the nonlinear function. The forget gate f t controls how much information of internal cell status c t−1 must be forgotten at the previous moment: The input gate i t controls how much information must be saved for the candidate cell status , where W and U represent the weight, b is the bias weight, and is the sigmoid function.

TSD-CNN-LSTM
CNN and LSTM models rely heavily on the historical information of the time window to restore the change rule of time series over time to predict the future response variables. TSD can extract the component features of the original time series using CNN to re-extract and filter the component features, which are then send to the LSTM network and output by the fully connected layer. Figure 3c depicts the structure and the calculation process of TSD-CNN-LSTM: 1. The original time series was decomposed into a group of feature vectors that included residual components, seasonal components, and trend components: {R N , S N , T N }. 2. The mapping group of input feature of the convolution layer is a two-dimensional tensor F ∈ ℝ ×3 , where is the width of the mapping window of the input features of CNN.
To calculate the feature map y p i of the output of the ith sample, the input feature maps F 1 i , F 2 i and F 3 i are convolutionally filtered with convolution kernels W p,d .
where T i− , S i− and R i− represent the trend, seasonal and residual components of the past months at the time corresponding to the ith sample, respectively; ⊗ represents the crosscorrelation operation (the convolution without flipping); p = 1, … , P represents the number of convolution kernels. b p is the bias weight. f (⋅) is the Relu function, which can solve the gradient disappearance problem and has a faster computational speed and convergence rate than tanh function and the sigmoid function, defined as Eq. (6): max(0, 1) 3. y p i is resampled by the pooling layer and output ỹ p after the convolution operation. This study utilizes average pooling for feature selection. 4. ỹ p enters the LSTM network to fit the time series relationship and outputs the predicted value from the fully connected layer. Dropout is an effective method to reduce model overfitting. During model training, neurons are removed from the neural network with a predetermined probability and reconnected when used. The mean square error is the loss function of parameter updating in a neural network.
where x i is the measured value corresponding to the ith training sample,x p is the predicted value of the model, and n is the number of training samples. where n is the number of all test samples; x i is the ith measured value; x pi is the ith predicted value, and x i is the mean of all test samples.

Time Series Decomposition
Temperature, precipitation, and NDVI series of the study area exhibit a clear annual cycle of seasonal characteristics. Figure 4 depicts the decomposition of temperature, precipitation and NDVI series into a seasonal component with an annual cycle, a trend component reflecting the change trend, and a residual component reflecting random fluctuations.
According to the results of time series decomposition, the fluctuation ranges of residual component of the temperature series, precipitation series, and NDVI series are -3 ~ 3, -60 ~ 100 and -0.4 ~ 0.4, respectively, while the fluctuation ranges of the trend component are 5 ~ 10, -20 ~ 100 and 0.3 ~ 0.8, respectively. In summary, the trend component of temperature series exhibits more constant fluctuations than the precipitation and NDVI series.

Feature Window
Figure 5a-c show that the Hurst exponent of the trend components were H t = 0.95 > 0.5 , H p = 0.59 > 0.5 , and H NDVI = 0.84 > 0.5 , respectively. The results also indicate that alltime series have long-term correlations, which reflect exceptional memory in time series prediction. Figure 5d-f indicate that the average cycles of the trend components of temperature, precipitation and NDVI were obtained as T t = 33, T p = 48 , and T NDVI = 16 , respectively. Since the seasonal cycles of temperature, precipitation and NDVI are all 12 months, which are entirely contained in the cycles of the trend components. Therefore, the feature windows of temperature, precipitation, and NDVI for network training are determined as t = 33 , p = 48 , NDVI = 16 , respectively.

Parameter Calibration
Depending on the size of the feature window, six models were established with an approximate 8:2 ratio of calibration sets to validation sets (Table 1 displays Table 1 provides that the input size of T-T model is a 33*3 matrix, while that of the P-P model, the T-N model, the P-N model, the TP-N model and the N-N model is a 48*3 matrix, 33*3 matrix, 48*3 matrix, 48*6 matrix and 16*3 matrix, respectively.

Prediction Results
The effects of TSD-CNN-LSTM (TCL), TSD-CNN (TC), TSD-LSTM (TL), and single LSTM models on forecasting are depicted in Fig. 6. The T-T model has a marginally better fitting effect than the P-P and N-N models. In contrast, the N-N model fits significantly better than the T-N, P-N, and T&P-N models. Table 2 shows that the TSD-CNN-LSTM model of T-T and P-P models has the best forecasting performance with the smallest RMSE, MAE, and NSE, whereas the LSTM model has the worst forecasting performance.
The T-N model's single LSTM model has the best prediction performance with the lowest RMSE and the highest NSE, while the P-N model's TSD-CNN-LSTM model has the best prediction performance with the lowest RMSE, MAE, and NSE. The TSD-CNN-LSTM model of T&P-N and N-N models has the best prediction performance with the smallest RMSE, MAE, and highest NSE, followed by the TSD-CNN and TSD-LSTM models, and the single LSTM model has the worst prediction performance.

Validity of Data Pre-processing
In this paper, TSD-CNN-LSTM model demonstrated superior forecasting performance for all series compared to the TSD-CNN, TSD-LSTM and LSTM models. However, the single LSTM model had the worst performance among most models, possibly because a single RNN cannot effectively capture trend or seasonality in a time series (Bandara et al. 2020). The previous prediction has demonstrated that data pre-processing can improve the predictive accuracy of neural network models (Ni et al. 2020). Cui et al. (2020) have demonstrated that the prediction accuracy can be enhanced by detrending and de-seasonalizing the original time series. Zhang et al. (2022) established a CEEM-DAN-LSTM model for water quality prediction based on the complete set of empirical mode decomposition (CEEMDAN) methods, which improved the precision of a single LSTM model. Wei and You (2022) improved the precipitation prediction accuracy of CNN and LSTM by capturing the time series details using the wavelet transform method. In light of the findings of this study, it is evident that data pre-processing is required prior to building a neural network model.

Effect of Series Volatility on Prediction Results
According to the results of time series decomposition, the trend change of temperature, precipitation and NDVI series fluctuates with a small range of 5 ~ 10, significantly with a large range of -20 ~ 100, and irregularly with a small range of 0.3 ~ 0.8, respectively. Compared with the temperature series, the precipitation and NDVI series exhibit strong oscillations, and the characteristics of trend component are highly variable, which may explain why the prediction effects of the P-P and N-N models are marginally inferior to those of the T-T model (Yan 2012). The poor prediction performance of TSD-CNN-LSTM for the T-N model is likely attributable to the loss of information in the smoothly varying trend component, which the hybrid model may have caused. Additionally, the imbalance in training data significantly negatively impacts CNN's overall performance (Bandara et al. 2020;Hensman and Masko 2015).

Response of NDVI to Temperature and Precipitation
Correlation analysis of NDVI and climatic factors indicated that the multiple correlation coefficient between NDVI and temperature & precipitation is 0.833 (P < 0.01). The partial correlation coefficient between NDVI and temperature is 0.714 (P < 0.01), while that between NDVI and precipitation is 0.01. The Pearson correlation between NDVI and temperature is 0.833 (P < 0.01), while that between NDVI and precipitation is 0.611 (P < 0.01). Previous research has demonstrated that vegetation growth in arid regions is more sensitive to climate change (Liu et al. 2016). Precipitation was primarily responsible for improving local vegetation in the arid Sahel region of Africa (Herrmann et al. 2005). While Lamchin et al. (2018) found that air temperature was the most influential factor in NDVI change in Asia from 1982 to 2014, Eamus et al. (2006) discovered that NDVI was negatively correlated with temperature and positively correlated with precipitation in Mongolia over the past three decades. Cui and Shi (2010) discovered that in eastern China, the response of vegetation to temperature change was more pronounced than its response to precipitation change. The factors that influence the NDVI in different regions vary. It is obvious that the driving factors of NDVI are different in different regions. Consequently, based on the results of the correlation analysis and prediction effects of NDVI models (N-N model > T&P-N model > T-N model > P-N model), it can be concluded that NDVI changes in the Ningtiaota coal mining area are most significantly driven by the combination of temperature and precipitation, with temperature playing the most critical role.

Conclusions
Changes in the NDVI are a complex nonlinear process caused by multiple factors. Direct analysis of the original series cannot fully reflect the patterns created by the superposition of multiple features. Decomposition of the non-stationary NDVI series can effectively extract the periodic and trend information from deterministic factors and the fluctuation information from stochastic factors. This paper proposes a new NDVI forecasting model based on TSD-CNN-LSTM that considers the nonlinear variation characteristic of the NDVI. In the meantime, two forecasting models of climatic factors and four forecasting models of NDVI were developed to test the model's generalizability and discuss the response of NDVI to climatic factors. Using RMSE, NSE and MAE to evaluate the performance of models, the results demonstrated that TSD could extract the residual, seasonal and trend change characteristics of the NDVI series, whereas R/S analysis can capture the trend characteristics based on TSD. The TSD-CNN-LSTM model can extract the most valuable historical information and perform optimal fitting with superior performance to the TSD-CNN, TSD-LSTM, and single LSTM models when combined with the CNN and LSTM models. The climatic factors-based NDVI forecasting models revealed that the T&P-N model had greater predictive ability than the T-N model, while the P-N model had the worst predictive ability. Using the results of the correlation analysis between NDVI and climatic factors, it is possible to conclude that the variation of NDVI was driven by both temperature and precipitation, with temperature playing a more significant role.

Availability of Data and Materials
The data used to support the findings of this study are available from the corresponding author upon request.

Competing Interests
The authors declare no conflicts of interest.