Forecasting Hourly Intermittent Rainfall by Deep Belief Networks with Simple Exponential Smoothing

Accurate rainfall forecasting is essential in planning and managing water resource systems efficiently. However, intermittent rainfall patterns increase the difficulty of accurately forecasting rainfall values. Deep learning techniques have recently been popular and powerful in forecasting. Thus, this study employed deep belief networks with a simple exponential smoothing procedure (DBNSES) to forecast hourly intermittent rainfall values in Taiwan. Weather factors were used as independent variables to forecast rainfall volume. The simple exponential smoothing data preprocessing procedure was used to deal with the intermittent data patterns. The other three forecasting models, namely the least squares support vector regression (LSSVR), the generalized regression neural network (GRNN), and the backpropagation neural network (BPNN), were employed to forecast rainfall using the same data sets. In addition, genetic algorithms were utilized to determine the parameters of four forecasting models. The empirical results indicate that the developed DBNSES models are superior to the other forecasting models in terms of forecasting accuracy. In addition, the DBNSES can obtain smaller values of RMSE than those in the previous studies. Therefore, the DBNSES model is a suitable and effective way of forecasting rainfall with intermittent data patterns.


Introduction
In many countries, accurately forecasting intermittent rainfall is a major challenge, particularly in Taiwan, whose topographical elements and geographic position pose difficulty (Wu and Lin 2017). Due to the mountainous terrains and the island nature in Taiwan, most rainfall occurs in the plum rain season, including May and June, and the typhoon seasons, from July to October. Heavy downpours frequently result in catastrophic repercussions like flooding, landslides, floods, or debris flows that cause property damage and fatalities (Saito et al. 2017;Song and Park 2020). Nevertheless, extensive studies have gained valuable insight into building reliable warning and forecasting systems to alleviate disasters threatening security and the economy because of excessive rainfall. Through proper preventive measures, such as evacuating people living in the most endangered areas, provided by early warning systems, the influence of disasters resulting from heavy rainfall could be alleviated. Thus, accurate rainfall forecasting plays a crucial role in disaster prevention and mitigation in Taiwan.
In hydrological research, precisely predicting rainfall is a crucial topic, particularly in countries with intermittent weather patterns (Wu and Lin 2017;Singh 2018;Haidar and Verma 2018) because accurate rainfall forecasts can moderate the impacts of heavy rain and prevent disasters timely. The forecasts of hourly rainfall in hydrology are acquired from radar measurement or by using various statistical methods to forecast rainfall, including machine learning, singular spectrum analysis, support vector machine, and time series. Yu et al. (2015) developed a hybrid blending system that collects and integrates information from radar data and weather data to improve the forecasting accuracy of floods and rainfall. It was reported that the designed model could generate more accurate and effective forecasting results than the other radar-based forecasting methods. Yu et al. (2017) employed random forest techniques and support vector machine models with radar-derived data to forecast real-time rainfall. Previous grid positions, including longitude and latitude and three reservoirs, Feitsui, Deji, and Zengwen, served as independent variables to forecast rainfall 1 to 3 h ahead. This investigation reported that the support vector machine approach is superior to the random forest in terms of forecasting accuracy. Bagirov et al. (2017) used clusterwise linear regression to forecast rainfall. Data employed in this study were gathered from eight atmospheric stations. Five weather attributes-maximum temperature, minimum temperature, evaporation, vapor pressure, and solar radiation-were treated as independent variables of the clusterwise linear regression to forecast monthly rainfall. This investigation reported that the clusterwise linear regression outperformed the other four prediction models. Unnikrishnan and Joshiprakash (2018a) performed a singular spectrum analysis to forecast daily rainfall in time series. Various components of the rainfall series, such as trend, periodic component, noise, and cyclic component, were extracted. Numerical results indicated that the presented model could effectively forecast the daily rainfall in time series for long durations. Further, Ghamariadyan and Imteaz (2021) developed a wavelet artificial neural networks model to forecast seasonal rainfall in Queensland, Australia. This study reported that the presented model is superior to the other five forecasting models by more than 35% according to forecasting accuracy. Islam and Imteaz (2022) designed a hybrid autoregressive integrated moving average with an exogenous input model to predict seasonal rainfall in Western Australia. Numerical results revealed that the proposed model outperformed the other traditional linear and non-linear models in terms of forecasting accuracy.
Deep learning is an emerging technology applied in many fields, including forecasting rainfall. Venkata Ramana et al. (2013) employed the artificial neural network with the wavelet technique to forecast monthly rainfall in Darjeeling. Three variables were selected for independent variables, namely rainfall, minimum temperature, and maximum temperature. Numerical results revealed that the proposed wavelet neural network outperformed the artificial neural network. Azad et al. (2015) investigated the predictability of monsoon rainfall in India using monthly and annual data. In addition, wavelets and artificial neural networks were applied to extract the periodic structure and train forecasting models, respectively. The results indicated that more accurate results are provided when periodic and random components are considered separately. Salman et al. (2015) compared the performances of three deep learning methods, namely recurrent neural networks, conditional restricted Boltzmann machines, and convolutional neural networks, in rainfall forecasting. Numerical results revealed that the recurrent neural networks could obtain more accurate rainfall forecasting results than the other two methods. Shi et al. (2015) developed a convolutional long short-term memory network model to nowcast rainfalls. The designed method can effectively forecast rainfall intensity over a relatively short period of time by spatiotemporal data. Hernández et al. (2016) designed a deep learning architecture composed of two networksthe autoencoder network and the multilayer perceptron network-to forecast the accumulated rainfall for the following day. The results indicated that the proposed forecasting model outperformed multilayer perceptron neural networks. Ha et al. (2016) used deep belief networks to model with eight-fold cross-validation in their study. Data were collected from an automatic weather station, including rainfall, temperature, and sun and moon positions. The numerical results showed that the proposed model outperformed the multilayer perceptron neural networks in terms of forecasting accuracy. Dabral and Murry (2017) employed Seasonal Autoregressive Integrative Moving Average models (SARIMA) on three types of data sets to forecast the monsoon rainfall, including monthly, weekly, and daily time series data. The autocorrelation function, partial autocorrelation function, and the minimum values of the Akaike Information Criterion with Schwarz Bayesian Information were used to select the most optimal SARIMA models. Further, Qiu et al. (2017) designed a multi-task convolutional neural network to forecast short-time rainfall. Due to the correlations between different sites that can effectively forecast rainfall in various weather conditions, this investigation used weather datasets from multiple stations for training. The proposed models were significantly superior to European Centre for Medium-Range Weather Forecasts. Liu et al. (2022) proposed the convolutional long short-term memory network model to nowcast the rainfall of four stations in Helan Mountain, namely Dianjiang, Shihuiyao, Linkuang, and Suyukou, using rainfall data and radar data. Combined reflectance and the retrieved wind field served as the main determinants for forecasting. Compared with convolutional neural networks and long short-term memory networks, the proposed model greatly improved the performance of forecasting rainfall with the ability to capture spatial information and timing memory. Wei and You (2022) presented a hybrid model by integrating the discrete wavelet transform with two deep learning models, namely dilated causal convolutional neural network and long short-term memory, to forecast the monthly rainfall of four cities in China, specifically Beijing, Tianjin, Chongqing, and Guangzhou. The experimental results indicated that the proposed model outperformed long short-term memory and dilated the causal convolutional neural network. Further, it was confirmed to be more satisfactory for long-term rainfall forecasting.
The rest of this paper is organized as follows: Sect. 2 introduces deep belief networks and the framework of the DBNSES model in rainfall forecasting; Sect. 3 presents the numerical results; Sect. 4 addresses the conclusions and direction of the future study.

Deep Belief Networks
A deep belief network (DBN), proposed by Hinton et al. (2006a), includes several stacked restricted Boltzmann machines (RBMs)-the main components of a deep belief network)and a back propagation neural network (BPNN). The RBM was a revised form of the original Boltzmann machine (BM) (Ackley et al. 1985) with node connections both within layers and between layers. The difference lies in that the RBM only allows links between nodes in adjacent layers. The learning strategies of a DBN can be split into two phases, including unsupervised learning in the pre-training phase and supervised learning in the fine-tuning phase. In the pre-training phase, the unsupervised learning-based training uses only the independent variable to select apposite initial parameters, namely weights and biases, for the supervised learning. Therefore, to maximize the likelihood estimation, the pre-training phase rebuilds training samples by adjusting parameters. According to the initial parameters determined by the pre-training phase, supervised learning in the fine-tuning phase is performed to further efficiently adjust the network parameters. The outputs of the units in a visible layer served as inputs of the units in the hidden layer. Equation (1) illustrates the energy function of a connected structure ( , ς ) of visible and hidden units.
where W ij is the weight connecting unit i in the visible layer with a bias α i and unit j in the hidden layer with a bias β j . Values of i and ς j are binary distributions and can be depicted as Eq. (2): where z = ,ς exp −Engy( ,ς) is a normalization partition function. Activated probability functions are expressed as Eq. (3) and Eq. (4), while the state vector of the visible layer or ς of the hidden layer is produced.
The sigmoid (•) is a sigmoid function represented as Eq. (5): Three parameters, α , β , and W , are illustrated as a parameter set θ . Basically, the objective of RBM is to pursue an appropriate parameter set. The outcomes of unsupervised RBM are treated as input of the MLP, which is the supervised learning phase of deep belief networks. When distributions of the visible and the hidden layers are in a steady phase, the likelihood function is maximized and the features of the training data set can be represented well using a constructive divergence algorithm (Hinton 2002b;Roux and Bengio 2008). Equation (6) depicts the likelihood function: where s is the size of training data. The gradient of Eq. (6) is illustrated as Eq. (7) in regard to the parameter set.
Equation (8) and Eq. (9) are the results of the derivative of lnP ( s ) in regard to θ and the gradient of the likelihood function, correspondingly.
Thus, the learning rate and the momentum of the RBM can be represented as Eq. (10).
where λ p and m p denote the learning rate and the momentum, correspondingly. Then, the k-step constructive divergence algorithm (Hinton 2002b) was used to cope with the computations of Eq. (10). In this study, the k value was equal to one. The training policy of parameters in RBM was depicted as Eqs. (11)-(13).
The well-trained parameters provided by RBM served as initial parameters of the next supervised learning stage of deep belief networks. Backpropagation algorithms were used for training weights and biases of multilayer perceptron networks in the supervised learning stage. The output of the hidden layer can be expressed as Eq. (14): where n and z are the numbers of neurons in the input layer and hidden layer, respectively. The number of the hidden layer is l. The w l ij is the weight between the input layer and the hidden layer. x i is the input variable, namely the weather factors, and b is the bias. The output of the output layer was presented as Eq. (15).
where Y k is the forecasting value, namely rainfall, in this study. The mean square error, illustrated as Eq. (16), was used to measure the error between actual values and forecasting values.
The errors between the hidden layer and the output layer were calculated by the delta function expressed as Eq. (18).
Therefore, the learning rules of the weights and the biases in the supervised learning phase were presented as Eqs. (19)- (21): where η is the learning rate of the backpropagation (Huang 2019).

The Framework of DBNSES Model in Rainfall Forecasting
In this study, a deep belief network with a simple exponential smoothing model was designed to forecast hourly intermittent rainfall. Three other forecasting models-the least squares support vector regression, generalized regression neural networks, and the backpropagation neural networks-were also employed to forecast rainfall with the same data sets to demonstrate the performance of the presented DBNSES model. The flowchart of this study is illustrated in Fig. 1.
The intermittent rainfall frequently leads to floods and inundations, which result in property damage and personal injury. New Taipei City is Taiwan's most populated county, with Banqiao District having the highest population. To ensure public safety and lessen the likelihood of flooding, drainage systems and early warning systems were improved. In addition, effective water resource management systems and accurate rainfall forecasting are considered essential in preventing drought and flood disasters. First, the weather attributes and hourly rainfall data at Banqiao District, New Taipei City were collected from the Central Weather Bureau in Taiwan (https://e-service.cwb.gov.tw/HistoryDataQuery). Figure 2 illustrates the geographic region of this study. In total, ten weather attributes were used, including station pressures, sea pressures, temperatures, dew-point temperatures, relative humidity, wind speeds, wind directions, max gusts, the directions of max gusts, and rainfall hours. Rainfalls in Taiwan have an intermittent characteristic. Thus, simple exponential smoothing was used to preprocess the intermittent rainfall data. Previous studies (Waller 2015;Kourentzes 2013;Li and Lim 2018) pointed out that the simple exponential smoothing represented by Eq. (22) could smoothen intermittent data. Hence, the simple exponential smoothing method served as a preprocessing procedure in this study: where α is a smoothing constant with values between 0 and 1; R t and F t represent the actual value and the forecasting value at time t, respectively. The weather attributes and rainfall data served as independent variables and the dependent variable, correspondingly. After applying the simple exponential smoothing method to the rainfall data, ten weather attributes and rainfall were normalized before being fed into the deep belief networks. The data were divided into training data and testing data for model learning and forecasting performance evaluations. Both the 24 h-ahead and the 168 h-ahead predictions were conducted in this study to examine the performance of the proposed DBNSES model in forecasting hourly intermittent rainfall. The numbers of the training data set and testing data set of the 24 h-ahead forecastings are 8736 and 8784, correspondingly. In addition, the amount of the training data set and testing data set of the 168 h-ahead forecastings are 8592 and 8784, respectively. Furthermore, this study used genetic algorithms (Holland 1975) to determine the parameters of four forecasting models-DBNSES, LSSVR, GRNN, and BPNN. Based on the study of Lin et al. (2019), the proposed DBNSES model employed genetic algorithms to select six parameters, including the learning rate and the momentum of both unsupervised and supervised training phases, and the dropout and the number of neurons in the hidden layer in the supervised learning phase. Table 1 illustrates parameters and searching ranges for forecast- Fig. 1 The flowchart of the DBNSES model for hourly intermittent rainfall forecasting ing models. For unsupervised learning and supervised learning phases, the energy function of restricted Boltzmann machines and root mean square error values of rainfall forecast were treated as the fitness functions of genetic algorithms, respectively. Both minimizations of the energy function and the root mean square error were pursued in the learning process of deep belief networks. Thus, negative outputs of both measurements were employed as the fitness function of genetic algorithms for maximizations. The gene number of a chromosome, the number of generations, the population size, the crossover, and the mutation rates of genetic algorithms are 40, 30, 30, 0.7, and 0.7, respectively. Further, this study applied a single-point crossover with binary coding for genetic algorithms. When the search for appropriate parameters of DBNSES models was completed, the finalized DBNSES models were obtained. The finalized forecasting models were then used to predict rainfall. Then, the forecasting rainfall values were restored by the reversed functions of the normalization and the simple exponential smoothing. Finally, measurements of rainfall forecasting accuracy were generated. Algorithm 1 shows the process of the proposed DBNSES models with the  Genetic Algorithm. In this study, the proposed forecasting models were implemented by the MATLAB and the spreadsheet. The MATLAB were used to implement forecasting models. The data were collected and arranged in spreadsheets. The designed forecasting system is able to read weather and rainfall data then make a forecast.
Algorithm 1: The proposed DBNSES with genetic algorithms 1. Input: dataset with 10 weather attributes and rainfall; 2. Preprocess rainfall data by simple exponential smoothing; 3. Normalize weather attributes and rainfall; 4. Divide data into training datasets and testing datasets; 5. Initialize a DBNSES parameter set with six parameters in a form of population: POPU i , (i = 1, 2, …, n); 6. While end condition (i = n) of genetic algorithms is not reached do < The unsupervised training phase > 7. Repeat until the number of epochs is reached for training RBM;

Numerical Results
Four forecasting models-deep belief networks, the least squares support vector regression, generalized regression neural networks, and backpropagation neural networks-were employed to forecast intermittent rainfall with the same data sets. Tables 2 and 3 list the parameters of the four models for 24 h-ahead and 168 h-ahead forecastings. In this study, the range of smoothing constants ranged from 0.1 to 0.9 interval value of 0.1. The smoothing constants resulting in the smallest RMSE were selected. For the 24 h-ahead forecastings, the smoothing constants of simple exponential smoothing leading to the smallest RMSE values for DBN, LSSVR, GRNN, and BPNN models were 0.1, 0.9, 0.9, and 0.1, correspondingly. For the 168 h-ahead forecastings, the smoothing constants of simple exponential smoothing leading to the smallest RMSE values for DBN, LSSVR, GRNN, and BPNN models were 0.9, 0.9, 0.9, and 0.8, respectively. Two indices, the root mean square error (RMSE) and the mean absolute error (MAE), expressed as Eqs. (23) and (24), were used to measure the performance of the forecasting models: where N is the number of forecasting periods, F t is the forecasting value at time t, and A t is the actual value at time t. Table 4 lists the forecasting measurements with the best α by four models for two forecasting strategies. The forecasting measurements without simple exponential smoothing by four models for two forecasting strategies are shown in Table 5. Tables 4 and 5 show that the hourly intermittent rainfall forecasting performance can be mostly improved using a simple exponential smoothing preprocessing approach, particularly in 24 h-ahead forecastings. The DBNSES models were superior to the other three models of both 24 h-ahead and 168 h-ahead forecastings in terms of RMSE. It was pointed out that RMSE is more sensitive to outliers than MAE (Hyndman and Koehler 2006). In addition, the proposed DBNSES model outperformed previous studies of rainfall forecasting in terms of RMSE (Wu and Lin 2017;Singh 2018;Unnikrishnan and Jothiprakash 2018b   the numerical results revealed that the proposed DBNSES is a feasible and stable method for forecasting hourly intermittent rainfall. Additionally, the simple exponential smoothing technique can yield satisfactory results when coping with data with intermittent patterns. Figures 3 and 4 plot the absolute error values of rainfall provided by forecasting models for two different forecasting strategies, respectively. The proposed DBNSES can deliver more satisfactory results than the other three forecasting models. Figure 5 illustrates the absolute error values of rainfall generated by forecasting models with and without simple exponential smoothing in a format of boxplots. A boxplot can demonstrate error values and provide clear data distributions when a large number of values in the visual representation arises (Tukey 1977). It can be observed from Fig. 5 that the simple exponential smoothing technique reduced the median values and outliers in most forecasting models, especially for the deep belief networks.

Conclusion
Although rainfall resources are quite abundant in Taiwan, its economic progress is hampered by frequent and heavy rains. In recent years, extreme weather has had a devastating impact on people and caused a variety of calamities globally. Due to its topographical features and geographic position, Taiwan has a convergent and intermittently heavy rainfall characteristic. Thus, effective rainfall forecasting can help the public and commercial sectors take the right precautions to lessen the risk of disasters. Therefore, this study developed a deep belief network with a simple exponential smoothing model for forecasting hourly rainfall in Tai Table 4 The measurements of forecasting models with the simple exponential smoothing generating the smallest RMSE value for each model only used weather attributes for rainfall forecasting. For future study, some other factors, such as radar maps (Kim et al. 2017), could be included in the DBNSES to increase the accuracy of rainfall forecasting. In the narrowest sense, weather and rainfall data collected from a weather station were used to examine the effectiveness of the proposed DBNSES models. In the broadest sense, weather and rainfall data gathered from other regions or countries can be used to investigate the feasibility of the designed DBNSES models. Furthermore, the findings of this study are as follows. First, weather attributes are essential in forecasting intermittent rainfall when deep belief networks are used. Second, simple exponential smoothing is an effective way to improve forecasting results for deep belief networks, and the genetic algorithm is a feasible and promising way to determine the parameters of deep belief networks. Finally, with the advantages of simple exponential smoothing preprocessing, genetic algorithms, and deep learning capabilities, the developed DBNSES models can be appropriately used for contexts with intermittent rainfall.  Data Availability Data used in this study will be made available upon request.

Declarations
Ethical Approval Not applicable.

Consent to Participate
All authors have consented.

Consent to Publish All authors have consented.
Competing Interests All authors declare that they have no conflict of interest.