Streamflow forecasting has an important role in water resources management and hydrological modeling (Yaseen et al. 2015). Recently, numerous studies tried to improve the accuracy of forecasting models (Fahimi et al. 2017; Fang et al. 2019). Meanwhile, because of climate change, urbane developments, human activities, and geographical characteristics, streamflow forecasting has become complicated (Huang et al. 2014). Regarding mentioned complexities, recently, DDMs have been used more than physical models in hydrological studies. This could be because of how much these models are easy to apply in different problems (Liu et al. 2014; Shoaib et al. 2016). There are some trends in hydrological time series that could not be revealed unless you use signal decomposition methods (Quilty and Adamowski 2018).
There are numerous data-driven models, which have been used for time series forecasting like Multiple Linear Regression (MLR) (Abbasi et al. 2021), K-Nearest Neighbor (KNN) (Modaresi et al. 2018), and Artificial Neural Networks (ANN) (Khazaee Poul et al. 2019). The mentioned models are the classic Data-Driven Forecasting Frameworks (DDFF). Recently, using Wavelet-based Data-Driven Forecasting Frameworks (WDDFF) the accuracy of forecasting models has been Improved (Afan et al. 2016; Dixit et al. 2016; Fahimi et al. 2017; Sang 2013; Yaseen et al. 2015). However, there are problems with WDDFF application in real-world case studies that are originated from misinterpretation of parameters, which, result in an invalid response (Du et al. 2017; Quilty and Adamowski 2018; Zhang et al. 2015). These issues are about determining the appropriate decomposition level, wavelet filter, data partitioning, and boundary condition (Quilty and Adamowski 2018). Quilty and Adamowski (2018) tried to address these issues and solve part of these problems. For instance, they showed that only AT and MODWT are the preprocessing models, which do not use future data in the decomposition process. The other models like Discrete Wavelet Transform (DWT), because of using future data in the decomposition process could not be used for real-world forecasting problems (Quilty and Adamowski 2018). When DWT is calculating wavelet and scaling coefficient in time step t, it will use recorded data in time step t + a. Meanwhile, the a value varies and depends on the kind of filter, its length, and decomposition level.
Boundary Condition (BC) causes the main source of error in using WDDFF (Aussem 1998; Bakshi 1999; Maheswaran and Khosa 2012; Quilty and Adamowski 2018). When we use MODWT and calculate wavelet and scaling coefficients for a time series in time step t, it uses recorded data in time step t – a. You should notice that for the a initial number time steps there are no previous records. Therefore, the calculated coefficients for initial a time steps are incorrect and we call them boundary-affected data. To achieve an accurate result the boundary-affected time steps should be removed (Quilty and Adamowski 2018).
Another addressed problem by Quilty and Adamowski (2018) is the selection of suitable decomposition levels and wavelet filters. As the most significant objective of this investigation, we are proposing a novel solution for this problem, which works based on the entropy concept. Entropy has been used in different problems. For the first time, we use it to determine the appropriate decomposition level and the most suitable filter. We will use the predictability concept to solve this problem. Predictability is an index that shows how well the future time steps of a time series could be predicted (Lorenz 1969). Predictability could be measured using measurement indexes like Lyapunov exponents (Palmer and Hagedorn 2006), recurrence measures (Marwan et al. 2002; Pospelov et al. 2019), and information entropic measures (Garland et al. 2014; Guntu et al. 2020). In this paper, we will calculate the entropy of decomposed wavelet and scaling coefficients to reach the decomposition level and filter with the best predictability.
In the following sections we will discuss about: 1) How entropy helps to determine a suitable decomposition level as a novel work; 2) How to determine a suitable wavelet filter using entropy as a novel work; 3) Use the Quilty and Adamowski (2018) achievements in developing a correct WDDFF using our new Maximal Overlap Discrete Wavelet Entropy Transform (MODWET) like boundary affected data elimination; 4) We will implement the MODWET in a WDDFF to practice WDDFF in a real-world streamflow one-month ahead forecasting case study (CAMELS data set); 5) Finally, we will compare the result of modeling using WDDFF and DDFF to find out which one is more accurate (Addor et al. 2017).