A Novel Integration of Hodrick-Prescott Filter and Harmonic Analysis with Machine Learning Methods to Enhance Time Series Prediction Accuracy of Daily and Monthly Wind Speeds

In this work, a new hybrid algorithm for modelling time series of daily and monthly wind speed is proposed. The method utilizes Hodrick-Prescott Filter (HPF) to decompose raw wind speed data into trend and cyclic components, and harmonic analysis (HA) is thereafter used to decompose the cyclic component into the periodic and stochastic sub-components. Machine learning (ML) methods are then used to model the time series of both the trend and stochastic components. The predicted wind speeds are nally summed from the individual predictions of the ML methods and harmonic analyses. To highlight the considerably higher predictive accuracy that results from the introduced data pretreatments with HPF and HA, the proposed hybrid algorithm is compared against the traditional ML methods that are not subjected to the pre-treatments. The proposed hybrid algorithms are highly accurate relative to the traditional ML methods reecting much higher coecients of determination and correlation coecients, and much lower error indices. Articial neural networks (ANNs), linear regression with interactions (LRI), support vector machine (SVM), rational quadratic Gaussian process regression (RQGPR), ne regression trees (FRTs) and boosted ensembles of trees (BETs) are used as the illustrative machine learning methods. To guarantee both versatility and robustness, the methods are tested on example data drawn from both temperate and tropical conditions.


Introduction
Wind energy derives from the kinetic energy of atmospheric air molecules. The heating effects of solar energy absorbed by the earth's surface and atmosphere create atmospheric thermal differential that drives wind. Energy is tapped from wind when the moving air exerts an impulsive torque on a rotor via the rotor blade. In modern times rotors bear an electric generators while in the historic times rotors bore grain grinders, powered a water pumps, etc [1]. Though wind energy exploitation dates back ve thousand years, majority of the technological advancement was witnessed in the last thirty years [2] due to growing issues of energy security and environmental protection. The concerns about energy security and environmental protection stimulated research interest in wind energy which has succeeded in shrinking the limitations on wind speed forecasting, wind turbine technology, wind energy policy, etc [3]. Wind energy systems perform best when located in prominent or unobstructed areas like hill-tops and offshore as long as they do not compromise tourism value [4]. According to Keivanpour et al [5], who summarized the results of recent works focusing on potential offshore wind energy around the world, offshore holds stronger and larger wind energy resources than land. Wind energy is increasingly becoming an important renewable resource for power generation globally, and its use is increasing annually [6]. The cost of wind energy has continued to fall as the technology, manufacture, location and maintenance of wind turbine become more standardized and e cient. As a result, the global wind power capacity grew from 74GW to 487GW from 2006 to 2016 while the annual rate of global wind power addition grew from 15% in 2005 and peaked at 64% in 2015 [7]. There are, therefore, prospects of surmounting all the challenges against achieving the global wind power target of 1000 GW by 2030 [2].
Since terrestrial solar energy is intermittent, the arising wind speed is also intermittent thus di cult to model and predict making the decision about optimal wind turbine location a di cult task. The problem of intermittency limits the deployment of wind energy systems in large-scale electricity utilities thus the need for reliable modelling of wind speed. Therefore, research effort has been devoted to improving the accuracy of wind speed prediction. The several ways to predict wind speed include (1) using geographic location and seasonality parameters as predictors, see Fadare [8] who used latitude, longitude and month number as the predictors of wind speed, (2) using meteorological parameters as predictors, see [9,10] in which air temperature, relative humidity and vapor pressure were used to predict monthly mean daily wind speed, and (3) using historical wind speed data as the predictors, see [11]. Multiple linear regression methods [12,13], stochastic regression methods [11] and arti cial intelligence methods [10] have been used to correlate these classes of predictors with wind speed. The following review is tailored within the scope of this work by focusing on the application of arti cial intelligence in time series modelling and forecasting of wind speed.
Neural networks based on adaptive linear element, back propagation, and radial basis function were found to be comparatively effective in 1-h-ahead wind speed forecasting [14]. Liu et al [11] found that Wavelet Packet-ANN performed best amongst the three hybrid models compared for prediction of half-hourly wind speed. Two methods hybridized from ve-three-Hanning weighted average smoothing, ensemble empirical mode decomposition, and nonlinear autoregressive neural networks were shown in [15] to predict ten-minute wind speed data better than the traditional methods like ARIMA and persistent methods. Ensemble empirical mode decomposition was used to preprocess wind speed data for a component-by-component adaptive prediction using wavelet neural networks in [16].
A secondary decomposition algorithm, which combined wavelet packet decomposition and the fast ensemble empirical mode decomposition, was used to improve the capacity for multi-step wind speed prediction of Elman neural networks [17]. The four decomposition algorithms; wavelet decomposition, wavelet packet decomposition, empirical mode decomposition and fast ensemble empirical mode decomposition, were jointly used to improve the accuracy of extreme learning machines in multi-step wind speed forecasting [18]. Fast ensemble empirical mode decomposition was used for data preprocessing in a method that trained regularized extreme learning machine by backtracking search algorithm for forecasting of wind speed in short-term horizons [19]. A combination of non-positive constraint theory, neural networks, nonlinear and linear statistical models and enhanced with a modi ed cuckoo search algorithm is proposed and found in [20] to be more effective for forecasting wind speed than the hybrid nonlinear or linear single models. A hybrid of support vector regression, stacked de-noising auto-encoder and unscented Kalman lter, which embodied novel ltering approach that combines statistical and numerical weather prediction models, was proposed and validated in [21] to accurately predict wind speed on short-and long-term horizons. In [22], variational mode decomposition was used to decompose wind speed time series into different intrinsic mode functions to reduce nonstationary behaviour. Then, ANN was used to build sub-models from which predicted wind speeds were integrated. The method was seen to perform better than ANNs based on wavelet decomposition, and empirical mode decomposition used earlier in [23,24]. In a similar study [25], a combination of the complementary ensemble empirical mode decomposition with adaptive noise and the variational mode decomposition was used to decompose original wind speed series into intrinsic mode functions of different frequencies, and an improved AdaBoost.RT algorithm coupled with extreme learning machine was used to forecast the decomposed modes. Another study based on complementary ensemble empirical mode decomposition with adaptive noise improved the accuracy of short-term wind speed forecasting by using ARIMA in selecting the best input variables and implementing an error correction [26]. In [27], complete ensemble empirical mode decomposition was used to reinforce wavelet packet decomposition of wind speed time series before the application of different ANNs for modelling of the components from which the forecasted wind speeds were integrated. Wind speed decomposition with complete ensemble empirical mode decomposition was reinforced with the empirical wavelet transform before the application of ANN that is improved by ower-pollination algorithm for wind speed forecasting [28]. Nonlinear autoregressive ANN with exogenous inputs was demonstrated in [29] to be superior to nonlinear autoregressive model in the prediction of 1-year hourly wind speed data. Back propagation and support vector machine methods and their hybrids with empirical mode decomposition and wavelet transform, and an ensemble of the methods were used to predict wind speed in [30] and found that ensemble approaches predict better than the individual methods. A method which used analysis of variance to classify wind data into different categories, used stacked de-noising auto-encoder for training the classi ed data and nally used extreme learning machine to ne-tune and forecast from the trained model was proposed in [31]. Their method predicted better than the adaptive neuron-fuzzy inference system. Other decomposition-based hybrid forecasting methods that are implemented with extreme learning machine were veri ed experimentally in [32]. In the work [33], a method which rst decomposed wind speed data using wavelet technique followed by the modelling of the decomposed wind speed data sets using recurrent wavelet neural network was seen to out-perform a conventional recurrent neural network. Hybrid methods based on wavelet decomposition and wavelet neural network optimized with Cuckoo search algorithm were used to predict wind speed data collected at two wind farms in China [34]. A novel method which applies echo state network to combine forecasts of several hybrid models was proposed in [35]. A two-part scheme consisting of point prediction based on nonlinear combination and interval prediction based on fuzzy clustering was successfully used to predict wind speed in [36]. Other works on forecasting wind speed time series using arti cial intelligence can be found in [37][38][39][40].
The reviewed works indicate that successful prediction of wind speed relies on making appropriate choice of ltering, decomposition or classi cation techniques and applying appropriate modelling approaches to the sub-models. Based on the above review, popular decompositions techniques in wind speed forecasting are empirical mode decomposition, ensemble empirical mode decomposition, fast ensemble empirical mode decomposition, wavelet decomposition, wavelet packet decomposition and their modi cations and hybrids. In this work, HPF is applied in decomposing wind speed time series followed by a hybrid use of HA and ML for predicting wind speed from the decomposition components. Hodrick-Prescott Filter is a popular tool in macroeconomics for separating short run uctuations from long run trend [41] which is innovatively applied in wind speed time series prediction here. The sequential applications of HPF and HA to enhance the accuracy of ML methods for wind speed time series prediction is the major contribution of this work. Section 2 describes the elements of the proposed methodology including the goodness of t metrics while Sect. 3 presents the results and discussion which highlight the capacity of the proposed sequential applications of HPF and HA for enhancing the accuracy of ML methods. Section 4 summarizes the conclusions drawn from the study.

Hodrick-Prescott Filter
Hodrick-Prescott Filter was developed in macroeconomic analysis for decomposition of observed time series into trend and cyclic parts [42]. The lter is based on minimizing the objective function with respect to where J is the sample size, λ is the smoothing parameter that has the values 100, 1600 and 14400 for yearly, quarterly and monthly data periodicities. The rst sum captures the minimization of the cyclic component while the second sum captures the minimization the second-order difference of the trend component. Hodrick-Prescott Filter is introduced in this work for decomposition of wind speed time series as the rst step of seasonality adjustment.
At every j-th time step for j = 1, 2,…, J, HPF decomposes the observation as follows; where d j,T is the trend component and d j,C is the cyclic component. It is known that seasonality compromises the predictive capacity of ANNs in many time series applications [43][44][45], therefore, the motivation to separate d j,C from the raw data for more treatments.

Harmonic Analysis
Harmonic analysis is based on the Fourier series of a function that satisfy the Dirichlet conditions. The Fourier series for a periodic function of time is and T = 2π/ω is the period. The idea here is to consider the cyclic component d j,C from the HPF as a linear combination of a periodic function and a stochastic function. This is conceived to be analogous to considering an experimental measurement of a periodic system to have been compromised by random error. Therefore, Integral quadrature is used to arrive at the forms in Equations (8) and (9) where C j are coe cients that depend on the limits of integration, number of integration intervals (J − 1) and the order or degree of integrand. The Newton-Coates methods are used here for numerical integration. The stochastic component, therefore, becomes d j,CS = d j,C − y CP (t j ) 10 .
The daily and monthly data are considered to have yearly periodicity meaning that t j is the day or month number for the daily and monthly data, respectively. The seasonality adjusted data (the data resulting from the exclusion of the periodic component from the raw data), given as can then be modeled with various ML methods.

The Anns
Multiple Layer Perceptron (MLP), based on the back-propagation algorithm, is the ANN architecture adopted here. The details of the back-propagation algorithm can be found in [46]. A parallel set of nodes or neurons constitute a layer of an MLP while various layers are connected in series to form a functional MLP network. For example, the following equations hold for the ith node in the pth layer, where p = 1, 2 … … . P , see Fig. 1; In the equations, the inputs are x i . In MLP, the nodal outputs of the pth layer are channeled into the nodes of the (p + 1)th layer as inputs, and so on. For example, a single hidden layer network depicted in Fig. 2 (a) (that is, the shallow network with P = 1) has the single output y j,A = f (Wg (W (1) x j )) 14 where the input is x j ∈ R K , the number of variables in the input layer of the ANN or the lags is K, and the weight matrices are A two-layer network represented in Fig. 2 (b) (that is the deep network with P = 2) has the single output where the weight matrix W (1) is the same as above and

20
, where d j,A is the target at the j-th time step, is minimized in terms of the weight parameters w (p) ki and w ki using a weights-adjustment algorithm that is initiated with a numerical choice of the weight matrices. A properly speci ed learning rate η is used to increment the weight parameters the q-th iteration for the (q + 1)-th iteration as follows The iterations are implemented using the chain rule and backpropagation algorithm [46]. Levenberg-Marquardt algorithm, as adopted in this work, is typically used to teach MLPs. Here, the functions g and f are based on tangent sigmoid (tansig) and linear (purelin) transfer functions respectively. where ϵ is normally distributed error with zero mean and xed variance σ 2 and \varvecβ is a vector of basis function coe cients. To explain the response in Gaussian process regression, the latent variables f (x j ) are introduced for j = 1, 2,…, J to form a Gaussian process. The details on Gaussian process regression can be found [48]. In addition to the above-described regression methods, ne regression trees and boosted ensembles of trees as available in MatLab are also used for illustration in this work.

Goodness of Fit Indices
The accuracy of the models are assessed and compared using the following goodness of t (GOF) indices; the coe cient of determination R 2 , root mean square error (RMSE), bias (MBE), absolute bias (MABE), mean percentage error (MPE), and correlation coe cient (CC). They are expressed as follows;  where y j = y j,A + y CP (t j ) and d j = d j,A + y CP (t j ).

The Analytical Procedure
The proposed analytical procedure is illustrated in Fig. 3. Firstly, the raw wind speed data is decomposed into trend Daily and monthly mean wind speed data for Enugu (6. signi cantly devalued indices for testing compared to training meaning that the trained models cannot be reliably used for predicting from independent data. The improvement is visible in the gures as less horizontal scatter which indicates the signi cant better capacity of the enhanced ML methods to model the variation in the target data. This means that the presence of seasonality in the raw data weakens the predictive capacity of the non-enhanced methods. In terms of the performance index MPE, the enhancement ANN with HPF and HA reduced the training error from 23.04- methods by comparing the corresponding values in the tables of GOF indices. The need to watch against over-tting arises with the use of BET (and FRT to a lesser degree) especially in the prediction of daily wind speed. This can be seen in Tables 1, 3, 5 and 7 as more pronounced devaluation of the GOF indices (see R 2 for example) of testing compared to training than can be seen for the other ML methods.

Conclusions
In this work, the capacities of various machine learning methods for time series modelling and forecasting of daily and monthly wind speed are enhanced by introducing two-step ltration of the target data using HPFand HA, in that order.
The key results are itemized as follows: Predictive accuracy is improved when the following four analytical steps are integrated in the so-called enhanced ML methods. Firstly, theraw wind speed data is decomposed into trend and cyclic components using HPF. Secondly, the cyclic component is decomposed into the periodic and stochastic sub-components using HA. Thirdly, ML methods are applied in modelling the sum of the trend and stochastictime series. Finally, wind speed predictions are integrated fromthe outputsML methods and harmonic analyses.
Compared to the non-enhanced ML methods which are applied directly on the raw data, the enhanced ML methods are found to have markedly improved prediction accuracy of daily and monthly wind speed time series.For example, mean percentage error typically reduced from 44.85% to 2.15% and 46.80% to 6.81% for monthly and daily predictions on testing the ANN-trained models. The poor performance of the non-enhanced ML methods is as a result of the effects of seasonality in the raw data.
Amongst the six ML methods adopted for illustrative purposes, ANN, LRI, SVM, RQGPR are recommended because of the high predictive accuracy of the trained models from new data while the tree-based methods (BET and FRT) are not recommended because of over tting issues.   The MLP ANNs with (a) a single hidden layer (b) two hidden layers The proposed wind speed prediction procedure.  Plots of monthly wind speed predictions for Stuttgart (a) with HPF and HA pretreatments (b) without HPF and HA pretreatments Figure 8 Plots of daily wind speed predictions for Enugu (a) with HPF and HA pretreatments (b) without HPF and HA pretreatments