Novel Residual Hybrid Machine Learning for Solar Activity Prediction in Smart Cities

doi:10.21203/rs.3.rs-3141445/v1

Download PDF

Research Article

Novel Residual Hybrid Machine Learning for Solar Activity Prediction in Smart Cities

https://doi.org/10.21203/rs.3.rs-3141445/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 01 Nov, 2023

Read the published version in Earth Science Informatics →

You are reading this latest preprint version

Predicting global solar activity is crucial for smart cities, especially space activities, communication industries, and climate change monitoring. The recently developed models to predict solar activity based on stand-alone artificial intelligence, based on machine and deep learning models, and hybrid models are promising. Yet they may not be effective at capturing simpler linear patterns in the data and often fail to provide reliable predictions due to the computationally expensive and complex. This article proposed a novel residual hybrid machine learning method integrating linear regression machine learning, and deep learning neural networks for solving predictive accuracy in individual machine learning models that reduces complexity. The residual hybrid model leverages the capacities of the support vector machine (SVM) and long short-term memory neural network (LSTM) for hybrid SVM-LSTM model. The performance of the model is evaluated using the correlation coefficient, determination coefficient, root-mean-squared error (RMSE) and mean-absolute error. The simulation results indicated that compared to the SVM-LSTM, the training and testing RMSE of the LSTM is reduced by 76.62% and 71.18%, respectively. It also decreases the training and testing RMSE of the SVM by 77.06% and 71.81%, respectively. The proposed model can be implemented as reliable solution for accurately predicting solar activities in smart cities.

Smart city

Artificial Intelligence

Deep learning

hybrid machine learning

prediction

solar activity

sunspots number

Solar activities in terms of solar wind and electromagnetic radiation have a substantial impact on space and weather events and could have a serious impact on smart cities specially on the radio communication, satellite longevity, cable networks, navigation systems, and climate change [1] [2]. Global solar activity can be quantified in terms of the sunspot number (SSN) [3]. Sunspots are regions of the sun that appear darker than the surrounding area. Their diameters vary between tens of thousands and hundreds of thousands of kilometers [4]. The ionosphere of the Earth, which is the layer of the atmosphere that is ionized by solar and cosmic radiation, is susceptible to the effects of sunspots, which are natural phenomenon that occurs in cycles lasting around 11 years. Sunspot-induced ionosphere disruptions can interfere with the radio frequencies used by satellites to interact with ground stations. This might be problematic for several sectors that depend on satellite communication, such as remote sensing, navigation, and television and radio broadcasting [5].

Hence, these sectors must be aware of any potential impacts of solar activity on their operations. Monitoring solar activity and taking the necessary precautions to prevent any disruptions can assist to lessen the impact on these industries. Solar activity can be predicted in terms of the SSN, using statistical, machine learning and deep learning models [6].

A. Motivation of the Study

Predicting solar activity is a daunting task due to the many factors that influence it, including magnetic fields and solar flares. Moreover, solar activity is known to undergo cycles of varied amplitudes and lengths, making accurate prediction of the timing and intensity of the cycles difficult. Solar activity exhibits both linear and nonlinear trends. The 11 years solar cycle, for instance, which is the most observable variation in solar activity, is a comparatively linear pattern. While the solar activity within the cycle involves complicated nonlinear variations.

Several models have been proposed in the literature to predict solar activity via sunspot numbers (SSN). The recently developed models are mainly based on machine learning (ML) and deep learning (DL). The ML and DL models are designed to learn intricate, nonlinear correlations in a dataset, and they are successful in capturing complex patterns within the data and have demonstrated promising performance in the prediction of solar activity using SSN. However, when used as a stand-alone, or as a hybrid, may not be as effective at capturing simpler linear patterns in the data, and often provide unsatisfactory predictions. Additionally, combining two DL models or a DL model with a nonlinear ML model will increase computational complexity and expense.

Linear regression ML models such as SVM can be a cost-effective approach for predicting the solar activity, with good prediction performance. The combinations of linear regression ML and DL can improve the prediction accuracy by leveraging the potentials of both methods and provide a less expensive and computationally effective hybrid model. The linear regression ML model would capture the linear relationships in the data, while the DL model, on the other hand, can capture more complicated and nonlinear patterns that the linear regression may leave out.

In this context, the current study proposed a novel residual hybrid machine learning model integrating linear regression SVM and deep learning LSTM to improve the prediction accuracy of stand-alone ML and DL models while reducing the complexity of using hybrid of two DL models.

B. Contributions

The main contribution in this study can be identified as:

The study employs the linear regression support vector machine (SVM) learning method and deep learning long short-term memory (LSTM) neural network.

It develops the hybrid SVM-LSTM based model by integrating SVM and LSTM.

The key performance analyzing metrics such as: RMSE, MAE, R, and R² were derived and assessed accordingly for rectifying the accurateness of the proposed model.

Proposed model improves the performance comparing the existing state of art stand-alone models.

Proposed model offers high performance accuracy with reduced computational cost and complexity.

Examples of applications in microgrid energy system scenarios

The solar activity can be reliably predicted via SSN using different modelling techniques [6]. In the literature, several models have been proposed to predict the future dynamics of solar activity using SSN. The vast majority of the models employ the daily and monthly average SSN time series data. However, the SSN time series data reflect seasonality because sunspots have an 11-year solar cycle, making the modelling process extremely complex and cumbersome [7].

The statistical modelling process is a method that applies statistical analysis to datasets to discover correlations between variables and establish future projections of the target variable. If a statistical model predicts future values based on past values, it's classified as autoregressive. Autoregressive models work on the concept that past values have an influence on current values, making the statistical technique popular for analyzing natural phenomena and other time-varying processes. Multiple regression methods predict a variable using a linear relationship between the predictors, whereas autoregressive models use a combination of previous values of the variables [6]. Yule[7] proposed what is widely regarded as the first statistical model to predict solar activity using SSN; the model is based on the auto-regressive (AR) approach. Xu et al.[8] used a two-step modelling technique to forecast solar cycle 23 using SSN data from 1848 to 1992. Empirical Mode Decomposition (EMD) is used to extract the frequency components of the data first. The extracted information is then used to create the AR model. The proposed model was compared to the classic precursor approach and solar dynamo and was found to be superior in terms of accuracy for predicting the solar cycle 23. Similarly, Curve fitting and AR are used in[9] to model the SSN for solar cycle 23. The autoregressive integrated moving average Model (ARIMA) method is employed by Abdel-Rahman and Marzouk[2] to predict the solar cycle 24. While Liu et al.[10] employed a Gaussian function to estimate the solar cycle 24 activity. Du[11] used the value 39 months before the solar minimum to predict the peak of Solar Cycle 25. While statistical models could offer valuable predictions of sunspot activity, they tend to not capture the full complexity of the processes involved. This is owing in part to the fact that sunspot activity is impacted by a variety of factors, some of which may be difficult to quantify. Additionally, the vast majority of statistical models are linear and would perform poorly when applied to highly nonlinear and fuzzy systems.

Machine learning (ML) regression, which is a black box modelling method is used to create intelligent model that is trained to comprehend the causal relationship between data variables. The model can then be used to predict the future values of the dependent variable [12]–[14]. Many studies leverage the power of ML to improve the accuracy of solar activity prediction. Artificial neural network (ANN) has demonstrated high prediction accuracy in comparison with AR and ARIMA [15], [16]. Dani and Sulistiani [17], developed and compare the performance of four ML-based models to forecast the solar cycle 25, using monthly average SSN data, from 1856 to 2018. The proposed ML models; support vector machine (SVM), random forest (RF), radial basis function (RBF), and linear regression (LR), projected that the solar cycle 25 would occur in late 2019 or early 2020. Adaptive neuro-fuzzy inference system (ANFIS) integrates the learning capabilities of ANN and the ability of fuzzy inference system (FIS) to handle process uncertainties [18]. Parsapoor et al. [19] trained and validated an ANFIS model for the prediction of solar cycle 25. The model is developed using monthly average SSN data from cycles 16–24. In a similar passion, Novitasari et al. [20] combine Markov chain algorithm and FIS for the prediction of SSN.

Deep learning (DL) is a subclass of ML that uses multiple layers to improve the performance of classical ML. DL algorithms, unlike statistical methods and ML, can handle data directly without the need for preprocessing (such as feature extraction). They are also capable of handling unstructured data [21]. DL algorithms have been successfully applied in different time series prediction applications, including solar activity prediction. The LSTM which is a recurrent neural network (RNN) that has the ability, during learning, to not only feedback on the errors but also the weight from the previous learning stages, is popular choice for time series applications. Depending on the network and system complexity, the LSTM network can learn more than 1000-time steps [22]. The LSTM neural network has been employed by many studies to model the monthly average SSN [23]–[25]. Pala and Atici [26] compared the performance of LSTM and neural-network auto-regression (NNAR) for prediction of monthly SSN and demonstrated the superiority of LSTM.

Recent studies considered the integration of multiple ML and DL models to improve the accuracy of the stand-alone models for prediction of solar activity. Panigrahi et al.[27] integrates two statistical models with SVM. Hybrid PSO with extreme learning machine (ELM), with a feature extraction is presented in[28] Khan et al.[29] combined ANN with LSTM. Benson et al. proposed a hybrid of two DL models: LSTM and WaveNet [30].

The study proposed a residual hybrid machine learning model for the prediction of solar activity in terms of sunspot numbers (SSN). The proposed method is illustrated in Fig. 1. The process begins with data preprocessing to make the data appropriate for the application and eliminate outliers. The two stand-alone models are then developed using the support vector machine (SVM); which is a linear regression machine learning algorithm and LSTM, which is a deep learning algorithm. The individual models are then integrated to produce the hybrid residual (SVM-LSTM). The performance of the individual models and the hybrid model is evaluated using statistical metrics.

3.1. Data Pre-processing

The incidence of solar activity can be identified through the international sunspot number (SSN). The monthly SSN is widely used to develop SSN forecasting models. In this study, the data consisting of the monthly average SSN over the period of January 1749 to December 2020 is used. The data is obtained from Sunspot Index and Long-term Solar Observation (SILSO) [31]. The sample of the data is presented in Table 1.

Table 1

Sample of monthly averaged sunspot number (SSN)
Year	Month	SSN	Year	Month	SSN
2019	1	7.7	2020	1	6.2
2019	2	0.8	2020	2	0.2
2019	3	9.4	2020	3	1.5
2019	4	9.1	2020	4	5.2
2019	5	9.9	2020	5	0.2
2019	6	1.2	2020	6	5.8
2019	7	0.9	2020	7	6.1
2019	8	0.5	2020	8	7.5
2019	9	1.1	2020	9	0.6
2019	10	0.4	2020	10	14.6
2019	11	0.5	2020	11	34.5
2019	12	1.5	2020	12	23.1

The collected data is preprocessed and formatted to make it suitable for modelling. In this context, the data is set up so that the SSN value from the previous month can be used to predict the SSN for the current month (one step ahead). This can be expressed mathematically in Eq. (1).

$${x}_{t}=f\left({x}_{t-1}\right)$$

Following that, the pre-processed data is split in a ratio of 70:30. The 70% is used for training and the remaining 30% is used for testing. The training dataset is used during the learning stage to obtain the model parameters, and the testing dataset is used to test the models' accuracy.

3.2. Support Vector Machine (SVM)

The Support Vector Machine (SVM) technique is one of the popular machine-learning methods employed in myriad data-driven modelling applications, including classification, regression and pattern recognition. It has a simple structure and generalizes well in many implementations. The architecture of SVM was first introduced by Vapnik, in 1995 [32]. SVM models are characterized by high learning capacity, reduced error rate, reduced complexity. SVM learning functions can learn the linear and non-linear relationships in the given dataset, depending on the application [33]. In the case of SVM linear regression, nonlinear mapping function is used to map the input vector to an m-dimensional feature-space, and the linear model is then constructed in this space [34].

The dataset for the training is presented as${ \left({x}_{i},{y}_{i}\right)}_{i}^{m}$, in which ${x}_{i}$ denotes the input variables vector, ${y}_{i}$ represents the experimental parameters and m is the total data count. The SVM function can therefore be defined in Eq. (2).

$$f\left(x\right)=\omega \theta \left({x}_{i}\right)+b$$

With $\theta \left({x}_{i}\right)$ illustrating the nonlinear mapped feature spaces from input vector $x$.

$$Minimize: \frac{1}{2}{\Vert \omega \Vert }^{2}+C\left({\sum }_{i}^{m}\left({\text{ʋ}}_{i}+{\text{ʋ}}_{i}^{*}\right)\right)$$

Subject to$\left\{\begin{array}{c}{y}_{i}-{\omega }_{i}\theta \left({x}_{i}\right)-{b}_{i}\le {\text{ʋ}}_{i}+{\text{ʋ}}_{i}^{*}\\ {\omega }_{i}\theta \left({x}_{i}\right)+{b}_{i}-{y}_{i}\le {\text{ʋ}}_{i}+{\text{ʋ}}_{i}^{*}\\ {\text{ʋ}}_{i},{\text{ʋ}}_{i}^{*}\ge 0\end{array}\right.$ (4)

For$i=\text{1,2},\dots ..,m$

Where $\frac{1}{2}{\Vert \omega \Vert }^{2}$ represent the weights vector, and $C$ is a positive regularization constant.

Figure 2 illustrates the simplified SVM framework. The variables ${\alpha }_{i}$ and ${\propto }_{i}^{*}$ are the Lagrange multipliers and the weight vector $\omega$ can be obtained by solving the following Eq. (5):

$${\omega }^{*}={\sum }_{i=1}^{n}\left({\alpha }_{i}-{\propto }_{i}^{*}\right)\theta \left({x}_{i}\right)$$

Hence, the general form of SVM can present in Eq. (6):

$$f\left(x,{\alpha }_{i},{\propto }_{i}^{*}\right)={\sum }_{i=1}^{n}\left({\alpha }_{i}-{\propto }_{i}^{*}\right)X\left({x}_{1},{x}_{2}\right)+b$$

Where $X\left({x}_{1},{x}_{2}\right)$ is the kernel function and b represent the bias.

3.3. Long Short-Term Memory Neural Network (LSTM)

The feed forward neural networks (FFNN) are fully connected, such that every neuron gives an input to the neuron in the next layer. They are loop-free and none of the weights provide input to the neurons in the previous layers. Hence, they are limited to static classification tasks. FFNN can be modified to achieve dynamic classification by applying feedback from the previous time-step into the network. These networks are referred to as Recurrent Neural Networks (RNNs). Figure 3. shows the basic architecture of RNNs.

However, ordinary RNNs cannot look back in time for more than 5–10-time steps. This is because the fed back either grow exponentially or vanishes with time. Vanishing fed back signal results in unacceptably long training time, while the blown-up signal will cause the weights to oscillate. In this regard, long short-term memory (LSTM) RNNs are introduced to addressed these problems [36]. Depending on the network and system complexity, LSTM network can learn for more than 1000-time steps [37]. LSTM employ constant error carousel strategies to limit the error within special cells.

Figure 4 illustrated the basic architecture of LSTM unit consisting of a cell, input, forget and output gates. The plus and multiplication signs represent elements level addition and multiplication, respectively. The input, forget and output gates are denoted by ${i}_{t}$,${f}_{t}$, and ${o}_{t}$, respectively. Similarly, the output node isrepresented by ${g}_{t}$. The cell captures the relationships between the various input sequences. The values flowing through the cells are control by the input gate. The values that stay in the cell are control by the forget gate. The activation of the LSTM output is calculated based on the values in the cell. The output gate controls the extent of the LSTM output activation.

Hence, the three gates regulate the flow of information into and out of the cell, and the state ${s}_{t}$ of the cell recalls past values across arbitrary time intervals. As a result, prediction problems based on a time sequence are especially well suited for the LSTM network. Based on the LSTM structure shown in Fig. 4, the following formulae can be deducted:

$${i}_{t}=\sigma (\left({x}_{t}+{h}_{t-1}\right){W}_{i}+{b}_{i})$$

$${f}_{t}=\sigma (\left({x}_{t}+{h}_{t-1}\right){W}_{f}+{b}_{f})$$

$${o}_{t}=\sigma (\left({x}_{t}+{h}_{t-1}\right){W}_{o}+{b}_{o})$$

$${g}_{t}=tanh(\left({x}_{t}+{h}_{t-1}\right){W}_{g}+{b}_{g})$$

Subsequently, the state of the cell can be obtained as

$${s}_{t}={s}_{t-1}⨀{f}_{t}+{g}_{t}⨀{i}_{t}$$

Where:

${W}_{i}$ , ${W}_{f}$, ${W}_{o}$ ${W}_{g}$ represents the weight matrices connecting the corresponding input signals,

$⨀$ is an element wise multiplication,

$\text{t}\text{a}\text{n}\text{h}\left(\right)$ represents a hyperbolic tangent function, and

$\sigma \left(\right)$ is the sigmoid activation function.

By utilizing backpropagation with gradient descent, the network training seeks to reduce the standard squared error objective function. [39]. The weights and bias are modified during training utilizing their gradients. Once the network has learned a single batch of training data using the backpropagation optimization process.

3.4. Hybrid SVM-LSTM

In order to boost the performance of the individual models, many studies introduced hybrid modeling techniques and apply them successfully to solve complex data science applications [40], [41]. In this regard, the current study implements a residual hybrid machine learning that combine the advantages of the linear SVM and nonlinear deep learning neural network (LSTM). The hybrid model is based on the structure proposed in [42], [43].

As illustrated in Fig. 5, the implementation of the hybrid model is done in two phases. First, SVM is used to produce the initial forecast by extracting the relationships within the original data. In the second phase, the LSTM is used to produce a residual forecast by capturing the relationships within the residuals from the first model. The two forecasts are summed up to obtain the final forecast.

3.5. Performance Evaluation Metrics

The performance of the models during both the training and testing is evaluated by using four widely used statistical metrics: the correlation coefficient (R), determination-coefficient $\left({R}^{2}\right)$, the root-mean-squared-error (RMSE) and mean absolute error (MAE). They can be computed using Eq. (12) to Eq. (15).

$$R=\frac{\sum {x}_{i}-{\widehat{x}}_{i}}{n}$$

$${R}^{2}=1-\frac{\sum {\left({x}_{i}-{\widehat{x}}_{i}\right)}^{2}}{\sum {\left({x}_{i}-{\stackrel{-}{x}}_{i}\right)}^{2}}$$

$$RMSE=\sqrt{\frac{\sum {\left({{x}}_{{i}}-{\widehat{{x}}}_{{i}}\right)}^{2}}{{n}}}$$

$$MAE=\frac{1}{n}\sum \left|{x}_{i}-{\widehat{x}}_{i}\right|$$

for $i=\text{1,2},3\dots \dots \dots \dots .n$.

Where ${x}_{i}$, ${\widehat{x}}_{i}$, ${\stackrel{-}{x}}_{i}$ and n stands for the original values, predicted values, an average value of the original data and the total number of data instances. $R$ and ${R}^{2}$ take values between 0 and 1, with 1 indicating a perfect prediction. The small values of RMSE and MAE show good performance of the models.

4.1 Descriptive Statistics

Table II shows the descriptive statistics of the dataset. A descriptive statistic is an organized and summarized characteristic of the data. It can be observed from Table 2 that the average value of the SSN is 81.80, the standard deviation is less than the mean and therefore the SSN pattern is fairly stable. The skewness value of 0.93, and kurtosis value of 0.34, indicated that the data was somewhat normally distributed.

Table 2

Descriptive statistics of the SSN data
Descriptive	Values
Mean	81.80
Median	67.20
Mode	0
Standard Deviation	67.89
Sample Variance	4608.73
Kurtosis	0.34
Skewness	0.93
Minimum	0
Maximum	398.20
Count	3264

4.2. Results of the SVM, LSTM and SVM-LSTM Models

This section focuses on description of the results obtained from the simulation of the stand-alone machine learning (SVM), deep learning (LSTM) and the hybrid (SVM-LSTM) techniques for the prediction of the solar activity in terms of SSN. The three models were simulated using MATLAB 2019b software on a PC with a COREi7 processor. In the course of developing predictive machine learning model, the choice of the network structure and the learning parameters has a great influence on the prediction accuracy of the models. The training data is used to obtain the optimal models’ parameters, and the testing data is used on the trained models to test the prediction accuracy [44].

For the SVM, the key parameters include the epsilon $\left(\epsilon \right)$, bias, mu $\left(\mu \right)$ and the number of iterations. For this study, the values of the parameters that produced the best results are found as, epsilon $\epsilon =7.31,$ the number of iterations is 1619, with a bias of 80.09, and $\mu =81.80$. Similarly, For the deep learning LSTM model, selection of the network parameters, such as the number of layers, number of cells, learning rate and momentum influences the model’s performance. After several attempts, the optimal network structure is obtained with two LSTM layers with 32 cells each, one convolutional layer for feature extraction, and one output layer, with rectified linear unit (Relu) activation function. The model is trained for 400 epochs with a momentum of 0.9 and a learning rate of ${10}^{-5}$. The hybrid SVM-LSTM model is simulated based on the optimal structures of the stand-alone models.

According to Yassen et al. [45], to construct a credible prediction model, the model's performance should be evaluated using error metrics such as MAE or RMSE, as well as goodness of fit criteria such as R and R². Therefore, the proposed techniques are evaluated using $R$, ${R}^{2}$, RMSE and MAE to provide a sense of the models’ performances and efficiencies in terms of the error criteria and goodness of fit.

The performance of the three models is shown in Table 3. From the training and testing results it can be observed that the LSTM outperformed the linear regression SVM. This finding is corroborated by the recently published studies on the prediction of solar activity [26], [30]. Meanwhile, the results in Table III shows that the proposed hybrid SVM-LSTM model with $R=0.995$, ${R}^{2}=0.989$, $RMSE=7.778$, and $MAE=4.857$ in testing step has performance superiority over the standalone SVM and LSTM. This could be attributed to the fact that the hybrid model combined the capacities of the two single models.

Table 3

Performance comparison
Training
Models	R	R²	RMSE	MAE
SVM	0.937	0.879	26.284	19.168
LSTM	0.940	0.883	25.787	18.476
SVM-LSTM	0.996	0.991	6.030	4.579
Testing
Models	R	R²	RMSE	MAE
SVM	0.902	0.812	27.593	19.840
LSTM	0.906	0.821	26.989	18.985
SVM-LSTM	0.995	0.989	7.778	4.857

Figure 6 depict the performance of the three models based on the RMSE and MAE metrics in a bar chart. The performance superiority of the SVM-LSTM for both error metrics is indicated. For instance, the training and testing RMSEs using the SVM-LSTM is decreased by about 76.62% and 71.18% compared to the LSTM, respectively. And significantly decreased by 77.06% and 71.81% compared to SVM, for training, and testing, respectively.

Moreover, the comparison of the models’ performance in terms of goodness of fit (DC) is shown graphically using a radar chart in Fig. 7. The models with DC values closer to 1 are generally more accurate. Hence, the models can be ranked in terms of accuracy from highest to lowest as $SVM-LSTM>LSTM>SVM$. However, since the DC of the three models are all greater than 70%, it shows that all the three developed models can provide a good forecast of the SSN [46].

Figure 8 shows the scatter plot of the predicted SSN versus the original SSN. In a scatter plot, the prediction performance is considered to be perfect when the majority of the points on the graph lies on a linear positive slope. Therefore, it can be seen from these figures that all the three models performed relatively good. However, in case of the hybrid SVM-LSTM, more points lie within the range, which indicates the performance superiority of the hybrid model over the stand-alone.

Figure 9 illustrates the comparison of the predicted SSN against the original SSN using time series plots. Time series plot is used to further indicate the performance of the models in predicting the time series SSN. As shown in Fig. 9, the data has seasonality, with some peaks higher than others, and a bit of noise. It is clear from Fig. 9 that although the stand-alone models can capture the dynamics of the SSN, there are some errors especially in predicting the peak values. Nonetheless, the SVM-LSTM predicted SSN are more consistent with the original data including during the peaks.

The box plot, displayed in Fig. 10, provides additional visualization of the models' performance. The various whiskers and quartiles can be used to quantify the cumulative distribution of values in the predicted SSN using the three models and the original SSN. As shown in Fig. 10, the distribution of the data predicted by the hybrid SVM-LSTM is closer to the real data, making it the best of the three models. Furthermore, the SVM and LSTM are unable to match the original data's median and whiskers, indicating their inability to capture the maximum SSN values.

The performance accuracy of the proposed SVM-LSTM model for the prediction of the solar activity is further revealed through a quantitative comparison with the existing models in the literature. Pala and Atici [[26] employed two stand-alone deep learning methods; the LSTM and NNAR for prediction of the solar activity. Although the LSTM with $RMSE=35.9$ outperformed the NNAR in their study, the performance of the LSTM developed in our study is better with $RMSE=26.989$ in the validation stage. Moreover, the proposed hybrid SVM-LSTM proved best with validation $RMSE=7.778$. A hybrid of three models comprising of Autoregressive Integrated Moving Average (ARIMA), Exponential Smoothing with Error, Trend and Seasonality (ETS), and SVM is proposed in [27]. The model with validation $RMSE=22.726, MAE=16.549 and {R}^{2}=0.97$ is less accurate in comparison to our proposed hybrid model with $RMSE=7.778, MAE=4.857 and {R}^{2}=0.99$, both in terms of goodness of fit and error criteria. Notably, Zhu et al.[25] employed optimized LSTM to predict solar activity in terms of monthly sunspot area (SSA). Their proposed method is not like our study which relies on the SSN directly without the need of first predicting the SSA. Benson et al. [30], combines two deep learning methods; LSTM and WaveNet to predict the solar activity. The hybrid of WaveNet-LSTM attracted more computational cost and complexity. Moreover, their proposed LSTM has 132 cells compared to our proposed LSTM with 32 cells. Nghiem et al.[47] applied Bayesian inference in hybrid LSTM with Convolutional Neural Network (CNN), to predict the SSN. Their proposed hybrid model attained an accuracy with $RMSE=26.10 and MAE=18.74$. Comparatively, a proposed hybrid deep neural network with LSTM (DNN-LTSM) in[48] achieved a validation $RMSE=20.34 and MAE=13.75$. In both scenarios our proposed hybrid model shows higher performance with validation $RMSE=7.778 and MAE=4.857$.

4.3. Future scenarios in energy microgrid systems

Solar activity will affect space, weather, and technology like communication and navigation systems. These disruptions can interfere with satellite communication, for example used in remote sensing and data communication in decentralized systems. Decentralization refers to the shift from a centralized energy system, where power is generated in large-scale power plants and distributed across long distances, to a more localized and distributed system. Complex patterns in the data create uncertainties. There is a growing need of adoption and implementation of microgrid energy systems with decentralized systems, for example for rural development. Increasingly all data makes are necessary to have more accurate predictions while being less computationally expensive so that local users can afford to actually use the data. Such systems would offer numerous benefits, including resilience, renewable energy integration, energy efficiency, local empowerment, and grid flexibility, contributing to a more sustainable and reliable energy future.

The future will also include shared data to create new business models. For example, the energy surplus at one component in the energy system can be sold to neighbor in the local energy community either at the time it occurs, or to be predicted in the future to occur. With more players in the system, the complexity is increased by including the present and future energy production and use. Prediction of behaviors will affect the new business or predicted business. More reliable data, computations and simulation with digital learning models are necessary to predict the scenarios. At the end, this will align with local aims, such as CO2 neutrality, net zero carbon, etc of local place-based network municipalities and regions to be aligned with climate needs.

This study proposed a residual hybrid machine learning technique for the prediction of solar activity using SSN. The hybrid model integrates the potentials of linear regression SVM and deep learning LSTM. The modelling is performed in two phases. The first stage employed SVM to generate the initial SSN prediction result. In the second step, LSTM is used to model the residuals from the first stage, to obtain the second prediction results. The predictions from the two stages are summed up to get the final prediction. Based on the simulation results and comparison with previous studies in the literature, it is possible to infer that combining linear regression machine learning with deep learning delivers good accuracy for solar activity prediction. This could be ascribed to the fact that the proposed SVM-LSTM captured both the linear and nonlinear patterns exhibited by the SSN. The created model can be applied as an alternative tool for predicting solar activity and can be utilized efficiently by researchers, communication industries, and aviation agencies. In the future, studies will be conducted to examine the relationship between the SSN and meteorological parameters such as temperature, humidity, and so on, and predictive models will be constructed based on those findings. Future research will also investigate the use of explainable AI (XAI) to provide interpretability and comprehension of how the model makes its predictions. This will allow for a better understanding of the fundamental processes that drive solar activity.

Authors’ contribution

Rabiu Aliyu Abdulkadir provided concepts, data preprocessing, simulation and writing original draft. Mohammad Kamrul Hasan contributes concept, supervision, methodology, guiding data preprocessing and writing review and editing. Shayla works for simulation guidance, and result analysis. Thippa Reddy Gadekallu worked for literature research and guiding methodology. Bishwajeet Pandey is responsible for guiding simulation, figures and writing-original draft. Nurhizam Safie worked for results discussion and article format. Mikael Syväjärvi is responsible for guiding conceptualization, identifying future scenarios and writing-review and editing.

Funding

This work is supported by the Universiti Kebangsaan Malaysia under research grant, number: DIP 2022-021. Mikael Syväjärvi acknowledge financing by Swedish Energy Agency and support from European Commission in European Union’s Horizon 2020 research and innovation programme under grant agreement no. 775970.

Availability of data and material

Data can be provided on a formal request to the corresponding author.

Conflict of interest: The authors declare no conflict of interest.

Liu Z, Zhang T, Wang H (2021) Predicting Sunspot Numbers Based on Inverse Number and Intelligent Fixed Point. Sol Phys 296(5). 10.1007/s11207-021-01835-z
Abdel-Rahman HI, Marzouk BA (2018) Statistical method to predict the sunspots number. NRIAG J Astron Geophys 7(2):175–179. 10.1016/j.nrjag.2018.08.001
De Jager C (2005) Solar forcing of climate. 1: Solar variability. Space Sci Rev 120:3–4. 10.1007/s11214-005-7046-5
Kirov B, Asenovski S, Georgieva K, Obridko VN, Maris-Muntean G (2018) Forecasting the sunspot maximum through an analysis of geomagnetic activity. J Atmos Sol Terr Phys 176:42–50. 10.1016/j.jastp.2017.12.016
Ahluwalia HS (2022) Forecast for sunspot cycle 25 activity. Adv Space Res 69(1):794–797. 10.1016/j.asr.2021.09.035
ArunKumar KE, Kalaga DV, Mohan Sai Kumar C, Kawaji M, Brenza TM (2022) “Comparative analysis of Gated Recurrent Units (GRU), long Short-Term memory (LSTM) cells, autoregressive Integrated moving average (ARIMA), seasonal autoregressive Integrated moving average (SARIMA) for forecasting COVID-19 trends,” Alexandria Engineering Journal, vol. 61, no. 10, pp. 7585–7603, Oct. doi: 10.1016/j.aej.2022.01.011
Yule GU (1927) On a method of investigating periodicities disturbed series, with special reference to Wolfer’s sunspot numbers. Philosophical Trans Royal Soc A. https://doi.org/10.1098/rsta.1927.0007
Xu T, Wu J, Sen Wu Z, Li Q (2008) Long-term sunspot number prediction based on EMD analysis and AR model. Chin J Astron Astrophys 8(3):337–342. 10.1088/1009-9271/8/3/10
Hathaway DH (2010) The solar cycle. Living Rev Sol Phys 7(1):57–75. 10.12942/lrsp-2010-1
Liu J, Zhao J, Lin H (2019) Prediction of the Sunspot Number with a New Model Based on the Revised Data. Sol Phys 294(11). 10.1007/s11207-019-1536-1
Du Z (2022) Predicting the Maximum Amplitude of Solar Cycle 25 Using the Early Value of the Rising Phase. Sol Phys 297(5):1–18. 10.1007/s11207-022-01991-w
Ghazal TM et al (2021) IoT for smart cities: Machine learning approaches in smart healthcare—A review. Future Internet 13 8. MDPI AG, Aug. 10.3390/fi13080218
Ghazal TM et al (2021) Hep-pred: Hepatitis C staging prediction using fine gaussian SVM. Computers Mater Continua 69(1):191–203. 10.32604/cmc.2021.015436
Alaghbari KA, Mohamad MH, Hussain A, Alam MR (2022) Activities Recognition, Anomaly Detection and Next Activity Prediction Based on Neural Networks in Smart Homes. IEEE Access 10:28219–28232. 10.1109/ACCESS.2022.3157726
Gkana A, Zachilas L (2015) Sunspot numbers: Data analysis, predictions and economic impacts. J Eng Sci Technol Rev 8(1):79–85. 10.25103/jestr.081.14
Safiullin N, Porshnev S, Kleeorin N (2018) “Monthly sunspot numbers forecast with artificial neural network combined with dynamo model: Comparison with modern methods,” Proceedings – 2018 Ural Symposium on Biomedical Engineering, Radioelectronics and Information Technology, USBEREIT 2018, pp. 199–202, doi: 10.1109/USBEREIT.2018.8384584
Dani T, Sulistiani S (2019) Prediction of maximum amplitude of solar cycle 25 using machine learning. J Phys Conf Ser 1231(1). 10.1088/1742-6596/1231/1/012022
Memon I, Shaikh RA, Hasan MK, Hassan R, Haq AU, Zainol KA (2020) “Protect Mobile Travelers Information in Sensitive Region Based on Fuzzy Logic in IoT Technology,” Security and Communication Networks, vol. 2020, doi: 10.1155/2020/8897098
Parsapoor M, Bilstrup U, Svensson B (2018) Forecasting Solar Activity with Computational Intelligence Models. IEEE Access 6:70902–70909. 10.1109/ACCESS.2018.2867516
Novitasari DCR, Ardhiyah N, Widodo N (2019) “Flare Identification by Forecasting Sunspot Numbers Using Fuzzy Time Series Markov Chain Model,” Proceedings – 2019 International Seminar on Intelligent Technology and Its Application, ISITIA 2019, pp. 387–392, doi: 10.1109/ISITIA.2019.8937242
Hossain Lipu MS et al (2022) “Deep learning enabled state of charge, state of health and remaining useful life estimation for smart battery management system: Methods, implementations, issues and prospects,” Journal of Energy Storage, vol. 55. Elsevier Ltd, Nov. 25, doi: 10.1016/j.est.2022.105752
Lipu MSH et al (2021) Artificial Intelligence Based Hybrid Forecasting Approaches for Wind Power Generation: Progress, Challenges and Prospects. IEEE Access 9:102460–102489. 10.1109/ACCESS.2021.3097102
Lee T (2020) Hybrid Deep Learning Model for Predicting Sunspot Number Time Series with a Cyclic Pattern. Sol Phys 295(6). 10.1007/s11207-020-01653-9
Arfianti UI, Novitasari DCR, Widodo N, Hafiyusholeh M, Utami WD (2021) Sunspot Number Prediction Using Gated Recurrent Unit (GRU) Algorithm. IJCCS (Indonesian Journal of Computing and Cybernetics Systems) 15(2):141. 10.22146/ijccs.63676
Zhu H, Chen H, Zhu W, He M (2023) “Predicting Solar cycle 25 using an optimized long short-term memory model based on sunspot area data,” Advances in Space Research, Apr. doi: 10.1016/j.asr.2023.01.042
Pala Z, Atici R (May 2019) Forecasting Sunspot Time Series Using Deep Learning Methods. Sol Phys 294(5). 10.1007/s11207-019-1434-6
Panigrahi S, Pattanayak RM, Sethy PK, Behera SK (Jan. 2021) Forecasting of Sunspot Time Series Using a Hybridization of ARIMA, ETS and SVM Methods. Sol Phys 296(1). 10.1007/s11207-020-01757-2
Zhang B, Sun L, Wang W (2022) Two Stage Prediction Model of Sunspots Monthly Value Based on CEEMDAN and Particle Swarm Optimization ELM. IEEE Access 10:102981–102991. 10.1109/ACCESS.2022.3206542
Khan T, Arafat F, Mojumdar MU, Rajbongshi A, Siddiquee SMT, Chakraborty NR (2020) “A Machine Learning Approach for Predicting the Sunspot of Solar Cycle,” in 11th International Conference on Computing, Communication and Networking Technologies, ICCCNT 2020, 2020. doi: 10.1109/ICCCNT49239.2020.9225427
Benson B, Pan WD, Prasad A, Gary GA, Hu Q (May 2020) Forecasting Solar Cycle 25 Using Deep Neural Networks. Sol Phys 295(5). 10.1007/s11207-020-01634-y
SILSO World Data Center (2021) “The International Sunspot Number,” Int Sunspot Number Monthly Bull online catalogue, pp. 1749–2020,
Bannani FK, Sharif TA, Ben-Khalifa AOR (2006) Estimation of monthly average solar radiation in Libya. Theor Appl Climatol 83:1–4. 10.1007/s00704-005-0157-9
Lauret P, Voyant C, Soubdhan T, David M, Poggi P (2015) A benchmarking of machine learning techniques for solar radiation forecasting in an insular context. Sol Energy 112:446–457. 10.1016/j.solener.2014.12.014
Wu J, Yang H (2015) “Linear Regression-Based Efficient SVM Learning for Large-Scale Classification,” IEEE Trans Neural Netw Learn Syst, vol. 26, no. 10, pp. 2357–2369, Oct. doi: 10.1109/TNNLS.2014.2382123
Staudemeyer RC, Morris ER (1909) “Understanding LSTM -- a tutorial into Long Short-Term Memory Recurrent Neural Networks,” arXiv preprint arXiv, vol. 09586, pp. 1–42, 2019
Wang Z, Xu Z, He J, Delingette H, Fan J (2023) “Long Short-Term Memory Neural Equalizer,” IEEE Transactions on Signal and Power Integrity, vol. 2, pp. 13–22, Feb. doi: 10.1109/tsipi.2023.3242855
Hochreiter S, Schmidhuber J (1997) Long Short-Term Memory. Neural Comput 9(8):1735–1780
Liu Y, Guan L, Hou C, Han H, Liu Z, Sun Y (2019) applied sciences Wind Power Short-Term Prediction Based on LSTM and Discrete Wavelet Transform. Appl sciecnces 9(108):1–17. 10.3390/app9061108
Qing X, Niu Y (2018) Hourly day-ahead solar irradiance prediction using weather forecasts by LSTM. Energy 148:461–468. 10.1016/j.energy.2018.01.177
Shao YE (2014) “Body fat percentage prediction using intelligent hybrid approaches,” The Scientific World Journal, vol. pp. 1–8, 2014, doi: 10.1155/2014/383910
Abba SI et al (2020) “Hybrid machine learning ensemble techniques for modeling dissolved oxygen concentration,” IEEE Access, vol. 8, no. September, pp. 157218–157237, doi: 10.1109/ACCESS.2020.3017743
Yu G, Feng H, Feng S, Zhao J, Xu J (2021) Forecasting hand-foot-and-mouth disease cases using wavelet-based SARIMA – NNAR hybrid model. PLoS ONE 16(2):1–12. 10.1371/journal.pone.0246673
Aravazhi A (2021) Hybrid Machine Learning Models for Forecasting Surgical Case Volumes at a Hospital. AI 2:512–526
Sun Y, Gilbert A, Tewari A (2018) “But How Does It Work in Theory? Linear SVM with Random Features,”
Yaseen ZM, Ramal MM, Diop L, Jaafar O, Demir V, Kisi O (2018) Hybrid Adaptive Neuro-Fuzzy Models for Water Quality Index Estimation. Water Resour Manage 32(7):2227–2245. 10.1007/s11269-018-1915-7
Abdulkadir RA, Ali SIA, Abba SI, Esmaili P (2019) “Forecasting of daily rainfall at ercan airport northern Cyprus: A comparison of linear and non-linear models,” Desalination Water Treat, vol. 177, no. May pp. 297–305, 2020, doi: 10.5004/dwt.2020.25321
Nghiem TL, Le VD, Le TL, Marechal P, Delahaye D, Vidosavljevic A (2022) “Applying Bayesian inference in a hybrid CNN-LSTM model for time-series prediction,” in International Conference on Multimedia Analysis and Pattern Recognition, MAPR 2022 - Proceedings, Institute of Electrical and Electronics Engineers Inc., 2022. doi: 10.1109/MAPR56351.2022.9924783
Hasoon SO, Al-Hashimi MM (2022) “Hybrid Deep Neural network and Long Short term Memory Network for Predicting of Sunspot Time Series,” [Online]. Available: http://ijmcs.future-in-tech.net

No competing interests reported.

Download PDF

Journal Publication

published 01 Nov, 2023

Read the published version in Earth Science Informatics →

Editorial decision: Major revision
18 Sep, 2023
Reviews received at journal
10 Sep, 2023
Reviewers agreed at journal
21 Aug, 2023
Reviews received at journal
06 Aug, 2023
Reviewers agreed at journal
17 Jul, 2023
Reviewers invited by journal
17 Jul, 2023
Editor assigned by journal
17 Jul, 2023
Submission checks completed at journal
10 Jul, 2023
First submitted to journal
05 Jul, 2023

You are reading this latest preprint version

Novel Residual Hybrid Machine Learning for Solar Activity Prediction in Smart Cities

Status:

Journal Publication

Version 1

Abstract

Figures

1. Introduction

2. Related Work

3. Proposed Methodology

3.1. Data Pre-processing

3.2. Support Vector Machine (SVM)

3.3. Long Short-Term Memory Neural Network (LSTM)

3.4. Hybrid SVM-LSTM

3.5. Performance Evaluation Metrics

4. Results and discussion

4.1 Descriptive Statistics

4.2. Results of the SVM, LSTM and SVM-LSTM Models

4.3. Future scenarios in energy microgrid systems

5. Conclusion

Declarations

References

Additional Declarations

Status:

Journal Publication

Version 1