Modeling and Forecasting of Sunspots Cycles: An Application of ARMA (p, q)-GARCH (1, 1) Model

The inuence of the earth climatic condition of oscillations of solar activity is measurable only in the long run duration. Modeling of the sunspots is an initial role for the mankind utilization of the benets because solar activity has inuenced on the earth’s climates. Time series analysis and modeling have proved to stick out amidst with other statistical tools when estimating and predicting solar activities. This study emphasis on the appropriateness of the generalized autoregressive conditional heteroskedasticity (GARCH) models with specication autoregressive ARMA (p, q) process in terms of their performance for delivering volatility forecasts for Sunspot cycles. In this study, individual sunspots cycle’s ranging from cycle 1st to 24th (1755–2019) are considered. To notice the appropriateness of Autoregressive Conditional Heteroscedastic (ARCH) effect on sunspot cycles data, Lagrange Multiplier test is used. ARMA (p, q)-GARCH (1, 1) process expresses leptokurtic that is fat and heavy tail (values are strongly correlated to each other). The sunspot cycles ARMA (p, q)-GRACH (1, 1) process expresses the positive skewness except cycles 4th and 19th. Most of the Sunspot cycles (1st, 4th, 12th, 13th, 14th, 15th, 16th, 19th, 20th, 23rd and 24th ) follow AutoRegressive and moving Average (ARMA (2, 2))-GARCH (1, 1). Sunspot cycles (5th, 6th, 7th and 15th ) follow ARMA (3, 3)-GARCH model. Whereas the cycles (2nd and 11th ) show appropriate model is ARMA (5, 1) GARCH (1, 1) process. ARMA (5, 3) -GARCH (1, 1) process expresses cycles (18th and 19th ). The ARMA (2, 2)-GARCH (1, 1) stationary volatility model expresses the nest forecasting model as compared with other models. Though, ARMA (2, 2)-GARCH (1, 1) is the adequate model for estimation and forecasting most of the sunspot cycles. The results that are obtained by this study are very benecial for observing the inuence of solar activity on the earth's climate. The sunspot (1st – forecasted using GARCH volatility with specication autoregressive ARMA (p, q) process are used to estimate and forecast evolution the sunspot cycles (1st -24th ). GARCH stationary volatility model expresses the nest forecasting model as compared with other models. The Gaussian quasi maximum likelihood estimation is used to analysis ARMA (p, q)-GARCH (1, 1) process. The appropriate model is selected by residuals diagnostic checking (Lagrange multiplier LM test for knowing ARCH effect, Ljung-Box test for checking autocorrelation or ARCH effect in given data, and last normality test). ARMA (p, q)-GARCH (1, 1) process follows leptokurtic that is fat and heavy tail (values are strongly correlated to each other). The sunspot cycles ARMA (p, q)-GRACH (1, 1) process expresses the positive skewness except cycles 4 and 19. In this study, the sunspot cycles follow GARCH specication with ARMA (2, 2) model for cycles (1st, 4th, 12th, 13th, 14th, 15th, 16th, 19th, 20th, 23rd and 24th ). Sunspot cycles (5th, 6th, 7th and 15th ) follow ARMA (3, 3)-GARCH model. Whereas the cycles (2nd and 11th ) show appropriate model is ARMA (5, 1) -GARCH (1, 1) process. ARMA (5, 3) -GARCH (1, 1) process expresses cycles (18th and 19th ). Durbin-Waston (DW) statistics test value of each sunspot cycles are less than 2 which indicate that sunspot observations are correlated to each other. Akaike information criterion (AIC), Bayesian Schwarz information criterion (BIC) and Hannan Quinn information criterion (HIC) explored that the most appropriate model is a 5th sunspot cycle. ARMA (p, q)-GARCH (1, 1) process of sunspot cycles rejected Jurque-Bera


Introduction
Different layers of the sun spin at different rates, creating a magnetic eld for the solar sphere. Convection currents create local magnetic elds in hot gas bubbles. Larger local magnetic elds and bubbles rise to the surface. At the surface, north and south polarity is split into pairs of disturbances. Large pairs usually create sunspots. Large sunspot groups often create ares and mass coronal ejections. Solar activity is established via spots dark on the Sun surface which is called Sunspots. The counting of Sunspots changes from time to time. Approximately, sunspots have 11-year cycles (Muraközy and Ludmány 2012). The solar cycle effect on the activity changes in the sun, solar material ejection and the solar radiations level. The solar cycle appearance depends on the variations in sunspot numbers, ares, and other manifestations.
Time series are very essential for various solar physics disciplines. The study which belongs to the climate change study also goes to the area alike. After eliminating trend and periodicities from a time series, the components of stochastic endure there. The long range correlation recommends the positive autocorrelations presence that continue signi cantly high over large time lags, so as to the autocorrelation function of the series demonstrate a slow asymptotic decay. The persistence or strength of the long-range correlations constrained in experimental time series can be evaluated by various well-known methods (Box and Jenkins 1994). The involvement of time series in solar physics frequently reveals persistence, where sequential values are positively correlated. In statistical analysis, huge data is a way to associate the trends of subsets of data across huge data sets. To study solar activities, we have certain the sunspot cycle (from cycle 1 to cycle 24) individual data and total sunspot cycles data (1755 to 2019). Each sunspots cycle data has long term trend. The prediction and correlation of large time series data has long term trend behavior. Whereas small data has short term trend behavior.
In the conditional Heteroscedastic process, an autoregressive model is used. It can follow because of the presence of outliers (very small or very large). The GARCH model (Goh and Khor C 2016) is one of the most advanced statistical techniques which is applied in volatility. It is used to analyze forecast volatility. GARCH model is a variance model and used to forecast the variance of the forthcoming period as a weighted average of the long-term average variance. GARCH model is forecast just a single period, it turns out that absorbed based on one period forecast due to second period forecast can be made (Bollerslev and Engle 1994). GARCH model is mean reverting and conditionally heteroskedastic in which unconditional constant variance are involved (Engle 2001). ARMA model is strongly signi cant to volatility modeling. ARMA methods are frequently used and most popular in time series models compared to other models like Markov Chains, Arti cial Neural Network Models, Fuzzy networks, etc (McKenzie 1984). The ARMA models have a exible nature. Thus, it can be used in numerous types of time series with different orders. It compromises regular extensiveness at individual phases (identi cation, estimation, and diagnostic checks) for an appropriate model. In these models, one of the highest di culties is the essential for large data (W. Ji and K. Chee 2011). A large amount of literature has been explored by using GARCH (1, 1) model (Engle 2001, Salisu and Fasanya 2013, Epaphra 2017, Pham and Yang 2010. All of these literatures reported that GARCH (1, 1) is more appropriate in analyzing time series data. It is the simplest and strongest among volatility models (Engle 1982) and t various data series as well (Hill, Gri ths, and Lim 2011). GARCH (1, 1) is adequate to capture the volatility clustering in the data (Brooks 2014). Moreover, (Olson and Wu 2017) revealed that analysis can be su cient with only one lag for each variable. Furthermore, GARCH (1, 1) is leptokurtic (a process having a kurtosis value greater than 3). The generalized autoregressive conditional heteroscedastic (GARCH) models can relate to ARMA models. Residual diagnostic checking like ARCH LM, normality test and correlogram squared residuals found the selection of the adequate model. Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) are con rmed by forecasting evolutions. Furthermore, the Akaike information criterion (AIC), Bayesian Schwarz information criterion (BIC) and Hannan Quinn information criterion (HIC) values are also calculated. The best-tted model residuals are selected by diagnostic checking. Forecasting evolution of each sunspot cycle is calculated via the normality test, which is based on Skewness, Kurtosis and of Jurque-Bera statistic tests. ARMA (p, q)-GARCH (1, 1) model of sunspot cycles also veri ed the presence of leptokurtic except cycles 2, 7, 18 and 19 which are platykurtic at tail (kurtosis value less than 3). The sunspot cycles GRACH (1, 1) follow positive skewness except cycles 4 and 19. These two cycles show negative skewness.

Data Description And Methodology
The data of sunspots cycles from 1755 to 2019 (1-24) is the mean monthly under deliberation. The data is collected from the World Data Centre (WDC). The main emphasis is on the Box-Jenkins method for the stationary process of ARMA-GRACH. The adequate models ARMA (p, q)-GRACH (1, 1) are selected by Akaike information criterion (AIC), Bayesian Schwarz information criterion (BIC) and Hannan Quinn information criterion (HIC). The forecasting ability of each model of sunspot cycles will be judged by diagnostic checking tests like Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). Mean maximum likelihood estimation is used to evaluate ARMA (p, q)-GARCH (1, 1) model. The Statistical EViews version 9.0 software is used for calculation and analysis of ARMA (p, q) -GARCH (1, 1) model and respective graphs. For instance, time series plots and tted, residual and forecasted plots for total sunspot cycles. This section consists of two subsections.

2.1: Basic equations of statistical analysis
This section consists of short statistical analysis.

2.1.1: Diagnostic Test
Lagrange multipliers (LM) are used to check the ARCH effect of the existing data. Correlogram squared residual test is also used to con rm the ARCH effect in the time series data. In addition, the usual normality test is also executed for the veri cation of the utilization of GARCH. Root mean square error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) are calculated to verify the accuracy of the forecasts. The Gaussian quasi-maximum likelihood estimation (GQMLE) is executed to indicate the tting models. The GQMLE is normally used in GARCH models for keeping the heavy-tailed returns (Bollerslev, Engle andNelson 1994, Thomas. andDenial 2002). The Gaussian quasi-maximum Likelihood Estimator (GQMLE) is almost normally distributed with a variance which is at least as lesser as those of other asymptotically normally distributed estimators. GQMLE constantly produces consistent estimates of the parameters of appropriately speci ed conditional mean. The adequacy of selected models is veri ed by the Akaike information criterion (AIC), Bayesian Schwarz information criterion (BIC) and Hannan Quinn information criterion (HIC). Forecasts with the best-tted model of sunspot cycles were tested for accuracy with the help of a Root mean square error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE). A description of these terminologies is given in the following.
Akaike information criterion: The AIC test was introduced by Hirotogu Akaike in 1973. It is the extension of the maximum likelihood principle. The selection criterion is focused on the least value of AIC.
Where S is the model parameter numbers. The likelihood is a measure of the t model. Maximum values exhibit the best t.
Schwarz criterion: The SIC test is used to select the most appropriate model among nite models. The appropriate model is based on the least value of SIC. Schwarz criterion (SIC) was developed by Gideon E. Schwarz. It is closely related to the AIC.

SIC = -2ln (Likelihood) + (S + S ln (N)) (2)
Where S is the model parameter numbers. N exhibits the number of observations.
Hannan-Quinn criterion: The HQC is the criterion for model selection. This test is an alternative to AIC and SIC.

HQC = -2 Log (Likelihood) + 2 (S + S ln (N)) (3)
Where S is the model parameter numbers. N exhibits the number of observations. Durbin-Watson Test: The DW statistics is a test for measuring the linear association between the adjacent residual from a regression model. The hypothesis of Durbin-Watson statistics is = 0 is the speci cation.
Durbin-Watson (DW) is equal to 2 shows there is no serial correlation. If Durbin-Watson (DW) is less than 2 indicate that positive correlation and the range from 2 to 4 represents that negative correlation. The series is strongly correlated if the value nearly approaches to zero.
Mean Absolute Error: The mean absolute error is expressed as a mathematically formed.

MAE = (5)
Where n is the number of observations. Mean Absolute Error ( RMSE calculates the average squared deviation of forecasted values. The opposite signed errors do not offset one another. RMSE provides the complete idea of the error that happened during forecasting. By using the accuracy measures, errors that are small and are getting good, such as 0.1 RMSE and 1% MAPE, can often be achieved. In RMSE, the total forecast error is affected by the large individual error. For ϵ 2 t 1 n example, a large error is much more expensive than small errors. It does not reveal the direction of overall errors. RMSE is affected by the data transformation and the change of scale. RMSE is a good measure of overall forecast error (Adhikari R. 2013).
Theil's U-Statistics (U): Theil's U-Statistics is de ned as Where f t represent the forecasted value and X t shows that the actual value. U is the normalized measure of the total forecast error. U is equal to 0 exhibits the perfect t.

2.1.2: Tests for Normality
The normality test is executed to test whether the data under consideration is normally distributed or not. These tests are based on the analysis of two numerical measures, the shape skewness and the excess kurtosis. The data sets are normally distributed if those measures are close to zero. The acceptance of Jurque-Bera test also focused on skewness and kurtosis. Hence, the test of normality consists of checking the skewness and kurtosis on which the Jurque-Bera test is based.
Skewness: The skewness determines the degree of asymmetry of the data.

Skewness = (9)
Where is the mean and S is the standard deviation and n is the number of values (Christian and Jean-Michel 2004). The skewness of the normal distribution. If the data is normally distributed, then the skewness shows that the following data is symmetry. If the data is normally distributed if the symmetric distribution (skewness value is equal to zero). The distribution is positively skewed, if it is greater than zero and negatively skewed if it is less than zero.
Kurtosis: The Kurtosis measures the degree of peakness of the data. Kurtosis has been estimated as Where is the mean, S is the standard deviation and n is the number of values of the time series data. Kurtosis of a normal distribution is called mesokurtic if it is equal to 3. Whereas it is leptokurtic if the value is greater than 3. It is Platykurtic if the value is less than 3.
Jurque-Bera Statistics Test (JBS): The JBS is accepted with the normality of the data with skewness is equal to zero and excess kurtosis is also equal to zero. Jurque-Bera test is de ned as follows.
Jurque-Bera test statistics are estimated as Chi-squared distribution with two degrees of freedom. Null hypothesis (H O ) is a normal distribution with skewness zero and excess kurtosis zero (which is the same as a kurtosis is 3). Alternate hypothesis (H A ) of given data is not normally distributed.

2.2: Methodology of the model
This section is based on the description of ARMA-GARCH model.

2.2.1: ARMA MODEL
A statistical approach to forecasting involves stochastic models to predict the values of sunspot cycles by using pervious once. In the linear time series, two methods are frequently used in literature, viz.
Autoregressive AR (p) and Moving Average MA (q) (Jenkins et.al. 1970 andHipal et. al. 1994). ARMA models are developed by (Jenkins et. al 1994). An ARMA model is the combination of an idea of Autoregressive AR (p) and Moving Average MA (q) process. The concept of ARMA process is strongly relevant in volatility modeling. ARMA model is wieldy used for forecasting the future values. Autoregressive process (AR) is developed by (yule, 1927). In stochastic process, Autoregressive process AR (p) can be expressed by a weighted sum of its previous value and a white noise. The generalized Autoregressive process AR (p) of lag p as follow X t = α 1 X t−1 + α 2 X t−2 + … + α p X t−p + t (12) Here ε t is white noise with mean E ( t ) = 0, variance Var ( t ) = σ 2 and Cov ( t −s , t ) = 0, if s ≠ 0. For every t, suppose that t is independent of the X t−1, X t−2, ….. t is uncorrelated with X s for each s < t. AR (p) models regress is past values of the data set. Whereas MA (q) model relates with error terms as a descriptive variable (Hipal et. al. 1994). The generalized Moving Average process MA (q) of lag q as follows.

2.2.2: GARCH MODEL
The generalized autoregressive conditional heteroskedasticity (GARCH) model is used to evaluate the volatility of an asset. It expresses that the volatility presence depends on the past observations and volatilities (Christian and Jean-Michel 2010). The time series t can be modeled by GARCH model is used to estimate the variance = (16) The GARCH (p, q) model is strictly stationary with nite variance, when the conditions > 0, and < 1 are essential. The GARCH model has similar form with the ARMA model.
Moreover, the GARCH process can be derived by using a similar theory and method with ARMA.

2.2.3: ARMA (p, q) -GARCH (1, 1) METHODS OF SUNSPOT CYCLES
The concept of ARMA models is strongly relevant in volatility modeling. The generalized autoregressive conditional heteroscedastic (GARCH) models can be linked as ARMA models. GARCH Models satisfy an ARMA equation with white noise. In time series, GARCH model supposition that conditional mean is zero. Generally, conditional mean of ARMA model can be structured. Identi cation of GARCH process focused on the square of residuals from the appropriate ARMA models. Moreover, in the ARAM process the quasimaximum likelihood estimation is nearly independent of their GARCH process. ARMA estimation and GARCH estimation are strongly correlated if the ARMA -GARCH process has a skewed distribution (Csyer et. al 2008). The ARAM process and GARCH process have similar behavior in forecasting. ARMA -GARCH process provides a good estimation in time series data.
Moreover, The Box-Jenkins methodology with GARCH approach is used to develop models, to estimate the models and to forecast the sunspot cycle's data.

Result And Discussion
This study focused to estimate and forecast the future sunspots with Box-Jenkins ARMA (p, q) GARCH models. Using lags with second differences for making data series stationary. The Independent Generalized Autoregressive Centralized Heteroskedastic Model (GARCH) is mostly the only three parameters that allow an unlimited number of square roots to in uence the present in nite variables. As the ARCH integrated the consultation independent feature in the absence of a return of sunspots cycles, parameters in GARCH (p, q) is frequently used for modeling, this model is insu cient parameters. They develop good estimates. The conditional variance estimated through GARCH is a weighted average of past residuals. Weight is low, but never zero. Essential for GARCH, it is the fact that it allows a vertical variable that it depends on the previous screen itself [6]. The novelty of this study to analyze the ARMA (p, q) -GARCH (1, 1) process of sunspot cycles. The ARMA (p, q) -GARCH (1, 1) model based on the least value of Darbin -Waston statistics test (DW). Least DW value (< 2) shows that each value of cycles is strongly correlated and persistence to each other. AIC, SIC, HQC and Log likelihood also estimate to each cycle. In Tables 1, 2 and 3 are depicted the GARCH (1, 1) model equations to speci cation ARMA (p, q) model of sunspot cycles by diagnostic checking test, forecast evolution and normality test. The Gaussian quasi maximum likelihood estimation is used to analysis ARMA (p, q) -GARCH (1, 1) model. Lagrange multiplier is used to verify the ARCH effect on following time series data. Ljung-Box test is used for serial correlation of each sunspot cycle. The novelty of this research to analysis the conditional mean and conditional variance effect on each sunspot cycle.
Diagnostic Checking Test is chosen with compression of these techniques with the help of Root mean square error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE). Figure 1 displayed that GARCH graphs of sunspot cycles (1-24 cycles complete time series data from 1855 to 2019) with conditional variance. In the Fig. 2

Conclusion
The presentation of ARCH model and its modi cation namely generalized autoregressive conditional heteroskedasticity GARCH has been studied using sunspot cycles (1st -24th ). The sunspot cycles (1st -24th ) have been modelled and forecasted using GARCH volatility model with speci cation autoregressive ARMA (p, q) process are used to estimate and forecast evolution the sunspot cycles (1st -24th ). GARCH stationary volatility model expresses the nest forecasting model as compared with other models. The Gaussian quasi maximum likelihood estimation is used to analysis ARMA (p, q)-GARCH (1, 1) process. The appropriate model is selected by residuals diagnostic checking (Lagrange multiplier LM test for knowing ARCH effect, Ljung-Box test for checking autocorrelation or ARCH effect in given data, and last normality test). ARMA (p, q)-GARCH (1, 1) process follows leptokurtic that is fat and heavy tail (values are strongly correlated to each other). The sunspot cycles ARMA (p, q)-GRACH (1, 1) process expresses the positive skewness except cycles 4 and 19. In this study, the sunspot cycles follow GARCH speci cation with ARMA (2, 2) model for cycles (1st, 4th, 12th, 13th, 14th, 15th, 16th, 19th, 20th, 23rd and 24th ). Sunspot cycles (5th, 6th, 7th and 15th ) follow ARMA (3, 3)-GARCH model. Whereas the cycles (2nd and 11th ) show appropriate model is ARMA (5, 1) -GARCH (1, 1) process. ARMA (5, 3) -GARCH (1, 1) process expresses cycles (18th and 19th ). Durbin-Waston (DW) statistics test value of each sunspot cycles are less than 2 which indicate that sunspot observations are correlated to each other. Akaike information criterion (AIC), Bayesian Schwarz information criterion (BIC) and Hannan Quinn information criterion (HIC) explored that the most appropriate model is a 5th sunspot cycle. ARMA (p, q)-GARCH (1, 1) process of sunspot cycles rejected Jurque-Bera test for normality test. Forecasting of each sunspot cycle was analyzed based on Root Mean Squared Error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE). For each cycle Mean Absolute Error (MAE) has the least value. Sunspot cycle 6th has the smallest value of RMSE, MAE, and MAPE which are 25.17780, 17.81173 and 79.03446 respectively. On the bases of ARMA-GARCH process results for the cycles 1st to 24th is stationary and linear. On the behalf of the study, we can predict that the cycle 25 will be stationary and linear. The results exposed that ARMA (p, q)-GARCH (1, 1) process is the nest volatility modeling for solar activities. Based on the implications of the results, the scope of the future research directions will be expanded.