The data of sunspots cycles from 1755 to 2019 (1–24) is the mean monthly under deliberation. The data is collected from the World Data Centre (WDC). The main emphasis is on the Box-Jenkins method for the stationary process of ARMA-GRACH. The adequate models ARMA (p, q)-GRACH (1, 1) are selected by Akaike information criterion (AIC), Bayesian Schwarz information criterion (BIC) and Hannan Quinn information criterion (HIC). The forecasting ability of each model of sunspot cycles will be judged by diagnostic checking tests like Root Mean Squared Error (RMSE), Mean Absolute Error (MAE), and Mean Absolute Percentage Error (MAPE). Mean maximum likelihood estimation is used to evaluate ARMA (p, q)–GARCH (1, 1) model. The Statistical EViews version 9.0 software is used for calculation and analysis of ARMA (p, q) –GARCH (1, 1) model and respective graphs. For instance, time series plots and fitted, residual and forecasted plots for total sunspot cycles. This section consists of two subsections.

## 2.1: Basic equations of statistical analysis

This section consists of short statistical analysis.

## 2.1.1: Diagnostic Test

Lagrange multipliers (LM) are used to check the ARCH effect of the existing data. Correlogram squared residual test is also used to confirm the ARCH effect in the time series data. In addition, the usual normality test is also executed for the verification of the utilization of GARCH. Root mean square error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE) are calculated to verify the accuracy of the forecasts. The Gaussian quasi-maximum likelihood estimation (GQMLE) is executed to indicate the fitting models. The GQMLE is normally used in GARCH models for keeping the heavy-tailed returns (Bollerslev, Engle and Nelson 1994, Thomas. and Denial 2002). The Gaussian quasi-maximum Likelihood Estimator (GQMLE) is almost normally distributed with a variance which is at least as lesser as those of other asymptotically normally distributed estimators. GQMLE constantly produces consistent estimates of the parameters of appropriately specified conditional mean. The adequacy of selected models is verified by the Akaike information criterion (AIC), Bayesian Schwarz information criterion (BIC) and Hannan Quinn information criterion (HIC). Forecasts with the best-fitted model of sunspot cycles were tested for accuracy with the help of a Root mean square error (RMSE), Mean Absolute Error (MAE) and Mean Absolute Percentage Error (MAPE). A description of these terminologies is given in the following.

**Akaike information criterion: **The AIC test was introduced by Hirotogu Akaike in 1973. It is the extension of the maximum likelihood principle. The selection criterion is focused on the least value of AIC.

AIC = − 2Log (likelihood) + 2S (1)

Where S is the model parameter numbers. The likelihood is a measure of the fit model. Maximum values exhibit the best fit.

**Schwarz criterion: **The SIC test is used to select the most appropriate model among finite models. The appropriate model is based on the least value of SIC. Schwarz criterion (SIC) was developed by Gideon E. Schwarz. It is closely related to the AIC.

SIC = -2ln (Likelihood) + (S + S ln (N)) (2)

Where S is the model parameter numbers. N exhibits the number of observations.

**Hannan-Quinn criterion: **The HQC is the criterion for model selection. This test is an alternative to AIC and SIC.

HQC = -2 Log (Likelihood) + 2 (S + S ln (N)) (3)

Where S is the model parameter numbers. N exhibits the number of observations.

**Durbin-Watson Test: **The DW statistics is a test for measuring the linear association between the adjacent residual from a regression model. The hypothesis of Durbin- Watson statistics is = 0 is the specification.

Ut = \(\tau\) Ut−1 + \({\in }_{t}\) (4)

Durbin-Watson (DW) is equal to 2 shows there is no serial correlation. If Durbin- Watson (DW) is less than 2 indicate that positive correlation and the range from 2 to 4 represents that negative correlation. The series is strongly correlated if the value nearly approaches to zero.

**Mean Absolute Error: **The mean absolute error is expressed as a mathematically formed.

MAE = \(\frac{1}{n}\sum _{t=1}^{n}\left|{\epsilon }_{t}\right|\) (5)

Where n is the number of observations. Mean Absolute Error (MAE) processes the absolute deviation of forecasted values from real ones. It is also called Mean Absolute Deviation (MAD). It expresses the magnitude of overall error caused by forecasting. MAE does not cancel out the effect of positive and negative errors. MAE does not definite the directions of errors. It should be as small as possible for good forecasting. MAE depends on the data transformations and the scale of measurement. Extreme forecast error does not exist by MAE (Adhikari R. 2013).

**Mean Absolute Percentage Error (MAPE): **The Mean Absolute Percentage Error (MAPE) is defined as

MAPE = \(\frac{1}{n}\sum _{t=1}^{n}\left|\frac{{\epsilon }_{t}}{{X}_{t}}\right|\times 100\) (6)

Mean Absolute Percentage Error (MAPE) provides the percentage of the average absolute error. It is independent of the scale measurement. MAPE does not locate the direction of Error. The extreme deviation is not penalized by MAPE. In this measure, opposite signed errors do not offset each other in MAPE [1]. This means that due to the benefits of freedom and commentary on the absolute percentage error (MAPE) scale, one of the most extensively used measures of prediction accuracy. Whereas it is independent of the scale of measurement but affected by data transformations (Schwabe H. 1844).

**Root Mean Squared Error (RMSE): **The root mean squared error (RMSE) is defined as

RMSE = \(\sqrt{\frac{1}{n}\sum _{t=1}^{n}{\epsilon }_{t}^{2}}\) (7)

RMSE calculates the average squared deviation of forecasted values. The opposite signed errors do not offset one another. RMSE provides the complete idea of the error that happened during forecasting. By using the accuracy measures, errors that are small and are getting good, such as 0.1 RMSE and 1% MAPE, can often be achieved. In RMSE, the total forecast error is affected by the large individual error. For example, a large error is much more expensive than small errors. It does not reveal the direction of overall errors. RMSE is affected by the data transformation and the change of scale. RMSE is a good measure of overall forecast error (Adhikari R. 2013).

**Theil’s U-Statistics (U): **Theil’s U-Statistics is defined as

U = \(\frac{\sqrt{\frac{1}{n}\sum _{t=1}^{n}{\in }_{t}^{2}}}{\sqrt{\frac{1}{n}\sum _{t=1}^{n}{{f}_{t}}^{2}}\sqrt{\frac{1}{n}\sum _{t=1}^{n}{{X}_{t}}^{2}}}\) \(0\) ≤ \(U\) ≤ 1 (8)

Where ft represent the forecasted value and Xt shows that the actual value. U is the normalized measure of the total forecast error. U is equal to 0 exhibits the perfect fit.

## 2.1.2: Tests for Normality

The normality test is executed to test whether the data under consideration is normally distributed or not. These tests are based on the analysis of two numerical measures, the shape skewness and the excess kurtosis. The data sets are normally distributed if those measures are close to zero. The acceptance of Jurque-Bera test also focused on skewness and kurtosis. Hence, the test of normality consists of checking the skewness and kurtosis on which the Jurque-Bera test is based.

**Skewness: **The skewness determines the degree of asymmetry of the data.

Skewness = \(\frac{\sum _{i=1}^{n}{({X}_{i}-\stackrel{-}{X})}^{3}}{(n-1){S}^{3}}\) (9)

Where \(\stackrel{-}{X}\) is the mean and S is the standard deviation and n is the number of values (Christian and Jean-Michel 2004). The skewness of the normal distribution. If the data is normally distributed, then the skewness shows that the following data is symmetry. If the data is normally distributed if the symmetric distribution (skewness value is equal to zero). The distribution is positively skewed, if it is greater than zero and negatively skewed if it is less than zero.

**Kurtosis: **The Kurtosis measures the degree of peakness of the data. Kurtosis has been estimated as

Kurtosis = \(\frac{\sum _{i=1}^{n}{({X}_{i}-\stackrel{-}{X})}^{4}}{(n-1){S}^{4}}\) (10)

Where \(\stackrel{-}{X}\) is the mean, *S* is the standard deviation and *n* is the number of values of the time series data. Kurtosis of a normal distribution is called mesokurtic if it is equal to 3. Whereas it is leptokurtic if the value is greater than 3. It is Platykurtic if the value is less than 3.

**Jurque-Bera Statistics Test (JBS): **The JBS is accepted with the normality of the data with skewness is equal to zero and excess kurtosis is also equal to zero. Jurque-Bera test is defined as follows.

Jurque-Bera test = \(\frac{n{\left(Skewness\right)}^{2}}{6}\) + \(\frac{n{(Kurtosis-3)}^{2}}{24}\) (11)

Jurque-Bera test statistics are estimated as Chi-squared distribution with two degrees of freedom. Null hypothesis (HO) is a normal distribution with skewness zero and excess kurtosis zero (which is the same as a kurtosis is 3). Alternate hypothesis (HA) of given data is not normally distributed.

## 2.2: Methodology of the model

This section is based on the description of ARMA-GARCH model.

## 2.2.1: ARMA MODEL

A statistical approach to forecasting involves stochastic models to predict the values of sunspot cycles by using pervious once. In the linear time series, two methods are frequently used in literature, viz. Autoregressive AR (p) and Moving Average MA (q) (Jenkins et.al. 1970 and Hipal et. al. 1994). ARMA models are developed by (Jenkins et. al 1994). An ARMA model is the combination of an idea of Autoregressive AR (p) and Moving Average MA (q) process. The concept of ARMA process is strongly relevant in volatility modeling. ARMA model is wieldy used for forecasting the future values. Autoregressive process (AR) is developed by (yule, 1927). In stochastic process, Autoregressive process AR (p) can be expressed by a weighted sum of its previous value and a white noise. The generalized Autoregressive process AR (p) of lag p as follow

*X* *t* = *α*1 *X**t*−1 + *α* 2 *X**t*−2 + … + *α* *p* *X**t−p* + \(\in\)t (12)

Here εt is white noise with mean E (\(\in\)t) = 0, variance Var (\(\in\)t) = σ2 and Cov (\(\in\)t −s, \(\in\)t) = 0, if s ≠ 0. For every t, suppose that \(\tau\)t is independent of the *X**t*−1, *X**t*−2, ….. \(\tau\)t is uncorrelated with *X**s* for each *s* < *t*. AR (p) models regress is past values of the data set. Whereas MA (q) model relates with error terms as a descriptive variable (Hipal et. al. 1994). The generalized Moving Average process MA (q) of lag q as follows.

*X* *t* = \(\in\)t + β1\(\in\)t−1 + β2 \(\in\)t−2+ … + β*q* \(\in\)t−*q* (13)

The process *X**t* is defined by the ARMA model.

*X* *t* = *α*1 *X**t*−1 + *α* 2 *X**t*−2 + … + *α* *p* *X**t−p* + \(\in\)t + β1\(\in\)t−1 + β2 \(\in\)t−2+ … + β*q* \(\in\)t−*q* (14)

With \(\in\)t is an uncorrelated process with mean zero. The prediction of ARMA (p, q) process shows the decay to be sinusoidally and exponentially to zero.

## 2.2.2: GARCH MODEL

The generalized autoregressive conditional heteroskedasticity (GARCH) model is used to evaluate the volatility of an asset. It expresses that the volatility presence depends on the past observations and volatilities (Christian and Jean-Michel 2010). The time series \(\tau\)*t* can be modeled by

\(\tau\) *t* = \({\sigma }_{t}{ϵ}_{t}\) with \({\in }_{t}\tilde IID(0, 1)\) (15)

GARCH model is used to estimate the variance\({\sigma }_{t}\)

\({\sigma }_{t}^{2}\) = \(\delta + \sum _{i=1}^{q}{\alpha }_{i}{{x}_{t-i}^{2}+ \sum _{j=1}^{p}{\beta }_{j}}_{1}{\tau }_{t-j}^{2}\) (16)

The GARCH (p, q) model is strictly stationary with finite variance, when the conditions \(\delta\) > 0, and \(\sum _{i=1}^{q}{\alpha }_{i}{{x}_{t-i}^{2}+ \sum _{j=1}^{p}{\beta }_{j}}_{1}{\tau }_{t-j}^{2}\)< 1 are essential. The GARCH model has similar form with the ARMA model. Moreover, the GARCH process can be derived by using a similar theory and method with ARMA.

## 2.2.3: ARMA (p, q) – GARCH (1, 1) METHODS OF SUNSPOT CYCLES

The concept of ARMA models is strongly relevant in volatility modeling. The generalized autoregressive conditional heteroscedastic (GARCH) models can be linked as ARMA models. GARCH Models satisfy an ARMA equation with white noise. In time series, GARCH model supposition that conditional mean is zero. Generally, conditional mean of ARMA model can be structured. Identification of GARCH process focused on the square of residuals from the appropriate ARMA models. Moreover, in the ARAM process the quasi-maximum likelihood estimation is nearly independent of their GARCH process. ARMA estimation and GARCH estimation are strongly correlated if the ARMA – GARCH process has a skewed distribution (Csyer et. al 2008). The ARAM process and GARCH process have similar behavior in forecasting. ARMA – GARCH process provides a good estimation in time series data.

GARCH (1, 1) process specification with ARMA (p, q) is defined as follows (t = 0 \(\pm 1, \pm 2, \dots\)).

*X* *t* = *α*1 *X**t*−1 + *α* 2 *X**t*−2 + … + *α* *p* *X**t−p* + \(\in\)t + *β*1\(\in\)t−1 + *β*2 \(\in\)t−2+ … + *β**q* \(\in\)t−*q* (17)

\(\tau\) *t* = \({\sigma }_{t}{ϵ}_{t}\) with \({\in }_{t}\tilde IID(0, 1)\) (18)

\({\sigma }_{t}^{2}\) = \(\delta + {\beta }_{1} {\tau }_{t-1}^{2}+{\beta }_{2} {\tau }_{t-2}^{2}+\dots +{\beta }_{p} {\tau }_{t-p}^{2}+\gamma {\sigma }_{t-1}^{2}\) (19)

Where E(\(\tau\)t) = 0, variance Var (\(\tau\)t | \({\tau }_{t-1}^{2}, {\tau }_{t-2}^{2}\dots\) ) = σ2 and Cov (\(\tau\)t −s, \(\tau\)t) = 0, if s ≠ 0.

Moreover, The Box-Jenkins methodology with GARCH approach is used to develop models, to estimate the models and to forecast the sunspot cycle’s data.