2.1 A model for monthly trends
Figure 1 shows monthly rainfall for the entire U.K. from 1836 to 2021. The plot shows the familiar variability of rainfall over time, with no clear trend and with the monthly pattern swamped by the noisiness of the data. The sidebar depicts the histogram of the data, which reveals asymmetry with a long right-hand tail containing high rainfall figures and with the left-hand tail being bounded by zero. This suggests that a transformation of the series could be useful in inducing symmetry, and perhaps normality, in the data before formal modelling is undertaken.
These general features suggest using the approach introduced by Mills (2005) and subsequently employed in Mills (2015, 2017, 2019a) to model monthly rainfall. If \({x}_{t}\) denotes rainfall in month \(t, t=\text{1,2}, \dots , T\), where \(t=1\)represents January 1836 and \(T=2232\) represents December 2021, then this model takes the form
$${x}_{t}^{\left(\lambda \right)}={\sum }_{i=1}^{12}\left({\alpha }_{i}{s}_{i,t}+{\beta }_{i}{s}_{i,t}t\right)+{u}_{t}$$
1
where the Box and Cox (1964) power transformation
\({x}_{t}^{\left(\lambda \right)}=\frac{{x}_{t}^{\lambda }-1}{\lambda }\) \({x}_{t}^{\left(0\right)}=\text{log}{x}_{t}\)
is employed to reduce the asymmetry in the rainfall series. Such a transformation will also help to induce normality, linearity, and constancy of variance into the model: see, for example, Mills (2019b, Chap. 2). The \({s}_{i,t}\), \(t=\text{1,2},\dots ,T\), are ‘dummy’ variables defined to take the value 1 in month \(i\) and 0 elsewhere (where \(i=1\) signifies January, \(i=2\) February, etc.). Their inclusion allows a deterministic monthly seasonal pattern to be modelled. The presence of the \({s}_{i,t}t\) ‘interaction’ variables allow the possibility of different monthly linear time trends. The \({\alpha }_{i}\) and \({\beta }_{i}\) parameters measure the intercept and slope of these trends, so that if \({\beta }_{i}\ne 0\) then the seasonal pattern for month i evolves linearly over time.
The error \({u}_{t}\) can, in general, follow a seasonal autoregressive-moving average (ARMA) process: see, for example, Mills (2014, 2019b, Chap. 9):
$$\phi \left(B\right)\varPhi \left({B}^{12}\right)\hspace{0.33em}{u}_{t}=\theta \left(B\right)\varTheta \left({B}^{12}\right)\hspace{0.33em}{a}_{t}$$
2
where
$$\phi \left(B\right)=1-{\phi }_{1}B-\dots -{\phi }_{p}{B}^{p}$$
$$\theta \left(B\right)=1+{\theta }_{1}B+\dots \hspace{0.33em}+{\theta }_{q}{B}^{q}$$
are ‘non-seasonal’ polynomials of orders p and q in the lag operator B, defined such that \({B}^{j}{z}_{t}\equiv {z}_{t-j}\). The ‘innovation’ \({a}_{t}\) is assumed to be zero mean white noise (\(E\left({a}_{t}\right)=0\), \(E\left({a}_{t}{a}_{t-j}\right)=0\) for all \(j\ne t\)) with variance \(E\left({a}_{t}^{2}\right)={\sigma }_{a}^{2}\). The ‘seasonal’ polynomials
$$\varPhi \left({B}^{12}\right)=1-{\varPhi }_{1}{B}^{12}-\hspace{0.33em}\dots \hspace{0.33em}-{\varPhi }_{P}{B}^{12P}$$
$$\varTheta \left({B}^{12}\right)=1+{\varTheta }_{1}{B}^{12}+\hspace{0.33em}\dots \hspace{0.33em}+{\varTheta }_{Q}{B}^{12Q}$$
are of orders P and Q and their presence allows the error to be autocorrelated at annual lags, such as 12, 24, …, as well as being autocorrelated at non-seasonal lags.
More general models may be obtained if unit roots are allowed in the \(\phi \left(B\right)\) and \(\varPhi \left({B}^{12}\right)\) polynomials. If the non-seasonal autoregressive polynomial contains a unit root, i.e., the characteristic equation associated with \(\phi \left(B\right)\) contains a root of unity, then \(\phi \left(B\right)\) can be factorised as
$$\phi \left(B\right)=\left(1-B\right)\left(1-{\phi }_{1}^{*}B-\dots -{\phi }_{p-1}^{*}{B}^{p-1}\right)=\left(1-B\right){\phi }^{*}\left(B\right)$$
,
where \({\phi }^{*}\left(B\right)\) is a polynomial of order \(p-1\). Eq. (1) then becomes, with \(\nabla =1-B\) signifying the first-difference operator and \({u}_{t}^{*}=\nabla {u}_{t}\),
$$\nabla {x}_{t}^{\left(\lambda \right)}={\sum }_{i=1}^{12}\left({\alpha }_{i}\nabla {s}_{i,t}+{\beta }_{i}\nabla {s}_{i,t}t\right)+{u}_{t}^{*}$$
$${\phi }^{*}\left(B\right)\varPhi \left({B}^{12}\right){u}_{t}^{*}=\theta \left(B\right)\varTheta \left({B}^{12}\right)\hspace{0.33em}{a}_{t}$$
3
Noting that \(\nabla {s}_{i,t}={s}_{i,t}-{s}_{i+1,t}\) and \(\nabla {s}_{i,t}t=\left(12+i\right)\left({s}_{i,t}-{s}_{i+1,t}\right)\), where it is taken that \(\nabla {s}_{12,t}={s}_{12,t}-{s}_{1,t+1}\), (3) in turn becomes
$$\nabla {x}_{t}^{\left(\lambda \right)}={\sum }_{i=1}^{12}\left({\alpha }_{i}+\left(12+i\right){\beta }_{i}\right)\left({s}_{i,t}-{s}_{i+1,t}\right)+{u}_{t}^{*}$$
4
Noting that \(\nabla {x}_{t}^{\left(\lambda \right)}=\alpha +{u}_{t}^{*}\) would depict a random walk with drift \(\alpha\), Eq. (4) may be interpreted as implying that \({x}_{t}^{\left(\lambda \right)}\) contains a stochastic, random walk, trend with differing seasonal drifts, i.e., each month evolves as a random walk with its own drift.
Alternatively, suppose that the seasonal autoregressive polynomial contains a unit root:
$$\varPhi \left({B}^{12}\right)=\left(1-{B}^{12}\right)\left(1-{\varPhi }_{1}^{*}{B}^{12}-\dots -{\varPhi }_{P-1}^{*}{B}^{12\left(P-1\right)}\right)=\left(1-{B}^{12}\right){\varPhi }^{*}\left({B}^{12}\right)$$
.
Equation (1) then becomes, with \({\nabla }_{12}=\left(1-{B}^{12}\right)\) and\({u}_{t}^{†}={\nabla }_{12}{u}_{t}\)
$${\nabla }_{12}{x}_{t}^{\left(\lambda \right)}={\sum }_{i=1}^{12}\left({\alpha }_{i}{\nabla }_{12}{s}_{i,t}+{\beta }_{i}{\nabla }_{12}{s}_{i,t}t\right)+{u}_{t}^{†}$$
$$\phi \left(B\right){\varPhi }^{*}\left({B}^{12}\right){u}_{t}^{†}=\theta \left(B\right)\varTheta \left({B}^{12}\right)\hspace{0.33em}{a}_{t}$$
5
Now \({\nabla }_{12}{s}_{i,t}=0\) and \({\nabla }_{12}{s}_{i,t}t=12{s}_{i,t}\), so that (5) becomes
$${\nabla }_{12}{x}_{t}^{\left(\lambda \right)}=12{\sum }_{i=1}^{12}{\beta }_{i}{s}_{i,t}+{u}_{t}^{†}$$
6
and \({x}_{t}^{\left(\lambda \right)}\) now contains a stochastic seasonal random walk with differing seasonal drifts. If \(\varPhi \left({B}^{12}\right)=\varTheta \left({B}^{12}\right)\) then there is only deterministic seasonality (see Pierce, 1978, and Mills and Mills, 1992, for similar set-ups and additional analysis).
The transformation parameter \(\lambda\) may be estimated by maximum likelihood (ML) using the model (1) with the error assumed to be white noise, i.e., with \({u}_{t}={a}_{t}\): equivalently, by setting all lag polynomial orders to zero, \(p=q=P=Q=0\).[2] The ML estimate \(\widehat{\lambda }\) is obtained by maximising over \(\lambda\) the concentrated log-likelihood function
$$\mathcal{l}\left(\lambda \right)=-\left(\frac{T}{2}\right)\sum _{t=1}^{T}{\widehat{a}}_{t}^{2}\left(\lambda \right)+\left(\lambda -1\right){\sum }_{t=1}^{T}\text{l}\text{o}\text{g}{ x}_{t}$$
where \({\widehat{a}}_{t}^{2}\left(\lambda \right)\) is the residual from estimating (1). A confidence interval for \(\lambda\) can then be constructed using the standard result that \(2\left(\mathcal{l}\left(\widehat{\lambda }\right)-\mathcal{l}\left(\lambda \right)\right)\) is asymptotically distributed as \({\chi }^{2}\left(1\right)\), where \(\mathcal{l}\left(\widehat{\lambda }\right)\) is the maximised likelihood. A \(95\%\) confidence interval, for example, is then given by those values for which \(\mathcal{l}\left(\widehat{\lambda }\right)-\mathcal{l}\left(\lambda \right)<0.5{\chi }_{0.05}^{2}\left(1\right)=1.92\).
The ML estimate was found to be \(0.62\) with a \(95\%\) confidence interval of \(\left(0.54, 0.70\right)\).[3] Both \(\lambda =0\), the logarithmic transformation, and \(\lambda =1\), effectively no transformation, lie well outside this interval. The square root transformation, \(\lambda =0.5\), although it lies just outside this interval, has two advantages over the ML estimate. First, it has a much simpler interpretation and, second, it allows a closed form expression for the optimal forecast of rainfall itself, \({\widehat{x}}_{t}\), to be used, which is not the case for other values of \(\lambda\) such as \(0.62\). Granger and Newbold (1976) provide results that can be used to establish that, if \({y}_{t}\) is the predicted value of \({\sqrt{x}}_{t}\) from the regression (1) with \(\lambda =0.5\), and \({s}_{y}^{2}=T\sum _{t=1}^{T}{\widehat{a}}_{t}^{2}\left(0.5\right)\) is the accompanying residual variance, then the optimal rainfall forecast is given by
$${\widehat{x}}_{t}={y}_{t}^{2}+\frac{1}{2}{s}_{y}^{2}$$
Figure 2 shows the square root transformed U.K. rainfall series and the asymmetry appears to have been eradicated: indeed, standard tests for normality cannot reject this hypothesis.
Salient statistics from the regression of (1) with \(\lambda =0.5\) are shown in Table 1. Both hypotheses \({\alpha }_{1}=\cdots ={\alpha }_{12}=0\) and \({\beta }_{1}=\cdots ={\beta }_{12}=0\) are conclusively rejected, so that a seasonal deterministic (linear) trends representation is implied. A test of whether the set of quadratic trends \({s}_{1,t}{t}^{2},\cdots ,{s}_{12,t}{t}^{2}\) should also be included in the model fails to reject the hypothesis that their coefficients are jointly zero at the 0.26 marginal significance level. Furthermore, tests of residual autocorrelation cannot reject the assumption made above that \({u}_{t}={a}_{t}\), i.e., that the residuals are white noise so that there is no evidence of stochastic seasonality or, indeed, of any autocorrelated noise whatsoever. For example, a Lagrange Multiplier (LM) test for up to 12th order residual autocorrelation cannot reject the hypothesis of white noise at the marginal significance level of \(0.66\)(recall footnote 2).
Table 1
Summary statistics for estimation of (1) with \(\lambda =0.5\).
\({\alpha }_{1}=\cdots ={\alpha }_{12}=0\)
|
\({\beta }_{1}=\cdots ={\beta }_{12}=0\)
|
\({s}_{y}^{2}\)
|
\({R}^{2}\)
|
\({\chi }^{2}\left(12\right)=4642.7\)
|
\({\chi }^{2}\left(12\right)=52.9\)
|
\(2.894\)
|
\(0.237\)
|
Figure 3 shows the actual rainfall series with the predicted series \({\widehat{x}}_{t}\) superimposed. There is certainly a very stable seasonal pattern but any trends in this pattern are difficult to discern at this resolution of the figure. Table 2 enables a more detailed assessment of these trends to be made. As well as reporting the estimates of the seasonal intercept and trend coefficients \({\alpha }_{i}\) and \({\beta }_{i}\), the monthly predictions for 1836 and 2021, along with their differences, are also provided. The standard error attached to each of the seasonal intercept estimates is 0.25 so, not surprisingly, each of these are highly significant and precisely estimated. The standard error attached to each of the trend coefficients is 0.0002, so only the coefficients for January, February, March, May, November, and December are significantly different from zero (all being positive) at the 0.05 significance level (equivalently, we require a difference in predictions to be greater than 13.7 to be significant at this level). Figure 4depicts these predicted seasonal patterns graphically. Average annual predicted rainfall has increased from 83.2 mm in 1836 to 95.7 mm in 2021, but this increase of 12.5 mm has been concentrated in the winter months with little or no change in the mid-months of the year. The amplitude of the seasonal pattern, given by the difference between the maximum and minimum monthly predicted rainfall in a given year, has increased by 32% over the sample period, from 44.5 mm in 1836 to 58.6 mm in 2021. This increase in amplitude has coincided with a shift in the wettest month from October to December over the 185 years of the rainfall record.
Table 2
Estimated coefficients and predictions from Eq. (1); * denotes significant at 0.05 level.
|
\({\alpha }_{i}\)
|
\({\beta }_{i}\)
|
1836,\({\widehat{x}}_{t}\)
|
2021,\({\widehat{x}}_{t}\)
|
Difference
|
January
|
9.440
|
0.00066*
|
92.00
|
121.90
|
29.90*
|
February
|
8.364
|
0.00040*
|
72.78
|
88.54
|
15.76*
|
March
|
8.036
|
0.00055*
|
67.47
|
88.72
|
21.25*
|
April
|
7.636
|
0.00026
|
61.19
|
70.19
|
9.00
|
May
|
7.554
|
0.00042*
|
59.96
|
75.09
|
15.13*
|
June
|
8.285
|
0.00000
|
71.50
|
71.32
|
0.19
|
July
|
9.226
|
0.00020
|
87.96
|
80.00
|
7.96
|
August
|
9.639
|
0.00007
|
95.76
|
92.69
|
3.07
|
September
|
9.123
|
0.00020
|
86.12
|
94.24
|
8.12
|
October
|
10.322
|
0.00016
|
109.44
|
116.87
|
7.43
|
November
|
9.761
|
0.00048*
|
98.25
|
120.41
|
22.16*
|
December
|
9.657
|
0.00070*
|
96.28
|
128.79
|
32.51*
|
Average
|
|
|
83.23
|
95.74
|
12.51
|
Amplitude
|
|
|
44.48
|
58.60
|
|
Table 3 Regional estimates and statistics. Marginal probabilities in \(\left[.\right]\).
|
East Anglia
|
East,
North East
|
Midlands
|
\(\widehat{\lambda }\)
|
\(0.52\)
|
\(0.50\)
|
\(0.52\)
|
\({\alpha }_{i}=0\) for all \(i\)
|
\(3554.1 \left[0.000\right]\)
|
\(3850.9 \left[0.000\right]\)
|
\(3655.7 \left[0.000\right]\)
|
\({\beta }_{i}=0\) for all \(i\)
|
\(22.27 \left[0.035\right]\)
|
\(24.33 \left[0.018\right]\)
|
\(32.30 \left[0.001\right]\)
|
\({s}_{y}^{2}\)
|
\(3.113\)
|
\(3.223\)
|
\(3.582\)
|
\({R}^{2}\)
|
\(0.098\)
|
\(0.111\)
|
\(0.094\)
|
Quadratic trends
|
\(12.81 \left[0.383\right]\)
|
\(16.43 \left[0.172\right]\)
|
\(12.15 \left[0.433\right]\)
|
\(LM\)
|
\(11.73 \left[0.468\right]\)
|
\(13.09 \left[0.362\right]\)
|
\(10.85 \left[0.542\right]\)
|
|
North Wales,
North East
|
South East, Central
|
South Wales,
South West
|
\(\widehat{\lambda }\)
|
\(0.58\)
|
\(0.48\)
|
\(0.53\)
|
\({\alpha }_{i}=0\) for all \(i\)
|
\(3877.4 \left[0.000\right]\)
|
\(3187.8 \left[0.000\right]\)
|
\(3535.8 \left[0.000\right]\)
|
\({\beta }_{i}=0\) for all \(i\)
|
\(43.20 \left[0.000\right]\)
|
\(28.52 \left[0.005\right]\)
|
\(29 98 \left[0.003\right]\)
|
\({s}_{y}^{2}\)
|
\(4.975\)
|
\(4.564\)
|
\(5.742\)
|
\({R}^{2}\)
|
\(0.197\)
|
\(0.121\)
|
\(0.200\)
|
Quadratic trends
|
\(12.05 \left[0.442\right]\)
|
8.95\(\left[0.707\right]\)
|
9.28\(\left[0.678\right]\)
|
\(LM\)
|
\(11.09 \left[0.521\right]\)
|
\(12.19 \left[0.430\right]\)
|
7.77\(\left[0.803\right]\)
|
|
Northern Ireland
|
\(\widehat{\lambda }\)
|
\(0.66\)
|
\({\alpha }_{i}=0\) for all \(i\)
|
\(3096.6 \left[0.000\right]\)
|
\({\beta }_{i}=0\) for all \(i\)
|
\(39.05 \left[0.001\right]\)
|
\({s}_{y}^{2}\)
|
\(29.328\)
|
\({R}^{2}\)
|
\(0.167\)
|
Quadratic trends
|
\(13.24 \left[0.352\right]\)
|
\(LM\)
|
\(13.58 \left[0.328\right]\)
|
|
East Scotland
|
North Scotland
|
West Scotland
|
\(\widehat{\lambda }\)
|
\(0.53\)
|
\(0.50\)
|
\(0.57\)
|
\({\alpha }_{i}=0\) for all \(i\)
|
\(4164.9 \left[0.000\right]\)
|
\(4227.5 \left[0.000\right]\)
|
\(3888.7 \left[0.000\right]\)
|
\({\beta }_{i}=0\) for all \(i\)
|
\(19.36 \left[0.080\right]\)
|
\(78.24 \left[0.000\right]\)
|
\(90.35 \left[0.001\right]\)
|
\({s}_{y}^{2}\)
|
\(3.878\)
|
\(4.815\)
|
\(5.861\)
|
\({R}^{2}\)
|
\(0.178\)
|
\(0.291\)
|
\(0.263\)
|
Quadratic trends
|
23.5\(1 \left[0.024\right]\)
|
\(15.15 \left[0.233\right]\)
|
\(18.05 \left[0.114\right]\)
|
\(LM\)
|
\(10.58 \left[0.565\right]\)
|
24.04\(\left[0.020\right]\)
|
25.03\(\left[0.015\right]\)
|
2.2 Are there discernible cycles in rainfall?
Although ‘portmanteau’ tests detect no evidence of significant autocorrelation in the residuals from (1), a more detailed analysis of the autocorrelation structure of the prediction errors \({v}_{t}={x}_{t}-{\widehat{x}}_{t}\) may provide information on whether there is any evidence of cycles in rainfall greater than the seasonal period of 12 months. Figure 5 plots the sample autocorrelations of \({v}_{t}\) up to 72 lags (six years). While few are statistically significant, there might be an indication of some weak cyclical pattern in this correlogram.
To investigate further, a smoothed periodogram of \({v}_{t}\) was computed and is shown in Fig. 6, for frequencies up to the seasonal frequency of \(1/12=0.0833\) radians (a Bartlett, 1950, kernel was employed with a bandwidth of 50 in the computation). This shows weak evidence of a broad peak in the range of roughly 0.3–0.4 radians, or approximately 24 to 36 months, but no evidence of longer cycles, such as those associated with the 11-year sunspot cycle, for example. This is consistent with a cycle of 2.1–2.4 years found using spectral and filter analysis by Tabony (1979) for an earlier long rainfall record for England and Wales.
A band-pass filter that passes all cycles between these lower and upper bounds was used to extract a cyclical component from \({v}_{t}\). The filter used was the full sample asymmetric filter of Christiano and Fitzgerald (2003), which has the important advantages here that a cycle may be extracted for all observations, rather than some being lost at the start and, more relevantly, at the end of the rainfall record, and that the moving average weights used in constructing the filter are not fixed but depend on \(t\), so that the cycle adjusts to local data conditions, again important here given the length of the series, 185 years.
Figure 7 shows \({v}_{t}\) and the extracted cycle \({c}_{t}\). This cycle has a period of, on average, 28 months and a fluctuating amplitude ranging from a minimum of 1.4 mm to a maximum of 34.1 mm. However, these cyclical fluctuations are very small compared to the fluctuations in the prediction error, the standard deviations of the two series being 5.5 mm and 31.2 mm respectively. Consequently, the rainfall ‘pattern’, \({\widehat{x}}_{t}+{c}_{t}\), shown with observed rainfall, \({x}_{t}\), in Fig. 8, is essentially just a ‘jittered’ version of \({\widehat{x}}_{t}\).