Improved complete ensemble empirical mode decompositions with adaptive noise of global, hemispherical and tropical temperature anomalies, 1850–2021

ICEEMDAN, a variant of Empirical Mode Decomposition (EMD), is used to extract temperature cycles with periods from half a year to multiple decades from the HadCRUT5 global temperature anomaly data. The residual indicates an overall warming trend. The analysis is repeated for the Southern and Northern Hemispheres as well as the Tropics, defined as areas lying at or below 30 degrees of latitude. Multiannual cycles explain the apparently anomalous pause in global warming starting around 2000. The previously identified multidecadal cycle is found to be the most energetic and to account for recent global warming acceleration, beginning around 1993. This cycle’s amplitude is found to be more variable than by previous work. Moreover, this variability varies by latitude. Sea ice loss acceleration is proposed as an explanation for global warming acceleration.


Introduction
Long-term variation in global temperatures is a well-known phenomenon. Improved Complete Ensemble Empirical Mode Decomposition with Adaptive Noise (ICEEMDAN, Colominas et al. 2014), a variant of the Empirical Mode Decomposition (EMD, Huang et al. 1998) decomposes time series of temperature anomalies into Intrinsic Mode Functions (IMFs) representing noise, cyles of different and possibly nonconstant frequencies and amplitudes and a residual. The last estimates the trend in temperature anomalies. Colominas et al. (2014) developed ICEEMDAN as an improvement on prior EMD variants to more accurately reproduce the input signal and to reduce the remaining residual noise. They show that ICEEMDAN extracts signals more faithfully and with less residual noise than Ensemble Empirical Mode Decomposition (EEMD) . 1 The temperature anomaly data come from the Met Office Hadley Centre HadCRUT5 infilled observation datasets (Morice et al. 2021). The data are for months between January 1850 and December 2019. For each 5 • by 5 • cell of the Earth's surface, the average temperature, 1961-1990, is computed. A monthly time series of temperature anomalies is constructed for each cell by subtracting the 1961-1990 average from the monthly estimated values. The averaged series are the averages of the anomalies for the area of interest, weighted by surface area. All available averaged series are used: Global and Northern and Southern Hemispheres. In addition, a tabulation was obtained from the Met Office Hadley Centre for the Tropics, defined as latitudes between 30 • N and 30 • S. The mean for each series is used as the measure of temperature.

Previous research
Previous research has followed four approaches. The first has concentrated on removing noise by smoothing data. The popularly displayed annual averages of Lindsey and Dahlman (2020) and others are simply the averages of all months within a calendar year. Not only the choice of periods to smooth is arbitrary, the smoothed data prevent identification of biennial, annual and multiannual cycles. Moreover, identification of multiannual cycles is hampered by preventing the contributions of individual months from being identified. Variations in the timing of cycles with periods of a few years can result in their nonidentification. Hansen et al. (2006) avoided this by focusing on the changes over time and not attempting any decompositions or forecasts. Hansen, Sato and Ruedy (2013) similarly use annual averages in an analysis of climatic forcing. Hansen and Sato (2021) use a linear trend to identify putative recent global warming acceleration. A 21-year weighted moving average has happily been discontinued from the Internet.
The second approach is regression analysis. Foster and Rahmstorf (2011) and Zhou and Tung (2013) use linear regression on global data to obtain linear trends after controlling for forcing variables. The obvious criticisms are that the trends are not necessarily linear, the forcing variables may not have linear effects, missing variables may be present and that the time series structure is not used in any way. Lean and Rind (2008) partition the Earth's surface into cells, then run regressions within each cell and combine results. This approach suffers from not explicitly incorporating the spatial structure in what is really a spatial panel model. An additional weakness of regression is that statistical significance does not imply practical significance (Ziliak and McCloskey 2004;McCloskey and Ziliak 2008). When a variable lacks practical significance, controlling for it has no practical effect on regression fit. Turner et al. (2005) use regression analysis on monthly Antarctic temperature and wind speed data to obtain linear trends. This approach has the defects of assuming linearity and not accounting for time series structure. Thus, their results can, at best, only be interpreted qualitatively.
The third approach uses wavelet analysis. A full description of wavelet analysis is beyond the scope of the present paper. A short description is that a basis function is chosen that, in turn, generates other basis functions that are used to form a wavelet representation of a signal. The exact representation depends on the choice of initial basis function. Recovering amplitude and frequency information is mathematically complicated. Lau and Weng (1999), Silva et al. (2018) and Yang et al. (2015) are examples of applying wavelet analysis to monthly temperature data. The starting dates of these analyses, 1884 (Silva et al. 2018) and 1955(Lau and Weng 1999Yang et al. (2015), show an important limitation: these analyses do not use the full time series because of the initial noise, as described in Section 4. Moreover, the trends are linear, a constraint that EMD lacks.
Empirical Orthogonal Functions (EOFs) appear to have fallen out of favor in climatic research. EOFs are the principal components of spatiotemporal data (Björnsson and Venegas 2000). The components are called "modes of variability." Their problem lies in their being "primarily data modes, and not necessarily physical modes" (Björnsson and Venegas 2000, p. 5, original emphasis). Without physical knowledge, they provide little information about physical phenomena. Examples of their application to climatic data, including global warming, include Björnsson and Venegas (2000), Bretherton et al. (1999), Feldstein (2002) and Wang and Mehta (2008).
The last approach uses the data-driven Empirical Mode Decomposition (EMD) or Ensemble Empirical Mode Decomposition (EEMD) ) and seems to be the most popular recently. Section 3 delves into the technical details. Huang et al. (2009) appear to have been the first. They use EMD to remove noise from monthly global temperature anomaly series to derive annual series. Wu et al. (2011) essentially repeat the analysis using EEMD and identify a nearly regular multidecadal cycle. Franzke (2010) uses EEMD to remove noise from Antarctic temperature series to identify trends. Shi et al. (2011) and Xing et al. (2006) apply EEMD to tree ring records. Qian (2015) uses EEMD to remove noise from Shanghai, China temperature extreme series to identify the effects of urbanization on them. Yang et al. (2011) apply EMD to air temperature observations at Nanjing, China to find no detectable solar-driven variability, which the present paper confirms globally. Mukherjee et al. (2014) apply EEMD to daily Indian monsoon rain totals. Similarly, Sabzehee et al. (2019) analyze Caspian Sea catchment rain totals. Huang et al. (1998) introduced the Empirical Mode Decomposition (EMD) as an adaptive, data-driven method to completely decompose time series using the Hilbert Transform. The Hilbert Transform is a more general version of the Fourier Transform, decomposing a time series into the sum of series of the form where a j is the amplitude of j , j is its possibly time-varying period and is its phase shift. The true j are the modes of the input signal. The estimates ̂j are Intrinsic Mode Functions (IMFs) which should satisfy the condition that the number of extrema should differ from the number of zero-crossings by 0 or 1. Given a residue r j , with r i 0 = r 0 0 = r 0 , the initial residue equal to the input signal, EMD proceeds to sift r j to produce IMF j+1 and r j+1 by first constructing upper and lower envelopes by interpolating the local maxima and minima, respectively, then

Empirical mode decomposition
subtracting their local means from r i j to obtain h i j . If h i j is an IMF, then it is output as IMF j+1 and r j+1 is set equal to r i+1 j = r i j − h i j . Otherwise, the algorithm repeats, incrementing i to use r i+1 j and iterated until an IMF is produced or a stopping criterion is reached, in which case a defective IMF is output. IMFs are subsequently output until r j+1 has 2 or 3 extrema, in which case it is output as the residual trend. EMD ideally outputs IMFs in order of increasing period or, equivalently, decreasing frequency.
EMD is well-known for mode-mixing (outputting a single IMF for multiple j ), mode-splitting (outputting multiple IMFs for a single j ) and producing spurious IMFs. Several methods have been developed to remedy this, including EEMD, CEEMDAN (Torres et al. 2011), ICEEMDAN and MAEMD (Deering and Kaiser 2005). These methods are all ensemble methods that add a function to multiple copies of the input, run EMD on each of the new series, then average the outputs. This enables them to reduce EMD's modemixing and mode-splitting (Huang et al. 1998). The first two add white noise, ICEEMDAN adds IMFs derived from white noise and MAEMD adds and subtracts a masking sinusoid. All average the results of their decompositions. Ensemble methods have the further advantage of being able to separate the noise which EMD lacks (Kim et al. 2012). Ensemble methods are not guaranteed to produce proper IMFs because the average of IMFs is not necessarily an IMF (Steven Sandoval personal communication). All EMD methods are subject to outputting residual IMFs due to possible nonorthogonality of the IMFs that represent the input signal. The summed IMFs are subtracted from the temperature anomaly input to obtain the temperature trend, with the obvious interpretation. Colominas et al. (2014) showed that ICEEMDAN outputs IMFs in decreasing order of frequency with fewer residual IMFs than EMD, EEMD and CEEMDAN and does not output residual IMFs before outputting all informative IMFs. ICEEMDAN is run with 10,000 ensemble members and the default SNR of 0.2. The number of ensemble members was empirically determined to provide stable decompositions. The number of IMFs was set to 8 to avoid residual IMFs and to include the residual ninth IMF in the residual trend.
EMD has the further advantage of being applicable to any type of time series. Fourier series have the tightest restrictions: linearity and stationarity. Wavelets permit nonstationarity but require linearity. Fourier series and wavelets require a priori bases, while EMD is adaptive. EMD is chosen to minimize assumptions.

Results
This Section presents selected graphs illustrating the temperature anomaly decompositions and provides some interpretations. The decompositions were performed for all downloaded series. R (2020) codes and undisplayed graphs are in the Supplemental materials. Additionally, for each decomposition, the Hilbert spectrum, the time-frequency-amplitude spectrum associated with each IMF j , H j ( , t) , is defined as Frequencies and amplitudes are displayed separately. Negative frequencies appear occasionally as a result of a violation of the IMF condition. For example, a trough may occur at the expected time but not cross zero (that is, remain positive). In the absence of a well-founded interpretation, these should be ignored. They are only reported for IMF 1 for completeness. In other cases, the signal phase is adjusted to prevent negative frequencies, resulting in sharp frequency peaks. As necessary, the Marginal Hilbert Spectrum for an IMF is calculated as where I is the indicator function and the limits have the format year:numeric month. The final, presented h j ( ) is then obtained by applying the Epannechnikov kernel smoother to h j ( ) for all > 0 . The smoothing turns the discontinuous h j ( ) into a continuous, more interpretable function. The final result displays amplitude as a function of frequency, similar to a Fourier Spectrum. Only the modes are analyzed, as they are invariant to expression by frequency or period, unlike averages. They also have the interpretation of being spectral peaks.
A final concept used is the energy or power of a signal. For a signal y t , its energy is the time-averaged integral of its squared amplitude: where T is the length of y t . Since an IMF is centered, its energy is equal to its variance.
A full analysis is provided only for the global data. Decompositions and select series are provided for the other datasets. Figure 1 displays the global mean temperature anomalies for 1850-2021. Several things are readily apparent. Average temperature is rising throughout the period, with sustained declines during the approximate periods 1880-1910, 1940-1970 and 2000-2010. The series is particularly noisy before 1900. The reduction of noise over

Decompositions
(3) time, especially during the satellite era, reflects better measurements.

Global
Figure 2 displays the ICEEMDAN decomposition of median global temperature anomalies. IMF 1 estimates the noise. IMFs 2-8 estimate the respective j , in descending order of frequency. IMF 1's amplitude is particularly high before around 1890. Figure 3 shows that IMF 1's amplitude rose to a sustained peak around the 1870s. The additional noise in IMF 1 spills over into the other IMFs, especially IMFs 2 and 3, which show amplitude peaks coinciding with IMF 1's early peak. Figure 4 displays the frequencies.
IMF 1 is clearly the noise mode with its greatest variation in frequency, ranging from near 0 to near 6, the Nyquist frequency. IMFs 2 and 3 show clustering around 2 and 1 cycle(s) per year: these are the semiannual and annual IMFs. IMFs 4-8 have frequencies of less than 1 per year. To better understand the frequencies, Fig. 5 shows the periods, the reciprocals of the frequencies, for IMFs 4-8. IMF 4 is dominated by 2 year periods. IMF 5's period fluctuates between 1 and over 20 years. IMF 6 shows a peak period of over 3300 years in 2008 in the middle of a surge from 2006 to 2011. This corresponds to a period of slowing, than decreasing decline in IMF 6. IMF 7's period generally varies between 10 and 20 with increases to around 60 years and declines below 5 years. IMF 8's period generally lies between 50 and 90 years, with an increase to almost 500 in 1992. While the subsequent decline is largely explained by global warming acceleration, discussed below, its onset before acceleration is difficult to understand. Figure 6 shows the smoothed Marginal Hilbert Spectra for IMFs 5-8. While this figure confirms that frequencies are decreasing, it is otherwise hard to interpret. Figure 7 displays the inverted smooothed Marginal Hilbert Spectra. The horizontal axis has the natural interpretation of being the period. IMF 5 shows a modal period of 7 years, with a positively skewed spread of 3-17 years. IMF 6's mode is 9 years, with a wider, similarly skewed spread of 7-25 years.
Its uppermost year is well to the right. IMF 7 shows the most variability, with its main mode at 16-17 years, a nearly equal mode at 22 years, an additional mode at 37 years and a minor mode at around 58 years. IMF 8's period peaks at 71 years, with a small skewness of -0.13 for the periods displayed. No IMF corresponds to the 11-year solar cycle. Its energy is too low to distinguish it from the noise. Table 1 shows that the multidecadal IMF 8 has the greatest energy. This, and its timing, is consistent with Wu et al. (2011) with the exception that its amplitude is even more variable. Moreover, IMF 8 accelerates beginning in 1993, which will be explored more in Section 4.2.

Regional
Figures 8, 9 and 10 display the monthly average temperature anomalies and their decompositions for the Northern Hemisphere, Southern Hemisphere and Tropics, respectively. With the exception of IMF 8's amplitudes, as explained in Section 4.2, they are generally similar to the Global decomposition. Table 2 shows the trend increases in the temperature anomalies, globally and regionally. It shows two effects. First, warming is greater at higher latitudes as shown by the greater temperature increase globally compared to the Tropics when they have approximately the same share of land area: 29.2% globally and 28.6% in the Tropics. 2 Excluding ice-covered surfaces from the calculation only increases this effect. Second, greater surface land area increases warming. Land covers 29.3% of the Northern Hemisphere compared to 19.1% of the Southern Hemisphere. Again, excluding ice-covered surfaces increases this latter effect.

Global warming acceleration and hiatus
By decomposing temperatures into their constituent modes we can obtain insights into observed phenomena and  We find that this warming began around 1993, which is not apparent from their graphs that show a recent, possibly temporary, increase above a linear trend. We posit that accelerating sea ice decline is the cause of global warming acceleration. We also find that the Global Warming Hiatus that first appeared in the media and Internet (Easterling and Wehner 2009) and was subsequently analyzed by many is at least mostly a mirage caused by the confluence of multiannual cycles.   Figure 11 displays IMF 8 globally and for each region studied. Close examination shows a recent acceleration in the global cycle, which is harder to discern regionally. Figure 12 shows the derivative of each corresponding IMF 8 with respect to time. In each geography, IMF 8 is sinusoidal with varying amplitudes and a roughly 70-year modal period. However, the first peak corresponds to a flatter, declining period in the Tropics. The derivative of global IMF 8 presents a point of inflection in 1993, which leads to a slowing of the rate of temperature increase, then accelerating increase. The regional graphs are more subtle. Each derivative of IMF 8 is sinusoidal (except, of course, the Tropics before around 1900) until 1993 when, outside of the Southern Hemisphere, they turn into roughly straight lines above the expected continuations of the sinusoids. The Northern Hemisphere has an accelerating temperature increase, while the Tropics has a decelerating decrease. These are all equivalent to global temperature warming acceleration. The weakness or absence of the acceleration in the Southern Hemisphere may indicate more specifically that it is the decline in Northern sea ice that is driving acceleration. Hansen and Sato (2021) extrapolate a linear approximation of a 132 running mean of global temperatures, 1970-2015, to find that temperatures afterwards are above expectation. They conclude that global warming acceleration began in 2015 based on Loeb et al.'s (2021) interpretation of CERES (Clouds and the Earth's Radiant Energy System) satellite data. Instead, we propose that increasing loss of sea ice has been causing global warming acceleration. Sea ice is an excellent candidate because ice reflects far more sunlight than seawater. Thus, its loss increases global warming beyond that caused by the initial forcing causing loss (Dai et al. 2019). According to National Snow and Ice Data Center (2020) Figure "Mean sea ice anomalies, 1953-2018,"  Northern Hemispheric sea ice started declining around 1988 with evidence of acceleration. Figure 13 shows the sum of the multiannual median IMFs 5-8, and residual trend. The cooling period during the 2000s now appears. It is clear that this global warming hiatus, originally analyzed by Easterling and Wehner (2009), occurred as a result of the multiannual cycles corresponding to IMFs 5-7 being in declining phases, even while IMF 8, the most energetic IMF, was increasing. When these cycles resumed increasing, the cooling period ended. IMF 7 peaks around 2000, though it alone is not enough to account for the pause due to its low amplitude.

Discussion
We have used ICEEMDAN to decompose the Met Office Hadley Centre's median monthly temperature anomaly series into noise, cycles and a residual trend. Our most important finding is global warming acceleration beginning around 1993, 3 The global cooling hiatus of the early 2000s is coincidental, being the result of cyclic downturns. Global warming acceleration is found in the multidecadal cycle. We hypothesize that this is due to accelerating sea ice loss, which is documented to have begun around 1988, two decades earlier than the start of Hansen and Sato's (2021) claim. This is supported by Hugonnet et al. (2021), who find that global glacier ice mass loss accelerated during 2000-2019, providing confirmation of global warming acceleration during this period. 4 These three accelerations suggest some sort of linkage. Hansen and Sato (2021) use linear regression to produce a smooth trend for the global average temperature anomaly, 1970-2015. 5 This is based on the 132-month running mean, which appears close to linear, during 1970-2015. They then find that the global temperature anomaly after 2015 is completely above this trend. They hypothesize that this is due to increased atmospheric aerosols increasing temperature forcing, leading to accelerated global warming. However, lacking and destroying information about temperature cycles, they cannot determine whether they are truly observing acceleration, a temporary or permanent change in trend or an anticipable peak in an underlying cycle. Their inability to precisely identify what they found precludes policy prescriptions. Instead, our evidence provides actionable information: Sea ice restoration should be a part of global warming mitigation. IMF 8's early Tropical nonappearance and early weak appearance at other latitudes are worthy of investigation.
The global cooling pause has a simple explanation in decreases in global multiannual cycles, which can be seen individually in IMFs 5-8 in Fig. 2 and in their sum plus the residual trend in Fig. 13. Each of these IMFs has a particularly low trough after 2000, consistent with sporadic negative forcings. Ridley et al. (2014) provide evidence for these forcings in the form of increased and variable stratospheric volcanic aerosols. Coincidence with a modal trough can lower that trough. Militating against this is the accelerating global glacier ice loss during this period. The increased CO 2 uptake of Keenan et al. (2016) and the similar increased photosythesis hypothesis of Leggett and Ball (2015) have to be rejected because these would have been reflected in sustained decreases in low frequency IMFs in the ICEEMDAN decompositions.
The first three cycles have clear physical interpretations. The semiannual and annual cycles (IMFs 2 and 3) capture the seasonal variations in average temperature driven by changes in absorbed insolation. The biennal cycle (IMF 4) reflects the Quasi-Biennial Oscillation possibly interacted with other phenomena with approximately biennal cycles.
Pooling observations, in this case, 5 • by 5 • latitude-longitude cells, reduces the relative noise, thus increasing identifiability of cycles and trends. As is visible in the early decades of the series, noise can spillover into high frequency cycles. As the noise increases, increasingly lower frequency cycles can become unidentifiable. Moreover, except for the very lowest frequency cycle, the lowest frequencies have the least energy per Table 1. Their low energies are additional impediments to their identification in the presence of noise. Thus, EMD and its derivatives have to be used on temperature series that have pooled enough observations to reduce noise to a manageable level. It may be possible to use a spatial generalization of EMD on a grid, provided that the spatial structure enables canceling enough noise to improve identifiability. Coarser grids may accomplish this at the potential risk of producing too little spatial detail. An improved version of the spatial EMD of Fauchereau et al. (2008) may be able to do this. A requirement is the ability to draw strength across space to reduce noise.

Conclusions and extensions
Understanding multiannual temperature cycles can shed new light on the climate in general. The global warming acceleration that began in 1993 is only visible in the 4 Glacier ice mass loss is linearly proportionate to local temperature increase (Hugonnet et al. 2021). Accelerating loss can only be caused by accelerating increase. multidecadal cycle. This acceleration is proposed to be due to accelerating sea ice loss beginning around 1988, particularly in the Northern Hemisphere. This improves on Hansen and Sato's (2021) claim of recent global warming acceleration based on extrapolating a trend. By using cyclical information, we have avoided the biases from not accounting for cyclical information when forecasting. Moreover, we have pinpointed, within estimation variation, the beginning of this acceleration. The strongest evidence of global warming acceleration lies in accelerating global glacial ice mass loss during 2000-2019. Confirmatory research is needed to verify that sea ice loss has indeed been accelerating and to fully incorporate Southern Hemispheric sea ice loss into explanations of global temperature warming acceleration. The policy implication is clear: sea ice restoration is a necessary part of global warming mitigation. The Global Warming Hiatus has been shown to be, at most, the effect of volcanic forcings. It may very well have been a mirage. Again, this was only made possible by decomposing global temperature changes into their underlying trend and cycles. The Hiatus had minimal, if any, effect on global glacial ice mass loss. Accelerating global glacial ice loss provides evidence against the hiatus.
ICEEMDAN, the technique we used, is superior to EMD and EEMD, the previous EMD-based methods to analyze global temperature changes due to its ability to handle noise, output informative IMFs in decreasing order of frequency and reduction of residual IMFs. All of these techniques share the advantages of being data-driven, having minimal assumptions and being applicable to almost any kind of time series. In particular, they do not require assuming particular functional forms for the cycles or trends. They can be used to improve climate models by identifying temperature and other cycles with variable amplitudes and frequencies. Even the estimated noise can inform these models. It may be possible to develop a spatial method that accounts for spatial correlations and draws strength across space to provide local EMD-style decompositions. 6 These can provide local information to better improve climatic understanding and inform climate models.