Real-Time ENSO Forecast Skill Evaluated Over the Last Two Decades, with Focus on Onset of ENSO Events

doi:10.21203/rs.3.rs-3588191/v1

Download PDF

Article

Real-Time ENSO Forecast Skill Evaluated Over the Last Two Decades, with Focus on Onset of ENSO Events

https://doi.org/10.21203/rs.3.rs-3588191/v1

This work is licensed under a CC BY 4.0 License

You are reading this latest preprint version

This paper provides an updated assessment of the “International Research Institute for Climate and Society's (IRI) El Niño Southern Oscillation (ENSO) Predictions Plume". We evaluate 247 real-time forecasts of the Niño 3.4 index from February 2002 to August 2022 and examine multimodal means of dynamical (DYN) and statistical (STAT) models separately. Forecast skill diminishes as lead time increases in both DYN and STAT forecasts, with peak accuracy occurring post-northern hemisphere spring predictability barrier and preceding seasons. The DYN forecasts outperform STAT forecasts with a pronounced advantage in forecasts initiated from late boreal winter through spring. The analysis uncovers an asymmetry in predicting the onset of cold and warm ENSO episodes, with warm episode onsets being better forecasted than cold onsets in both DYN and STAT models. The DYN forecasts are found to be valuable for predicting warm and cold ENSO episode onsets several months in advance, while STAT forecasts are less informative about ENSO phase transitions.

Earth and environmental sciences/Climate sciences/Climate change/Projection and prediction

Scientific community and society/Scientific community/Education

Earth and environmental sciences/Climate sciences/Climate change/Climate and Earth system modelling

IRI

ENSO

El Niño

La Niña

Onset

Triple-Dip

Since February 2002, the International Research Institute for Climate and Society (IRI) has been issuing monthly El Niño Southern Oscillation (ENSO) forecasts in both deterministic and probabilistic formats. These forecasts from global producing centers used various dynamical and statistical models extend for the next nine 3-month overlapping seasons (also known as the IRI ENSO Predictions Plume), providing valuable insights into the state of ENSO. The identification of the ENSO state relies on the Niño3.4 index, which tracks the progression of sea surface temperature (SST) anomalies in the central-eastern equatorial Pacific region spanning from 5°S to 5°N and 170°W to 120°W. The objective of this paper is to conduct a reassessment of the IRI ENSO predictions plume spanning the period from February 2002 to August 2022, encompassing a total of 247 forecasts. This paper reevaluates the performance of multi model mean of dynamical and statistical real-time predictions issued over 20 years, updating earlier analyses that were limited to the 2002–2011 period^1,2, and with a focus on the ability of dynamical and statistical models to predict the onset of ENSO.

Figure 1 illustrates the IRI ENSO Predictions Plume released in mid-August 2022 (last issued forecast considered in this study). The plume encompasses the forecast period from Aug-Oct 2022 through Apr-Jun 2023, showcasing predictions from various dynamical and statistical models. Additionally, it includes the multi-model mean (thick solid lines), which combines the predictions from both types of models. The IRI ENSO Predictions Plume stands out for its distinctiveness in two key aspects. Firstly, it provides real-time ENSO forecasts issued every month from many dynamical and statistical models, while secondly keeping a track record of the models and forecasts, as well as a forecast discussion issued on the 19th of each month. The IRI website (https://iri.columbia.edu/our-expertise/climate/forecasts/enso/current/) hosts archived forecasts data and discussions, enabling users to retrospectively review past forecast. This extensive compilation of real-time forecasts from diverse models, along with accompanying discussions, serves as a crucial resource, reinforcing the reliability of the plume for real-time ENSO predictions.

The real-time prediction data from the IRI ENSO plume has been assessed in two previous studies^1,2. In their study, Barnston et al.¹ conducted an analysis to assess the real-time prediction capabilities of 20 models individually. The evaluation was performed using a dataset spanning 9 years, from February, 2002 to March, 2011. They found that the dynamical models exhibited a statistically significant advantage over the statistical models when predicting ENSO events during the months of March to May. This period is particularly challenging due to the presence of the northern hemisphere spring predictability barrier^3,4,5,6,7,8. The superior performance of the dynamical models was attributed to several factors. Firstly, these models employed fully coupled ocean-atmosphere prediction systems, which allowed for a comprehensive representation of the complex interactions between the two components. Additionally, most dynamical models utilized high spatial resolution, advanced physical parameterizations, and data assimilation systems for model initializations. In contrast, the statistical models relied on a longer historical data record to establish predictor-predictand relationships. The temporal resolution is coarser in statistical models than dynamical and many statistical models have month or seasonal averaged predictors, meaning they cannot really take in account a westerly wind burst that occurs over the span of a week or two. Specifically, the crucial subsurface ocean information from the tropical Pacific Ocean, which was not readily available before the 1990s was missing in statistical models⁹. In contrast, Tippett et al.² undertook a verification analysis using the same dataset, but evaluating the multimodel mean predictions the 12 dynamical models and 8 statistical models averaged together. They found that the mean forecast of dynamical models exhibited a slight, statistically insignificant advantage over the mean forecast of statistical models. Additionally, they reported that the mean forecast of dynamical models displayed overall larger anomalies but demonstrated comparable errors to those of statistical models. These two studies provide a thorough evaluation of real-time prediction data from the IRI ENSO predictions plume, but their findings are limited due to the dataset's short duration. El Niño (warm) and La Niña (cold) episodes represent the two phases of the ENSO, with an ENSO-neutral condition (neither El Niño nor La Niña) occurring between them. The ENSO is the dominant force behind year-to-year climate fluctuations, with the capacity to influence weather patterns worldwide, potentially leading to extreme climatic events and hazardous conditions in various regions^{10,11,12,13,14,15,16,17}. Given that users often rely on real-time forecasts for decision-making and planning purposes, a reevaluation of their performance using a longer data record is desirable.

This study capitalizes on an expanded dataset of 247 forecasts over two decades, more than double the 110 forecasts used in the earlier studies^1,2. Given the much larger sample size, this study also examines the plume's capacity to predict the initiation of the seven cold and seven warm ENSO events that occurred, based on the multi-model mean of dynamical and statistical models.

The rest of the manuscript is arranged as follows. The “Data, Models and Methods” outlines the observed and model datasets used and method employed to do the analysis. The ‘Results’ section reviews the observed characteristics of the ENSO during the period (May-Jul, 2001 to Apr-Jun, 2023), followed by the skill and error analysis during the real time forecast period (Feb, 2002 to Aug, 2022), and the performance of the dynamical and statistical models multi-model mean for the onset of the ENSO events (7 warm and 7 cold episodes) during the 20-year period. The "Summary and Discussion" section provides a concise overview and analysis of the significant findings and challenges encountered in this study.

a. ENSO Characteristic during Last Two Decades and General Comparison of DYN and STAT ENSO Forecasts

Figure 2 illustrates the evolution of ENSO over the last two decades (May-Jul, 2001 to Apr-Jun, 2023: Total 264 overlapping 3-month seasons). Indicators of the start and end of an event are shown in green and red (with bold and large fonts) respectively. Over the past two decades, La Niña, the cold phase of ENSO, has become more prevalent (32% of total). The increasing frequency of La Niña events in the last two decades may be attributed to a persistent negative phase of the Pacific Decadal Oscillation (PDO) since early 2000^18,19, the anomalous Indo-Pacific Warm Pool warming^20,21, and stronger Walker Circulation²². A total of seven cold events (85 seasons) were observed during this period, including a double-dip La Niña of 2010-11 and 2011-12, and the triple-dip La Niña which began in Jul-to-Sep 2020, and continued until Dec-to-Feb (DJF) 2022-23, with a brief break during May-to-Jul and Jun-to-Aug of 2021. During a double or triple dip La Niña/El Niño, the event typically persists successively for two or three successive boreal winters (see DJF seasons in Fig. 2 with asterisks signs). Additionally, the past two decades have been marked by seven warm ENSO episodes (23% of total or 61 seasons), including the back-to-back El Niño of 2014-15 and 2015-16, and the El Niño of 2023, ongoing at the time of writing.

Supplementary Fig. S1 illustrates time series comparing the forecasts derived from the multi-model mean of dynamical (DYN) and statistical (STAT) models to the observed data. This figure serves as an illustrative representation of the core observational and forecasts datasets analyzed in this study. The predictions are depicted as 9-month trajectories starting from each calendar month (starting in Feb, 2002) and extending to the maximum lead time (nine 3-month overlapping seasons). The trajectory plots demonstrate that the predicted anomalies generally (though there are few exceptions) follow the subsequently observed ones, but this agreement weakens as the lead time increases. Both model types demonstrated reasonably accurate predictions for certain ENSO events, particularly the stronger episodes, even when considering longer lead times. Both DYN and STAT models exhibited a tendency to underestimate the amplitudes of the warm events in 2002-03, 2009-10, and 2015-16, especially when the forecasting began in earlier years. This pattern of underprediction is in line with findings from previous studies²³. Moreover, the DYN models were much too cold in late 2003, as they forecasted colder conditions in the May, June, and July 2003 initialized forecasts following the El Niño of 2002-03 (Fig. S1a). Overall, STAT forecasts exhibit an intrinsic characteristic of relatively diminished amplitude as these models aim to minimize squared errors, resulting in a dampening effect on the intensity of sea surface temperature warming and cooling during El Niño and La Niña episodes (Fig. S1c and S1d). Additionally, their reliance on seasonal mean predictor data restricts their ability to detect very recent changes in observed conditions. In contrast, DYN forecasts adopt a different approach and initiate their computational process by incorporating the most up-to-date observations, encompassing surface and subsurface oceanic variables, as well as atmospheric variables that extend deep into the upper atmosphere. This enables DYN forecasts to incorporate the latest data and capture more immediate changes in both the ocean and atmosphere, which are specifically important during the ENSO transition phase.

b. Skill and Error during two Decades

Figure 3 offers a detailed illustration of the skill that varies for each calendar start month and 9 leads in both DYN (Fig. 3a) and STAT (Fig. 3b) forecasts. The forecasts generated at longer lead times generally exhibit lower levels of skill compared to forecasts at shorter leads in both DYN and STAT. Specifically, upon analyzing the STAT (Fig. 3b) forecasts, it becomes evident that the level of skill is significantly low and experiences a rapid decline for forecasts initiated in the early calendar months (January to May) as compared to DYN (Fig. 3a) forecasts. In STAT forecasts, this decrease in skill is particularly noticeable even in the shortest lead time for the initial conditions starting in April and May, as compared to the other months. The initial (Lead-1) skill value for forecasts initiated in April and May is notably lower than that of the other months, with correlation coefficients of 0.76 and 0.73, respectively in STAT forecasts (Fig. 3b). This period is particularly challenging due to the presence of the northern hemisphere spring predictability barrier^24,25,26. Furthermore, this period coincides with the ENSO transitioning, which means that these diminished skill levels may also have consequences for the crucial information pertaining to the ENSO transition phase^27,28,29,30. However, as we progress towards forecasts initiated in the middle calendar months (July, August, and September), the skill remains relatively higher and stable in both DYN and STAT forecasts^31,32. Furthermore, this higher skill level persists even as the lead times become longer. On the other hand, for forecasts initiated in the late calendar months (October, November, and December), correlation values are notably higher during the shorter lead times (up to lead 5). Nevertheless, the correlation experiences a swift decline for longer lead forecasts (navigating through the spring predictability barrier), reaching values as low as 0.3 or even lower in STAT forecasts, which indicates that such forecasts have very limited practical utility.

Supplementary Fig. S2 illustrates a comparison between the DYN and STAT forecasts, highlighting the advantages of one tool over the other. The z-scores display significantly high values when the forecasts are initialized in February, March, April, and May aligning with the analysis presented in Fig. 3. This analysis highlights the rapid degradation of skill in STAT forecasts as the forecast lead time increases for the earlier calendar start months, in contrast to the performance of DYN forecasts. The findings provide a clear indication of the superiority¹ of DYN forecasts over STAT forecasts, particularly during the challenging period of ENSO forecasts known as the boreal spring ENSO predictability barrier. It is worth mentioning that the STAT forecasts show a slight advantage over the DYN forecasts during the middle calendar month starts, specifically in June, July, and August, however, the amplitude of z-score is relatively small. Additionally, forecasts generated during the remaining calendar months exhibit minimal differences and do not suggest a significant benefit of using one tool over the other².

Figure 4 presents the connection between the forecast errors obtained from DYN and STAT forecasts. This figure encompasses errors across all forecast start months and lead times spanning from February 2002 to August 2022. Notably, the errors in both DYN and STAT forecasts exhibit a robust correlation, yielding a linear correlation coefficient of 0.91, which is consistent with the analysis reported by Tippett et al. in their study¹. Supplementary Fig. S3 offers a detailed illustration of the absolute values of the forecast errors for each calendar start month and 9 leads in both DYN and STAT forecasts. Even though we analyzed a dataset with more than twice the sample size used in previous studies, this analysis yet closely echoes with the findings of the two preceding studies^1,2, which showed that both dynamical and statistical models displayed a similar level of error in predicting ENSO. Highlighting the asymmetry in errors for negative values, it is noteworthy that the STAT exhibits greater discrepancies compared to DYN (as depicted in the bottom left quadrant of Fig. 4). This holds especially true for errors exceeding 1°C.

c. Onset of Warm and Cold ENSO Episodes

In this section, we focus on examining the performance of DYN and STAT forecasts in predicting the onset of the seven warm and seven cold ENSO episodes happened during the analysis period. Figure 5 illustrates the onset, of warm (Fig. 5a) and cold episodes (as represented in Fig. 5b) in the observation and forecasts. The observed onsets (denoted by both the season and year) are indicated by the thick red and blue lines. The DYN and STAT forecasts for the specified target (onset) seasons are denoted by yellow and green lines with varying line thicknesses representing Lead-1, 2, and 3, respectively. Additionally, the supplementary Fig. S4 displays the DYN and STAT forecasts for lead times ranging from Lead 4 to 9. The target season is defined here as the season at which the Niño-3.4 index reaches a value of +/- 0.5, as demonstrated in Fig. 2. Generally, the onset analysis demonstrates that, both DYN and STAT forecasts tend to perform more effectively for warm episodes (Fig. 5a) in comparison to cold episodes (Fig. 5b). Furthermore, the DYN forecasts demonstrate superior performance in the prediction of both warm and cold ENSO episodes, though it still shows substantial bias in both cases. To illustrate, during the El Niño event of 2009, the STAT forecast failed to anticipate the event until it had already occurred. It consistently predicted SST anomalies approximately a quarter degree colder than the average, as evidenced at lead time 3. Similarly, in the case of the ongoing El Niño event in 2023, the STAT forecasts were only able to predict anomalies close to + 0.43 in the forecasts initialized in April 2023. In contrast, the STAT forecasts initialized in March and February 2023 consistently showed anomalies close to zero (see Fig. 5a). When considering longer lead times (as shown in Supplementary Figure S4a), the STAT forecasts remained within the range of -0.25 to 0.2. It is worth emphasizing that the onset of the 2009 and 2023 El Niño s took place during the period of spring and summer. As previously discussed, the analysis indicated that STAT forecasts show reduced skill when initiated during the late winter and spring months. This observation is particularly significant when considering the 2009 El Niño o event and also for El Niño of 2023, as it highlights the potential limitations of STAT forecasts during that specific time period. However, it is important to note that this conclusion should not be generalized based solely on these two events, as forecasting performance can vary across different El Niño episodes. On the other hand, the DYN forecasts indicated anomalies that were quite close to the observed values for shorter leads for both 2009 and 2023 El Niños (Fig. 5a). For the El Niño of 2023, the DYN forecasts show anomalies within the range of 0.15 to 0.36 for longer leads (as shown in Supplementary Figure S4a). However, another exception is observed during the back-to-back Niño of 2014-15 and 2015-16, where the DYN forecast indicated a notably strong event at longer leads (as shown in Supplementary Figure S4a), while the STAT forecast suggested a weaker warming signal. Overall, onset trajectory analysis highlights that STAT forecasts tend to demonstrate subdued warming values compared to DYN forecasts. This characteristic aligns with the typically conservative nature of STAT forecasts, which aim to minimize squared errors. Moreover, this muted response is also evident in the case of cold episodes (Fig. 5b). Moreover, in case of the cold episodes (Fig. 5b), both DYN and STAT forecasts show larger differences in the forecasted anomalies as compared to observed values. Notable examples include the La Niña events of 2017, 2010, and 2007, which display substantial disparities in both DYN and STAT forecasts as compared to observed values (Fig. 5b and Supplementary Figure S4b). The findings of this analysis are captivating, and delving into the underlying reasons and causes of these significant errors would necessitate a separate study that employs more comprehensive analysis and additional variables for a more profound understanding.

Figure 6 illustrates the square error skill score (SESS) of individual forecasts from both DYN and STAT models for seven warm and seven cold events for all lead times (green lines: Lead 1 to 3, red lines: Lead 4 to 6, yellow lines: Lead 7 to 9). This examination unveils several noteworthy observations that complement the prior analysis:

In comparison to cold episodes (Fig. 6c and 6d), both the DYN and STAT forecasts exhibit higher skill levels when it comes to warm episodes (Fig. 6a and 6b).

DYN forecasts have demonstrated greater value in providing information regarding the onset of ENSO episodes when compared to STAT forecasts.

When considering warm episodes across all lead times, DYN forecasts show higher SESS values as compared to the STAT forecasts.

During cold ENSO episodes, the SESS exhibits a low value, possibly even turning negative, in both DYN and STAT forecasts. The superiority of DYN over STAT is not consistent, especially notable in the cases of cold episodes in 2007, 2010, and 2017, where the DYN SESS displays notably negative values.

Figure 7 summarizes the accuracy levels achieved by DYN and STAT forecasts in predicting the onset of ENSO episodes at different lead times. Accuracy is defined here as the ratio of correctly forecasted ENSO episodes to the total episodes. A forecast at a given lead is considered a “hit” if the forecasted anomaly meets or exceeds the threshold of + 0.45 for warm episodes or -0.45 for cold episodes. The DYN forecasts exhibit a remarkable level of accuracy when it comes to predicting the onset of warm episodes, maintaining their accuracy up to a lead time of 3 (as seen in Fig. 7a). However, in the case of cold episodes (as depicted in Fig. 7b), their accuracy is merely moderate at lead 1 and lead 2, diminishing to less than 20% beyond lead 3. Subsequent leads reveal that DYN forecasts offer no valuable insights. On the other hand, the STAT forecasts demonstrate a moderate level of accuracy, particularly at Lead 1, for the onset of warm ENSO episodes (Fig. 7a). In stark contrast, the STAT forecasts fail to furnish significant information regarding the initiation of cold ENSO episodes, even at shorter lead times (Fig. 7b). These findings are derived from real-time ENSO forecast data and shed light on the current capabilities of ENSO prediction models in the context of ENSO phase transitions, emphasizing the need for further research in this area.

In this study, we conducted such an evaluation using real-time ENSO forecast data comprising 247 forecasts from February, 2002 to August, 2022. During this period, seven El Niño events and seven La Niña events occurred. The main findings can be summarized as follows:

The skill levels of the DYN and STAT forecasts exhibit significant fluctuations depending on the month of forecast initiation and the targeted seasons.
The DYN excels in forecasting ENSO when the forecast begins in the boreal spring to summer months, especially during February to May.
The STAT’s predictions are comparable in predicting ENSO during the boreal summer and fall, indicating their utility once the event is established, but with lower accuracy during the transitional months.
Both the DYN and STAT forecasts tend to have larger errors when predicting the onset of cold ENSO episodes compared to warm episodes.
The analysis emphasizes that DYN forecasts provide valuable insights several months in advance for both warm and cold ENSO episode onsets, while STAT forecasts offer limited information.

Previous research suggests that, in the past, STAT models outperformed DYN models in ENSO prediction. However, subsequent studies have noted a shift, with DYN models exhibiting a slight edge over STAT models. This current study aligns with the latter trend, highlighting the superior performance of DYN models over STAT models. One plausible explanation for this discrepancy could be attributed to the limited progress in STAT models over the last two decades, while DYN models have rapidly¹ evolved thanks to advancements in computer resources, enhanced observational data, and a better grasp of the underlying physical processes governing the nonlinear dynamics of Earth's climate. Furthermore, these models serve as a benchmark upon which more complex models can be constructed and operationalized. Another advantage of STAT models is their cost-effectiveness, as they entail significantly lower initial and ongoing investments for maintenance when compared to resource-intensive DYN models.

Recently, machine learning (ML) techniques have found applications in ENSO predictions, and numerous studies have demonstrated their effectiveness in hindcast settings, extending predictions up to 18 months in advance. Some studies have even achieved successful predictions of ENSO event onsets using these ML methods^33,34,35. The progress made in ML-based ENSO prediction methods is of potentially great significance as it enhances our understanding of ENSO forecasts. Therefore, we would like to invite researchers and forecasting centers utilizing such methods to contribute their real-time ENSO forecasts to the IRI ENSO Predictions Plume.

a. Data and Models

The IRI ENSO plume forecasts are typically a series of values denoted by ${\left({F}_{SL}^{{\prime }}\right)}_{i}$, where “${F}^{{\prime }}$” shows 3-month average Niño 3.4 index, “S” shows forecast start time, “L” shows the length of the forecast, while index “i” denotes the number of models in the plume. For a given start and lead time, the multi-model mean forecast from dynamical and statistical models can be denoted as ${\left({F}_{SL}^{{\prime }}\right)}_{DYN}$, and ${\left({F}_{SL}^{{\prime }}\right)}_{STAT}$ respectively. These forecasts cover a range of lead times, commencing from the 3-month period immediately after the latest observed data and extending up to nine consecutive 3-month periods, varying based on the specific model used. For instance, in the forecast issued in August 2022, the period from Aug to Oct 2022 is designated as a Lead-1 forecast, while the period from Sep to Nov 2022 is referred to as a Lead-2 forecast. This naming convention continues for subsequent periods until we reach the final target period of Apr-Jun 2023, which is designated as Lead-9. Most models (particularly dynamical models) produce ensemble ENSO forecasts; however, plume displays only the ensemble average provided by the respective institution.

Forecast length varied among models, leading to a wide range of forecast lengths. The statistical models consistently provided forecasts for the upcoming nine 3-month average seasons. However, the forecast lengths differed among different dynamical models, with some encompassing the full nine seasons while others had shorter durations. This discrepancy highlights that a greater number of models contribute to the average during the initial target seasons, resulting in fewer models available for the latter part of the forecast. It is worth noting that we do not attempt to address or rectify this discrepancy arising from the limited number of models available towards the end of the forecast period. Table 1 provides an overview (type of model, length of forecasts, start and end date of forecast contribution to plume) of the models employed in the IRI ENSO plume, starting from its inception to May 2023. Over the entire duration, a collective of thirty dynamical models and thirteen statistical models contributed their predictions to the IRI ENSO plume. Additional insight is provided in Supplementary Fig. S5, illustrating the count of dynamical and statistical models that have contributed to the plume since its initiation back in February 2002. Particularly noteworthy is the progressive increase in the count of dynamical models since the plume's inception, marking a significant trend in the use of dynamical models. It is also important to note that the cohort of the models contributing to the plume also changed over time due to the introduction of new models, replacement of existing ones, and the discontinuation of certain models. Many dynamical models underwent substantial upgrades during the study period to enhance and refine ENSO prediction capabilities, while statistical models remained relatively constant. Additional explanation regarding this aspect and its influence on the quality and utility of forecasts is presented in the summary section.

Table 1

List of models (dynamical: DYN and statistical: STAT models ever contributed to the ENSO plume since the start of the plume in February, 2002 till May, 2023. “Start”, “End or continued”, and “Length” show the start of the forecast, the last forecast contribution, and the length (3-month overlapping seasons) of the forecasts, respectively. Over the course of time, a total of thirty dynamic models and thirteen statistical models contributed their predictions in the IRI ENSO plume.
Sr #	Model	Type	Start	End or continued	Length
1	NCEP CFS	DYN	200201	201208	7
2	SCRIPPS	DYN	200201	201705	9
3	AUS/POAMA	DYN	200201	201905	6
4	KMA SNU	DYN	200201	202104	9
5	NASA GMAO	DYN	200201	202305	7
6	JMA	DYN	200201	202305	5
7	LDEO	DYN	200201	202305	8
8	ECMWF	DYN	200201	202305	5
9	ESSIC ICM	DYN	200305	201506	9
10	ECHAM/MOM	DYN	200308	201206	9
11	COLA ANOM	DYN	200310	201308	9
12	UKMO	DYN	200501	202305	4
13	MetFRANCE	DYN	200702	202305	5
14	JPN-FRCGC	DYN	200703	201401	9
15	COLA CCSM3	DYN	200803	201601	9
16	NCEP CFSv2	DYN	201106	202305	7
17	CSI-IRI-MM	DYN	201108	202305	6
18	GFDL CM2.1	DYN	201110	202101	9
19	CMC CANSIP	DYN	201201	202305	9
20	SINTEX-F	DYN	201402	202305	9
21	GFDL CM2.5	DYN	201404	201501	9
22	GFDL FLOR	DYN	201501	202101	9
23	IOCAS ICM	DYN	201507	202305	9
24	COLA CCSM4	DYN	201601	202305	9
25	SAUDI-KAU	DYN	201702	202210	9
26	BCC_CSM11m	DYN	201704	202305	9
27	AUS-ACCESS	DYN	201906	202305	4
28	GFDL SPEAR	DYN	202101	202305	9
29	KMA	DYN	202105	202305	4
30	DWD	DYN	202301	202305	4
1	CDC LIM	STAT	200201	201701	9
2	CPC CCA	STAT	200201	201701	9
3	UBC NNET	STAT	200201	201908	9
4	CPC MRKOV	STAT	200201	202305	9
5	CPC CA	STAT	200201	202305	9
6	CSU CLIPR	STAT	200201	202305	9
7	FSU REGR	STAT	200302	202107	9
8	UCLA-TCD	STAT	200311	202305	9
9	UNB/CWC	STAT	201405	201609	9
10	PSD-CU LIM	STAT	201709	201906	9
11	NTU CODA	STAT	201803	202305	9
12	BCC_RZDM	STAT	201901	202305	9
13	IAP-NN	STAT	201905	202305	9
List of Figures

The verifying observed data is from the Extended Reconstructed Sea Surface Temperature (ERSSTv5)³⁵. Observations are expressed as anomalies with respect to the 1991–2020 climatology. Models used different base period for calculating anomalies. For the forecasts generated prior to January, 2021 the climatological based period was 1971–2000, while some models also use a different climatological based period which was from 1982–2000. Starting from January, 2021 the recent climatological period 1991–2020 was recommended. In this paper, we do not attempt to make any adjustment for such discrepancies, though we used variable climatology values and found almost no impact on the skill (see supplementary Fig. S6).

b. Methods

Different parameters are used in this research to evaluate the performance of models for ENSO forecast. For a given start and lead time, the multi-model mean forecast from dynamical and statistical models are denoted as ${\left({F}_{SL}^{{\prime }}\right)}_{DYN}$, and ${\left({F}_{SL}^{{\prime }}\right)}_{STAT}$, and observed anomalies by ${O}^{{\prime }}.$ Then the start and lead dependent Forecast Error (FE) in DYN and STAT forecast can be calculated as follows;

$${\left({FE}_{SL}\right)}_{DYN}={\left({F}_{SL}^{{\prime }}\right)}_{DYN}-{O}^{{\prime }}$$

$${\left({FE}_{SL}\right)}_{STAT}={\left({F}_{SL}^{{\prime }}\right)}_{DYN}-{O}^{{\prime }}$$

Where ${O}^{{\prime }}$ is according to the start and lead time of the forecasts, aligning with the forecast anomalies.

The Anomaly correlation coefficient is used for the skill evaluation in DYN and STAT forecasts is calculated as follows;

$${r}_{DYN}=\frac{\sum _{k=1}^{n1}\left({\left({F}_{SL}^{{\prime }}\right)}_{DYN}\right){(O}^{{\prime }})}{\sqrt{\sum _{k=1}^{n1}{\left({\left({F}_{SL}^{{\prime }}\right)}_{DYN}\right)}^{2}}\sqrt{\sum _{k=1}^{n1}{\left({O}^{{\prime }}\right)}^{2}}}$$

$${r}_{STAT}=\frac{\sum _{k=1}^{n2}\left({\left({F}_{SL}^{{\prime }}\right)}_{STAT}\right){(O}^{{\prime }})}{\sqrt{\sum _{k=1}^{n2}{\left({\left({F}_{SL}^{{\prime }}\right)}_{STAT}\right)}^{2}}\sqrt{\sum _{k=1}^{n2}{\left({O}^{{\prime }}\right)}^{2}}}$$

Where the correlation coefficients for given sample sizes of n1 and n2 for DYN and STAT forecasts are denoted as ${r}_{DYN}$ and ${r}_{STAT}$respectively. The statistical significance of the correlation is assessed by using the two-tailed t-test³⁹. At p = 0.05 with 20 degrees of freedom the critical value that must exceed to consider our calculated correlation coefficient to be significant is 0.45.

We employed z-score³⁶ analysis for the comparison of the correlation coefficients obtained from both forecast types for each start and lead time. This analysis can provide insights into the superiority (if any) of one forecast over the other. Notably, despite the dependence of both DYN and STAT forecasts, it is worth mentioning that this approach yielded a clear and informative perspective on the relative performance³⁷ of the two forecast products. In this case the correlation coefficients were computed using the different observations. For DYN forecasts, the ERSSTv5 dataset is employed, while for STAT forecasts, correlations are calculated using the COBE³⁸ dataset. Then the Fisher’s z-transformation for DYN and STAT can be written as follows;

$${z}_{DYN}=0.5 \times ln\left(\left(1+{r}_{DYN}\right)/\left(1-{r}_{DYN}\right)\right)$$

$${z}_{STAT}=0.5 \times ln\left(\left(1+{r}_{STAT}\right)/\left(1-{r}_{STAT}\right)\right)$$

$${z}_{diff}=\frac{{z}_{DYN}-{z}_{STAT}}{\sqrt{\frac{1}{n1-3}+\frac{1}{n2-3}}}$$

The statistical significance is determined by comparing with the critical value. For instance, for Apr start at Lead-2 the ${z}_{diff}=1.81$ at the 0.05 level of significance, which indicates a critical value of $\pm 1.72$, the ${z}_{diff}$ falls into the rejection region as it is greater than the critical value, thus showing the statistically significant difference between two correlations. We can reject the null hypothesis that the two correlations ${r}_{DYN}$ and ${r}_{STAT}$ are not significantly different.

We also used the squared error skill score¹ (SESS), which is defined as one minus mean square error divided by the observed climatological variance and it is given by;

$$SESS=1-\frac{{\left({\left({F}_{SL}^{{\prime }}\right)}_{DYN or STAT}-{O}^{{\prime }}\right)}^{2}}{{\left({O}^{{\prime }}\right)}^{2}}$$

The SESS ranges from + 1 to $-\infty$. SESS = 1 indicates a perfect forecast, where the forecast equals the observed value. SESS = 0 implies that the forecast error magnitude matches that of the observation, while SESS < 0 indicates a negative score, meaning that the forecast error exceeds the observational values. One advantage of this score is that it does not require averaging, unlike correlation. This allows for a direct comparison of the skill of individual forecasts with that of a climatological forecast. As a result, we utilized this score to evaluate the skill during the onset of ENSO episodes.

ENSO evolution has been quantified in terms of the 3-month running mean of SST anomalies in the Niño-3.4 region (5°N–5°S, 170°–120°W) based on a recent 30-year (1991–2020) climatology period. A threshold of +/- 0.5 C is used to define warm and cold ENSO episodes. The ENSO event is considered to have occurred when these conditions persist for five consecutive overlapping 3-month seasons (Fig. 2).

Acknowledgments

We acknowledge all the institutions that provide their ENSO forecasts data.

Contributions

MAE collected data, analyzed the results, and prepared the initial draft of the manuscript. Discussions between MAE, MKT, AWR, MH, JT, contributed to the final manuscript.

Ethics Declarations

The authors declare no competing interests.

Data Availability

Different datasets used in this study are freely available at the following links;

Monthly ONI:

https://www.cpc.ncep.noaa.gov/data/indices/ersst5.nino.mth.91-20.ascii

Warm/Cold ENSO Episodes: https://origin.cpc.ncep.noaa.gov/products/analysis_monitoring/ensostuff/ONI_v5.php

IRI ENSO Forecast:

https://iri.columbia.edu/our-expertise/climate/forecasts/enso/current/

All raw ENSO plume data available on request

https://forms.gle/7PgLuf7rwET5GBxH7

Code Availability

Free Software Python is used to conduct analysis and plot figures, and are available on request.

Barnston, A.G., K. Tippett, M.K., L’Heureux, M.L., Li, S., & DeWitt, D.G. 2012: Skill of Real-Time Seasonal ENSO Model Predictions during 2002–11: Is Our Capability Increasing? Bull Amer Meteor Soc 93 631–651 (2012).
Tippett, M.K., Barnston, A.G. & Li, S. Performance of recent multi-model ENSO forecasts. J Appl Meteor Climatol 51 637–654 (2012).
Webster, P.J., & S. Yang, S. Monsoon and ENSO: Selectively Interactive Systems. Quart J Roy Meteor Soc 118 877-926 (1992).
Webster, P.J. The annual cycle and the predictability of the tropical coupled ocean-atmosphere system. Meteor Atmos Phys 56 33-55 (1995).
Torrence, C. & Webster, P.J. The Annual Cycle of Persistence in the El Niño-Southern Oscillation. Quart J Roy Meteor Soc 124 1985-2004 (1998).
McPhaden, M.J. Tropical Pacific Ocean heat content variations and ENSO persistence barriers. Geophys Res Lett 30 (9) 1480 doi:10.1029/2003GL016872 (2003).
Duan, W., & Wei, C. (2013), The ‘spring predictability barrier’ for ENSO predictions and its possible mechanism: results from a fully coupled model. Int J Climatol 33 1280–1292 doi: 10.1002/joc.3513 (2013).
Tippett, M.K., L’Heureux, M.L. Low-dimensional representations of Niño 3.4 evolution and the spring persistence barrier. npj Clim Atmos Sci 3(24). https://doi.org/10.1038/s41612-020-0128-y (2020).
Barnston, A.G., Glantz, M.H., & He, Y. Predictive skill of statistical and dynamical climate models in SST forecasts during the 1997–98 El Niño episode and the1998 La Niña onset. Bull Amer Meteor Soc 80 217–243 (1999).
Ropelewski, C.F., & Halpert, M.S. Global and regional scale precipitation patterns associated with the El Niño/Southern Oscillation. Mon Wea Rev 115 1606–1626 https://doi.org/10.1175/1520-0493(1987)115<1606:GARSPP>2.0.CO;2 (1987).
Mason, S.J., & Goddard, L. Probabilistic precipitation anomalies associated with ENSO. Bull Am Meteorol Soc 82 619–638 (2001).
Hoell, A., Funk, C., Magadzire, T., Zinke, J., & Husak, G. El Niño–Southern Oscillation diversity and southern Africa teleconnections during austral summer. Clim Dyn 45 1583–1599, https://doi.org/10.1007/s00382-014-2414-z (2015).
Smith, S.C., & Ubilava, D. The El Niño Southern Oscillation and economic growth in the developing world. Global Environ Change 45 151–164 https://doi.org/10.1016/j.gloenvcha.2017.05.007 (2017).
Ehsan, M.A., Tippett, M.K., Robertson, A.W. et al. The ENSO Fingerprint on Bangladesh Summer Monsoon Rainfall. Earth Syst Environ https://doi.org/10.1007/s41748-023-00347-z (2023).
Chiew, F.H.S., Piechota, T.C., Dracup, J.A., Mcmahon, T.A. El Niño/Southern oscillation and Australian rainfall, streamflow and drought: links and potential for forecasting. J Hydrol 204 138–149. https://doi.org/10.1016/S0022-1694(97)00121-2 (1998).
Fraedrich, K. An ENSO impact on Europe? Tellus A: Dyn Meteorol Oceanogr 46 541–552. https://doi.org/10.3402/tellusa.v46i4.15643 (1994).
Attada R, Ehsan M.A., Pillai P.A. Evaluation of potential predictability of indian summer monsoon rainfall in ECMWF's fifth-generation seasonal forecast system (SEAS5). Pure Appl. Geophys 179, 4639–4655 (2022). https://doi.org/10.1007/s00024-022-03184-9.
Hu, Z.Z., Kumar, A., Huang, B., Zhu, J., L'Heureux, M., McPhaden, M.J. & Yu, J.Y. The interdecadal shift of ENSO properties in 1999/2000: a review. Journal of Climate 33(11), 4441–4462 (2020).
Wang, S., Huang, J., He, Y. et al. Combined effects of the Pacific Decadal Oscillation and El Niño-Southern Oscillation on Global Land Dry–Wet Changes. Sci Rep 4 6651 https://doi.org/10.1038/srep06651 (2014).
Wills, R.C.J., Dong, Y., Proistosecu, C., Armour, K.C., & Battisti, D.S. Systematic Climate Model Biases in the Large-Scale Patterns of Recent Sea-Surface Temperature and Sea-Level Pressure Change. Geophys Res Lett 49 e2022GL100011 https://doi.org/10.1029/2022GL100011 (2022).
Lee, S., L’Heureux, M., Wittenberg, A.T. et al. On the future zonal contrasts of equatorial Pacific climate: Perspectives from Observations, Simulations, and Theories. npj Clim Atmos Sci 5 82 https://doi.org/10.1038/s41612-022-00301-2 (2022).
Heede, U.K. & Fedorov, A.V. Colder eastern equatorial Pacific and stronger Walker circulation in the early 21st century: separating the forced response to global warming from natural variability. Geophys Res Lett 50 e2022GL101020 (2023).
L’Heureux, M.L., Takahashi, K., Watkins, A.B., Barnston, A.G., Becker, E.J., Di Liberto, T.E., Gamble, F., Gottschalck, J., Halpert, M.S., Huang, B., Mosquera-Vásquez, K. and Wittenberg, A.T. Observing and Predicting the 2015/16 El Niño. Bulletin of the American Meteorological Society 98(7) 1363– 1382 https://doi.org/10.1175/bams-d-16-0009.1 (2017)
Kumar, A., Hu,Z.-Z., Jha, B., & Peng, P. Estimating ENSO predictability based on multi-model hindcasts. Clim Dyn 48, 39–51, https://doi.org/10.1007/s00382-016-3060-4 (2017)
Hu,Z.-Z., Kumar, A., Zhu, J., Peng, P., Huang, B. On the Challenge for ENSO Cycle Prediction: An Example from NCEP Climate Forecast System, Version 2. J Clim 32, 183-194 https://doi.org/10.1175/JCLI-D-18-0285.1 (2019)
Chen, H.-C., Tseng, Y.-H., Hu, Z.-Z., Ding, R. Enhancing the ENSO Predictability beyond the Spring Barrier. Sci Rep 10:984 doi: 10.1038/s41598-020-57853-7 (2020)
Yang, X., Bao, Y., Song, Z. et al. Key to ENSO phase-locking simulation: effects of sea surface temperature diurnal amplitude. npj Clim Atmos Sci 6, 159 https://doi.org/10.1038/s41612-023-00483-3 (2023).
Tziperman, E., Cane, M. A., Zebiak, S. E., Xue, Y. & Blumenthal, B. Locking of El Niño’s peak time to the end of the calendar year in the delayed oscillator picture of ENSO. J Clim 11, 2191–2199 (1998).
Li, T. Phase Transition of the El Niño–Southern Oscillation: a stationary SST mode. J Atmos Sci 54, 2872–2887 (1997).
Chen, H.-C. & Jin, F.-F. Fundamental behavior of ENSO phase locking. J Clim 33, 1953–1968 (2020).
Almazroui, M., Ehsan, M.A., Tippett, M.K. et al. Skill of the Saudi-KAU CGCM in Forecasting ENSO and its Comparison with NMME and C3S Models. Earth Syst Environ 6, 327–341 https://doi.org/10.1007/s41748-022-00311-3 (2022).
Barnston, A.G., Tippett, M.K., Ranganathan, M. et al. Deterministic skill of ENSO predictions from the North American Multimodel Ensemble. Clim Dyn 53, 7215-7234 (2019)
Yan, J., Mu, L., Wang, L. et al. Temporal Convolutional Networks for the Advance Prediction of ENSO. Sci Rep 10, 8055 https://doi.org/10.1038/s41598-020-65070-5 (2020)
Ham, YG., Kim, JH. & Luo, JJ. Deep learning for multi-year ENSO forecasts. Nature 573, 568–572 (2019). https://doi.org/10.1038/s41586-019-1559-7 (2019)
Yoo-Geun Ham, Jeong-Hwan Kim, Eun-Sol Kim, Kyoung-Woon On, Unified deep learning model for El Niño/Southern Oscillation forecasts by incorporating seasonality in climate data, Science Bulletin, Volume 66 (13) 1358-1366 https://doi.org/10.1016/j.scib.2021.03.009 (2021)
T. DelSole and M. K. Tippett. Comparing forecast skill. Mon. Wea. Rev., 142, 4658-4678 Doi:10.1175/MWR-D-14-00045.1 (2014)
Ishii, M., Shouji, A., Sugimoto, S., &Matsumoto T. Objective Analyses of Sea-Surface Temperature and Marine Meteorological Variables for the 20th Century using ICOADS and the Kobe Collection. Int J Climatol 25 865-879 (2005)
Hinkle DE, Wiersma W, Jurs SG Applied statistics for the behavioral sciences. 2^nd ed. Boston: Houghton Mifflin Company (1988)
Huang, B., Thorne, P.W., et al. Extended Reconstructed Sea Surface Temperature version 5 (ERSSTv5), Upgrades, validations, and intercomparisons. J Clim 30 8179-8205 doi:10.1175/JCLI-D-16-0836.1 (2017)
Wilks, D.S. Statistical methods in the atmospheric sciences, 2nd edn. Elsevier Publishers, New York (2006).

(Not answered)

Supplementarymaterial.docx

Download PDF

Editorial decision: revise
20 Dec, 2023
Review #1 received at journal
28 Nov, 2023
Review #2 received at journal
22 Nov, 2023
Reviewer #2 agreed at journal
16 Nov, 2023
Reviewer #1 agreed at journal
13 Nov, 2023
Reviewers invited by journal
13 Nov, 2023
Editor assigned by journal
10 Nov, 2023
Submission checks completed at journal
10 Nov, 2023
First submitted to journal
09 Nov, 2023

You are reading this latest preprint version

Real-Time ENSO Forecast Skill Evaluated Over the Last Two Decades, with Focus on Onset of ENSO Events

Status:

Version 1

Abstract

Figures

1. Introduction

2. Results

3. Discussion

4. Data, Models and Methods

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1