a. ENSO Characteristic during Last Two Decades and General Comparison of DYN and STAT ENSO Forecasts
Figure 2 illustrates the evolution of ENSO over the last two decades (May-Jul, 2001 to Apr-Jun, 2023: Total 264 overlapping 3-month seasons). Indicators of the start and end of an event are shown in green and red (with bold and large fonts) respectively. Over the past two decades, La Niña, the cold phase of ENSO, has become more prevalent (32% of total). The increasing frequency of La Niña events in the last two decades may be attributed to a persistent negative phase of the Pacific Decadal Oscillation (PDO) since early 200018,19, the anomalous Indo-Pacific Warm Pool warming20,21, and stronger Walker Circulation22. A total of seven cold events (85 seasons) were observed during this period, including a double-dip La Niña of 2010-11 and 2011-12, and the triple-dip La Niña which began in Jul-to-Sep 2020, and continued until Dec-to-Feb (DJF) 2022-23, with a brief break during May-to-Jul and Jun-to-Aug of 2021. During a double or triple dip La Niña/El Niño, the event typically persists successively for two or three successive boreal winters (see DJF seasons in Fig. 2 with asterisks signs). Additionally, the past two decades have been marked by seven warm ENSO episodes (23% of total or 61 seasons), including the back-to-back El Niño of 2014-15 and 2015-16, and the El Niño of 2023, ongoing at the time of writing.
Supplementary Fig. S1 illustrates time series comparing the forecasts derived from the multi-model mean of dynamical (DYN) and statistical (STAT) models to the observed data. This figure serves as an illustrative representation of the core observational and forecasts datasets analyzed in this study. The predictions are depicted as 9-month trajectories starting from each calendar month (starting in Feb, 2002) and extending to the maximum lead time (nine 3-month overlapping seasons). The trajectory plots demonstrate that the predicted anomalies generally (though there are few exceptions) follow the subsequently observed ones, but this agreement weakens as the lead time increases. Both model types demonstrated reasonably accurate predictions for certain ENSO events, particularly the stronger episodes, even when considering longer lead times. Both DYN and STAT models exhibited a tendency to underestimate the amplitudes of the warm events in 2002-03, 2009-10, and 2015-16, especially when the forecasting began in earlier years. This pattern of underprediction is in line with findings from previous studies23. Moreover, the DYN models were much too cold in late 2003, as they forecasted colder conditions in the May, June, and July 2003 initialized forecasts following the El Niño of 2002-03 (Fig. S1a). Overall, STAT forecasts exhibit an intrinsic characteristic of relatively diminished amplitude as these models aim to minimize squared errors, resulting in a dampening effect on the intensity of sea surface temperature warming and cooling during El Niño and La Niña episodes (Fig. S1c and S1d). Additionally, their reliance on seasonal mean predictor data restricts their ability to detect very recent changes in observed conditions. In contrast, DYN forecasts adopt a different approach and initiate their computational process by incorporating the most up-to-date observations, encompassing surface and subsurface oceanic variables, as well as atmospheric variables that extend deep into the upper atmosphere. This enables DYN forecasts to incorporate the latest data and capture more immediate changes in both the ocean and atmosphere, which are specifically important during the ENSO transition phase.
b. Skill and Error during two Decades
Figure 3 offers a detailed illustration of the skill that varies for each calendar start month and 9 leads in both DYN (Fig. 3a) and STAT (Fig. 3b) forecasts. The forecasts generated at longer lead times generally exhibit lower levels of skill compared to forecasts at shorter leads in both DYN and STAT. Specifically, upon analyzing the STAT (Fig. 3b) forecasts, it becomes evident that the level of skill is significantly low and experiences a rapid decline for forecasts initiated in the early calendar months (January to May) as compared to DYN (Fig. 3a) forecasts. In STAT forecasts, this decrease in skill is particularly noticeable even in the shortest lead time for the initial conditions starting in April and May, as compared to the other months. The initial (Lead-1) skill value for forecasts initiated in April and May is notably lower than that of the other months, with correlation coefficients of 0.76 and 0.73, respectively in STAT forecasts (Fig. 3b). This period is particularly challenging due to the presence of the northern hemisphere spring predictability barrier24,25,26. Furthermore, this period coincides with the ENSO transitioning, which means that these diminished skill levels may also have consequences for the crucial information pertaining to the ENSO transition phase27,28,29,30. However, as we progress towards forecasts initiated in the middle calendar months (July, August, and September), the skill remains relatively higher and stable in both DYN and STAT forecasts31,32. Furthermore, this higher skill level persists even as the lead times become longer. On the other hand, for forecasts initiated in the late calendar months (October, November, and December), correlation values are notably higher during the shorter lead times (up to lead 5). Nevertheless, the correlation experiences a swift decline for longer lead forecasts (navigating through the spring predictability barrier), reaching values as low as 0.3 or even lower in STAT forecasts, which indicates that such forecasts have very limited practical utility.
Supplementary Fig. S2 illustrates a comparison between the DYN and STAT forecasts, highlighting the advantages of one tool over the other. The z-scores display significantly high values when the forecasts are initialized in February, March, April, and May aligning with the analysis presented in Fig. 3. This analysis highlights the rapid degradation of skill in STAT forecasts as the forecast lead time increases for the earlier calendar start months, in contrast to the performance of DYN forecasts. The findings provide a clear indication of the superiority1 of DYN forecasts over STAT forecasts, particularly during the challenging period of ENSO forecasts known as the boreal spring ENSO predictability barrier. It is worth mentioning that the STAT forecasts show a slight advantage over the DYN forecasts during the middle calendar month starts, specifically in June, July, and August, however, the amplitude of z-score is relatively small. Additionally, forecasts generated during the remaining calendar months exhibit minimal differences and do not suggest a significant benefit of using one tool over the other2.
Figure 4 presents the connection between the forecast errors obtained from DYN and STAT forecasts. This figure encompasses errors across all forecast start months and lead times spanning from February 2002 to August 2022. Notably, the errors in both DYN and STAT forecasts exhibit a robust correlation, yielding a linear correlation coefficient of 0.91, which is consistent with the analysis reported by Tippett et al. in their study1. Supplementary Fig. S3 offers a detailed illustration of the absolute values of the forecast errors for each calendar start month and 9 leads in both DYN and STAT forecasts. Even though we analyzed a dataset with more than twice the sample size used in previous studies, this analysis yet closely echoes with the findings of the two preceding studies1,2, which showed that both dynamical and statistical models displayed a similar level of error in predicting ENSO. Highlighting the asymmetry in errors for negative values, it is noteworthy that the STAT exhibits greater discrepancies compared to DYN (as depicted in the bottom left quadrant of Fig. 4). This holds especially true for errors exceeding 1°C.
c. Onset of Warm and Cold ENSO Episodes In this section, we focus on examining the performance of DYN and STAT forecasts in predicting the onset of the seven warm and seven cold ENSO episodes happened during the analysis period. Figure 5 illustrates the onset, of warm (Fig. 5a) and cold episodes (as represented in Fig. 5b) in the observation and forecasts. The observed onsets (denoted by both the season and year) are indicated by the thick red and blue lines. The DYN and STAT forecasts for the specified target (onset) seasons are denoted by yellow and green lines with varying line thicknesses representing Lead-1, 2, and 3, respectively. Additionally, the supplementary Fig. S4 displays the DYN and STAT forecasts for lead times ranging from Lead 4 to 9. The target season is defined here as the season at which the Niño-3.4 index reaches a value of +/- 0.5, as demonstrated in Fig. 2. Generally, the onset analysis demonstrates that, both DYN and STAT forecasts tend to perform more effectively for warm episodes (Fig. 5a) in comparison to cold episodes (Fig. 5b). Furthermore, the DYN forecasts demonstrate superior performance in the prediction of both warm and cold ENSO episodes, though it still shows substantial bias in both cases. To illustrate, during the El Niño event of 2009, the STAT forecast failed to anticipate the event until it had already occurred. It consistently predicted SST anomalies approximately a quarter degree colder than the average, as evidenced at lead time 3. Similarly, in the case of the ongoing El Niño event in 2023, the STAT forecasts were only able to predict anomalies close to + 0.43 in the forecasts initialized in April 2023. In contrast, the STAT forecasts initialized in March and February 2023 consistently showed anomalies close to zero (see Fig. 5a). When considering longer lead times (as shown in Supplementary Figure S4a), the STAT forecasts remained within the range of -0.25 to 0.2. It is worth emphasizing that the onset of the 2009 and 2023 El Niño s took place during the period of spring and summer. As previously discussed, the analysis indicated that STAT forecasts show reduced skill when initiated during the late winter and spring months. This observation is particularly significant when considering the 2009 El Niño o event and also for El Niño of 2023, as it highlights the potential limitations of STAT forecasts during that specific time period. However, it is important to note that this conclusion should not be generalized based solely on these two events, as forecasting performance can vary across different El Niño episodes. On the other hand, the DYN forecasts indicated anomalies that were quite close to the observed values for shorter leads for both 2009 and 2023 El Niños (Fig. 5a). For the El Niño of 2023, the DYN forecasts show anomalies within the range of 0.15 to 0.36 for longer leads (as shown in Supplementary Figure S4a). However, another exception is observed during the back-to-back Niño of 2014-15 and 2015-16, where the DYN forecast indicated a notably strong event at longer leads (as shown in Supplementary Figure S4a), while the STAT forecast suggested a weaker warming signal. Overall, onset trajectory analysis highlights that STAT forecasts tend to demonstrate subdued warming values compared to DYN forecasts. This characteristic aligns with the typically conservative nature of STAT forecasts, which aim to minimize squared errors. Moreover, this muted response is also evident in the case of cold episodes (Fig. 5b). Moreover, in case of the cold episodes (Fig. 5b), both DYN and STAT forecasts show larger differences in the forecasted anomalies as compared to observed values. Notable examples include the La Niña events of 2017, 2010, and 2007, which display substantial disparities in both DYN and STAT forecasts as compared to observed values (Fig. 5b and Supplementary Figure S4b). The findings of this analysis are captivating, and delving into the underlying reasons and causes of these significant errors would necessitate a separate study that employs more comprehensive analysis and additional variables for a more profound understanding.
Figure 6 illustrates the square error skill score (SESS) of individual forecasts from both DYN and STAT models for seven warm and seven cold events for all lead times (green lines: Lead 1 to 3, red lines: Lead 4 to 6, yellow lines: Lead 7 to 9). This examination unveils several noteworthy observations that complement the prior analysis:
-
In comparison to cold episodes (Fig. 6c and 6d), both the DYN and STAT forecasts exhibit higher skill levels when it comes to warm episodes (Fig. 6a and 6b).
-
DYN forecasts have demonstrated greater value in providing information regarding the onset of ENSO episodes when compared to STAT forecasts.
-
When considering warm episodes across all lead times, DYN forecasts show higher SESS values as compared to the STAT forecasts.
-
During cold ENSO episodes, the SESS exhibits a low value, possibly even turning negative, in both DYN and STAT forecasts. The superiority of DYN over STAT is not consistent, especially notable in the cases of cold episodes in 2007, 2010, and 2017, where the DYN SESS displays notably negative values.
Figure 7 summarizes the accuracy levels achieved by DYN and STAT forecasts in predicting the onset of ENSO episodes at different lead times. Accuracy is defined here as the ratio of correctly forecasted ENSO episodes to the total episodes. A forecast at a given lead is considered a “hit” if the forecasted anomaly meets or exceeds the threshold of + 0.45 for warm episodes or -0.45 for cold episodes. The DYN forecasts exhibit a remarkable level of accuracy when it comes to predicting the onset of warm episodes, maintaining their accuracy up to a lead time of 3 (as seen in Fig. 7a). However, in the case of cold episodes (as depicted in Fig. 7b), their accuracy is merely moderate at lead 1 and lead 2, diminishing to less than 20% beyond lead 3. Subsequent leads reveal that DYN forecasts offer no valuable insights. On the other hand, the STAT forecasts demonstrate a moderate level of accuracy, particularly at Lead 1, for the onset of warm ENSO episodes (Fig. 7a). In stark contrast, the STAT forecasts fail to furnish significant information regarding the initiation of cold ENSO episodes, even at shorter lead times (Fig. 7b). These findings are derived from real-time ENSO forecast data and shed light on the current capabilities of ENSO prediction models in the context of ENSO phase transitions, emphasizing the need for further research in this area.