Atmosphere-driven cold SST biases over the western North Pacific in the GloSea5 seasonal forecast system

The predictability of the sea surface temperature (SST) in seasonal forecast systems is crucial for accurate seasonal predictions. In this study, we evaluated the prediction of SST in the Global Seasonal forecast system version 5 (GloSea5) hindcast, particularly focusing on the western North Pacific (WNP), where the SST can modify atmospheric convection and the East Asian weather. GloSea5 has a cold SST bias in the WNP that grows over at least 7 months. The bias originates from the surface net heat flux. At the beginning of model integration, the ocean receives excessive heat from the atmosphere because of the predominant positive bias in the downward shortwave radiation (SW), which rapidly decreased within a few days as cloud cover builds. Then, the negative bias in the latent heat (LH) flux increases over time and induces a negative bias in the surface net heat flux. Although the magnitude of the negative bias in LH flux gradually decreases, it remains the most significant contributor to the negative bias in the net heat flux bias for more than 250 days. Uncoupled ocean model experiments showed that the ocean model is unlikely to be the primary source of the SST bias.


Introduction
Seasonal forecast systems (SFSs) aim to provide an overview of the atmospheric and oceanic conditions a few months in advance (Shukla et al. 2000;Troccoli 2010). SFSs have evolved to become more complex over the last couple of decades by including various components to more accurately represent the climate system (Wang et al. 2009). Consequently, the performance of SFSs has improved for predicting important variables such as temperature and precipitation for the next season, potentially allowing us to mitigate socioeconomic damage caused by extreme weather events. As the demand for accurate seasonal forecasting has increased, several institutions worldwide have been developing and operating SFSs (e.g., Troccoli 2010).
SFSs focus on predictions on a timescale of months, wherein the interactions of the components of the Earth system cannot be neglected (Neumann et al. 2019), and sea surface temperature (SST) plays a key role in the interactions between the atmosphere and ocean. Changes in the SST affect upper ocean stratification, vertical mixing, and eventually, the exchange of heat, momentum, and tracers between the surface and subsurface ocean. SST modulates the surface heat fluxes that affect the atmospheric boundary layer and subsequently change atmospheric circulation and precipitation both locally and remotely (Bayr et al. 2019;Garfinkel et al. 2020;Ashfaq et al. 2011;Jong et al. 2018;Keeley et al. 2012).
SST biases are common in SFSs, as they can be accumulated during integration when air-sea interactions are not well represented. For example, some coupled general circulation models (CGCMs) suffer from warm SST biases in the tropical southeastern Atlantic, and cold biases in the equatorial cold tongue (Li and Xie 2012;Wang et al. 2014;Toniazzo and Woolnough 2014;Richter 2015;Zuidema et al. 2016;Voldoire et al. 2019). Warm SST biases in the tropical Atlantic Ocean are attributed to errors in the surface heat flux (Toniazzo and Woolnough 2014), whereas incorrect zonal wind can cause cold SST biases in the Pacific cold tongue by inducing strong upwelling (Vannière et al. 2013). Another example is cold biases in the climate models over the North Pacific, which are attributed to either incorrect local surface heat fluxes or the remote influence from North Atlantic biases (Zhang and Zhao 2015;Wang et al. 2018).
The western North Pacific (WNP; 120-160 • E and 0-30 • N in the present study) is a key region that displays local and remote influences on atmospheric circulation. It hosts the western Pacific Warm Pool, which has the highest SST across the globe (Yan et al. 1992), and is the location where deep convection develops and distributes energy to remote regions (Nitta 1987;Park and An 2014;Park et al. 2017). Major ocean currents, such as the Kuroshio Current and Indonesian Throughflow, transport large amounts of heat out of this region (Dunxin and Maochang 1991;Jo et al. 2014). Furthermore, atmospheric or oceanic variabilities in the WNP can affect other tropical regions via Walker circulation and extra-tropical regions via Hadley circulation (Park and An 2014). For example, the ocean-atmosphere interactions in the WNP play a key role in Pacific-East Asian teleconnection (Wang et al. 2000). Additionally, SST variabilities in the western Pacific Warm Pool influence the East Asian summer monsoon and related rainfall (Huang et al. 2003), in addition to the geopotential height in the North Pacific and North America via atmospheric teleconnections (Park et al. 2017). Hence, the performance of SFSs in forecasting SSTs in the WNP must be carefully evaluated to improve the seasonal prediction for both local and remote regions.
In the present study, we evaluated the ability of the Korea Meteorological Administration (KMA) Global Seasonal forecast system version 5 (MacLachlan et al. 2015, GloSea5) to predict the SST in the WNP at a seasonal time scale. GloSea5 provides seasonal forecasts for several institutions, including the U.K. Meteorological Office and KMA. This operational system produced a hindcast from 1991 to 2016 by integrating the coupled system for 7 months starting from reanalysis or data-assimilated states. The SST in the Glo-Sea5 hindcast is anticipated to be consistent with observations within the time scale when the ocean state still remembers the initial condition. However, the extent and growth rates of SST biases beyond this time scale remain unknown, especially in the WNP. For this reason, we first evaluated the SST biases in the WNP for the GloSea5 hindcast. Then, we utilize the surface heat flux dataset to explore how air-sea interactions create and develop SST biases in the WNP. We use the ocean-only simulations to identify the main cause of SST bias development.
The remainder of this paper is structured as follows: After the description of the data and method in Sect. 2, the SST biases in the GloSea5 hindcast are identified in Sect. 3. The causes of the SST biases are explored in Sect. 4, followed by a description of the processes of rapid error development in Sect. 5. The paper ends with a discussion and conclusions in Sect. 6.

Data and methods
We examined the SST predictability using the 1991-2016 hindcast data of GloSea5 (MacLachlan et al. 2015;Williams et al. 2015) Rae et al. 2015). The GloSea5 hindcast is a mean of three ensemble members that are initialized on the 1st, 9th, 17th, and 25th of each month, and then integrated for 7 months. The initial conditions of the atmosphere and land surface were obtained from the European Centre for Medium-Range Weather Forecasts ERA-Interim project (ERA-Interim, Dee et al. 2011). Those for the ocean and sea-ice were created by NEMOVAR (Mogensen et al. 2009).
We evaluated the SST in GloSea5 against the U.K. Met Office Hadley Centre's sea ice and SST dataset (HadISST), for the monthly mean SST with a spatial resolution of 1 • (Rayner et al. 2003). The surface fluxes in the GloSea5 were also evaluated as they are potential sources of SST biases. The surface fluxes and momentum flux in GloSea5 were compared to those in the ERA-Interim reanalysis dataset that has been previously assessed and matched with observations in various places in the global ocean (Brunke et al. 2011;Bentamy et al. 2017;Bharti et al. 2019;Yu 2019). Moreover, ERA-Interim was a natural choice for the evaluation because it provides the initial conditions for the atmospheric fields of the GloSea5 hindcast.
To assist with the source identification of the SST bias in GloSea5, we compared the GloSea5 hindcast to the simulations using the ocean model component of GloSea5, with the same horizontal/vertical resolution. We launched 12 cases, each for a total period of 7 months using the atmospheric conditions from the Drakkar Forcing Sets (Dussin et al. 2016), which is based on ERA-Interim. These simulations do not consider active air-sea interactions but are forced by consistent global forcing that suppresses the introduction of error from surface fluxes.
We analyzed the SST biases through the lead time, defined as the time difference between the prediction and initialization. For example, the SST bias with a lead time of one month represents the mean SST difference between the first month of each hindcast integration and the HadISST data corresponding to that month. As no data were available for several months of 1991 in cases when the lead time was greater than one month, we conducted the analysis from 1992 to 2015 (288 months). The surface heat and momentum fluxes in GloSea5 were compared to ERA-Interim data in the same manner.

Prediction skill of SST
First, we evaluated the SST difference between GloSea5 and HadISST at a lead time of one month immediately after model initialization. Apart from the near coastal regions, the magnitude of the SST differences between GloSea5 and HadISST averaged over 228 months was less than 0.5 • C (Fig. 1a). This was expected as the SST was updated using observations through data assimilation when starting the model. As the integration continued, the SST biases in the North Pacific grew and became organized. The SST, with a lead time of six months, which was used for the seasonal predictions, had an overall cold bias in the WNP and a warm bias in the northeast Pacific, and the magnitude of the bias exceeded 1.0 • C in some areas (Fig. 1b). A pattern with strong cold biases appeared in the equatorial cold tongue, which has also been observed in other CGCMs (Vannière et al. 2013).
The SST biases in the WNP (black box in Fig. 2a) were initially negative in some areas, and became more negative over the 7 months of integration ( Fig. 2a-g). The largest negative SST bias was found near the latitude of 20 • N, where it became greater than −1 • C (Fig. 2g). The box plots of the SST bias over the WNP showed a negative interquartile range after a lead time of 2 months, suggesting that cold biases occurred in 75% or more of the entire 288-month period (Fig. 3a). The growth in the negative SST bias showed that almost all months had colder SSTs than the observations at a lead time of 7 months.
This cold bias was similar to that in climate models. The ensemble mean of historical runs of CMIP5 models has lower SSTs in the North Pacific than the observations (Wang et al. 2014;Richter 2015;Wang et al. 2018). UKESM, an Earth system model based on HadGEM3, also shows a similar long-term SST bias. The temporal averages of SST bias from 1992 to 2014 against the HadISST dataset were generally negative over the broad area in the WNP, with the pattern and size being comparable to the SST bias with a lead time of 7 months in GloSea5 (Fig. 2g, h). Since UKESM and GloSea5 share the same model frame, one can argue that the SST bias in the UKESM's historical run is developed rather quickly over several months based on the growth in the SST bias in GloSea5. Furthermore, the processes responsible for the SST bias in GloSea5 may be applied to UKESM. The development of SST bias differed depending on the starting month. The monthly averaged SST bias with a lead time of 1 month fluctuated around zero, but that with a lead time of 6 months was negative and varied with the season (Fig. 3b). The negative SST bias with a lead time of 6 months was greater during the boreal autumn and winter than that during spring and summer, i.e., the bias was greater than 0.6 • C from September to December but lower than 0.5 • C from February to June. This indicated that the rapid development of overall negative biases when starting in June, which contrast with the relatively slow bias development when starting in December (Fig. 3c, d).

Possible causes of errors
We attempted to identify the cause of the cold SST bias in the WNP from the surface heat flux in the GloSea5 hindcast. The comparison of the surface net heat flux with ERA-Interim showed both weak negative and positive values in the WNP with a lead time of 1 month. Then, the bias became negative overall in the second month (Fig. 4a,  b). Daily surface net heat flux ( Q net ) was partitioned into shortwave radiation (SW), longwave radiation (LW), sensible heat (SH), and latent heat (LH) fluxes, and we examined each of them against ERA-Interim data (Fig. 4c). Except for the initial few days when the ocean receives Fig. 3 a Box plots for the annual mean SST biases ( • C) averaged in the WNP (black box in Fig. 2a) with a lead time of 1-7 months. b Monthly mean SST biases ( • C) averaged in the WNP, with a lead time from 1 (red) to 6 (blue) months. c, d Box plots for WNP SST biases of the June (December)-start models ( • C) with a lead time of 1-7 months. Box plots denote median biases (orange lines) and interquartile ranges (IQR; boxes). The whiskers denote the range of biases, except outliers. Open circles denote the outliers respectively. Positive and negative signs indicate fluxes into and out of the ocean, respectively more heat from the atmosphere because of a positive SW bias, the Q net bias averaged over the WNP was negative in most hindcast periods, indicating that the ocean in Glo-Sea5 receives less heat from the atmosphere than that suggested by ERA-Interim. The excessive heat loss by LH was the primary reason for the negative Q net bias. The biases of the LW and SH were relatively small: thus, these were not the main source of the negative SST bias. There were a gradual increase and decrease in the sizes of the SW and LH biases, respectively, reducing the magnitude of the negative Q net bias with a lead time 60 days. However, the consistent negative Q net bias led to an increased cold SST bias in the WNP over time.
The excessive heat loss through LH accounted for most of the Q net bias as shown above. LH was calculated by the following equation: where air is the air density at the surface, L e is the latent heat of evaporation, C L is a stability-dependent bulk transfer coefficient for water vapor, u 10 is the wind speed at the height of 10 m, and q air − q * (SST) is the difference between the specific humidity in the atmosphere and that at saturation for a given SST. Eq. (1) suggests that a negative LH bias can originate from an overestimated 10 m wind speed ( u 10 ), underestimated specific humidity of the atmosphere ( q air ), and/or overestimated SST, leading to erroneously high saturation specific humidity at the sea surface ( q * (SST) ). Quantifying the contributions of these terms at the daily timescale could be beneficial but is not practiced as only the monthly averaged wind fields are available. The LH bias was further analyzed for the simulation starting in December and June as the SST bias differed depending on the starting month (Fig. 3b).

Mean LH bias in the hindcast started in December
In winter, the underestimated q air exerted a relatively more significant impact on the negative LH bias in the northern part of the WNP (Fig. 5). The LH and specific humidity biases were both negative in the WNP in the first month of the hindcast (Fig. 5a, b). The negative SST bias lowered the saturation specific humidity ( q * (SST) ), potentially suppressing the excessive LH. This feedback, however, was not visible in the first 2 months because the magnitude of the LH bias increased even with a negative SST bias in the following month (Fig. 5d, e). The southerly wind anomaly in the northern part of the WNP (Fig. 5c, f) may weaken the northeasterly wind in winter ( Fig. S1(a,b) in Supplementary Information), suggesting that it is unlikely that the negative LH bias stemmed from a wind speed bias. Hence, the dry (1) LH = air L e C L u 10 (q air − q * (SST)) atmosphere in GloSea5 is vital for the negative bias in LH in winter.
The wind speed bias in winter drives the growth of the negative LH bias in the southern part of the WNP. At there where the wind mainly heads southwest ( Fig. S1(a,b) in Supplementary Information), the northeasterly wind anomalies strengthen the wind (Fig. 5c, f), causing the excessive LH release from the surface with the effect of the q air bias. Considering that the q air bias does not grow in the southern part of the WNP in the second month, the growth of the LH bias can be attributed to the increased wind speed.
In December-start models, the largest negative LH bias occurs in the third month (Fig. S2 in the Supplementary  Information). This can be related to the SST bias that shows the fastest growth during lead time 3-5 months over 7 months (Fig. 3d). The continuous negative Q net and LH negative biases in the following months force the negative SST bias to grow, although their size decrease over time.

Mean LH bias in the hindcast started in June
In summer, the bias in the wind speed rather than the specific humidity explained the growth in the LH bias. The hindcast started in June exhibited a rapid change in the SST bias from positive to negative in the first two months (Fig. 3c). In this period, the negative q air bias subsided and even became positive (Fig. 6b, e), especially in the eastern part of the study domain (black box in Fig. 6a). This change could result in a positive LH bias. On the other hand, anomalously southerly 10 m wind in the second month (Fig. 6f) enhanced the seasonal southwesterly, which could enhance the negative LH bias. As the GloSea5 hindcast had a negative LH bias that grew over time, it became clear that the wind speed bias played a more critical role in introducing the negative LH and SST biases in summer.
According to Wang et al. (2000) and Tao et al. (2017), the strong southwesterly wind anomaly in the WNP may be associated with a remote response to a cold SST anomaly in the equatorial East Pacific (EP). When the EP becomes colder, similar to that under La Niña condition, precipitation in the Central Pacific (CP) is suppressed, which drives an anticyclonic circulation in the CP and a cyclonic circulation in the WNP. These circulation changes eventually develop a southwesterly wind anomaly in the equatorial western Pacific. These processes seem to explain the bias patterns in GloSea5, particularly for the hindcast started in June (Fig. 7). In the second month, the GloSea5 hindcast showed a cold SST bias in the EP region and the 10-m wind bias of cyclonic circulation in the WNP (Fig. 7a), which is consistent with the observed patterns. The negative bias of the total precipitation rate in the CP and the positive bias related to the cyclonic circulation anomaly in the WNP are also in line with the anomaly patterns suggested in previous studies (Fig. 7b).

Ocean model bias
To confirm that surface forcing was the main source of the SST bias in the WNP in GloSea5, we evaluated the SST of the uncoupled ocean model, NEMO, integrated from the same initial conditions as for GloSea5, but forced by reanalysis at the surface. We ran 12 cases that started on the first day of each month in 2003; this year showed a similar bias growth to that averaged over the entire period (Fig. 8). The horizontal distribution of the monthly mean SST bias with a lead time of one month was similar in both the hindcast and the ocean model (Fig. 9a, c). After six months of integration, a negative SST bias appeared near the equatorial Pacific in NEMO, similar to that in GloSea5, although the negative bias areas were limited to the eastern part (Fig. 9b, d). In the WNP, however, NEMO showed a weak positive SST bias ( < 0.5 • C), in contrast with the negative SST biases in Glo-Sea5. These results suggested that the ocean dynamics and thermodynamics in the ocean model are unlikely to cause a cold SST bias when the atmospheric reanalysis forces the ocean model. Thus, it is reasonable to argue that the bias in the surface heat flux is the primary source of the cold SST bias in the WNP in GloSea5.

Rapid development of the surface heat flux bias
In Sect. 4, we confirmed that the negative Q net bias, with the most significant contribution from the LH bias, was the primary source of the SST bias in the WNP. However, the initial Q net bias was positive before becoming negative shortly thereafter, and the SW bias significantly contributed Same as a-c, but with a lead time of 2 months. The biases were calculated with respect to ERA-Interim. Dotted regions indicate the significance of the correlation coefficients between LH bias and q air bias at the 95% confidence level. The wind vectors indicate the differences between GloSea5 and ERA-Interim data to this transition (red curve in Fig 4c). This transition can be confirmed by the spatial map of biases on the 1st and 31st day of integration (Fig. 10). The positive SW bias largely explained the positive Q net bias on the 1st day of integration (Fig. 10a-c). On the 31st day, the Q net bias became negative in most regions, especially in the Philippine Sea, with a large negative bias over −30 W/m 2 . This transition in Q net bias can be attributed to both the reduction in SW bias and the growth of LH bias; the mean change in the Q net bias was −39 W/m 2 over 30 days of lead time, and those in the SW and LH were −23 and −18 W/m 2 , respectively, in our study area. The rapid reduction in SW was associated with clouds in the model. Clouds with a high albedo reflect solar radiation and reduce the SW incident on the sea surface. The cloud development was estimated using the cloud radiative effect (CRE), denoted by the differences in upward radiation between all-sky (with clouds) and clear-sky (without clouds) conditions at the top of atmosphere (TOA).
As a positive SW(TOA) represents an upward flux, clouds with high albedo result in a more-positive CRE SW . The outgoing longwave radiation is affected by the existence of clouds as well, which can be quantified by CRE LW .
Clouds can result in negative CRE LW as they suppress the outgoing longwave radiation.
In GloSea5, both CRE SW and CRE LW increased in magnitude during the first 31 days (Fig. 11). The difference in CRE SW between the 31st and 1st days ( CRE SW ) was positive in the WNP, indicating that the amount of clouds increased during the first month of simulation. The clouds concentrated near the equator, where active convection is anticipated, as seen in the maximum positive CRE SW (Fig. 11a).
(2) CRE SW = SW all−sky (TOA) − SW clear−sky (TOA).   Fig. 5, except that the data were from the start of June (0601) Fig. 7 a July-mean SST bias (shading; • C) and 10-m wind bias (stream line) of GloSea5 from the start of June (0601) with a lead time of 2 months. b July-mean total precipitation rate bias (kg m -2 s -1 ) for the same data. The bias of the SST was calculated with respect to HadISST. The biases of 10-m wind and total precipitation rate were calculated with respect to ERA-Interim This region coincides with that in which a SW negative bias developed in Fig. 10e. The CRE LW also suggested the accumulation of clouds in the WNP within a month after initialization (Fig. 11b). A closer investigation revealed that CRE SW and CRE LW developed quickly within a few days (Fig. 11c), suggesting that the GloSea5 hindcast undergoes an adjustment after initialization. GloSea5 hindcast obtains the initial conditions from independent sources; the atmospheric initial conditions are obtained from ERA-Interim, whereas NEMO-VAR updates the initial conditions of oceanic and sea-ice. We anticipated that the erroneous air-sea fluxes could occur and induce errors in both the atmospheric and oceanic components when the coupled atmosphere-ocean states approach the balanced state. Hence, starting the hindcast from balanced atmospheric and oceanic states may reduce the rapid growth in the Q net and eventually SST biases.

Discussion and conclusion
GloSea5 is a seasonal forecast system used by many operational centers, and the diagnostics of the SST bias and possible errors in this system are critical as they can influence air-sea interactions and seasonal predictions. In this study, we found a negative SST bias in the western North Pacific (WNP) in the GloSea5 hindcast compared to satellite-based observations. The initial SST bias was not significant, as the initial condition of the ocean is the product of data assimilation. However, a cold bias developed over the 7 months of integration, after which the SST bias became comparable to that in the multiyear simulation of the climate model. This mainly occurred because of the lower net heat uptake of the ocean, with the greatest contribution from excessive latent heat (LH) loss. The ocean initially received anomalously high shortwave radiation (SW), followed by a sharp reduction with the accumulation of clouds. In a few days, the magnitude of the LH bias became larger than that of the SW bias, and the sign of the net heat flux ( Q net ) bias changed. Although the Q net bias tapered slowly after 31 days because of the increase in positive SW bias and the decrease in negative LH bias, it was still negative; thus, the SST cold bias continued to develop.
The negative LH bias was caused by the lower atmospheric specific humidity of GloSea5 in winter. However, the negative LH bias in summer was related to anomalous wind. For the hindcast started in June, we identified a southwesterly wind anomaly for a lead time of 2 months, which enhance the southerlies in GloSea5. Strong wind results in excessive LH being released at the sea surface. When more latent heat is released than that observed, water vapor in the atmosphere is expected to increase. In summer, the bias of the specific humidity became positive in the second month of the integration, particularly in the northeastern part of the domain of interest, as southwesterly winds transported the water vapor. The excessive loss in latent heat and the increase in water vapor were not evident in winter. In this season, the wind anomaly tended to bring relatively dry air into the domain of interest, promoting anomalously strong LH loss.
The rapid change in the errors in the SW and Q net suggested that the initial shock could have introduced errors in the hindcast. The imbalance in the atmospheric and oceanic states could have induced unwanted processes in either the atmosphere or ocean, leading to erroneous fluxes between them (Rosati et al. 1997;Zhang et al. 2020). In the GloSea5 hindcast, the amount of clouds quickly increased in a couple of days after initialization, thereby reducing the excessive incoming solar radiation to the ocean. The accumulation of clouds reduced the outgoing longwave radiation, which warmed the sea surface. However, this had a smaller impact than the incoming SW radiation. This error may be resolved if coupled data assimilation is applied to the GloSea5 hindcast data (Sugiura et al. 2008;Mulholland et al. 2015).
The evolution of SST bias in WNP highlights the interactions between the atmosphere and ocean. Initially, the ocean responded to the surface heat flux; the accumulation of clouds and increasing LH loss drove the negative SST bias. The growth in the negative LH bias stagnated after 1 month and then gradually decreased, starting approximately 60 days of integration, which suggests that negative feedback occurred. As the negative SST bias grew, the saturation water vapor pressure decreased, resulting in a gradual decrease in the negative LH bias. Although this negative feedback slowed the growth of the negative SST bias, it was not strong enough to reverse the bias within 7 months. The uncoupled ocean model experiments, which did not show cold SST biases in the WNP, indicate that surface forcing plays an important role in the occurrence of an SST bias. However, we could not completely rule out ocean dynamics and thermodynamics as sources for the SST bias. As biases in the surface forcing can alter ocean circulation, circulation in the GloSea5 hindcast and the ocean model is not identical; hence, it is possible that the biases of the surface fluxes were passed to the ocean, eventually contributing to the development of the SST bias.
The WNP overlaps the Pacific Warm Pool, where strong convection occurs. Through Walker and Hadley circulation, the conditions of the WNP can affect a wide area. The East Asian monsoon, which is vital for inducing precipitation in East Asia, is also closely related to the SSTs in this  (Huang et al. 2003). As a cold SST bias exists in the WNP, the model may have underestimated the convection in this region and might have further reduced the atmospheric overturning circulation, thereby degrading the seasonal predictability in several regions, including East Asia.
Through a correction of the SST bias, we can expect an improvement in seasonal forecasting performance. For example, many CGCMs have systematic SST biases in the tropical Atlantic, and surface heat flux is the cause of this bias (Toniazzo and Woolnough 2014). After diagnosing SST biases, Dippe et al. (2019) corrected the surface heat flux and obtained an improved hindcast skills. The SST diagnostics presented in this study can thus be useful for improving the seasonal prediction accuracy over the WNP in GloSea5.
The results of this study may be applied to other CGCMs with cold SST biases in the WNP. The magnitude of SST biases in GloSea5 is comparable to that in the multiyear simulations of UKESM, which share the same model framework. The multi-model mean of historical runs of CMIP5 models also showed a cold bias in the WNP (Wang et al. 2014;Richter 2015;Wang et al. 2018).
If the results of this study are applied to climate models, biases may occur in climate models during the early stage (seasonal timescale) of the integration, which can be alleviated by better treating the surface heat flux.

Supplementary Information
The online version contains supplementary material available at https:// doi. org/ 10. 1007/ s00382-022-06228-x. Fig. 11 a, b Change in the cloud radiative effect (CRE; W/m 2 ) on SW (LW) from lead times of 1-31 days. Contour lines mark 10 W/m 2 intervals. c CRE growth (W/m 2 ) with respect to the lead time for SW and LW in the WNP region (black boxes in a, b)