The representation of dry-season low-level clouds over Western Equatorial Africa in reanalyses and historical CMIP6 simulations

Within the equatorial zone, Western Equatorial Africa (WEA) has a record low sunshine duration during the June–September dry season due to the persistence of low clouds. This study examines the ability of two reanalysis products (ERA5 and MERRA-2) and eight CMIP6 models (both coupled and atmosphere-only historical simulations) to reproduce the climatology of these low clouds, by comparing it with ground observations and a satellite product. All datasets show a reasonable representation of the regional distribution of low clouds over the Tropical Atlantic and the neighbouring African continent. However, CMIP6 models tend to underestimate the low cloud fraction, especially over WEA in the coupled simulations. This underestimation is partly due to an insufficient seasonal sea-surface temperature (SST) cooling over the Eastern Equatorial Atlantic from April to July in most models, which reduces the lower-tropospheric stability (LTS). However, the inability to reproduce the JJAS low cloud fraction does not necessarily scale with the SST biases of the CMIP6 models. Observed interannual variations of WEA low-cloud fraction are strongly controlled by LTS, itself mostly related to Atlantic SST. The strong dependence of low clouds on interannual SST variations is captured by most, but not all the CMIP6 models. Additional drivers of interannual variations identified in this study, such as mid-tropospheric temperatures over WEA and Bight of Bonny surface winds, emerge inconsistently in CMIP6. Further analyses are needed to disentangle the roles played by SST and independent atmospheric forcings on WEA low cloud formation.


Introduction
Much of climate research focusing on tropical land areas is dedicated to precipitation variability and its driving mechanisms. Dry seasons generally attract less attention, since they are considered as periods of relatively steady, stable and cloudless weather. In Western Equatorial Africa (WEA), especially in Gabon and southern Congo-Brazzaville, the austral winter season from June to September is rainless but is unexpectedly also the time of the year showing the least sunshine and the highest cloud amounts (Bush et al. 2019;Philippon et al. 2021). Over the Mayombe range in Congo-Brazzaville, mean dry season sunshine duration is under one hour per day (Bouka-Biona et al. 1993). The extensive low cloud cover found during this season is associated with relatively low daytime temperatures, which reduces evapotranspiration and is likely to play a key role in the survival of an evergreen forest in much of the region, despite the long dry season of up to 4 months (Clairac et al. 1989;Couralet et al. 2010;Philippon et al. 2019;Réjou-Méchain et al. 2021).
Dry season low-level clouds in WEA tend to be highly persistent, with occurrences above 80% in Gabon based on synoptic observations (Dommo et al. 2018). They usually take the form of a stratocumulus cloud deck with maximum cloud fraction near 850-900 hPa (Dommo et al. 2018). This cloud deck occurs within the southwesterly monsoon flow, which along the coast is about 1000-2000 m deep during this season (Lacaux et al. 1992). The monsoon flow is topped by a temperature inversion (Berruex 1958;Tschirhart 1959;Trewartha 1981). While stratocumulus cloud decks are recurrent features over the eastern flanks of subtropical anticyclones covering oceanic areas (Klein and Hartman 1993;Wood 2012;Eastman and Warren 2014), they are more uncommon over equatorial land areas. They are found in Western Africa along the Gulf of Guinea coastal zone Knippertz et al. 2011;Schrage and Fink 2012;Schuster et al. 2013;Dione et al. 2019;Hannak et al. 2017;Danso et al. 2020), also from July to September, but their frequency is not as high as in WEA. A stratocumulus cloud cover is also a recurrent feature in austral winter at Nairobi (Kenya), on the east-facing (windward) slopes of the East African Highlands, suggesting a key role of orography (Camberlin 2018).
Although the WEA low clouds are not strictly a coastal phenomenon, since they extend to about 300-400 km inland (Dommo et al. 2018;Phillippon et al. 2019), their location east of the Equatorial Atlantic Ocean at a time when the seasonal coastal and equatorial upwelling reaches its maximum intensity suggests that sea surface temperature (SST), combined with the onshore low level flow of relatively cool and moist air, is an important control. However, the mechanisms of the low cloud formation over this region and its representation in models are not well known. Further south over the Southeast Atlantic Ocean, there is still some uncertainty on the exact role played by the cool marine surface, given the complex feedbacks between clouds and surface temperature (Philander et al. 1996;Trzaska et al. 2007;Bellomo et al. 2015). Adebiyi and Zuidema (2018) found that both meteorological factors (including atmospheric stability and large-scale subsidence) and aerosol forcing (biomassburning aerosols from Southern Africa) contribute to the development of the persistent low-level cloud deck over this area, while Koseki and Imbol Koungue (2021) showed that March-April low-level clouds were enhanced in years of cold SST anomalies along the Angolan coast (so-called 'Benguela Niña' events, Shannon et al. 1986). Even further south, for the coastal Namib Desert, Andersen et al. (2020) showed that fog and low-cloud occurrence is controlled by variations in cloud-top longwave cooling and the strength of the onshore flow, both modulated by synoptic-scale disturbances.
Uncertainty also revolves around the ability of global circulation models (GCMs) and even reanalysis products to reproduce this low cloud cover. Low clouds are known to be difficult features to model (Dufresne and Bony 2008;Lauer and Hamilton 2013;Zelinka et al. 2017;Myers et al. 2021). Over the tropical and subtropical zones, simulated low clouds tend to be too few and too bright, i.e. have an overestimated optical thickness, likely due to a vertical overlap of cloud layers (Nam et al. 2012;Cesana and Walliser 2016). Within the GCMs participating in the CMIP5 experiment, there is also a large inter-model diversity in the relationship between marine low clouds and SST (Cesana et al. 2012). The poor simulation of marine low clouds should be considered along with the warm SST bias often found over these regions (Richter 2015). Hu et al. (2008) suggested that the underestimation of low clouds was by itself a potential cause of the warm SST bias over the southeastern Atlantic Ocean. Xu et al. (2014), however, found that the overestimation of shortwave radiation associated with the insufficient marine low clouds in CMIP5 models was not the leading cause of the warm bias, and hypothesized a role of eddy-induced ocean heat transport. For southern West Africa (Guinea zone), Hill et al. (2016) found a strong underestimation of June-July low cloud cover in both ERA-Interim reanalysis data and CMIP5 atmospheric models forced by prescribed SSTs. Hannak et al. (2017), for the same region, analysed a larger number of CMIP5 models and showed that their simulation of low-level clouds was only marginally improved from that of previous generation models (CMIP3). The underestimation of low-cloud cover appears to be related to the subgrid cloud schemes, and to too low near-surface relative humidity at daytime, associated with abundant solar radiation and too high daytime maximum temperatures. No study ever examined the skill of climate models in the simulation of low cloud cover over WEA, despite its temporal persistence during the dry season and major feedbacks on the local climate.
This study will make use of newly available CMIP6 simulations, with the aim to address the following questions: (i) are global climate models and two recent reanalysis products (i.e. ERA5 and MERRA2) capable of realistically representing the low clouds present in WEA? (ii) are the low clouds of this region a mere extension of the large-scale low cloud deck found over the Southeast Atlantic Ocean? (iii) do CMIP6 GCMs underestimate low clouds as in the previous model generations, and is there any difference between atmospheric and coupled simulations? (iv) how does the low cloud cover relate to oceanic and atmospheric forcings, in models and reanalyses? (v) do these relationships help to understand the models' systematic biases in their reproduction of low clouds? To answer these questions, the low cloud cover will be compared between CMIP6 historical simulations (both coupled and atmospheric only), ERA5 and MERRA2 reanalysis data, Cloud-Aerosol Lidar with Orthogonal Polarization (CALIOP) satellite data and in situ synoptic observations for WEA. The focus will be on the austral winter dry period (June-September: JJAS) during which most of the low clouds develop over the region, but insights will also be given into the annual cycle. Besides the examination of mean fields, interannual variability will be considered, in particular the relationship between low cloud variability and SST over the Equatorial Atlantic Ocean, which is suspected to play a significant role in (or least interact with) low cloud occurrence.
Section 2 presents the different types of data and the methods used in the study. The climatology of low level clouds is then examined (Sect. 3.1), followed by an analysis of the interannual variations of low clouds (Sect. 3.2). Finally, their relationships with oceanic and atmospheric fields are studied in Sects. 3.3, and 3.4, respectively. A discussion of the main results is provided in Sect. 4, followed by the conclusions.

Data and methods
Four types of data sets are used (Table 1): satellite observations (CALIOP), surface observations (ISD and EECRA), reanalyses (ERA5 and MERRA-2) and simulations from climate models (CMIP6). All the data (with some exceptions as noted below) are extracted over the period 1979-2014, which is common to most data sets, at the monthly timescale.
Low cloud definitions vary between these data sets and are presented at the end of this section.

CALIOP
Cloud-Aerosol Lidar with Orthogonal Polarization (CAL-IOP) is the lidar device aboard the CALIPSO satellite. Cloud Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) is a Franco-American mission launched in 2006 with the objective of studying the radiative impacts of clouds and aerosols on climate. CALIOP is a two-wavelength backscattering lidar (532 and 1064 nm) that provides high-resolution vertical profiles of clouds and aerosols (Chepfer et al. 2008). The lidar detects all types of clouds but low clouds can only be detected when they are not covered by optically thick higher clouds. The data used in this study are not those obtained directly from the lidar but come from the GCM-Oriented CALIPSO Cloud Product (GOCCP) (see below). The spatial resolution of the GOCCP low cloud data is 2° in latitude and 2° in longitude, i.e. approximately a 200 km grid; hence they are used to depict large-scale patterns only. The data are available from June 2006 and extracted up to December 2019.

ISD
Integrated Surface Database (ISD) is a global database comprising hourly synoptic surface observations made at meteorological stations (Smith et al. 2011). It is a database  (Fig. 1a). Only the synoptic hours (i.e. GMT 0000, 0300, 0600 etc.) are retained. Unfortunately, the records are rather incomplete.
Only 15 stations have more than 25% of non-missing 3-hourly records over the period of study 1979-2014.

EECRA
Extended Edited Cloud Report Archive (EECRA) is a global cloud database (Hahn and Warren 1999). These data are surface observations from various sources which have been processed, edited after quality control, from reports made by observers on ships or on the continent, day and night. Li et al. (2015) found an excellent agreement at global scale between EECRA cloud data and various satellite cloud products. Data from land stations cover the period 1971-2009, from which the sub-period starting in 1979 has been extracted. Low cloud fraction is available at threehourly resolution. There are only 25 stations in the study area (Fig. 1b). EECRA data have been supplemented by additional data bases, mostly SYNOP reports, which enabled to fill some gaps and to extend the time-series to 2014 in the present study (for details, see Champagne et al. 2022). The overall amount of missing data is slightly smaller than in the ISD data set.

ERA5
ERA5 is the fifth generation of climate reanalysis data from European Centre for Medium-Range Weather Forecasts (ECMWF). ERA5 replaces ERA-Interim, which ceased to be produced at the end of August 2019 (C3S 2017), and includes considerable improvements and higher horizontal, temporal and vertical resolutions (Hersbach et al. 2020). The data are at a native horizontal resolution of 31 km, made available on a 0.25-degree grid, and resolve the atmosphere at 137 levels from the surface to a height of 80 km, on an hourly time scale. ERA5 uses the Tiedtke cloud scheme (Tiedtke 1993) to forecast a three-dimensional cloud fraction for each grid box. Monthly means of three-hourly data are extracted over the period 1979-2014. In addition to low cloud fraction, the following variables have been used: SST, relative humidity, temperature, zonal and meridional winds at 1000, 925, 850, 700, 600 and 200 hPa. For the atmospheric fields, the ERA5 reanalyses will serve as the reference. For SST, ERA5 will be supplemented by data from the NOAA Optimum Interpolation Sea Surface Temperature v2 data set from 1982 to 2014 (OISSTv2- Reynolds et al. 2002). We verified that ERA5 SST means for 1982-2014 and 1979-2014 were almost identical.

MERRA2
Modern-Era Retrospective analysis for Research and Applications (MERRA) produced by NASA (Rienecker et al. 2011) is a second set of reanalyses used in this study. The second generation of these reanalyses (MERRA-2), starting from 1980, is used in this study. It is based on GEOS-5 for the assimilation of the most recent sources of satellite data, low cloud fraction (b), in percentages. The red box encloses the area over which the Gabon LC regional index is computed 1 3 in particular new microwave sounders and hyperspectral infrared radiation instruments (Bosilovich et al. 2016). The resolution is 0.625° × 0.5° or approximately a 50 km grid. Only low cloud data are extracted from MERRA-2 over the period 1980-2014. The initial time step is 30 min, and data are integrated as monthly averages.

CMIP6
Coupled Model Intercomparison Project (CMIP) is an intercomparison project of coupled ocean-atmosphere climate models the objective of which is to better understand past, present and future climate change. The project falls under the World Climate Research Program (WCRP) of the World Meteorological Organization (Eyring et al. 2016) and is currently in its sixth phase (CMIP6). Data cover the period 1850-2100 depending on the type of experiment, i.e. natural variability or response to changes in radiative forcing. The present work seeks to evaluate the models' performance to simulate the recent climate through historical simulations, and projections are therefore not examined. Two types of experiments are selected: coupled simulations and atmosphere-only simulations (AMIP: Atmospheric Model Intercomparison Project). The historical coupled simulations are only forced by observed features such as solar variability, volcanic aerosols, modification of the atmospheric composition (greenhouse gases and aerosols) caused by human activities. These simulations go from 1850 and extend to the near present (2014 in CMIP6). The AMIP-type simulations (hereafter referred to as "atmosphere-only simulations"), on the other hand, can be used to assess the atmospheric response to forcings by surface conditions (especially SSTs) in addition to changes in the composition of the atmosphere. The period of availability of the atmosphere-only simulations is from January 1979 to December 2014.
A total of eight models are used in this study (Table 1) Table 1. It corresponds to the availability and accessibility of the data at the time of research. In this study all ensemble members available for a given model and experiment are averaged together when analysing mean patterns, but are considered individually when computing correlations.

Definition of low clouds
Surface synoptic reports (from which the ISD and EECRA archives derive) are based on human observers' estimates of the cloud fraction of low clouds (LC) as the part of the sky covered by clouds the base height of which is typically below an altitude of 2 km. However, this category includes both stratiform clouds (stratus and stratocumulus) and vertically extensive clouds (cumulus and cumulonimbus). Since the present study is interested in stratiform clouds only, vertically extensive clouds (LC codes 1, 2, 3 and 9) have been excluded from LC observations (i.e., the corresponding cloud fraction is set to zero). They account for less than 17% of all dry season low cloud occurrences in the cloudiest area, centred on Gabon. We included LC code 8 (cumulus and stratocumulus with bases at different levels, 20% of cloud occurrences), thus contrary to Schrage and Fink (2012), our LC statistics do not purely refer to stratiform clouds (codes 4 to 7), although this is the overwhelming cloud type for the region and the season (63%, out of which 62% are stratocumulus). In EECRA, the two extended codes for obscured sky conditions available in this data set (codes 10 and 11, i.e. fog and thunderstorms) have not been considered (low cloud cover similarly set to zero), since they are not available in the other databases, and they occur very rarely in the season and region under study. The cloud fraction initially expressed in octas was converted to percent, and the arithmetic mean was computed to obtain monthly and seasonal averages. In the ISD data set, some records are based on METAR reports, which classify cloud cover in 5 broad categories. For ISD we therefore considered the occurrence of stratiform low clouds by considering the presence of an extensive low cloud cover at the time of observation, defined as an LC fraction greater than or equal to 5 octas (i.e. "broken" or "overcast" in the METAR cloud deck reports). The LC frequency is then simply obtained as the percentage of low cloud observations divided by the total number of synoptic reports at the station. This definition was retained for the ISD dataset only.
For CALIOP satellite data, the low cloud definition is the one retained in the GCM-Oriented CALIPSO Cloud Product (GOCCP) (Chepfer et al. 2008(Chepfer et al. , 2010. GOCCP was built from an algorithm designed to evaluate the representation of clouds in climate models. Three categories of clouds are differentiated based on pressure level (P): low level (P > 680 hPa), medium level (680 < P < 440 hPa) and high level (P < 440 hPa). The retrieval steps are described by Chepfer et al. (2013). In this product, low clouds are characterized by the cloud fraction at elevations below 3.2 km (P > 680 hPa). The LC fraction is calculated from the CALIPSO simulation included in the second version of the Cloud Feedback Model Intercomparison Project Observational Simulator Package (COSP) . Although these data can handle multi-layered clouds, the presence of optically thick high clouds and those associated with deep convection within the Intertropical Convergence Zone may partly mask low clouds (Chepfer et al. 2010;Vaughan et al. 2009). Note that this not a major issue in WEA since the ITCZ, and associated convective clouds, is located farther north in JJAS. For this reason, the difference with the definitions retained in synoptic observations not only lies in the altitude of the top of the lower level, but also to the fact that in one case the clouds are somehow observed "from below" and in the other case "from above". CALIOP data can be directly compared to CMIP6 simulations, which provide "GCMlidar-simulated" CALIPSO LC fractions for the same three cloud categories.
In the reanalysis products, the tropospheric depth over which low clouds are considered differ between ERA5 and MERRA2. In MERRA2, the LC fraction ("CLDLOW" variable) is described as the maximum of the total cloud fraction (convective and large scale) of any layer at or below 700 hPa (about 3 km in a standard atmosphere), which is fairly similar to CALIOP and CMIP6 data. In ERA5, low clouds are defined as the integration of all clouds found at pressure levels greater than 0.8 times the local surface pressure, i.e. below about 2 km assuming a standard atmosphere, which is closer to the definition of low clouds in synoptic reports than to that adopted in CALIOP and CMIP6. In order to evaluate the sensitivity of the results to this criterion, ERA5 hourly low cloud fraction at pressure levels is extracted for the year 2008, selected as a test year for numerical simulations in a separate component of the project. The low cloud fraction is then computed first from the surface to 800 hPa and second from the surface to 650 hPa, using random-overlap assumption (close to the one retained in the original reanalysis). The comparison indicates that the inclusion of mid-level clouds to 650 hPa has only a marginal effect: there is no obvious change in the spatial and temporal patterns of low cloud fraction, and for the WEA dry season an increase of less than 5% for 650 hPa as compared to 800 hPa is observed (slightly more inland-not shown).
Importantly, ERA5 LC fraction does not distinguish stratiform from vertically developed clouds, hence it is expected that in seasons when convection is active (outside the JJAS season in the core study area) ERA5 LC fraction is positively biased relative to the station estimate as described above. In order to assess to what extent this affects the results, a stratiform-only low cloud fraction is computed as based on Moron et al. (2023). Hourly low cloud cover is screened for simultaneous convective precipitation (> 0.1 mm). In such cases, the stratiform low cloud fraction is set zero. Comparisons are made between the seasonal averages of stratiform low cloud cover and the total (original) low cloud cover, over the whole period of study . The main outcomes will be discussed below.
The data from CALIOP, ISD and EECRA will serve as initial references for the evaluation of CMIP6 and reanalyses. Each of these reference datasets has its own flaws: ISD and EECRA records have gaps and are only available at a limited number of stations, and in CALIOP, besides the very short period of available data, stratiform low clouds can sometimes be undetectable since they can be hidden by opaque higher clouds of convective origin.

Methods
Most analyses focus on the JJAS season, which is the dry season in most of WEA and shows the highest frequency of stratiform low clouds (Dommo et al. 2018). JJAS longterm means at individual stations are computed from all available observations (using only stations with at least 300 three-hourly records). Both night-and daytime observations are used, since the CMIP6 data do not enable to investigate the diurnal cycle. Although low clouds are more frequent at night, we checked that both night-and daytime observations were present at the stations we used (i.e., the stations did not close during the whole night), and that the general results of our study were not affected by mixing night-and daytime records. However, observed ISD and EECRA low cloud monthly and seasonal means may be slightly negatively biased because more observations are generally available at daytime (63% of the records are within the time-span 0600-1500 UTC), and low cloud fractions tend to decrease in the afternoon (Champagne et al. 2022).
The inter-comparison of the data sets is first carried out over a broad area covering an extended Gulf of Guinea area and the neighbouring African land areas (15 °W-30 °E, 20 °S-20 °N, Fig. 2). Then more restricted analyses are produced focusing on WEA, using a regional index which will be defined below. This part of the analysis has some limitations in CMIP6, since the relatively coarse spatial resolution of the simulations (100-250 km) does not enable a detailed spatial analysis.
Comparisons between the data sets are made using standard methods. To assess the agreement in the spatial distribution of low clouds, pattern correlation coefficients are computed as well as bias and root-mean-square error (RMSE). To that end, all products are linearly interpolated to the same resolution, i.e. that of ERA5 (about 30 km). A nearest-neighbour interpolation was also tested, which yielded very similar pattern correlations. The agreement in temporal variations is examined by plotting the mean annual cycle of low clouds and (for atmosphere-only experiments, reanalyses and observations) by analysing interannual variations of JJAS average using Pearson's correlation coefficient. The latter is also used to relate low clouds and SST variations. In addition, Taylor diagrams are plotted, which enable to compare at the interannual time-scale the skills of different data sets (CMIP6, ERA5 and MERRA2) with respect to a reference (observed) time-series, by combining correlations, centred RMSE and standard-deviations.
A regional low cloud index is computed covering most of Gabon and the south-westernmost part of the Republic of Congo (9-13 °E, 5 °S-1.5 °N, Fig. 1), the area over which the dry season low cloud cover is the most persistent (Dommo et al. 2018;Champagne et al. 2022). This index includes twelve stations from the ISD and EECRA databases. In order to account for possible changes in diurnal as well as station sampling, data for each 3-hourly period at each station are standardised. This standardisation prevents a time-slot and/or station with a high mean LC fraction from having a disproportionate influence on the LC regional average of a given season if other time-slots/stations are unavailable. The JJAS mean for each season is then computed using all 3-hourly station data. If in a given season less than 20% of the data (8 time-slots, 122 days and 12 stations) are available, then the seasonal average is set to missing. This happens on three years for ISD and one year for EECRA. A regional index covering the same area is then defined from the reanalysis products and CMIP6 models, by spatially averaging the interpolated LC fraction at all the grid-points within the domain, with the exception of oceanic areas which are masked out.
An SST index is also computed over the region 5-13 °E and 2-9 °S, i.e. off the coasts of Gabon and Congo-Brazzaville, south of Cape Lopez. This corresponds to the part of the equatorial Atlantic upwelling area which shows the largest amplitude (> 3.5 °C) in the SST annual cycle (Merle et al. 1980). Given the strong SST covariations in the equatorial Atlantic Ocean in austral winter, this index is representative of JJAS SST variations over a much larger area, including the "cold tongue" (equatorial upwelling) area (Servain and Merle 1993;Lutz et al. 2013). The index is strongly correlated (r = 0.82 in the OISST data set between 1982 and 2014) to the ATL3 SST index (20 °W-0 °N, 3 °S-3 °N-Zebiak 1993) which is commonly used to depict the largescale SST variations in the equatorial Atlantic and associated warm events ("Atlantic Niños"-e.g., Lübecke et al. 2018;Vallès-Casanova et al. 2020).
In order to describe atmospheric stability, a key feature in the development of low clouds (Slingo 1987;Klein and Hartmann 1993;Hu et al. 2008;Sun et al. 2011), the Lower-Tropospheric Stability index (LTS) defined by Klein and Hartmann (1993) is used. It is computed over Gabon (10-12 °E, 3 °S-1 °N) as the difference between the In the boxes, r is the pattern correlation with CALIOP, b is the mean bias and rmse the root-meansquare-error (in % cloud fraction) potential temperature at 700 and 925 hPa. The 925 hPa pressure level was used instead of 1000 hPa because over land, 1000 hPa temperature is not defined in all the models.

Climatology of low level clouds
The JJAS mean low cloud fraction is first examined over the larger region (Gulf of Guinea and nearby African landmass) (Fig. 2). CALIOP data shows a major contrast between the oceanic areas, where a high to very high LC fraction is observed (40-85%), especially off the coast of Southern Angola, and the land areas where LC fraction is generally below 20%. As an exception to the land areas, the coastal regions from Sierra Leone to Northern Angola show a higher LC fraction, with a distinct maximum in Gabon (30-70%). This maximum decreases inland at a distance of about 300-500 km from the coast. The two reanalysis products and the CMIP6 coupled simulations display relatively high pattern correlations with CALIOP ( Fig. 2, r values in the lower right corners). This merely reflects the regionalscale land-ocean contrast, always well replicated, including the corona of relatively high cloud fractions over the coastal areas around the Gulf of Guinea. However, it is easy to see that both the oceanic and land patterns often notably differ from the CALIOP satellite estimates, as reflected by relatively high RMSE values (exceeding 16% for most datasets). In ERA5 for instance (Fig. 2b), while the oceanic LC fraction agrees relatively well with satellite data [as found by Koseki and Imbol Koungue (2021)], on the land areas bordering the Gulf of Guinea, especially over Western Africa, the LC fraction is very high (40-80%). Even over the inland northern part of the Congo Basin, it often exceeds 40%. Over these regions, a substantial share of low cloud occurrences in ERA5 is associated with convective clouds, therefore the purely-stratiform LC fraction is lower and better agrees with CALIOP (not shown). However, relatively high values are found inland over the northern Democratic Republic of the Congo (30-40% for purely-stratiform clouds), a feature absent from CALIOP observations. MIROC (Fig. 2g) also portrays a high LC fraction over the Congo Basin, with almost no difference with the coastal regions. Its overall mean bias is small but its pattern correlation quite low. Larger negative biases are found in most other products, especially MERRA2 reanalysis (− 15%) and CNRM (− 19%). The IPSL simulations are among the most skilful ones, in terms of bias and spatial patterns, as denoted by a very high correlation and a low RMSE (Fig. 2e). On the whole however, most CMIP6 simulations tend to underestimate LC fractions over both coastal areas north of 10°S and the SE tropical Atlantic. An examination of JJAS mean precipitation in the models (not shown) reveals that in the Gulf of Guinea coastal areas the low cloud underestimation is always accompanied by a rainfall overestimation. This indirectly suggests that the low clouds captured by the simulations over these regions are mostly stratiform, and that an abnormally strong convective instability is the cause for their underestimation.
When using the atmosphere-only simulations instead of the coupled ones (Fig. 3), most models still underestimate LC fractions. However, there is a substantial increase in oceanic and coastal LC fraction in the IPSL, E3SM, MOHC and CNRM models. There is also a marginal increase in many correlation coefficients, and a decrease in most RMSE values, but the geographical patterns are not strongly altered. Only MRI shows a significant correlation increase (from 0.74 to 0.87), which suggests that in this model the coupling critically affects the simulation of the basic state of the LC cover.
In order to document LC fraction over WEA, the JJAS spatial average over the area shown on Fig. 1 (Gabon and south-western Congo) is plotted (Fig. 4). Synoptic observations are now added and serve as reference for the other datasets. The observed LC frequency (ISD) and LC fraction (EECRA) are both very high (69-70%). CALIOP satellite estimates are slightly lower (61%), possibly because in some occasions low clouds may be obscured by an upper dense cloud layer, though high opaque clouds are quite infrequent (< 25%), except at the northern margins (Dommo et al. 2018). The two reanalyses strongly disagree with each other. ERA5 (60%) is far closer to observations than MERRA2, which shows an unrealistically low LC fraction (21%) in agreement with the large bias seen in Fig. 3. All the CMIP6 coupled simulations (Fig. 4, light blue bars) underestimate the LC fraction. The range is wide however, with CNRM and GFDL cloud fraction being as low as 28-29% and MIROC reaching 52%. It is noteworthy that all the atmosphere-only simulations (Fig. 4, dark blue bars), with the exceptions of MIROC and MRI, yield smaller LC fraction biases. The IPSL model shows a regional mean of 61%, close to observations, and both MOHC and E3SM have LC fractions near 60%, compared to about 30-35% in their respective coupled simulations. This suggests that the ocean-atmosphere interactions are important components of LC simulation, even over this continental area, and that incorrect feedbacks of clouds or atmospheric dynamics on the state of the upper ocean layers are the potential cause of the LC underestimation over WEA.
The LC annual cycle for the Gabon regional index is plotted in Fig. 5, for CALIOP, synoptic observations and reanalyses, compared to CMIP6 coupled simulations (panel a) and atmosphere-only simulations (panel b). Synoptic observations (ISD and EECRA) display a strong annual cycle. The JJAS season clearly stands out as the period with the highest stratiform LC fraction. Low clouds are less frequent in the wet months, i.e. from October to May. The CALIOP annual cycle agrees well with synoptic observations, although the LC fraction is generally lower by 5-15 percentage points, as expected from the fact that opaque high clouds sometimes overlap low clouds. The peak is also shifted to July in CALIOP, instead of August in both ISD and EECRA observations. Reanalyses deviate markedly, but in ERA5 this is partly the result of an absence of discrimination between stratiform and convective clouds in the original LC field. From October to May, the LC fraction is high (40-60%, thin dashed orange line) compared to synoptic records (generally  Table 1 for the definition of low cloud cover and the period used for computation in each dataset 30-40%). This is due to the fact that convective clouds with low cloud bases are included in the ERA5 LC fraction. When corrected for convective occurrences (thick dashed orange line, same averaging period), ERA5 LC fraction during these wet months becomes very close to synoptic observations. In JJAS, the ERA5 low cloud fraction is much less impacted by convective clouds. By contrast, the MERRA2 reanalysis shows an underestimation of LC all year round, especially in JJAS as documented above.
CMIP6 coupled simulations (Fig. 5a) display highly dissimilar patterns. MIROC fails to show any realistic annual cycle, with a minimum in August instead of a maximum, and very high values in the wetter months. This is not due to any major error in the latitudinal shift of the ITCZ, since JJAS precipitation, although overestimated over WEA as in most models, is not particularly high (not shown). The rest of the models are able to reproduce an austral winter LC maximum, but it is systematically too weak, sometimes too early (July in CNRM) or too late (September in E3SM) compared to synoptic reports. For most models, it is the onset of the cloudy season (i.e. the strong increase in LC fraction in June) which is critical, with low clouds developing too slowly in the models. Wet season LC fractions are much better reproduced by most models, except IPSL, but this model, on the contrary, has its dry season maximum closest to observations, though delayed. It is unlikely that the strong dry season LC underestimation in CMIP6 relates to cases of undetected low clouds, being overlaid by optically thick clouds. CMIP6 LC fraction is actually much lower than in the CALIOP satellite product, which undergoes the same bias but shows only slightly lower values than the ground observations. Moreover, in most CMIP6 simulations, upper layer cloud fractions over the region are similar or lower than in CALIOP (not shown).
Compared to coupled simulations, CMIP6 atmosphereonly simulations (Fig. 5b) better agree with observations, both in terms of timing and amplitude of the annual cycle. The quick increase of LC fraction in May and June is much better reproduced by the atmospheric models, although still weak in some of them. However, the LC peak remains too late in three models (E3SM, NCAR, MRI), and is often underestimated, especially with reference to synoptic observations. Notably, MIROC's annual cycle is not improved (even worsened) when compared to the coupled version. Hence, this model will not be further considered below given its inability to display the dry season LC increase. On the whole, Fig. 5 confirms that an accurate representation of atmospheric-ocean feedbacks (or at a minimum an accurate representation of SSTs) are key to get LC correct in coupled models in the region, especially around the start of the dry season.

Interannual variability of low clouds
The interannual variations of the JJAS low clouds in the period 1979-2014 are examined for the Gabon regional index, in the observations and in the ERA5 reanalysis (raw and stratiform-only low cloud fractions) (Fig. 6). The MERRA2 reanalysis is not used given its strong underestimation of LC fraction during this season. ISD cloud frequency and EECRA cloud fraction are strongly correlated (r = 0.98). ERA5 and observations display variations that are largely in phase. The correlations between EECRA and ERA5, for both raw and stratiform-only cloud cover, are highly significant (0.76 and 0.81 respectively-p < 0.001). Most years with an abnormally high LC fraction (e.g. 1982, 1983, 1990-1994, 2005, 2009-2011) and those with a low LC fraction (e.g., 1984, 1988, 1989, 1996, 1998, 1999) show a reasonable agreement between ERA5 and synoptic observations. The year 1984 is known as a highly singular dry season, with out-of-season heavy convective rains in Gabon associated with an anomalously southern location of the ITCZ and a severe drought across the Sahelian and Sudanian belts, which is consistent with the lower frequency of stratiform low clouds (Buisson 1984). CALIOP data are not displayed in Fig. 6 because at most 9 years are in common with the other data sets. Over these years, CALIOP correlates significantly with ERA5 (r = 0.71, p < 0.01) but not with EECRA/SYNOP observations (r = 0.52).
These results show that ERA5 performs well at reproducing observed interannual variations of low clouds over Gabon, despite the fact that this variable is only indirectly affected by assimilated observations (e.g. radiances that affect column humidity) and is hence strongly determined by model's physics. Note that years with more disagreement between the two observation data sets and/or ERA5 are near the end of the time-series (apart from 1986-1987), which have more missing synoptic records. In the next step, we examine the ability of the CMIP6 models to reproduce the observed interannual variations of the LC over Western Equatorial Africa. Solely the atmosphere-only simulations can be analysed since the coupled simulations all have their own internal variability and thus do not match the temporal phase of observed ocean-atmosphere anomalies. The EECRA Gabon regional index is used as the reference in a Taylor diagram (Fig. 7). The MERRA2 data have been included in the assessment, but not CALIOP data because of its short period of data availability. The CMIP6 models show moderate to low skills. Five models (GFDL, E3SM, CNRM, MOHC and IPSL) show correlations close to 0.6, all significant at p < 0.01, but the standard-deviation is lower than that observed (particularly for CNRM and MOHC), reflecting their underestimation of the low cloud fraction. Their performance is lower than that of ERA5 (either raw or stratiform-only cloud cover, orange triangles in Fig. 7). Much lower correlations are found for NCAR (0.39) and MRI (0.27). MIROC (not shown) has a strongly negative correlation with observed and ERA5 interannual time-series, and as mentioned above this model will not be kept in the subsequent analyses. The MERRA2 reanalysis is characterised by a very small standard-deviation, reflecting the large negative bias as discussed above, and its correlations with both EECRA (0.39) and ERA5 (0.45) are modest.

Relationships with sea-surface temperature
The inferior performance of the coupled simulations compared to atmosphere-only simulations point to a key role of oceanic dynamics in the adequate simulation of low clouds. Dommo et al. (2018) also suggest that the seasonal occurrence of cool SST off Gabon is instrumental in low cloud genesis. This section thus analyses SST patterns and variability over the South Atlantic Ocean. In Fig. 8, the JJAS mean SST is mapped for the seven CMIP6 coupled models (eight initial models, minus MIROC-panels c to i) and for OISSTv2 ("Reynolds") as well as ERA5, as references (panels a-b). The latter two data sets agree by showing a distinct northward SST gradient, with cold water (< 20 °C) along the coast of Namibia and southern Angola (ca. 15-20 °S), associated with the Benguela Current and coastal upwelling, and warm water north of the equator (> 26 °C). A zonal band of relatively cool water (23-25 °C) is found slightly south of the equator, denoting the Equatorial Atlantic upwelling zone (cold tongue area). The temperature does not exceed 23 °C along the coasts from northern Angola to Cape Lopez in Gabon. In all the CMIP6 models, SSTs display mostly positive biases, with large errors (2-5 °C) in the eastern Atlantic Ocean, including along the coasts of Congo-Brazzaville and Gabon. Likewise, the equatorial cold tongue is either missing or too weak in most models, as demonstrated by a wedge of positive biases along the equator. Of all the CMIP6 models, MRI and to some extent IPSL seem to best reproduce SSTs over the Atlantic and along the Gulf of Guinea coasts during JJAS, although temperatures are overestimated and the warm bias is stronger around Namibia. In MRI, the equatorial upwelling extends well towards the coasts of Gabon and Congo-Brazzaville, where the ocean temperature drops to 23 °C as in the observations. GFDL also has smaller positive biases than many other models, but negative biases are found in the western part of the basin, resulting in a too strong east-west gradient.
Whereas the direction of surface winds in the South Atlantic Ocean is generally well reproduced in the CMIP6 models, a noteworthy feature is the fact that the intensity of the south-easterlies south of 10 °S is often too strong compared to ERA5, while north of the equator the monsoon is too weak in nearly all the 7 models (not shown). The anomalous speed convergence near the equator (as well as along the eastern ocean boundary in some models) may contribute to explain the positive SST biases, at least in the subequatorial part of the basin, in addition to other possible causes such as equatorial westerly wind biases and a southward ITCZ shift in the preceding boreal spring Richter and Tokinaga 2020). The lack of stratocumulus clouds itself was also found to explain part of the SST biases through a positive shortwave radiation bias (Wahl et al. 2011;Voldoire et al. 2014). In CMIP6, Farneti et al. (2022) noted that the positive SST biases in the Southeast Atlantic were not improved compared to CMIP5. They highlighted two reasons for the biases: a local one, related to the location of the Angola-Benguela Front (improved in highresolution simulations), and equatorial biases associated with the role played by coupled feedbacks in the representation of the equatorial cold tongue. However, it is beyond the scope of the present article to assess the exact causes of the SST biases.
In order to better describe the SST along the Gabonese coast, a regional index is extracted for the area boxed on Fig. 8a. The annual cycle is plotted in Fig. 9 to find out whether the poor replication of the JJAS SST in CMIP6 is a result of a flawed seasonal cooling or more permanent biases. All the models overestimate SST off the southern coast of Gabon, especially from May to December. The austral winter cooling, peaking in August, is reproduced by all models, but it is noteworthy that it starts too late, consistently with the difficulties to correctly reproduce the onset of the cloudy season, and is too weak. While ERA5 and OISST ("Reynolds") data have their warmest water in March (close to 29 °C), with a strong cooling starting after April, in most models the SST drops too late (May or even June) and too slowly. The lowest temperature is generally found in August, in agreement with observations, but it is not low enough (positive bias of 2-4 °C). The models that are best able to reproduce the equatorial upwelling (IPSL and especially MRI) are the only ones to show a quick SST drop from May to July which almost matches observations, although the minimum austral winter temperature is still strongly overestimated.
The above systematic biases in both low cloud cover and SST in CMIP6 simulations do not imply that the models are unable to reproduce interannual variations in the cloud-SST coupling. In order to document this issue, Fig. 10 shows the correlation between interannual variations of LC fraction (JJAS average) in Gabon and SST at each grid point, within each data set. For consistency in the period of computation, observed LC fraction is correlated to ERA5 SST since the Reynolds SST data do not start until 1982. For ERA5 and observations (Fig. 10a, b), the correlation is strongly negative over the Equatorial Atlantic Ocean (-0.6 to -0.8, significant at p < 0.001), which indicates that the years showing a high LC fraction in Gabon coincide with a cooler than usual Atlantic Ocean. Absolute values are highest in the cold tongue area and to the southwest of WEA. They decline to the north and the southwest in both data sets. Reciprocally, anomalously warm years such as 1984, 1988, 1995, 1998, 1999, 2003 and 2007, which denote "Atlantic Niño" events  (Lübbecke et al. 2018;Vallès-Casanova et al. 2020), generally coincide with less low clouds in Gabon (Fig. 6).
CMIP6 coupled simulations (Fig. 10c-i) display correlation patterns with Gabon LC fraction that differ largely between models. While one model (NCAR) shows unrealistic weakly positive correlations, all the others display negative correlations, as in the observations, but the patterns and intensity are sometimes quite different from those observed. In most models (IPSL, E3SM, MOHC, GFDL), the highest negative correlations (below -0.6, significant at p < 0.001) are found in the equatorial upwelling area south of the equator. They more or less expand eastward and southward to the African coast from Gabon to Namibia. This pattern is relatively close to that obtained from ERA5 and observed data. E3SM and GFDL ( Fig. 10e and h) show the strongest negative correlations (< -0.75), though some inaccurate positive correlations are found along the West African coast and in the southwestern Atlantic. MRI exhibits correlations which are quite uniform but weak, and CNRM displays an unrealistic pattern.
Over the period 1979-2014, there is a significant warming trend over the South Atlantic Ocean in all the coupled models but GFDL (not shown). When the linear trend is removed, the correlations are generally slightly enhanced for the models which were already showing an SST-low clouds relationship. Although it may be difficult to transpose these relationships to climate change issues, this suggests that a warming trend may not necessarily result in less low clouds over Gabon.
In order to better describe the relationship between SST and low clouds, the SST index extracted off the coasts of southern Gabon and Congo-Brazzaville is plotted against Gabon LC fraction (Fig. 11). Each dot is the JJAS average of a given year (and a given simulation member if applicable), with colours and symbols referring to the different data sets (observations, ERA5 and coupled models). Regression lines are plotted only for the data sets that show a statistically significant linear relationship between interannual variations of LC and SST. The scatter plot well shows the negative correlation found in several models, replicating the strong relationship obtained in the observations (r = -0.77) and in ERA5 (r = -0.72). However, the biases in both SST and LC fraction in all the models are also conspicuous. The smallest LC fraction biases are often associated with smaller SST biases (for instance, IPSL has moderate biases in both variables, while CNRM exhibits large biases for both variables), but the LC bias does not necessarily scale with the SST bias. For instance, MRI has the smallest SST bias (Figs. 10 and 11) but it severely underestimates the cloud cover. These biases do not necessarily relate to the interannual correlations between LC and SST either. For instance, despite the large cloud biases found in GFDL and MOHC, these models are able to reproduce a significant relationship with interannual SST variations (r = -0.63 and -0.65, respectively). And while MRI has small SST biases, its negative correlation with the cloud fraction is weak. An important feature is that the models that exhibit unsuitably low interannual SST variations tend to miss the relationship with the LC fraction (e.g., CNRM and NCAR). E3SM, which has the strongest negative correlation with SST (-0.94) also displays the largest interannual variability, with SSTs varying between 25 and 28.5 °C, a range similar to that of ERA5, although (like in other models) SSTs are much too high.
As suggested by Fig. 10, the SST forcing onto WEA low clouds is not a local one, since high correlations extend over much of the equatorial Atlantic Ocean. The substitution of the ATL3 SST index (Zebiak 1993) to the Gabon/Congo index actually yields results that are very close to those shown on Fig. 11. For instance, ERA5 and observed LC fraction correlates to ATL3 at -0.72 and -0.79, respectively. Generally lower values, but very similar to those obtained using the Gabon/Congo SST index, are obtained in CMIP6.
To further understand the origin of the poor performance of several models, the same correlation maps as in Fig. 10 are produced using the atmosphere-only simulations instead of the coupled ones (Fig. 12). All the models now reproduce a negative correlation between LC fraction and SST in the Equatorial Atlantic. Compared to ERA5 and observation, the spatial pattern and amplitude of the correlation are accurate in the CNRM, IPSL, MOHC and GFDL models. Correlations are too low in NCAR and MRI. A clear improvement Fig. 11 Scatter plot of LC fraction over Gabon versus SST over the Atlantic Ocean (Gabon/ Congo area, 9-2 °S, 5-13 °E), in the observations (OBS), in ERA5 and in each of the CMIP6 coupled models. Each marker is the JJAS mean of a given year (all available members are considered for the CMIP6 models). Regression lines are shown when the cloud fraction significantly correlates (p < 0.05) with SST upon the coupled simulations is found for CNRM and to some extent NCAR. These results show that much of the flaws in the relationship between SST and cloud fraction in the coupled simulations arise from faulty ocean-atmosphere feedbacks, which project onto a too weak variability of the oceanic temperature.

Relationships with atmospheric fields
In order to determine how much does the SST-cloud relationship involve variations in lower-tropospheric stability, a LTS index is extracted over Gabon and plotted against Gabon LC fraction (Fig. 13). A strong interannual correlation is obtained between ERA5 LTS and both observed and ERA5 LC fractions (r = 0.73 and 0.86, respectively). As expected, increased stability is key to the development of a stratiform low cloud cover. This relationship is reproduced by a majority of models. Four of them (IPSL, E3SM, NCAR and MRI) show correlations similar to or even higher than ERA5 and observations. MOHC has a weaker but significant (p < 0.05) correlation, although the slope of the relationship is distinctly too small, resulting in dampened variations of the LC fraction. No relationship between LC fraction and LTS is found for CNRM and GFDL. Interestingly, although stability is supposed to be largely constrained by SST, the models which skilfully reproduce the relationship between LC fraction and stability are not always the same as those which are the best ones at simulating the relationship with SST. E3SM and IPSL reproduce both relationships. By contrast, NCAR shows a good fit between low clouds and stability, but not with SST. This suggests that in these models climatic forcings other than SST (e.g., mid-tropospheric air temperature) have a dominant control on stability and then on LC fraction. Conversely, low clouds variability in GFDL is related to SST, but it fails to show any clear relationship with LTS.
Overall, these results indicate that the poor simulation of low clouds in the CMIP6 models may be due to a variety of reasons. While the incorrect simulation of SST and its association with low clouds is a key feature, purely atmospheric issues are also at play in many models. Remarkably, the correlations with LTS are often higher than those obtained with SST. In ERA5 for instance, the correlation of LC fraction with LTS stands at 0.86, as against -0.72 with SST. In the IPSL, MRI and NCAR models, LTS much better explains cloudiness variations than SST alone does. This points to the potential role of purely atmospheric dynamics (especially in the mid-troposphere) on LC fraction variability. To shed light on this aspect, the relationship between key atmospheric variables and LC fraction is now examined. Although a detailed assessment for each model is out of scope, this will also help to point to possible causes of the CMIP6 models deficiencies in their simulation of LC fraction and its variability. Figure 14 highlights some ERA5 atmospheric variables which show a significant relationship with observed LC fraction over Gabon (top panels). Nearsurface winds display positive correlations in the Gulf of Guinea area north of the equator, particularly the Bight of Bonny (both U and V), indicative of an enhanced southwesterly monsoon flow in years with an increased cloud fraction. A stronger monsoon flow north of the equator was found to result from an SST drop in the cold tongue area and an increasing meridional SST gradient across the north front of the upwelling, with warm waters trapped in the Bight of Bonny, around the start of the first Guinean rainy season (Leduc-Leballeur et al. 2013;Meynadier et al. 2016). Therefore, the positive correlations between LC fraction and U1000 and V1000 reflect the role of SST on both winds and cloudiness. However, at 850 hPa the correlations with the meridional wind flow turn negative (Fig. 14), suggesting that the monsoon depth is reduced when the LC fraction is high. This pattern, together with the stronger near surface southerly winds, is indicative of a stronger but shallower monsoon flow. This combination is only possible in an environment of higher vertical stability, where turbulent mixing is suppressed and where moisture is concentrated in a shallower layer to allow a more persistent stratiform cloud cover. Potentially, increased surface fluxes from the ocean due to the higher wind speeds further support this process.
At 700 hPa (above the monsoon flow), a weak positive correlation between LC and temperature is found over Gabon and neighbouring regions (Fig. 14, top right panel). Together with the negative correlation with surface temperature and SST, this is in line with the strong relationship found above between LTS and low clouds, since a warmer mid-troposphere (colder lower-troposphere) increases lowlevel stability.
By extracting regionally-averaged indices representative of these relationships (boxes on top panels of Fig. 14), it is found that the models unevenly reproduce the relationships   of JJAS LC fraction in Gabon and different atmospheric fields. Top panels: correlation maps between observed LC fraction and ERA5 variables (thick white lines: 0.05 significance level; thin white lines: − 0.6 and 0.6 correlations). Central panels: total correlations between Gabon LC fraction and regional indices computed over the boxes shown on the maps (stars indicate correlations significant at p < 0.05, and big stars at p < 0.01), for observations, ERA5 and CMIP6 coupled simulations. Bottom panels: same as central panels but for partial correlations independently of SST variations over the Gabon/Congo area found in the observations (central panels). In particular, the connection with meridional wind over the Bight of Bonny and the Gulf of Guinea (V1000 and V850) is not well represented in 3 models. E3SM and MRI perform best in this respect, and MOHC and NCAR reasonably well. The IPSL model, good at reproducing the relationship between SST and the cloud fraction, fails to show any association with the wind flow in the Bight of Bonny, but correctly simulates a link with the monsoon flow further west (not shown). In the CNRM model, variations in the cloud cover are strongly associated with the zonal (landward) component of the wind in the Bight of Bonny, but not with meridional winds, suggesting that the link with the monsoon is missed. Another issue is the fact that the (weak) positive correlation with 700-hPa temperature (T700 index over Gabon, right panels) is not seen properly by most models.
In order to find out whether these atmospheric dynamics are related to or independent from SST variations, partial correlations are computed between low clouds and the same atmospheric indices, after removing the effect of Gabon/ Congo SST (Fig. 14, bottom panels). In the observation, the correlation with near-surface winds (U1000 and V1000) becomes very weak, indicating that the above wind signal was strongly associated with SST-induced north-south temperature gradients. The correlation with U1000 and V1000 becomes generally weaker in the models as well, though still highly significant in E3SM, MRI and NCAR (V1000), which for the latter two is no surprise given their poor skills at reproducing SST variations. The same applies to CNRM for U1000.The fact that several of these partial correlations remain significant while they are not any more in the observations is also due to the generally underestimated relationship with SST in several models. To some extent, the same remarks apply to V850.
Interestingly, in the observations, the positive partial correlations with T700 (Fig. 14, bottom-right panel) are still significant, and even stronger than the total correlations. The T700 index is actually fully decorrelated to surface temperature and SST (correlation with the Gabon/Congo SST index: -0.11). This shows that mid-tropospheric warming has a distinct effect on the occurrence of low clouds. It is conceivable that this variable is influenced by processes such as advection or longwave cooling associated with variations in free-tropospheric water vapour or high-level clouds (see e.g. discussions in Andersen et al. 2020) but may also reflect large-scale teleconnections with ENSO events (i.e. increased subsidence and anomalous warming over equatorial Africa during warm events; Moron et al. 2023). Several of the CMIP6 models also show a higher correlation with T700 when partialling out the effect of SST.
Given the absence of correlation between SST and T700, a linear model was defined which aimed at explaining LC fraction variability by the two predictors combined (Table 2). In the observations, it is confirmed that both SST and mid-tropospheric temperature separately and significantly contribute to cloud cover variations over Gabon. As much as 67% of the LC fraction variance is explained by the two predictors. Similar results are obtained for ERA5 LC fraction. Three CMIP6 models (IPSL,E3SM and MOHC) replicate this joint effect of SST and T700 (the former being dominant), with significant p-values attached to each of the two predictors. The resulting r-square is high (0.54-0.86). Three other models (CNRM, MRI and GFDL) reproduce the effect of SST variations only, though it is weak. For NCAR, cloud fraction variations are not explained by any of the two predictors. On the whole, besides the strong systematic biases found in all the models, a fair reproduction of the mechanisms driving LC variability over WEA is found in several models, which suggests that they can be useful for projecting future changes in the LC cover over the region.   to SST (Gabon/ Congo index, see Fig. 8 for the location) and 700-hPa temperature (Gabon index, see Fig. 14

Discussion
The above results showed that low clouds are unevenly reproduced in CMIP6 models and in reanalyses. Over the broader region (Equatorial Atlantic and Tropical Africa), the spatial patterns of mean low cloud occurrence as depicted by CALIOP data are reasonably well reproduced by CMIP6 models, ERA5 and MERRA2 reanalyses. However, the mean cloud fraction is severely underestimated in MERRA-2 and in several models, as earlier shown for CNRM for instance (Voldoire et al. 2019a). At globalscale, although Jiang et al. (2021) showed an improvement upon CMIP5, Vignesh et al. (2020) and Konsta et al. (2022) noticed that CMIP6 models still show a negative bias in low-altitude cloud fractions in the tropics. The exception is the IPSL model, as also noted by Konsta et al. (2022).
In WEA, all data sets struggle to correctly reproduce the amplitude of the annual cycle and underestimate the JJAS low cloud fraction, with respect to synoptic reports at stations in Gabon and southern Congo-Brazzaville.
Although MERRA-2 reanalysis shows very good skill in reproducing precipitation and winds in Central Equatorial Africa (Hua et al. 2019), it severely underestimates low clouds. An underestimation of the total cloud fraction in MERRA2 was found for China by Feng and Wang (2019), leading to an overestimation of solar radiation. Miao et al. (2019) noted a dramatic underestimation of low clouds in MERRA2 at global scale. By contrast, ERA5 reanalysis is fairly accurate in simulating low clouds in WEA during the austral winter, with a slight underestimation. However, it produces too many low clouds during the rest of the year, mostly because convective clouds are not discriminated from stratiform clouds in ERA5. This bias vanishes when a filter is applied to exclude convective clouds.
The underestimation of low clouds is generally reduced in atmospheric models forced by observed SST compared to coupled ocean-atmosphere models. This suggests that a dominant part of the inaccuracies in low cloud simulations is related to unrealistic simulated SSTs. The seasonal development of low clouds in WEA is primarily associated with the surface cooling of the Equatorial Atlantic Ocean between April and July. This cooling is delayed and too weak (2-4 °C in most models, instead of 5-6 °C) in the CMIP6 coupled models. Consistently, the underestimation of LC fraction is especially strong around the start of the cloudy season. Richter and Tokinaga (2020) noted the warm bias in the Equatorial Atlantic in CMIP6, with little improvement upon CMIP5 simulations (except in some models), and attributed it to the response to wind forcing being too weak. Voldoire et al. (2019b) also described the leading role of wind stress biases in driving the equatorial SST bias. This warm bias reduces the stability of the lower troposphere, which may account for the underestimation of stratiform low clouds. This is in agreement with the observations by Cesana et al. (2012) on the links between SST and low clouds. The problem is complex as the lack of stratocumulus over the ocean is also a cause of the positive SST bias. However, Voldoire et al. (2014) demonstrated that in the CNRM-CM5 coupled model the solar heat flux biases alone, resulting from the cloud biases, are not sufficient to account for the biased cold tongue intensity. Though focused on the continent, the above results showed that the inter-model variations in the low cloud biases over WEA do not necessarily scale with their respective SST biases, which also suggests that the misrepresentation of low clouds in the models does not simply reflect incorrect cloud-SST feedbacks.
At the interannual timescale, ERA5 variations of low cloud cover match well those of observed data. These variations are inconsistently reproduced by the atmosphere-only simulations forced by observed SSTs. The GFDL, E3SM and CNRM AMIP-type models show the best correlations between interannual variations of the observed and simulated LC fraction (r ~ 0.60-0.65 for Gabon over the period . Observations indicate that these interannual variations in low cloud cover are strongly related to SST over a large part of the Equatorial Atlantic Ocean, with correlations exceeding 0.7 in the equatorial upwelling area and off the coasts of Gabon and Congo-Brazzaville. Some models underestimate this relationship, or completely miss it. The accuracy of the correlation between SST and low cloud variations is only loosely related to the mean biases of the models. This is in line with Richter and Tokinaga (2020) who examined Tropical Atlantic SST as simulated by CMIP6 models and found a relatively weak link between mean state biases and the quality of the simulated variability. Several models display a mean JJAS SST off Gabon exceeding 26 °C, an often used threshold for deep convection (Zhang 1993), whereas in the observation SST over the same region is close to 23.5 °C. Yet some of these models, especially E3SM, adequately simulate a strong relationship between low clouds and SST. However, models whose SSTs are always above 26 °C (CNRM, NCAR) fail to show any strong relationship with low clouds (regardless of the magnitude of their LC fraction mean biases), suggesting that above this threshold low cloud formation obeys to other forcings in these models. ERA5 hourly data actually confirm that 26 °C is a very effective SST threshold for convective rains to develop over WEA (not shown).
Interannual variations of lower-tropospheric stability (LTS) over Gabon are strongly correlated to low cloud fraction (r = 0.73 and 0.86 in the observations and ERA5, respectively). Mean LTS is markedly underestimated in all coupled CMIP6 models, mainly as a result of too high SSTs.
While the relationship of LTS with low clouds is correctly simulated by a majority of the models, two fail to show any relationship. For CNRM, this reflects the poorly simulated SST-low cloud relationship. In GFDL however, while the SST forcing on low clouds is well reproduced, the relationship with LTS is not. This is explained by a decorrelation between low level air temperature and SST in this model, but it also points to a deficiency in the basic effect of stability on low cloud variability.
ERA5 data do show that the WEA low cloud cover is not only controlled by SST. Low-level winds and 700 hPa temperature (T700), partly independent of SST, exert secondary controls on interannual variations of low cloud occurrence. A higher T700 over WEA, which is independent of SST variations in the Atlantic, distinctly contributes to increase LC fraction through enhanced LTS. Less than half of the CMIP6 models reproduce this control. Low-level wind signals over the Gulf of Guinea north of the equator are found to be associated with LC variations, indicating more low clouds over Gabon in years with a stronger monsoon. These signals are strongly related to the cold tongue intensity, and reflect its coupling to the large-scale dynamics of the African monsoon (Okumura and Xie 2004;Caniaux et al. 2011). The relationship between winds and LC mostly vanishes after removing the effect of SST. A small residual signal in the wind may denote non-linearities or the effect of independent drivers, such as zonal advection of moist air towards WEA. Neupane (2016), and Longandjo and Rouault (2020) indicate that the strength of zonal circulation over WEA is controlled by west-east temperature gradients, hence by Congo Basin temperatures in addition to Equatorial Atlantic SST, which therefore may both play a role in LC variability. Only some models reproduce the link between Gulf of Guinea winds and the LC, and as in the observations they see it as strongly (but not fully) driven by the Equatorial Atlantic SST variations. In the coupled ocean-atmosphere system, which involves Atlantic SST, monsoon and zonal circulations, any over/under-estimation of one component has cascading effects on the others, including low cloud cover (Tompkins and Feudale 2010). Richter et al. (2012) underlined that biases in simulated equatorial SST are dependent on wind biases, themselves related to precipitation biases over the continent. However, the fact that there is no systematic association across the different CMIP6 models between their capacity to replicate the LC-SST and the LC-wind (or mid-tropospheric temperature) relationships, in addition to observed correlation patterns themselves, shows that the drivers of low cloud variability are manyfold. Among other parameters, biomass-burning aerosols are suspected to play a role in low cloud fraction over the South-East Atlantic Ocean, likely through an additional radiative heating generated by smoke above the low-level clouds, which increases the stability of the low-level inversion, a feedback which is not well represented in many CMIP6 models (Lu et al. 2018;Mallet et al. 2021). However, over WEA, this effect seems moderate only (3-5% increase of simulated low cloud fraction over southern Congo-Brazzaville when aerosols are activated in the models, according to Mallet et al. 2020).

Conclusions
Stratiform low clouds are a recurrent feature of the Southeast Atlantic Ocean. In austral winter, they extend over southern West Africa and Western Equatorial Africa, and mean cloud fractions of more than 60-70% are found over Gabon and southern Congo-Brazzaville. This extensive low cloud cover during the dry season is suspected to play a key role in the presence of a rainforest over WEA, by reducing solar radiation, daytime temperatures and hence water demand. The present paper aimed at assessing the ability of reanalysis products (MERRA2 and ERA5) and eight CMIP6 models to simulate these low clouds and their drivers, by focusing on mean patterns and interannual variability. CALIOP satellite data and ground observations at synoptic stations were used as reference.
Over the Tropical Atlantic and the neighbouring African continent, the CMIP6 simulations and reanalyses all show a reasonable representation of the regional distribution of low clouds, but the coupled simulations underestimate the low cloud fraction. The MERRA-2 reanalysis also strongly underestimates low cloud fractions. Over WEA itself, CAL-IOP and ERA5 slightly underestimate the low cloud fraction (ca. 60% instead of ca. 70% at synoptic stations). The underestimation is stronger in MERRA-2 and in the coupled simulations, although with some differences between the models. When forced by observed SST, the models generally perform much better, suggesting that the cloud biases in the coupled models are partly due to an inaccurate representation of atmospheric-ocean feedbacks. South Atlantic SST fields in the coupled simulations actually show a strong warm bias. The annual cycle displays an insufficient seasonal cooling over the Eastern Equatorial Atlantic from April to July in most models. It thus reduces the lower-tropospheric stability (LTS) which is a key feature for the maintenance of a stratocumulus cloud cover over WEA. For some models however, the simulation of the low clouds is not improved when forced by observed SST, and the inter-model variations in the WEA cloud biases do not always scale with their respective SST biases, indicating that additional factors play a role.
LTS strongly controls the observed interannual variations of low-cloud fraction over WEA, and the relationship is well reproduced by ERA5 and many models, but not all. However, while in the observation and ERA5, SST over the equatorial Atlantic Ocean exert a major control on both LTS and low clouds, only some models capture this link. Importantly, the performance of the models is only loosely related to their mean biases. Besides SST, a preliminary analysis of atmospheric patterns associated with interannual low cloud variations has been carried out in order to identify secondary drivers and to point to other possible explanations for the poor low cloud simulations. In the observation, higher midtropospheric (700-hPa) temperatures over WEA contribute to increase LTS and to enhance the cloud cover. Monsoon dynamics, especially surface winds over the Bight of Bonny, also relate to the interannual variations of WEA low clouds. All these atmospheric drivers are inconsistently reproduced in CMIP6.
On the whole, the above results show that the models' imperfect simulation of WEA low clouds has different origins, though realistic SSTs are a prerequisite. Even if biases in mean patterns are not always deterrent to a fair simulation of interannual variations and associated forcing mechanisms, an improved simulation of the coupled ocean-atmosphere dynamics in the region is key for reliably assessing future low cloud trends over WEA as a result of global warming. Further work also needs to be carried out on the withinseason changes in low cloud occurrence and their forcing mechanisms.