Description of the deep-learning-based global oceanic DA system. The procedure in the DeepDA is as follows (Fig. 1a). When the global analysis states at a previous time step (i.e., background states, Fig. 1b) and the observational states (Fig. 1c) are given as inputs, the generator in the GAN produces the global analysis states (Fig. 1d). While the key feature in the background states is extracted through the three-dimensional (3D) convolution process, those in the observations goes through the 3D partial convolution process (see Methods and Materials for the details). The generator, and discriminator are designed to evaluate the similarity, and reality of the analysis states by comparing the true states (Fig. 1e), respectively. Once the analysis fields are generated, they are recurrently used as the background states, which is an input map of the generator for the next assimilation cycle. We updated the generator in the similar manner by setting the batch size as one. Therefore, both the input and coefficients of the generator are updated sequentially.
As the analysis map is produced by combining the observations and short-term forecasts, the analysis distribution of the DeepDA can be expressed as a form of joint probability: \(p\left({\mathbf{x}}_{t}|{{y}}_{1:t}\right)=p\left({\mathbf{x}}_{t}|{{y}}_{t},{{y}}_{1:t-1}\right)\propto p\left({\mathbf{y}}_{t}|{\mathbf{x}}_{t}\right) p\left({\mathbf{x}}_{t}|{{y}}_{1:t-1}\right)\), where \({\mathbf{x}}_{t}\) and \({{y}}_{t}\) are the target and observational states at time \(t\), respectively, and \({{y}}_{1:t}\stackrel{\scriptscriptstyle\text{def}}{=}\{{{y}}_{1}, {{y}}_{2}, \dots ,{{y}}_{t}\}\). \(p\left({\mathbf{x}}_{t}|{{y}}_{1:t-1}\right)\), and \(p\left({\mathbf{y}}_{t}|{\mathbf{x}}_{t}\right)\) denote the probability of the short-term forecast states at time \(t\) by the given observational states up to time \(t-1\) (Fig. 1b), and that of the observational states by the given true states at time \(t\) (Fig. 1c), respectively. That is, the DeepDA utilizes the observations over multiple previous time frames to estimate the state at a target time \(t\). In contrast, K20 only utilizes simultaneous observations for the state estimation (i.e., \(p\left({\mathbf{x}}_{t}|{{y}}_{t}\right)\)), and therefore, the analysis accuracy of the DeepDA is expected to be higher than that of K20.
For the training dataset, the global 3D pentad-averaged temperature from the surface to a depth of 700 m was obtained from the large-ensemble historical simulation of the Community Earth System Model, Version 2 (CESM2)28. The period for the training dataset spans from 1892 to 1932, and the total number of training sample is 2,993 (i.e., 73 pentads \(\times\) 41 years). A separate period of the dataset, from 1933 to 1973, is utilized for the validation.
The superiority of the DeepDA revealed by the observing system simulation experiments. For testing the performance of the DeepDA system, first, the observing system simulation experiments (OSSEs) are performed by utilizing the CESM2 dataset from 1974 to 201429. That is, the true oceanic states are defined from the simulation results of CESM2 during 1974–2014, and the observational states are generated by adding random errors to the true states. These observed values are then masked out, except the observed location for the global in situ temperature and salinity observations obtained from the Met Office Hadley Centre Integrated Ocean Database (HadIOD) version 1.2.0.030. The standard deviation of the observational random error is set to 0.2. The initial background states for the generator model is given as the randomly selected states from the training dataset. Therefore, initially, only the information of the true states is provided in the observational data.
The global maps presented in Fig. 1a show the snapshot of the DeepDA results obtained from the OSSEs at the first cycle of the assimilation (i.e., first pentad at 1974). The ground truth exhibited an El Nino state over the equatorial Pacific, positive Indian Ocean Dipole, tropical Atlantic warming, and north Pacific cooling (Fig. 1e). The observations exhibited a sparse spatial distribution similar to the true states as designed (Fig. 1c). The background state, which is the input of the generator, showed a significantly different spatial distribution of the surface temperature anomalies, as it is randomly selected in the first assimilation cycle (Fig. 1b). By assimilating the observation into the background states, the analysis state becomes closer to the true states, particularly in the vicinity of the observed location (Fig. 1d). For example, the negative SST anomalies over the equatorial Pacific in the background state are corrected to positive values for over most parts of the equatorial Pacific in the analysis field. With a sufficient number of observations over the north Pacific, the sign and spatial distribution of the analysis SST anomalies over these regions become smilar to those of the true state. The state correction by inserting observations is also shown at the subsurface levels (Extended Data Fig. 1), even though the degree of the correction is less than that of the surface layer owing to a smaller number of observations.
In the analysis field, the SST anomalies resemble the observed values not only over the observed locations, but also over the adjacent regions. For example, the basin-wide warming over the north tropical Atlantic, which is shown in the grouth truth, is also prominent in the analysis field, even though these regions are not fully covered during the observations. In addition, even though the observations are along an arbitrary straight line due to the ship-based observations in many regions, the analysis field does not exhibit these features at all. This is due to the spread of observatoional information to adjacent grid points; sensitivity tests with a single point observation in each ocean basin (Extended Data Figs. 2 and 3) confirmed this feature. More importantly, the spatial scale and detailed pattern of the analysis increment is different at different times and in different regions; this is the state-dependent feature of the DeepDA.
By repeating the assimilation cycle in time, the analysis fields gradually adjust to the true states with the aid of the accumulated observed information. The root-mean-squared error (RMSE) between the analysis fields and the true states is significantly reduced in time, and saturates roughly after three months (Extended Data Fig. 4). After these few months of a spin-up time, even the background states become similar to the true states (Extended Data Fig. 5). This demonstrates that the estimated states in the GAN model are closer to the true states, as the observational information of the previous time steps is accumulated through the sequential DeepDA process.
The RMSE of the DeepDA-based 3D temperature analysis values obtained for the testing period (i.e., from April 1974 to December 2014 after excluding three months of a spin-up time) is smaller than those of the observations or the background states (left bars in Fig. 1f). Over the observed locations, the RMSE of the analysis values is reduced to 0.12, which is smaller than that of the background (i.e., 0.25), or the observed values (i.e., 0.2). This confirms that the DeepDA properly maximizes a posterior probability (i.e., analysis value distribution) by combining two possible priori probability (i.e., background and observational distribution)31 The global-averaged RMSE of the analysis states is also systematically reduced to that of the background states, demonstrating the systematic error reduction by the DA process (right bars in Fig. 1f).
We also compared the quality of the DeepDA and K20 analyses. The RMSE values of the output obtained using K20 are 0.27 and 0.44 over the observed location and globe, respectively, which are almost twice that of the DeepDA analyzed values. The RMSE of K20 is even larger than that of the background value of the DeepDA. This clearly demonstrates that the DeepDA exhibits a superior performance compared to the previous partial convolution-based DA systems.
DeepDA-based global ocean reanalysis using real in-situ observaitons. Next, the real in situ ocean observations spanning 41 years, from 1980 to 2020, are assimilated to generate the oceanic reanalysis data for the observed period. For the real case, the satellite-based gridded SST output (i.e., OISST V232,33) is also assimilated in addition to the in situ profile observations. Similar to the OSSEs, the initial background states are randomly selected among possible initial states.
The RMSE of the pentad-averaged DeepDA reanalysis from the observed temperature profile clearly exhibited the smallest values among those of the state-of-the-art ocean reanalysis systems in most vertical levels of most ocean basins (Fig. 2a-2d). The global-averaged RMSE of the DeepDA is less than 1 oC for all the vertical levels and the smallest among those of all the other reanalysis products, from the surface to 500 m (Fig. 2a). The RMSE of the DeepDA is also lower than those of the other reanalysis products over the Indian Ocean (Fig. 2b), tropical Pacific (Fig. 2c), and northern Atlantic ocean (Fig. 2d). This confirms that the DeepDA is well fitted to the observation profiles. However, the global RMSE increases below 500 m, where the number of in situ observations becomes relatively smaller. The increase in the RMSE of the DeepDA below 500 m is mainly attributed to that over the Atlantic (Fig. 2d), which might be caused by the systematic errors in simulating the Atlantic variability (e.g., Atlantic meridional overturning circulation) in the training samples (i.e., CESM2)34.
The global-averaged SST in the reference satellite observation (i.e., OISST V2) is between 19 \(\text{℃}\) and 20 \(\text{℃}\) with the gradual warming trend (Fig. 2e). In the DeepDA, amplitude of the global-averaged SST and its yearly variation is generally well reconstructed, confirming the successful injection of the observed SST into the reanalysis fields. Therefore, the global-mean SST time-series in the DeepDA belongs to a group that exhibits the highest consistency in terms of correlation and variability (Fig. 2f). In the DeepDA, the global-averaged oceanic heat content anomalies, which are defined by the vertically integrated density-weighted temperatures up to 500 m (will be referred to as OHC500), exhibits a marked increase after 1990, while the warming trend is less clear in the 1980s. This temporal variations are consistent with the reference data (i.e., EN4.2.2) to some extent9, even though the OHC500 in the DeepDA during the 1990s and 2010s is systematically overestimated and underestimated, respectively, compared to the EN4.2.2. The temporal correlation between the DeepDA-based global-mean OHC500 anomalies and the EN4.2.2 is over 0.9, which is the second highest among those of all the reanalysis datasets (Fig. 2h).
To examine the spatial distribution of the reanalysis fields, the annual mean SST and T300 (i.e., the vertically averaged temperature from the surface to 300 m) (Fig. 3a-3d) is shown. The annual-mean SST exhibited an ignorable amplitude of the bias (Fig. 3a), which is smaller than other oceanic reanalysis products (Extended Data Fig. 6). The pattern correlation of the annual-mean SST in the DeepDA and the OISST V2 reaches up to 0.99, which is the largest value among all analyzed reanalysis products (Fig. 3b).
The annual-mean T300 exhibited negative biases over the north Pacific and north Atlantic, with a maximum amplitude of up to 1 oC (Fig. 2c). By contrast, the bias in the equatorial regions is less than 0.5 oC, and smaller than those in other reanalysis products (Extended Data Fig. 7). In addition, even though the positive bias is observed over the southern ocean, its amplitude is generally smaller than those in other reanalysis products. As a result, the pattern correlation between the T300 in the DeepDA and those in EN4.2.2 is 0.8, which is similar to that shown by the other reanalysis products (Fig. 3d). The global-averaged standard deviation of the T300 climatology deviated from the global-mean value is slightly smaller than those of the EN4.2.2, however, this feature is quite common in many reanalysis products.
In addition to the annual-mean fields, the seasonal cycle of the SST and T300 is also well simulated in the DeepDA (Fig. 3e-3h). Similar to the annual-mean fields, the seasonal cycle of the SST, which is represented by the difference of the climatological SST during June-July-August (JJA) from that during December-January-February (DJF) season, is almost identical to the reference data (Fig. 3e and 3f). The seasonal difference in the T300 climatology is successfully reconstructed using the DeepDA, as the bias amplitude in most regions is less than 0.4 oC (Fig. 3g). The corresponding pattern correlation reaches 0.9, which is a similar degree to other reanalysis products (Fig. 3h). Note that the amplitude of the bias in T300 is generally larger than that in the SST in the DeepDA, basically due to the availability of a small number of subsurface observations to constrain the reanalysis fields. However, the other reanalysis products do not exhibit a similar issue (Extended Data Fig. 6 and Fig. 7). This is possibly due to the lack of dynamical processes to propagate the observation-sufficient surface signal into the subsurface in the DeepDA (e.g., vertical advection or diffusion processes).
To demonstrate the reanalysis quality in simulating the observed interannual variability, we calculated the temporal anomaly correlation of the monthly SST anomalies between the DeepDA reanalysis and the OISST V2 (Fig. 4a). The temporal correlation coefficient shows a maximum value of 0.9 over most of the globe, except for the southern ocean, where the number of availiable in situ observations is relatively low; this feature is similarly seen in the other reanalysis products (Extended Data Fig. 8). The realistic simulation of the SST variability in the DeepDA is confirmed by the eigenvector of the first Empirical Orthogonal Function (EOF) for each ocean basin (Fig. 4b and 4c), whose spatial distribution is similar to that of the reference data.
The anomaly correlation coefficients of the T300 anomalies between the DeepDA and the ENN4.2.2 are also calculated (Fig. 4d). The subsurface temperature anomaly variation in the DeepDA is generally well reproduced with a relatively higher correlation coefficient for the tropical and north Pacific regions, while a reduced correlation is visible over the tropical Atlantic and southern oceans. These spatial distributions of the correlation coefficients are common in the other reanalysis products (Extended Data Fig. 9)9 as well. Even though the systematic difference is shown over the South Pacific and Atlantic, the first EOF eigenvector of the T300 anomalies in the DeepDA is generally similar to the reference data (Fig. 4e and 4f). In particular, both the dominant variability and the detailed time-evolution of the T300 anomalies is almost idential over the equtorial Pacific (Fig. 4g and Extended Data Fig. 10 and Fig. 11), confirming that the equatorial Pacific variability is well simulated by the DeepDA.