Inter-comparison of the Impact of INSAT-3D Atmospheric Motion Vectors in 3DVAR and Hybrid Ensemble-3DVAR Data Assimilation Systems during Indian Summer Monsoon

The impact of observations may depend on various factors such as background error covariance in the data assimilation (DA) system. The present study compares the impact of INSAT-3D Atmospheric Motion Vector (AMV) observations in traditional three-dimensional variational (3DVAR) DA system and hybrid Ensemble Transform Kalman Filter (ETKF)-3DVAR DA system (HYBRID) available in Weather Research and Forecast (WRF) modeling system. The objective of the study is to understand how the impact of INSAT-3D AMV observations differ when assimilated using 3DVAR and HYBRID DA systems. The DA experiments are conducted over a ~ 4 week period of Indian Summer Monsoon Rainfall of July 2016. Four sets of experiments are performed with and without INSAT-3D AMV in both the DA systems. The domain-wide verification with respect to radiosonde observations reveals that forecasts in HYBRID experiments are more accurate than 3DVAR experiments, in general. Geographical distribution depicts the positive impacts of INSAT-3D AMV observations across the domain in both 3DVAR and HYBRID DA systems. The AMV observations show a larger relative impact in HYBRID than in 3DVAR. The relative improvement in HYBRID with AMV DA compared to 3DVAR is 77% and 71% for wind and tropical temperature. The skill scores for quantitative evaluation of precipitation forecast indicate a modest improvement in rainfall for HYBRID run and incorporating the AMV observation do not considerably enhance the skill of 24 h and 48 h rainfall forecast.


Introduction
The importance of early warnings from the Numerical Weather Prediction (NWP) model has increased greatly in the last few decades as it plays an important role in mitigating the damages due to natural disasters like floods, thunderstorms, heavy rains, tropical cyclones, etc. NWP is an initial value problem, and its ability to represent the future state of the atmosphere depends primarily on the initial conditions of the atmosphere. The initial conditions for the limited area models are mostly derived from coarser-resolution global forecast models, which lack information about the region-specific local conditions. Data assimilation (DA) is a scientific method obtaining a very precise initial state of combining the background forecast and the observations (Daley 1991). Some of the widely used DA techniques such as three and four-dimensional variational method (3DVAR and 4DVAR) estimates the true state of the system based on the minimization of the cost function (e.g., Courtier et al. 1998;Wu et al. 2002;Bouttier and Kelly 2001). At the same time, the DA methods such as the ensemble Kalman filter (EnKF) estimate the optimal weights to obtain accurate initial conditions (Kalnay 2003). To preserve the accuracy of the state of the system, the distance between the background state and the observations are scaled by background error covariance (BEC) and observational error covariance. In the traditional variational approach, BEC is generated using climatological data and it is assumed to static and isotropic while the advanced DA methods such as EnKF utilizes the ensemble of the model forecast to compute BEC, which evolves with time. However, the implementation of EnKF in an operational forecast system is computationally intensive as its accuracy depends upon the ensemble size ( Houtekamer et al. 2005). In addition to that, the BEC is prone to sampling error due to limited ensemble number.
Recently, a sophisticated DA system is developed by incorporating the flow-dependent BEC information provided by the ensemble members in the 3DVAR DA system, which is popularly known as the hybrid ensemble -3DVAR DA system (hereafter HYBRID; Wang et al. 2007). The improved performance of HYBRID over 3DVAR is documented by many authors (e.g., Hamill and Snyder 2000;Wang et al. 2008;Prasad et al. 2016;Kutty and Wang 2015;Kutty et al. 2018).
Atmospheric motion vectors (AMV) are satellite-derived wind observations, which are obtained by continuously tracking regions of clouds or water vapor using satellite images. AMV provides wind information with good areal coverage, particularly over the data-sparse oceanic region.
Several studies have shown the benefit of assimilating AMVs on improving the weather forecasts over the tropics (eg., Velden et al. 1992;Leslie et al. 1998;Soden et al. 2001;Rani and Gupta 2014;Mounika et al. 2018). A study by Kaur et al. (2015) during the Indian summer monsoon (ISM) reported a positive impact of Kalpana-1 AMV assimilation over the tropical region in the 3DVAR DA system. Deb et al. (2016) demonstrate the reduction of track forecast errors of the cyclonic storm NANAUK over the Arabian Sea when INSAT-3D AMV observations are assimilated. Kumar et al. (2017) has reported a positive impact of INSAT-3D AMV observations in the 3DVAR DA system during ISM.
The impact of observations may vary depending on many factors in the DA assimilation system such as data quality control, preprocessing, and specification of BEC. Previous studies have shown that in the presence of flow evolving BEC in HYBRID DA system the observations are effectively assimilated as compared to traditional 3DVAR DA systems, and hence it is expected that the impact of INSAT-3D AMV observations may vary in different DA systems. Recent studies have established the forecast improvements due to AMV observations in HYBRID DA system (e.g., Sawada et al. 2019;Zhang et al. 2018). The present study attempts to quantify the impact of assimilation of INSAT-3D AMV using the advanced HYBRID DA system using a ~ 4 weeks period of July 2016 using a limited area model. The specific objective of the study is to understand how different or similar the impact of INSAT-3D AMV observations by 3DVAR as compared to that assimilated by HYBRID DA system. This paper is organized as follows. Section 2 gives a detailed description of the DA systems, the model, and the experimental design. Section 3 describes the major results and Section 4 concludes the paper.

Data assimilation systems:
In this study, experiments are conducted using two DA systems namely 3DVAR and hybrid where ′ is the analysis increment that has the contribution of both static and ensemble error covariances available in HYBRID, is the static BEC matrix, ′ is the model space increment that has the contribution of static BEC of 3DVAR and β 1 is used to assign a weight to static BEC.
Here, consists of the spatially varying extended control variables (Lorenc 2003), and its variations are controlled by the localization matrix A. β 2 1 2 − incorporates the flowdependent ensemble BEC information and β 2 is used to assign weight to the ensemble covariance (Wang et al. 2008). The present study assigns equal weight (50%) to the static and flow-dependent ensemble BEC in the HYBRID cost function. The static BEC matrix used in this study is generated using the National Meteorological Center (NMC; Parrish and Derber 1992) method from one month of WRF generated model forecast using CV5 option. In equation (1), is the innovation vector where is the observation, is the background forecast and H is the nonlinear observation operator. Here, H is the linearized observation operator and R is the observation error covariance matrix.
The forecast ensemble perturbations are updated by ETKF using the following transformation matrix T: where Г represnts the eigen values and represents the eigenvectors obtained by the singular value decomposition of ( ) T − . The under-sampling problem associated with ETKF due to small ensemble size is dealt with adding inflation factors Π and ρ that increases the analysis error variance (Wang et al. 2008): Since covariance localization has not been applied to the ETKF formulation, large inflation factors are used to improve the systematic underestimation in the error variance. The initial perturbations for ensembles are obtained as random draws from the static BEC of 3DVAR.
Further details on the configurations of the DA systems can be found in Gogoi et al. (2020)

Model Description and Configurations
The model simulations are performed using the Advanced Weather Research and Forecasting (ARW-WRF) model of version 3.8.1 (Skamarock et al. 2008). The model is fully compressible and non-hydrostatic that utilizes sophisticated parameterization schemes to represent unresolved atmospheric processes. The configuration of the model domain and the various schemes used in this study are shown in Table 1. Figure 1 represents the simulation domain used in this study that covers the monsoon prevailing region in and around the Indian subcontinent. The initial and boundary conditions for WRF simulations are obtained from the National Center for Environmental Prediction (NCEP) Global Forecast System (GFS) data. Three consecutive INSAT-3D images of 30-minute intervals are used to determine the AMVs, which consists of the following steps 1) Image registration, thresholding, filtering, 2)

Data used for
Features/tracer selection and tracking, 3) Quality control and 4) Height assignment (Sankhala et al. 2020). This study uses AMVs retrieved from low-level MIR and visible channels extended from 600 hPa to 950 hPa, and upper-level WV channel data ranges from 100 to 500 hPa. A Recent study shows that INSAT-3D AMV is found to be useful in understanding the monsoon intraseasonal variability of ISM (Sankhala et al. 2019). AMV data for this study is obtained from https://www.mosdac.gov.in/. Conventional in situ observations and satellite-derived wind observations available from the Global Telecommunication System (GTS) are assimilated.

Experimental design and validation
Observation System Experiment (OSE) is conducted to understand the impact of INSAT-3D AMV in 3DVAR and HYBRID DA systems and the details of experiments are provided in Table   2. Four experiments are conducted and the DA system is continuously cycled for ~ 4 week period starting from 0000 UTC 1 July 2016 to 0000 UTC 30 July 2016 at every 12-hour interval, and 48 h free forecast is commenced from each 0000 and 12 UTC DA analysis during the month of  (4).

] × 100 … … . (4)
Positive η (%) values depict improvement in 3DVAR_AMV, HYBRID, HYBRID_AMV in comparison to 3DVAR. The rainfall forecast is further validated using Bias Score (BS) and Equitable Threat Score (ETS) (Hamill and Juras 2006). In general, the fit of analysis to observations are closely tied to the configurations of 3DVAR and HYBRID algorithms and its BEC settings. Previous studies such as Wang (2008) show that for smaller background error variances or larger correlation length-scales the analysis may not fit well with the observations. In addition to that, Zhang et al. (2011) has shown that a closer fit to observations does not necessarily lead to a better forecast. To further assess the impact of observations on the model variables, the vertical profile of analysis increment at the radiosonde location is evaluated (Fig. 3)  validation is done with respect to radiosonde observation, which is absent over oceans. Therefore, the actual impact of AMV observations may not be reflected when verified over limited radiosonde profiles.

Spatial Forecast verification
To further assess the impact of observations as well as DA system on the model forecast, the significant zone of forecast difference between 3DVAR and the rest of the experiments are evaluated for model variables namely wind at 850 hPa level, tropospheric temperature (TT) averaged over 200 hPa to 700 hPa level and relative humidity (RH) at 850 hPa level (Figure not shown). The spatial representation of forecast difference shows an overall more significant difference zone in HYBRID experiments than 3DVAR_AMV simulations. More impact on TT field is observed over Bay of Bengal (BoB) and southern oceanic region. in Figure 5 b-d. LLJ is also known as Findlater Jet (Findlater 1978) is an important lower level circulation feature of the Indian summer monsoon, which carries extensive moisture from the Indian Ocean producing rainfall over the Indian subcontinent. Therefore, the improvement in the forecast of LLJ is expected to improve the rainfall over Indian landmass.
Apart from wind, a substantial improvement of 70 % positive η values for TT variable is observed in HYBRID_AMV experiment (Figure 5h). The improvement percentage for RH is not very significant compared to wind and TT. However, marginal improvement due to AMV DA is observed both in 3DVAR and HYBRID. To explore further on the impact of observations closer to the surface, 10-m surface wind forecast is evaluated with respect to ASCAT wind observations over the AS region. Figure 7 shows the time series plot of monthly averaged RMSE of 24 h and 48 h near surface zonal and meridional wind forecast over the AS. Similar to the results obtained in ERA-interim validation, the assimilation of AMV observations has produced substantial improvements in the wind forecasts in both 3DVAR and HYBRID DA systems, in general. However, the 3DVAR_AMV experiment depicts higher reduction in forecast errors as compared to the other experiments for zonal wind.
The improvement in meridional wind component is more pronounced in HYBRID_AMV run as compared to other experiments for both 24 h and 48 h forecasts. To quantitatively evaluate the precipitation forecasts, skill scores such as ETS and Bias score are calculated for various experiments. It is to be noted that the skill scores are calculated in two phases of experiments: Phase-1 (2 -16 July, 2016) and Phase-2 (17 -31 July 2016), which is represented in Figure 9. The skill of 24 h precipitation forecast for HYBRID experiments are found to be higher than 3DVAR experiments in Phase-2 towards higher rainfall thresholds, which is evident from ETS values (Figure 9c). The HYBRID_AMV experiment shows improved skills for precipitation forecast for higher rainfall thresholds as compared to 3DVAR_AMV experiment in Phase-2. Bias score indicates that all the experiments in Phase-1 and Phase-2 shows overestimation of rainfall and the results are more pronounced in 3DVAR experiments in Phase-2. Similarly, 48 h rainfall forecast results do not show substantial improvement in both the HYBRID experiment in Phase-I ( Figure 10a). However, in Phase -2, the skill scores indicate modest improvements in rainfall forecast in moderate-high rainfall threshold for HYBRID experiment as compared to that of 3DVAR ( Figure 10c). Furthermore, the assimilation of AMV observations does not significantly improve the precipitation forecast in both 3DVAR and HYBRID experiments.

Conclusion
In this study, the impact of assimilation of INSAT-3D AMV in the two DA systems for shortrange forecast during the Indian summer monsoon season is evaluated. The DA systems used in this study include 3DVAR and HYBRID available in the The present study attempts to quantify the impact of INSAT-3D AMV observation in the 3DVAR and the hybrid ETKF-3DVAR DA system. The HYBRID DA system incorporates flowdependent ensemble BEC that generates optimal analysis through increments that are consistent with the background flow, and respond adaptively to the change in the observing system. Therefore, it is expected that the impact of the observing system may vary depending on the DA system used. As a matter of fact, the results from the study indicate that the impact of INSAT-3D AMV observations varies in 3DVAR and HYBRID DA systems. Further, the impact of the new observing system shows more value to the advanced DA systems such as HYBRID than the traditional 3DVAR approach. However, the ensemble system needs to be properly configured for the DA system to perform optimally. Further, the study has not assimilated satellite radiance observation. The impact of INSAT-3D AMV may differ considerably in both the DA systems when satellite radiance is incorporated. Future studies in this direction are warranted. Figure 1 Model con guration deployed in this study.