An improved method for estimating soil moisture over cropland using SAR and optical data

The paper aims to construct simple soil moisture(SM) retrieval model using Sentinel-1 synthetic aperture radar (SAR) data. The water cloud model (WCM) removed the contribution of vegetation to the radar backscattering coefficient, and the backscattering coefficient of soil was estimated. Based on the established SM retrieval model without soil roughness parameters, the SM in farmland and forest land was retrieved using radar VV-VH dual-polarization data. We considered the interference of uneven surfaces on the radar signal, added the radar local incidence angle parameter to improve the model, and constructed a semi-empirical SM retrieval model. The accuracy of the results showed Root Mean Square Error (RMSE) of 0.04 and the Pearson correlation coefficient (r) of 0.80. The SM retrieval model for removing soil roughness parameters can estimate soil moisture with reasonable accuracy. The influence of topographic factors (elevation, slope and aspect) on the retrieval results of the model was analyzed. It was found that the area with the steep slope and blocked radar signal is not conducive to estimate SM. The SM retrieval method constructed in this paper provides many advantages for some research and practical applications, and its application in other SAR data remains to be further studied.


Introduction
Soil moisture (SM) is closely related to many basic agricultural activities and hydrological processes, and it is of great significance in agricultural applications such as crop growth monitoring, crop yield estimation, and irrigation (Koster et al. 2004;Leenhardt et al. 2004;Saux-Picart et al. 2009;Balenzano et al. 2011). The temporal variability of SM provides good information on crop water demand, which is helpful to formulate agricultural water management policies (Champagne et al. 2012;Trudel et al. 2012;Sekertekin et al. 2020). Due to its significant spatial heterogeneity and the high cost of SM observation equipment, it is not easy to establish a high-resolution observation network in the region (Korres et al. 2013;Hajj et al. 2016).
The rapid development of microwave (MW) remote sensing (RS), especially the spaceborne synthetic aperture radar (SAR), breaks through the limitation of traditional pointbased measurement to obtain SM, making it possible to monitor SM in a large area in real-time (Mattia et al. 2009;Wei et al. 2014;Shi et al. 2021). The SAR can provide high temporal and spatial resolution monitoring data under any weather conditions (Gherboudj et al. 2011;Bauer-Marschallinger et al. 2018). The surface backscattering coefficient provided by SAR directly relates to the dielectric constant, which can effectively extract the surface SM information (Zribi et al. 2019).
The method of soil moisture retrieval using SAR data can be divided into theoretical, empirical, and semi-empirical models. The theoretical model involves a wide range of parameters, which makes the model difficult to realize (Gorrab et al. 2015). The empirical model is established under specific conditions. For different regions, the applicability of empirical model is uncertain when the surface and radar parameters are different (Chen et al. 2003;Baghdadi et al. 2012 the backscattering coefficient, surface parameters, and radar parameters, which has the advantage of a wide application range (Shakya et al. 2021;Stuurop et al. 2021). Due to the complexity of the interaction between electromagnetic wave and surface, the radar backscattering coefficient is affected by the surface dielectric constant and soil roughness and radar parameters (incidence angle, frequency, and polarization), and vegetation cover (Balenzano et al. 2013). The water cloud model (WCM) uses simulated or measured surface roughness and vegetation coverage datasets to simplify the theoretical backscattering models (El Hajj et al. 2017;Wang et al. 2020). In the vegetated area, the WCM can effectively remove the contribution of vegetation to backscattering coefficient . Soil roughness is one of the crucial factors affecting retrieval accuracy and one of the most challenging factors to monitor large-scale areas (Aubert et al. 2011;Balenzano et al. 2011;Zheng et al. 2021). It is difficult to measure accurate soil roughness in large areas, primarily on cultivated land, which will change over time (Balenzano et al. 2013;Zheng et al. 2021). Eliminating the soil roughness parameters can simplify the SM retrieval model. Using multi-polarization, multi-angle, and multi-band radar data can eliminate the influence of soil roughness and simplify the relationship model between soil moisture and radar data (Dey et al. 2020). When the ground fluctuation is obvious, the influence of the local terrain on the radar beam should be considered (Rahman et al. 2008;Ouellette et al. 2017;Zhu et al. 2019). The radar local incidence angle fully considers the influence of local terrain on radar signal.
A semi-empirical retrieval model was constructed using Sentinel-1 SAR VV-VH dual-polarization and measured SM data based on the established SM retrieval model for removing soil roughness parameters. The influence of topographic factors (elevation, slope, and aspect) on the retrieval results was analyzed. Remove data from areas that were not conducive to estimating SM. Add The radar local incidence angle parameter to improve the model and improve the accuracy of SM estimation. The semi-empirical model can effectively remove the effects of soil roughness and topography on radar backscattering coefficient. The model is not limited by surface conditions and can estimate soil moisture well.

Study area
The study area is in southern Henan Province, China. Figure 1 shows the geographical location of the study area and SM observation stations. The study area is in the transition zone between the subtropical and temperate zones. The interannual precipitation variation is considerable, with the uneven temporal and spatial distribution. Generally, there is less precipitation in winter and spring, more precipitation in summer and autumn, and mainly concentrated in July and August.
The southwest of the study area is Nanyang Basin, the middle is Tongbai hill, and the northeast is Huaihe River alluvial plain. As an area focusing on agricultural development, the basin and plain are cultivated land, with many crops growing on the surface, mainly wheat and corn. Dense broad-leaved forests grow in hilly areas. The demand for crops for water is different in different periods. Timely monitoring of soil drought is of positive significance to agricultural production.

In situ measurements
The in situ measurements data used in this work are those collected by the China National Meteorological Science Data Center (available at https:// data. cma. cn/). The meteorology stations are equipped with different types of sensors to measure the parameters such as: soil relative moisture and soil temperature at different depths, precipitation, wind speed and temperature. The meteorological bureau uses the GStar-I (DZN2) automatic SM observation instrument to collect hourly soil relative moisture data. The soil relative moisture with an underground depth of 10cm-100cm is measured, and the sampling interval is 1 hour. The hourly SM curve of each station can test the integrity of the data of these station. Abnormal values can be eliminated through the data change trend to ensure the accuracy of the measured SM. The In situ measurements data with a depth of 10cm was used to build the model and verify the accuracy of the estimated SM.

Sentinel-1
Sentinel-1 is recognized as an earth observation satellite, comprising two polar orbiting satellites A and B. Sentinel-1 equipped with a C-band SAR sensor with an operating frequency of 5.4 GHz has a multi-polarization imaging capability. The revisit cycle of one satellite is 12 days, while the use of two satellites declines to 6 days. Under the data acquisition mode of satellite, images with 5-40m resolution can be captured under all-weather conditions. The imaging mode of The SAR image used in this study is single look complex (SLC) data from the interferometric wide swath (IW) mode which provided VV and VH polarization data. The Sentinel-1 data were obtained on 7 July and 16 November 2019.

Landsat-8
The Landsat 8 data is provided at no cost for global users through the data distribution website (https:// glovis.usgs.gov/) of the United States Geological Survey (USGS). Landsat-8 carries operational land imager (OLI) and thermal infrared sensor (TIRS), and the revisit cycle is 16 days. The OLI instrument includes 9 shortwave spectral bands with the resolution of 30 meters, and the TIRS consists of two thermal infrared bands with the resolution of 100 meters. The Landsat-8 image from 7 July and 12 November 2019 were acquired to ensure the consistency in the acquisition time with the Sentinel-1 data.

Methodology
This paper used WCM to estimate the soil backscattering coefficient. The established SM retrieval model estimated soil relative moisture in the vegetation coverage study area. Discussed the effects of elevation, slope and aspect on the retrieval results to remove the data that is not conducive to estimate SM. We considered the influence of terrain factors on the measured results, addicting the radar local incidence angle parameter to improve the model. The illustration of methods for SM retrieval is in Fig. 2.

Water cloud model
The vegetation layer will contribute to the radar backscattering coefficient in the area covered by vegetation. The effect of vegetation on backscattering coefficient must be subtracted when retrieving soil moisture from SAR data. The WCM formula is as follows: Where 0 is total backscattering coefficient of the radar reception; 0 veg , 0 soil are vegetation and soil backscattering coefficient, respectively. is radar incident angle; 2 ( ) is a double attenuation factor for radar waves passing through vegetation; VWC is the vegetation water content; and the values of A and B depend on the vegetation type.

Improved SM retrieval model
Previous studies have shown that there is a close relationship between the soil backscattering coefficient, SM and soil roughness (Ulaby et al. 1981), which can express as: Where pq is H or V polarization, M v is soil moisture, l is correlation length, and s is the root mean square height. Correlation length and root mean square height are essential factors constituting surface roughness. Zribi and Dechambre (2002) proposed the combined soil roughness parameter, Z s = s 2 / l. Therefore, the empirical formula abbreviates as: Based on the data simulation, the relationship between backscattering coefficient, SM, combined soil roughness parameters and incident angle can express as (Kim and Van 2009): In formula 6, the combined soil roughness parameter can be eliminated by combining two polarization data, and the relationship between backscattering coefficient and SM under each polarization combination can be obtained. The SM retrieval formula of VV-VH dual-polarization data is as follows (Wang et al. 2018): Where A VVVH , B VVVH and C VVVH are functions of radar incidence angle. Wang et al. (2018) put the simulated radar and surface parameters into the AIEM model (Wu and Chen (4) ) to fit the A VVVH , B VVVH and C VVVH parameters corresponding to each radar incidence angle. It is worth mentioning that the AIEM model is only applicable to HH and VV co-polarization data, while the AIEM model and Oh model jointly describe HV and VH cross-polarization data (Oh et al. 2002).
In this study, 75% of data are used to fit the formula and use the remaining 25% of data to verify the accuracy of model. The radar incidence angle will change under the influence of terrain, as shown in Fig. 3. Adding the parameter of radar local incidence angle to the formula. The improved SM retrieval formula is as follows:

Results
Based on Sentinel-1 and Landsat-8 RS images, calculate the soil backscattering coefficient by WCM. There are 138 group data, of which 104 data are put into formula seven as training samples. The fitted retrieval formula is as follows: The fitting formula 9 estimated the remaining 34 data and constructed the scatter diagram with the measured 34 data (Fig. 4). The Root Mean Square Error (RMSE) is 0.04, and (8) M v = 10 0.07694⋅cos( )⋅ 0 soil(VV) −0.00934⋅cos( )⋅ 0 soil(VH) + 0.33924⋅cos( ) −0.7194 Fig. 2 The illustration of the SM retrieval model the Pearson correlation coefficient (r) between the estimated and measured soil moisture is 0.80. Without soil roughness parameter, the SM retrieval model can estimate the SM with reasonable accuracy. In order to analyze the influence of topographic factors (elevation, slope, and aspect) on the model, the soil moisture was estimated by formula 9. Divide all estimated SM data into four groups regarding their DEM, slope, and aspect values, respectively. The correlation coefficient values of estimation and measurement of SM under grouping are in Table 1. In the study area, the basin and plain areas have the characteristics of low elevation, flat slope and inconspicuous aspect distribution, while in hilly areas, it is the opposite. From Table 1, conclude that (1)In the area with low elevation (elevation lower than 130 m), flat slope (slope lower than 2°), north and south aspects, the correlation coefficient between observation and estimation of SM is little different, and the topographic factors hardly affect the estimated results of the model. (2)The r between the retrieved and the observed SM in the area with high elevation, steep slope, and east aspect is significantly lower than in other areas. In the area with sizeable topographic relief (steep slope), the radar signal received and reflected in the east aspect is blocked by the terrain, making it unable to express the surface information.
We compared the radar backscattering coefficients in the study area and found that the backscattering coefficients in the north and south aspect were not significantly different, and the values in the east aspect were low. The radar signal in the west aspect was affected by perspective shrinkage and overlap, and the backscattering coefficient in this area was significantly higher than that in other areas.
Combined with the VV-VH dual-polarization data of Sentinel-1 and the optical data of Landsat-8, the improved model was used to retrieve the SM in vegetation areas. Taking the data of 7 July 2019 as an example, Fig. 5 shows the spatial distribution of the estimated SM (Fig. 5a) and normalized vegetation index (NDVI) (Fig. 5b). The blank parts in the figure are clouds, shadows, water bodies, and buildings. The estimated SM is in good agreement with the  The scatter plot between measured and estimated soil relative moisture spatial distribution of NDVI. Especially in the low vegetation coverage area, the SM in this area is also relatively low. The slope is steep and the aspect distribution is apparent in the hilly areas. The high and low values of estimated SM in this areas are cross distributed. Due to the radar signal was affected by sizeable topographic relief, the soil backscattering coefficient in the west aspect was too large, and the soil backscattering coefficient in the east aspect was too low.

Conclusion and discussion
The SAR polarization data provides useful information about SM. The Sentinel-1 VV-VH dual-polarization data estimated the SM in the study area using the SM retrieval model without soil roughness parameters. Group the SM data according to the elevation, slope, and aspect, and the correlation between the measured and estimated SM under each grouping was analyzed. According to the Sentinel-1 data used in the study, found that in the area with sizeable topographic relief (large slope and apparent aspect distribution), the radar signal was blocked by the terrain, resulting in the small radar backscattering coefficient value in the east aspect in the study area, and the correlation between the estimated and measured SM in this area was poor.
Considering that the actual radar incidence angle would change under the influence of terrain, add the radar local incidence angle parameter to improve the model. The SM retrieved by the improved model has the highest estimation accuracy, which shows that it was reasonable to consider the radar local incidence angle in the model.
It is worth mentioning that the steep slope and the east aspect areas are not conducive to the model estimation of SM and are not suitable for other SAR data. With the change of SAR imaging mode, the blocked area of the radar signal will change. This study found that the radar signal in the west aspect was affected by perspective shrinkage and overlap, and the value of the radar backscattering coefficient in this area was too large. However, there are no SM observation stations in the study area, distributed in the area with a large slope and west aspect, so the correlation between observation and estimation of SM in this area cannot be verified. In general, the improved SM retrieval model does not need the measured surface parameter but only needs to combine the radar polarization and the measured SM data with building the model. Constructed the model using the relationship between soil backscattering coefficient and SM, so applying this model to bare land is applicable. The SM retrieval method constructed in this paper provides advantages for agricultural irrigation, drought monitoring, and other practical applications, but its application in its area needs to be further studied. At the same time, the paper only discusses the ability of VV-VH polarization combination data to estimate SM, and the effect of other polarization combination data to estimate SM needs to be further confirmed.