Land subsidence prediction model based on its influencing factors and machine learning methods

Land subsidence has caused huge economic losses in the Beijing plains (BP) since 1980s. Building land subsidence prediction models that can predict the development of land subsidence is of great significance for improving the safety of cities and reducing economic losses in Eastern Beijing plains. The pattern of evolution of land subsidence is affected by many factors including groundwater level in different aquifers, thicknesses of compressible layers, and static and dynamic loads caused by urban construction. First, we used the small baseline subset Interferometric Synthetic Aperture Radar (SBAS‐InSAR) technology on 47 ENVISAT ASAR images and 48 RADARSAT‐2 images and used Persistent Scatterers Interferometric Aperture Radar (PS-InSAR) technology on 27 Sentinel-1 images to obtain the land subsidence monitoring results from June 2003 to September 2018. Second, the accuracy of the InSAR monitoring results was validated by using leveling benchmark land subsidence monitoring results. Finally, we built land subsidence rate prediction models and land subsidence gradient prediction models by combining land subsidence influencing factors and four machine learning methods including support vector machine (SVM), Gradient Boosting Decision Tree (GBDT), Random forest (RF) and Extremely Randomized Trees (ERT). The findings show: (1) The InSAR monitoring results revealed that the Beijing Plain has experienced serious land subsidence during the 2003–2010, 2011–2015 and 2016–2018 periods. (2): The InSAR monitoring results agreed well with the leveling benchmark monitoring results with the Pearson correlation coefficients of two monitoring results were all greater than 0.95 during the 2003–2010, 2011–2015 and 2016–2018 periods, respectively. (3): We found that the land subsidence prediction based on ERT method is the optimal model among four land subsidence prediction models and that the prediction performance of land subsidence prediction model based on ERT method will be greatly improved when apply this prediction model in sub study areas where the land subsidence mechanism is similar owning to the similar hydrogeological parameters.


Introduction
With the loss of surface elevation, land subsidence has become one of the most common geological disasters (Castellazzi et al. 2017;Galloway et al. 1998). Land subsidence, especially for uneven land subsidence, could causes huge economic losses and damages infrastructure, including rupturing underground pipelines and sinking foundations (Strozzi et al. 2017;Ng et al. 2012). By the end of 2018, more than 150 countries around the world, including Italy (Samsonov al. 2014), China (Luo et al. 2019;Li et al. 2020), Mexico (Castellazzi et al. 2016(Castellazzi et al. , 2017, and USA (Galloway al. 1998), had recorded the occurrence of land subsidence. The cities experienced land subsidence in Chinese mainland are mainly distributed in the North China Plain (NCP) (Zhu et al. 2015;Chen et al. 2016Chen et al. , 2020, the Fenwei Basin (Zhao et al. 2011;Qu et al. 2014) and the Yangtze River Delta (Chen et al. 2003;Hu et al. 2014;Ye et al. 2016). Among these areas, the middle and northern regions of the NCP, where Beijing is located, occur the most serious land subsidence .
As the capital city of China, Beijing is a megacity and rated as the world's first tier city (Alpha +) by the globalization and World Cities Study Group and network (GAWC) in 2018. Beijing has a Gross Domestic Product (GDP) of over USD 452 billion and total population of over 21 million by the end of 2018. Domestic water and economic development consume a large amount of groundwater , which lead to the dramatic decrease in groundwater level and the consolidation of aquifer systems; then causing serious land subsidence . Beijing Plain (BP) is prone to subsidence owing to the existence of Quaternary loose sediments, especially for the Eastern Beijing Plain where the thickness of Quaternary loose sediments exceeds 100 m and experienced serious land subsidence (Zuo et al. 2019;Li et al. 2020). The static and dynamic loads caused by urban construction have accelerated the process of land subsidence development in the BP (Yang et al. 2018). Previous studies have shown that the land subsidence in BP was mainly affected by groundwater level, the thickness of compressible layers, and urban static and dynamic loads (Yang et al. 2018;Zhou et al. 2019;Li et al. 2020).
As a new quantitative microwave remote sensing technology, Interferometry synthetic aperture radar (InSAR) could obtain land subsidence monitoring information with millimeter accuracy (Berardino et al. 2002). However, its application scope is limited by many factors, including atmospheric delay, spatial and temporal decorrelation, topographic error and thermal noise. In order to solve those obvious deficiencies and expand the application scope of this technology, temporal sequential InSAR (TS-InSAR) technology was proposed. As a typical TS-InSAR technology, Persistent Scatterers Interferometric Aperture Radar (PS-InSAR) could detect points with stable backscattering characteristics as permanent scatterers (PSs) and acquire land subsidence monitoring results with millimeter-scale accuracy (Ferretti et al. 2001). Previous studies have shown that this technology has been widely used in the field of obtaining land subsidence monitoring information (Thapaa et al. 2016). Small baseline subset InSAR (SBAS-InSAR) technology is another expansion and improvement of InSAR technology. Relevant research has proved that this technology can efficiently obtain larger-scale regional land subsidence monitoring results with millimeterscale accuracy .
The traditional land subsidence prediction models can be divided into two categories, including physical prediction model and mathematical prediction model. The physical prediction model simulates the process of land subsidence by combining the influencing factors of land subsidence (Nie et al. 2015). This model can explain the mechanism of land subsidence well and obtain reliable prediction results . However, physical prediction model needs many hydrological parameters that are difficult to obtain, which greatly limits the scope of its application . The mathematical prediction model uses mathematical functions to express the statistical characteristics of historical land subsidence . The mathematical prediction model has a wider range of applications than physical prediction model. However, mathematical prediction model has a poor prediction performance when the dispersion of land subsidence value is large (Li et al. 2021). This phenomenon should be attributed to the fact that the mathematical prediction model cannot explain the mechanism of land subsidence. The land subsidence prediction model based on machine learning method can integrate the advantages of the physical prediction model and mathematical prediction model and overcome the shortcomings of the physical prediction model and mathematical prediction model. Previous studies have shown that the land subsidence prediction model based on machine learning method can obtain reliable prediction results (Shi et al. 2020;Li et al. 2021).
This paper was organized as follows. First, we illustrated the geographical and geological conditions of the study area in Sect. 2. In Sect. 3, we describe the InSAR technology and machine learning methods in detail. Then, we obtain the land subsidence monitoring results and assess the accuracy of the InSAR monitoring results by using leveling benchmarks in Sect. 4. We built the land subsidence rate prediction models and land subsidence gradient prediction models by using machine learning methods in Sect. 4. Finally, we introduced our conclusions in Sect. 5.

Study area
Beijing locates in the northwest fringe of NCP (Fig. 1). Geographically, Beijing can be divided into three geographical units, which namely western mountains, northern mountains, and southeastern plains , respectively. Beijing covers an area approximately 16,400 km 2 , of which plains areas and mountainous areas account for 38% and 62% (Shi et al. 2020), respectively.
The BP lies in the southeast part Beijing city and has typical temperate continental monsoon climate. With an annual average precipitation of 598 mm from 1949 to 2020, BP experienced severe uneven distribution of precipitation across the year, of which 80% concentrated from June to September (Fig. 2). Groundwater has been extracted extremely to meet the need of inhabitant survival and economic development, which greatly exceeded its natural replenishment capacity (Gao et al.2018). Excessive groundwater exploitation has led to obvious decline of the groundwater level in BP, which resulted in an increase in effective stress, thus causing consolidation of aquifer systems and serious land subsidence (Galloway et al. 1998;Chaussard et al. 2014).
BP is a typical piedmont alluvial-proluvial plain, which composed by loose sediments carried by Ji Canal, Yongding River, Chaobai River, Daqing River, and Wenyu River (Shi et al. 2020). Different thickness of Quaternary sediments have been deposited in BP during the long geological period, which provide necessary geological conditions for land subsidence ). According to recharge conditions of groundwater and characteristics of Quaternary loose sediments, the aquifer system in BP can be divided into four groups, including a phreatic aquifer, a first confined aquifer, a second confined aquifer, and a third confined aquifer. The hydrogeological parameters of each aquifer are listed in Table 1. Previous studies have shown that BP has experienced serious land subsidence since the 1970s . Five land subsidence funnels have appeared in the Beijing plain, including Shunyi-Pinggezhuang land subsidence funnel, Shahe-Baxianzhuang land subsidence funnel, Laiguangying land subsidence funnel, Dongbalizhuang-Dajiaoting land subsidence funnel and Yulong-Lixian land subsidence funnel. Dongbalizhuang-Dajiaoting land subsidence funnel is the earliest and most typical 1 3 among those land subsidence funnels and shows a tendency to merge with Laiguangying land subsidence funnel. Based on those findings, the Laiguangying-Dongbalizhuang-Dajiaoting (LDD) land subsidence funnel was selected as the study area in this study.

Datasets
In this study, three radar images sets, including ENVISAT ASAR (EA), RADARSAT-2 (R2) and Sentinel-1 (S1) were used to obtain land subsidence monitoring results. EA images set contained 47 descending radar images collected from June 2003 to August 2010. R2 images set contained 48 descending radar images obtained from November 2010 to November 2015. S1 images set contained 27 ascending radar images acquired from January 2016 to September 2018. The EA satellite, R2 satellite and S1 satellite all operate in the 5.6 cm C-band. However, those three satellites have different revisiting time and incident angles. The detailed parameters of each images set are listed in Table 2. Shuttle Radar Topography Mission Digital Elevation Model (SRTM DEM) with a 90 m resolution was used to remove the topographic phase in interferograms. SRTM DEM data supplied by the Shuttle Radar Topography Mission cover more than 80% of the land area of the world (detailed information can be found at http:// www. gsclo ud. cn/# page2).
Forty-nine leveling benchmarks-17, 20 and 20 leveling benchmarks which measured precise land subsidence monitoring results during the 2003-2010, 2013-2015 and 2016-2018 periods, respectively-were used to assess the accuracy of the InSAR monitoring results. Their locations are shown in Fig. 1a. The groundwater level in different aquifers during three different periods was used to present the degree of groundwater extraction; the lower the groundwater level indicates more severity of groundwater exploitation. The thickness of Quaternary loose sediments was used to present the strength of Quaternary sedimentation; the greater the thickness of Quaternary sediments, the more conducive to the development of land subsidence. Building loads were used to present static load information and subway lines, expressways, and urban roads were used to present the dynamic load information. Figure 3 displays the flowchart of this study. Firstly, we used the SBAS-InSAR method on EA images set and R2 images set to obtain the land subsidence information during the 2003-2010 and 2013-2015 periods; at the same time, we used PS-InSAR method on S1 images set to obtain the land subsidence information during the 2016-2018 period. Secondly, the accuracy of InSAR land subsidence monitoring results was assess by using the leveling benchmark monitoring results. Thirdly, we chose machine learning methods to build the land subsidence rate prediction models and land subsidence gradient prediction models.

Persistent Scatterers Interferometric Aperture Radar (PS-InSAR) technology
PS-InSAR technology was proposed firstly in 2000 to eliminate the problem of low correlation of interference phase caused by atmospheric delay, orbit error, terrain error and other factors which greatly limit the application range of InSAR technology (Ferretti et al. 2000a, Ferretti et al. 2000b. Permanent Scatterers (PS) points can maintain stable backscattering characteristics for a long time (Ferretti et al. 2001). These points correspond to urban roads, buildings and exposed rocks in the wild. PS-InSAR technology has been widely used in the field of land subsidence monitoring and has obtained credible results (Shi et al. 2020). The phase caused by deformation can be separated from the following equation according to the imaging geometric relationship of PS InSAR technology: where is differential interferometric phase of two SAR images, deformation is the deformation phase in line-of-sight (LOS) direction, topographic is the residual topographic phase, orbit is the phase affected by orbit inaccuracy, atmospheric is the atmospheric delay phase, noise is the thermal noise of radar system. After remove other phases, deformation can be obtained; then the surface deformation rate along the LOS direction can be calculated. Considering that the deformation along horizontal in BP can be ignored , land subsidence rate can be estimated by using the following equation: where V is land subsidence rate (deformation rate in the vertical), V LOS is the deformation rate in LOS direction and is incidence angle of the radar sensor.
In this study, we applied PS-InSAR technology which has been integrated in SARProz software on S1 images set to obtain the land subsidence monitoring results. According to the principle of minimizing the sum of temporal baseline, spatial baseline and Doppler centers frequency, the image collected in July 2017 was selected as master image. The main image and other slave images form 26 interferograms. The points which amplitude stability coefficient greater than 0.7 were selected as PS points.

Small baseline subset Interferometric synthetic aperture radar (SBAS-InSAR) technology
SBAS-InSAR technology was proposed firstly in 2002 (Berardinoet al. 2002). SBAS-InSAR technology form interferograms by setting temporal baseline threshold and spatial baseline threshold to further eliminate the influence of temporal and spatial decoherence and atmospheric noise, and obtain more accurate surface deformation information. This technology divides SAR images which spatial baseline is less than the setting threshold into one group, called a subset. According to the setting threshold, all SAR images are divided into several subsets; SAR images within the same subsets can form interferograms, and different subsets are connected by common SAR images. The phase caused by surface deformation can be obtained after remove other phases from interferograms. Previous study has shown that the deformation along horizontal in BP can be ignored . Based on the above findings, the land subsidence rate can be estimated by using Eq. (2).
In this study, SBAS-InSAR technology which has been integrated in GAMMA software (detailed information can be found at https:// www. gamma-rs. ch/ softw are) was used on EA images set to and R2 images set obtains the land subsidence monitoring results. We generated 267 interferograms for EA images set when temporal baseline threshold and perpendicular baseline threshold were set as 500 days and 500 m, respectively. We generated 163 interferograms for R2 images set when temporal baseline threshold and perpendicular baseline threshold were set as 500 days and 500 m, respectively.

Machine learning methods
Machine learning is an interdisciplinary covering many fields including probability theory, statistics, pattern recognition, approximation theory and artificial intelligence. Machine learning methods simulate the process of human thinking and learning by using computers. Machine learning has experienced four stages of development since it was proposed in the 1950s. Machine learning methods have been used in analyzing of land subsidence influencing factors Li et al. 2020) and building land subsidence prediction model (Shi et al. 2020). In this study, we used SVM method, GBDT method, RF method and ERT method to build land subsidence rate prediction models and land subsidence gradient prediction models. These four algorithms have been programmed in Python which is an efficient computer programming language. The results of those four prediction models were cross-validated.

Support vector machine
SVM is a machine learning method based on statistics theory. The core of this algorithm is to find the segmentation hyperplane that can correctly divide the training set and has the largest geometric distance (Vapnik et al. 1998). SVM is a supervised learning method, which has been widely used in statistical classification, regression analysis and pattern recognition (Qin et al. 2005, Sun et al. 2002. Owing to excellent classification performance and strong generalization ability, SVM has been widely used in machine learning cases. However, SVM also has some disadvantages, such as low efficiency when used in large size training sample and sensitive to missing data.

Gradient boosting decision tree
GBDT is an effective supervised machine learning method which has been widely used in regression analysis and prediction model cases (Friedman al. 2001). This algorithm obtains the optimal prediction model by adjusting the weight of the basis learner to obtain the minimum value of the loss function. This algorithm was composed by decision tree and gradient boosting. The decision tree method is used to determine the location of the optimal segmentation by iterative training. Gradient boosting was used to reduce the error of current model in each iteration. The optimal GBDT model can be obtained when the value of the loss function reaches its minimum.

Random forest
RF is one of the most representative algorithm of Bagging in ensemble learning (Breiman al. 1996, Breiman et al. 2001. To improve the accuracy and generalization performance of the whole model, RF obtain the final model result by combining multiple learners and taking the mean value of the combined learners. RF has many advantages, including high accuracy, processing the training set with high-dimensional features without dimensionality reduction, evaluating the importance of each feature, and insensitive to loss values. RF also has the many disadvantages including large computational burden in feature collection, taking more time in model training, prone to fitting and poor interpretation.

Extremely Randomized Trees
ERT algorithm is an efficient supervised machine learning method expanded from RF algorithm. The ERT model could get more credible than RF mode by using all sample sets in training process. ERT model has stronger robustness owing to its base learner has more representative when compared with RF model. ERT model could obtain a better prediction model and a wider range of applications owing to this algorithm can overcome effectively some shortcomings existing in RF model. ERT model is an efficient and concise method, especially for large datasets which need more time to train the all of sample sets, to get a credible model.

Land subsidence monitoring results measured by InSAR method during three time periods
Based on the InSAR land subsidence monitoring results, we obtain three land subsidence rate maps of BP during the 2003-2010, 2011-2015 and 2016-2018 periods. Figure 4 shows that land subsidence rate varied greatly in BP and uneven land subsidence was obvious in BP. The areas where have experienced severe land subsidence mainly located in the east part of Chaoyang District, the northwest part of Tongzhou District, the middle and the south part of Changping District, the west part of Shunyi District and the south part of Daxing District. A total of 272,264 pixels were detected in BP from 2003 to 2010, with a density of 50 pixel/km 2 . The maximum land subsidence rate was -110.7 mm/year during this period. A total of 290,197 pixels were detected in BP from 2010 to 2015, with a density of 46 pixel/km 2 . The maximum land subsidence rate reached -144.4 mm/year during this period. A total of 531,087 pixels were detected in BP from 2016 to 2018, with a density of 83 pixel/km 2 . The maximum land subsidence rate was -136.8 mm/year during this period. We divide the land subsidence rate maps in the three periods into five categories to better study the evolution law of land subsidence in BP. Figure 5 and Table 3, 4 and 5 show the classification results. Table 3 shows that the area with land subsidence rate varied from − 150 to − 120 mm/year, − 120 to − 90 mm/year, − 90 to − 60 mm/year, − 60 to − 30 mm/year and − 30 to 15 mm/year account for 0%, 0.3%, 2.6%, 12.2% and 84.9%, respectively. Table 4 shows that the area with land subsidence rate varied from − 150 to − 120 mm/year, − 120 to − 90 mm/year, − 90 to − 60 mm/year, − 60 to -30 mm/year and − 30 to 15 mm/year account for 0.2%, 1.6%, 3.1%, 11.3% and 83.8%, respectively. Table 5 shows that the area with land subsidence rate varied from − 150 to − 120 mm/year, − 120 to − 90 mm/year, − 90 to − 60 mm/year, − 60 to − 30 mm/ year and -30 to 15 mm/year account for 0.2%, 1.2%, 1.1%, 9.4% and 88.1%, respectively. We found that the area with land subsidence rate varied from − 150 to -120 mm/ year, − 120 to -90 mm/year and − 90 to − 60 mm/year has increased and that the area with land subsidence rate varied from -60 to − 30 mm/year and − 30 to 15 mm/ year has reduced by comparing Table 3 and Table 4. Those findings indicate that the development of land subsidence in BP showed an accelerate trend from 2010 to 2015. We found that the area with land subsidence rate varied from -120 to -90 mm/year, − 90 to -60 mm/year and − 60 to − 30 mm/year has reduced and that the area with land subsidence rate varied from − 30 to 15 mm/year has increased by comparing Table 4 and Table 5. Those findings indicate that the development of land subsidence in BP showed a slowing trend from 2016 to 2018. Figure 6 shows that land subsidence has experienced a rapid development stage in BP from 2003 to 2018 and the spatial discrepancy of land subsidence has increased gradually. By the end of 2018, the maximum cumulative land subsidence in BP has reached 1716.3 mm; at the same time, the area with cumulative land subsidence exceed 300 mm has reached 1524.9 km 2 , accounting for 23.8% of the total area of the BP. Several land subsidence funnels have formed in BP. LDD subsidence funnels have experienced the most serious land subsidence among them. Therefore, we selected LDD land subsidence funnel as the study area and build land subsidence rate prediction models and land subsidence gradient prediction models for LDD land subsidence funnel by using four machine learning methods.  1 3

Accuracy verification of InSAR land subsidence monitoring results
A total of 49 leveling benchmarks were used to verify the reliability of InSAR land subsidence monitoring results. We selected leveling benchmarks as datum points and extracted the InSAR monitoring points within 200m of each benchmark. We compared land subsidence monitoring rate of each benchmark with average land subsidence rate of the extracted points during three time periods. Figure 7 shows that the Pearson correlation coefficient between two land subsidence monitoring results during the 2003-2010, 2011-2015 and 2016-2018 periods were 0.97, 0.96 and 0.95, respectively. In order to further verify the accuracy of InSAR land subsidence results, we counted the maximum error, minimum error and root mean square error (RMSE) between the two land subsidence monitoring results. Table 6 presents that the RMSE of two land subsidence monitoring results were less than 10 mm/year in three time periods. Those findings indicated that InSAR land subsidence monitoring results have high accuracy, which provides reliability for subsequent research.

Land subsidence rate prediction models based on machine learning methods
In this study, 54,636 pixels detected by the InSAR technology were selected as research data to build land subsidence rate prediction models by using four machine learning methods including SVM method, GBDT method, RF method and ERT method. Among these data, 70% are selected as training data and the remaining 30% are selected as verification data. Firstly, four land subsidence rate prediction models were built by using four machine learning methods. Secondly, the optimal land subsidence rate prediction model was chosen by comparing the accuracy and RSME of the four models; at the same time, the study area was divided into 7 sub study areas according to hydrogeological parameters (Luo al. 2019). Figure 8 shows the classification results of the study area. 1 3 Table 7 shows the hydrogeological parameters of 7 sub study areas. Finally, the land subsidence rate prediction models were built for each sub study area by using the optimal land subsidence rate prediction model.

Comparison of land subsidence rate prediction models based on machine learning methods
We selected the land subsidence rate during the 2011-2015 period as the dependent variable, while groundwater level in the phreatic aquifer, groundwater level in the first confined aquifer, groundwater level in the second confined aquifer, groundwater level in the third confined aquifer, the thickness of compressible layers and additional stress engendered from static and dynamic loads in same time period and land subsidence rate during the 2003-2010 period were the explanatory variables. We built four land subsidence rate  prediction models by using SVM method, GBDT method, RF method and ERT method, respectively. Figure 9 shows the results of four land subsidence rate prediction models during the 2011-2015. We found that the land subsidence rate prediction model based on ERT method has the highest accuracy and the lowest RSME among those four prediction models. We selected the land subsidence rate during the 2016-2018 period as the dependent variable, while groundwater level in the phreatic aquifer, groundwater level in the first  confined aquifer, groundwater level in the second confined aquifer, groundwater level in the third confined aquifer, the thickness of compressible layers and additional stress engendered from static and dynamic loads in same time period and land subsidence rate during the 2011-2015 period were the explanatory variables. We built four land subsidence rate prediction models by using SVM method, GBDT method, RF method and ERT method, respectively. Figure 10 shows the results of four land subsidence rate prediction models during the 2016-2018. We also found that the land subsidence rate prediction model based on ERT method has the highest accuracy and the lowest RSME among those four prediction models. Based on the above findings, we choose the land subsidence rate prediction model based on ERT model as the optimal land subsidence prediction model in this study.

Land subsidence rate prediction models based on ERT method in sub study areas.
We built land subsidence rate prediction models by using ERT method during the 2011-2015 and 2016-2018 periods in 7 sub study areas. The parameter settings in this section are the same as those in Sect. 4.3.1. Figure 11 and Fig. 12 show the results of land subsidence rate prediction models based on ERT method in 7 sub study areas during the 2011-2015 and 2016-2018 periods, respectively. We found that the accuracy of the land subsidence rate prediction model based on ERT mothed in each sub study area increased and the RSME has decreased when compared with land subsidence rate prediction model based on ERT method applied on the whole study area, except for the sub study

Land subsidence gradient prediction models derived from machine learning methods
In this study, we also selected 54,636 pixels detected by the InSAR technology as research data to build land subsidence gradient prediction models by using four machine learning methods including SVM method, GBDT method, RF method and ERT method. Of these data, 70% are selected as training data and the remaining 30% are selected as verification data. Firstly, four land subsidence gradient prediction models were built by using four machine learning methods. Secondly, the optimal land subsidence gradient prediction model was chosen by comparing the accuracy and RSME of the four models. Finally, the land subsidence gradient prediction models was built for each sub study area by using the optimal land subsidence gradient prediction model.

Comparison of land subsidence gradient prediction models based on machine learning methods
We selected the land subsidence gradient during the 2011-2015 period as the dependent variable, while groundwater level gradient in the phreatic aquifer, groundwater level gradient in the first confined aquifer, groundwater level gradient in the second confined aquifer, groundwater level gradient in the third confined aquifer, the gradient of thickness of compressible layers and the gradient of additional stress in same time period and land subsidence gradient during the 2003-2010 period were the explanatory variables. Four land subsidence gradient prediction models were built by using SVM method, GBDT method, RF method and ERT method, respectively. Figure 13 shows the results of four land subsidence gradient prediction models during the 2011-2015. We found that the land subsidence gradient prediction model based on ERT method has the highest accuracy and the lowest RSME among those four prediction models. We selected the land subsidence gradient during the 2016-2018 period as the dependent variable, while groundwater level gradient in the phreatic aquifer, groundwater level gradient in the first confined aquifer, groundwater level gradient in the second confined aquifer, groundwater level gradient in the third confined aquifer, the gradient of thickness of compressible layers and the gradient of additional stress in same time period and land subsidence gradient during the 2011-2015 period were the explanatory variables. Four land subsidence gradient prediction models were built by using SVM method, GBDT method, RF method and ERT method, respectively. Figure 14 shows the results of four land subsidence gradient prediction models during the 2016-2018. We also found that the land subsidence gradient prediction model based on ERT method has the highest accuracy and the lowest RSME among those four prediction models. Based on the above findings, we choose the land subsidence gradient prediction model based on ERT model as the optimal land subsidence prediction model in this study.

Land subsidence gradient prediction models based on ERT method in sub study areas
We built land subsidence gradient prediction models by using ERT method during the 2011-2015 and 2016-2018 periods in 7 sub study areas. The parameter settings in this section are the same as those in Sect. 4.4.1. Figure 15 and Fig. 16 show the results of land subsidence gradient prediction models based on ERT method in 7 sub study areas during the 2011-2015 and 2016-2018 periods, respectively. We found that the accuracy of the land subsidence gradient prediction model based on ERT mothed in each sub study area increased and the RSME has decreased when compared with land subsidence gradient prediction model based on ERT method applied on the whole study area, except for the sub study area labeled B. This is mainly due to the similar hydrogeological conditions in each sub study area.

Conclusion
In this research, we obtained the land subsidence monitoring results from June 2003 to September 2018 covering Beijing Plain by using SBAS-InSAR method on 47 EA images and 48 R2 images and using PS-InSAR method on 27 S1 images; at the same time, we assess the accuracy of InSAR land subsidence monitoring results. Based on the land subsidence monitoring results, we analyzed the evolution law of land subsidence in Beijing Plain. Land subsidence rate prediction models and land subsidence gradient models in LDD land subsidence funnel were built by using machine learning methods. The following conclusions can be derived: (  2003-2010, 2011-2015 and 2016-2018 periods, respectively. The area with the most serious subsidence occurs in the east part of Chaoyang District and the northwest part of Tongzhou District, where LDD land subsidence funnel located. The InSAR land subsidence monitoring results agree well with the that of leveling benchmark, and the Pearson correlation coefficient between two land subsidence monitoring results all greater than 0.95 during three study period. (2) With the highest accuracy and the lowest RSME, the land subsidence rate prediction model based on ERT method was the optimal prediction model among four land subsidence rate prediction models based on SVM method, SVM method, GBDT method, RF method and ERT method. The accuracy of the land subsidence rate prediction model based on ERT method during the 2011-2015 and 2016-2018 periods was 94% and 93.7%, respectively. Because the land subsidence mechanism is similar in areas with similar hydrogeological parameters, the land subsidence rate prediction models in the sub study area have better prediction performance when compared with the performance of land subsidence rate prediction model based on ERT method applied to the whole study area. (3) We found that land subsidence gradient prediction model based on ERT method was the optimal prediction model among four land subsidence gradient prediction models. The prediction performance of land subsidence gradient prediction model will be greatly improved when land subsidence gradient prediction model based on ERT method was applied to sub study areas with similar hydrological parameters. This finding provides a new method for land subsidence gradient prediction.

Conflict of interest
We do not have any conflict of interest, financial or non-financial that are directly or indirectly related to the work submitted here for publication.

Ethical approval
The study followed the accepted principles of ethical and professional conduct throughout the research work. No animals or human participants are involved in this research.