Impervious surface Mapping and its spatial–temporal evolution analysis in the Yellow River Delta over the last three decades using Google Earth Engine

The unique geographical location of the land-sea transition makes the ecological environment of the Yellow River Delta very fragile and vulnerable to human activities. As one of the characteristics of anthropogenic activities, monitoring the spatiotemporal changes of impervious surface area (ISA) is of great significance to the protection of the ecological environment in the Yellow River Delta (YRD). Based on the Landsat historical images and computing resources provided by Google Earth Engine (GEE), an ISA mapping method was developed through combining spectral, texture features and random forest algorithm, and subsequently was applied to generate the spatiotemporal distribution data of ISA of the YRD for 1992, 1998, 2004, 2010, 2016 and 2021. The experimental results demonstrated that the proposed method achieved satisfactory accuracy, with an average overall accuracy of 92.23% and an average Kappa coefficient of 0.9090. Through further time-series analysis of ISA, it found that the area of ISA in the YRD increased from the initial 394.87 km2 to 1081.74 km2 during study periods, and the annual growth rate broke through new highs, ranging from the initial 1.01 km2/year to 67.87 km2/year. According to the research results, urban development activities in the region should be strictly restricted in order to protect the ecological environment of the YRD.


Introduction
In recent decades, China has experienced rapid urbanization and industrialization, which has led to significant changes in land use/cover (Qiao et al. 2016). As one of the most important indicators for studying urbanization, impervious surface area (ISA) directly reflected ecological and environmental issues from local to global (Ayalew et al. 2022). The research on the spatiotemporal dynamic distribution of ISA contributed to understanding the impact of human activities on the natural world (Seto et al. 2012). In general, ISA refers to man-made surface materials that impeded water infiltration into underground, such as asphalt, concrete roads, roofs, or car parks (Chester 1996). The Chinese government attached great importance to the sustainable development of the Yellow River Delta (YRD) region due to its unique geographical location and significant strategic position. For example, a specialized regional plan was approved by Chinese State Council for the reasonable development, called "Development Plan for the Yellow River Delta Efficient Ecological Economic Zone". Therefore, it is of great significance to study the spatiotemporal evolution of ISA in the YRD region for deeply understanding the urbanization process of the region, planning sustainable urban development, protecting the ecological environment as well as promoting harmonious co-existence between human and nature .
With the continuous development of computer technique and sensor, numerous approaches have been developed for the ISA extraction from remotely sensed data with several spatial resolution over the past decades (Wu et al. 2015). For example, Wang et al. generated time series ISA data over Huai'an central urban area and subsequently explored the impact of ISA spatiotemporal changes on urban heat islands (Wang et al. 2022a, b); Wang et al. adopted a neighborhoodbased spatiotemporal filter for extracting ISA continuous change information from Landsat images over the Qinhuai River basin from 1988 to 2017, aiming to provide input for future urban planning in this basin . Different from the traditional field surveys, remote sensing has the advantages of wide coverage, short revisit and low cost, which has been considered as the primary means to obtain ISA distribution (Duan et al. 2022). The main differences among researches on ISA using remote sensing images included: (i) selection and extraction of classification features, such as different combinations of spectral and texture features (Fu et al. 2021;Liu and Li 2016;Li et al. 2021;Zhao and Wang 2012); (ii) selection and design of classifiers, the used classification methods for extracting ISA mainly involved: random forest (Liu et al. 2020a, b), support vector machine method (Xue et al. 2009), decision tree (Li et al. 2002) and neural network (Cots-Folch et al. 2007), etc. Gu et al. compared and analyzed the accuracy of the four classification methods mentioned above, and the results showed that the random forest algorithm had great advantages over others (Gu et al. 2019). The random forest classification method has the advantages of higher accuracy, less parameters to set, and more convenient operation and estimation of feature variables' importance (Breiman 2001), leading to its wide application in remote sensing image classification for the extraction of land cover/utilization information.
For the ISA of the YRD, some scholars have conducted relevant studies. Song et al. took Binzhou City, the hinterland of the YRD, as the study area and investigated the impact of ISA on rainwater resources (Song 2018); Shen et al. extracted ISA distribution information of Dongying City in time series based on MNF transform, PPI index and LSMA, and used root mean square error to verify the accuracy (Shen et al. 2021); Liu et al. used random forest method based on Sentinel-1/2 image method to extract the impervious layer of Dongying City in 2019, and the final overall accuracy reached 93.37% . Although the previous process of ISA extraction based on the traditional desktop side can obtain highly accurate ISA information, the image data pre-processing process is tedious and at the same time, high-performance processing equipment is also essential to obtain the resultant data quickly and accurately, which will undoubtedly increase the cost expenditure.
In the twenty-first century, China has entered a new stage in the field of Earth observation, demanding higher performance of the automated processing and analysis of massive remote sensing data, and there is an urgent need to make innovative breakthroughs in intelligent, efficient and lowcost processing of massive remote sensing data. With the rapid development of space information technology, diverse and integrated remote sensing cloud computing platforms have emerged. During the development of the platforms, the Google Earth Engine (GEE) cloud computing platform was born, presenting the remote sensing data processing and application to the general public (Tamiminia et al. 2020). GEE is a cloud platform provided by Google to achieve efficient visualization and processing of massive remote sensing data (Tamiminia et al. 2020). The GEE platform, which has been applied in many fields, simplifies the process of downloading and processing of remote sensing data by using globally open-source datasets capable of processing large amounts of remote sensing data efficiently, such as Landsat and Sentinel (Samadi Todar et al. 2021;Yang et al. 2022).
In this study, we aim to generate a time series of consistent ISA data in the YRD from 1992YRD from , 1998YRD from , 2004YRD from , 2010YRD from , 2016 and 2021 using a random forest classifier and GEE, and subsequently analyze the spatiotemporal changes of ISA in the last three decades. More specifically, this study aimed (i) to determine whether ISA mapping accuracy was improved when spectral and GLCM texture features were synergistically used, (ii) to understand what the ISA spatial-temporal evolution characteristics were in the YRD spanning almost three decades, (iii) to attempt to provide beneficial suggestions for the protection of ecological environment and sustainable development of the YRD.

Study area overview
As one of the three major estuarine deltas in China, the YRD harbors the most extensive and well-preserved wetland ecosystem in the warm temperate region of China (Lu et al. 2018). Due to its unique geographical location and geography, climate change and disturbance from human activities are highly likely to lead to changes in the ecological environment of the YRD . Meanwhile, its unique strategic national status (Chester 1996) also makes the YRD region of great research value in terms of land use/ cover (Zhao et al. 2013). The YRD region is mainly located within the area of present-day Binzhou and Dongying cities in Shandong province, with Dongying City containing 93% of the entire YRD area.
In this paper, the spatiotemporal evolution of ISA of the YRD is studied by selecting the time-series Landsat images of Dongying City as the study area (Fig. 1). Dongying City is located in the mid-latitude region from 36°55′ to 38°10′ N and 118°07′ to 119°10′ E. The eastern and northern regions are near the Bohai Sea. Its climate belongs to a typical warm-temperate continental monsoon with large differences in total precipitation in different years, which makes it prone to suffer from droughts and floods. At present, the total land area of Dongying City is 8243km 2 , with construction land accounting for 14.52% of the total area. The Yellow River runs through Dongying City from southwest to northeast, flowing into the Bohai Sea in the Kenli district. The topography of Dongying City slopes along the direction of the Yellow River, with a maximum difference in elevation of about 27 m.

Data sources and pre-processing
Google Earth Engine is a cloud computing storage platform that stores open-source datasets such as Landsat and Sentinel on a global scale (Tamiminia et al. 2020), and retrieves timeseries Landsat 5/8 cloud less than 30% remote sensing image data based on the GEE cloud platform. The multi-temporal data used for the study are shown in Table 1. The Landsat 5/8 remote sensing images retrieved through the GEE platform have been geometrically and atmospherically corrected, and only the following two aspects of data pre-processing are required: (i) Based on the already screened remote sensing images with less than 30% clouds, image de-clouding is carried out by using the cloud mask method; and (ii) Image cropping and mosaic are carried out by using the uploaded Dongying City boundary data.

Workflow
The technical process of extracting ISA information based on the GEE cloud computing platform is shown in Fig. 2, the main steps include: (i) Data pre-processing includes the screening of remote sensing dataset based on time and cloud (cloud cover threshold of 30% was applied as filter condition for deriving high-quality images), de-clouding of dataset, and cropping of images in the classification area; (ii) Feature calculation mainly includes the calculation of 3 spectral features of NDVI, NDBI and MNDWI, and the calculation of 6 texture features based on Gray Level Co-generation Matrix (GLCM) (iii) The training set and test set are produced based on the GEE cloud platform, and the selected samples were randomly divided into training samples and test samples in the ratio of 7:3; (iv) Classification and accuracy verification are performed based on the random forest classification method for classification, and confusion matrix is calculated using test samples for accuracy validation. This paper also compares and analyses the effects of three common classifiers, namely random forest, decision tree and support vector machine, on the classification results. (v) Mapping and spatiotemporal evolution analysis were conducted to produce temporal

Google Earth Engine (GEE) platform
GEE is a high-performance cloud computing platform applied to remote sensing data processing, and it is an all-inone platform that brings together massive amounts of remote sensing data, data visualization and data analysis and computation (Tamiminia et al. 2020). Its existence has saved a lot of effort in downloading and managing remote sensing data. Google's millions of servers makes it possible to visualize and calculate large amounts of remote sensing data without extremely high computer configurations, greatly reducing the cost. The GEE cloud platform stores open-source datasets such as Landsat and Sentinel on a global scale, allowing users to retrieve data for visualization and computation according to their needs (Tamiminia et al. 2020). With its stored open-source datasets and efficient computational capabilities, GEE has become one of the major cloud platforms for remote sensing classification today (Eckert et al. 2017;Liu et al. 2020a, b;Phalke et al. 2020;Wang et al. 2020;Zhang et al. 2020), and has been used in many situations (Dong et al. 2016;Noel et al. 2017).
Of course, apart from the GEE cloud platform, there exist many other cloud platforms. Although they can also provide effective cloud computing support in processing massive remote sensing data, they all have more or less deficiencies compared to the GEE platform, such as the low openness of remote sensing data, the lack of flexible algorithms for intelligent data processing, and the low efficiency of data processing (Wang et al. 2022a, b;Kaufman and Tanre 1992).

Spectral features
In order to highlight the differences of different features in the study area and improve the reliability of the ISA extraction results, three spectral features were extracted in this paper, which are: (i) normalized difference vegetation index (NDVI), which is beneficial for distinguishing vegetation from other areas (Kaufman and Tanre 1992); (ii) normalized difference built up index (NDBI), which is more effective for detecting urban building sites (Cha et al. 2003); and (iii) modified normalized difference water index (MNDWI) (Xu 2005), which is more accurate in water extraction (Xu 2005). The three spectral features are calculated as follows: where: ρ NIR , ρ RED , ρ SWIR , and ρ GREEN represent the band reflectance of the NIR, red, shortwave IR, and green bands in Landsat imagery, respectively.

Textural features
Since the topography of Dongying has distinct geometric and textural features, relevant studies have shown that the results of extracting land use information by combining textural features can achieve higher accuracy (Pei et al. 2018), so this paper collaboratively uses textural features to extract spatial and temporal distribution information of ISA in Dongying City. According to Zoltan et al.'s study (Zoltan et al. 2013), six second-order statistics with the least correlation, mean (MEA), variance (VAR), entropy (ENT), angular second-order moment (ASM), homogeneity (HOM) and dissimilarity (DIS), were selected from the grey-scale co-occurrence matrix in this paper, which were calculated as follows: where n is the number of grey levels; x and y are the number of rows and columns; R is the n × n dimensional normalized co-occurrence matrix; and R(x, y) denotes the normalized grey-level value in cell x, y of the co-occurrence matrix. (4) Due to the strong correlation between the spectral bands of remote sensing images, so a single red band was selected for texture feature extraction in this paper, which can also greatly reduce the computational effort. In addition, in order to determine the optimal moving window size for calculating texture features, six common moving windows of 3 × 3, 5 × 5, 7 × 7, 9 × 9, 11 × 11 and 13 × 13, were selected for texture feature extraction based on the same environment configuration in this paper, which were used as auxiliary bands for classification, and the final classification accuracy was compared and analyzed. The results showed that the best moving window size was 5 × 5.
In this paper, seven spectral bands of Landsat images were used. In order to enhance the discrimination between different features and improve the accuracy of the ISA extraction results in Dongying City, three spectral features of normalized vegetation index (NDVI), normalized building index (NDBI) and modified normalized difference water index (MNDWI), mean (MEA), variance (VAR), entropy (ENT), angular second order moment (ASM), homogeneity (HOM) and phase dissimilarity (DIS) of the six texture features that are least correlated with each other were extracted.. A total of the above 16 features were selected to form the classification feature space in this study.

Sample selection
In the GEE visualization platform, samples were selected based on visual interpretation and Google Earth historical images, and the land use/cover types in the study area were classified into seven categories: ISA, water, fallow land, arable land, tidal flat, woodland, and grassland (woodland, tamarisk community, reed community) and unused land. Based on this, 800 ISA samples were selected for better extraction of ISA, and 700 samples were selected for each of the remaining six categories, for a total of 5,000 samples. The number and criteria for selecting test and training samples are shown in Table 2. The selected test samples and training samples should be independent of each other. In this paper, based on the GEE cloud platform, 70% of all samples were randomly selected as training samples and the remaining 30% as test samples.

Random forest
The Random Forest (RF) algorithm was proposed by Breiman Leo and Adele Cutler in 2001 (Breiman 2001). Compared with machine learning methods such as support vector machine (SVM) and neural networks, the process of RF classification is simple and requires fewer parameters to be set, and the principle of RF classification is shown in Fig. 3.  (Breiman 2001), and the accuracy of the final classification can reach or even exceed the level of SVM classification (Daniele and Daniel 2013). In general, the more random decision trees that form a forest, the better the final classification model converges and the more accurate the classification results are, but overfitting may occur as the number of decision trees increases (Guo et al. 2016).
The key step that enables the implementation of RF classification lies in the creation of each decision tree that makes up the forest (Breiman 2001). There are two steps involving random selection in this step: (i) 2/3 of the training samples are randomly selected with replacement to form each decision tree, and the remaining 1/3 of the training samples are called out-of-bag estimation (OOB) data, which are used to verify the accuracy of the formed decision trees (Breiman 2001); (ii) Each node of the decision tree randomly selects M features from the set of classification features to grow with maximum splitting, i.e. the subset of predictor variables is chosen randomly. The advantage of using only a randomly selected subset of predictor variables (categorical features) for split growth is that it enables the correlations between decision trees to be kept small and the generalization ability to be high. Finally, the results of each decision tree test are voted on and the category with the most votes for each entry is used as the final classification result (Saeid et al. 2022;Geng et al. 2019;Jan et al. 2022).
The simplicity of the RF operation is mainly reflected by the fact that only two parameters need to be set: (i) ntree, the number of decision trees (Rodriguez-Galiano et al. 2011);and (ii) mty, the number of randomly selected predictor variables (Rodriguez-Galiano et al. 2011). These two parameters correspond to the two key steps described above. In general, increasing the number of ntree is accompanied by a decrease in OOB error, and when the decision tree is increased to a certain number of queues, the OOB error will stabilize (Rodriguez-Galiano et al. 2011), i.e., the OOB error converges. The necessity of determining whether the OOB error converges or not is that the adequacy of the number of decision trees in the RF can be verified (Rodriguez-Galiano et al. 2011). As for the parameter mty, the number of parameters mty is positively correlated with the predictive power of each decision tree and the correlation between different decision trees, respectively, and a lower value of mty can reduce the normalization error (Rodriguez-Galiano et al. 2011).

Accuracy evaluation
In order to evaluate the accuracy of the ISA extraction results, the confusion matrix was calculated using the test samples, which is a common method to evaluate the accuracy of remote sensing image classification results. Using the test samples selected by visual interpretation method and historical images as reference data, the overall classification accuracy OA (Overall Accuracy, OA) and Kappa coefficient were calculated to evaluate the accuracy of ISA extraction results, and the equations are as follows: where t is the number of land types classified; m nn is the number of principal diagonals in confusion matrix; N is the total number of pixels for accuracy calculation; m x is the total number of each row in confusion matrix; and m y is the total number of each column in confusion matrix.
The Kappa coefficient is a combination of two accuracy metrics: user accuracy and mapping accuracy. Finally, the accuracy of the classification results using different classifiers is also compared and analyzed based on the same test samples.

Classification results and accuracy validation
In this paper, the confusion matrix was used to verify the accuracy of the classification results. The overall accuracy and kappa coefficients of the classification results are shown in Table 3.
It can be seen that the average overall accuracy of the extraction results reached 92.23% and the average Kappa coefficient reached 0.9090. The reliability of the results of ISA extraction was verified.
The confusion matrix of the classification results was analyzed for 2021 as an example (Table 4). Firstly, between unused land and ISA, eight unused land samples were misclassified into ISA while eight ISA samples were misclassified into unused land, which is due to the fact that the unused land have similar spectral signature with ISA. Secondly, seven unused land samples were misclassified as fallow lands, eleven fallow land samples were misclassified as unused lands, and seven unused land samples were misclassified as tidal flat. This confusion can also be attributed to the similar spectral signature among fallow land, unused land, and tidal flats. Additionally, eight arable land samples were misclassified as woodland and grassland, and nine woodland and grassland samples were misclassified as arable land. Finally, six water samples were misclassified as woodland and grassland.
For comparison of ISA changes during study period, time-series binary map only involving ISA and pervious surfaces were produced through combining other types except ISA into pervious surfaces. The distribution of time-series ISA in Dongying City is shown in Fig. 4. Through visual inspection, the extraction results of ISA were found to be in good agreement with the corresponding remote sensing images, which further justified the effectiveness of the proposed ISA mapping approach.

Comparative analysis based on different classification algorithms
In order to further verify the advantages of the random forest algorithm in the extraction of ISA of the YRD, we further selected classification and regression decision trees (CART) and SVM to carry out comparative experiments in this paper. The kernel function type selected for the SVM was Gaussian kernel function with gamma coefficient of 1/16 (the reciprocal of the number of feature bands). The extraction results of the three methods are shown in Fig. 5, and the accuracy evaluation results are shown in Table 5.
From the data shown in Table 5 and Fig. 5, compared with the CART, the accuracy of the ISA mapping results of RF has an improvement of about 3 percentage points. And from the visualization effect, a considerable number of pervious surfaces are misclassified as ISA in the extraction results of the decision tree method. For example, a great number of other surfaces in the eastern coastal region are misclassified as ISA, and the reason for this is that the decision tree model can easily lead to the phenomenon of overfit, which indicates that the decision tree algorithm has certain limitations in extracting ISA of the YRD. The difference between the extraction accuracy of the RF and that of the

Spatiotemporal evolution analysis
In 1992, the establishment of DongYing Economic and Technological Development, the first large national economic development zone in the YRD, accelerated urbanization of Dongying City. In this paper, we studied the evolution of ISA in the YRD region, mainly represented by Dongying City. Figure 6 illustrated the spatiotemporal evolution of ISA in Dongying City from 1992 to 2021. Figure 7 showed area percentage of each land use/land cover type during study period. It can be seen that the change of ISA is small from 1992 to 1998, and the density of ISA in the central urban area of Dongying City increased slightly. Several satellite towns have begun to take shape, however, the relationship Fig. 6 Spatial and temporal evolution of ISA between the central urban area and the satellite towns is not close. This is also due to the initial establishment of the economic development zone, so the urbanization process is relatively slow.
From 1998 to 2004, the change of ISA has been relatively obvious with an increase in ISA of 74.16 km 2 . The central urban area of Dongying City has gradually developed to the eastern coastal area, and a development pattern with the central urban area of Dongying City as the core and the surrounding satellite cities as the assistance has been initially formed, and the connection with the surrounding satellite cities has been continuously strengthened. From 2004 to 2016, in the context of closely linked development, the urbanization process has entered a stage of rapid development, with satellite expanding in size and scale. The central urban area of Dongying City continues to extend to the coastal areas, and the density of ISA in the urban area expands. The ISA increases by 268.30 km 2 from the central urban area of Dongying City to the surrounding areas. Finally, in 2021, the urbanization progress in Dongying City has reached its peak in recent years, and the ISA growth rate has reached 67.87 km 2 per year with the economy and society flourishing.
The classification result of fallow land and arable land was merged into cropland, and the classification results of each of the six categories of ISA, water, cropland, tidal flat, woodland and grassland and unused land were used as a percentage of the total area to produce, as shown in Fig. 7, and a line graph was produced based on the true area of each category, as shown in Fig. 8. From the data in the two sets of graphs, we can learn that (i)The ISA is gradually increasing, which also reflects the deepening of urbanization; (ii) With the acceleration of urbanization and the increase of population, the increase of inland reservoirs has led to an increase in the proportion of water areas year by year, which represents the increasing demand for water resources, and thus reflects the relationship between ISA and water areas; (iii)From 1992 to 2004, cropland was abandoned, and the area where the original cropland was located was gradually converted into woodland and grassland and unused land. From 2004 to 2021, the economic and social development of Dongying City was at a rapid stage, and the rapid growth of population was accompanied by an increase in demand for various resources. Therefore, the cropland area at this stage is increasing year by year, and the unused land is also being developed and utilized continuously to meet the increasing demand caused by population growth.; (iv)In addition, it can be seen from the figure that although the area of woodland and grassland has increased at first, with the acceleration of urbanization, woodland and grassland are losing a lot, the ecological environment is damaged, and the balance between human and nature is unbalanced. Therefore, while the economic and social development and urbanization process are deepening, more attention should be paid to the protection of the ecological environment.

Conclusions
This paper takes Dongying, a major city in the Yellow River Delta, as the study area, and extracts the spatiotemporal ISA distribution data through combining random forest algorithm with spectral features, texture features, Landsat data provide by GEE at an interval of six years from 1992. Subsequently, we explored the spatiotemporal evolution characteristics of ISA in the study area. The main findings of the study are as follows.
(i) The overall precision for ISA extraction ranged from 89.39% to 93.55%, with an average overall precision of 92.23%. And the kappa coefficient ranged from 0.8761 to 0.9156, with an average kappa coefficient of 0.9090. The extraction method in this paper achieved a more satisfactory classification precision. (ii) Since 1992, the area of ISA in Dongying City has increased nearly threefold from 394.87 km 2 at the beginning to 1081.74 km 2 , and the growth rate also climbed from 1.01 km 2 /year to 67.87 km 2 /year. The ISA of Dongying City has grown by leaps and bounds, and the level of urbanization is changing rapidly. (iii) In the past 30 years, Dongying City strengthened the linkage between the central city area and satellite cities. With the ongoing expansion of the central city towards the eastern coastal area, an overall urban development pattern was formed, with the central city area near the eastern coast as the core and radiating outward to drive development.
The results of the spatiotemporal evolution analysis of the ISA of the YRD in this study can provide informative suggestion for the urbanization process, urban development planning and ecological environmental protection in the YRD. In our future studies, we will attempt to introduce street-view images into the process of ISA mapping for its accuracy improvement due to its merits and availability .