Real‐time prediction and ponding process early warning method at urban flood points based on different deep learning methods

Accurate prediction of urban floods is regarded as one of the critical means to prevent urban floods and reduce the losses caused by floods. In this study, a refined prediction and early warning method system for urban flood and waterlogging processes based on deep learning methods is proposed. The spatial autocorrelation of rain and ponding points is analyzed by Moran's I (a common used statistic for spatial autocorrelation). For each ponding point, the relationship model between the rainfall process and ponding process is constructed based on different deep learning methods, and the results are analyzed and verified by mean absolute error (MAE), root mean square error (RMSE), Nash efficiency coefficient (NSE) and correlation coefficient (CC). The results show that the gradient boosting decision tree algorithm has the highest accuracy and efficiency (with a 0.001 m RMSE of the predicted and measured ponding depth) for ponding process prediction and is regarded as the most suitable method for ponding process prediction. Finally, the real‐time prediction and early warning of urban floods and waterlogging processes driven by rainfall forecast data are realized, and the results are verified by the measured data. The research results can provide theoretical support for urban flood prevention and control.

2021, the "7.20 rainstorms" in Zhengzhou, China, caused 292 deaths and a direct economic loss of 53.2 billion yuan.With the aggravation of climate change, the frequency and intensity of extreme rainfall events may increase in the future, and cities may face a greater flood disaster threat (Ntelekos et al., 2010;Schreider et al., 2000;Xu et al., 2018).Therefore, in the urban flooding and waterlogging process, how to forecast the inundation process in a more accurate way for each ponding point to minimize the life, property and economic losses caused by floods has become a scientific problem to be solved urgently.
To prevent the occurrence of floods and implement timely countermeasures to reduce flood losses, city authorities usually need to predict urban flooding and waterlogging (Ke et al., 2020).Urban flood simulation models based on physical mechanisms are the most popular method for predicting the urban flooding process and have attracted great attention from scientists (Mignot et al., 2018).So far, scholars have developed a variety of models (1D-hydrological model, 2D hydrodynamic model, and 1D-2D dual drainage models) to simulate urban floods and waterlogging (Babaei et al., 2018;Bermudez et al., 2018;Hou et al., 2020).However, due to the lack of sufficient calibration and verification data (Macchione et al., 2019;Wu et al., 2020) and the long running time of some models (Hou et al., 2020;Suwit & Parinda, 2016), these numerical models are limited to a certain extent in the real-time urban flood prediction and early warning.In contrast, the deep learning technology (or machine learning) can mine knowledge and laws that conventional data processing methods cannot mine and identify highly complex nonlinear relationships between characteristic and predictive variables (Panahi et al., 2020), and it does not need to understand the potential physical process (Mosavi et al., 2019).Therefore, deep learning technology provides a new idea for solving the problem of rapid and accurate urban floods prediction, which is particularly useful in large-and medium-sized cities with sufficient data.In recent years, many deep learning methods have been used in urban flood research.Wu et al. (2020) constructed a water accumulation process prediction model using gradient boosting decision trees (GBDTs).The results demonstrated that the GBDT model has a mean relative error of 19.77%, which verify the validity of the model for the urban flood prediction.Wang and Song (2019) established a machine learning model based on support vector machines (SVMs) to predict the water level of rainwater pipe network.The proposed model has been applied in Fuzhou, China, and can predict the water level with good accuracy and running speed.Lei et al. (2021) evaluated the ability of convolutional neural network (NNET C ) and recurrent neural network (NNET R ) in drawing flood hazard maps.The results showed that the prediction performance of NNET C model was slightly better than that of NNET R model.However, in urban flood forecasting research, not all in-depth learning methods are suitable for urban flood research, and the prediction performance of each method may vary greatly due to algorithms and specific prediction requirements.In addition, these researches rarely pay attention to the correlation of various ponding points.In fact, if some ponding points have strong correlation, it may lead to significant uncertainty in proposed model.However, to the author's knowledge, few studies have considered the spatial correlation of ponding points and the rationality of method selection.Therefore, it is necessary to develop a relatively complete set of methods using deep learning methods to predict the urban flood ponding process to improve the urban flood prediction method and guide urban flood control.
Based on the aforementioned literature, this study analyzed the feasibility of the research method, compared the applicability of different deep learning technologies in urban flood prediction, and applied a deep learning method to propose a set method system of urban flood ponding process prediction.The specific objectives of the study were to (i) analyze the feasibility of the urban flood ponding process prediction for each ponding point based on the spatial autocorrelation method, (ii) use GBDT, SVM and backpropagation neural network (BPNN) to construct the relationship model between the rainfall process and the ponding process and use statistical evaluation methods to analyze the accuracy of different deep learning methods, which aims to propose a method suitable for predicting the urban flood ponding process, and (iii) construct the urban flood ponding process prediction and early warning method driven by rainfall forecast data, combined with measured data to test the accuracy of the early warning results.The novelty of this work lies in proposing a rapid, accurate and refined prediction method system for the urban flood ponding process from three aspects: research feasibility, method applicability, and practical application.The research results can provide technical references for urban flood early warning and urban flood control.

| Study area
Zhengzhou is the capital city of Henan Province and an important transportation hub in central China (Figure 1).It has a temperate continental monsoon climate, with an average annual precipitation of 639.5 mm.However, the precipitation distribution in Zhengzhou is very uneven.The precipitation in the flood season (June to September) accounts for approximately 60% of the annual precipitation.In response to urban floods and waterlogging, Zhengzhou City has taken various measures in recent years, such as dredging rivers and cleaning up sewers.However, the flooding and waterlogging phenomenon still occurs frequently, which seriously threatens the safety of people's lives and property and the normal operation of the city.

| Rainfall observation data
Rainfall observation data refer to the time-history distribution data of rainfall observed and recorded by 16 rainfall stations, which come from the Zhengzhou City Meteorological Department (Figure 1).The time resolution of the data is 10 min.In this study, 21 historical rainfall data points from 2016 to 2018 were selected as the sample data of the model: June 4, 2016, June 23, 2016, July 19, 2016, August 4, 2016, August 6, 2016, August 25, 2016, September 12, 2016, May 22, 2017, June 5, 2017, June 22, 2017, July 18, 2017, August 12, 2017, August 25, 2017, August 30, 2017, May 15, 2018, July 4, 2018, July 13, 2018, July 27, 2018, August 10, 2018, September 15, 2018, and September 25, 2018.To obtain the rainfall spatial distribution data, the kriging interpolation method was used to perform interpolation analysis on the rainfall time-history distribution data of each rainfall station to obtain the rainfall time-history distribution data of each ponding point.

| Ponding observation data
Ponding observation data refer to the ponding depth of ponding points in historical ponding events.In this study, the ponding depth data of 48 typical ponding points were collected, which were obtained from Zhengzhou Municipal Department.The time resolution of the data was 1 min.The ponding depth data of these ponding points was from the ponding detection equipment at each road intersection.

| Rainfall forecast data
The rainfall forecast data in the next 2 h were obtained by calling the application program interface (API) of The location of study area.
Caiyun technology.The rainfall forecast data have a temporal resolution of 1 min and a spatial resolution of 1 km.It should be noted that since the update period of the observational rainfall data of the rainfall station is 10 min, the update period of rainfall forecast data is also taken as 10 min to ensure the data comparison consistency.Therefore, in this study, a total of 6 updated rainfall forecast data points on August 1, 2019, with forecast periods of 10 min, 20 min, and 60 min, were used as the sample forecast and early warning data, and each update of the sample data included the rainfall forecast data of the 48 ponding points.

| Training dataset
Six different rainfall sensitivity indicators (rainfall (rainfall volume), rainfall duration, rainfall peak (maximum value of rainfall intensity), position coefficient (time of rainfall peak), rain intensity variance and peak multiplier (ratio of rainfall peak to total rainfall)) were used as the input variables of the model (Wu et al., 2020), and the ponding depth in the ponding process was used as the output variable of the model.Eighteen randomly selected rainfall events were used as the training samples of the model, and 3 rainfall events with different rainfall types were selected to verify the prediction performance of the model.The rainfall event on August 1, 2019 was selected as the sample data for prediction and early warning demonstration.

| Spatial autocorrelation analysis
Spatial autocorrelation is proposed based on the first law of geography, which reflects the potential interdependence of adjacent elements (Tobler, 1970).Rainfall is a phenomenon in which the water vapor in the atmosphere falls to the surface in the form of liquid water after condensation.Therefore, the spatial distribution of rainfall usually has strong spatial autocorrelation.The spatial autocorrelation of the water accumulation point refers to whether the ponding process of each ponding point is independent; if the spatial autocorrelation is high, the ponding process of other ponding points will affect the ponding process of the adjacent water accumulation point.In contrast, the low spatial autocorrelation of the ponding points indicates that the ponding point ponding process is independent, and the ponding process of each ponding point is less affected by the ponding process of the surrounding ponding points.In this paper, the spatial autocorrelation of rainfall and ponding points is analyzed by a rainfall ponding process.
Moran's I index (Wang et al., 2023) and Geary's C ratio (Hiroshi, 2021) (Geary's contiguity ratio, which refers to whether adjacent observations of the same phenomenon are related) are selected to describe the spatial autocorrelation.Moran's I index characterizes the spatial distribution relationship of a certain attribute of adjacent objects, with a value of À1 $ 1.A positive value indicates that the spatial distribution of a certain attribute of the adjacent object has a positive spatial correlation, and a negative value indicates that the spatial distribution of a certain attribute of adjacent objects has a negative spatial correlation.The closer the value is to 0, the smaller the spatial autocorrelation of a certain attribute between adjacent objects.A value of zero indicates that there is no spatial correlation.The calculation formula of Moran's I index is as follows: where n a is the sample point or the number of grids, y i or y j represents the attribute value of the i or j point area, and w ij is the weight matrix that measures the relationship between the spatial objects i and j.
In addition, Moran's I index assumes that the distribution of spatial objects is random and then tests whether the hypothesis is tenable by Z score.It is generally believed that with a 95% probability, when the score of its normal statistics (Z score) is less than 1.96, the original hypothesis is accepted, that is, there is no significant correlation between spatial objects.

| Model construction of ponding process at ponding point
In this study, the gradient boosting decision tree (GBDT), support vector machine (SVM) and backpropagation neural network (BPNN) algorithms are used to build the rainfall process and ponding process relationship model.The rainfall process is reflected by rainfall characteristic indicators, and the ponding process refers to the depth of ponding at various times.It should be noted that the prediction model of the ponding process is built for each ponding point.For each ponding point, a ponding process prediction model constructed by the GBDT, SVM and BPNN algorithms is required.

| GBDT
GBDT is an integrated learning algorithm that combines decision trees and gradient boosting algorithms (Friedman, 2001).The core idea of GBDT is iterative learning based on the residuals predicted by the decision tree, and finally, the weak learners of each iteration are accumulated and output.CART was selected as the base learner in the process of GBDT training because the CART decision tree structure is simple, easy to understand, and robust.That is, in each iteration, the gradient boosting algorithm is used to make the latter decision tree train the previous tree residual along the direction of the maximum descending gradient, and finally, the classification results of all trees are accumulated and output (Deng et al., 2019).
The number of weak learners, maximum depth and learning rate are the main parameters of GBDT (Wu et al., 2020).The number of weak learners reflects the number of model iterations, which increases with the increase in the number of iterations.Max depth refers to the maximum depth of the decision tree, which is "none" by default.However, due to the large quantity of data in this study, it is necessary to reasonably set the number of trees to prevent overfitting.The learning rate is a parameter between 0 and 1 that represents the shrinkage step in the update process.The above parameters are optimized by the grid search algorithm.A complete mathematical and technical description of GARP model can be found in Friedman (2001) and Wu et al., 2020.

| SVM
The SVM algorithm is a supervised machine learning algorithm proposed by Vapnik and Corts in 1995 (Cortes & Vapnik, 1995).It improves the generalization ability of the learning machine by minimizing the empirical risk and structured risk so that when the number of samples is small, a good statistical law can be obtained (Zhou et al., 2020).SVM can solve both classification and regression problems.In this paper, the regression method is used to predict the ponding process.Assume the training sample D = {(x 1 , y 1 ), (x 2 , y 2 ), (x n , y n )}, where y i is the observation value of the objective function corresponding to x i , and the ultimate goal of the regression support vector machine is to find the regression fitting function f x ð Þ ¼ ωφ x ð Þþb, where φ x ð Þ is the mapping function, which is used to map the sample to a linearly separable high-dimensional space.Set the estimated value of the sample data to not less than ε.To find the optimal ω and b, it is transformed into the following optimal solution problem (Zhou et al., 2020): where s.t.refers to constraint condition of the objective function.
SVM provides a variety of kernel functions, such as the linear kernel function, polynomial kernel function, radial basis kernel function and sigmoid kernel function.Among them, the radial basis kernel function is widely used and has higher efficiency for nonlinear data mapping (Xiao et al., 2019).Therefore, the radial basis function was used as the kernel function of the support vector machine in this study.The penalty parameter (C) and kernel function (K) are the key parameters to be optimized for SVM.In this study, a grid search algorithm was used to optimize C and K.A complete mathematical and technical description of SVM model can be found in Nayak and Ghosh (2013) and Wang and Song (2019).

| BPNN
A BPNN is a backpropagation neural network connected by multiple neurons, which can be divided into an input layer, hidden layer and output layer.BPNN adopts a full interconnection mode between layers, and there is no connection between neurons in the same layer.The BPNN transmission is divided into the forward propagation stage and the backpropagation stage.In the forward propagation stage, the signal starts to propagate from the input layer, and the deviation is calculated in the output layer.If the deviation meets the requirements, the program is terminated.If the deviation does not meet the requirements, the program enters the backpropagation stage.In the backpropagation stage, the weight of each layer is modified by calculating the local gradient of the network and then the forward propagation stage is entered again after the network is reassigned.The program is terminated when the deviation meets the requirements.The grid search algorithm was also used to optimize the main parameters of BPNN (learning rate, number of hidden layers, number of nodes in hidden layers).A complete description of BPNN model can be found in Jiang and Hong (2013) and Li et al. (2019).

| Model performance analysis
The mean absolute error (MAE), root mean square error (RMSE), Nash efficiency coefficient (NSE) and correlation coefficient (CC) were used to evaluate and compare the performance of the models in this study (Table 1).MAE and RMSE reflected the overall error level of the prediction result.CC reflected the correlation degree between the predicted results and the measured results.NSE reflected the simulation quality of the model.The closer NSE (the value is negative infinity to 1) is to 1, the better the model quality is and the higher the model credibility.Precision and recall were used to evaluate the performance of early warning results (Table 1).Precision refers to the proportion of true positive samples in prediction samples (Faceli et al., 2011), which reflects how many prediction results are true.Recall refers to the proportion of true positive samples in all positive samples (Faceli et al., 2011).

| Spatial autocorrelation results
The spatial statistical analysis software GeoDa developed by the University of Chicago Spatial Data Center was used to analyze the spatial autocorrelation of rainfall and ponding points in this paper.The spatial autocorrelation results of rainfall are shown in Figure 2a.The value of Moran's I index is 0.184, and its normal statistic Z value is 4.23, which is greater than 1.96.Therefore, it is believed that there is a significant positive spatial correlation of rainfall under 95% probability, which verifies the spatial autocorrelation of rainfall.
Moran's scatter diagram of ponding is shown in Figure 2b, the value of Moran's I is À0.00077, which is close to 0, indicating that the spatial distribution of ponding has a low correlation.In addition, the test score (Z value) of the normal statistic is 0.8648, which is less than 1.96, indicating that the spatial distribution of ponding is irrelevant under 95% probability.This result shows that there is no obvious hydraulic connection among the ponding points and that they are independent of each other in space.Therefore, it is theoretically feasible to construct a rainfall and ponding process relationship model for each ponding point.

| Analysis and comparison of prediction model results
In this study, Python 3.7 developed by Google was used to train and verify the model.As shown in Table 2, the mean absolute error of GBDT, SVM and BPNN for the ponding depth prediction is not greater than 0.03 m, and the CC between the prediction results and the measured results is greater than 0.97, which shows the effectiveness of these three algorithms in the ponding depth prediction to a certain extent.However, in comparison, the RMSE of the GBDT prediction model is significantly lower than that of SVM and BPNN, indicating that the GBDT model is more stable and more robust for predicting ponding depth.The stability of the model prediction results is often very important for the prediction of urban flood depth.Therefore, from the perspective of the above statistical evaluation indicators, the GBDT prediction model is more suitable for urban flood depth prediction.
T A B L E 1 Model evaluation index explanation.

Indicator Formula
Remarks where x i is the predicted value of the sample, y i is the measured value, x is the mean value of the predicted sample, y is the mean value of the measured sample, and n is the total number of samples.
where TP is the number of samples correctly classified as positive (Faceli et al., 2011).TN is the number of samples correctly classified as negative (Faceli et al., 2011).FP is the number of samples incorrectly classified as positive because the right category is negative (Faceli et al., 2011).FN is the number of samples incorrectly classified as negative because the right class is positive (Faceli et al., 2011).

Recall
Recall ¼ TP

TPþFN
In addition, Figures 3-5 show the prediction performance of the SVM, BPNN and GBDT prediction models in different validation events and different ponding points.In the first verification event, the absolute error (AE) range and MAE of the GBDT (0-0.0405m and 0.0122 m) prediction model are significantly lower than those of the SVM (with 0.0001-0.0922m AE and 0.025 m MAE) and BPNN (with 0.0014-0.0973m AE and 0.024 m MAE).Similarly, in the second verification event, the AE range and MAE of the GBDT prediction model are 0-0.0450m and 0.0090 m, respectively, which are also lower than those of SVM (with 0.0001-0.0992m AE and 0.0166 m MAE) and BPNN (with 0-0.0823 m AE and 0.0203 m MAE).Similarly, the AE range and MAE of the GBDT prediction model (0-0.4667m and 0.0110 m) are lower than those of SVM (with 0.0001-0.0988m AE and 0.0248 m MAE) and BPNN (with 0.0001-0.0912m AE and 0.0218 m MAE).Therefore, it is not difficult to find that the performance of the GBDT prediction model is superior to that of SVM and BPNN in different validation events.At different ponding points, as shown in Figures 3-5, the AE range and MAE of the prediction results are quite different.This phenomenon is a common feature of SVM, BPNN and GBDT models.However, the MAE of the GBDT prediction model at different ponding points displays a lower fluctuation degree than SVM and BPNN, which demonstrates that the prediction performance of the GBDT model is obviously more stable.
To more clearly compare the difference between GBDT, SVM and BPNN in the prediction of ponding depth, the prediction results of the ponding depth of four ponding points (# 1, # 5, # 9, # 13) in a rainfall event were extracted by a random sampling method.As shown in Figure 6, the ponding hydrograph at ponding point No. 1 was relatively flat, and the ponding depth fluctuated slightly from 8 to 20 min, which was obviously different from the other three ponding points.The reason for this phenomenon may be that the catchment area of ponding point No. 1 is larger than that of the other three ponding points.After rainwater reaches the surface, rainwater near the ponding point quickly collects at the ponding point to form ponds.However, the rainwater far away from the ponding point takes longer to converge to the ponding point after reaching the surface, which causes the ponding point to still have considerable rainwater converging to the ponding point after the ponding reaches the peak value, extending the duration of the ponding peak value.The other three ponding points have similar ponding formation and dissipation processes, and the ponding subsides soon after reaching the maximum depth.It is worth noting that the prediction results of the GBDT model for the ponding peak are very close (with 0.009 m MAE) to the measured results, and the prediction accuracy is significantly better than that of the SVM and BPNN models.This result shows that GBDT has obvious advantages in predicting the ponding peak.

| Feature importance analysis
The information gain ratio (IGR) is an effective feature importance selection method.The larger the IGR of the condition factor is, the higher the influence of the condition factor on the prediction ability of the model.Therefore, in this study, the feature importance of each feature variable (rainfall, rainfall duration, rainfall peak, position coefficient, peak multiplier ratio, and rain intensity variance) on the influence of ponding depth was calculated based on the IGR.As shown in Figure 7, the result of feature selection shows that rainfall (0.2893) is the most important factor affecting the ponding depth.The reason is that rainfall is the driving factor of ponding, and only sufficient rainfall will produce ponding.In addition, the rainfall peak value (0.2070) and position coefficient (0.1864) are also important factors affecting the ponding depth.The reason is that the rainfall peak determines the size of rainfall events and directly affects ponding severity.The position coefficient determines the rainfall pattern characteristics, and different types of rainfall patterns also have different degrees of impact on ponding depth.In contrast, the peak multiple ratio (0.1148) and rainfall intensity variance (0.0511) have little influence on the ponding prediction, indicating that the small rainfall pattern fluctuation has little or even negligible influence on ponding.These results indicate that the ponding depth is more sensitive to rainfall, rainfall peaks and position coefficients.Among them, rainfall and rainfall peak value are indicators that characterize rainfall intensity, indicating that the rainfall intensity of rainfall events is the most important factor affecting the ponding depth.Specifically, larger rainfall and higher rainfall intensity are the main reasons for the formation of waterlogging.For example, the rainfall event in the July 20, 2021 in Zhengzhou, the cumulative rainfall was 624.1 mm, and the rainfall peak value reached 201.9 mm/h, causing more than half of the residential areas in Zhengzhou City to be flooded.Therefore, urban flood control should focus on heavy rainfall events, especially those with high rainfall and rainfall peak value.

| Prediction and early warning of the ponding process driven by rainfall forecast data
It can be seen in section 4.2 that the prediction effect accuracy and stability of the GBDT prediction model are better than SVM and BPNN.Therefore, the GBDT prediction model was used to predict ponding depth in this study.The rainfall forecast data are the basis for ponding depth prediction.In this study, the rainfall event in Zhengzhou city on August 1, 2019, was taken as the forecast event.By calling the API of Caiyun technology, the rainfall forecast data of the 48 ponding points in this event were obtained 60 min in advance.The ponding process data of 48 ponding points were obtained by inputting the sensitivity index of rainfall prediction data into the GBDT prediction model.On this basis, four ponding points (# 9, # 19, # 29, # 39) were selected by the equidistant sampling method to draw the ponding process curve.As shown in Figure 8, there is a slight hysteresis between the ponding process curve and the rainfall process curve of these ponding points.The reason is that when the rainfall reaches the surface, it flows into the pipe network.In the beginning, the rainwater does not exceed the pipeline drainage capacity, so it will not form ponds.However, when the rainwater collection speed exceeds the pipe network drainage capacity, the drainage pipe network will fill with rainwater, which will form accumulated water near the rainwater collection center.Therefore, the ponding hydrograph at the ponding point has a slight lag compared with the rainfall hydrograph.
To more intuitively display the ponding depth prediction results of ponding points, the ponding depth and ponding duration early warning classification method (Table 3) is used to classify the ponding depth of ponding points.The reason is that the early warning classification method based on ponding depth and ponding duration can not only directly reflect the ponding depth but also consider the continuous impact of long-term ponding.
As shown in Figure 9, the areas with serious ponding are mainly concentrated in the central and eastern parts of Zhengzhou city, especially in Longhai Road, Hanghai Road and the old urban area in the central area.The reason is that there are many underpass tunnels on Hanghai Road and Longhai Road.These underpass tunnels often have the characteristics of low elevation, large terrain slope and wide catchment area.After rainwater reaches the surface, it quickly collects and flows to form ponds at the bottom of the tunnel.Therefore, ponding points with serious ponding mostly appear near these underpass tunnels.In contrast, the degree of ponding in the northern part of Zhengzhou city is obviously lower.The reason may be that the northern part of Zhengzhou is adjacent to the Yellow River wetland, which has a relatively flat terrain and a large proportion of the surface permeable area.Therefore, the amount of infiltration and interception of rainwater after reaching the ground is large, resulting in low confluence and confluence velocities.
To further verify the effectiveness of real-time early warning results of ponding points using rainfall forecast data, the ponding depth of each rainfall ponding point on F I G U R E 7 Feature importance analysis results.
August 1, 2019, was obtained by means of ponding monitoring equipment, electronic water gauges and actual measurements, and the early warning results and measured results of ponding points were analyzed by the accuracy evaluation index of precision and recall.As shown in Table 4, the overall precision and of realtime ponding point early accuracy results is more than 80%, indicating that the overall early warning result precision can meet the requirements of urban flood forecasting and early warning.Moreover, the prediction precision shows an overall trend of improvement, although the prediction precision fluctuates with the shortening of the forecast period.For example, the prediction precision in the next 10 min is 14.7% higher than that in the next 60 min.Moreover, from the perspective of the local accuracy of the early warning results, the prediction precision of ponding level 4 (i.e., serious ponding) is obviously higher than that of other levels in the early warning results of different ponding point levels (Table 4), which indicates that the GBDT algorithm has obvious advantages for predicting more serious ponding.

| DISCUSSION
As seen in Figures 2 and 9, although the ponding points do not have spatial correlation, the ponding points with the same ponding grade still have certain spatial aggregation.Specifically, ponding in the central region is more serious, which is mainly due to the combined influence of the spatial correlation of rainfall (Wang, Loo, et al., 2020), the urban heat island effect and the distribution (Min et al., 2018) of urban functional areas.The urban heat island effect makes the rainfall in the center of the city larger.The city center is a densely populated old city with lower pipe network design standards, aging and severely damaged pipe networks and higher impervious areas.After rainwater reaches the surface, there is less infiltration, faster water collection speed, and limited drainage capacity, which easily causes more serious ponding in the central part of the city.In addition, Figure 7 shows that urban ponding is most sensitive to rainfall, rainfall peaks and location coefficients.Therefore, to reduce losses caused by floods, urban management departments should not only increase the area of permeable ground in the central region of urban areas and transform, repair and dredge the drainage pipe network but also take timely measures such as drainage and cutting off roads when dealing with heavy rainfall.
In terms of the early warning results accuracy, as shown in Table 4, with the shortening of the forecast period, the prediction accuracy shows an overall improvement trend.The main reason is that rainfall forecast data accuracy gradually improves, and the accuracy of input variables improves with the shortening of the forecast period, which leads to the improvement of prediction variable accuracy.It is generally believed that there is a contradiction between the forecast period and the prediction accuracy.The shortening of the forecast period will result in higher prediction accuracy, and a longer forecast period will reduce the prediction accuracy.Fortunately, the prediction precision and recall of the model proposed in this study still exceed 80% when the forecast period is 60 min.It effectively guarantees a certain forecast accuracy while obtaining a longer forecast period, which effectively addresses one of the contradictions between the forecast period and the prediction accuracy.The research results can provide more guiding theoretical and technical references for improving prediction and early warning methods and preventing flood disasters.

| CONCLUSION
In this study, constructing ponding process prediction and early warning methods for ponding points is systematically explained in three aspects: research feasibility, method applicability, and practical application.The conclusions are as follows: 1.The feasibility of this study is explained by using the spatial autocorrelation analysis method, that is, each ponding point is spatially independent of each other, and there is no significant correlation.These indicate that these ponding points have little hydraulic connection.Therefore, the urban flood inundation process model based on ponding points is theoretically feasible.2. Based on different deep learning methods (GBDT, SVM, and BPNN), the relationship model between the rainfall process and the ponding process is constructed for each ponding point.A statistical evaluation method was used to analyze the applicability of different deep learning methods in ponding depth prediction.The results show that the GBDT algorithm has the highest accuracy for ponding depth prediction, which indicates that it is the most suitable method for the prediction of ponding depth.3. Taking the rainfall event in Zhengzhou on August 1, 2019, as an example, a refined real-time prediction model for the urban flood ponding process driven by rainfall forecast data was constructed.The real-time early warning for ponding points is realized by the warning level standard and GIS.The results show that the overall accuracy of early warning results is more than 80%, and the accuracy of early warning results shows an upward trend with the shortening of the prediction period, which can meet the urban flood control requirements.
However, due to limitations in rainfall and ponding data, the ponding process prediction model can only be built for ponding points with detailed data in this study.With the gradual enrichment of data and advancement of numerical simulation technology, future research can attempt to expand the research scope combined with numerical models.

F
I G U R E 2 Moran scatter diagram of rainfall and ponding.T A B L E 2 The prediction performance of GBDT, SVM and BPNN.

F
I G U R E 3 AE of prediction results of different ponding points in the first verification event.F I G U R E 4 AE of prediction results of different ponding points in the second verification event.F I G U R E 5 AE of prediction results of different ponding points in the third verification event.F I G U R E 6 Prediction results of ponding process (part).

F
I G U R E 8 Prediction results of ponding process at ponding points under the prediction period of 60 min in advance (part).T A B L E 3 Classification standard of ponding early warning level.Early warning level map of ponding point (10-60 min early).
T A B L E 4 Statistics of early warning results of ponding points.