Comparative Study of Flood Coincidence Risk Estimation Methods in the Mainstream and its Tributaries

The coincidence of floods in the mainstream and its tributaries may lead to a large flooding in the downstream confluence area, and the flood coincidence risk analysis is very important for flood prevention and disaster reduction. In this study, the multiple regression model was used to establish the functional relationship among flood magnitudes in the mainstream and its tributaries. The mixed von Mises distribution and Pearson Type III distribution were selected to fit the probability distribution of the annual maximum flood occurrence dates and magnitudes, respectively. The joint distributions of the annual maximum flood occurrence dates and magnitudes were established using copula function, respectively. Fuhe River in the Poyang Lake region was selected as a study case. The joint probability, co-occurrence probability and conditional probability of flood magnitudes were quantitatively estimated and compared with the predicted flood coincidence risks. The results show that the selected marginal and joint distributions can fit observed flood dataset very well. The coincidence probabilities of flood occurrence dates in the upper mainstream and its tributaries mainly occur from May to early July. It is found that the conditional probability is the most consistent with the predicted flood coincidence risks in the mainstream and its tributaries, and is more reliable and rational in practice.


Introduction
Nowadays, flood problems have become more and more prominent, which account for a large part of all-natural hazards in the world (KvočKa et al. 2016). Affected by human activities, climate change, environmental degradation and El Niño, extreme hydrological events occur frequently worldwide and the frequency and intensity of floods continue to increase (Alfieri et al. 2016). Generally, large floods can be caused by the combination of floods in the mainstream and its tributaries (Prohaska and Ilic 2010). When floods occur simultaneously, the flood peaks and volumes will superimpose into large floods, threating the safety of the downstream river (Chen et al. 2012;Wang 2016). Therefore, it is of great significance to study the flood coincidence laws in the mainstream and its tributaries, which can not only provide a theoretical basis for the formulation of flood control and dispatching plans in the basin, but also offer a reference for the construction of flood control facilities in the downstream.
For the flood coincidence study, the traditional method only focuses on the statistical analysis of flood events based on the synchronized flood data, and can't quantitatively estimate the coincidence probabilities of design floods at specific site. Essentially, the flood coincidence is a multivariable frequency combination event, which can be studied by the multivariable hydrological analysis method (Feng et al. 2020). As an effective method, copula functions can connect arbitrary marginal distributions through correlation structures, and have been widely used in multivariate hydrological analysis. For example, in flood frequency analysis (Tsakiris et al. 2015;Zhong et al. 2018;Moftakhari et al. 2019;Karahacane et al. 2020;Dodangeh et al. 2020), rainfall frequency analysis (Ashkar and Aucoin 2011;Zhang and Singh 2012), drought frequency analysis (Montaseri et al. 2018), and multivariate simulation (Chen et al. 2016;Jane et al. 2020).
Copula functions have also been applied in flood coincidence analysis (Bing et al. 2018;Muthuvel and Mahesha 2021). For example, Schulte and Schumann (2016) developed multivariate copula-approaches to analyze coincidence risk of flood peaks in adjoining catchment. Bender et al. (2016) introduced a multivariate design framework by copula functions to evaluate the flood occurrence at the confluence. Peng et al. (2017) estimated flood risk in the confluence flood control downstream of a reservoir using Copula Monte Carlo method. Gilja et al. (2018) analyzed flood hazard at river confluences based on bivariate copula. However, all of these researches only considered flood magnitudes and ignored flood occurrence time.
In fact, flood coincidence means that the simultaneous occurrence of large floods in different rivers, which needs to meet two conditions: the flood occurrence time should be within a certain range, and the flood magnitudes should be above a certain level. Therefore, both factors of flood occurrence time and magnitudes should be taken into consideration. Assuming that the flood occurrence dates and magnitudes were independent, Chen et al. (2012) selected the multi-dimensional asymmetric Archimedean copula functions to analyze the flood coincidence risk of the upstream Yangtze River and its tributaries. Huang et al. (2018) took flood magnitudes of two rivers and flood occurrence interval dates as three reference variables to further explore the flood hydrograph coincidence risk using copulas. Peng et al. (2019) employed multivariate copulas to estimate flood coincidence probabilities and considered flood occurrence dates and magnitudes simultaneously.
All above work has revealed the characteristics of flood coincidence risk from different angles, and made some progress in the flood coincidence analysis. However, some researches only focused on the coincidence risk of flood magnitudes and neglected the flood occurrence time; other researches ignored the correlation of flood magnitudes and time. Furthermore, there are several flood coincidence risk estimation methods, which one is much rational and reliable in practice?
The objective of this study is to estimate flood coincidence risks quantitatively in the mainstream and its tributaries by considering the correlation between the upstream and downstream flood variables, and to evaluate the reliability and applicability of different coincidence probability estimation methods. The main steps of this study are as follow: First, the multivariate regression model is used to establish the functional relationship among the flood variables in the mainstream and its tributaries. Second, the mixed von Mises and Pearson Type III marginal distributions are selected to describe the annual maximum flood occurrence dates and flood magnitudes, respectively. Third, the joint distributions of flood magnitudes and occurrence dates are constructed based on copula function. Fourth, the coincidence probabilities of flood occurrence dates and magnitudes are quantitatively estimated. Finally, the flood coincidence probabilities obtained by different estimation methods are compared with the predicted values.

Copula Functions
Sklar's theorem assumes that the marginal distribution functions of random variables X and Y are F X (x) and F Y (y) , respectively, and F(x, y) is their joint distribution, then the copula function can be written as: where C (⋅) is a copula function; is a parameter of the copula function, which can capture the dependency between random variables; U and V are marginal distribution functions, Copula function is a multi-dimensional joint distribution function uniformly distributed in the domain of [0, 1], which can integrate marginal distributions and correlation structures to construct multi-dimensional joint distributions (Nelsen 2006). The Clayton (Clayton 1978), Gumbel-Hougaard (Hougaard 1986) and Frank copula (Frank 1979) only have one parameter to be estimated and are widely used in practice Yin et al. 2018). The mathematical expressions of these copula functions are: Based on the relationship between the copula parameter and the two variables' Kendall correlation coefficient , the parameters of copula functions can be estimated by: where is the Kendall correlation coefficient of two variables; is a parameter of the copula function; and D 1 (⋅) is the first-order Debye function.

Marginal Distributions of Flood Occurrence Dates and Magnitudes
A mixed von Mises distributions is used to fit the flood occurrence dates series which often have the characteristics of periodicity and multi-peaked. The probability density function of the mixed von Mises distribution can be written as: where p i is the coefficient of the mixing proportion; k i is the scale parameter; u i is the position parameter; I 0 k i is the 0-order modified Bessel function; and m is the order of the finite mixed von Mises distribution.
Pearson Type III (P3) distribution recommended by the Ministry of Water Resources, China as a uniform procedure for flood frequency analysis (MWR 2006), is used to fit the annual maximum flood magnitudes. The probability density function of P3 distribution is expressed as: where , and are the shape, scale, and position parameters of the P3 distribution, respectively; Γ(⋅) is the gamma function.

Goodness-of-Fit Evaluation
The Root Mean Square Error (RMSE), Kolmogorov-Smirnov (K-S) and Chi-square ( 2 ) test methods are selected to test the goodness-of-fit of the marginal and joint distributions. For the univariate series, the empirical probabilities of each flood variable are calculated by the Gringorten plotting position formula (Gringorten 1963), which is written as: where x i is the observed data; n is the sample length; and P e x i is the empirical exceedance probability.
For the bivariate series, the empirical probabilities of joint distribution can be estimated using the specific formula as follows: where x i , y i is a combination of the observed data; n is the sample length; and P e x i , y i is the empirical joint distribution probability.
The RMSE is selected to measure the difference between the theoretical and empirical probabilities and calculated by: where n is the sample size; P i is the theoretical probabilities obtained from the fitted distribution; P ei is the empirical frequencies from the observed data. For n observed data in an increasing order, the K-S test statistic is expressed as: The Chi-square ( 2 ) test is to measure the degree of deviation between the observed and predicted values, which statistic is defined as: where p 1 , p 2 , … , p k are the hypothesized probabilities for k possible outcomes; and M 1 , M 2 , … , M k are the observed counts of each outcome to be compared for expected counts np 1 , np 2 , … , np k in n independent trails.

Coincidence Risk of Flood Occurrence Dates
The coincidence of flood occurrence dates refers to that the annual maximum floods in the mainstream and its tributaries occur on the same day. Therefore, the coincidence probabilities of the annual maximum flood occurrence dates of two rivers on the kth day can be defined as: where i and j are the hydrological stations on the mainstream and its tributary; T i represents the occurrence dates of the annual maximum flood, expressed as a certain day of the flood season; and t k represents the kth day of the flood season.

Coincidence Risk of Flood Magnitudes
The joint distributions of the annual maximum flood magnitudes in the mainstream and its tributaries are established firstly based on copula functions. Then, the joint probabilities, co-occurrence probabilities and conditional probabilities can be quantitatively estimated by following equations.
For the joint probabilities of flood magnitudes coincidence, it refers to the probabilities that at least one of two rivers occurs floods surpassing certain values and can be expressed as: where x and y are the flood magnitudes in i and j river, respectively; and F(x, y) is the joint distribution function of flood magnitudes in two rivers.
For the co-occurrence probabilities of flood magnitudes coincidence, it refers to the probabilities that two rivers simultaneously occur floods surpassing certain values and can be expressed as: For the conditional probabilities, it refers to the probabilities that when one of the variables is in a given range, the other variable falls within specified range. In this study, under the condition that one river has occurred floods surpassing certain values, the probability that another river also occurs floods surpassing certain values and can be expressed as:

Study Area and Data
The Fuhe River, located at the east of Jiangxi Province, China, feeding into the Poyang Lake, was selected as a case study. Figure 1 sketches the map of Fuhe River basin, the mainstream and tributaries as well as the hydrological stations. Affected by the subtropical humid monsoon climate, the flood season is generally from April to early July in this region.
The Fuhe River is 348 km long and has a drainage area of 16,493 km 2 , and the control basin area of Liaojiawan and Loujiacun hydrological stations located in the upper mainstream and tributary is 8723 km 2 and 4969 km 2 , respectively. The Lijiadu hydrological station with a catchment area of 15,812 km 2 is located in the down-mainstream, 1 Sketch map of the Fuhe River basin and the location of hydrological stations accounting for more than 95% of the entire drainage area. The daily average flow discharge data series at these three hydrological stations from 1953 to 2016 were collected from Jiangxi Province, and the annual maximum flood magnitudes and corresponding occurrence dates were sampled and shown in Fig. 2.

Flood Correlation Analysis
The Pearson correlation coefficient was used to describe the correlation between the annual maximum flood magnitude series of these three hydrological stations in the Fuhe River. Figure 3a shows the correlation analysis of the annual maximum flood magnitudes between the Liaojiawan, Loujiacun and the Lijiadu station. The Pearson correlation coefficient between the Liaojiawan and Lijiadu stations is 0.93, and between the Loujiacun and Lijiadu stations is 0.90. Since the control basin area of the Loujiacun station at tributary is smaller than that of the Liaojiawan station at upper mainstream, its correlation with the Lijiadu station is relatively small. Figure 3b shows the correlation analysis of the annual maximum flood magnitudes between the Liaojiawan and Loujiacun stations located in the upper reaches, and the Pearson correlation coefficient is equal to 0.80. Since their control basins are very close, the key factors for flood generation including climatic conditions, geographical environment and topographical position are very similar (Deidda et al. 2021), which implicitly indicates that the regularity of flood occurrence in the upper reaches is highly consistent.
The multiple regression model is employed to analyze the quantitative relationship among the annual maximum flood magnitude series. The functional relationship among the annual maximum flood magnitudes at these three hydrological stations is as follows: where X 1 and X 2 represent the observed flood magnitudes at the Liaojiawan and Loujiacun stations in the upstream, respectively; Ŷ represents the predicted flood magnitudes at the Lijiadu station in the downstream; is the random error obeying a Normal distribution with a mean value of 0, which can be expressed as ∼ N 0, 2 .
The relationship between the predicted and observed flood magnitudes at the Lijiadu station is plotted in Fig. 3c. The scattered points are basically distributed along the 45° diagonal line, indicating the predicted and the observed flood values are very close. The simultaneous occurrence of the upstream floods is prone to cause large flooding in the downstream confluence area.

Estimation of Marginal Distribution Parameters
The parameters of mixed von Mises distributions were estimated by the maximum likelihood method, and the K-S test and the RMSE were applied for the goodness-of-fit evaluation. Table 1 shows that the K-S test statistics values do not exceed critical values at a significance level of 0.05, implying the hypothesis that the annual maximum flood occurrence dates follow the mixed von Mises marginal distribution could be accepted. Meanwhile, the RMSE values between the theoretical and empirical probabilities are very small. The parameters of the P3 distribution are estimated by the L-moment method. The estimated statistics, the Chi-square test and RMSE are also listed in Table 1. It is shown that the p-value ( 2 ) are all larger than 0.05, implying that the hypothesis test that the flood magnitudes obey the P3 marginal distribution is not rejected.
The mixed von Mises distributions and empirical probabilities were calculated by Eqs. (6) and (8), while the P3 distributions and empirical probabilities were calculated by Eqs. (7) and (8), respectively. Figure 4 shows the empirical frequency points can fit the theoretical probability curve for the annual maximum flood occurrence dates or magnitudes very well.

Selection of Copula Function
Based on the Clayton, GH and Frank copula functions, the bivariate joint distribution of the annual maximum flood occurrence dates at the Liaojiawan and Loujiacun stations was constructed, respectively. Similarly, the bivariate joint distributions of the annual maximum flood magnitudes between the Liaojiawan, Loujiacun and Lijiadu stations were also constructed, respectively. The Kendall rank correlation coefficient was employed to reckon the parameters of the copula functions, and the empirical frequencies of the joint distributions were calculated by Eq. (9). Table 2 shows that for the flood occurrence dates and magnitudes, the RMSE values of the Clayton copula function are the lowest, indicating the Clayton copula function is the most appropriate copula for modeling the joint distributions of flood variables at these stations. Therefore, the Clayton copula function is selected to establish the joint probability distributions of flood dates and magnitudes, respectively. The theoretical and the observed nonexceedance joint probabilities were plotted in Fig. 5, on which the x-axis is sorted in ascending order of the theoretical nonexceedance joint probabilities. Obviously, the theoretical frequency curves can fit the observed flood dataset well.

Coincidence Risk of Flood Occurrence Dates
The coincidence probabilities of the flood occurrence dates at the Liaojiawan and Loujiacun stations were calculated by Eq. (13) and plotted in Fig. 6. It is shown that the coincidence probabilities of flood occurrence dates present the characteristics of multiple-peak. Before March or after August, the coincidence probabilities are basically close to zero. Early May to early July is a higher coincidence period, including two peaks, which occur on May 12 and June 21 with probabilities of 0.026% and 0.057%, respectively. Adding together all the daily coincidence probabilities, we can get that the annual coincidence risk of the flood occurrence dates in the mainstream and tributaries is equal to 2.87%.

Coincidence Risk of Flood Magnitudes
The coincidence probabilities of the annual maximum flood magnitudes for different design floods in the mainstream and its tributaries were estimated. The joint probabilities   (14) and (15), respectively. Given the occurrence of floods at the Liaojiawan or Loujiacun station, the conditional probabilities of T-year design floods at the Lijiadu station were calculated by Eq. (16). Table 3 displays the flood coincidence probabilities including the joint probabilities, co-occurrence probabilities, and conditional probabilities for different design floods.
The joint probabilities, co-occurrence probabilities for 5, 10, 20, 50 and 100-year design floods are 30.48%, 17.14%, 9.20%, 3.86%, 1.96% and 9.52%, 2.86%, 0.80%, 0.14%, 0.04%, respectively. Obviously, the joint probabilities are greater than the co-occurrence probabilities. Moreover, both the joint probabilities and co-occurrence probabilities decrease as the return periods increase. That is, small and medium floods are more likely to occur simultaneously, which conforms to the general observations in practice.
The conditional probabilities of the same flood magnitudes occurring at the Lijiadu station are 16.41%, 5.01%, 1.46%, 0.26%, 0.07% for given 5, 10, 20, 50, and 100-year design floods at the Liaojiawan station, respectively. It can be seen that floods with lower return periods result in higher coincidence probabilities. The conditional probabilities between the Liaojiawan and Lijiadu stations are higher than that between the Loujiacun and Lijiadu stations. The floods at the Liaojiawan station in the upstream have more significant impact on the downstream floods.

Comparison of Different Estimation Methods
In this study, we assumed that floods with the same frequencies simultaneously occur at the Liaojiawan and Loujiacun stations. The design flood at the Lijiadu station can be predicted by Eq. (17). Since the flood magnitude series obey the P3 distribution, the corresponding design frequencies and the return periods are calculated by Eq. (7). For given return periods of 5, 10, 20, 50, and 100 years, the design floods at the Liaojiawan and Loujiacun stations and the predicted floods at the Lijiadu station were listed in Table 4. The corresponding To compare above three flood coincidence probability estimation methods, the theoretical coincidence probabilities were calculated and compared with the design frequencies of the predicted flood at the Lijiadu station. Under the different return periods, the design frequencies of the predicted floods at the Lijiadu station, the joint probabilities and the co-occurrence probabilities when floods with the same frequency occur at the Liaojiawan and Loujiacun stations were estimated. The conditional probabilities P c 1 (or P c 2 ) of the same frequency floods occurring at the Lijiadu station for given the occurrence of T-year floods at the Liaojiawan (or Loujiacun) station were also calculated. All of these estimated flood probabilities are listed in Table 4.
Compared with the design frequencies of the predicted flood at the Lijiadu station, the joint probabilities of floods simultaneously occurring in the upper mainstream and its tributaries are generally larger, while the co-occurrence probabilities or conditional probabilities are less. Among these flood probabilities, the conditional probabilities ( P c 1 ) are the closest to the design frequencies, especially with small return periods. For example, the joint and co-occurrence probability of 10-year floods at the Liaojiawan and Loujiacun stations are 17.14% and 2.86%, respectively, while the design frequency of the predicted flood at the Lijiadu station is 9.41%. The conditional probability of a 10-year flood occurring at the Lijiadu station given a 10-year flood at the Liaojiawan station is 5.01%. The conditional probability of a 10-year flood occurring at the Lijiadu station given a 10-year flood at the Loujiacun station is 4.47%. Combined with the previous analysis, we can see that flood with the same frequency simultaneously occurring at the Liaojiawan and Loujiacun stations can lead to large floods at the Lijiadu station. The corresponding design frequencies of the predicted floods are more consistent with the conditional probabilities that the same frequency floods occurring at the Lijiadu station when given T-year floods at Liaojiawan station. Generally, the floods at the Lijiadu station are basically formed by the superposition of the floods in the upper reaches. More importantly, the floods at the Liaojiawan station in the upstream account for a larger proportion of floods at the Lijiadu station.

Conclusions
Flood coincidence risk analysis plays an important role in reservoir operation and flood management. In this study, the marginal and the joint distributions of annual maximum flood magnitude and occurrence date series were established, respectively. The coincidence risk of flood occurrence dates, and the joint, co-occurrence and conditional probabilities for different flood magnitudes were calculated and compared with the predicted flood coincidence risks. The main conclusions were summarized as follows: 1. There is a strong consistency and significant correlation between the floods at the upstream and downstream with the Pearson coefficient equal to 0.90. The floods at the Lijiadu station are mainly formed by the superimposition of the upstream floods. The mixed von Mises (or P3) marginal distribution can fit flood occurrence dates (or magnitudes) very well, and the Clayton copula is the best one for constructing the joint distributions of flood variables. 2. The coincidence events of the annual maximum flood occurrence dates at the Liaojiawan and Loujiacun stations mainly occur from May to early July. Two flood coincidence peaks occur on May 12 and June 21, and the annual coincidence risk in Fuhe River is 2.87%. 3. The joint probability and co-occurrence probability of 50-year design floods at the Liaojiawan and Loujiacun stations are 3.86% and 0.14%. Given the occurrence of 50-year flood at the Liao- jiawan or Loujiacun station, the corresponding probability of a flood with the same frequency occurring at the Lijiadu station is 0.26% or 0.22%, respectively. 4. Floods with the same frequencies simultaneously occurring at the Liaojiawan and Loujiacun stations are likely to superimpose into large floods at the Lijiadu station. Among these three coincidence probability estimation methods, the conditional probability is the most consistent with the flood coincidence risk in the mainstream and its tributaries, which is more reliable and rational in practice.
In this study, the copula-based quantitative analysis of flood coincidence risk reveals the spatiotemporal characteristics of floods in the Fuhe River. The feasibility and rationality of the flood coincidence probability estimation methods have been verified by considering the connection between the mainstream and its tributaries. This comparative study may provide more intuitive understanding in different flood coincidence probabilities. To improve the accuracy of flood coincidence risk estimation, other factors including climate, environment, and topography in addition to flood magnitudes and time should be considered as the variables in future studies. A comprehensive and systematic flood coincidence assessment method will provide a scientific basis and effective support for crucial decision-making and flood management.