Nonparametric Approach to Copula Estimation in Compounding The Joint Impact of Storm Surge and Rainfall Events in Coastal Flood Analysis

The joint probability modelling of storm surges and rainfall events is the main task in assessing compound flood risk in low-lying coastal areas. These extreme or non-extreme events may not be dangerous if considered individually but can intensify flooding impact if they occur simultaneously or successively. Recently, the copula approach has been widely accepted in compound flooding but is often limited to parametric or semiparametric distribution settings in a limited number of cases. However, both parametric and semiparametric approaches assume the prior distribution type for univariate marginals and copula joint density. In that case, there is a high risk of misspecification if the underlying assumption is violated. In addition, both approaches suffer from a lack of flexibility. This study uses bivariate copula density in the nonparametric distribution setting. The joint copula structure is approximated nonparametrically by employing the Bernstein copula estimator and Beta kernel copula density, and their performances are also compared. The proposed model is tested with 46 years of rainfall and storm surge observations collected on Canada's west coast. The marginal distribution of the selected flood variables is modelled using nonparametric kernel density estimation (KDE). Based on the different model compatibility tests, the Bernstein copula with normal KDE margins defined the joint dependence structure well. The selected nonparametric copula model is further employed to estimate joint and conditional return periods. It is found that flood hazard characteristics occurrence simultaneously is less frequent in AND-joint cases than in OR-joint cases. Also, the derived model is further used to estimate failure probability (FP) statistics to assess the variation of bivariate hydrologic risk during the project lifetime. It is found that FP statistics could be underestimated when neglecting the compound effect of storm surge and rainfall in the coastal flood risk.


Introduction
Flooding in low-lying coastal zones is becoming one of the most devastating hazards worldwide. Compound flooding (CF) events can be defined by a simultaneous or successive combination of multiple extreme or non-extreme factors such as oceanographic, hydrological, or meteorological (Zscheischler et al. 2018;Hendry et al. 2019;Jane et al. 2020;Naseri and Hummel 2022). These events may not be disastrous if they occur individually but have an intensive and devastating coastal impact if they coincide (Archetti et al. 2011;Zheng et al. 2013;Wahl et al. 2015;Santos et al. 2021). In reality, storm surge can be considered the primary driver of coastal flooding (Resio and Westerink 2008). It can become an extreme flooding event when combined with other meteorological factors, such as rainfall, even though mutual concurrency might not be much more substantial. The storm surge and rainfall can be naturally correlated in the coastal regions because of a common forcing mechanism that drives both events, such as low atmospheric pressure and tropical or extra-tropical cyclone (Zheng et al. 2014). In such circumstances, the probability of their joint occurrence will be higher and more appropriate for assessing hydrologic risk than separately considering each flood's extreme contributing variable. In compounding the joint impact of rainfall and storm surge on the flood risk estimate, earlier studies proposed different statistical models, which usually account for the number of joint extremes or the existence of a joint dependence structure between the variables of interest that define flood hazard types (Coles et al. 1999;Coles 2001).
The multivariate analysis can model the multiple uncertain characteristics using a justifiable multivariate probability framework. The copula approach has been recognized as a highly flexible multivariate statistical tool in joint probability analysis that can eliminate several statistical restrictions of traditional multivariate distributions (Nelsen 2006;Zhang 2005; Shahid and Firuza 2021 and references therein). One of the most significant flexibilities is that the choice of best-fitted univariate marginals and their joint dependence structure do not necessarily need to be from the same distribution family. Recent studies frequently incorporated a copula-based multivariate framework in the joint analysis of compound events by targeting different flood-driving mechanisms (Xu et al. 2014(Xu et al. , 2019Wahl et al. 2015;Masina et al. 2015;Paprotny et al. 2018;Ghanbari et al. 2021;Naseri and Hummel 2022;and references therein). All these studies often introduced parametric class copulas (such as Archimedean or Extreme value (EV)) together with parametric marginals probability density functions or PDFs (such as GEV, Lognormal or Logistic) in the parametric distribution settings.
No evidence from the previous literature can be found that modelling the univariate marginals of any hydrologic extreme should be done either using the same distribution family or any fixed predefined distribution (Silverman 1986;Adamowski 1989;Sharma et al. 1998). For instance, the applicability of parametric distributions in modelling univariate marginals could be challenging and questionable if the hydrologic characteristics exhibit multimodal (or skewed) behaviour. Previous studies, such as Lall (1993, 1994), Wand and Jones (1995); Tarboton et al. (1998), Kim et al. (2006) and references therein, inferred that the implementation of a nonparametric approach via kernel density estimation (KDE) results in more appropriate density function for skewed data and does not require a prior assumption of the distribution type. Dooge (1986) already stated that no amount of statistical refinement could overcome the consequences due to the lack of prior probability distribution information of the observed random samples.
Similarly, the same statistical facts can apply to the copula function; for instance, it does not require any predefined joint PDF assumption in describing mutual dependency (Joe 1997;Nelsen 2006). The parametric copula approach frequently appears in the literature, as cited above, because of its simplicity. However, it is questionable how accurately selected parametric functions approximate joint density structure and/or univariate marginals in such an approach. To overcome this, Genest and Rivest (1995), Shih and Louis (1995), Liebscher (2005), and Bouezmarni and Rombouts (2008) pointed out the possibility of using a parametric model for the density copula and a nonparametric model for univariate marginals, called the semiparametric copula distribution settings. Applications of the semiparametric copula model are primarily in economics or financial studies and infrequent in hydro-meteorological studies. Only a few studies, such as those by Karmakar and Simonovic (2009), Rauf and Zeephongsekul (2014), Reddy and Ganguli (2012), and Shahid and Firuza (2021), are examples of modelling flood peak discharge, volume, and duration series.
However, the semiparametric framework still uses the parametric class copulas, which might contribute to underestimating the actual joint probability density (Charpentier et al. 2006). The parametric model's parameter estimations are time-consuming using standard statistical procedures. Also, if the underlying assumption of the fitted parametric model is violated, it could be challenging or have spurious inferences (Rauf and Zeephongsekul 2014). Unless a wide variety of parametric copula classes are available, this approach is not flexible and might bear the risk of uncertainty in their estimated joint exceedance due to a lack of appropriate dependence structure. Different parametric classes, such as Archimedean copulas, have a different extent of dependency measuring capability, thus demanding extra attention and caution when applying copula to given historical observations. The nonparametric copula density can be a much more appropriate approach to adapt to any kind of dependence structure without considering specific PDF form for copula functions or their univariate marginals. Deheuvels and Hominal (1979) significantly contributed by incorporating a nonparametric approach via empirical copula with univariate marginals. Later on, Gijbels and Mielniczuk (1990) study incorporated a 2-D copula using the smoothing kernel methods. This approach suggested a reflection method to alleviate the boundary bias problem of the kernel methods. Chen and Huang (2007) studies highlighted the importance of bivariate kernel copula joint density based on the local linear estimator. In hydro-meteorological studies, Rauf and Zeephongsekul (2014) study incorporated a nonparametric copula framework in the joint modelling of rainfall severity and duration characteristics in Victoria, Australia. Later, Latif and Mustafa (2020) introduced nonparametric copula joint density in the bivariate modelling of flood peak flow, volume and duration series for the Kelantan River in Malaysia. In modelling of multivariate copula density nonparametrically, Beta kernel estimators can be a good choice (Harrell and Davis 1982;Chen 1999). One of the most significant statistical advantages is that they are naturally free of boundary bias and can generate estimates with a minimum variance (Charpentier et al. 2006). The Bernstein estimator is another way to approximate copula density nonparametrically (Pfeifer et al. 2009). Sancetta and Satchell (2004) found an upper bound of its asymptotic bias and variance in developing a Bernstein polynomial estimator of the copula function. However, this study also establishes the asymptotic normality of the Bernstein density copula estimator without an explicit expression of the variance. The Bernstein copula can provide a higher rate of consistency (Kulpa 1999;Weiss and Scheffer 2012) and do not suffer from boundary bias problems; in other words, it can improve the estimation of the underlying dependence structure compared to nonparametric empirical copulas. The Bernstein copula density approach gained attraction in economics (or insurance modelling or financial risk management) (e.g., Dieres et al. 2012;Pfeifer et al. 2009 and references therein) but has never been used in modelling extreme hydrologic events, to the best of our knowledge.
The present study's novelty is introducing a nonparametric copula joint density via the Bernstein copula estimator in modelling compounding impacts of rainfall and storm surge events on flood risk in the coastal zones. The performance of the Beta kernel copula estimator is also investigated and compared with the Bernstein copula in the joint dependency simulation. Establishing flood dependence via Bernstein and Beta kernel copula estimator can facilitate a much more realistic modelling approach without considering any prior assumption about their density structure. The Bernstein estimator has never been used to model hydrologic or meteorological observations. The proposed nonparametric joint framework is applied to 46 years of historical observations collected on the west coast of Canada. The derived best-fitted copula joint density is employed further to estimate bivariate joint and conditional return periods. The best-fitted model is used to estimate failure probability (FP) to analyze the bivariate hydrologic risk of rainfall and storm surge. The FP statistics can provide practical input in compound flood risk assessments. The motivation of the present study is the lack of an efficient probabilistic framework for modelling coastal flooding on the western coast of Canada.
The low-lying regions of Canada's east and west coasts are experiencing a significant threat from coastal flooding, especially under sea level rise (SLR) (Atkinson et al. 2016). The west coast of British Columbia (B.C.), is expected to see an increase in SLR between ½ and 1 m by the end of 2050 and 2100 (British Columbia Ministry of Environment 2013). Pirani and Najafi (2020) study already claimed a higher risk of compound flooding on the Pacific (west) Canada's coast due to the combination of precipitation and extreme water level. The high risk of coastal water levels can further increase the risk of storm surge events. The complex interplay between storm surges and rainfall can significantly impact flood risk. This paper's proposed nonparametric extreme modelling can be applied to any coastal region worldwide.
The remainder of this manuscript is organized as follows. The brief mathematical approach to the nonparametric copula density using the Beta kernel and Bernstein copula estimator is discussed in Sect. 2. This section also discusses the nonparametric distribution approach via 1-D Kernel density estimation (KDE) in modelling univariate flood marginals. The third section presents a case study application of nonparametric copula density with KDE marginals to the joint analysis of rainfall and storm surge events. The model adequacy of the Bernstein copula is compared with the Beta kernel copula in the nonparametric joint density of flood structure. Subsequently, the bivariate return periods and failure probability statistics are estimated to explore the bivariate compound flood risk caused by rainfall and storm surge events. The last section of the paper provides the research summary and conclusions that can be used to reference coastal flood control and mitigation practices.

Nonparametric Estimation of 2-D Copula Joint Density
The copula function begins from Saklar's theorem (Saklar 1959), which states that any multivariate joint probability distribution can be separated into its univariate distributions and copula functions, which capture the mutual concurrency between them independently from marginal approximation (Nelsen 2006). The copula function eliminates the restriction in selecting the best-fitted univariate marginals and their joint structure from the same family distribution functions. Incorporating the multivariate copula in the parametric and semiparametric settings requires distributional assumptions for their univariate marginals PDFs and/or joint density (refer to Sect. 1). It could be a risk of misspecification if the underlying statistical assumptions of the selected predefined marginals PDF and/or copula density are violated and can lack flexibility. A nonparametric copula density framework is a flexible alternative to previously used parametric and semi-parametric approaches. This study introduces a nonparametric approach via kernel density estimation (KDE) in modelling univariate flood marginal distributions. The best-fitted marginals are then employed with two different nonparametric estimates of the joint copula density, the Bernstein copula estimator and the Beta kernel copula estimator (refer to Fig. 1).

Beta Kernel Copula Estimator
In this study, a nonparametric via the Beta kernel estimator is introduced and tested in the joint probability analysis of storm surge and rainfall characteristics. The earlier study, such as Behnen et al. (1985) and Gijbels and Mielniczuk (1990), pointed out that the nonparametric approach procedures for the density of a copula function often rely on symmetric kernels. Unfortunately, such procedures often suffer from boundary bias problems, depending on bandwidth. Different statistical approaches are highlighted and discussed in the estimation of joint copula density via nonparametric settings, such as Mirror image modification (Schuster 1985), transformed kernels (Wand et al. 1991), and boundary kernels (Müller 1991). The development of the nonparametric copula framework used in this study is based on the Beta kernel function, which was discussed in literature by Brown and Chen (1999), Harrell and Davis (1982), and Chen (1999Chen ( , 2000. So-called boundary kernels achieve the Beta kernel. This nonparametric density approach can alleviate severe boundary bias problems (or is naturally free of boundary bias) often encountered in the standard kernel estimators and can perform better than other estimators (Jones 1993;Müller 1991;Renault and Scaillet 2004). The Beta kernel estimators are consistent if the actual density is unbounded at the boundary (Bouezmarni and Rolin 2003). In other words, it can be selected as a candidate to simulate a justifiable nonparametric estimator of the copula density.
Mathematically, the univariate Beta-kernel density function based on the sample of univariate variable M 1 , M 2 , … … .M p with known compact support [0,1] is given as: In Eq. (1), K (., a and b) represent the density of the Beta distribution function with associated parameters a and b, which can be estimated by , m ∈ [0, 1] In Eq. (1), h is the kernel smoothing parameter called kernel bandwidth. Mathematically, the bivariate copula joint density structure can be derived from the product of Beta kernel densities called Beta kernel copula at a point (m, n) (Charpentier et al. 2006) given below: In Eq. (5), the Beta kernel copula density's smoothing parameter or bandwidth h is estimated by the rule of thumb (ROT) method by minimizing the asymptotic mean integrated squared error or AMISE using the Frank copula as the reference copula (Nagler 2014). Mathematically, ROT for the Beta kernel estimator c h (m, n) is In Eq. (6), c can be taken as Frank copula.

Bernstein Copula Estimator
Besides the applicability of the above-discussed nonparametric framework, this study firstly introduced and tested the efficacy of the Bernstein estimator in the compound flood probability analysis. In the earlier study, Lorentz (1953) demonstrated that any continuous function within a range of [0, 1] could be approximated using the Bernstein polynomials. A later study by Vitale (1975) used this polynomial for density function, and Tenbusch (1994) examined the Bernstein estimator for bivariate density function. According to Sancetta and Satchell (2004) and Diers et al. (2012), using Bernstein copulas provides a higher consistency and does not suffer from boundary bias. Compared to empirical copula, this nonparametric copula density can better estimate underlying mutual dependence and have a good estimator at an extreme and asymmetric dependence structure (Bouezmarni et al. 2013).
The d degree Bernstein polynomials can be mathematically estimated as where a = 0, 1, … … , d ∈ ℕ and 0 ≤ b ≤ 1. Again, if U = (U 1 , U 2 ) represents bivariate random observation with uniform marginals over P i = {0, 1, 2, … … … … , d i } and having grid size d i ∈ ℕ and i = 1, 2. Then, Hence, the Bernstein copula density for a bivariate joint case is estimated by (Sancetta and Satchell 2004;Bouezmarni et al. 2013) Also, the 2-D Bernstein copula of Eq. (9) can be used in the estimation of the empirical copula (Deheuvels and Hominal 1979) as given below where

Nonparametric Kernel Density in Flood Marginals
Parametric families-based approximations of univariate flood marginals are based on the prior distribution assumption of their marginal pdf. In reality, each flood characteristic can exhibit different distribution behaviour and needs to be modelled separately without imposing any fixed or predefined density functions. The kernel density estimation (KDE) is a data-driven model having no prior assumption of the probability density type, especially suited for skewed distributions. The kernel estimator concept was introduced by Rosenblatt (1956). It can be employed to approximate the probability density function or PDF of the hydrologic variables. The univariate kernel density function can be employed in the estimation of the probability density of targeted univariate random observation having the following statistical property In Eq. (12), K(x) is the univariate kernel function, and the PDF represented here can be used as a kernel function. According to Härdle (1991), the kernel function can be described using a general equation In Eq. (13), 't' is the kernel smoothing parameter called kernel bandwidth which can help to control the degree of roughness and smoothness of the kernel density function's shape Lall 1993, 1994). Table 1 illustrates some standard univariate kernel functions (Silverman 1986;Karmakar and Simonovic 2008) introduced and tested in this study for the modelling of flood margins. The univariate kernel density estimator f t (x) of a pdf can be estimated by taking the mean of Eq. (13) of the given random observations as given below: In Eq. (14), n is the observation count and X i is the i th observation. The value of 't' (kernel bandwidth) is a critical concern. Over-smoothing can bypass essential contents, and undersmoothing (or insufficient smoothing) can.
lead to rough density. Readers are advised to follow Jones et al. (1996) and Sharma et al. (1998) for a comprehensive overview of the kernel bandwidth selection procedure. In the present study, the direct plug-in (DPI) method is used to select the bandwidth of a kernel density estimate (Sheather and Jones 1991). The Plug-in kernel bandwidth approach is simple and performs well (Wand and Jones 1995;Chen 2015). For extended mathematical details about the bandwidth estimation approach, readers are advised to follow Sheather and Jones (1991) and Chen (2015). Our recent study (Shahid and Simonovic 2022) selected parametric GEV and Normal distributions as best-fitted in defining the marginal distributions of annual maximum 24-h rainfall and maximum storm surge (time interval = ± 4 days) characteristics. These estimated results are compared with this study's most parsimonious nonparametric KDE.

Data Preparation
This study examined the joint dependency analysis of storm surge and rainfall events in the coastal flood risk using the proposed nonparametric copula distribution. The complex interplay between storm surge and rainfall can be devastating and exacerbate the impact of flooding on the coastal region. Therefore, to investigate the dependency, at first, we search for the 24-h rainfall events and their associated highest (or maximum) storm surge within a time lag of ± 0 (simultaneous events), ± 1, ± 2, ± 3, ± 4 and ± 5 days during annual maximum 24-h rainfall events. In the previous attempts, such as Xu et al. (2014) and Xu et al. (2019), the highest value of the second variable was selected (e.g., storm tide or storm surge) within a time lag of ± 0 day (simultaneous) from the date of the highest value of the first variable (rainfall). A few studies, for instance, Wahl et al. (2015), searched for the dependency of precipitation within a time lag of ± 1 day from the highest storm surge events date. It may be possible that the dependency between rainfall and storm surge is highest or more significant at ± 2 or ± 3 or at ± 5 days. In other words, flooding may be devastating if the second event (e.g., rainfall) occurs in close succession or a few days apart before or after the first event (e.g., storm surge).
The tidal gauge collects the observed coastal water level (CWL) data at the New Westminster station (British Columbia, Canada) with the geographical location of 49.2 • N Lat and 122.91 • W Lon, from 1970 to 2018. The data was provided by the Department of Table 1 The mathematical expressions of univariate kernel density functions SI no.
Kernel functions Fisheries and Oceans, Canada. The predicted astronomical high tide observations are collected by and obtained from the Canadian Hydrographic Service (CHS). It should be noted that west Canada's coast exhibited two high peaks each day. The storm surge data are estimated by the difference with proper time synchronization or matching between high tide and CWL data, which can be either positive or negative.
Similarly, the rainfall gauge station, Haney UBC RF, with a geographical location 49 • 15 � 52.100 �� N Lat and 122 • 34 � 400 �� W Lon is selected within 50 km distance from the selected tidal gauge station. The daily rainfall data is converted into the block (annual) maximum form, called the annual maximum 24-h rainfall observations. The storm surge values are selected within the above-specified time lags of the annual maximum rainfall. Their strength of dependency is measured using the parametric Pearson (r) and nonparametric Kendall tau ( ) and Spearman rho ( ) correlation coefficients. The investigation confirms that the maximum dependency is exhibited when the maximum value of storm surge is observed within a time lag of ±4 days (refer to Supplementary Fig. SF1). Finally, our study selected annual maximum 24-h rainfall and maximum storm surge (time interval = ±4 days) to define flood-paired observations.
Our previous study (Shahid and Simonovic 2022) confirms that the storm surge observations (time interval = ± 4 days) exhibited monotonic time trend behaviour (via Mann-Kendall or MK test) but no serial correlation (via Ljung-Box test using Q-statistics). Therefore, the pre-whitening procedure is adopted to detrend storm surge time series before introducing them into the univariate and multivariate probability framework. Conversely, the annual maximum 24-h rainfall series did not have any time-trend and autocorrelation. Supplementary Table ST1 lists essential descriptive statistics of the selected compound flood driving factors.

Nonparametric Marginal Distributions for Compound Flood Characteristics
Selecting the best-fitted univariate probability function in describing their marginal behaviour is a mandatory pre-requisites before introducing them into the multivariate distribution framework. In this regard, parametric family distributions' applicability is frequently tested in previous studies but can be problematic due to prior distributional assumptions about their marginal pdf (refer to Sect. 1). The nonparametric kernel density estimation, KDE, does not require any distributional type assumption and can be directly derived from the given observations. Therefore, it is especially suited for unsymmetrical data.
In this study, KDE is employed in estimating univariate flood marginals and whose performance is compared using the best-fitted parametric distributions, such as GEV and Normal distributions which are selected from our previous study (Shahid and Simonovic 2022). The bandwidth of the candidate kernel functions fitted to both flood characteristics are estimated using a direct plug-in (DPI) estimation (refer to Sect. 2.2). Tables 2 and 3 list the fitted kernel functions and their estimated bandwidth. The cumulative distribution functions (CDFs) of the KDE are estimated using an empirical approach that is based on numerical integration due to the lack of a closed-form of probability density function (PDF) and cumulative distribution function (CDF) (Kim and Heo 2002;Kim et al. 2006).
Selecting the most parsimonious univariate distribution can demand a better selection procedure by adopting a series of analytical and graphical visual inspections. The performance measures of the candidate function fitted to observed flood characteristics are undertaken, such as error indices statistics (Mean square error (MSE) and Root mean square error (RMSE)) (Moriasi et al. 2007), and information criterion statistics (such as Akaike information criterion (AIC), Bayesian information criterion (BIC) and Hannan-Quinn information   (Akaike 1974;Schwarz 1978;Hannan and Quinn 1979). The AIC statistics included the lack of the model's fit on the one hand and the model's unreliability due to the number of model parameters on the other hand (Zhang and Singh 2007). On the other side, HQC statistics is another alternative to AIC and BIC statistics and can exhibit a higher level of consistency in determining the model's performance (Claeskens andHjort 2008, Haggag 2014). Besides this, Mean absolute error (MAE) and Nash-Sutcliffe model efficiency coefficient (NSE) test statistics are also taken to measure model compatibility levels to observed flood characteristics (Willmott and Kenji 2005;Nash and Sutcliffe 1970). The MSE, RMSE and MAE usually define error statistics in the units of interest such that a value closer or zero often indicates consistency or optimum model performance (Singh et al. 2004;Moriasi et al. 2007). Also, when the value of NSE statistics is nearer to 1, indicating higher predictive skill. Similarly, the NSE statistics compared the model performance by comparing the data and residual variance structure. In hydro-climate studies, the NSE test can significantly evaluate model performance (Sevat and Dezetter 1991;Legates and McCabe 1999). Similarly, the vector of unknown statistical parameters of the fitted parametric distributions in our previous study is estimated via the maximum likelihood estimation (MLE) (refer to Tables 2 and 3). The empirical non-exceedance probabilities are estimated using the Gringorten plotting positioning approach (Gringorten 1963). For a better model performance, all the estimated statistics must be lower, except the value of NSE statistics must be nearer to 1. Based on the above investigation, it is found that the Normal KDE is identified as best fitted to define the univariate marginal distribution of both flood variables (refer to Tables 2 and 3). It exhibits the minimum value of MSE, RMSE, MAE, AIC, BIC and HQC statistics and a higher value of NSE statistics closer to 1. Also, the performance of selected Normal KDE dominates and performs much better than parametric GEV and Normal distribution selected in our previous study.
The graphical visual inspections are carried out using the cumulative distribution plots and probability-probability (P-P) plots to cross-check the fitted distributions' data reproducing capabilities (refer to Supplementary Figs. SF2 and SF3). This investigation also supports the results obtained via analytical fitness measures. In conclusion, the best-fitted Normal KDE is selected for both flood-contributing variables and used with a nonparametric copula density estimator in flood dependency analysis.

Joint Dependency Modelling Using Nonparametric Copula Estimators
Section 3.1 and Supplementary Fig. SF1 already confirm a positive and significant (at a 5% significance level and 95% confidence interval) correlation between the flood contributing variables, annual maximum 24-h and the highest storm surge value at the time lag of ± 4 days. The correlation statistics are measured using both parametric via the Pearson correlation (r) coefficient and nonparametric, via Kendall tau ( ) and Spearman coefficient ( ) (Favre et al. 2004;Klein et al. 2011). This investigation confirms that the copula function can be employed to model joint dependence between them. In other words, compounding the combined impact of storm surge and rainfall can significantly impact the flooding effect in our study area due to the observed significant positive correlation. The parametric copula's efficacy could be attributed to underestimating actual joint density and lacking flexibility (refer to the introduction in Sect. 1). It can be demanding for the prior subjective assumption about the copula joint pdf or having a parametric class function. The nonparametric copula joint density can adapt to any type of dependence structure to the given observed flood characteristics without having a distributional assumption.
As discussed in Sect. 2 of the paper, the Beta kernel copula can alleviate the problem of boundary bias. On the other side, Bernstein copula-based joint approximation can provide higher consistency, lack boundary bias problems, and provides a better estimate of joint dependence structure than empirical copula. The best-fitted KDE flood marginals in the previous section are now introduced in the bivariate joint distribution analysis using the Bernstein and Beta kernel copula estimators (referring to Fig. 1). The bandwidth of the fitted Beta kernel estimator is estimated using the rule of thumb (ROT), followed by Eq. (6) (refer to Sect. 2.1.2). In fitting the Bernstein copula estimator with Normal KDE marginals (refer to Sect. 3.1 and Tables 3 and 4), the coefficients are adjusted using the procedure of Weiss and Scheffer (2012).
The model adequacy of both fitted nonparametric joint densities is assessed using the good-of-fit or GOF test statistics, such as MSE, RMSE, MAE and NSE, as shown in Table 4. The Mean absolute error or MAE between theoretical observation (obtained from the fitted 2-D copula estimators) and empirical observations (obtained from the corresponding empirical copula (Deheuvels and Hominal 1979) is also estimated. It is found that the Bernstein copula estimator with Normal KDE flood marginals outperformed the Beta kernel copula with the same marginal structure. The performance of both nonparametric copula densities is also evaluated graphically using the overlapped scatterplots between a set of (sample size N = 1000) generated random flood observations from the fitted model and observed samples-refer to Fig. 2. It is found that both nonparametric copulas adequately capture the observed mutual concurrency of storm surge and rainfall characteristics. The performance of the Bernstein copula estimator is slightly better than the Beta kernel copula in compounding the joint impact of both flood variables. The generated random observations (shown by light blue circles in Fig. 2) coincide with the natural dependence of historical flood characteristics (shown in red) much better. The reliability of the developed Bernstein copula is compared with Beta copula density using the theoretical Kendall's tau ( ) correlation estimated from the generated sample (sample size, N = 1000) of selected models with the empirical Kendall's value obtained from observed samples. From Table 4, it is found that, for the Bernstein copula, the minimum gap is observed between the empirical and theoretical Kendall's tau ( ). This result confirms that the selected model comprehensively regenerates the mutual dependence of historical flood characteristics. Supplementary  Fig. SF4a, b also confirm that Bernstein copula's performance is better in the bivariate flood dependency modelling unless both nonparametric joint density performs well. Figure 3 illustrates the derived bivariate Bernstein copula density of annual maximum 24-h rainfall and maximum storm surge (time interval = ± 4 days). Supplementary  Fig. SF5 illustrates the contour plot derived from 2-D Bernstein copula density fitted to bivariate flood dependence of rainfall and storm surge events.
The performed investigation shows how effectively the nonparametric copula framework captures flood dependence structure between storm surge and rainfall without having in advance PDF in their univariate marginals and copula density. In conclusion, the selected copula is employed further in estimating exceedance probabilities and associated joint and conditional return periods. The case study results clearly show that there is merit in using a nonparametric copula joint distribution framework for west coast compound flooding analyses.

Table 4
Nonparametric copula density estimation in the bivariate flood dependence modelling and goodness-of-fit test statistics Bernstein kernel copula density (highlighted in bold with an asterisk) exhibited a minimum value of MSE, RMSE, MAE and high-value of NSE statistics. Bernstein copula also exhibits a minimum gap between theoretical and empirical Kendall's tau, which indicates that the selected copula captures or regenerates the historical flood dependence structure much better than Beta kernel copula density Nonparametric copula density estimators

Return Periods of Flood Events
Coastal flooding is an extreme multidimensional phenomenon characterized by multiple flood drivers. This multidimensional behaviour limits the applicability of single variable frequency analysis or return periods. The return period defines an average inter-arrival time between two successive occurrences of hydrologic events (Salvadori 2004).
The low-lying coastal regions can experience a devastating situation when either of the targeted compound flooding variables, such as storm surge or rainfall, exceeds a specific threshold value. This hydrologic situation can be analyzed using the OR-joint probability distribution relationship and their associated return period (Shiau 2003;Brunner et al. 2016).
where C(F(Rainfall), F(Storm surge)) is the copula-based bivariate joint cumulative distribution functions (CDFs) derived from the best-fitted nonparametric 2-D copula joint density. F(Rain) and F(Storm surge) are the univariate marginal CDFs derived from the bestfitted Normal KDE.
The situation may be worst when both flood characteristics simultaneously exceed a specific threshold value. In order to handle this hydrologic situation, the analysis of the period for any bivariate flood event is higher than the OR-joint return period. This result further confirms that the occurrence of bivariate flood variables simultaneously is more frequent in the case of OR-joint and less frequent in the case of AND-joint. For instance, refer to Table 5, 100-yr flood events having the following characteristics, rainfall = 147.540 mm and storm surge = 0.337 m, the bivariate OR-and AND-joint return periods is 50.92 years and 2744.23 years. Similarly, for a 200-year flood event with the following properties, rainfall = 149.799 mm, and storm surge = 0.353 m, the OR-and AND-joint return periods are 101.06 years and 9514.74, respectively. The same table shows that the OR return period is lower than the univariate return period.
From the case study outputs, it is clear that the use of single variable probability distributions in analyzing the coastal flooding phenomena is not justified. In other words, flooding events in coastal regions may not be severe if they occur alone but can be devastating if combined. Table 5 shows that a better understanding of coastal compound flooding can be achieved by combining storm surge and rainfall instead of independently visualizing the probabilistic behaviour of these variables. Both OR and AND-joint distribution relationships and their return periods are essential. In practice, the appropriate notation of return will be decided based on the nature of the water resources problem undertaken.
For water infrastructure design, the conditional joint probability distribution relationship is necessary (Zhang and Singh 2007;Sraj et al. 2014;Brunner et al. 2016 and references therein). The nonparametric Bernstein copula joint density is used further to estimate the conditional joint return periods. Two different conditional joint distribution cases are considered for the bivariate compound events.
Case 1: The conditional return periods of rainfall given different percentile values of storm surge, i.e., Storm surge > y(threshold) is given by, where 'y' is the threshold value when considering storm surge as a conditioning variable, which is defined for different percentile values such as 25 th , 50 th , 75 th , 90 th , 95 th , and 99 th . Case 2: The conditional return periods of rainfall given different percentile values of storm surge events, i.e., Storm surge ≤ y(threshold) is estimated by  The conditional return periods for both cases (Case 1 & Case 2) are estimated using Eqs. (17) and (18), where we considered different percentile values, for instance, 25th, 50th, 75th, 90th, 95th and 99th, of storm surge events as a conditioning variable, refer to Figs. 4 and 5. It is found that the conditional return period of rainfall events decreases with an increase in the percentile value of storm surge, refer to Fig. 4. For instance, in a flood event with rainfall = 146 mm, the return period is 77.29 years (when considering the 25th percentile value of storm surge ≤ 0.187m ), 73.56 years (at the 50th percentile value of Storm surge ≤ 0.301m ), and 72.58 years (when considering the 90th percentile of Storm surge ≤ 0.556m ). In other words, higher return periods are obtained at a low percentile value of storm surge than higher percentile values for the same rainfall events. Figure 5 (conditional joint Case 1) shows that higher return periods are obtained at higher percentile values of storm surge events than lower return periods for the same rainfall event. For instance, in a flood event with rainfall = 100.8 mm, the return period is 180.39 years (when considering the 25th percentile of Storm surge > 0.187m) and 4139.09 years (when considering the 50th percentile of Storm surge > 0.301m) . After considering both joint and conditional return periods, it is concluded that the knowledge of both return periods is essential for water resources management decision-making. For example, in most hydraulic or engineering-based flood defence infrastructure designs, it could be required to consider an extreme hydrologic situation by giving importance to one design variable over another. This requirement can be easily satisfied using a conditional joint probability relationship.

Failure Probability in the Evaluation of Compound Flood Risk
The traditional definition of return periods, which considers the average inter-arrival time between two successive occurrences of extreme events, would be ineffective in characterizing the risk of potential flooding hazards during the entire project design lifetime of the infrastructure (Salvadori et al. 2011;Huang and Chen 2015). In this study, the developed nonparametric joint density is applied further in the estimation of failure probability (FP) statistics, which is considered a practical approach in hydrologic risk assessments by Serinaldi (2015) and Salvadori et al. (2016). For the bivariate flood hazard scenarios, if Rain 1 , Rain 2 … … Rain T and Storm surge 1 , Storm surge 2 … … , Storm surge T are the paired flood observations, and T is the arbitrary project design lifetime; the FP statistics are estimated for different joint cases: Case 1: In the OR-joint distribution case is given by And Case 2: In the AND-joint distribution is given by The failure probability (FP) statistics are estimated for bivariate OR flood hazard scenarios (indicated by red colour in Fig. 6a-g) using Eq. (19). The same Figure also shows the comparison with the value of univariate hazard scenarios. These figures illustrate the variation in the bivariate (and univariate) hydrologic risk or hazard according to the variation in the service design lifetime (different return periods, for instance, 500 years, 200 years, 100 years, 50 years, 20 years, 10 years and 5 years). It is found that bivariate events produce higher FP statistics when compounding the collective impact of rainfall and storm surge than when considering these flood drivers individually. For example, at the return period (RP = 200 years), the value of FP statistics for the bivariate and univariate hazard scenarios is 0.328 and 0.181 (when the design lifetime is 100 years). Nevertheless, when the return periods are reduced to (RP = 100 years), the value of FP statistics for bivariate and univariate scenarios are 0.630 and 0.394. These results show that with the increase in the value of return periods, the bivariate (and univariate) flood risk decreases, at the same time, bivariate (and univariate) hydrologic risk increases with an increase in the value of the service design lifetime of the infrastructure. Thus, ignoring the compound effect of rainfall and storm surge can result in an underestimation of FP in the coastal flood risk.
This study also examined the variation of bivariate compound flood risk with changes in the annual maximum 24-h rainfall events for different synthetic maximum storm surge It is found that for the higher return periods of the storm surge value, the bivariate hydrologic risk (FP statistics) decreases. At the same time, the FP statistics increase with an increase in the service design lifetime of the infrastructure under consideration. It is also found that when the rainfall is above 80 mm, the bivariate flood risk decreases rapidly. When the rainfall exceeds 140 mm, the flood risk value would attain a minimum value and decreases slowly.
In conclusion, the simultaneous accounting of the joint impact of rainfall and storm surge enables a better understanding of the extreme compound phenomena and provides needed information for coastal flood risk assessments. All the above analytical and graphical investigations are vital for effective and sustainable flood management strategies in coastal regions.

Results and Discussions
Compounding the joint probability distribution of storm surge and rainfall events is essential in assessing the hydrologic risk of coastal flooding. The complex interplay between them can exacerbate flooding events, as both drivers are usually interconnected through a common forcing mechanism (tropical or extra-tropical cyclone events). The coastal inundation can be devastating if the storm surge and rainfall occur in close succession or simultaneously. Therefore, the potential for flooding in the case of their joint occurrence is higher than considering each driver individually. The copula-based methodology is frequently used under the parametric distribution settings in the current joint distribution modelling of extreme hydro-climatological events. The parametric approach relies on prior knowledge of 'marginals' distribution assumptions (PDFs) and joint copula density in the parametric fitting procedure. Limited literature highlighted the significance of semiparametric-based joint simulation, introducing nonparametric marginal PDFs co-joined using a parametric class copula density in the parametric fitting process. A parametric fitting approach may contribute to underestimating the actual joint density. Also, it may be misleading or result in spurious inference if the underlying distribution assumption is wrong. No evidence from the existing literature points to modelling either univariate marginal distributions or copula dependence structure by employing any fixed or predefined structure.
In conclusion, parametric and semiparametric copula distribution approaches can have statistical constraints that can not be neglected. To alleviate those limitations, the nonparametric density framework is proposed that can adjust any dependence structure to the given observed flood characteristics without distributional assumption. This paper presents a nonparametric approach to the copula joint estimation by incorporating and comparing the performance of the Bernstein copula and Beta kernel copula estimator in compounding the joint impact of 46 years of storm surge and rainfall data for the west coast of Canada. The main findings of the present study are given below: 1. The paired compound flood observations are obtained by observing the annual maximum 24-h rainfall events and the highest value of storm surge events within a time lag of ± 0 (simultaneous occurrence of both events), ± 1 days, ± 2 days, ± 3 days, ± 4 days and ± 5 days. At first, the dependence between rainfall and the maximum value of storm surge extracted under different time lags is examined. It is found that the strongest correlation is observed at ± 4 days from the date of annual maximum 24-h rainfall events. Finally, this study investigates the dependency between annual maximum 24-h rainfall and maximum storm surge (Time interval = ± 4 days) in developing a joint probability distribution framework nonparametrically. 2. The maximum storm surge (Time interval = ± 4 days) observations exhibited monotonic time trend behaviour with zero serial correlation (or autocorrelation) observed in our previous study. Therefore, a pre-whitening procedure is adopted to detrend given the time series before introducing them into a univariate or multivariate distribution framework. The annual maximum 24-h rainfall series exhibited zero time trend behaviour with no autocorrelation. 3. The nonparametric 1-D kernel density estimations (KDE) are introduced in the modelling of the univariate marginal distribution. Their performance is tested and compared using different analytical and graphical goodness-of-fit (GOF) test statistics. Results confirm that the Normal KDE best describes univariate marginals of both targeted flood variables, annual maximum 24-h rainfall and maximum storm surge (Time interval = ± 4 days). The performance of the best-fitted parametric probability distribution obtained from our recent study (Shahid and Simonovic 2022) is also included in the comparison. It is observed that Normal KDE captures the marginal behaviour of both flood driving variables much better than GEV and Normal parametric function. The obtained results further reveal: how effectively one can approximate marginal behaviour without knowing the prior distribution assumption of marginal PDFs. In other words, the adequacy of the nonparametric approach is much more practical, especially when hydrologic characteristics exhibit unsymmetrical (or multimodal) distribution behaviour. 4. The nonparametric Bernstein copula and Beta kernel copula density are employed separately in the joint density approximation with the Normal KDE, which defines marginal behaviour of annual maximum 24-h rainfall and maximum storm surge (Time interval = ± 4 days) events For instance, a series of analytical model compatibility tests are employed via MSE, RMSE, MAE and NSE statistics. It is found that the Bernstein copula estimator's performance with Normal KDE margins describes flood dependence structure better than the Beta kernel copula joint density with the same marginal structure. The reliability and model adequacy of the developed Bernstein copula density is further examined and compared analytically with Beta copula density, comparing the theoretical Kendall's tau ( ) correlation measures from the simulated random sample (sample size, N = 1000) with the empirical Kendall's tau value obtained from the observed samples. It is found that the Bernstein copula framework regenerates the joint dependence structure of historical flood characteristics much better than the Beta kernel copula density. Besides this, performance between both developed nonparametric copula densities is investigated through visual inspection, using the scatterplots between a set of (sample size N = 1000) simulated flood samples from the fitted copula density and observed data. It is confirmed that the simulated flood characteristics using the Bernstein copula density better match the historical flood characteristics' natural dependence. 5. In conclusion, both nonparametric joint density frameworks adequately capture the observed mutual dependence of flood events, with the performance of Bernstein density being slightly better. This investigation reveals that the nonparametric approach can provide better outcomes without subjective distributional assumptions about joint copula density and their univariate marginal PDFs. There is much scope in incorporating this nonparametric framework in modelling compound extremes such as floods, droughts and rainfall characteristics worldwide. 6. Finally, the selected Bernstein copula with a normal KDE margin is employed to estimate joint and conditional joint return periods. First, the joint return periods for ORand AND-joint cases are considered. It is found that the OR-joint return period for any bivariate flood event is lower than the AND-joint return period. In other words, the occurrence of bivariate flood hazard characteristics simultaneously is less frequent in the case of AND-joint and more frequent in the case of OR-joint. 7. The conditional joint distribution relationships for bivariate events are also examined. It is found that the conditional return period of rainfall events decreases with an increase in the percentile value of storm surge. Higher return periods are obtained at a low percentile value of storm surge than higher percentile values for the same rainfall event.
Both joint and conditional return periods are essential for water management practice. In the design of hydraulic infrastructure, conditional return periods often play a crucial role in assessing hydrologic risk. 8. The selected bivariate joint density function is employed further in the estimation of failure probability or FP for the bivariate OR hazard scenario. The FP statistics are used to assess the risk of failure associated with the bivariate return periods due to the joint occurrence or combined effect of storm surge and rainfall events. The traditional definition of return periods, for instance, OR-or AND-joint cases, would be ineffective in demonstrating the risk of potential food events occurring during the entire project design lifetime of the infrastructure. The variation in the FP statistics is observed by changes in the service design lifetime under different return periods. It is observed that bivariate flood hazard scenarios produce a higher FP than considering each flood variable independently. Also, the bivariate (and univariate) hydrologic risk increases with an increase in the design lifetime of the infrastructure and decreases with an increase in return periods. In conclusion, it could be an underestimation of failure probability in assessing coastal flood risk if the compound effect of storm surge and rainfall events is ignored. 9. Similarly, the variation of bivariate flood risk with changes in rainfall value in differently designed storm surges (i.e., 200, 100, 50, 20 and 10 years return periods) is also examined. It is found that the failure probability (or bivariate hydrologic risk) decreases when considering a designed storm surge at a higher return period. At the same instant, FP statistics increase with an increase in the service design lifetime of the infrastructure under consideration. Overall, it must be revealed that the joint simulation of rainfall and storm surge facilitates a better understanding of the coastal region's critical hydrological behaviour during the flooding.
Using the vine copula methodology, the nonparametric bivariate copula density estimator incorporated in the present study can be projected into a higher dimensional or trivariate distribution. Shahid and Simonovic (2022) developed trivariate distribution under parametric settings in the recent study. It is confirmed that compared to bivariate analysis, trivariate flood events explain the risk of coastal flooding much more comprehensively by integrating river discharge observations' impact in conjunction with rainfall and storm surge events. This study also confirms that neglecting the trivariate joint probability analysis can significantly underestimate the failure probability or hydrologic risk. Our present study demonstrated that a nonparametric framework could better approximate joint density without having a particular form of univariate marginals and copula joint density. Future work will consider introducing nonparametric copula methodology in the trivariate compound flood distribution analysis using the concept of paircopula constructions.