A Statistical Analysis of the Occurrences of Critical Waves and Water Levels for the Management of the Operativity of the MoSE System in the Venice Lagoon

The particular structure and configuration of the Venice lagoon represents a paramount case study concerning coastal flooding which affects natural, historical/cultural properties, together with industrial, commercial, economical and port activities. In order to defend Venice (and other sites) within the lagoon from severe floods, the Italian Government promoted the construction of a complex hydraulic/maritime system, including a movable storm surge barrier named Experimental Electromechanical Module (MoSE), to be activated when specific water levels occur. When the MoSE barriers are raised, the only access to the lagoon for commercial and cruise ships is represented by the Malamocco lock gate, provided that suitable safety conditions (involving the significant wave height) are satisfied. In addition, the Italian Government has recently established that, in the near future, large ships will always have to enter/exit the lagoon only through the Malamocco entrance. In turn, the navigation within the Venice lagoon is (will be) controlled by the combined MoSE-Malamocco system, ruled by both univariate and bivariate paradigms/guidelines. As a novelty, in the present work, for the first time, the statistics of significant wave heights and water levels in the Venice lagoon (both univariate and bivariate ones) are investigated: in particular, these variables turn out to be dependent, and their joint occurrence (statistically modeled via Copulas) can determine the stop of ship navigation, yielding significant economic losses. Here, univariate and bivariate Return Periods and Failure Probabilities are used to thoroughly model the statistical behavior of significant wave heights and water levels, in order to provide useful quantitative indications for the management of the tricky hydraulic, maritime and economical system of the Venice lagoon.


Introduction
The Venice lagoon is located in the northern Adriatic Sea and is characterized by a surface area of about 550 km 2 , a length of about 52 km, and a width ranging from 8 to 14 km (Fig. 1). The lagoon's surface area is composed by 8% of land, including the city of Venice and other smaller islands, and by 92% of dredged channels, mud flats and salt marshes. The lagoon is separated from the Adriatic Sea by a strip of land made up of Pellestrina, Malamocco, and Lido islands, and presents three entrances: Chioggia, Malamocco, and Lido. Beside famous historical and residential sites, several of which are UNESCO World Heritage sites (http://www.veniceandlagoon.net/web/en/), an international airport, a cruise port in Venice, and a large industrial area with an important commercial port (Marghera) are located in the lagoon.
Venice lagoon is a fragile environment with natural, cultural and historical properties often subjected to flooding; in addition, there are industrial, commercial, touristic (e.g., cruise market) and port activities/economy to be preserved. Over the past 60 years, the Venice lagoon has experienced several flooding events. On November 4th, 1966, the Venice lagoon suffered from the largest observed water level (WL) with ? 1.94 m with respect to the ''Punta della Salute'' Mareographic Zero (MZPS). 1 As a further recent example (Fig. 2), the ''Acqua Alta 2 '' event that happened on November 13th, 2019 was characterized by a maximum water level of ? 1.87 m MZPS at ''Punta della Salute'', and occurred with a delay of about one hour after the peak of ? 1.82 m observed at the Italian National Research Council (CNR) platform, also named ''Acqua Alta'' (Fig. 1).
Storm surge events in the Venice lagoon are more frequent and characterized by higher water levels than in other parts of the Mediterranean basin due to the presence of shallow waters with a large continental shelf, and to the effect of the Sirocco and Bora winds (Robinson et al. 1973;Pirazzoli 2002;Trigo and Davis 2002;Lionello 2005). In addition, flooding events are exacerbated by the lowering of ground surface due to local anthropogenic subsidence (e.g., extraction of gas from the subsoil), sediment compaction, glacial isostatic adjustment (small) and long-term tectonic vertical motion.
To defend the Venice lagoon from severe floods, the Italian Government promoted the construction of a complex hydraulic/maritime system which includes a movable storm surge barrier named Experimental Electromechanical Module (MoSE), to be closed during specific sea state conditions (see later) to protect the low areas behind it against flooding (Fig. 3). The MoSE barrier is composed of bottom-hinged floating gates able to temporarily isolate the Venice lagoon from the Adriatic Sea during critical storm surge events, thus protecting the low areas of Venice against flooding, ensuring acceptable safeguarding water levels in Venice. Specifically, together with other measures (e.g., coastal reinforcement, raising of quaysides, lock gates for ships transit during the surge barrier closure, rubble mound breakwaters), the MoSE is expected to protect Venice from floods of up to 3.0 m MZPS, although UNESCO notices that this threshold is likely to be overcome more frequently and for longer periods of time in the future (UNESCO, 2020).
The raise of the MoSE barriers is regulated by a complex normative, based on so-called ''risk classes' ' (Eprim et al. 2005;Cavallaro et al. 2017). In the present work we consider, as a reference, the ''B1 CV'' class criterion, the most extreme one concerning the so-called ''frequent events'' (i.e., with a Return Period less than 10 years, viz. the ''normal'' situation), corresponding to a Water Level of 0.75 m measured at the ''Punta della Salute/Canal Grande'' tide gauge, which would trigger the rise of the MoSE barriers. A further access to the lagoon is represented by the Malamocco lock gate (Figs. 1, 3). However, it can be used by ships to enter/exit the lagoon only if a port pilot is able to get on/off board, and this is possible only if the Significant Wave Height (H s ) measured at the CNR platform ''Acqua Alta'' ( Fig. 1) is less than 2.7 m (Deltares 2016). As a consequence, the analysis of the conditions that may prevent the navigation within the lagoon involves both univariate and bivariate paradigms, and require different methodologies. More precisely: • it will be a univariate approach if the interest is in the rise of the MoSE barriers only, or in the impossibility of ship transit thorough (in/out) the Malamocco lock gate due to the lack of a port pilot on board; • it will be a bivariate approach if the interest is in the ''sealing'' of the whole lagoon to the maritime traffic (i.e., both the MoSE barriers raised and the Malamocco lock gate useless due to the impossibility of a port pilot to get on/off board).
As explained below, the latter analysis will naturally involve a bivariate ''AND'' Hazard Scenario approach, corresponding to sea state conditions such that both the significant wave height and the water level exceed the respective thresholds mentioned above, thus causing the closure of the lagoon to the maritime traffic. Concerning the Malamocco lock gate, closures due to accidental disruptions or scheduled maintenance activities will not be considered. Incidentally, notice that the Italian Government has recently established that, in the near future, cruise (or other large) ships will always have to enter/exit the lagoon only through the Malamocco entrance 3 : in turn, the (statistical) study of its possible ''disruption'' (univariate and/ or bivariate) has become more and more fundamental. It is interesting to note that recent research has focused on the analysis of Compound Events (CE), i.e., multivariate occurrences in which the contributing variables may not be extreme themselves, but their joint instances may nevertheless cause severe impacts: traditional univariate statistical analyses cannot give information regarding the multivariate nature of CE's and the hazard/risks associated with them (Schölzel and Friederichs 2008;Bevacqua et al. 2017). Attention to CE's has also increased at an international policy-makers level. In fact, the Intergovernmental Panel on Climate Change (IPCC) notices that, in the coastal and inter-tidal zones, tides, waves and tidal surge events can occur simultaneously, leading to increased flood severity, duration or frequency (IPCC 2012). Also at a European level, specific EU bottom-up networks have been created to carry on research activities aimed at gaining a better understanding of CE's and at increasing awareness in coastal communities about the inherent modelling challenges associated with CE's (e.g., the ''DAMOCLES'' European COST Action #CA17109). In a broad perspective, the present work deals with Compound Events (via the Copula approach outlined below). Indeed, there are clear physical/economical motivations for investigating the joint dynamics of wave heights and water levels, as they can generate Compound Events critically impacting the Venice lagoon industrial, economical and port activities.
In the present work, the following topic concerning the operativity of the Venice lagoon port and industrial activities is addressed: ''To determine, in a probabilistically well-based and consistent way, the Return Periods and the Failure Probabilities of the occurrences of wave heights and water levels that could negatively affect the commercial activities in the Venice lagoon.'' Note that, here Failure Probabilities are intended in a mathematical sense, i.e., as the probability of exceeding a limit sea state within a given reference time period. As explained below, a Hazard Scenario simply represents the situation in which the occurrences of wave heights and/or water levels could negatively impact industrial and port activities in the lagoon, and the Failure Probability is simply the probability For this purpose, recent advances in Mathematics show how Copulas (Nelsen 2006; represent an efficient tool to statistically investigate the joint behavior of dependent variables. A thorough copula-based analysis (such as the one carried out in the present work) may provide valuable estimates of the marginal and joint distributions useful for (multivariate) hazard assessment. For a theoretical introduction to copulas see Nelsen (2006), Joe (2014) and Durante and Sempi (2015); for a practical approach see , Genest and Favre (2007), De Michele (2007), De Michele et al. (2007) and Orcel et al. (2021).
Copulas have been already successfully applied in Maritime Engineering. In particular, Salvadori et al. (2014Salvadori et al. ( , 2015Salvadori et al. ( , 2020 and references therein, proposed a multivariate approach to the assessment of the hazard based on copulas that can be applied to the design of coastal and offshore engineering infrastructures. The copula-based approach may be alternative and complementary to those presented in several papers available in Literature and is most suitable for modeling Compound Events involving wave heights and water levels. The (statistical) analyses mentioned above, and carried out in the present work, represent a novelty: actually, to the best of our knowledge, no similar studies are present in Literature concerning the Venice lagoon.

Materials
In this section, some basic information about the data used is presented.

Wave Height Data Set
The available significant wave height (H s ) data have been hourly observed at the CNR offshore platform ''Acqua Alta'' ( Fig. 1) from 16/10/1987-12:00 to 31/12/ 2014-23:30. The research platform was installed on January 1970 off the lagoon and is situated in 16 m deep water (ISMAR 2020).

Water Level Data Set
The water level (WL) data set, covering the years 1987-2014, is part of the historical archive of the ''Centro Previsioni e Segnalazioni Maree'' (Center for Tide Reports and Forecasts) of the city of Venice. 4 Data were measured, at an hourly frequency, with respect to MZPS.

Selection of the Sea Storms
As explained above, historical available data sets provide information about the wave height and the water level. In order to avoid serial dependencies and correlations, in the present work an event-based approach is used by adopting a maritime version of the Run Method outlined in Yevjevich (1967). Practically, a sea storm starts when the significant wave height H s crosses upwards a given threshold H 0 , and ends when H s persists below H 0 for at least I 0 hours. Figure 4 shows, as an example, a standard Equivalent Triangular Wave Model used to identify sea storms (Boccotti 2000).
According to Davies et al. (2017) and Martzikos et al. (2021), here the threshold H 0 is taken as the 95th percentile of the empirical distribution of wave heights. Also, a minimum inter-storm temporal distance I 0 = 48 h is used to temporally separate successive sea storms: in fact, as indicated in Boccotti (2000) and Inghilesi (2000), this guarantees the physical meteo/marine independence of the events of interest. Once over-threshold wave heights have been identified, the associated sea storm is characterized by means of the maximum value of H s observed during the event, as well as the contemporaneous observed water level. In the present case, it turns out that a threshold H 0 = 1.5 m should be used. The corresponding size of the sample of sea storms extracted, i.e., N = 520, is large enough for carrying out valuable statistical analyses. The extracted pairs (H s , WL)'s are presented in Figs. 5 and 6.

Methods
In the following, statistical information is extracted from sea storm data sets in terms of Return Periods (RP) and Failure Probabilities (FP), both in a univariate and in a bivariate framework. In particular, the analyses carried out in the present work exploit the notion of Hazard Scenario defined as follows (Salvadori et al. 2016).
Þbe a d-dimensional vector modeling the phenomenon of interest. A Hazard Scenario (hereinafter, HS) of probability level a 2 0; 1 ð Þ is any Upper Set S ¼ S a R d such that the following relation holds: where S is an Upper Set if x 2 S and y ! x component-wise imply that also y belongs to S.
In practice, here a HS is simply a set containing occurrences x's that may negatively affect the commercial activities in the Venice lagoon. By the very definition of Upper Set, also y could be considered as a dangerous occurrence, since it ''exceeds'' x in a component-wise sense.
In the univariate case, the HS of interest here is fX ! x Ã g, where a single critical threshold x Ã is sufficient to individuate a hazardous region on the Real line. Instead, in the multivariate case, a number of different HS's can be constructed (Salvadori et al. 2014(Salvadori et al. , 2016. In the following, an ''AND'' approach is used, for which the HS is given by fX [ x Ã AND Y [ y Ã g: this HS contains bivariate (H s , WL)'s occurrences such that both H s and WL exceed the respective critical thresholds mentioned above, corresponding to sea state occurrences that could prevent the navigation within the lagoon.

Methods: Return Period
In the following, the general notion of Return Period introduced in Salvadori et al. (2004Salvadori et al. ( , 2011) is adopted.
Definition 2: (Return Period). Given an event E, the associated RP T E is.
where l is the mean inter-arrival time between successive events E's in the considered time series.
Note that l provides the time unit (e.g., years) in which T E should be expressed, and that a Hazard Scenario corresponds to a specific event E: in turn, it is possible to speak of Return Periods of HS's. The above definition of T E is quite a general one, and can be used both for univariate and multivariate events. More specifically: • In case a univariate approach only based on the Wave Height X is adopted, the corresponding RP is ' threshold for X (e.g., the one prescribed by the Malamocco lock gate guidelines and operating rules).
• In case a univariate approach only based on the Water Level Y is adopted, the corresponding RP is Here the HS is E ¼ y Ã ; 1 ð Þ, where y Ã is a ''critical'' threshold for Y (e.g., the one prescribed by the MoSE guidelines and operating rules).
• In case a bivariate approach is adopted, based on X and Y, the ''AND'' RP T AND of a bivariate event can be computed as: where F XY is the joint distribution of the random vector X; Y ð Þ. Here the HS is E ¼ fX [ x Ã AND Y [ y Ã g. Using the Theory of Copulas, according to Sklar's Theorem representation (Nelsen, 2006;, the joint distribution function F XY can be written as where C XY : 0; 1 ½ Â 0; 1 ½ ! 0; 1 ½ is the bivariate copula of X; Y ð Þ, i.e., the dependence structure of the random vector X; Y ð Þ, and F X ; F Y are the corresponding marginal laws. In case X and Y were independent, then the joint distribution would become Note that Eq. (4) can be extended to a general multi-dimensional case.

Methods: Failure Probability
The computation of Return Periods represents a traditional source of information for design and hazard assessment of maritime structures (as well as of terrestrial ones). For the same purposes, a further quantity of interest is represented by the notion of Failure Probability: this represents an alternative and complementary way to achieve useful information from the available data. As already explained above, the Failure Probability does not represent the probability of a ''structural'' collapse, rather it is simply the probability of observing an occurrence of the phenomenon of interest in a prescribed Hazard Scenario at least once in a given temporal horizon.
In the present framework, let T [ 0 be an arbitrary temporal horizon: for the sake of simplicity, and without loss of generality, T is measured in years. Given T, and using as Hazard Scenario the ''AND'' one introduced above, the corresponding FP's p T can be computed as in Salvadori et al. (2016) and references therein: where C XY is the bivariate copula of the pair (H s , WL). Note that univariate FP's can be computed in a similar way, simply using as a second term in the formula the probability that the variable considered does not belong to the HS of interest.

Results
The main statistical features of the observed H s and WL data characterizing the sea storms of interest are shown in Table 1.

Univariate fits
As a result, the Weibull distribution well fits the H s data, and the GEV law the WL ones (Fig. 7): here, the Maximum Likelihood technique is used to estimate the parameters of the distributions. The p values of the Goodness-of-Fit (GoF) tests of Kolmogorov-Smirnov type are estimated via suitable Monte Carlo procedures, since the distributions are fitted on the data (Stephens 1974). The null hypothesis is that the observed values of Hs and WL, respectively, are drawn from the Weibull and the GEV distributions. Since the corresponding p-values are all larger than 10%, the null hypotheses cannot be rejected at standard levels (i.e., 1%, 5%, and 10%), and thus the fitted distributions can be accepted for modeling the statistics of Hs and WL. Actually, the fits look excellent also from a visual point of view.

Copula fit
The first step in any multivariate analysis consists in evaluating whether, and to what extent, two variables are dependent. Traditionally, this is carried out by estimating the (non-parametric) Kendall s and the Spearman q statistics, as well as by performing the corresponding independence tests (Nelsen 2006;). Should these parameters take a value statistically significantly different from zero, then the variables would be dependent. Table 2 shows that H s and WL are statistically dependent (more precisely, concordant, for s and q are positive), since the p-values of the independence tests are negligible: in turn, a multivariate (copula) approach is mandatory, in order to correctly statistically model the joint behavior of the pairs (H s , WL)'s in the Venice lagoon. The available pairs (H s , WL)'s presented in Fig. 6 cannot provide any information about the dependence structure (i.e., the Copula) that rules the joint random behavior of the variables. A correct way to get an idea of the copula at play (Nelsen 2006;) is to plot the so-called pseudo-observations, i.e., the ranks of the data normalized in the unit square (the domain of bivariate copulas): these are shown in Fig. 8.
The bivariate dependence analysis has been carried out testing a dozen of copula families usually adopted in the applications (as well as the corresponding Survival versions). These are used to fit the dependence structure of the pair (H s , WL): namely, amounting to about twenty different dependence structures. In order to check the admissibility of a copula, a GoF test of Cramér-von Mises type is used, being more robust than the Kolmogorov-Smirnov one in the multivariate case ): here the p-values are corrected for multiple comparisons (Hochberg and Tamhane 1987). Then, among the dependence structures that pass the GoF test, a ''best copula'' is chosen via a traditional corrected Akaike Information Criterion (Akaike 1974).
As a result, the Husler-Reiss family statistically fits the (H s , WL)'s pairs, as shown in Fig. 8. Furthermore, it is interesting to note that the Husler-Reiss copula is an Extreme Value one, and thus offers a possible paradigm for describing the extreme bivariate behavior of a phenomenon: in turn, the statistical model constructed in the present work may provide useful hazard indications concerning the instances of extreme bivariate (H s , WL)'s occurrences taking place at the lagoon, potentially impacting economical activities. In addition, notice that the Independence copula shown in Fig. 8 is quite different from the empirical and the fitted ones, entailing that H s and WL are definitely dependent.

Calculation of return periods
Once a critical bivariate reference occurrence x Ã ; y Ã ð Þ has been fixed (which defines the Hazard Scenario of interest), it is possible to estimate the associated RP's, both univariate and bivariate ones, by using Eqs. (2)-(4). As prescribed by the guidelines and operating rules already mentioned above, in the following, x Ã ; y Ã ð Þ will be the vector (H s = 2.7 m, WL = 0.75 m), corresponding to critical sea state conditions possibly entailing economical losses. The fits shown in previous Sections provide the estimates of the probabilities of interest, i.e., Þ : these are reported in Table 3. Table 4 shows the estimates of the RP's of closures, both univariate and bivariate ones. The label ''AND'' refers to the sealing of the Venice lagoon, the label ''Gate'' refers to the closure of the Malamocco lock gate due to the impossibility of a port pilot to get on/off board, and the label ''MoSE'' refers to the raise of the MoSE barriers. Figure 9 shows the results of a Monte Carlo investigation concerning the uncertainties associated with the RP's estimates: the procedure is simple and is briefly outlined in the following. (1) A statistical model is fitted over the available data: here it is the bivariate distribution of the pair (H s , WL), as described above.
(2) Synthetic independent (H s , WL)'s samples, having the same size as the observed data, are generated from the model: here, K = 1000 runs were used. (3) Each sample yields estimates of the RP's of interest (as well as of the Failure Probabilities-see below): thus, a set of K independent RP estimates is made available. (4) Finally, using such synthetic estimates, boxplots of the RP distribution can be drawn (see Fig. 9), as well as suitable non-parametric Confidence Intervals, of prescribed level a 2 ð0; 1Þ, by exploiting the Order Statistics associated with the Monte Carlo samples (e.g., in the present case, the 25th and the 975th Order Statistics provide a 95% approximate Confidence Interval).
As a result, on average, the raise of the MoSE barriers is expected to be more frequent than the no-transit circumstances through the Malamocco lock gate (about every 100 days against 130). Also, both the univariate closure  RP's turn out to be smaller than the ''AND'' one. To the best of our knowledge, such quantitative outcomes are not available in Literature, and provide new (statistical) information concerning possible disruptions of the economic activities in the Venice lagoon. All of these practical estimates may be of great interest for the management and/ or scheduling of the maritime traffic, also considering the recent directives of the Italian Government previously mentioned.

Calculation of failure probabilities
Given the critical pair x Ã ; y Ã ð Þ, the associated FP's can be computed by using Eq. (5): here, a temporal horizon T = 50 years is used. Table 5 shows the estimates of the FP's of closures, both univariate and bivariate ones, for selected temporal horizons.
As a quantitative result, fully consistent and coherent with the Return Period analysis, the ''sealing'' of the Venice lagoon (i.e., the ''AND'' case) is always less probable than the occasional partial closures due to the rise of the MoSE barriers or to the non-transit through the Malamocco lock gate. Figure 10 shows the results of a Monte Carlo investigation concerning the uncertainties associated with the FP's estimates over the whole temporal horizon T = 50 years: again, K = 1000 independent Monte Carlo runs have been used (see the explanation outlined in the previous section). Here, the confidence bands are obtained by joining 500 point-wise non-parametric interval estimates over the period 1-50 years, thus approximating a continuous stripe across the temporal range of interest. Overall, the confidence bands are narrow, and the Table 3 Estimates of the probabilities u*, v*, and c*   univariate FP's quickly converge to one, even for small T's. Evidently, for any given T, the probability of partial closures (univariate perspective) is much larger than a global ''sealing'' of the Venice lagoon. Again, these quantitative estimates may provide useful information concerning occurrences that could negatively affect the commercial activities in the lagoon.

Conclusions
The present work provides quantitative information about sea state conditions (both under univariate and bivariate perspectives) that could negatively impact the navigation within the Venice lagoon, potentially yielding huge economical losses. In particular, the following conclusions can be drawn.
• On average, the raise of the MoSE barriers is expected to be more frequent than the no-transit condition through the Malamocco lock gate (about every 100 days against 130). In addition, according to the the present bivariate analysis, univariate closure RP's are smaller than the ''AND'' ones. This latter quantitative calculation is only possible via a multivariate paradigm/approach. • For any given temporal horizon T, the probability of partial closures (univariate perspective) is much larger than a global ''sealing'' of the Venice lagoon. For instance, for T = 20 years, the estimated (closure) Failure Probabilities in the ''AND'' case is 71%, for the Malamocco lock gate to be unusable is 95.6% and for the MoSE to be raised is 98.7% (see Table 5). • The rates of partial and total closure of the Venice lagoon to the maritime traffic have been estimated (see Table 4), providing quantitative information about what should be expected concerning disruptions of the commercial activities in the lagoon.
The results obtained estimate (and quantitatively show) the non-negligible probability that the Venice lagoon could be practically isolated because both the MoSE barriers are raised (WL [ 0.75 m) and the Malamocco lock gate is unusable because the port pilot cannot get on/off board of incoming/outcoming ships (H s [ 2.7 m). For the first time in Literature, the methodology outlined in the present paper is able to quantify such probabilities concerning the Venice lagoon case. This can help planning, in a more efficient and optimized way, all the activities related to the harbor/touristic ports, in turn improving the management of the operativity of the Venice lagoon.
Moreover, in a more general perspective, the novel proposed approach, developed to contribute to the engineering management of the complex system formed by the MoSE barriers and the Malamocco lock gate in the Venice lagoon, can straightforwardly be extended to the study of the operativity and the management of other tricky engineering systems.