The method described herein is applicable to an ensemble of extremal data records that may be produced by different mechanisms from different geographical regions as it involves only the maximum value of each record. In addition, it applies whether the data be processed by the BM or the POT method. For derivation, we will concentrate on the POT method. If the occurrence of extremes exceeding a high threshold follows a Poisson process, the relationship between return period \(R\) and average recurrence interval \(A\) can be shown to be (Wang and Holmes 2020)

$$1/R=1-{e}^{-1/A} \left(1\right)$$

in which \(R\) is conventionally defined as the inverse of the probability of exceedance. Assuming that the probability distribution of annual extreme \(Y\) is \({F}_{Y}\left(y\right)=P\{Y\le y\}\), then the probability of \(Y\) less than \({y}_{a}\), the \(y\) value at \(a\)-year ARI, is

$${F}_{Y}\left(y\right)=P\{Y\le {y}_{a}\}={e}^{-1/a} \left(2\right)$$

Let \({Y}_{n}\) be the extreme value of \(Y\) in an \(n\)-year time interval, then

$$P\{{Y}_{n}\le {y}_{a}\}={\left[P\{Y\le {y}_{a}\}\right]}^{n}={e}^{-n/a} \left(3\right)$$

If \(Y\) is continuous, for every \(y\) in \(Y\), there is a unique \(a\) in \(A\), a bijective function \(g:Y\to A\) exists that maps \(Y\) to \(A\). Therefore \(A\) is also a random variable. With bijection, the function \(g\left(Y\right)\) first maps \(Y\) through the function \({F}_{Y}\) to the uniform random variable \(U\) on [0, 1], then maps \(U\) to \(A\) by

$$A=-{\left(\text{l}\text{n}U\right)}^{-1} \left(4\right)$$

If \({F}_{A}\left(a\right)\) is the distribution function of \(A\), we have

$${F}_{A}\left(a\right)=P\{A\le a\}={e}^{-1/a} \left(5\right)$$

which is an inverted exponential distribution (Lin, Duran, and Lewis 1989), the same as the right-hand side of Eq. 2. This is not surprising as both \(Y\) and \(A\) are mapped bijectively to \(U\) through \({F}_{Y}\left(y\right)\) and \({F}_{A}\left(a\right)\), respectively.

Let \({A}_{n}\) be the corresponding ARI (in a reference time interval of \(n\) years) of \({Y}_{n}\), and write

$$P\{{A}_{n}\le a\}={\left[P\{A\le a\}\right]}^{n}={e}^{-n/a}={e}^{-{e}^{-\left(\text{l}\text{n}a-\text{l}\text{n}n\right)}} \left(6\right)$$

If we further define

$$\varDelta {L}_{A}=\text{l}\text{n}A-\text{l}\text{n}n \left(7\right)$$

then \(\varDelta {L}_{A}\) is also a random variable with the probability distribution,

$$P\{\varDelta {L}_{A}\le \varDelta {\widehat{L}}_{A}\}={e}^{-{e}^{-\varDelta {\widehat{L}}_{A}}} \left(8\right)$$

which is the standard Gumbel distribution.

In theory, if \({F}_{Y}\left(y\right)\) is perfectly known, any given \({y}_{a}\) and the corresponding \(a\) will be known. In this case, if we choose, e.g. \(a=n\), \(\varDelta {\widehat{L}}_{A}\) will be zero. In practical situations, however, \({F}_{Y}\left(y\right)\) is not known, an observed quantity \({y}_{a}\) would correspond to an unknown \(a\), which effectively constitutes a random sampling problem with interest of obtaining an estimated \(a\) or equivalently \(\varDelta {\widehat{L}}_{A}\).

Note that the distribution of \(\varDelta {L}_{A}\) depends on neither the underlying distribution \({F}_{Y}\left(y\right)\) nor its distribution parameters. Suppose that there is a sample \(\varDelta {\widehat{L}}_{{A}_{i}}\) of size \(m\), \(i=1,...,m\), from \(m\) stations, where \(\varDelta {\widehat{L}}_{{A}_{i}}\) is the maximum among the \({N}_{i}\) values recorded at station \(i\), then provided that the \(\varDelta {\widehat{L}}_{{A}_{i}}\)’s are independent, the extreme-value distribution functions \({F}_{{Y}_{{n}_{i}}}\left({y}_{{a}_{i}}\right)\) used to model each of the \(m\) records are allowed to be different. Consequently, records from different climatological regions may be combined to gauge the conformance of \(\varDelta {\widehat{L}}_{{A}_{i}}\)’s to the standard Gumbel distribution. In other words, this property allows different records of extremes to ‘learn’ from the experience of others by pooling the \(\varDelta {\widehat{L}}_{{A}_{i}}\)’s as if they were a sample drawn from a standard Gumbel variate.

Even though \(\varDelta {L}_{A}\) is independent of the underlying \({F}_{{Y}_{i}}\left({y}_{i}\right)\)’s, in practical situations for extreme hazard analysis, \({F}_{{Y}_{i}}\left({y}_{i}\right)\)’s are typically unavailable and still needs to be substituted by an empirically determined distribution \({\tilde{F}}_{{Y}_{i}}\left({y}_{i}\right)\) with the distribution parameters being estimated from observational data, which are invariably plagued by sampling errors. That is, if the \({N}_{i}\) extreme values of \({Y}_{i}\) are arranged in ascending order, \({y}_{\left({1}_{i}\right)}\le {y}_{\left({2}_{i}\right)}\le ...\le {y}_{\left({N}_{i}\right)}\), then \(\varDelta {\widehat{L}}_{{A}_{i}}\) based on \({\tilde{F}}_{{Y}_{i}}\left({y}_{\left({N}_{i}\right)}\right)\) is obtained. As \(\varDelta {\widehat{L}}_{{A}_{i}}\) is determined by the largest value from station \(i\), special attention should be paid to ensure that all the \(m\) largest values are contributed by independent extreme events. If one event contributes to multiple \(\varDelta {\widehat{L}}_{{A}_{i}}\)’s, the \(\varDelta {\widehat{L}}_{{A}_{i}}\) value which represents the highest ARI among them is kept in the analysis but all others triggered by the same event should be discarded.

The duality of the GEV and GPD ensures that they exhibit the same tail behaviour and have the same shape parameter (Wang and Holmes 2020) when applied to the same set of data. The GEV is used in this study as it does not depend on the rate of exceedance, even though the wind gust data used in this study (described in the next section) were chosen by the POT method. Because of the duality, the outcomes and conclusion drawn for the GEV should be equally applicable to the GPD.

The GEV may be expressed as

$$P\{Y\le {y}_{a}\}={e}^{-{\left[1-k\left(\frac{{y}_{a}-\eta }{\sigma }\right)\right]}^{1/k}} \left(9\right)$$

where \(\eta\), \(\sigma\), and \(k\) are the location, scale, and shape parameters, respectively, of the distribution. \({y}_{a}\) can be related to its corresponding \(a\) as follows,

$${y}_{a}=\left\{\begin{array}{ll}\eta +\frac{\sigma }{k}\left(1-{a}^{-k}\right),& \text{if }k\ne 0;\\ \eta +\sigma \text{l}\text{n}a,& \text{otherwise}.\end{array}\right. \left(10\right)$$

For extreme hazard analysis, the analysts exercise their own decisions for the type of extreme value distributions. The distribution parameters are then estimated by a model-fitting method such as the least-squares regression (Press et al. 2007), method of moments (Ang and Tang 2007), probability weighted moments, maximum likelihood method, principle of maximum entropy, elemental quantile method, or Bayesian approaches (de Zea Bermudez and Kotz 2010). Except the method of moments and the maximum likelihood method, all other methods require an estimate of empirical cumulative distribution function (ECDF) for parameter estimation. Since the data to be analysed herein were extracted by the POT method, as to be described in the next section, the rate of exceedance was used to obtain the ECDF.

For a given wind type (e.g. non-synoptic) at a station, suppose there are \(N\) extreme gust speeds exceeding a specified threshold in \(n\) years and the occurrence of exceedance obeys a Poisson process, then an unbiased estimate of the rate of exceedance, \({\lambda }_{j},j=1,...,N\), with respect to the *j*-th smallest gust speed may be estimated by (Ang and Tang 2007)

$${\lambda }_{j}=\frac{N-j+1}{n} \left(11\right)$$

Because the ARI \({a}_{j}=1/{\lambda }_{j}\), Eq. 11 can be used to obtain the ECDF for hazard model fitting.

For simplicity and without loss of generality, in the following the least-squares linear (for cases with fixed shape parameter) and nonlinear (for cases with free shape parameter) regression techniques for the wind gust speed on ARI were used for model parameter estimation.