Bayesian validation of the seismic source models using the Wylfa Newydd site in the UK as a case study


 In probabilistic seismic hazard assessment, the development of the seismic source characterization, especially the geometry of the seismic source models (SSMs), is controversial because it often relies on expert judgment with different interpretations of the available data from seismology, tectonics, and geology. Based on the same input datasets, different teams of experts may derive different SSMs. In this context, the verification of the models through the comparison against a set of observations is a crucial step. We present a statistical tool to compare the SSMs with the observed seismicity and rank these SSMs based on their ability to replicate the past seismicity. We simulate many synthetic catalogues derived from candidate SSMs and compare them with the observed catalogue of mainshocks using the Metropolis-Hastings Algorithm to select those that fit the observed catalogue. The candidate SSMs are then expressed by a probability density function (pdf) using the set of synthetic catalogues accepted by the Metropolis-Hastings Algorithm and the Bayesian inference. To help practitioners in earthquake and civil engineering understand how this tool works in practice, the proposed approach is applied to a proposed new nuclear site in the United Kingdom, Wylfa Newydd.

this context, the verification of the models through the comparison against a set of observations is a 23 crucial step. We present a statistical tool to compare the SSMs with the observed seismicity and rank 24 these SSMs based on their ability to replicate the past seismicity. We simulate many synthetic 25 catalogues derived from candidate SSMs and compare them with the observed catalogue of 26 mainshocks using the Metropolis-Hastings Algorithm to select those that fit the observed catalogue. 27 The candidate SSMs are then expressed by a probability density function (pdf) using the set of centre of the distribution is the best estimate of the resulting interpretations, the body describes the 49 shape of the distribution around the best estimate, and the range captures the tails of the distribution 50 (USNRC 2012). The likelihood of fully capturing the epistemic uncertainty in the seismic hazard 51 model is achieved by including alternative models and parameters in the logic tree where weights are 52 assigned to each branch using expert judgment and elicitation that reflect the relative confidence in 53 the models and parameters (Coppersmith and Bommer 2012). However, experts may interpret the 54 observations differently and propose different weighting schemes for the input models. In this context, 55 testing not only the hazard results but also each component of the seismic hazard model against 56 observations is crucial to validate the model (e.g. Gerstenberger et al. 2020 and references therein). 57 Although the role of testing and the use of Bayesian methods are increasing to capture the epistemic 58 uncertainty in PSHA, most published studies aim to test either the GMC or the hazard estimates 59 In this work, we present and test a Bayesian approach, referred to as the Bayesian  Hastings (BMH) approach, to validate the proposed seismic source models against the observed 87 seismicity using statistical metrics and to rank which models better represent the past seismicity to 88 forecast the future seismicity. The BMH approach explores the parameter space and generates 89 synthetic catalogues that are compatible with the observed catalogue using the  Algorithm. Then, the catalogue parameters are expressed in terms of probability density functions 91 (pdfs) using the set of synthetic catalogues and the Bayesian inference. We test the BMH approach 92 using the SSMs developed for the PSHA for a proposed nuclear site in Wylfa Newydd (North Wales, 93 UK). The UK is an intraplate region with low levels of seismicity and the earthquake catalogue in this 94 region is relatively short in duration (a few hundreds of years) and does not contain large (Mw > 6.0) 95 earthquakes (e.g., Mosca et al., 2020). Facilities with such high consequence of failure require an 96 assessment of hazard at very long return periods because seismic hazard estimates, in this case, are to 97 be considered for low (≤ 10 -4 ) annual frequency of exceedance (e.g. USNRC 2012; ONR 2018) or 98 otherwise stated for very long recurrence periods (> 10,000 years) when compared with the relatively 99 short length of the seismic records. 100

101
The SSC aims to describe the location, size, and frequency of future earthquakes through a set of 102 parameters, such as the geometry of the seismic sources, maximum earthquake magnitude, and 103 recurrence parameters drawn from the frequency-magnitude distribution (FMD) in each source (e.g. 104 Budnitz et al. 1997). The study area is divided into a series of seismic sources (zones or faults). 105 Seismic activity is considered to be spatially uniform within each seismic source zone, and 106 earthquakes have an equal chance of occurring at any point in the source zone. 107 The BMH approach is a procedure to validate the seismic source models used in the SSC (Figure 1). 108 This approach explores the full parameter space, which consists of three dimensions (i.e. the geometry 109 of the source model, the activity rate, and the b-value), by generating synthetic catalogues that are 110 derived from candidate seismic source models using the Monte Carlo random sampling. Then, the 111 synthetic catalogues with a close fit to the declustered catalogue of mainshocks, are accepted by the 112 Metropolis-Hastings Algorithm and converted into probabilities using the Bayesian inference. The pdf 113 can be potentially used to adjust the weights in the source model logic tree that are defined by expert 114

judgments. 115
The BMH approach can be divided into six steps that are described in detail below and shown in 116 The observed data used by the BMH consist of the declustered catalogue of mainshocks for the study 119 area within the completeness thresholds. It includes also the uncertainty in the epicentral location of 120 the mainshocks since this information is important in the delineation of the geometry of the SSMs. 121 The regional activity rate a and the b-value for the declustered catalogue of mainshocks are calculated 122 using the study area and the penalized maximum likelihood procedure (PMLP, Johnston et al. 1994). 123 This procedure uses the truncated Gutenberg-Richter recurrence law, different time windows of the 124 catalogue for different magnitude completeness thresholds, the correlation between a and b, and a 125 weighted prior to the b-value when the number of earthquakes is too small for a robust estimate of the 126 b-value. The results are expressed by a 5×5 matrix of possible values for a and b, a ± one and two 127 standard deviations, and b ± one and two standard deviations where the standard deviation of the 128 activity rate and b-value are computed from their covariance matrix (for more details, see Veneziano 129 and Van Dyke 1985;and Johnston et al. 1994). This determines 25 triplets of activity rate, b-value, 130 and weight. The regional estimate of the activity rate and the b-value is the most likely value in the 25 131 triplets, together with their standard deviation. The PMLP allows us to include a correction factor in 132 the activity rate calculations based on the standard error of individual earthquake magnitudes, as 133 proposed by Rhoades and Dowrick (2000). 134

STEP 2: SELECTION OF THE CANDIDATE SEISMIC SOURCE MODELS 135
Step 2 of the BMH approach defines the candidate seismic source models (SSMs) to generate 136 synthetic catalogues. The spatial distribution of earthquakes within each source zone of the candidate 137 source models is expected to have a uniform probability, i.e. earthquakes are equally likely anywhere 138 in the zone. If this requirement, which can be tested using the nearest neighbour analysis (Davis 139 1986), is not satisfied, the source model cannot be considered as a potential candidate. Since this 140 approach is an excellent tool to retrospectively test the SSC used in PSHA, one can consider only the 141 source models in the SSC as candidates. Alternatively, it may be interesting to include all the 142 published source models for the same site and potentially unpublished source models, as we do here 143 (see Section 3). 144

STEP 3: GENERATION OF SYNTHETIC CATALOGUES 145
Assuming that the occurrence of the synthetic earthquakes follows a Poissonian process within each 146 source zone, many synthetic catalogues are derived from the candidate source models using Monte 147 of the source model, and not on a regional basis. Then, the regional a and b-values for the synthetic 159 catalogues are computed and compared with those from the observed catalogue. The sampling range 160 of the b-value can be selected within a unimodal probability distribution (e.g. Gaussian distribution) 161 or a uniform distribution. The former is centred on the regional b-value within one or more standard 162 deviations of the regional b-value. For the latter, the sampling range may correspond to the regional b-163 value within one or more standard deviations and all the values in the range have equal probability. 164 Defining the sampling range for the activity rate is not straightforward and must be done by trial and 165 error. The range of the activity rate can also be shaped within any probability distribution in principle. 166 However, we recommend using a uniform distribution for the range of this parameter for the 167 following reason. Unlike the b-value whose regional estimate represents a prior value for the 168 individual zones, the regional activity rate is roughly the sum of the activity rates of the zones, and a 169 prior value for a does not exist for each zone whose geometry is an unknown. Hence, it is not 170 straightforward to centre the probability distribution for a on a specific value and a uniform 171 distribution is a better choice not to favour or penalize some source models against others. In Section 172 4, we will show the results of the sensitivity analysis to set the ranges of the recurrence parameters 173 using the testing ground of the Wylfa Newydd site. 174 The magnitude of the synthetic events is chosen based on the FMD that is defined by the synthetic 175 values of Mmax, a and b. The corresponding year of occurrence is selected within the time period from 176 which the catalogue is assumed to be complete above the completeness magnitude (Mc) according to 177 the completeness analysis of the earthquake catalogue. This ensures that the synthetic catalogues do 178 not contain events of magnitude smaller than Mc. Finally, the epicentre of the synthetic earthquake is 179 drawn within the zone using random sampling. 180 The procedure to generate a synthetic catalogue from a candidate SSM is the following. For each 181 source zone of the source model, the activity rate and b-value are chosen within their sampling ranges 182 using random sampling assuming that the activity rate and the b-value are independent. The number 183 of synthetic events per year is computed using the Poisson distribution for each year of the 184 catalogue's duration that is between the completeness period of the largest magnitude and the ending 185 of the observed catalogue. Then, Mw and the epicentral coordinates for each synthetic event are 186 selected within their sampling range using random sampling. 187

STEP 4: IMPLEMENTING THE METROPOLIS-HASTINGS ALGORITHM 188
The Metropolis-Hastings Algorithm is a method to generate random walks (or candidate samples) 189 from an unknown target probability distribution for which direct sampling is difficult (e.g. Metropolis 190 et al. 1953;Tarantola 1987). For a random walk x j , the arbitrary, conditional pdf g(x i |x j ) is defined for 191 where P(x) is the target distribution. Assuming a uniform random number u between 0 and 1, if u ≤ α, 195 the transition x j → x i is accepted; otherwise it is rejected. Then, the distribution of the accepted 196 random walks is re-sampled to have a closer fit to the target distribution. 197 In the BMH approach, the target distribution is the regional estimate of the activity rate and b-value, 198 together with their uncertainties, using the declustered catalogue of mainshocks and the PMLP of 199 Johnston et al. (1994; see Subsection 2.1). The candidate samples are the regional estimates of the 200 recurrence parameters of the synthetic catalogues for the study area computed using the PMLP. The 201 Metropolis-Hastings Algorithm simultaneously fits the activity rate and the b-value in the synthetic 202 catalogues with the target distribution assuming therefore a correlation between the recurrence 203 parameters. Only the candidate samples that fall within the observed pdf are accepted by the 204 Metropolis-Hastings Algorithm. This is described by Figure 2 where the dark and light grey 205 histograms represent the target and synthetic distributions, respectively, before (left-hand column in 206 Figure 2) and after applying the Metropolis-Hastings Algorithm (right-hand column in Figure 2). 207 Although the activity rate and the b-value are chosen within their sampling ranges for each source 208 zone of the SSM, the comparison between the synthetic and observed catalogues is carried out in 209 terms of the regional recurrence parameters for the entire study area, and not in terms of the 210 recurrence parameters for the individual zones in the SSM. This is because there are no observations 211 to compare the synthetic data with for each zone and the geometry of the SSM is one of the model 212 parameters. 213

STEP 5: DEFINITION OF THE MISFIT FUNCTION 214
The misfit function (or cost function) χ 2 is a measure of the discrepancy between the observed and 215 synthetic data. We define this function as the sum of two terms. The first term accounts for the 216 location variability of the earthquakes in a grid cell and the second term is based on the mean 217 magnitude in the grid cellː 218 (2) 219 and ℎ are the number of earthquakes in the i-th grid cell for the observed and synthetic 220 catalogue, respectively; σ (C obs ) i is the standard deviation of ; ����� and ����� ℎ are the mean 221 magnitude in the i-th grid cell for the observed and synthetic catalogue, respectively; σ(Mw obs ) i is the 222 standard deviation of the mean magnitude ����� ; and N is the number of grid cells. The standard 223 deviation σ (C obs ) depends on the uncertainty in the epicentral location of the observed earthquakes 224 and is defined as the square root of the absolute difference of the number of earthquakes with and 225 without the epicentral uncertainty in the i-th grid cell: 226 where N eq is the number of mainshocks within the completeness thresholds; and ′ = * 8 is the 228 number of mainshocks within the completeness thresholds accounting for the uncertainty in the 229 epicentral location. When the uncertainty in latitude and longitude for the epicentral location is taken 230 into account, each earthquake in the observed catalogue can be located at eight different potential 231 locations. decluster the earthquake catalogue and assess its completeness because this is beyond the purpose of 250 this paper that aims to present not a site-specific PSHA but a new tool to test which candidate source 251 models are most appropriate to represent the future seismicity. 252

STEP 1 253
This subsection describes the data used by the BMH approach for our testing ground. We used the 254 that is relatively short in duration (a few hundreds of years) and does not contain large (Mw > 6.0) 261

earthquakes. 262
The uncertainty in the epicentral location for historical events is defined based on four classes (Table  263 2) as described in Musson (1994). For each class, we define a reference value ( Table 2) To determine the regional recurrence parameters a and b for the seismicity of the study area, we recognize that care should be taken when the magnitude uncertainties are accounted for in the recurrence statistics, especially when an earthquake catalogue contains more than one original 277 magnitude scale to avoid counter-balancing bias in the estimation of the seismicity rates. Using the 278 completeness thresholds in Table 1, we found that the best-fit values are N (Mw ≥ 2.1/yr) = 7.12 ± 279 0.51 and b =1.025 ± 0.041. Note that we provided no prior b-value for the regional estimate of this 280 parameter. 281

STEP 2 282
We selected 14 different SSMs as potential candidates, most of which were developed for other 283 regional PSHA for the UK or site-specific PSHA for Wylfa Newydd (Figure 4 and Table 3). SSM3 284 and SSM4 are based on political and administration criteria, respectively. These two SSMs were 285 included to check how source models based on non-scientific criteria compare against source models 286 based on seismic and tectonic information. We added a background zone for SSM2, SSM5, and 287 SSM6 to cover the entire study area and avoid any potential bias in the simulation of the synthetic 288 catalogues from source models that cover the entire study area. 289 To define the sampling range of the recurrence parameters used to generate the synthetic catalogues 290 from the 14 SSMs, we carried out preliminary recurrence calculations for the individual zones of the 291 14 SSMs using the PMLP, the earthquake catalogue in Figure 3, together with its completeness 292 analysis (Table 1), and the best regional estimate of b = 1.025 as a weighted prior for each of the 293 zones. We found that overall the b-value varies between 0.65 and 1.43, and N (≥ 2.1Mw/yr) is 294 between 0.01 and 5.00 considering all source zones in the 14 SSMs (see the last two columns in Table  295 3). 296 Table 4 shows the sampling ranges of the catalogue parameters used for the simulations of the 298 synthetic catalogues. Mmax is fixed to 7.1 Mw that is the largest maximum magnitude in the region 299 (see Subsection 3.1). The range of the b-value is chosen to be normally distributed around the regional 300 estimate within one standard deviation (Table 4). To set the range of the activity rate, we estimated 301 this parameter for each SSM as the fraction of the regional activity rate (i.e. 7.12) within four standard 302 deviations divided by the number of zones in the SSM because the regional activity rate is roughly the 303 sum of the activity rates of the individual source zones. The four standard deviations account for the 304 large variability in this parameter among the zones of the SSM. Then, we grouped the ranges into four 305 categories depending on the number of zones (see Table 4). Note that we cannot set a unimodal 306 distribution for the sampling range of the activity rate because this parameter changes significantly 307 from zone to zone within the same source model (see Table 3). This implies that no value of the 308 activity rate, around which to centre the unimodal distribution, can be chosen. By trial and error, we 309 have found that the ranges in Table 4 are the best choice to have a reasonably large range of activity 310 rates for the synthetic events but to avoid unrealistic values that would be rejected by the  Hastings Algorithm. Section 4 describes the influence of the range of the recurrence parameters on the 312 results computed by the BMH approach.

STEPS 5-6 318
To find the optimal grid size for the misfit function, we tested different cell sizes and plotted the 319 accepted catalogues by the Metropolis-Hastings Algorithm in terms of the misfit function ( Figure 5). 320 We have chosen the 0.1° by 0.1° grid size because it realistically simulates the sparse seismicity in the 321

UK. 322
We applied Equation 4 to convert the 1,052,929 synthetic catalogues accepted by the Metropolis-323 Hastings Algorithm into a 1-D posterior pdf for the SSM (Figure 6). The pdf is multimodal suggesting 324 that more than one model fits the data equally well. The smallest probability is associated with 325 SSM12, which consists of a single zone, and thus does not fit the catalogue of mainshocks. Also, the 326 performance of SSM3-SSM5 is relatively poor, which gives some confidence that the method can identify unrealistic models. Specifically, the low probability of SSM3 and SSM4 is because these 328 models are based on non-scientific criteria and therefore are not able to replicate the past seismicity. 329 SSM5, and to less extent SSM6, are penalized by the background zone that was added to cover the 330 same study area as for the other SSMs (see Subsection 3.2). Although SSM2 has also a background 331 zone, this is small and does not influence the performance of SSM2 against the observed catalogue. 332 At the time of the publication of SSM5 and SSM6, the principal study area was considered to be a 100 333 km radius circle around this site to build the source model for a site-specific PSHA. Nowadays, the 334 recommendation from IAEA is to extend the source model to 300-km from the site (e.g. IAEA 2010). In this section, we describe the impact of the prior distribution of the recurrence parameters, 346 earthquake catalogue, and the size of the study area on the 1-D posterior pdf for the SSM. This 347 sensitivity analysis allows us to determine whether the results from the BMH approach are strongly 348 dependent on the parameter ranges, the earthquake catalogue, and the area under investigation. We 349 compared the 1-D pdfs for the SSM obtained from the sensitivity analysis with that shown in Figure 6  350 (referred to as "Reference" in Figures 8 and 10).

INFLUENCE OF THE SAMPLING RANGE OF THE RECURRENCE PARAMETERS 352
To carefully assess the impact of the sampling range of the recurrence parameters, together with its 353 prior distribution, on the pdf for the SSM and how to set these ranges, we carried out a sensitivity 354 analysis by running the BMH approach for different combinations of sampling range for these 355 parameters (see Table 5). We considered three different ranges of the activity rate and b-values. In 356 Test 1, the sampling range of the recurrence parameters is chosen to be the regional values within two 357 standard deviations using a uniform distribution. In Test 2, the range for the b-value is that in 358 Reference (i.e. the sampling ranges in Table 4); whereas, the range for the activity rate is broader and 359 grouped into three categories, rather than four. In Test 3, the distribution of the range for the b-value 360 is set within a uniform distribution centred on the regional estimate within four standard deviations; 361 and the sampling ranges of the activity rate is that in "Reference". 362 The main observations from this analysis are the following (see Figure 8): 363 • The sampling range of the recurrence parameters, especially the activity rate, affects significantly 364 the pdf for the SSM, as shown by the number of accepted catalogues. 365 • The larger the parameter range, the lower the number of accepted catalogues from the Metropolis-366 Hastings algorithm. 367 • The pdf for Test 1 is almost uniform between SSM1-SSM11 suggesting that the algorithm cannot 368 tell which source models produce synthetic catalogues that fit better the observations. This is 369 because the narrow range of the recurrence parameters excludes the extreme cases, i.e. where the 370 zones have a few or many earthquakes. 371 • In Test 2 where the range of the activity rate is large, the search tends to favour some models and 372 penalize others. For example, the range of 0.0-1.0 for the activity rate and the source model with 373 more than 10 zones penalizes SSM2 and SSM9-SSM11 and favours SSM4-SSM5 and SSM7-374 SSM8. This is because the activity rate of the individual zones tends to be chosen such that the 375 regional synthetic activity rate for the entire study area is larger than the regional observed value. 376 • The result from Test 3 confirms that the influence of the b-value on the 1-D pdf is less strong than 377 the activity rate and the results are less sensitive to the choice of the prior distribution for the b-378 value. If the uniform distribution is chosen rather than the Gaussian distribution, the number of 379 synthetic catalogues that fit the observed catalogue of mainshocks decreases. 380

INFLUENCE OF THE EARTHQUAKE CATALOGUE AND STUDY AREA 381
We carried out two tests to check the influence of the earthquake catalogue and the size of the study 382 area on the pdf for the SSM and the consistency and repeatability of the results computed by the BMH 383 approach. In Test A, we used the instrumental catalogue for the period 1970-2014 that consists of 172 384 mainshocks of ≥ Mw 2.1 within 300 km from the site using the completeness thresholds in Table 1. 385 We estimated the seismicity rate and the b-value for the instrumental catalogue using the PMLP. The 386 target probability distribution for the instrumental catalogue is described by N (Mw ≥ 2.1/yr) = 7.33 ± 387 0.55 and b=1.050 ± 0.067. In Test B, we focused on the region within 60-km of the site because it has 388 the strongest impact on the estimated seismic hazard (Villani et al. 2020). The entire catalogue 389 includes 17 mainshocks of Mw ≥ 2.1 using the completeness thresholds in Table 1. The activity rate 390 and the b-value for this test are N (Mw ≥ 2.1/yr) = 0.49 ±0.12 and b=0.88±0.11. We used only 13 of 391 the SSMs because SSM3 and SSM4 are identical for this study area (Figure 9). 392 The results from these tests are shown in Figure 10. The trend of the pdfs for the SSM for Reference 393 and Test A is very similar. Test B provides relatively similar results to the reference values except for 394 SSM5 that gets the highest probability in the pdf for the SSM. The performance of this model is 395 significantly better in Test B than in Reference because the large background zone of SSM5 is 396 excluded from the 60-km study area (see Figure 4). A similar effect can be seen also for the large 397 probability of SSM6; whereas, the small background zone of SSM2 does not influence the results by 398 changing the size of the study area. As explained in Sections 3.2 and 3.4 of the Wylfa Newydd case 399 study, the background zone was added to SSM2, SSM5, and SSM6 to cover the same study area. 400

401
The goal of this article was to present a statistical tool for the objective assessment of the quality of 402 the SSMs in the SSC against the observations using the Bayesian inference. The output of the BMH 403 approach is expressed in terms of posterior pdfs for the SSMs and can be used to quantitatively 404 compare various models with the observed seismicity in a transparent and reproducible way. 405 The application of the BMH approach to the Wylfa Newydd case study helps understand how to use 406 this tool in practice. Although we provide the guidelines to apply the BMH approach, its 407 implementation requires setting up some input elements that are specific for the site under 408 investigation, such as the selection of the candidate SSMs, the range of the catalogue parameters to 409 generate the synthetic catalogues and the optimal grid size to compute the misfit function. The 410 sensitivity analysis for the Wylfa Newydd case study illustrates the influence of the input parameters 411 on the posterior pdf for the SSM. The sampling ranges of the recurrence parameters must be tailored 412 to the site under investigation by trial and error. If this is not carried out well, there is the risk to 413 favour some models and penalize others by the Metropolis-Hastings Algorithm. The correct setting of 414 the ranges for a and b parameters limits also the tendency of the BMH approach to favour SSMs 415 based on past seismicity and to penalize those based on the regional geology and tectonics. This is a 416 consequence of the fact that the Metropolis-Hastings Algorithm accepted and rejects SSMs based on 417 the ability of the models to replicate the observed seismicity. 418 The retrospective validation of the SSMs developed for the PSHA of the Wylfa Newydd site shows 419 that the source models based on political and administration criteria and the single-zone source model 420 are not able to reproduce the past seismicity ( Figure 6) suggesting that the BMH approach can 421 identify unrealistic source models. The results also show that many source models can replicate the 422 and Bommer 2006). The development of the SSC must satisfy the following two criteria. Firstly, the 447 construction of the SSC should be based on data from seismology, tectonics, geology, geophysics, and 448 paleoseismology to account for the short length of the seismic observations compared to the seismic 449 cycle. The use of source zones, where the seismicity is assumed to be uniform across it, allows us to 450 reduce the likelihood of neglecting important but unknown tectonic structures that are not been 451 associated with recorded seismicity but may generate future earthquakes. Secondly, the SSC should 452 reproduce well the past seismicity. In intraplate regions with low levels of seismicity and low 453 deformation rates, such as most of the Eastern United States, Canada, and the UK, the observed 454 catalogue is the only option to extrapolate the rates of large, rare earthquakes from the occurrence 455 rates of small-to-moderate earthquakes (Budnitz et al. 1997;Anderson and Biasi 2016). In this 456 context, it seems reasonable to develop a robust methodology to statistically validate the ability of the 457 source models in the SSC to replicate the observations and therefore fully satisfy the second criterion 458 in the development of the SSC. 459

460
This work highlights the importance of model testing and validation against the observations. The 461 main finding of the paper is the presentation, together with its first application, of the Bayesian 462 Metropolis-Hastings (BMH) approach to test to what degree the seismic records support or reject the 463 proposed seismic source models in the SSC in light of the observed seismicity, i.e. one of two criteria 464 to construct realistic source models. This is a valuable tool in regions of high seismicity with a long 465 catalogue of seismic historical records but also in regions of moderate seismicity where the processes 466 controlling the occurrence of earthquakes are not fully understood, the rate of deformation is low, and 467 therefore the occurrence rate of large earthquakes is extrapolated from low-to-moderate events. 468 The BMH approach is applied for the evaluation of the seismic source models for the PSHA at the 469 Wylfa Newydd site in the UK. Due to the poor correlation between British earthquakes and tectonic 470 structures in the UK, the SSMs for the Wylfa Newydd case study did not include fault sources.      Distribution of the mainshocks in the study area within the completeness threshold in Table 1. Events 643 with unknown depth are coloured in white. The black circles describe the area within 300 and 60 km 644 from the site, respectively. The star describes the site of Wylfa Newydd. 645   Table 5. 654       Table 1. Events with unknown depth are coloured in white. The black circles describe the area within 300 and 60 km from the site, respectively. The star describes the site of Wylfa Newydd. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors. Figure 4 14 candidate seismic source models (SSMs) for the study area. The star describes the site of Wylfa Newydd. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors.   Distribution of the accepted synthetic catalogues by the BMH approach as a function of the SSMs on the left-hand side and the mis t function on the right-hand side. Figure 8 1-D posterior pdf as a function of SSMs for the three tests in Table 5. 14 candidate seismic source models (SSMs) for the study area of 60 km within the site. The star describes the site of Wylfa Newydd. Note: The designations employed and the presentation of the material on this map do not imply the expression of any opinion whatsoever on the part of Research Square concerning the legal status of any country, territory, city or area or of its authorities, or concerning the delimitation of its frontiers or boundaries. This map has been provided by the authors. Figure 10 1-D posterior pdf as a function of SSMs for three cases the entire catalogue and the 300-km study area (Reference), the instrumental catalogue and the 300-km study area (Test A), and the entire catalogue and the 60-km study area (Test B).