A regionally adaptable ground-motion model for fourier amplitude spectra of shallow crustal earthquakes in Europe

Typical seismic ground-motion models predict the response spectral ordinates (GMM-SA), which are the damped responses of a suite of single-degree-of-freedom oscillators. Response spectra represent the response of an idealized structure to input ground-motion, but not the physics of the actual ground-motion. To complement the regionally adaptable GMM-SA of Kotha et al. (2020), we introduce a model capable of predicting Fourier amplitudes (GMM-FA); developed from the Engineering Strong Motion (ESM) dataset for pan-Europe. This GMM-FA reveals the very high variability of high frequency ground-motions, which are completely masked in a GMM-SA. By maintaining the development strategies of GMM-FA identical to that of the GMM-SA, we are able to evaluate the physical meaning of the spatial variability of anelastic attenuation and source characteristics. We find that a fully data-driven geospatial index, Activity Index (AIx), correlates well with the spatial variability of these physical effects. AIx is a fuzzy combination of seismicity and crustal parameters, and can be used to adapt the attenuation and source non-ergodicity of the GMM-FA to regions and tectonic localities sparsely sampled in ESM. While AIx, and a few other parameters we touch upon, may help understand the spatial variability of high frequency attenuation and source effects, the high frequency site-response variability—dominating the overall aleatory variance—is yet unresolvable. With the rapid increase in quantity and quality of ground-motion datasets, our work demonstrates the need to upgrade regionalization techniques, site-characterisation, and a paradigm shift towards Fourier ground-motion models to complement the traditional response spectra prediction models.

physically the quantified random-effects at T < 0.5s . Towards this end, in this study we introduce a regionally adaptable GMM-FA, correlate the regional adjustments to an independently derived geospatial index, and suggest using these correlation to adapt the model to regions with little to no data in the ESM dataset.
Our best candidate geospatial index is the Activeness or Activity Index developed by Chen et al. (2018). Although we have tried a suite of geological and geophysical parameters (e.g. Mohorovičić or Moho depth, 1 Hz coda Quality factor maps, etc.) to serve as proxy for our regional adjustments (the random-effects), the spatial coverage and resolution, and the epistemic uncertainty of these parameters made them less viable than the globally available Activity Index. More about this parameter will be discussed in relevant sections.

Ground-motion dataset
The ground-motion dataset we used is the Fourier amplitude version of the ESM dataset. The Fourier version of ESM does not contain data from Iran, and is therefore a few dozen records smaller than that used to derive the  GMM-SA. The groundmotion data selection criteria, regionalisation models, functional form, and regression methods are identical to that of the  GMM-SA; except for a few minor changes noted subsequently. To recall the criteria: (1) We select only those events classified as non-subduction events by Weatherill et al. (2020d). The selection removes in-slab, interface, outer-rise, and upper-mantle events from the regression dataset. The resulting 923 events with hypocentral depth 0 < D ≤ 39 km are located in regions with 14 ≤ Mohodepth ≤ 49 km , as per the Moho map of Grad et al. (2009). Note that the same selection criteria resulted in 927 events in case of GMM-SA (2) Only those events with ≥ 3 records in the dataset are used in regression. Also, wherever available, the harmonised moment-magnitude M W estimates according to the European Mediterranean Earthquake Catalogue [EMEC by Grünthal and Wahlström (2012)] are preferred over the ESM default values. As with GMM-SA, preference of EMEC M W is to maintain consistency with the seismic source models developed for the new European Seismic Hazard Model 2020 (ESHM20, www. efehr. org) (3) We keep data from all 1622 sites in the dataset irrespective of whether their V S30 measured from geotechnical investigations is provided or not in ESM. This is to estimate the site-specific terms ( S2S s ) at as many sites as possible, and then explore various site-response proxies to characterize them (e.g. Kotha et al. 2018;Weatherill et al. 2020b). Note that the GMM-SA used data from 1829 sites, including 197 sites from Iran whose data is unavailable in the Fourier version of ESM (4) Choice of distance metric is R JB where available, otherwise the epicentral distance R epi -but only for events with M W ≤ 6.2 . The distance range is not truncated and extends up to R JB = 477 km . In comparison, the GMM-SA was derived from data extending up to R JB = 545 km (5) The only additional record selection criterion relevant to the GMM-FA is the low-pass frequency filter limit. For the GMM-SA regression of SA(T) , we chose only the records with a high-pass filter frequency of both horizontal components f hp ≤ 0.8∕T . While 1 3 for the GMM-FA regression of FA(f ) , we chose the records with high-pass frequency f hp ≤ 0.8f and low-pass frequency f lp ≥ f ∕0.8 , of both horizontal components. (6) The GMM-FA is developed to predict the Effective Amplitude Spectra [ EAS as defined by Pacific Earthquake Engineering Research Centre, Goulet et al. (2018)] of the two horizontal FA(f ) components, for 21 values of f in the range 0.186Hz − 15.051Hz . EAS is an orientation independent horizontal-component FA(f ) of ground-motion that can be used with RVT framework to estimate SA(T)  Following the above criteria, the number of records from 3.1 ≤ M w ≤ 7.4 events at distances 0 ≤ R JB < 477 km available for GMM-FA regression is shown in Fig. 1

Regionalisation datasets
The GMM-SA of  is a regionally adaptable model, wherein the event and path effects are regionalised. It is capable of predicting SA(T) accounting the regional differences in attenuation of high frequency (short-period) ground-motions, tectonic locality dependent average of event effects, and site-specific effects through random-effect adjustments to the generic GMM-SA median. We use the same regionalisation models for GMM-FA, and therefore, regionalise the data identically. Regional differences in localityspecific event and region-specific anelastic attenuation characteristics are estimated as random-effects in a mixed-effects regression. The GMM-FA random-effect values can be used in three ways: 1) in region-and locality-specific predictions accounting their epistemic uncertainties, 2) ignored while reintroducing their random-variances into total aleatory variability of ergodic GMM-FA predictions-i.e. no region-and locality-specific adaption, 3) investigated for correlation with geophysical parameters or proxies in order to extend their usage to regions with no data in ESM.  demonstrated the first two applications, while also evaluating the robustness of region-and locality-specific GMM-SA In this study, we discuss the third approach, i.e. evaluating the physical meaning and adaptability of the GMM-FA random-effects.

Attenuation regionalisation
In a drive towards regionalising ground-motion predictions, a few recent GMM-SAs identified and quantified between-region variability of apparent anelastic attenuation, and attributed it to spatial variability of crustal characteristics (e.g. Abrahamson et al. 2014;Boore et al. 2014b;Campbell and Bozorgnia 2014;Chiou and Youngs 2014;Kotha et al. 2016;Kuehn and Scherbaum 2016). These GMM-SAs were regionalised based on geopolitical boundaries in absence of a more relevant geophysical regionalisation. However, spatial variability of attenuation captured by the 1 Hz coda Q maps of smaller regions within geopolitical boundaries is quite high; as shown for Italy and Turkey by Cong and Mitchell (1998), for mainland France by Mayor et al. (2018), and for Europe in general by .
In this study, as in , we chose instead of geopolitical boundaries, a geological-geophysical regionalisation model developed by Basili et al. (2019) in the TSUMAPS-NEAM project. Figure 2 shows the TSUMAPS-NEAM regionalisation and the number of records in each attenuating region, as determined by the location of the recording sites. Assuming that anelastic attenuation is a dominant phenomenon at far-source distances (e.g. R JB ≥ 80 km ) and in near-surface crustal layers, we let the location of the receiving site decide to which attenuating region a particular ground-motion record should be allotted to.
The 45 regions in this map (Fig. 2)  In mixed-effects regression terminology, the random-variance of the group of attenuation regions quantifies the observed variability of anelastic attenuation between regions, where regions are levels of the group. If the random variance of this group is non-zero, it means that the chosen regionalisation scheme is able to capture the spatial variability of anelastic attenuation. Complementing the , in this study, we evaluate physical meaning of the region-specific random-effects.

Event localisation
Alongside the traditional earthquake-to-earthquake variability captured by the betweenevent random-effects group ,  introduced an additional random-effect to quantify locality-to-locality variability of earthquake characteristics; analogues to the location-to-location variability of Al Atik et al. (2010). We assign the data from 923 shallow crustal events in the ESM dataset to 55 seismotectonic zones defined in the European Seismic Hazard Model 2020 (ESHM20). The seismotectonic source zonation (called "TECTO") was designed to distinguish regional tectonic features influencing the generation of crustal earthquakes, but not the very localised features. 'TECTO' is thus a better model for earthquake source localisation than the Basili et al. (2019) used for regionalisation of anelastic attenuation.
As in , this random-effects group is the between-locality , wherein the levels are tectonic localities containing their assigned ESM crustal events and records. Figure 3 shows the distribution of records from active shallow crustal events within the event localisation polygons. As with the attenuating region group, this group is introduced to quantify the earthquake locality-to-locality variability of ground-motion in the dataset through the between-locality variance, which if zero indicates that the regionalisation model is not able to identify any significant spatial variability of earthquake characteristics. Chen et al. (2018) introduced a fully data-driven global tectonic regionalisation model for selection of ground-motion models in seismic hazard applications. Based on a fuzzy logic workflow, they have rendered a regular grid with a spacing of 0.5°, wherein each cell is Fig. 3 Distribution of records from shallow crustal events in each of the 55 event localisation polygons (tectonic zones) assigned a probability of being an active tectonic region-the Activeness or Activity Index (AIx), as shown in Fig. 4. AIx in a grid cell is calculated from the following fuzzy rules:

Activity index
( AIx is derived as a of combination of seismic moment rate density (Weatherill et al. 2016), 1 Hz Lg coda Q (Mitchell et al. 2008), and shear-wave velocity variation at 175 km (Ritsema et al. 2011). We use this dataset to evaluate the various regional variabilities quantified as the GMM random-effects.

Functional form
A mixed-effects GMM is composed of fixed-effects and random-effects. Fixed-effects part of the GMM is the continuous function built as a combination of predictor variables, which in this case are event magnitude M W and event-to-site distance metric R JB (in km). As suggested by Fukushima (1996), and later in Bora et al. (2017) and Bora et al. (2019), the fixed-effects component of our GMM-FA remains the same as GMM-SA presented in . However, although recent GMM-FA models from NGA-West2 dataset (Ancheta et al. 2014) by Bayless and Abrahamson (2019b) and Bora et al. (2019) retained a magnitude-dependent geometrical spreading component, following the suggestion by Cotton et al. (2008) we have dropped this term from our GMM-FA.
(1) ln( ) = e 1 + f R,g R JB + f R,a R JB + f M M W Fig.4 Activity Index map of pan-European region by Chen et al. (2018) In the mixed-effects formulation of Eq.
Regarding the random-effects components: (1) Alongside a generic region-independent c 3 , region-specific adjustments c 3,r are estimated as random-effects for the 45 attenuation regions in Fig. 2. Therefore, for regionspecific GMM-FA predictions at sites located in region r of Fig. 2, we replace the apparent anelastic attenuation coefficient c 3 in Eq.
(2) With records from events allotted to different tectonic localities l in shown in Fig. 3, locality-specific adjustments L2L l are estimated for the 55 localities. For an event in locality l , its L2L l can be used to make tectonic locality-specific ground-motion predictions by replacing the generic offset e 1 in Eq. (1) with e 1,l = e 1 + L2L l . Betweenlocality random-effects L2L l also follow a Gaussian distribution N(0, L2L ) , where L2L quantifies the observed variability of ground-motions of events distributed across the 55 tectonic localities shown in Fig. 3 (3) Event-to-event variability in this GMM is the traditional between-event random-effects group N(0, e ) filtered for between-locality variability N(0, L2L ) , and is now captured by the N(0, 0 ) ; where, for an event e located in tectonic locality l , the event-specific term can be seen as B 0 e,l ≈ B e − L2L l . 0 is the generic event-to-event variability corrected for locality-to-locality variability, and does not vary with locality l (4) Site-specific GMM-FA adjustments S2S s , estimated for the 1622 sites s in the dataset can be used to make site-specific ground-motion predictions by replacing the offset e 1 in Eq. (1) with e 1,s = e 1 + S2S s . Site-to-site response variability is captured by the between-site random-effects group N(0, S2S ) . The potential of S2S s in site-specific GMM-SAs is well-known, and are useful in studying regional differences in siteresponse scaling with V S30 (time-averaged shear-wave velocity in 30 m top-soil) as in Kotha et al. (2016), topographic slope and geology as in Weatherill et al. (2020a) or a variety of site parameters (Kotha et al. 2018;Zhu et al. 2020), and even in site-specific seismic risk assessment (Kohrangi et al. 2020).
(2) f R,g = c 1 .ln 2 The left-over residuals = N(0, ) contain the unexplained natural variability of ground-motion observations, and thus represent the apparent aleatory variability of the model. These residuals can be investigated for less dominant phenomenon, such as the anisotropic shear-wave radiation pattern at near-source distances , anelastic attenuation effects at long distances (Sahakian et al. 2019), non-linear soil response (Loviknes et al. 2021), etc.
In all, this GMM has four random-effect groups. With this configuration of mixedeffects GMM, we run a robust linear mixed-effects regressions (rlmm by Koller 2016) independently for the 21 EAS IMs of f = 0.186Hz − 15.051Hz . Although robust mixed-effects regression is computationally quite intensive compared to ordinary least-square mixedeffects regression (e.g. lmer by Bates et al. 2014), it is better adapted to large datasets; where statistical outlier (or physically anomalous) data need to be down weighted-and not deleted-to prevent them from biasing the fixed-effects and mixed-effects variance estimates. In this study, random-effects and residuals with values beyond ± 1.345 standard deviations of their respective Gaussian distributions are assigned lower weights (< 1). It is possible to tune the regression parameters of the Huber loss function (Huber 1992) used in our rlmm regressions. But we maintained the default values optimised for efficiency and robustness in detecting outliers, as suggested by Koller (2016). Along with the fixed-effect regression coefficients and random-effect adjustments, we produced also the fixed-effects variance-covariance matrices needed to estimate the GMM epistemic uncertainty (Atik and Youngs 2014;Bindi et al. 2017a) and to update the GMM coefficients in a Bayesian framework (Kowsari et al. 2019). These resources can be disseminated on request.

Fixed-effects
The regression results comprise of fixed-effect coefficients and their covariance matrices, random-effect values and their rlmm weights and standard errors, residuals, and variances. Similar to the GMM-SA, outlier events, localities, regions, sites, and recordings are flagged as well. Figure 5 presents the predicted Effective Amplitude Spectra for a few scenarios of interest. In this plot, we show fixed-effects prediction of FAS for: (3) with c 3 = c 3 , c 3 + c3 , c 3 − c3 , respectively; where c3 is standard-deviation of apparent anelastic attenuation random-effects group.
In the left panel, showing the near-source predictions, we notice that the depthdependence has a non-negligible effect on the amplitudes at distances R JB ≤ 5 km . For the (M7, 5 km) scenario, the epistemic uncertainty (orange ribbon) on the median is wide enough to cover the variation with depth for f < 0.5Hz . A large part of this epistemic uncertainty is from the lack of near-source data from large magnitude events, which is necessary to constrain the magnitude-scaling f M M W component of the GMM at M W ≥ M h = 5.7 (see Fig.S1 in supplement).
In the right panel, FAS predictions at far-source distances are shown. The c 3 = c 3 , c 3 + c3 , c 3 − c3 adjustments become active at f ≥ 2Hz in far-source predictions. At lower frequencies, the differences are much smaller-which is to be expected since coefficient c 3 is meant to capture the (apparent) anelastic attenuation of moderate-to-high frequency ground-motions.
Along with the depth and anelastic attenuation dependencies, we notice that with increasing magnitude the Fourier spectra become flatter in low-moderate frequency range, as the apparent corner-frequency shifts towards lower frequencies for larger magnitude events. At near-source distances (left panel), the spectra decay rapidly beyond f ≥ 10Hz, while at far source distances (right panel) this behavior is observed earlier at f ≥ 5Hz ; most likely from the enhanced anelastic decay of high frequency ground-motions.

Random-effects and residuals and
Prior to discussing the random-effects and residuals in following sections, there are few inferences derived from the random-effect and residual variances in Fig. 6. In this plot, S2S , L2L , 0 , c3 are the random-effect standard deviations of between-site, between-locality (tectonic locality of events), between-event (after between-locality correction), and between-region (anelastic attenuation) random-variables, respectively.
is the residual standard deviation. The total-sigma for the ergodic version of GMM-FA, = √ 2 S2S + 2 L2L + 2 0 + 2 , does not include c3 because it is intended for use as an epistemic uncertainty on the anelastic attenuation term c 3 of the GMM (Douglas 2018;Weatherill et al. 2020c), as shown in Fig. 5. All of the random-variances are comparable in size, which means the random-effect groups are meaningful for this dataset. The largest variability, however, is the site-tosite response variability, which is captured by the between-site standard deviation S2S .
Increasing monotonically at f ≥ 3Hz , S2S suggests that site-response (in the dataset) is highly variable at moderate-high frequencies. Such rapid increase of S2S towards higher frequencies has been reported for a various datasets (Bayless and Abrahamson 2019b;Bindi et al. 2017bBindi et al. , 2021. For instance, ground-motion amplification at a site with S2S s (f = 10Hz) = 1.5 * S2S (f = 10Hz) is 20 times larger than that at a site with S2S s (f = 10Hz) = −1.5 * S2S (f = 10Hz) . The large variability in site-response, and the consequently large S2S (the largest contributor to ), suggests that partially non-ergodic site-specific ground-motion predictions may soon become indispensable (Faccioli et al. 2015;Kotha et al. 2017;Rodriguez-Marek et al. 2013).
The next largest random variance is that of between-event variability quantified as 0 . Note that a part of the overall between-event (spatiotemporal) variability e is quantified in to the between-locality variability L2L ; as in e = √ 2 0 + 2 L2L . Apparently, variability of event-specific effects is the highest at f ≤ 0.5Hz . Seismic moment and moment-magnitude are the event-specific parameters estimated at these frequencies. For a GMM with M W (from EMEC catalog) as an explanatory variable, such large between-event variability at f ≤ 0.5Hz suggests large differences in observed ground-motions between events of identical M W . The most likely cause, to our understanding, is errors in M W in the dataset. A few studies (e.g. Holmgren and Atkinson 2018; Kuehn and Abrahamson 2017) demonstrated that M W uncertainty is a contributor to the between-event variability at long period spectral accelerations, which are analogues to low frequency Fourier amplitudes. Beyond f ≥ 1Hz the 0 values are lower and remain almost constant.
The counter-part of 0 is the between-locality variability L2L , which captures the average event-to-event variability when events are localized into seismotectonic zones ('TECTO'). L2L values are much smaller than 0 at f < 1Hz , and increase monotonically above 0 at f ≥ 5Hz . Assuming the errors in M W are contained in 0 , we discuss the physical meaning of L2L in subsequent sections.
Region-to-region variability of anelastic attenuation is quantified into c3 . Only the high frequency ground-motions are attenuated exponentially with distance. Therefore, c3 increases towards high frequencies in Fig. 6. The residual standard deviation , corrected for all parametric fixed-effects, and non-parametric region, locality and site-specific random-effects, remains almost constant across the frequency range. Figure 6 illustrates the significance of the chosen random-effect groups, and the frequency dependence of their random-variances. It is necessary to validate the randomeffects by correlating them to a physical parameter or an index. Since the random-effects are estimated only for the regions, localities, and sites (levels-as explained in Regionalisation Datasets) whose ground-motion data is used in GMM regression, new levels with no ground-motion data cannot benefit from the non-ergodic level-specific GMM predictions. However, correlating random-effects to a physical parameter or a geospatial index may allow, in a limited way, predicting region, locality, and site-specific adjustments for new levels outside the regressed dataset through the correlated parameter-with appropriate epistemic uncertainty.

Anelastic attenuation variability
Apparent anelastic attenuation of high frequency ground-motions, due to intrinsic scattering and absorption in the propagation medium, comes into play at intermediate to far source distances (e.g.R ≥ 80 km ). The coefficient c 3 in GMM median captures the average rate of exponential decay of ground-motion, while c 1 captures the linear decay. Substantial correlation between c 3 and c 1 estimates are to be expected because they together model the decay of ground-motions with distance (e.g. Abrahamson et al. 2014;Boore et al. 2014b;Campbell and Bozorgnia 2014). Therefore, it is more appropriate to refer to c 3 as a coefficient for apparent anelastic attenuation. δ c 3,r is meant to capture regional variability of this exponential decay. Figure 7 shows the frequency dependence of δ c 3,r for the 45 regions identified with subscript r . The region-to-region variability is largest at f ≥ 5Hz , reflecting the large c3 in the Fig. 6.
We note that, in this and all subsequent figures showing random-effect estimates and their rlmm weights, the brighter (yellow) colour identifies with unit-weight, and progressively darker colours have lower weights. Although the colour scale ranges from 0 to 1, there are seldom any regions, localities, events, and sites, with null weight (blue); which Regions with c 3,r (f ) beyond ±1.345 c3 (f ) are given a lower than a unit weight means, none of the data is discarded from the regression, but only down weighted from influencing the mixed-effect estimates.
Figure 8 maps the regional variability of δ c 3,r at f ≈ 0.3, 1, 3, 10Hz in the pan-European region; wherein the red coloured polygons ( δc 3,r > 0 ) represent regions with anelastic attenuation slower/weaker than the pan-European average, and the blue coloured polygons with δc 3,r < 0 are those regions attenuating faster/stronger than average. Sparsely sampled regions with large epistemic uncertainty on their δc 3,r are faintly coloured. A few interesting remarks on these maps: (1) Weatherill et al. (2020c) showed that regions with similar attenuation characteristics, as estimated in , are spatial clustered. Apparently, high seismicity regions (e.g. Italy and Greece), perhaps with their highly fractured top crustal layers, attenuate the high frequency ground-motions faster/stronger than lower seismicity regions (e.g. central Europe) the difference in δ c 3,r between these two regions is 0.2; which roughly translates to 10% larger ground-motion predictions at R = 80 km towards east than to west. In addition to attenuating the high frequency ground-motions faster than the pan-European average, there appears to be a frequency dependent contrast between these adjacent regions.
(3) Recent 3D shear wave velocity ( V s ) maps produced by Lu et al. (2018) reveal a relatively higher V s in the Appenines (west) compared to the Adriatic basin (east) at 10 km and 30 km crustal depths. If we assume that a higher V s means a higher crustal quality factor, these V s maps would imply a more efficient propagation in the crust, which translates to slower decay towards east as compared to the west. However, the spatial trends of δc 3,r indicate the contrary in this region. (4) Coincidentally, Apennines and region towards its west were reported to have shallower Moho (Grad et al. 2009) and significantly higher crustal temperatures due to submarine volcanic activity in the Tyrrhenian sea (Diaferia et al. 2019), compared to the Adriatic sea (east of Apennines); the latter phenomenon presumably better correlated to stronger attenuation of mechanical waves than shear-wave velocity alone. Inopportunely, the large uncertainties in estimates of these parameters (Moho depth and thermal gradient) at pan-European scale did not motivate a more quantitative evaluation. Therefore, we preferred a globally available geospatial index that is designed specifically for regionalisation of GMMs, i.e. the Activity Index in this study (5)  We note that changing the resolution or geometry of the regions may change the estimated spatial variability, and values of δc 3,r as well. However, given the current configuration, we seek physical features that may correlate (qualitatively) with δc 3,r . A recent study by Sahakian et al. (2019) using a large data set of small-magnitude earthquakes in Southern California suggested that crustal shear-wave velocity V s is only weakly correlated to anelastic attenuation, which also seems to be the case here with Central Italy (observation #2 above). Regional variability of anelastic attenuation may in fact be a combination of regional variability of crustal shear-wave velocity, crustal thermal gradient influencing the rigidity modulus of the crust, Moho depth, and other parameters that, at the time of this study, were not uniformly mappable across the pan-European region.
Activity Index (AIx) is a unique composite parameter that we preferred in this study. Activity Index is a data-driven continuous parameter inferred from a fuzzy combination of shear-wave velocity, seismic moment rate density, and crustal quality factors across the globe. A 0.5 0 gridded map of AIx was generated by Chen et al. (2018) for the sole purpose of regionalising GMMs or selecting suitable GMMs for a region with no region-specific ground-motion data. In this study, we extracted the AIx for every site location in the ESM dataset. A region with n sites will therefore have n values of AIx, which can serve as an epistemic uncertainty on the region-specific AIx. Figure 9 shows the loess fit [non-parametric moving average by Jacoby (2000)] between the δ c 3,r of the 45 regions and the AIx of sites located within each region. A strong negative correlation is evident at moderatehigh frequencies (bottom panels), where the regional variability estimated c3 is the largest.
The negative correlation between δ c 3,r and AIx in Fig. 9 suggests that regions with high seismic rate density, low shear-wave velocity, and low 1 Hz coda Q (therefore, high AIx) attenuate significantly faster than regions more likely to be cratonic (low AIx). Chen et al. (2018) indicate that the regional variability of AIx is dominated by regional variability of seismic moment rate density in active crustal regions ( AI ≥ 0.7 ), and to that of shear-wave velocity and 1 Hz code Q in relatively stable regions ( AI < 0.7 ). ESM dataset contains sites located in regions with 0.4 ≤ AI as seen in Fig. 9. The smooth transition of δ c 3,r between stable cratonic ( 0.4 ≤ AI < 0.7 ) to the more seismically active regions ( 0.7 ≤ AI ≤ 1 ) is an indication that it is a physically meaningful random-effect. However, these regionalisation models are different in nature: Activity Index is fully data-driven and gridded, while the regionalisation used in this GMM regression is based on expert elicitation and polygonised. In that sense, although there is a decent agreement, neither of the models may sufficiently replace the other. Figure 10 shows the ranges of AIx within each of the attenuation regions. The regions are ordered by decreasing δc 3,r (f ≈ 10Hz) from top to bottom. This figure illustrates the exclusivity of the two GMM regionalisation models. For instance, the two best-sampled regions, Northern and central Apennines W (West) and Northern and central Apennines E (East), despite their δ c 3,r (f ≈ 10Hz) values differing by 0.2 still have significant overlap of AIx ranges i.e. 0.68 ≤ AI ≤ 0.88 and 0.62 ≤ AI ≤ 0.84 , respectively. Meaning, datadriven AIx by itself may not resolve the differences between these two adjacent regions with contrasting attenuation characteristics as efficiently as the more subjective Basili et al.
(2019) regionalisation model. In lieu of more refined and unified regionalisation models, we foresee using both the models in tandem to explain and predict attenuation characteristics ( δc 3,r ) for regions outside the ESM dataset. A study in this direction is anticipated with the recently published RESIF-RAP dataset of ground-motions recorded by the French Fig. 9 c 3,r of the 45 regions versus Activity Index at site locations within each region, for f ≈ 0.3, 1, 3, 10Hz (clockwise from top-left to bottom-left). The blue lines are loess fits between the two parameters. Marker colors indicate the weight assigned to c 3,r of each region in the rlmm regression accelerometric network (Péquegnat et al. 2008) by Traversa et al. (2020), and other lowmoderate seismicity regions whose data are not integrated into ESM.

Tectonic locality variability
Seismic source variability is divided into two components in this GMM-FA: variability across tectonic localities producing the earthquakes N(0, L2L ) , and the locality corrected between-event variability N(0, 0 ) . Since events are exclusively nested in their respective tectonic localities, L2L l quantifies the average of the nested events' ground-motion  Figure 6 illustrates the frequency dependence of L2L and 0 . It appears that the two random-variances capture disjoint frequency dependent earthquake characteristics; where 0 > L2L at low-moderate frequencies, and vice versa at high frequencies. Figure 11 shows the L2L l (f = 0.186 − 15.051Hz) of the 55 tectonic localities in ESM dataset. Resembling Fig. 6, the scatter of L2L l significantly increases at f ≥ 5Hz in Fig. 11. The epistemic uncertainty of L2L l , i.e. the standard error SE( L2L l ) , are generally smaller than L2L . In this regard, dropping L2L from the aleatoric variability ( ) and using instead the L2L l ± SE( L2L l ) adjustments to regionalise the GMM predictions is recommended. A database of L2L l , SE( L2L l ) , and their rlmm weights indicating outliers can be provided on request for analyses and applications. For instance, Fig. 11 suggests that the number of detected outliers increases towards higher frequencies, along with L2L . A few of these outliers are also well-sampled localities with a low SE( L2L l ) ; implying a more source specific study could be worthwhile.  in observed ground-motions as governed by the predominant focal mechanisms in these localities. However, the within-locality diversity of focal mechanisms dissuaded us from this hypothesis.
It is important to note that the colour scale ranges in Fig. 12 are not frequency dependent. Despite, an interesting feature in Fig. 12 is the inversion of L2L l in the central Apennines from positive values at f ≈ 0.3Hz to negative values at f ≈ 10Hz (red to blue colour). However, it is inconclusive if the events in this region produced low frequency ground-motions stronger than pan-European average or if it is the inhomogeneity of M W estimates across the pan-European region that is being captured by the L2L l at f ≤ 0.3Hz .
We have already seen in Fig. 9 the clear correlation between c 3,r and the AIx values at station locations. Figure 13 is similar in description to Fig. 9, but instead of AIx at each station location within an attenuating region, we use the AIx at event locations within a tectonic locality. Although not presented here, we found no correlation between B 0 e,l vs AIx at any frequency (see supplement Fig.S9). This means to say that when B 0 e,l variability is large within a tectonic locality, the much larger localities composed of several 0.5 o grid cells with similar AIx values are unlikely to resolve event-specific differences. On the other hand, the size of tectonic localities is comparable to the size of regions with distinguishable AIx values (Fig. 4). As a result, Fig. 13 shows an interestingly strong relationship between L2L l and AIx (at event locations).
Up to f ≈ 1Hz , we observed no resolvable trends between L2L l and AIx. Moving towards higher frequencies, as L2L gains relevance, a significant negative correlation is observed. Essentially, the loess fits for f ≥ 2Hz suggest that the tectonic localities coinciding with regions with AI ≥ 0.7 are more likely to produce, on average, weaker high frequency ground-motions than those with AI < 0.7 . However, since AIx is a fuzzy combination of three physical parameters, it is not obvious which one is responsible for the negative correlation with L2L l (see supplement Fig.S5 for correlation with seismic moment rate density).

Fig.12
Tectonic localities and the L2L l for these locality polygons. Rred colored localities produced events capable of higher GM(f = 0.3,1, 3,10Hz) than the GMM median predictions, blue colored localities produced earthquakes weaker than dataset average, and grey colored localities are close to the dataset average. Localities with fewer ground-motion observations, thereby larger epistemic error on L2L l , are more transparent and appear fainter. Note that the color scale is the same in all the maps in order to emphasize the frequency dependence of spatial variability of event characteristics A classical hypothesis has been that events in stable continental regions produce stronger high frequency ground-motions than those in active crustal regions by virtue of their larger stress-drops, e.g. Bommer et al. (2015). In this case, stable continental regions are those with lower AIx and appear to coincide with tectonic localities with higher L2L l values in Fig. 13. Alternatively, the large L2L value at high frequencies could be from regional variability of a high frequency source parameter, e.g. the source ; which is the high frequency decay parameter of Brune's 2 source model (Bindi et al. 2019b). Through spectral decomposition of ground-motion observations, Bindi and Kotha (2020) estimated the source of several events in the ESM dataset, but the large variability of, and uncertainty in, deter us from associating these estimates with the already uncertain L2L l . Although not reported here, we do observe a reasonable positive correlation between source and AIx (see supplementary figure Fig.S6).
Through an exploratory analysis, we deduced a third hypothesis by comparing the three best-sampled localities in: central Apennines of Italy [(12 o E, 43  3 ). These three regions have similar seismic moment rate density, 1 Hz coda Q, and shear wave velocity variation at 175 km depth, and therefore, similar AIx values (see Fig. 4). These tectonic localities produced several M W ≥ 6 earthquakes as well. Bindi and Kotha (2020) showed that the stress-drop estimates for the large events in north-west Anatolia are a magnitude higher than those of the central Apennines; while those of north-east Anatolia were unconstrained due to lack of sufficient station coverage. Perrin et al. (2016) showed that the structurally immature north-west Anatolia produced large earthquakes with systematically lower rupture speed than those originating in the structurally mature north-east Anatolian segment. Chounet et al. (2018) followed up with a  Radiguet et al. (2009) connected the lower near-field ground-motions (e.g. peak ground acceleration) to structural maturity of fault systems. Based on these studies, we infer that the structurally immature fault segments produced earthquakes capable of stronger high frequency ground-motions. In this study, we concur with the hypothesis that the younger, westward growing north-west Anatolian fault system produced stronger earthquakes than the older, north-east Anatolian and central Apennine fault systems.
In summary, L2L l is introduced to capture partially, in a predictable way, the spatial variability of event dependent ground-motion variability. While the location corrected event-specific δ B 0 e,l retains a large party of event-to-event stress-drop variability as shown in Bindi et al. (2019b) (and in supplementary figure Fig.S8); L2L l captures the average regional trends, which in-turn appear to be controlled by macroscopic fault system characteristics-and not necessarily by the stress-drop (see supplementary Fig.S7). Given its strong correlation with Activity Index, we hypothesize that the trends shown in Fig. 13 may assist in guessing an appropriate L2L l for low-moderate seismicity regions (e.g. France and Germany) whose ground-motion data is absent in ESM; i.e. more positive L2L l in regions with lower AIx values (in Fig. 4).

Event variability
Spectral decomposition of ESM data by Bindi and Kotha (2020) and a few earlier studies demonstrated a strong correlation between the traditional between-event random-effect δ B e and stress-drop. We found a similarly strong correlation between the L2L l corrected between-event random-effect δ B 0 e,l and stress-drop here as well (supplement Fig.S8)-but no correlation with AIx (supplement Fig.S9). Stress-drop is not quite a spatiotemporally predictable parameter, and can be quite variable within any tectonic locality. However, it is commonly acknowledged that ground-motion variability among large magnitude events is smaller than small events. This has been verified by Youngs et al. (1995) as a magnitude dependence of ground-motion variability-irrespective of sample size. Several subsequent GMM-SAs have since featured the so-called heteroscedastic between-event variance ( e ) models.
Magnitude-dependent heteroscedasticity of e makes physical sense because larger ruptures tend to occur more-or-less on the same area of the fault plane, and periodically release a similar amount of accumulated elastic energy or stress. However, if the large events of similar magnitude originate in fault systems (or tectonic localities) with very different stress accumulation and release rates, the variability of stress-drop (and the δ B e ) could be as large as that of smaller events. This means that, heteroscedasticity of e in a global dataset consisting of events from Italy, Turkey, Japan, China, etc., (e.g. NGA-West2) may not be as significant as that of events within a tectonic locality within any of these geopolitical boundaries. In our case, we remove the locality-to-locality variability of event characteristics through the random-effect N 0, L2L , leaving us a large sample of N 0, 0 that can be examined for a generic heteroscedasticity with M W , independent of events' localities. Based on this understanding, we have modelled a frequency and M W dependent heteroscedastic 0,M W , as illustrated in Fig. 14 In the top panel of Fig. 14

Site-response variability
The next, and by far the largest, random variability is the site-response component N 0, S2S . Figure 6 shows that S2S is consistently the largest random-variance at all frequencies, when measured as site-to-site ground-motion variability across the 1622 sites in ESM dataset. Site-specific ground-motion predictions are more accurate and precise (smaller aleatory variability) than ergodic or region-specific predictions, but are only possible when site-specific ground-motion data are available. In absence of site-specific observations, site-response proxies are necessary to extrapolate spatially the site-specific terms S2S s (Kotha et al. 2018;Weatherill et al. 2020b). For such studies, we disseminate a database of S2S s (f = 0.186 − 15.051Hz) derived from the ESM dataset. Figure 15 shows the relation of S2S s (f ≈ 0.3,1, 3,10Hz) with measured V s30 (left column), and topographic slope (right column). While only 400 sites are provided with measured V s30 in the ESM dataset, topographic slope derived from digital elevation models provided by Shuttle Radar Topography Mission (Jarvis et al. 2008) is available at all site locations. In Fig. 15, each marker corresponds to a site with an estimated S2S s (irrespective of number of records), color coded to their rlmm weight. The error-bars (red) illustrate the mean and MAD within each Eurocode 8 site-class. The blue curve represents the proposed linear soil-response model, derived as a quadratic function of V s30 or slope.
Looking at the δS2S s trends with V s30 , it is evident that sites in EC8 class D ( V s30 ≤ 180m∕s) and class C ( 180 < V s30 ≤ 360m∕s) significantly amplify low frequency ground-motions compared to the average of the dataset. In addition, the within class site-to-site variability (error-bar) is apparently larger than that of EC8 class B ( 360 < V s30 < 800m∕s) and A ( 800m∕s < V s30 ) . The site-response of class A sites does not appear to scale with V s30 at low frequencies. The converse is observed at high frequencies, wherein, class A and B sites exhibit higher site-to-site variability, and scale steeply with V s30 . Interestingly, the flattening of f ≈ 3, 10Hz site-response function (blue curve) towards lower V s30 suggests that class C and D sites exhibit high frequency amplifications lower than that a linear V s30 scaling function would predict. Although this is likely from nonlinear behavior of soft soils when subjected to strong input ground-motion, which is not accounted for in our GMM-FA, a further record-to-record investigation is necessary. Alternatively, one approach could be to analyse the left-over residuals at these sites for nonlinear soil response, as in Loviknes et al. (2021).
For now, we only provide the site-response ( SR V s30 , SR slope ) as quadratic functions of V s30 and slope, along with a database of site-response terms (Eqs. 6 and 7). Here, the measured V S30 is in m∕s and slope in m∕m , the regression coefficients g 0 , g 1 , g 2 are different for SR V s30 and SR slope , and change with frequency. Robust linear fits using an M estimator (Venables and Ripley 2002), at each of 21 frequencies between f = 0.186 − 15.051Hz , PGA(T = 0s) and PGV , are derived for δ S2S s ∼ V S30 correlation of 400 sites with measured V S30 available, and S2S s ∼ slope of the 1622 sites with slope.The residuals from S2S s ∼ V S30 (measured) and Slope is a poorer explanatory parameter of site-response than measured V s30 , but is relatively easier to obtain at any site location. The left panel of Fig. 16 shows the reduction in S2S from using V s30 and slope as site-response proxies in the GMM-FA. Although the (6) SR V s30 = g 0 + g 1 .ln V S30 800 + g 2 . ln V S30 800 2 (7) SR slope = g 0 + g 1 .ln slope 0.1 + g 2 . ln slope 0.1 reduction at lower frequencies is substantial, an explanation for the large variability of high frequency site-response is still evasive. A few studies, e.g. Hollender et al. (2020), Mucciarelli et al. (2017), and Stewart (2000), discussed the impact of instrument housing and soil-structure interaction on ground-motion recordings; which might explain the relatively large site-to-site response variability at high frequencies, even among sites with similar V s30 , compared to that at lower frequencies. Also shown in this figure, in the right panel, is the reduction in S2S from using V s30 and slope as site-response proxies in the GMM-SA of . While the variances at f ≤ 3Hz and T ≥ 0.3s are comparable, the ground-motion variability at short periods (e.g. T < 0.3s ) captured by GMM-SA is a severe underestimate of the actual variability of high frequency Fourier amplitudes (e.g. f > 3Hz ). Differences as such were also observed while comparing the between-locality and between-region random-variances L2L and c3 , respectively. This observation reinforces the importance of GMM-FAs over GMM-SAs in capturing the true variability of shaking for the various practical applications we highlighted in the Introduction section, and also our motivation to evaluate the physical meaning of between-region δ c 3,r (f ) and between-locality δ L2L l (f ) random-effects in Fourier domain instead of in response spectral domains-especially at high frequencies, where their variances are largest (see Fig. 6). In case of the between-locality and between-region random-effect groups, the correlations L2L l ∼ AI in Fig. 13 and c 3,r ∼ AI in Fig. 9 explain to a good extent the large L2L and c3 values at high frequencies. But the large S2S at high frequencies could not be resolved using Activity Index. As stated above, the variability of site-response at high frequencies could be from instrument housing type or more physical; such as the influence of deeper soil layers, plasticity of soil column, weathering of bedrock, seasonal changes in shear-wave velocity in shallow layers (Alexis et al. 2021;Roumelioti et al. 2020), etc. However, in-lieu of resolving such issues, and given the practical importance of high frequency site-response, the most efficient solution yet is to collect more site-specific data (Bard et al. 2019).
Apart from the random-effects analyses presented in the previous section, customary checks for magnitude and distance dependencies showed no peculiarities-therefore, we skip presenting them here. Finally, the aleatory 'left-over' residuals ( defined in Functional Form section) can be examined for evidences of some secondary phenomenon which are record-specific and are not captured by the mixed-effects such as: shear-wave radiation pattern in near-source distances , SmS reflections at intermediateand far-source distances (Bindi and Kotha 2020;Bindi et al. 2006), nonlinear soil response (Loviknes et al. 2021), etc. However, each of these phenomena require information that are not currently available in the ESM dataset (e.g. centroid-moment-tensor solutions and crustal shear-wave velocity profiles). For sake of brevity of the manuscript, we anticipate in-depth residual analyses to follow-up studies.

Conclusion
We presented a ground-motion model capable of predicting the Effective Amplitude Spectra of horizontal Fourier ground-motions in the frequency range f = 0.186 − 15.051Hz . This model is developed as complementary to the regionally adaptable ground-motion model for response spectra predictions by . Hence the applicability range and application strategy are the same as those described in  and Weatherill et al. (2020c). However, unlike Bayless and Abrahamson (2019b), we have not smoothed the fixed-effects coefficients to extend the applicability range of our model. Therefore, we advise against extrapolating the model to higher or lower frequencies, and higher or lower M W . The random-effect and residual variances of our model are large, but are still comparable to the recent models by Stafford (2017) and Bora et al. (2019). Interestingly, all these models' variances show striking similarity in the rapid increase of estimated ground-motion variability at the high frequencies-a feature that is completely masked in the variance estimates of response spectra ground-motion model.
The three types of partial non-ergodicity are captured by three random-effect groups: between-locality, between-region, and between-site. Random-variances of all three groups increase rapidly at f > 3Hz . This intriguing feature was not apparent in the random-variances of  at periods T < 0.3s . Therefore, we found it more appropriate to evaluate the physical meaning of between-locality and between-region random-effects in Fourier domain, which is more closely related to actual physics of seismic ground-motion than the response spectral domain. We have attempted correlating these random-effects to a suite geological and geophysical parameters, but have not reported them here for various reasons-and for the sake of brevity. The most convincing correlation, however, was with Activeness or Activity Index of Chen et al. (2018). This fuzzy logic based data-driven geospatial index is able to resolve the high between-locality and between-region randomvariances at high frequencies. Our interpretation is that regions with high Activity Index (high seismic moment rate density, low 1 Hz coda Q, low crustal shear-wave velocity at 175 km) attenuate the high frequencies much faster/stronger than regions with low Activity Index (low seismic moment rate density, high 1 Hz coda Q, high crustal shear-wave velocity at 175 km). Similarly, tectonic localities characterised by low Activity Index (e.g. moderate seismicity regions like France) produced more energetic earthquakes, on average, compared to localities with high Activity Index (e.g. high seismicity regions like central Italy). Activity Index as a composite parameter correlates better to the spatial variability of between-locality and between-region random-effects, than its components individually.
Nevertheless, we find a strong potential in Activity Index as regionalisation parameter in our future ground-motion models. There are several directions we foresee: (1) using Activity Index as regionalising fixed-effects in ground-motion models, (2) using Activity Index to adapt the between-locality and between-region random-effects to sparsely sampled low seismicity areas, and (3) upgrading the Activity Index with new criteria based on more recent Moho depth, lithospheric temperature, and crustal shear-wave velocity, etc. The direction (2) is already on course using the recently published ground-motion dataset for France ) and a weak motion dataset for central and north Europe .
We strongly encourage the various ground-motion seismology communities to collaborate in testing our inferences, and in adapting our model to their datasets; especially those not integrated into ESM.
Despite our efforts however, the very high site-response variability at high frequencies remains a challenge. Site-specific ground-motion prediction using the site-specific randomeffects seems to be the most reasonable approach yet-if we were to avoid the very large ergodic aleatory variability in seismic hazard and risk assessments. It is however quite encouraging to see several research groups attempting to investigate and parametrize site-specific characteristics (e.g. shear-wave velocity profiles, sediment thicknesses, topography, etc.) for use in ground-motion models. To aid these efforts and many other practical applications, we consider ground-motion prediction models in Fourier domain are extremely useful; especially in quantifying the high frequency ground-motion variabilities.
One important aspect we have skipped discussing in this study, for the sake of brevity, is the analyses of aleatory residuals. These residuals are record-specific and also contain useful information on secondary phenomena, such as the shear-wave radiation pattern at near-source distances, arrival of secondary phases at intermediate-far source distances, nonlinear response of soft soils, etc. All of these phenomena can be included in a ground-motion model for more realistic (e.g. anisotropic) predictions. Of course, to quantify these effects would require much more information (e.g. centroid moment tensor solutions and incidence angles) in groundmotion datasets. Currently, we are attempting to populate the ESM dataset with as much of relevant information as possible, in order to facilitate further realism of ground-motion models. Based on the experience of developing, and the utility, of this Fourier amplitude groundmotion model, we plan to pursue such improvements in the Fourier domain rather than in the response spectra domain.