A Non-ergodic Effective Amplitude Ground-Motion Model for California

A new non-ergodic ground-motion model (GMM) for effective amplitude spectral ($EAS$) values for California is presented in this study. $EAS$, which is defined in Goulet et al. (2018), is a smoothed rotation-independent Fourier amplitude spectrum of the two horizontal components of an acceleration time history. The main motivation for developing a non-ergodic $EAS$ GMM, rather than a spectral acceleration GMM, is that the scaling of $EAS$ does not depend on spectral shape, and therefore, the more frequent small magnitude events can be used in the estimation of the non-ergodic terms. The model is developed using the California subset of the NGAWest2 dataset Ancheta et al. (2013). The Bayless and Abrahamson (2019b) (BA18) ergodic $EAS$ GMM was used as backbone to constrain the average source, path, and site scaling. The non-ergodic GMM is formulated as a Bayesian hierarchical model: the non-ergodic source and site terms are modeled as spatially varying coefficients following the approach of Landwehr et al. (2016), and the non-ergodic path effects are captured by the cell-specific anelastic attenuation attenuation following the approach of Dawood and Rodriguez-Marek (2013). Close to stations and past events, the mean values of the non-ergodic terms deviate from zero to capture the systematic effects and their epistemic uncertainty is small. In areas with sparse data, the epistemic uncertainty of the non-ergodic terms is large, as the systematic effects cannot be determined. The non-ergodic total aleatory standard deviation is approximately $30$ to $40\%$ smaller than the total aleatory standard deviation of BA18. This reduction in the aleatory variability has a significant impact on hazard calculations at large return periods. The epistemic uncertainty of the ground motion predictions is small in areas close to stations and past events.


Introduction
Probabilistic seismic hazard analysis (PSHA) estimates the annual rate of exceeding a groundmotion parameter at a site of interest. It typically breaks the problem in two parts: the seismic which, when used with a Brune (1970Brune ( , 1971 omega-squared FAS model, gives predictions that are consistent with the NGAWest2 data set. Converting the non-ergodic EAS GMM into an equivalent P SA GMM is not in the scope of this paper; it is covered in the second part of this study. In this approach, the median estimate of the non-ergodic P SA GMM is based on the median estimate of the EAS GMM and RVT, while the non-ergodic aleatory variability is estimated empirically using the ground-motion observations.

Ground-Motion Data
The NGAWest2 data-base (Ancheta et al., 2014) includes more than 21000 recordings covering a magnitude range from 3 to 7.9 and a closest distance range (R rup ) from 0.05 to 1500 km. For this study, a subset of this data-base was used which included the earthquakes and stations located in California, western Nevada, and northern Mexico. The recordings that were identified as questionable in Abrahamson et al. (2014) were not used in the regressions. The final data-set contains 8916 recordings from 188 earthquakes recorded at 1497 stations. Figure 1 shows the magnitude-distance distribution of the data and the number of recordings per frequency. The earthquake magnitudes in the selected data range from 3.1 to 7.3 and the distances range from 0.1 to 300 km. The usable frequency range of the majority of the recordings spans from 1 and 10Hz. Figure 2 shows the spatial distribution of the data: most of the stations are located in Los Angeles, Bay Area, and San Diego metropolitan areas, whereas in less populated areas, such as northern-eastern California the spatial density of the stations is lower. This difference in the density of stations has a large impact on the distribution of epistemic uncertainty of the systematic site effects: the epistemic uncertainty is higher is areas with lower station density making a case for expanding the strong-motion networks in these regions near critical infrastructure. In this study, the location of the earthquakes and stations is defined in kilometers in UTM coordinates; the longitude/latitude coordinates were transformed to UTM coordinates using the WGS84 reference ellipsoid and 11S UTM zone.

Non-ergodic Model Development
Rather than developing the non-ergodic GMM from scratch, the Bayless and Abrahamson (2019b) (BA18) ergodic EAS GMM was used as a backbone model to describe the average ground-motion scaling. The main reasons for that decision were that: i) the local data may non be adequate to estimate the scaling of complex terms, and ii) the adoption of the constraints built into BA18 ensures that it extrapolates properly outside the range of data.

Functional Form
Typically, a GMM (Equation (1)) is composed of the median model (y med ) and the aleatory variability. The median model describes the average scaling of a ground-motion parameter with magnitude, distance, site conditions, etc. The aleatory variability describes the misfit between a ground-motion observation and y med which is related to the true or apparent (due to simplified modeling) stochastic behavior of the source, site, and path. It is typically expressed as the sum of the between-event (δB e ) and within-event (δW e,s ) terms. δB e,s describes average shift of the ground motion for an earthquake, e, from y med , and δW e,s describes the variability of the ground motion at site, s from the median ground motion of that earthquake. ln(EAS) = y med + δB e + δW e,s The median EAS model of Bayless and Abrahamson (2019b) is formulated as: where f M is the magnitude scaling, f R is the path scaling, f S is the site scaling, f ztor is the top of rupture scaling, f N M is the normal-fault scaling, and f Z1 is the basin thickness scaling. The f M , f R and f S terms were modified to include the additional non-ergodic terms, whereas f ztor , f N M and f Z1 were kept fixed. The different terms and coefficients are described in detail in Bayless and Abrahamson (2019b) and Chiou and Youngs (2014). The f M in the non-ergodic model is: in which c 1 is the intercept of the model, c 2 controls the magnitude scaling for large magnitudes where the corner frequency is smaller than the frequency of interest, (c 2 − c 3 )/c n describes the magnitude scaling at small magnitudes, where the corner-frequency is larger than the frequency of interest, c n controls the width of the transition zone between small and large magnitude earthquakes, and c M is the magnitude at the center of the transition zone. A sketch of the magnitude scaling is provided in the electronic supplement. The additional coefficients in the non-ergodic model are δc 0 , δc 0,e , and δc 1,e . δc 0 is added to allow a small constant shift in the non-ergodic model due the difference in the weighting of residuals between the ergodic and non-ergodic GMMs. δc 1,e is a function of the earthquake coordinates, x e , and is intended to capture repeatable non-ergodic effects related to the source location. For instance, regions with a higher than average median stress drop will have higher than average median ground-motions resulting in a positive δc 1,e . δc 0,e is a regional term that corrects a potential bias in the magnitude estimation of small earthquakes between northern and southern California. δc 0,e is applied to earthquakes less than magnitude 5 and frequencies less than 5Hz. The vertices which define the polygons for the northern and southern CA subregions are summarized in Table  1, the border between northern and southern CA corresponds approximately to the boundary between the Northern California Seismic Network (NCSN) and Southern California Seismic Network (SCSN).
The δc 0,e term addresses potential inconsistencies in the reported magnitudes of small earthquakes between northern and southern CA because the NCSN/SCSN boundary was also used in the NGAWest2 dataset Ancheta et al. (2013) for the selecting the catalog to use for the source parameters (magnitude, hypocenter location, etc.) for small-to-moderate (less than M 5) earthquakes. If a small earthquake was located north of the boundary, the NCSN catalog was used for the source parameters, whereas if a small earthquake was located south of the boundary, the SCSN/CIT catalog was used for the source parameters. Preliminary regressions that did not include the δc 0,e term showed significant differences in δc 1,e between northern and southern CA at small frequencies. It was found that these differences were caused by a noticeable bias in the total residuals of BA18 between northern and southern CA for small magnitude events. Chiou et al. (2010) made a similar observation for the total residuals of CY08: they found a regional difference in median ground-motion amplitude between north and south CA earthquakes which was more noticeable at small magnitude events. The results in section 4 show that the difference in the median ground-motion of small events between northern and southern CA is approximately 0.4 in natural-log units at frequencies between 0.2 and 5.0 Hz. At frequencies well below the corner frequency, a unit change in magnitude leads to a factor of 32 change in the amplitude of the ground motion; thus, a 0.4 natural-log difference in ground motion can be caused by a 0.11 bias in the magnitude estimation between the NCSN and SCSN networks, which could be due to different assumptions in the velocity models or other input parameters used to determine the magnitude of an event. Given the assumption of magnitude bias, the δc 0,e term is not applied to events less than M 5. The source parameters of the moderate-to-large events in NGAWest2 were estimated with fault inversions using data from global networks, and therefore, the potential magnitude bias between the NCSN and SCSN networks is not applicable to events larger than M 5. However,  this issue should be further investigated in future studies to find the exact cause of this regional difference.

Path Scaling
The functional form for f P is: whereR = R 2 rup + 50 2 . The coefficient c 4 , which corresponds to the geometrical attenuation, manages how the ground motion attenuates at short distances. The coefficient c 5 describes the short-distance saturation, this term increases the effective distance for large magnitudes to capture the finite-fault effects (i.e. as the earthquake magnitude increases, the size of the rupture increases resulting in more seismic waves coming from more distant segments of the rupture leading to a larger effective rupture distance). Coefficients c 4 and c 6 control the magnitude saturation at short distances. Full saturation at zero distance is achieved when c2 = −c 4 c 6 is satisfied, non full saturation (i.e. positive magnitude scaling) is achieved when c 2 > −c 4 c 6 . At distances greater than 50km, the term (−0.5 − c 4 ) cancels the empirically estimated geometrical attenuation and fixes it to 0.5 which is the theoretical geometrical attenuation of surface waves. To maintain proper distance scaling and magnitude saturation in the non-ergodic model, all the aforementioned coefficients were fixed to their BA18 values. The non-ergodic distance scaling is captured with the cell-specific anelastic attenuation.
The anelastic attenuation is modeled with the cell-specific anelastic approach, first proposed by Dawood and Rodriguez-Marek (2013) and then extended by Kuehn et al. (2019) and Abrahamson et al. (2019). In this method, the states of California and Nevada are broken into 25 × 25km cells and, for each record, the ray path which connects the site ( x s ) to the closest point on the rupture ( x cls ) is broken into cell-path segments (∆R i ) which are lengths of the ray within each cell. For each record, the sum of cell-path segment lengths Nc i=0 ∆R i , is equal to R rup . The cell-specific anelastic attenuation is modeled by ∆ c ca,p · ∆ R where c ca,p is vector containing the attenuation coefficients of all the cells. In this GMM, c ca,p is modeled so that, in areas away from past paths it reverts to c 7 which is the anelastic attenuation coefficient in BA18, making the anelastic attenuation of the non-ergodic model equal to the anelastic attenuation of BA18 ( c ca,p · ∆ R = c 7 R rup ); while in areas that are covered by the paths in NGAWest2 dataset, c ca,p deviates from c 7 to capture the regional attenuation. Figure 3 shows the cells and the path coverage in the selected subset of the NGA-West2 dataset, as well as, the number of paths per cell.

Site Scaling
The functional form of the f S model is: The ergodic components of the site term are: c 8 which controls the V S30 scaling of the ground motion, and f N L which is the non-linear site amplification term. The non-ergodic effects related to the site are expressed by the δc 1a,s and δc 1b,s coefficients. The station constant, δc 1a,s , which has a finite correlation length, describes the broader adjustments to the backbone model to express the regional site effects. δc 1b,s has a zero correlation length and acts on top of δc 1a,s to describe the site specific adjustments. Coefficients with a finite correlation length vary continuously across the domain of interest, whereas coefficients with zero correlation length vary independently from location to location.
The remaining terms f ztor , f N M and f Z1 where kept as they are in the ergodic GM M .

Formulation of spatially varying coefficient model
The non-ergodic terms, cell-specific coefficients, and aleatory terms, hereafter collectively called model parameters ( θ), were estimated by describing the GMM as a hierarchical Bayesian model using the computer software STAN (Stan Development Team, 2019). In Bayesian statistics, the posterior distribution of the parameters is proportional to the likelihood times the prior distribution of the parameters: The prior distributions are the distributions that the model parameters are assumed to follow in the absence of data; the likelihood function, in general terms, is the probability of observing  the data given the model parameters; and the posterior distributions are the model-parameter distributions informed by the data. The likelihood can be estimated from the density function of the ground motion: where f (x, θ) is the functional form for the median non-ergodic ground-motion: It is equal to the ergodic backbone model without the effect of anelastic attenuation f erg (M, R rup , V S30 , ...)− c 7 R rup ), plus the non-ergodic spatially varying constants that have been described the previous section (δc i ), and the cell-specific anelastic attenuation c ca,p · ∆ R. The model is called hierarchical because the prior distributions are defined in multiple levels. At the lower level, θ follow some prior distributions, which are defined in terms of a different set of parameters, hereafter called hyper-parameters θ hyp , which, in turn, either follow some other prior distributions, or they are fixed. In this study, the non-ergodic ergodic regression was performed in two phases: in the first phase, which included a smaller number of frequencies, θ hyp were defined by their own prior distributions that are described later in this section, whereas, in the second phase, most of θ hyp were fixed to their smoothed values, estimated from first phase, and the remaining hyperparameters were left free to follow the same prior distributions as in the first phase. The main reasons for fixing θ hyp in the second phase were to ensure that there are no abrupt changes in the non-ergodic terms between frequencies, to constrain the model to a more physical behavior, and to reduce the computational cost of the second phase which included more frequencies. Table  2 summarizes the parameters that were classified as θ and θ hyp ; the parameters composing θ have been defined in Section 3.1, the hyper-parameters composing θ hyp have defined later in this section. Table 3 summarizes the θ hyp that were free at each phase. If θ hyp is free, the prior distribution of a model parameter, θ i , can be explicitly defined in terms of θ hyp as follows: More specifically, the δc 0 constant has a normal prior distribution with a zero mean and a 0.1 standard deviation: The mean is set to zero because in absence of data, there should be no shift between the ergodic and non-ergodic GMM. The standard deviation is small because any constant shift informed by the regional data or the re-weighting of the residuals is expected to be small. For earthquakes with magnitudes less than M 5 and frequencies less than 5 Hz, δc 0,e follow a normal prior distribution with a zero mean and a 0.2 standard deviation. Preliminary analyses, which did not include δc 0,e , had a 0.2 to 0.4 regional difference in δc 1,e between northern and southern California, which have a 16 and 3% probability of being exceeded with the selected standard deviation. Therefore, the posterior distribution of δc 0,e will deviate from zero to reach a similar range only if there is significant support by the data; otherwise, δc 0,e will stay close to zero implying no systematic difference between northern and southern CA at small magnitude events.
The non-ergodic constants δc 1,e ( x e ) and δc 1a,s ( x s ) follow multivariate normal prior distributions with zero mean and Matern (negative exponential) covariance functions (κ). κ imposes the spatial correlation on δc 1,e and δc 1a,s : it ensures that the values of δc 1,e and δc 1a,s will be similar for earthquakes or sites in close proximity and that δc 1,e and δc 1a,s would vary continuously from location to location.
The covariance function between a pair of earthquakes for κ( x e , x e ) or between a pair of stations for κ( x s , x s ) is defined in equation (13); x and x are the coordinates of the two earthquakes or sites depending on the coefficient, ω i is the standard deviation, and i is the correlation length. ω i controls the variability of the non-ergodic coefficients, that is, how much the values of the coefficients could vary between locations that are far from each other. i governs the length scale of the spatial variation of δc i ; increasing i makes δc i to vary more gradually with distance.
Both 1,e and 1a,s have inverse gamma prior distributions with distribution parameters α and β equal to 2 and 50 which corresponds to a mode and mean of 16.7 and 50 km, respectively. E a r t h q u a k e E n g i n e e r i n g physical meaning. The 1,e and 1a,s correlation lengths are expected to be around 10 to 50 km, where the inverse gamma distribution has most of its mass, but larger values are also possible, if they are supported by the data, due to its exponential tail.
The prior distribution of ω 1,e and ω 1a,s is an exponential distribution with a rate of 20 These prior distributions were chosen for ω 1,e and ω 1a,s to penalize unnecessary model complexity (Simpson et al., 2017). A ω i equal to zero implies no variability for δc i , meaning that there are no systematic effects related to that parameter. In an exponential distribution, most of the mass is near zero, which allows δc i to deviate from zero to capture systematic effects related to that parameter only if there is significant support by the ground-motion observations. For the same reason, exponential prior distributions were used in Kuehn et al. (2020) to model the standard deviations of the regional terms in the KBCG20 partially non-ergodic subduction-zone GMM. The site-specific adjustment, δc 1b,s follows a normal distribution with a zero mean and a ω 1b,s standard deviation: δc 1b,s is a function of the site location, the same adjustment is applied to all ground motions recorded at the same station.
The prior distribution for ω 1b,s is a log-normal distribution with a logmean of −0.8 and a standard deviation of 0.3 natural log units: This prior distribution has a median value of 0.45, and a 16 th and 84 th percentile of 0.33 and 0.6, respectively. Bayless and Abrahamson (2019b) found φ S2S to range from 0.4 to 0.6 which is consistent with the prior distribution for ω 1a,s . The prior distribution of c ca,p is a multivariate normal distribution with an upper truncation limit at zero: where µ ca,p is the mean of the distribution, and κ ca,p is the covariance function. To ensure the physical extrapolation of the GMM, c ca,p is limited to be less or equal to zero, which is satisfied by setting the upper limit of the normal distribution at zero (T (, 0)). Two key differences from the Abrahamson et al. (2019) approach when modeling the cells are the different mean and the different covariance function of the prior distribution. In this model, the mean of the prior is equal to the value of the anelastic-attenuation coefficient in Bayless and Abrahamson (2019b) (µ ca,p = c 7 BA18 ); thus, in areas with sparse data, the non-ergodic attenuation goes back to the ergodic attenuation to ensure reasonable extrapolation at large distances. This decision was made because the local data may not be sufficient to estimate both the median shift and the spatial variability of the anelastic attenuation. The covariance function (equation (19)) is the sum of a Matern kernel scaled by ω 2 ca1,p and a diagonal kernel scaled by ω 2 ca2,p . The Matern kernel controls the underlining continuous variation of anelastic attenuation over large areas, whereas, the diagonal kernel allows for some independence in the attenuation from cell to cell. ω ca1,p controls the size of the underling variability of c ca,p over large distances, the correlation length ca1,p controls how fast the underling component of c ca,p varies with distance, and ω ca2,p controls the size of the independent variability. The location of each cell is defined in terms of its midpoint coordinates, c c ; the distance between the cells is calculated by the L2 norm of the vector between midpoints of the cells.
ca1,p has an inverse gamma distribution with the same parameters as the prior distributions for 1e and 1a,s . ω ca1,p and ω ca2,p have an exponential prior distribution with the same parameters as the prior distributions for ω 1,e and ω 1a,s .
The non-ergodic within-event residuals, δW 0 e,s , follow a normal distribution with a zero mean and φ 0 standard deviation.
The prior distribution of φ 0 is a log-normal distribution with a logmean of −1.3 and a standard deviation of 0.3 natural log units (equation (22)). This set of parameters leads to a prior mean of 0.27 and a 16 th to 84 th percentile range of 0.2 to 0.37. BA18 is an ergodic model, and so an estimate of φ 0 is not available to inform the prior distribution of φ 0 of this model. However, φ SS , which is available in BA18, is about 0.4 for most frequencies, and because φ 0 is smaller than φ SS by definition (Al Atik et al., 2010), the range of the prior distribution is reasonable.
The non-ergodic between-event residuals, δB 0 e , follow a normal distribution: with a zero mean and τ 0 standard deviation. δB 0 e is a function of the earthquake id, e; that is, the same δB 0 e is applied to all recordings of the same earthquake. The prior distribution of τ 0 is a log-normal distribution with a −1.0 logmean and 0.3 log-standard deviation.
The τ 0 distribution parameters are judged to be reasonable because the mean and 16 th and 84 th percentiles (0.38, 0.27 and 0.50) are in agreement with other non-ergodic studies where the total non-ergodic standard deviation ( φ 2 0 + τ 2 0 ) ranges from 0.36 to 0.55.

Predictive distributions of coefficients at new locations
The non-ergodic coefficients can be estimated at new locations ( x * ) by conditioning them on the non-ergodic coefficients at the existing locations ( x); that is, the location of stations or past events depending on the coefficient. Since all non-ergodic coefficients follow multivariate normal distributions, for known values of the non-ergodic coefficients ( δc i ) at the x, the non-ergodic coefficients at x * also follow multivariate normal distributions with the mean and covariance matrix (Rasmussen and Williams, 2006;Landwehr et al., 2016): where µ δc * i |δc i mean of non-ergodic ergodic coefficients at x conditioned on δc i , Ψ δc * i |δc i is the covariance of non-ergodic coefficients at x conditioned on δc i , K is the covariance between the non-ergodic coefficients at the existing locations (K i = κ i ( x, x)), k is the covariance between the non-ergodic coefficients at the existing and new locations (k i = κ i ( x, x * )), and K * is the covariance between the non ergodic coefficients at the new locations (K * i = κ i ( x * , x * )). However, the non-ergodic coefficients at x are not perfectly known. There is some epistemic uncertainty associated with δc i which is quantified by their posterior distribution. To simplify the calculations and obtain a closed-form solution, δc i is assumed to follow a multivariate normal posterior distribution ( δc i ∼ N ( µ δc i , Ψ δc i )); µ δc i is the posterior mean, and Ψ δc i is a diagonal matrix with the posterior variances across the diagonal. We consider this assumption to be reasonable because all δc i have multivariate normal prior distributions. If all of the hyper-parameters were fixed, the posterior distributions of δc i would indeed be multivariate normal distributions, but because some of the hyper-parameters are free, the posterior distributions of δc i may slightly deviate from the assumption. The epistemic uncertainty of δc i can be included in the δc * i predictions by using the marginal distribution of δc * i : Due to the previous assumption, p(δc * i ) is also a multivariate normal distribution with mean and covariance matrix given in Equations (28) and (29), respectively (Bishop, 2006).

Inter-frequency Correlation
The main motivation behind the development of this non-ergodic EAS GMM is to use it with RVT to create an equivalent non-ergodic P SA GMM; in doing this, it is important to capture the inter-frequency correlation of the non-ergodic terms, otherwise, as it was demonstrated by Bayless and Abrahamson (2018), the variability of the P SA values is underestimated. The correlation coefficient (ρ) is a measure of the linear relationship of two random variables X 1 and X 2 . A ρ that is equal to one implies that X 2 can be perfectly defined as a linear function of X 1 , and vice versa; a zero ρ implies that the two random variables are uncorrelated. In ground-motion studies, the inter-frequency correlation coefficient is a measure of the width of the peaks and troughs of a P SA or EAS spectrum: the stronger the correlation of the amplitudes between frequencies, the wider the peaks and troughs of the spectra will be. The correlation coefficient for a non-ergodic term δc i , at frequencies f 1 and f 2 , is defined as: where cov is the covariance of δc i at the two frequencies, and σ i is the standard deviation of δc i . ρ can be determined from the data using the maximum likelihood estimator (Kutner et al., 2005): where n is the number of observations, δc i,j is the j th sample of δc i , and δc i is the mean value of δc i . For a large number of samples (n > 25), ρ can be transformed into a random variable z that follows a normal distribution with equation (32) (Kutner et al., 2005); the standard deviation of z is given in equation (33).
The same functional form that was used to model the correlation of the total EAS residuals in Bayless and Abrahamson (2019a) was used here to fit the empirical correlations of the non ergodic terms: A, B, C, D are the model parameters, and f r is the absolute value of the natural log of the ratio of the two frequencies. This functional form allows for a two-term exponential decay as a function of the logarithm of frequency; this behavior is required because both the correlation of the total residuals in Bayless and Abrahamson (2019a) and the correlation of the epistemic uncertainty terms presented in section 4 exhibit a steep decay at frequencies near the conditioning frequency which then flattens at frequencies that are further from the conditioning frequency. The model parameters were estimated with a non-linear least-squares regression on z using the MINPACK.LM package (Elzhov et al., 2016) in the statistical software R (R Core Team, 2020); σ(z) was used as weights in the least-squares regression emphasising the fit to the higher correlation values which have more samples.
In this study, one difference from the Bayless and Abrahamson (2019a) is it that the interfrequency correlation of all epistemic uncertainty terms was modeled as frequency independent (i.e. A, B, C, D are constants). This was done, because at it can be seen in section 4, δc 1a,s and δc 1b,s , which are the biggest contributors to the total non-ergodic effects, have an almost frequency independent inter-frequency correlation. c ca,p show the most noticeable frequency dependence in inter-frequency correlation, but it becomes more stable at intermediate and large frequencies which is the frequency range where it has the biggest impact. The assumption of frequency independence should be re-examined in future studies with a larger dataset.  Figure 4 presents the mean, 5 th and 95 th percentiles of the posterior distribution of the hyperparameters of the non-ergodic terms; the proposed smoothed values are also presented in the same figure. As mentioned in section 3.2, the regression for this model was performed in two phases; in the first phase, all model hyperparameter were free and estimated based on the data and prior distributions, whereas in the second phase, the hyper-parameters of the non-ergodic terms were fixed to their smoothed values, and τ 0 and φ 0 were reestimated for the new set of smoothed hyper-parameters. Figure 4a shows the variation of the correlation length for δc 1,e with frequency. The posterior distribution of 1,e is wide due to the small number of earthquakes. Overall, the mean estimate of 1,e is around 40km except at low frequencies where at 0.3Hz it jumps up to 85km. Because there is no reason for 1,e to increase at low frequencies, it was fixed to the average 1,e over all frequencies, as it the simplest model. Furthermore, there are less data at low and high frequencies making the estimates of the hyperparameters at these frequency ranges less stable. Figure 4b shows how ω 1,e changes with frequency. In this case, the posterior mean of ω 1,e stays constant at low and intermediate frequencies, and it exhibits an increase at high frequencies.

Hyperparameters
Similarly to 1,e , there is no reason for ω 1,e to increase at high frequencies so it was fixed to the average value over all frequencies. One possible cause of the apparent increase of ω 1,e at high frequencies is that some of the non-ergodic site effects could have been mapped into non-ergodic source effects in the regression. It is expected that the non-ergodic site effects will increase at high frequencies because the regional differences in site amplification tend to have a larger impact on the high frequencies. This assumption is consistent with the behavior of ω 1a,s and ω 1b,s , which both show an increase with frequency. For this reason, the difference between the estimated and smoothed ω 1,e was moved to ω 1a,s .
In smoothing ω 1a,s , up to the frequency of 15Hz, a piece-wise linear model was fit to the estimated mean values; whereas beyond 15Hz, it was fit to the square root of the sum of squares of the estimated mean ω 1a,s and the difference between the estimated mean and smoothed ω 1,e . Minimal smoothing was applied to 1a,s and ω 1b,s as they show a relatively small variation between neighbouring frequencies.
The smoothed ca1,p was fixed to the average of the mean estimates that are less than 75km; this upper limit was imposed because the ca1,p with large mean estimates also had wide posterior distributions, meaning that ca1,p could not be reliably estimated at those frequencies.
ω ca1,p and ω ca2,p exhibit similar characteristics in their variation with frequency: they are very small at low frequencies, they show an approximately linear increase with the log of frequency at intermediate frequencies, and they reach a plateau at high frequencies. This happens because the effects of anelastic attenuation are more noticeable at high frequencies, and likewise, spatial changes in the anelastic attenuation will have a larger effect on higher frequencies. This behavior was also observed by Kuehn et al. (2019) who found that the standard deviation of the cell-specific attenuation coefficients of their non-ergodic P SA GMM was smaller at long periods . Figure 5 presents the terms that were reestimated in the second step. The coefficient δc 0 corresponds to the shift of the non-ergodic GMM due to the reweighting of the residuals. For most frequencies, the change in the constant from the ergodic model is less than 10%. The regional constant δc 0,e , which corrects for the bias due to the differences in the small magnitude conversion, is about 0.4 for the northern California and zero for the southern California. In the NGAWest2   Figure 6 shows the spatial distribution of the mean estimate and epistemic uncertainty of δc 1,e , δc 1a,s , and δc 1b,s for f = 5Hz. As mentioned in Section 3, the δc 1,e varies as a function of the source coordinates, whereas δc 1a,s and δc 1b,s as a function the site coordinates. In areas with past earthquakes, the mean estimate of δc 1,e deviates from zero and its epistemic uncertainty is small. δc 1,e is positive if the earthquakes in a region have systematically above average source effects and negative if the source effects are below the average. In areas with sparse or no data, the systematic effects related to the source cannot be reliably estimated, thus, δc 1,e approaches zero and its epistemic uncertainty is large. The same behavior is observed in the spatial distribution of δc 1a,s : in large metropolitan areas, were most of the station are located, the δc 1a,s mean estimate deviates from zero, and its epistemic uncertainty is small; in remote areas, the δc 1a,s mean estimate approaches zero and, its epistemic uncertainty is large. δc 1b,s is only plotted at the station locations as it has a zero correlation length, meaning that as we move away from a station it will directly go to zero. The mean estimates of δc 1b,s do not exhibit any spatial correlation (i.e. there are no regions where δc 1b,s are systematically positive or negative) meaning that spatially correlated component of the site effects was properly captured by δc 1a,s Figure 7 illustrates the spatial distribution of the cell-specific anelastic attenuation. The mean of c ca,p deviates from c 7 BA18 in cells that are crossed by many paths, whereas it stays close to c 7 BA18 in cells crossed by few or zero paths. In addition, cells that are crossed by few paths have large epistemic uncertainty in c ca,p . Overall, the epistemic uncertainty is low in Bay Area and Los Angeles and high in the northern part of California and the state of Nevada. The main features  (2016) differ in Nevada because there are no paths that cover that region. The large epistemic uncertainty of c ca,p in Nevada means that the cell-specific anelastic attenuation cannot be estimated in that region with the current data set. This comparison shows that the cell specific anelastic attenuation captures an underlying physical behavior. Figure 9 shows the epistemic uncertainty of the non-ergodic terms as a function of the number of records for δc 1,e , δc 1a,s and δc 1b,s , and as a function of the number of paths for c ca,p . The epistemic uncertainty of δc 1,e and δc 1a,s is not sensitive to the number of records, whereas the epistemic uncertainty of δc 1b,s and c ca,p decreases as the number of records and number of paths increases. This occurs because δc 1b,s is spatially uncorrelated, and c ca,p has a spatially uncorrelated component; δc 1b,s can be estimated more accurately as the number of ground motions recorded at a station increases, and c ca,p can be estimated more accurately as the number of paths crossing a cell increases. δc 1,e and δc 1a,s are spatially correlated and so the location of an event or a station is also important. That is, δc 1a,s can have less epistemic uncertainty near a group of stations with few records at each station than near a remote station with a large number of records, if collectively, the group of stations has more data to constrain δc 1a,s . The same holds true for δc 1,e regarding the spatial distribution of events.

Non-ergodic residuals
The residuals of the non-ergodic model at f = 5Hz are presented in Figure 10: the dots represent the residuals, the solid line corresponds to the moving average, and the error bars correspond to the standard deviation. δB 0 e shows no trend and an approximately constant standard deviation with magnitude; δW S 0 es also shows no trend, but the standard deviation standard deviation reduces with magnitude. Additionally, δW S 0 es shows no trend and a constant standard deviation with R rup and V S30

Standard deviation
In the model development, for simplicity, the aleatory standard deviations, τ 0 and φ 0 , were modeled as magnitude independent. Any magnitude dependence of τ 0 and φ 0 was determined in post processing based on the non-ergodic residuals. τ 0 was modeled as constant, as, for the most part, the δB 0 e residuals did not exhibit any reduction in standard deviation with magnitude ( Figure  10). Because the number of events greater than M 6.5 is small, the model for τ 0 did not follow the reduction of the empirical standard deviation at large magnitudes, but instead it followed the standard deviation of the small events. φ 0 was modeled as a piece-wise linear function (equation (36)), as δW 0 es residuals exhibit some reduction in the standard deviation with increasing magnitude: φ 0M 1 is the within-event standard deviation for magnitudes less than 5, and φ 0M 2 is the within-event standard deviation for magnitudes greater than 6.5.
The aleatory parameters (τ 0 , φ 0M 1 , and φ resulting EAS will have a reasonable shape (Figures 11 and 12). The smoothing was performed by fitting the aleatory parameters with a fourth order polynomial. The value of τ 0 decreases from small to intermediate frequencies and increases again after f = 3Hz, which is consistent with the behaviour of BA18 and other P SA GMM, such as Abrahamson et al. (2014). The magnitude dependence of φ 0 is more pronounced at high frequencies. The higher φ 0 of small events at high frequencies is believed to be due to an increased effect of the radiation patterns. At large events, the effect of radiation patterns is smaller as seismic rays originate from more locations along the fault, which increases the range of azimuthal angles, and leads to destructive interference of the radiation patterns resulting in less ground-motion variability. Figure 13 compares the magnitude relationships for τ 0 and φ 0 with the empirical standard deviations at f = 5Hz. Overall, there is a good fit between the τ 0 and φ 0 relationships and the standard deviations of the binned residuals. The differences at large magnitudes should be reevaluated with a dataset which includes a grater number of large magnitude events.
As a comparison with ergodic aleatory variability, figure 14 shows the total aleatory variability for the non-ergodic GMM for the small and large magnitudes events and the standard deviation of the ergodic residuals used for the derivation of this model. Based on this range, the total aleatory standard deviation of the non-ergodic GMM ranges from 0.50 to 0.65 which is about a 40 to 30% reduction from the standard deviation of the ergodic GMM.

Inter-frequency correlation
The inter-frequency correlation of the non-ergodic terms is presented in Figure 15 and the model parameters are summarized in Table 4. Currently, the correlation of all non-ergodic terms is modeled as frequency independent: that is, the width of the EAS peaks and troughs does not depend their central frequency. A frequency independent correlation would mean that the width of    the correlation ridges in Figure 15 does not change along the diagonal, whereas a non-constant width would mean that the correlation changes with frequency. Out of all the non-ergodic terms, δc 1,e has the widest confidence intervals because the number of unique earthquakes is smaller than both the number of unique stations or the number of anelastic attenuation cells. δc 1,e has a relatively wide correlation, meaning that if δc 1,e is positive at one frequency, it is highly likely that it will also be positive over a wide range of neighbouring frequencies. The correlation of this term also exhibits some frequency dependence similar to the δB e frequency dependence found in Bayless and Abrahamson (2019a): the correlation is the widest at f = 0.5Hz, it narrows at intermediate frequencies, and it widens again at frequencies larger than 8Hz. Both δc 1a,s and δc 1b,s have narrow, mostly frequency independent, inter-frequency correlations, which are similar to the correlation structure of δS2S in Bayless and Abrahamson (2019a). The correlation of c ca,p shows the strongest frequency dependence; there is very little correlation at frequencies less than 1Hz, it gradually increases and reaches the widest point at 5Hz, and then, it narrows again. The narrow frequency correlation at low frequencies is expected as the anelastic attenuation is very weak for that frequency range; however, it is unclear why the inter-frequency correlation narrows at high frequencies. It could be an artifact of poor sampling. For now, it is modeled as frequency independent, but in future studies, this assumption would need to be reevaluated. As a point of comparison, in seismic numerical simulations, a deterministic velocity model would imply a perfect inter-frequency correlation on c ca,p , which is more similar to the width of the correlation of data at f = 5Hz   Figure 16 shows the effect of inter-frequency correlation in sampling the non-ergodic terms for an M 7 earthquake in Hayward fault 10 km from a site in Berkeley, CA. Figure 16a shows the median non-ergodic EAS, the 16 th to 84 th percentile range of epistemic uncertainty, and a representative EAS sample with epistemic uncertainty for zero inter-frequency correlation. Figure 16b shows the same information, but in this case, the ground motions were generated with the inter-frequency correlation model described previously. The median EAS and range of epistemic uncertainty is the same in both cases; what is different are the representative EAS samples. The EAS sample with zero inter-frequency correlation has zero width in the peaks and the troughs, whereas EAS sample with inter-frequency correlation has peaks and troughs that span approximately a quarter of a decade. It should be noted that these samples do not include aleatory variability, the inter-frequency correlation of the aleatory variability will influence the final width of the peaks and troughs. The distance scaling of the model for f = 5Hz for a site in San Jose, CA (SJ) and a site in Northeastern California (N E) is presented in Figure 17; the site in SJ has a station which has recorded ground motions from past earthquakes to constrain δc 1b,s , whereas the site in N E does not have one, so δc 1b,s is unconstrained. In both cases, the earthquakes are located north of sites. North of SJ, c ca,p is less than average (Figure 7a) which causes the non-ergodic GMM to have higher attenuation than BA18. Due to the small number of paths in N E, c ca,p is very close to the mean value which is why the non-ergodic GMM and BA18 have similar distance scaling. The epistemic uncertainty is less in SJ as there are more earthquakes and stations to constrain the non-ergodic terms.

Model Validation
The performance of the non-ergodic GMM was evaluated with a 5-fold cross validation test using the ground-motion data from f = 5hz. In each of the 5 iterations of the cross-validation test, the NGAWest2 CA dataset was randomly split into a training and test datasets, 80% of the earthquakes composed training dataset and the remaining 20% of earthquakes composed the test dataset. The training set was used to estimate the coefficients of the non-ergodic model, and the test dataset was used to evaluate the accuracy of the predictions with the estimated coefficients. The NGAWest2 CA dataset was split based on the earthquakes so that the non-ergodic GMM is evaluated on events that were not used in the parameter estimation. Figure 18 shows the root-mean-square error (rmse) of the non-ergodic GMM and BA18 for all iterations. The average rmse of the non-ergodic GMM and BA18 is 0.66 and 0.87, respectively, which indicates that incorporating the non-ergodic terms improves the ground-motion prediction for events that were not part of the regression dataset.

Conclusions and Discussion
A fully non-ergodic EAS GMM is presented in this study. The non-ergodic source and station effects are captured by spatially varying coefficients; the non-ergodic path effects are captured with the cell-specific anelastic attenuation. A regional term that accounts for the differences in the ground motion of small earthquakes between northern and southern California is also added in the non-ergodic GMM; this term is applied to events less than M 5, and frequencies less than 5 Hz. The exact cause of these differences could not be identified, but it could be related to a potential bias in the magnitude estimation between the NCSN and SCSN networks. Future studies should further investigate the cause of these differences. The proposed non-ergodic GMM has a 30 to 40% smaller total aleatory standard deviation than BA18. Furthermore, the cross-validation test shows that the non-ergodic GMM performs better than BA18 in predicting the ground motion for events that were not part of the regression dataset. The next step is to use this non-ergodic EAS GMM with RVT to develop an equivalent nonergodic P SA GMM. The advantage of this approach is that it is easier to transfer the estimated non-ergodic terms, which are primarily based on small magnitude events, to the non-ergodic terms for the scenarios of interest, which typically are large magnitude events, using RVT than it is to estimate the magnitude dependence during the development of the non-ergodic EAS GMM. For the scenarios of interest, the P SA non-ergodic terms can be estimated by combining the EAS predictions, for the same scenarios, with RVT. In this approach, the magnitude dependence of the non-ergodic P SA terms is captured.
As larger data sets become available, future studies should consider the addition of a spatially varying term for geometrical spreading and test the frequency dependence of the inter-frequency correlation of the non-ergodic terms. A spatially-varying geometrical-spreading coefficient may be able to better capture the non-ergodic path effects at short distances; however, if such a coefficient is added, it should be constrained so that the GMM does not over-saturate at short distances. Currently, the inter-frequency correlation of the non-ergodic terms was assumed to be frequency independent, future studies should reevaluate if this assumption is valid.
Currently, the path for the cell-specific anelastic attenuation connects the site with closest point on the rupture. This path was chosen because it is the same path that is used in the R rup calculation; however, it has not been tested whether a path connecting the site and a different point on the rupture would be more appropriate for the cell-specific anelastic attenuation. A large number of broadband earthquake simulations that include 3D velocity structure effects up to high frequencies (e.g. 5 Hz) would be ideal for solving this problem. This work was partially supported by the PG&E Geosciences Department Long-Term Seismic Program. The authors also thankful to the three anonymous reviewers for constructive comments that helped to improve the final article.