Combining a Locally Restricted Anisotropic Spatial Kernel with an ETAS Incomplete Model for Better Forecasts of the 2019 Ridgecrest Sequence


 Strong earthquakes cause aftershock sequences that are clustered in time according to a power decay law, and in space along their extended rupture, shaping a typically elongate pattern of aftershock locations. A widely used approach to model seismic clustering is the Epidemic Type Aftershock Sequence (ETAS) model, that shows three major biases: First, the conventional ETAS approach assumes isotropic spatial triggering, which stands in conflict with observations and geophysical arguments for strong earthquakes. Second, the spatial kernel has unlimited extent, allowing smaller events to exert disproportionate trigger potential over an unrealistically large area. Third, the ETAS model assumes complete event records and neglects inevitable short-term aftershock incompleteness as a consequence of overlapping coda waves. These three effects can substantially bias the parameter estimation and particularly lead to underestimated cluster sizes. In this article, we combine the approach of Grimm (2021), which introduced a generalized anisotropic and locally restricted spatial kernel, with the ETAS-Incomplete (ETASI) time model of Hainzl (2021), to define an ETASI space-time model with flexible spatial kernel that solves the abovementioned shortcomings. We apply different model versions to a triad of forecasting experiments of the 2019 Ridgecrest sequence, and evaluate the prediction quality with respect to cluster size, largest aftershock magnitude and spatial distribution. The new model provides the potential of more realistic simulations of on-going aftershock activity, e.g.~allowing better predictions of the probability and location of a strong, damaging aftershock, which might be beneficial for short term risk assessment and desaster response.

ian, 2019; Zhuang et al., 2002). Particularly, the aftershock productivity (i.e. expected number of 72 offsprings) for a trigger event with magnitude m is where parameters A > 0 and α > 0 control the exponential growth of the trigger potential and M c 74 is the cut-off magnitude of the analyzed earthquake catalog.

76
Despite generally producing successful and insightful estimation and forecast results, ETAS 77 models are known to be limited by a number of potential biases. In this article, we present an 78 approach that combines solutions for three main short-comings of the conventional ETAS model, 79 (1) the isotropic spatial aftershock distribution, (2) the infinite extent of the spatial kernel and (3) 80 the short-term incompleteness of earthquake records after strong triggering events.

81
Bias 1: Isotropic spatial distribution 82 The common assumption in ETAS models is that spatial aftershock locations are distributed isotrop- reasonably valid for weak earthquakes with small rupture extensions, but becomes problematic for 88 larger magnitudes, e.g. see the spatial pattern of the Ridgecrest sequence in Fig. 1(b). It has been 89 shown that inadequate spatial models can lead to an underestimation of the productivity parame-90 ter α (Equation 2) because the numerous small events take over the role of mimicking the "true" (Harte, 2013(Harte, , 2016Seif et al., 2017), However, they can be neglected in our study because they 112 are either expected to be small in southern California datasets (e.g. location and magnitude uncer-113 tainty) or do not apply in an isolated sequence analysis (background miss-specification).

115
Scope of this article 116 In this article, we combine an ETAS approach accounting for short-term incomplete event records 117 with the application of a generalized, anisotropic spatial model that restricts the spatial kernel to and discussed by Ogata (1998Ogata ( , 2011 and Ogata and Zhuang (2006). The latter approaches suc-128 cessfully modeled spatial aftershock patterns, however, they require a new set of parameters and 129 are therefore not flexibly combinable with the conventional, isotropic functionality. In contrast, the The ETAS model, first introduced by Ogata (1988Ogata ( , 1998, is a branching-tree type model which de-156 scribes clustered earthquake occurrences by consecutive triggering evolving over multiple parent-157 child generations (i.e. allowing secondary aftershocks). The triggered seismicity is overlaying a 158 time-invariant background process.

160
In this section, we will first introduce the conventional, isotropic ETAS model approach. Next, 161 we will extend the model to obtain a time-space version of the ETASI model suggested by Hainzl

162
(2021), which involves STAI into parameter estimation. Mostly, notations are consistent with 163 Hainzl (2021). We will then define the anisotropic generalization of the spatial kernel, which is 164 compatible with both the ETAS and ETASI model, and introduce the local restriction of the kernel. 165 Finally, we explain the fitting algorithm for the strike angle and rupture position of anisotropic 166 trigger events and the methods for spatial integral estimation.
is the "true" probability density function (pdf) of the frequency-magnitude distribution (FMD) with Gutenberg-Richter parameter b = β/ln(10) (Gutenberg and Richter, 1944), and 174 R 0 (t, x, y) = µ u(x, y) + is the "true" occurrence rate of events with magnitude m ≥ M c , at time t and at location (x, y). 175 The "true" event rate is modeled by a superposition of the time-invariant seismic background 176 rate µ u(x, y) with parameter µ > 0 and a sum of the trigger rate contributions of all events i 177 that occurred prior to current time t. k A,α (m i ) and g c,p (t − t i ) denote the aftershock produc-178 tivity and Omori-Utsu Law decay functions as defined in Equations (1) and (2), respectively.

179
h D,γ,q (r i (x, y), m i , l i ) models distribution of aftershock locations triggered by event i, with pa-180 rameters D, γ and q. The precise inputs and shape of the spatial kernel are discussed later.

182
The term "true" means that the (physical) relationships are expected to be observed with per-183 fect earthquake records. The main assumption of the conventional ETAS model is that STAI does 184 not significantly distort the "true" magnitude distribution and the "true" event rates. The concept of rate-dependent earthquake record incompleteness assumes that the "true" rela-189 tionships underlying f 0 (m) and R 0 (t, x, y) are not accurately identifiable in available earthquake 190 catalogs because especially events with small magnitudes are detected with lower probability in 191 periods of high seismic activity. In these periods, the detection ability is limited typically due to 192 overlapping seismic waves (Hainzl, 2016a(Hainzl, , 2021.

193
Fitting the "true" relationships to incomplete data records may therefore lead to significantly bi-194 ased parameter estimates (Hainzl, 2016a,b The working assumption of the ETASI model described here is that an earthquake with magnitude  Let N 0 (t) be the expected number of events occurring within the entire spatial window S during where the approximation holds under the assumption that event rates are approximately constant 217 during the blind time (Hainzl, 2021). According to the "true" FMD (Equation 4), each of the N 0 (t) Following the derivations in Hainzl (2016bHainzl ( , 2021, we obtain the "apparent", incompleteness- and the "apparent" event rate The term "apparent" signalizes that the functions f and R do not represent the "true", but 225 the observable relationships that are possibly distorted by rate-dependent record incompleteness.

226
In periods of high seismic activity, the "apparent" FMD exhibits a larger relative frequency of 227 strong events (because they are more likely to be detected) and an event rate lowered by detection 228 capacity. We obtain the ETASI intensity function The two underlying, simplifying assumptions in the ETASI model are that (1) the blind time T b 230 is magnitude-independent, which Hainzl (2021) justifies by typically shorter source durations than 231 travel times of coda waves, and (2) that the seismic network is equally occupied for blind time T b 232 by any event in the entire investigated spatial window. The second assumption is reasonable for a 233 small spatial window, e.g. when analyzing an isolated sequence. When fitting the ETASI model 234 over a larger region, it needs to be assured that not two relevant clusters evolve at the same time but 235 at distinct locations as they would be assumed to simultaneously occupy the entire seismic network.

236
A reasonable approach to prevent undesired biases is to choose a larger cut-off magnitude.

238
The parameter vector θ = {µ, A, α, c, p, D, γ, q, β, T b } of the ETASI model is estimated by maxi-239 mizing its log-likelihood function LL = LL 1 − LL 2 with where r i (x, y) denotes the point-to-point distance between a potential aftershock location (x, y) 247 and the coordinates (x i , y i ) of the triggering event i, and m i is the magnitude of the event i. The 248 kernel is constrained by the parameters D and γ that control the magnitude-dependent width of the 249 kernel, and parameter q that describes the exponential decay of the function with growing spatial 250 distance.

252
Here we use the anisotropic generalization of the spatial kernel that was first introduced by Grimm where l i denotes the rupture length estimate for the triggering event i. In this spatial model, the 253 distance term r i (x, y) denotes the point-to-line distance between the potential aftershock location 254 (x, y) and the estimated rupture segment of event i. Note that i.e. the anisotropic kernel is a generalization and collapses to the isotropic model if the triggering 256 location is assumed to be a point source with rupture extension l i = 0. Therefore, the generalized 257 spatial model can be used for mixing approaches of both kernels, e.g. applying anisotropy to events 258 i with magnitudes m i ≥ M aniso : The scaling relationship for anisotropic events is taken from the estimate of subsurface rupture 260 lengths for strike-slip faulting events, provided in Wells and Coppersmith (1994).

274
In other words, the spatial kernel for event i is only defined in the restricted area and set to 0 outside of it. Note that the restricted area S i (R i ) can assume isotropic and anisotropic 276 shapes, depending on the point-to-point or point-to-line definition of the distance term r i (x, y). In 277 order to retain the property of a pdf, we need to rescale the kernel within the restricted area by its The integral term holds true for both isotropic (l i = 0) and anisotropic triggers (l i > 0). We obtain 280 the generalized, restricted and anisotropic spatial kernel

282
The restricted, anisotropic spatial kernel in Equation (12) requires a strike angle to define the ori- The forward trigger contribution of event i is computed as In order to avoid that the rupture orientation and position is dominated by single events that oc-304 curred very close to the segment candidate, we applied a minimum distance of 0.2 kilometers.

347
• Evaluation: Analyze simulated sequences and compare to the observation.

348
In the following, we first introduce the basic earthquake event set for California underlying 349 this study, and define the time-space windows used to obtain the event sub-sets applied for pa-      to one rupture length. The "ETASI iso-r" and "ETASI aniso-r" models combine the spatial kernel 403 settings of the latter models with the ETASI approach accounting for STAI.
where k(m) is the aftershock productivity function in Equation (2) and the latter term is the in- to the logarithmic event time-magnitude scatter data of the observed Ridgecrest M6.4 and Ridge-432 crest M7.1 sequences in Fig. 1(c) and (d).
supposedly undetected range below the line of the simulated sequence. In contrast, ETAS models 435 estimate STAI-biased aftershock productivities and therefore lead to predictions of the incomplete, 436 rather than the "true" size of the sequence. Therefore, in forecasts generated by these models we 437 do not delete events. By repeating the same procedure for each simulation run, we obtain 10,000 simulated spatial distributions D ij for each model version. Finally, we average the individual simulated distributions to 455 obtain the predicted probability P j that an event occurs at grid point j.

456
The more events of the observed sequence have occurred at grid points with high predicted proba-457 bilities P j , the better is the forecast. Therefore, we quantify the goodness of the spatial fit with the 458 likelihood L space = grid points j P We compute the information gain of the models' spatial predictions relative to the ETAS conven-    According to the ETAS iso-r and ETAS aniso-r models, the observed number of events would be 501 a possible, but rather unlikely scenario, with approximately 2.4% and 3.7% probability to exceed 502 the observed value. The ETASI models tend to only moderately (ETASI iso-r) or slightly (ETASI 503 aniso-r) underestimate the observed number.

504
To explain this observation, we consider that the size of this relatively short sequence is predomi-  The larger the parameter α, the more direct aftershocks are expected for the M6.4 event.

510
As argued in the Methods section, the local restriction of the spatial kernels prevents a dispropor-  ; see table 2). If b = 1, then each magnitude increment by 1 leads to a 10 times 536 smaller probability of occurrence. Therefore, we would need a sample of the size 100,000 events 537 to obtain on average one M7.1 event.

538
According to the estimation results in table 2, all models estimated significantly smaller values 539 b < 0.8 for the observed Ridgecrest M6.4 sequence, which favors the occurrence of strong events.

540
Note that the b estimates of the three ETAS models are biased, because they are fitted for the "true" 541 FMD using the evidently short-term incomplete M6.4 sequence event record (see Fig. 1(c)). The 542 ETASI models account for the missing smaller magnitudes and therefore lead to corrected, larger 543 b values.

549
The branching ratios ν branch , i.e. the average number of direct aftershocks per event, clearly 550 exceed 1 in each model (see table 2), indicating that the M6.4 sequence was in an unstable state.

551
According to these models, an earthquake would trigger more than one direct aftershock on aver-  To quantify the quality of the spatial forecasts, we computed the information gains relative to the 581 ETAS conventional model as described in the Application section. Fig. 7(c) shows the results for 582 Experiment 1 in the left group of bars. Both anisotropic models lead to a pronounced improvement, 583 which confirms the visual impression in Fig. 6(a) and (b). The ETAS and ETASI iso-r models, 584 which differ from the conventional approach in terms of the local spatial restriction, show a small 585 information gain. As we can see in Fig. 6(a), none of the observed events occurred outside of the 586 spatial restriction. Therefore, the restriction leads to a slightly more pronounced accumulation of 587 simulated event locations closer to the M6.4 trigger, which coincides better with the observation. The ETAS iso-r and aniso-r models reach the observation in about 6.5% and 14.1% of the simula-596 tion runs. Again, the ETASI models provide the best approximations. for Experiment 1, the reason is found in the parameter estimate for α.     Fig. 7(c) shows that the narrower spatial distribution leads to a more pronounced information gain 638 by the local restriction and the anisotropy, relative to the ETAS conventional model.

639
Note that, to some extent, the predicted spatial distributions show traces of late or secondary af-   Fig. 7(a), largest aftershock magnitude in Fig. 7(b) and spatial 646 aftershock distribution in Fig. 7(c). The conventional model scores worst in each category. It 647 confirms the results in Grimm et al. (2021), which argued that the isotropic and unlimited spatial 648 kernel assumes an implausibly far trigger reach and leads to underestimated cluster sizes.

649
According to Fig. 7(a), the ETASI models seem to estimate more realistic aftershock productivties 650 than the ETAS models when fitted over the long-term Californian catalog (see Experiments 1 and 651 2). When fitted over the specific Ridgcrest M6.4 sequence, the bias of an underestimated aftershock 652 productivity seems to be balanced out by not cutting out undetected events. Anisotropic models 653 always lead to larger predicted sequence sizes, in the case of Experiment 3 even to a substantial 654 overestimation.

655
The predictions of the largest aftershock magnitude, shown in Fig. 7(b), are reasonable for all but 656 Experiment 1. Apparently, the short-term incompleteness bias in the ETAS models is of much less 657 consequence for the FMD than for the aftershock productivity.

658
According to Fig. 7(c), as expected, the anisotropic models predict more realistic spatial event 659 distributions. The spatial restriction leads to a much smaller improvement.

661
As a sensitivity study, we forecasted the Ridgecrest M7.1 sequence over a duration of 50 days. In 662 a longer time window, direct aftershock productivity has less dominance, and is being displaced     , gridded into small intervals. Red crosses symbolize the epicenters of the triggering events. In (a), the red circle around the event represents the contour lines of an isotropic spatial kernel and the shaded segments illustrate the radial partitioning method. In (b), the red box and semi-circles symbolize the contour lines of the anisotropic spatial kernel, the thin cicular segment and the shaded segments illustrate the radial and bin partitioning method.