Study area and data preparation. The study area is located in the region of Ridgecrest in southern California (Fig. 1), where a damaging earthquake sequence proceeded by an Mw 6.4 foreshock and followed by an Mw 7.1 mainshock in July 2019. Four moderate-to-large earthquakes (Mw>5.4) in the sequence are selected for this study. We collect the three-component (3-C) seismograms from 16 seismometers that are deployed by Southern Californian Seismic Network (SCSN) around the Ridgecrest area. They are utilized as the testing data for examining the validity of the proposed FMNet. Before the applications of the FMNet model, sufficient training data are vital for assuring a well-trained neural network. Here, instead of using the historic data, we simulate hundreds of thousands of synthetic data as training data since there are very limited source focal mechanisms of historical earthquakes available in this area.
As shown in Fig. 1, the study area is discretized from 35.4˚ to 36.2˚ in latitude direction, from -118.0˚ to -117.2˚ in longitude direction, and from 2 km to 20 km in depth. The intervals are 0.1˚, 0.1˚, and 2 km for latitude, longitude, and depth, respectively. We have 9 9 10 = 810 virtual grid locations in 3D space. Assuming a double-couple source model40 and a 1D velocity model of southern California41, we simulate the 3-C waveforms at 16 seismic stations by adopting the Thompson-Haskell propagator matrix42. For each virtual 3D grid, we simulate synthetic waveforms for all combinations of the strike, dip, and rake angles in the ranges of 0˚ to 360˚, 0˚ to 90˚, and -90˚ to 90˚37, respectively. The used intervals of the strike, dip, and rake angles are 30˚, 10˚, and 20˚, respectively. Hence, we have 12 9 9=972 focal mechanisms for each virtual grid and overall 810 972=787,320 synthetics as training samples, of which each sample contains the 3-C waveforms of 16 seismic stations with the time length of 128 seconds. We use 1 second as the sampling rate in all the simulations. Therefore each training sample has the size of 48 (three components by 16 stations) 128 (data length). Additionally, we have prepared another 1,000 synthetic samples as validation dataset. The validation dataset serves as unseen data to evaluate the trainig performance.
The training samples are processed by filtering between 0.05 Hz to 0.1 Hz, aligning with the theoretical P-wave first arrivals, and normalizing to the maximum amplitude. These preprocessing procedures are important because they help us get rid of the effect from other source parameters such as location and magnitude and mitigate the dependence on the heterogeneity of velocity medium. Considering the real data may present noise and picking errors, we resemble realistic scenarios by adding realistic noise and a random time shift (<10 s) to the synthetics (Supplementary Fig. S1). We process all the synthetic data in the same way and use them to train the network. After the FMNet is well trained, in case that one real earthquake is identified with the existing algorithms of automatic detection and phase picking7-11, we first remove their instrument responses and then perform the bandpass filtering, arrival-time alignments, and amplitude normalizations on the data prior to feeding them to the FMNet.
FMNet training and prediction. The framework of the real-time determination of the source focal mechanism is presented in Fig. 2. It consists of two parts: FMNet training and prediction. For the training part, we train the FMNet with the synthetic data prepared previously along with the corresponding training labels. We describe the architecture of the FMNet, training labeling, and the associated training parameters in the method section. In the training process, both the training and validation losses, and the goodness of fitting between true and predicted labels of validation data are viewed as metrics to evaluate the performance of the training process (Supplementary Fig. S2 and Fig. S3). The stabilized training and validation loss curves after 50 iterations with sufficiently low resultant values and the high fitting level of between true and predicted labels both indicate that the FMNet has been stably trained. When it comes to the prediction part, we can directly feed the processed recordings of a real earthquake into the trained FMNet to predict the source focal mechanism. The time cost of the training process may take hours to days, which depends on the size of the training samples, the complexity of the neural network, and computer capability. However, once well-trained, the designed FMNet can output a focal mechanism solution in only 196 milliseconds on a single CPU. Moreover, the trained network model can be deployed to estimate the source focal mechanisms in areas of interest permanently.
FMNet prediction results. The source focal mechanisms of four large earthquakes (Mw>5.4) in the Ridgecrest sequence are estimated with the trained FMNet. We show these results as red beach balls in Fig. 3. The predicted focal mechanisms generally reveal the strike-slip faulting with very steeply dipping fault planes. Among them, the three focal mechanisms in the southern region, including the Mw 6.4 foreshock and Mw 7.1 mainshock, demonstrate pressure axes in the north-south direction and tension axes in the east-west direction. The other one in the northernmost region shows a slight rotation in the fault plane azimuth. For comparison, we also plot the focal mechanism results from the SCSN moment tensor catalog as reference solutions (in black) in Fig. 3. We can see that the predicted focal mechanisms by the FMNet and the reference focal mechanisms from the SCSN catalog are essentially consistent for the three earthquakes in the southern region, considering the differences in methods, parameterization, velocity model, and the amount of recording stations used. The northernmost event is not included in the SCSN catalog for comparison. For this event, we conduct the widely used generalized Cut-and-Paste method (gCAP)38 to invert its focal mechanism as shown in grey. We observe that the inverted focal mechanism and the predicted focal mechanism match well for this event. Moreover, the slight rotation of fault azimuth is consistent with the distribution pattern of the aftershock event locations (grey dots). Comparing to other studies regarding this earthquake sequence20,23,43, the predicted focal mechanisms by our FMNet are essentially consistent with previous results. All these results demonstrate that the proposed FMNet enables us to determine the source focal mechanisms effectively. Additionally, the trained FMNet only takes 196 milliseconds with a minimum requirement of computing resources and memory storage, which outperforms both the conventional methods and the fast search method.
The comparison of waveforms is the most straightforward way to evaluate the predicted results. For this purpose, we simulate the synthetic waveforms using the predicted source focal mechanisms by our FMNet and analyze the similarity between real waveforms and synthetic waveforms (Supplementary Fig. S4). After comparison, we find that both the amplitude and phase information of waveforms across different seismic stations are overlapped well and the computed cross-correlation coefficients reach 0.86, which indicates that the FMNet has learned the ability to recognize the waveforms and mapping them to the corresponding source focal mechanism solution reliably.
Interpreting the FMNet using the encoder. To further investigate the working mechanism of our FMNet, by adopting a similar idea in face recognition of which the network learns a mapping from face images to a compact Euclidean space where distances directly correspond to a measure of face similarity44-45, we output the extracted features to analyze the reliability and robustness. The last layer of the compression part of the FMNet is exported as a by-product of the encoder (see Fig. 6 and Method section for details). After training, this encoder can take any training input with the size of 1 48 128 and output the extracted feature with the size of 128 1 1. With the encoder, we verify the hypothesis that a measure of feature similarity in feature domain is equivalent to a measure of waveform similarity in data domain through adopting the following steps: first, we calculate the extracted features using the encoder for the whole training dataset to build an encoded database in feature domain. Then, we calculate the extracted features of the data that records a real earthquake. Finally, we measure the L2-norm misfits between the encoded database of training data and the encoded features of the real data in feature domain. For comparison, we also calculate the L2-norm misfits in data domain measuring the waveform differences between real data and training database. By finding the smallest L2-norm misfit, if the retrieved best solution in feature domain corresponds to the best solution retrieved in data domain, we can therefore validate the above hypothesis.
We take the Mw 6.4 foreshock as an example. With the steps illustrated above, we display the comparison of L2-norm misfit distributions that are calculated in data domain (in red) and in feature domain (in black) in Fig. 4a, after ranking in ascending order. Since the whole training dataset is too large, we plot only the first 5,000 smallest misfits for clarification. We can see that the L2-norm misfit distributions calculated in data domain and feature domain present a similar shape. Meanwhile, Fig. 4b, 4c, and 4d show the corresponding training labels of the strike, dip, and rake angles for the L2-norm misfits in feature domain (the black curve in Fig. 4a). By finding the smallest L2-norm misfit, the retrieved best solution of the strike, dip, and rake angles in feature domain are highlighted as magenta circles. Then we compare the best solutions retrieved in data domain and feature domain as shown in Fig. 5. We can observe that the best solution retrieved in feature domain (in magenta) matches well with the best solution that is retrieved in data domain (in red). These analyses and comparison results validate our hypothesis that the extracted features in feature domain maintain the essential information of the original waveforms in data domain under the least-square sense, thus the extracted features are sufficient to identify its corresponding source focal mechanism. Moreover, the 10 best solutions (in magenta and black) retrieved in feature domain are generally consistent with minor variations, which illustrates the stability of the trained network.
From the above analysis, the compression part of our FMNet (i.e., the encoder) can be interpreted as a sparse transformation of the input waveforms, where the input data have been compressed from 1 48 128 to 128 1 1 in size by a decreasing factor of 48 times while keeping the key information in the data. The encoder also provides an alternative way to rapidly retrieve the best-matched source focal mechanism by searching in the dataset with encoded features that are prepared with the training data in advance. The expansion part of the FMNet mainly takes these extracted features to reconstruct a mapping function that yields the Gaussian distributions to represent the three angles of a focal mechanism. We address all these analyses presented in this section are for understanding the working mechanism of the FMNet and also for robustness analysis. When the proposed deep learning methodology is applied in a real case, we can directly feed the real data into the well-trained FMNet and output the focal mechanism rapidly. The intermediate output of the extracted feature maps can be used to further evaluate the reliability of the solution.