Tremor data processing
Continuous seismic waveform data were downloaded using ObsPy (Beyreuther et al., 2010) for six broadband stations close to the craters or vents of subject volcanoes (Figure 1; See Table S1). In total, 50 years of data were processed, ensuring that each eruption had at least one month of data prior (Table S2).
The instrument response was removed from the daily waveform data and this was processed to obtain four time-series (data streams): RSAM (real-time seismic amplitude measure), a 10-minute average of the absolute vertical station velocity bandpass filtered between 2 and 5 Hz; MF (medium frequency) and HF (high-frequency), computed the same as RSAM but using the 4.5 to 8, and 8 to 16 Hz frequency bands; and DSAR (displacement seismic amplitude ratio), the ratio between MF and HF bands recalculated for absolute vertical displacement (Caudron et al., 2019; Dempsey et al., 2020).
To isolate continuous signals, outlier detection is used to remove regional and volcano-tectonic (VT) earthquakes before the data streams are calculated. Events with especially long wave-trains, such as large magnitude subduction earthquakes, are removed by computing a two-window (20 minute) moving-minimum of the data stream. For comparison between volcanoes, values in each data stream are transformed so they sit on a unit log-normal distribution (or a unit normal in log10-space). This is called z-score normalization and it has been shown to improve classification modeling (Singh and Singh, 2010). Thus, a normalized value of 1 indicates a datapoint at the mean, and a value of 100 indicates the signal is two standard deviations above the mean. Alternative normalization techniques based on station distance are detailed by Aki and Koyanagi (1981) and Thomson and West (2010).
Feature extraction
Each of the four data streams was subdivided into overlapping 48-hour windows, with each window advancing one data point (10 minutes) beyond the previous one. For each window, 524 time-series features are extracted using the Python package tsfresh (Christ et al., 2018). Features included measures of distribution (parametric: mean, standard deviation; and non-parametric: quantiles, number of peaks), autocorrelation, frequency content, linearity and information content (entropy, energy, nonlinearity scores). Feature extraction yields 524 new time-series for each data stream. The effect of window length was explored by Dempsey et al. (2020) and shown not to affect resolution of short-term eruption precursors at Whakaari.
The feature referred to as nDSAR rate variance is labelled in the tsfresh python library to ‘change_quantiles__f_agg_"var"__isabs_False__qh_0.6__ql_0.4’. It is calculated as the variance of paired differences (the “rate”) for nDSAR values falling in the interquartile range 0.4-0.6. The feature referred to as 75-minute nHF harmonic has the corresponding tsfresh label ‘fft_coefficient__coeff_38__attr_"real"’ and is the real component of Fourier coefficient 38, which approximately corresponds to a frequency with period of 75 minutes.
Cross correlation analysis
For each feature time-series, e.g., nDSAR median, we analyzed four-week subsections prior to each of the 18 eruptions (See Table S2). We calculated standard correlation coefficients (Pearson method) between feature subsections from each eruption pair (see Figure 2a as reference). Once all the feature correlations are calculated (>40,000), we explored the matrices for high values between eruption pairs of eruptions and also clusters of eruptions that present strong correlations (see Figure 2a-b as reference).
Statistical significance tests
We conduct statistical significance analysis to test whether a pre-eruptive feature with high correlations is differentiable from non-eruptive repose periods. This verifies whether a result from data generated by testing or experimentation is not likely to occur randomly, but is instead attributable to a specific cause.
To test statistical significance analysis, we first select a possible 'signature', which is a four-week pre-eruptive feature time-series for one eruption. For example, one signature studied here is the nDSAR median prior to the 2019 Whakaari eruption (Figure 2b). The signature is then correlated with equivalent length records from each volcano (e.g., ~10 years for Whakaari), shifting forward one day at a time to calculate new correlations. This calculation is performed most efficiently as a convolution. Finally, we collect all correlation coefficients (e.g., ~2600 for the Whakaari record) and generate a histogram of their distribution and explore where the pre-eruption values fall (see Figure 2c-d; p-values are shown in Table S3).
To quantify significance, we calculate p-values using a Kolmogorov–Smirnov test (KS test) (see Table S2), which tests whether the pre-eruption correlation coefficients belong to the same distribution as the repose values. For example, a p-value for the archetype nDSAR median Whakaari-2019 (Figures 3a) correlated with the other four Whakaari eruptions generates high correlation coefficients that are very unlikely (p-value of 0.000148) to belong to the background distribution. We have tested a combination of archetype patterns and eruptions groups for different volcanoes (Table S3).
The statistical significance tests were performed over Whakaari, Ruapehu, Tongariro and Veniaminof record given. Records at Pavlof and Bezymiany are too short to perform conclusive statistical analysis (see Table S1).
DATA AVILABILITY
Raw waveform data for Ruapehu, Tongariro and Whakaari volcanoes can be download from GEONET (https://service-nrt.geonet.org.nz); and for Veniaminof, Pavlof and Bezymiany volcanoes from IRIS (http://service.iris.edu). Both are operable as clients though the FDSN webservice (https://www.fdsn.org/networks/). Rainfall data are available in New Zealand's National Climate Database (https://cliflo.niwa.co.nz/).
CODE AVILABILITY
The codes required to replicate the results of this study are available at https://github.com/aardid/volc_forecast_tl/tree/transfer-learning, released under the MIT license.