Data mining method in seismology by applying cellular automaton equivalence of ground vibration fluctuations recorded near the epicenter of the 2011 Mw 9 East Japan earthquake

We propose a data mining method to extract physical phenomena from the ground vibration velocity fluctuation (GVF) which is a daily observed weak signal and has no specific waveform. Our aim is to detect any traces of the return of ground motion to its normal state after earthquakes, and the evolution of ground motion towards earthquakes in the GVF. We focus on the GVF recorded near the epicenter of the magnitude 9 Great East Japan Earthquake (GEJE). Although the GVF can be discussed in the framework of the master equation which represents the thermodynamics of a stochastically fluctuating non-equilibrium system, the relation between the thermodynamic parameters and seismology-related phenomena is hardly known. So, we introduce cellular automata (CA) to analogically interpret the physical phenomena of interest as thermodynamic parameters, and show the thermodynamic equivalence between GVF and CA. These CA-derived thermodynamic parameters are then expected to deliver physical phenomena similar to those of the target when evaluated on the GVF data. In the demonstrated example of data mining, the stress relaxation signal immediately after GEJE is discovered from the recorded GVF data by deriving the thermodynamic parameters representing the stress relaxation of CA and evaluating these thermodynamic parameters for the recorded GVF data. Furthermore, the elastic rigidity and viscosity of the underground structures are estimated from the stress relaxation signal. The novelty of this study lies in drawing attention to the GVF, and combining standard techniques of CA with GVF through the use of the master equation.


Introduction
Strong earthquakes are a major concern in disaster management, and various measures are being taken for strong earthquakes. The Earthquake Early Warning system in Japan warns people when an earthquake with a seismic intensity of 5 or greater is expected on the Japan seismic scale. When an earthquake is detected, the system analyzes the data captured by seismographs near the epicenter to estimate the hypocenter and its depth, the magnitude of the earthquake, and the seismic intensity. The estimated information is quickly Seismic Lab, 307, 12-1, 2-Chome,Wakamiya, 3630022, Saitama, Okegawa, Japan released so that people can move to safe places or evacuate from dangerous places before strong ground motion arrives. Regarding building regulations, the seismic standards of the Building Standard Law in Japan require minor damage in medium-scale earthquakes with a seismic intensity of about upper 5, and no collapses in large-scale earthquakes with a seismic intensity of about 6 upper to 7.
On the other hand, earthquakes generally last less than a minute, and the dominant state of ground motion is seismically silent. The cause of an earthquake occurs and evolves in a silent state until prominent shaking occurs. Therefore, to understand the earthquake process, it is essential to investigate the silent state. Nonvolcanic tremor is one of the notable discoveries regarding the silent state. Obara investigated the seismically silent period in southwest Japan and identified the nonvolcanic tremor which are weak signals with noticeable waveforms and a typical frequency range from 1 Hz to 10 Hz (Obara 2002). In 2003, Rogers and Dragert (2003) related tremors to ground slip events. Tremor activity accompanied by a slip event was observed approximately every 12 months for 6 consecutive years at the Cascadia subduction zone interface (Rogers and Dragert 2003). These researches in the seismically silent period are based on tremors with specific waveforms, and few attempts have been made to investigate ground vibration with no noticeable waveforms.
In contrast, we believe that the ground vibration fluctuation (GVF), which has a weak amplitude but is observed during most of the ground vibration observations and does not have a specific waveform, contains a wealth of information on the evolution of ground dynamics before and after the megathrust earthquake. Note that this study does not cover the earthquake itself. The GVF, a fluctuating system, can be treated stochastically within the framework of the master equation and characterized by thermodynamic parameters. However, it is not known how the thermodynamic characteristics of the GVF are related to physically and practically meaningful data.
On the other hand, stochastic cellular automata (CA) can analogically reproduce complex phenomena such as traffic jams and identify the conditions that cause traffic jams. In other words, CA can build an analogical link between complex phenomena and practically meaningful information. If CA can build a link between complex thermodynamic parameters calculated from the CA data and physically meaningful information, and if GVF and CA are thermodynamically equivalent, physically meaningful information on the GVF can be extracted by examining the CA. The advantage of CA is that it is relatively easy to build links between complex thermodynamic parameters and physically meaningful information. This is because CA has been studied for over 80 years and many insights about it are available.
To show the equivalence of GVF and CA, we binarize the GVF and show the equivalence between GVF and binarized GVF, since CA is usually constructed on a binary basis. Each of the binarized GVF and CA is then represented as a sequence of binary data, and the same procedure is applied to characterize the GVF and CA with thermodynamic parameters.
As an application of the GVF-CA equivalence, we show an example of data mining on GVF using the GVF-CA equivalence. In this example, we explore the viscoelastic properties of CA and derive thermodynamic parameters that describe stress relaxation. We then calculate the derived thermodynamic parameter with respect to the GVF data in hopes of discovering stress relaxation signals in the GVF data.
The use of CA for data mining is significantly different from the traditional usage of CA in seismology. As thoroughly discussed by Jiménez (2013), in seismology, the CA equivalent to a mechanical model of seismic ground motion has been used to reproduce the general behavior of seismicity, such as the Gutenberg-Richter magnitude-frequency distribution, or to reproduce earthquake catalogs. Then, after adjusting the CA with reference to known data, the CA is used to statistically forecast future seismic events.
In this paper, GVF is regarded as a stochastically fluctuating thermodynamic system, and many pages are consumed to show the thermodynamic equivalence of the GVF and the stochastically fluctuating CA. The GVF of interest was recorded during the period before and after the Great East Japan Earthquake (GEJE) of magnitude 9 occurred at the epicenter of latitude 38.10N, longitude 142.86E, and depth 24 km at 14:46 on March 11, 2011. In this paper, the survey period (GEJE era) is from 00:00 on January 1, 2006, to 23:59 on December 30, 2016, and the unit of time is Japan Standard Time (JST).
The rest of this paper is organized as follows: First, the thermodynamic equivalence between the GVF and the binarized GVF is shown, and the binarized GVF during the GEJE era is characterized by thermodynamic parameters. Second, CA is characterized by the same thermodynamic parameters used to characterize the binarized GVF. The equivalence of the CA and the binarized GVF is shown by comparing their thermodynamic parameters. Finally, we present a data mining example to extract stress relaxation signal from the GVF data by evaluating the parameters derived by CA.

Thermodynamic characterization of GVF before and after GEJE
We consider a binarized GVF instead of the recorded GVF since the binary sequence is easier to work with in the thermodynamic framework and comparisons with CA. First, we need to show the equivalence between binarized GVF and GVF. Since waveforms are not of interest, we discuss the equivalence in Fourier space. The GVF in the GEJE era is redefined by observing the anomalous noise with large amplitude in the frequency range of 2.97 Hz to 9.8 Hz in the Fourier spectrum. It is shown that the redefined binarized GVF is indistinguishable from the redefined GVF. The binarized GVF is then thermodynamically characterized. Here, we apply the Fast Fourier Transform (FFT) algorithm without overlapping or filtering.

Anomalous noise in GVF signal
Ground vibration velocity data recorded at the seismic station KSN is downloaded in chronological order from the website of F-net, the broadband seismograph network of the National Research Institute for Earth Science and Disaster Resilience (NIED 2019), using the "BL" option. "B", and "L" indicate Ground vibration velocity (m/s) in UD direction recorded by the strong motion velocity meter. (a2) Magnified plot of (a1). (a3) Piecewise deviation of (a2), or GVF. (a4) Spectrogram of (a3). (b1) GVF in the NS direction. (b2) Spectrogram of (b1). (c1) GVF in EW direction. (c2) Spectrogram of (c1) that the sampling interval is 0.05 seconds, and the instrument type is a strong motion velocity meter.
The instrument response is corrected by multiplying it by a correction factor, As, which is the product of the transfer function, the normalizing constant, and the ratio of sensitivity to gain (SEED 2012. For the 'KSN BL' data from 2006 to 2016, As was calculated from the instrument data provided with the 'KSN BL' data and treated as a real constant (= 6.168853e − 08 m/s/V for 'KSN_BLZ'). This is because the imaginary component of As is small in the frequency range of 1-10Hz, where the GVF signal is most relevant to our analysis.
KSN is located at latitude 38.98N, longitude 141.53E, and altitude 260 m. The downloaded data is converted to piecewise deviation which is the difference between the velocity and the velocity averaged over the subsequent 10 data points. The piecewise deviation, which fluctuates around zero, defines the GVF. The GVF data is divided into blocks of 1024 data, which corresponds to the data acquisition time of 51.2 seconds, and the Fourier amplitude of each block is calculated. The Fast Fourier Transform (FFT) algorithm is applied with no overlap and no filtering. The upper bound of the frequency domain is 10Hz, which is half the data acquisition frequency. The lower bound is 0.02 Hz which is determined by the block size 1024. Figure 1 shows a comparison of the GVFs and spectrograms in the up-down (UD), north-south (NS), and east-west (EW) directions. The velocity data were acquired at KSN every 0.05 seconds from 2011-03-03 12:27 to 2011-03-11 23:59. The period includes the magnitude 9 Great East Japan Earthquake (GEJE) that occurred at 14:46 on March 11, 2011. In the spectrograms Fig. 1(a4), (b2), and (c2), there are noticeable vertical brown distributions, one of which is indicated by the brown-filled arrow in Fig. 1(a4). A sharp vertical distribution indicates that similar FFT spectra occurred over short time intervals, and brown indicates large spectral values in the frequency range indicated by the vertical length of the distribution. Before the M7.3 earthquake, which is a seismically silent period, several sharp vertical brown distributions appear in the upper half of the plots, implying that these distributions correspond to the spectra with large values in the high-frequency ranges. We define "anomalous noise" as these brown distribution corresponding to the Fourier spectrum with large high-frequency components observed in a seismically silent period. Later sections will focus on UD components.
The brown vertical distribution indicated by the filled brown arrow in Fig. 1 (a4), corresponding to the GVF in Zone_A in Fig. 1 (a3), has a finer spectrogram structure that confirms the existence of an anomalous noise. Figure  2 shows the details of the Zone_A of 12500-second duration. The GVF and its spectrogram are shown in Fig. 2 (a) and (b), respectively. The Fourier amplitude spectrum and its 10-point Simple Moving Average are respectively indicated by the black and red lines in the Log10-Log10 plots of Fig.  2 (c1) to (c6). The source data for these Fourier spectra are 6 GVFs of 51.2 seconds duration extracted from the elapsed time in Fig. 2 (a), exhibited in chronological order from left to right ( Fig. 2 (d1) to (d6)). The number below the GVF graph indicates the time interval (seconds × 20). The first and last Fourier spectra show small convex curves in the range of 1 Hz to 10 Hz ( Fig. 2 (c1) and (c6)). The rest of the spectra show anomalous noise with large convex curves in the range from 1 Hz to 10 Hz ( Fig. 2 (c2) to (c5)). To illustrate what we mean by 'convex curve,' we have added red curves above the spectra.

Definition of A-noise
In order to quantify the anomalous noise in Fig. 2 or the convex-curve-like spectrum, we need to define a parameter that takes into account whether the spectrum is convex or concave, as well as the magnitude of the curve. The shape and magnitude of a curve in a flat two-dimensional x-y coordinate system are defined by the 'curvature', which is proportional to the second derivative of the curve (Riley et al. 2006). Therefore, a 'convex' curve (i.e., one that curves outward) has negative curvature, and a 'concave' curve has positive curvature. Referring to this standard definition of curvature, we define the curvature of the Fourier amplitude spectrum for the anomalous noise as sign(P ni − P i ) times the ratio of |P ni − P i | to |P 2 − P 1 |, as shown in Fig. 3 (a), where P i is a point between 2.97 and 9.8 Hz. The frequency of P 1 and P 2 are 2.97 Hz and 9.8 Hz, which correspond to the 152nd and 502nd points on the frequency axis, respectively. The average spectrum value of the nearest 5 points (red circles in Fig.  3 (a)) is assigned as the spectrum value for the P 1 and P 2 . P ni is determined so that the line from P i to P ni is perpendicular to the line connecting P 1 and P 2 . If the spectral value of Pi is greater than that of Pni, then the spectrum is convex, and the curvature is negative. Hence, the definition of curvature presented in Fig. 3 is in line with the standard definition of curvature in two-dimensional space. The curvature so defined is independent of the scale change since the line length in the log10 plot is invariant to the scalar multiplication of the coor- dinate values. The scale invariant makes the curvature robust and consistent against the sudden and accidental amplitude changes in the Fourier spectrum. The curvature is defined to represent the large Fourier component between 2.97-9.8 Hz, and the definition is not unique. We could also define it in another way, such as choosing the spectrum value for P 2 at 9.8Hz equal to the spectrum value for P 1 at 2.97Hz, but it does not affect the later discussion anyway. As long as the calculation results are consistent, either definition is fine. In fact, the definition in Fig. 3 (a) yields consistent and robust calculation results as seen throughout this paper.
We define A-noise as the product of "-1" and the curvature whose absolute value is greater than that of any other curvature in the frequency range of 2.97 to 9.8 Hz ( Fig. 3  (a)). An arbitrary GVF signal is defined as either positive A-noise or non-positive A-noise.
The A-noise for the GVF data recorded at KSN from 2011-03-03 12:27 (JST) to 2011-03-11 23:59 is exhibited in Fig. 3 (b3) along with the velocity plot ( Fig. 3 (b1)) and the spectrogram of the GVF (Fig. 3 (b2)). As expected from the definition of A-noise, the timing of the positive peaks of A-noise is similar to the timing of the vertical brown distributions in the spectrogram, as indicated by the blackdotted double-headed arrow. Figure 4 compares the Fourier amplitude spectrum and spectrogram calculated from the GVF with those calculated from the binarized GVF. Figure 4 (a1) and (b1) show the bina-rization procedure. If each GVF data in Fig. 4 (a1) is greater than the mean of the data set of interest, the GVF data is transformed to "1", otherwise, the GVF data is transformed to "0", and binarized. The GVF data is then expressed as the time sequence of "0" and "1", as shown in Fig. 4 (b1). The clear negative curvature in the frequency range 1 Hz to 10 Hz, shown in both the Fourier spectrum of the GVF data and the binarized GVF data, implies that the A-noise is preserved in the binarization (Fig. 4 (a2) and (b2)). Figure 4 (a3) is the Fourier amplitude spectrogram duplicated from Fig. 2 (b), of which source data is the GVF shown in Fig. 2 (a). This source data is binarized and its spectrogram is calculated as shown in Fig. 4 (b3). The qualitative similarity between the spectrogram of the binarized GVF and the spectrogram of the GVF suggests that the time evolution of the anomalous noise is preserved in the binarization (Fig. 4 (a3) and (b3)). Therefore, the binarized GVF and GVF are both suitable to investigate the evolution of the anomalous noise, thus equivalent.

Thermodynamic description of GVF
For simplicity of representation, we do not distinguish between GVF and binarized GVF below. The thermodynamic state of the binarized GVF is defined by counting the "1" clusters in the 10-digit binary sequence. The binarized GVF sequence ( Fig. 5 (a)) is divided into blocks with 10 data points, and the number of clusters of "1" is counted in each block. To preserve the total number of the cluster, the rule shown in Fig. 5 is applied. In the 10-data block, we scan the The x i ∈ {0, 1} is the binary number of the ith cell from the left of the 10-data block (1 ≤ i ≤ 10). (a) A sequence of binarized GVF and N i , the count of "1" clusters. (b) Blocks with 10 data points divided from the sequence in (a) and the count N i of the "1" cluster. (c1)-(c5) Example thermodynamic states s 1 , s 2 , s 3 , s 4 , and s 5 containing 1, 2, 3, 4, and 5 clusters of "1" respectively cell from left to right and count one if the sequence of "10" is found. At the end of the scan, at the 10th data point, we count one only if the 11th data point is "0" (Fig. 5(b)). The counting rule restricts the maximum number of clusters in a block to five, and defines five thermodynamic states s 1 , s 2 , s 3 , s 4 and s 5 , each containing 1, 2, 3, 4, and 5 clusters (Fig. 5 (c1)-(c5)).
The number of data points 10 per block is determined by examining samples of binarized GVF data. If a block of a particular size is completely occupied by "1"s, then the number of clusters of "1"s in the block is 1. If this is the case for all blocks, no fluctuation in the state can be detected. The block size needs to be increased to detect the characteristics of the state. If the blocks are very large and each block contains all possible patterns of "0"s and "1"s, then all the blocks will look similar and no fluctuation of state will be detected. In this case, the block size must be reduced. After examining a few cases, it is found that the 10 data points per block are adequate to preserve the characteristics of the state change. As shown in later sections associated with CA, the 10 data points per block are sufficient to keep the discussion consistent. The time series of the number of clusters in Fig. 6 (c) shows the history of the state transitions. Since the number of clusters in a state is defined as a state index, the square brackets that make up the pair of two numbers indicate that the state of the number in the lower row has transitioned to the state of the number in the upper row. The transition rate matrix W i j defines the total number of transitions from i-state to j-state so that the W i j count is incremented by 1 when a transition from i-state to j-state occurs ( Fig. 6 (d)). The result of the W i j counting for the 100 data is shown in Fig. 6 (e). Figure 6 (f) shows the unnormalized probability density distribution, whose com- The state of GVF fluctuates over time, implying that the state is non-equilibrium. It is known that the thermodynamics of a fluctuating non-equilibrium system is described by the master (Eq. 1), and the Entropy Production Rate (Eq. 2) (as referred to by Haitao and Jiulin (2014); Ito (2020).
where W i j and p i are coherent with those in Fig. 6. J i j and F i j are called the flow from i-state to j-state and thermodynamic force, respectively. Note that in an equilibrium system where p i is independent of time, E P R(t) = 0. E P R is an important concept in thermodynamics for quantifying the rate of entropy generation in a system. Entropy is a measure of the level of randomness in a system, and it tends to increase as time progresses. Within this manuscript, E P R is regarded as an indicator of system stability: E P R = 0 corresponds to a thermodynamic equilibrium or time-independent system, and the sharp decrease in E P R implies an abrupt change in the system from non-equilibrium toward equilibrium. Figure 7 shows EPR, the Fourier amplitude spectrum, W i j contour plot, and t − t diagram. Figure 7 (a1)-(a6) are copies of Fig. 2 (c1)-(c6). W i j contour plot and t − t diagram are calculated from the GVF in Fig. 2 The EPRs in Fig. 7 (a2) to (a5) are all greater than 0.08. But, the W i j contour of Fig. 7 (b3) with the yellow area in the lower left is distinguished from the W i j contours of Fig.  7 (b2), (b4) and (b5), with the yellow area in the upper right. Therefore, the W i j contour does not correspond to the EPR.
On the other hand, Fig. 7 (a3) shows a moderate negative curvature of the Fourier spectrum, implying a small positive A-noise. Figure 7 (a2), (a4) ad (a5) show a sharp negative curvature implying a large positive A-noise. Thus, for large EPRs above 0.08, the small positive A-noise corresponds to the W i j contour with the yellow area in the lower left, and the large positive A-noise corresponds to the W i j contour with the yellow area in the upper right. Therefore, the W i j contour and therefore W i j correspond to the (EPR, A-noise) pair.
In Eqs. 1 and 2, the independent thermodynamic variables are EPR, W i j , and p i , where W i j corresponds to the EPR-A-noise pair. Therefore, EPR, A-noise, p i , and additionally the density of states are the thermodynamic parameters to be evaluated in the comparison of GVF and CA. Adopting A-noise is advantageous since A-noise is observable and directly related to GVF.    (10days)) with the positive A-noise calculated every 51.2 seconds. The logarithm is applied to A-noise to extract the positive component. A sharp decrease in EPR(10days) is seen at the timing of the strong peak of the positive A-noise ( Fig. 8 (a)). For example, in Fig. 8 (a), along the green vertical line indicated by "(d)", the red-filled circle is located at the very low E P R(10days) level around 0.03, and the black line reaches to the highest level of about 0.5. Since E P R(10days) = 0 corresponds to a thermodynamic equilibrium or time-independent system, the sharp decrease in EPR(10days) implies an abrupt change in the system from non-equilibrium toward equilibrium. The sharp decrease in EPR(10days) is observed at the timing of GEJE (green down arrow in Fig. 8 (a)) and in the period of approximately 3 years before and after GEJE ( Fig. 8 (a)). EPR(10days) takes high, medium, and low values at the time intervals (b), (c), and (d) in Fig. 8 (a). The characteristics of these time intervals are shown in Fig. 8 (b), (c), and (d). In all cases, the density of states fluctuates around a value of 0.5. Figure 8 (b) and (c), which are the cases of large and medium EPR(10days), both show that the frequency of occurrence is concentrated around A-noise= -1 (left and center graphs) and EPR(10days)>0.06. In the positive A-noise region, the frequency of occurrence is very low (center graphs in Fig.  8 (b) and (c)). On the other hand, when the EPR(10days) is small, around 0.03, a relatively strong concentration is shown in the positive A-noise region (red down arrow in the center graph in Fig. 8 (d)). These time intervals (b), (c), and (d) are chosen for use in later sections.

Equivalence of GVF and CA
If we choose a CA that generates a binary sequence, the thermodynamic parameters defined for the binarized GVF can also be evaluated for the binary sequences generated by the CA. We can then compare the thermodynamic parameters of GVF with those of CA to examine their thermodynamic equivalence.

Cellular automaton
We briefly review CA-184 or Rule 184 cellular automaton, define (d, p)-CA184, a stochastic extension of the CA-184, and choose (d, p)-CA184 with periodic boundary conditions for comparison with GVF. CA-184 which is frequently introduced for modeling traffic jams is the one-dimensional array of cells, each containing "1" or "0". The time evolution of the cells, which depends only on the cells on either side of it, obeys the rule-184 (Nishinari and Takahashi 1999): where t and j are natural numbers, denoting discrete time and discrete space, respectively. The denominator U t+1 j , which is the cell value at (time, space) = (t + 1, x j ), is determined by the numerator U t j−1 U t j U t j+1 , which is a sequence of three binary numbers at time t. The rule-184 states that when there is the sequence "10" (one followed by zero) at time t, it will become "01" (zero followed by one) at time t + 1. In other words, assuming "0" and "1" denote an empty cell and the cell occupied by "1", respectively, if there is an empty cell on the right side of the occupied cell at time t, the "1" in the occupied cell moves right and occupies the empty cell at time t + 1. Therefore, the occupied cells move to the right as time passes. There are eight possible time evolution patterns as shown on the right side of Eq. 3. The binary number "10111000" obtained by arranging the denominators in order is 184 in decimal notation, which is the origin of the name rule-184.
Probabilistic fluctuation needs to be introduced in CA for expressing complex phenomena such as ground vibration on the earth. Introducing the moving or hop probability p, the rule-184 (Eq. 3) can be rephrased that if there is a sequence "10" at time t, then with probability p = 1, the sequence will be "01" at time t + 1. Extending the probability p to an arbitrary real number, the rule for p-CA184 is defined as the following: if there is a sequence "10" in the numerator at time t, then with probability p, the sequence becomes "01" at time t + 1. In other words, when p-CA184 advances, it advances with probability p. The CA-184 with the moving probability p is called the Asymmetric Simple Exclusion Process (ASEP). However, in this paper, we refer to ASEP as p-CA184 since p is closely related to thermodynamic parameters in later sections. In addition, since the density of states d affects the basic behavior of CA, we define p-CA184 calculated with the initial density of states d as (d, p)-CA184.
Here, the density of states is the ratio of the number of "1" to the total number of cells in the cell set under consideration.
The (d, p)-CA184 introduced in this paper consists of 10 cells in space, (x i , 1 ≤ i ≤ 10) and has periodic boundary conditions ( Fig. 9 (a)-(b)). The boundary conditions, or the values of boundary cells x 10 and x 1 at time t + 1, are determined by referring the cells on either side of each boundary cell at time t. The boundary conditions are determined by applying the rule-184 to (U t 10 U t and (U t 9 U t 10 U t 1 )/U t+1 10 ( Fig. 9 (b)). The CA data is generated by reading the cell value of x 10 at each time step t j (Fig. 9 (c)).

Thermodynamic comparison of recorded GVF data with CA
The binary sequences generated by the periodic (d, p)-CA184 are expected to obey the thermodynamic equations Eqs. 1 and 2 since they statistically fluctuate between "0" and "1". Similar to the binarized GVF, the thermodynamic parameters of CA are calculated and compared with those of the binarized GVF. The comparisons are made at three 10-day intervals indicated by (b), (c), and (d) in Fig. 8 (a), corresponding to large, medium, and low EPR(10days), respectively. Scales for calculating the thermodynamic parameter are unified in the GVF-CA comparison. For both GVF and CA data, the EPR is calculated for the first 1000 data points of the data block for which A-noise is calculated. Here, one data block for either the GVF data or the CA data is defined as 1024 data points equidistant in chronological order. For the GVF data, the one data block corresponds to 51.2 seconds of data acquisition time. This section compares the 16875 data blocks of the GVF data (corresponding to the data length of 10 days) with the CA data blocks and evaluates the degree of agreement between them. It should be noted that A-noise in CA is notationally identical to the A-noise defined in Fig. 3, but is calculated from the binary sequence generated by CA.
CA data of periodic (d, p)-CA184 is generated by the method shown in Fig. 9. The initial density of states "d" is selected to be 0.5 to be coherent with the density of states of the GVF data which fluctuates around 0.5 as shown in Fig.  8 (b)-(d). 2000 CA data are generated for each hop probability " p" of 0.2, 0.3, 0.5, and 0.87, and the thermodynamic characteristics of the generated data are shown in Fig. 10. Figure 10 shows the frequency of (A-noise, EPR) pairs, frequency of A-noise, and density of states. In all figures in Fig. 10, the frequency of occurrence of A-noise is concentrated around -1, similar to the characteristics of the GVF in Fig. 8 If " p" is greater than or equal to 0.5, a larger positive A-noise is observed (red down arrow in the A-noise frequency plot in Fig. 10 (c) and (d)) than if " p" is less than 0.5. Thus, the large " p" which is the near-equilibrium condition corresponds to a more noticeable positive A-noise distribution. This is coherent with the GVF case. The noticeable positive A-noise distribution is seen in the GVF data in Fig. 8 (d) corresponding to the other near-equilibrium condition, the small EPR(10days). Therefore, for both periodic (d, p)-CA184 and GVF, relatively large positive A-noise corresponds to near-equilibrium conditions. Figure 11 compares the calculated thermodynamic parameters of GVF and CA. Instead of comparing thermodynamic parameters separately and directly, we compare the data blocks consisting of 1024 digits, which are the source data for calculating thermodynamic parameters. There are 16875 data blocks in a 10-days of GVF data. Suppose there are NN data blocks in the CA data generated by the periodic (d = 0.5, p < 1)-CA184. We pick up one data block from the 16785 GVF data blocks and compare it with all NN data blocks of CA to find the "best match" CA data block that is most similar to the GVF data block. The thermodynamic parameters calculated for the GVF data block are assigned as horizontal coordinate values x in the graphs in Fig. 11. The thermodynamic parameters calculated for the "best match"- Thermodynamic 10-day data comparison between periodic (d = 0.5, p < 1)-CA184 and GVF. The four graphs compare the evaluation vector components: EPR, A-noise, the density of states, and p i (i = 1, 2, 3, 4 and 5), from left. (a) Comparison in the time interval "(b)" in Fig. 8, (a). (b) Comparison in the time interval "(c)" in Fig. 8, (a). (c) Comparison in the time interval "(d)" in Fig. 8, (a) CA data block are assigned as vertical coordinate values y in the graphs in Fig. 11. If there is a CA data block which is thermodynamically identical to the selected GVF data block, the calculated thermodynamic parameters of the "best match"-CA data block will coincide with those of the selected GVF data, and the point (x,y) is placed on the black line in the graph in Fig. 11. This comparison process is repeated for all 16785 GVF data blocks.
The comparison of the GVF data blocks and CA data blocks is performed using the evaluation vectors. An evaluation vector containing 8 components: EPR, A-noise, the density of states, p 1 , p 2 , p 3 , p 4 , and p 5 is calculated for each data block of the CA and GVF. We then calculate the Pearson correlation of the two evaluation vectors to assess the degree of agreement between the two. For each GVF data block, the CA data block with the maximum absolute Pearson correlation value is selected as the "best match" with the GVF data block. The values of the thermodynamic parameter or the (x,y) values in the graphs in Fig. 11 are extracted from the evaluation vector. Figure 11 compares the thermodynamic parameters extracted from the evaluation vectors of the "best match" CA and GVF data blocks. In Fig. 11, the p i components are differentiated by colors so that the values of p i can roughly be read. Figure 11 (b) shows good agreements of the evaluation vector components between GVF and CA for the 10-day time interval "(c)" in Fig. 8 (a), corresponding to the medium EPR(10days) value of 0.054. The GVF data contains 16875 data blocks. The CA data consists of NN=10000 data blocks generated for (d, p) of (0.5, 0.2), (0.5, 0.22), (0.5, 0.3), (0.5, 0.5), and (0.5, 0.87). For every (d, p), 2000 data blocks are generated. The evaluation vectors of GVF match well with those of CA: 99.9% of the 16875 pairs of GVF data blocks and "best match" CA data blocks show a Pearson correlation greater than 0.9. Figure 11 (a) shows good agreements of the evaluation vector components between GVF and CA for the 10-day time interval "(b)" in Fig. 8 (a), corresponding to the high EPR(10days) value of 0.06. The GVF data contains 16875 data blocks. To handle the high value in EPR, 4000 data blocks are generated for each of p = 0.2, 0.22, 0.3, 0.5 and 0.87, and 2000 data blocks are additionally generated for each of p = 0.15 and p = 0.17. A total of NN=24000 data blocks are included in the CA data. 99.7% of the 16875 pairs of GVF data blocks and "best match" CA data blocks show a Pearson correlation greater than 0.9. Figure 11 (c) shows good agreements of the evaluation vector components between GVF and CA for the 10-day time interval "(d)" in Fig. 8 (a), corresponding to the low EPR(10days) value of 0.032. The GVF data contains 16875 data blocks. To handle the low value in EPR, 4000 data blocks are generated for each of p = 0.2, 0.22, 0.3, 0.5 and 0.87, and 2000 data blocks are additionally generated for p = 0.7. A total of NN=22000 data blocks are included in the CA data. 99.9% of the 16875 pairs of GVF data blocks and "best match" CA data blocks show a Pearson correlation greater than 0.9.
In all cases of the 10-day comparison between the GVF and the periodic (d = 0.5, p < 1)-CA184, there is always a well-matched CA data block for each GVF data block. The GVF data block is therefore a thermodynamic subset of the set of periodic (d = 0.5, p < 1)-CA184 data blocks and is thermodynamically indistinguishable from the set of periodic (d = 0.5, p < 1)-CA184 data blocks. So the GVF in the GEJE era is thermodynamically equivalent to the periodic (d = 0.5, p < 1)-CA184.
Note that the match between CA and the GVF for the small EPR(10days) in Fig. 11 (c) is achieved by introducing the additional CA data generated for p = 0.7. The small EPR(10days) and p = 0.7 are close to the equilibrium conditions of EPR(10days)=0 and p = 1 respectively. This implies that near-equilibrium GVF corresponds to near-equilibrium CA, providing another evidence that the GVF and the CA are thermodynamically equivalent.
It should also be noted that if the evaluation vector of the GVF data block and the CA data block match, good matches are also obtained for detailed dynamic properties such as the Fourier amplitude spectrum, the contour plot of W i j and the t − t diagram. With respect to the case of large EPR(10days)(=0.06), Fig. 12 (a1) and (a2) compare the detailed characteristics of the best matched CA data block and the corresponding GVF data block, respectively. In both cases, there is no large spectrum component near 10 Hz in the Fourier spectrum, there is a large yellow distribution in the lower left of the transition rate matrix contour, and there are a small number of clusters of blue cells in each row of the t − t diagram.
With respect to the case of low EPR(10days)(=0.032), Fig.  12 (b1) and (b2) compare the detailed characteristics of the best matched CA data block and the corresponding GVF data block, respectively. Both show large frequency components near 10 Hz in the Fourier spectrums, a large yellow distribution in the upper right of the transition rate matrix contour, and a sparse distribution of blue cells in the t − t diagram.

Data mining methodology
As an application of the GVF-CA equivalence discussed in the previous section, we propose a data mining method that extracts information of practical value from the GVF by introducing CA. Figure 13 shows the overview of the data mining protocol. The arrow (a) indicates the first action conducted with CA. By analogical interpretation by CA, we derive a thermodynamic function (ATF) analogous to the target physical phenomenon (TPP). Next, considering the Fig. 12 Comparison of detailed parameters between the GVF data block and its best-matched CA data block. The left is the Fourier amplitude spectrum, the center is the filled contour of the transition rate matrix, and the right is the t − t diagram consisting of the first 100 data points of the binary sequence. (a1) Detailed parameters for the GVF data block of 51.2 seconds starting at 00:34:33 on 2009-04-30. The elapsed time is included in the time interval with a large value of EPR(10days)(=0.06; Fig. 8 (b)). (a2) Detailed parameters of the CA data block that best matched the data block of (a1) and is generated by (d = 0.5, p = 0.5)-CA184. (b1) Detailed parameters for the GVF data block of 51.2 seconds starting at 03:38:01 on 2011-12-24. The elapsed time is included in the time interval with a low value of EPR(10days)(=0.032; Fig. 8 (d)). (b2) Detailed parameters of the CA data block that best matched the data block of (b1) and is generated by (d = 0.5, p = 0.87)-CA184 thermodynamic equivalence of CA and GVF, we evaluate the ATF for the GVF data and discover the pseudo-physical phenomenon which looks similar to the TPP (arrows (b) and (c)). Finally, we show that the pseudo-physical phenomenon is actually TPP by showing physical evidence (arrow (d)).
The discussions on various topics, including the reasons for choosing GVF over ground vibration itself and the rationale behind selecting CA184, can be found in "Appendix": Appendix.

Data mining example and results
This section provides examples of data mining with CA and how this method can contribute to seismology.

Analogous interpretation of stress relaxation by CA
In data mining the GVF data recorded at KSN, we first use the periodic (d = 0.5, p)-CA184, which is thermodynamically equivalent to the GVF, to derive an analogical thermodynamic function corresponding to the target physical phenomenon. Since the hop probability p causes a delay in forwarding movement in the CA, p is expected to be associated with some delay phenomenon in the CA. One such delay is viscoelastic stress relaxation, which is the delay in stress response to a sudden jump in strain.
As shown in Fig. 14, viscoelastic stress relaxation under constant strain is described by the Maxwell model which consists of a series of an elastic spring G and a dashpot of viscosity η. When a constant total strain γ 0 is applied at time t = t 0 , a stress jump occurs at t = t 0 , after which the stress exponentially decays with a relaxation time τ . The τ is the time interval at which the stress drops by e −1 times the stress at t = t 0 and represents the degree of ductility of the material.
After trial and error, it is found that the parameter R+Anoise, calculated from data generated by the periodic (d = 0.5, p)-CA184, decays exponentially with respect to 1/ p (Fig. 15). The R+A-noise on the vertical axis is defined as the ratio of the number of positive A-noise to the total number of A-noise in a data set consisting of 21,499 A-noise. The 288 points of R+A-noise are calculated for all combinations of the 36 cases of p = {0.1, 0.125, · · · , 0.95, 0.975} and 8 initial conditions. The horizontal axis is 1/ p, which is the reciprocal of the hop probability. The solid red line, which is an exponential approximation of R+A-noise in the range (1/0.7) < (1/p) < (1/0.33), well agrees with the data points (solid black circle).
Assuming a system where 1/ p changes in proportion to time, the equation for the solid red line is built as shown in Eq. 4. t si and t ei are the start and end of the time interval of interest, respectively. τ i and b i are relaxation time and constant, respectively, and are determined by linear regression applied to the CA data. The values of these parameters  are shown in Table 1. If R+A-noise and 1/ p of Eq. 4 are regarded as stress and time of Maxwell model of Fig. 14, and if A i , t si , and τ i are regarded as σ 0 , t 0 , and τ , respectively, then, the Eq. 4 corresponds to stress relaxation of Maxwell model. Therefore, the R+A-noise is a candidate parameter for data-mining the GVF recorded at KSN to extract stress relaxation phenomena from the GVF.

Evaluation of analogous stress R+A-noise in GVF
Figure 16 (a) shows the R+A-noise calculated for each 10day long GVF data recorded at KSN from 2006-01-01 00:00 to 2016-12-30 23:59. It is found that R+A-noise(10days) discontinuously increases immediately after the GEJE and then gradually decays. The decay is approximated by three piecewise exponential functions shown by the red, orange, and blue lines in Fig. 16(b). Each exponential function is described by Eq. 4, which describes the stress relaxation of Maxwell's model, and is thus considered a piecewise stress-like relaxation. Details of the stress-like relaxation parameters are shown in Table 2. The time intervals for the three stress-like relaxations are 40, 173, and 1271 days, which gradually increase over time. The corresponding relaxation times are 200.56 days, 685.87 days, and 12547.72 days, respectively, and increase over time. The relaxation trend, or the decay of R+A-noise(10days), is then approximated by the green line in Fig. 16 (b), which is a linear combination of the three exponential functions described in Table 2: where τ 3 =200.56 days, τ 1 = 685.87 days, τ 2 = 12547.72 days, b 3 = 7.11, b 1 = 0.13, and b 2 = −2.82.
On the other hand, the differential equation, ... σ + cσ + aσ + bσ =0, represents the stress relaxation in the M1V2 viscoelastic model defined as one Maxwell model and two Voigt models connected in series (Fig. 17). From the calculated c, a, and b, we try to estimate the three rigidities and three viscosities in the M1V2 model. There are six unknown viscoelastic parameters for the equations for c, a, and b, so we need to introduce three restrictions. First, note that the equations for c, a, and b are invariant under the exchange of the indexes 1 and 3 in Fig. 17. Next, the equations for c, a, and b in Fig. 17 are rewritten as Eq. 5. Define E M2 /E V 1 = E M2 /E V 3 = γ , 1/τ V 1 = β/τ 1 , and 1/τ V 3 = β/τ 3 so that the invariance under the index exchange is conserved. And define 1/τ M2 = α/τ 2 , and E v1 = E v2 = 64G Pa (Sun et al.   Fig. 16 2014). These definitions yield the solvable equations of c, a, and b, including the three unknowns α, β, and γ (Eq. 6). The only real solutions to these equations are α = 1.00485, β = 0.99758, and γ = 0.00102, and the M1V2 viscoelastic parameters estimated from these solutions are shown in Table 3.

Verification of the result
This section validates our claim that the observed relaxation signal is an existential physical phenomenon and indeed the stress relaxation signal, in three ways. One is a comparison with previous research on stress relaxation after the GEJE. The second is the observation of the correspondence between the R+A-noise and external loading due to coastal waves. The third is a discussion of the relationship between the R+Anoise and stress. The stress-like relaxation obtained from the data mining can be validated as an existential physical phenomenon by comparison with the previous study by Sun et al. (2014) on the dominant role of stress relaxation in post-GEJE land and seafloor deformation (Sun et al. 2014). By analyzing seafloor GPS data, they revealed fast landward motion in the trench area which opposes the seaward motion on the land. Using numerical models of viscoelastic mantle rheology, they demonstrated that the land motion is a consequence of stress relaxation induced by asymmetric rupture of the thrust earthquake. Their stress relaxation model that appropriately reproduces the time evolution of the seafloor and land motion is compared with the M1V2 model. Sun et al. (2014) considered three viscoelastic structures, mantle wedge, oceanic mantle, and LAB layer, and introduced the Burgers rheology model consisting of one Maxwell model and one Voigt model in series, which is equivalent to the M1V2 model in Fig.   Fig. 17 M1V2 viscoelastic model and its differential equation. For stress relaxation problems where the strain γ is constant, the differential equation becomes a stress equation   (Table 4) and compared with the stress-like relaxation time of the M1V2 model (Table 3). The comparison is graphically shown in Fig. 18. Figure 18 shows that the stress relaxation times τ V 1 , τ M2 , and τ V 3 observed immediately after GEJE to 1484 days after GEJE in this study are in the range of the stress relaxation times discussed by Sun et al. (2014). Therefore, the stress-like relaxation of R+A-noise(10days) captures an existential physical phenomenon.
Next, the stress-like relaxation obtained from the data mining can be validated as stress relaxation by observation of the correspondence between the R+A-noise and external loading due to coastal waves. Figure 19 Figure 19 (a) is the duplication of Fig. 16 (a). Figure 19 (b) is the 100-point Simple Moving Average of the hourly maximum wave height (m) downloaded from the "Various data and materials: coastal wave meter observations" site of the Japan Meteorological Agency under the Ministry of Land, Infrastructure, Transport and Tourism (JMA 2019). The data before the GEJE (blue line) is recorded by ultrasonic coastal wave gauge at Enoshima (38.40N,141.61E, -57m height), the coast 65km from KSN. The data after the GEJE (black line) is recorded by radar coastal wave gauge at Karakuwa (38.86N,141.67E, 63m height), the coast 18km from KSN. There are no observation points near KSN that have been continuously operated before and after the GEJE. The red vertical lines connect the wave height peak with the R+A-noise and log 10 (A-noise) peaks. The timing of the GEJE is indicated by the green vertical line at the bottom of Fig. 19 (b). Note that no wave data is available for the time interval from the end of the blue line to the start of the black line.
Comparing Figs. 19 (a) and (b), the timing of the R+Anoise peaks is similar to the timing of the wave height peaks. This suggests that the increase in R+A-noise is the crustal response to the wave-generated gravitational force and that R+A-noise corresponds to either strain or stress. Time-dependent material responses to external forces are known to be either creep or stress relaxation. The former and latter involve an increase in strain and a decrease in stress over time, respectively. Since R+A-noise decreases with time after the GEJE, the decrease of R+A-noise should correspond to stress relaxation.
How stress is detected by R+A-noise is rationalized by discussing Fig. 19. Comparing Fig. (a) and Fig. 19 (c), no stress relaxation is found in A-noise which represents the intensity of the negative curvature in the Fourier intensity spectrum of the GVF. This suggests that Fourier spectral details are independent of stress relaxation. On the other hand, the time evolution of R+A-noise, which is essentially a count of positive A-noise, is associated with stress relaxation over time, suggesting that counts must be stressrelated. Assume that there are many small structures at the plate boundary and that one positive A-noise corresponding to pulses with large components at frequencies from 2.97 Hz to 9.8 Hz in the Fourier spectrum of the GVF arises from the destruction of one of the small structures. Then, when the stress at the plate boundary becomes large, many small structures are destroyed and many positive A-noises are generated. Thus, the count of A-noise or the magnitude of R+A-noise reflects the magnitude of the stress.

Conclusions
We claim that the GVF, which has a weak amplitude but is observed during most of the ground vibration observations and does not have a specific waveform, contains a wealth of information on the evolution of ground dynamics before and after the megathrust earthquake. The GVF, a fluctuating system, can be treated stochastically within the framework of the master equation and characterized by thermodynamic parameters such as entropy production rate, transition rate A 2 .13 × 10 3.26 × 10 2 1.70 × 10 2 1.81 × 10 4 6.03 × 10 6.03 × 10 B 9 .04 × 10 1.81 × 10 3 9.04 × 10 1.81 × 10 4 --

Fig. 18
Comparison of the relaxation times observed in the data mining in this study and the relaxation times introduced by Sun et al. (2014). in their numerical models matrix, thermodynamic states, the density of states, and Anoise. However, it is hardly known how the thermodynamic characteristics of the GVF are related to physically and practically meaningful data.
To extract the physically and practically meaningful data from GVF, we propose the data mining method that uses CA to interpret physical phenomena of interest as analog- The data before the GEJE (blue line) is recorded by ultrasonic coastal wave gauge at Enoshima, Miyagi prefecture, the coast 65km from KSN. The data after the GEJE (black line) is recorded by radar coastal wave gauge at Karakuwa, Miyagi prefecture, the coast 18km from KSN. The red vertical lines connect the wave height peaks with the R+A-noise and log 10 (A-noise) peaks. The timing of the GEJE is indicated by the green vertical line at the bottom. (c) A-noise in log 10 scale ical thermodynamic functions. The interpretation requires try-and-error effort, but CA provides a starting point. The analogical thermodynamic function is valid not only for CA but also for GVF, as the thermodynamic equivalence between GVF and CA has been shown. Then, in the data mining method, by evaluating the analogical thermodynamic function for the GVF data, we discover real-world phenomena similar to the physical phenomena of interest.
To show the equivalence of GVF and CA which is usually constructed on a binary basis, the GVF is binarized, and the equivalence of the GVF and the binarized GVF is shown. We regard the GVF as A-noise fluctuation, show that Anoise fluctuation is conserved under the GVF binarization, and have shown that the GVF and the binarized GVF are equivalent. When the GVF is binarized, both the GVF and CA are represented as a set of binary data.
Thermodynamic parameters associated with the master equation and entropy production rate are calculated from the binary data and compared between the binarized GVF and the periodic (d = 0.5, p < 1)-CA184. The thermodynamic parameters of interest are EPR, A-noise, the density of states, and the state vector containing 5 components. These thermodynamic parameters calculated for each of the selected GVF data blocks are simultaneously in good agreement with those calculated for the periodic-(d = 0.5, p < 1)-CA184 data blocks. Over 99.7% of the 16875 pairs of GVF data blocks and CA data blocks show a Pearson correlation of 0.9 or greater. The GVF in the GEJE era and the periodic (d = 0.5, p < 1)-CA184 are shown to be thermodynamically indistinguishable and thus equivalent.
In the first step of the data mining example for GVF in the GEJE era, stress relaxation is analogically interpreted as the ratio of the number of positive A-noise to the total number of A-noise (R+A-noise) calculated for the periodic (d = 0.5, p < 1)-CA184. By the GVF-CA equivalence, the R+A-noise should correspond to a phenomenon similar to stress relaxation in GVF.
Then, by evaluating R+A-noise(10days) for the GVF data, we have discovered the relaxation trend of R+Anoise(10days) that continues for 1484 days immediately after GEJE.
We verify that the discovered relaxation trend is an existential physical phenomenon since it corresponds to a stress relaxation model similar to the stress relaxation model that reproduces the crustal motion observed in the trench area after GEJE. The relaxation trend is then verified as stress relaxation by observing the correspondence between R+Anoise and external loading due to coastal waves. Finally, the stress detection by R+A-noise is rationalized by assuming that there are many small structures at the plate boundary and one pulse from one small structure failure corresponds to one positive A-noise. This implies that the count of A-noise or the magnitude of R+A-noise reflects the magnitude of the stress.
The data mining with CA has discovered the stress relaxation signals which support the idea of Sun et al. (2014) on the viscoelastic dynamics immediately after the thrust earthquake. In addition, examining the stress relaxation signals has triggered discussions on viscoelastic models and material properties of subsurface structures.
The GVF-CA thermodynamic equivalence has been shown only for the specific earthquake GEJE, and for the specific seismic station KSN. Further research is required to expand the scope of the method to other megathrust earthquakes and other seismic stations.