Discrimination between Small Earthquakes and Local Quarry Blasts Using Committee Machine

22 A combination of multiple discrimination artificial neural networks using different seismic source 23 parameters is suggested using a committee machine. In this work, a committee machine was used to 24 combine supervised and unsupervised artificial neural networks to discriminate between earthquakes 25 and quarry blasts using data from the Egyptian National Seismological Network (ENSN). The 26 unsupervised network is used as a measure of accuracy for the results of the supervised neural 27 network. The unsupervised Self-Organized Map (SOM) and the k-means clustering algorithms are used 28 to estimate support and confidence measures for the results. Meanwhile, the supervised neural 29 network is used to discriminate between earthquakes and explosions. 30 The artificial neural networks are trained using different input parameters which are the P wave 31 spectrum corner frequency (P cF ), S wave corner frequency (S cF ), and the ratio (R cf ) of P cF to S cf . The 32 combined approach succeeds to discriminate between earthquakes and quarry blasts in Northern 33 Egypt. The method provides the results with a measure of confidence which eliminates false 34 discrimination. 35 The current paper represents an idea to implement artificial intelligence to assist experts in decision- 36 making situations. The committee machine could identify the nature of a particular event, using the 37 aid of several discrimination methods. The proposed committee machine could combine the results 38 of several algorithms and expert opinions to form one single output with a confidence measure. 39


Introduction 43
Both explosions and earthquakes release a large amount of acoustic energy that ripples through the 44 earth and recorded by seismic stations; thanks to the difference in source dynamics, the recorded 45 waveform may look different. But it is still a job that needs trained analysts to conduct such 46 discrimination, which is very critical to clean seismic catalogs from possible explosions and provide 47 monitoring tools for controlling such blasts in vast areas for security and proliferation. 48 Different discriminating methods have been previously proposed based on waveform amplitude ratios 49 [1][2][3][4] , or spectral methods 5-13 , or even coda based methods 14,15 . Also, discrimination was proposed based 50 on the time of the day seismicity maps where quarries blasts are usually carried out during the early 51 hours of the day 16,17 . In addition, pattern recognition techniques have been used for seismic 52 discrimination 18,19 53 Nevertheless, many attempts have been made to discriminate between earthquakes and man-made 54 seismic sources using neural network 6,9,20-25 . Tiira 26 used a multilayer perceptron (MPL) to discriminate 55 between nuclear explosions and earthquakes. Del Pezzo et al., 27 used a neural network to discriminate 56 between earthquakes and underwater chemical explosions fired by fishermen in Pozzuoli bay. 57 58 Nowadays, with the expansion in the use of explosive demolition-based techniques in mining and new 59 infrastructure projects, it became very crucial to distinguish between naturally occurring from man-60 made seismic events. Identification of the event's nature is urgently required for decision-makers. 61 Without true verification from the ground, experts use different published methods for discrimination. 62 However, these methods have different results, rising argue about confidence and depend mainly on 63 the analyst experience. Therefore, we develop an automated expert artificial neural network that 64 could combine the results of different methods and produce a single output with a confidence 65 measure. This expert artificial neural network is a committee machine with the ability to identify the 66 nature of a particular event, using the aid of several discrimination methods. The proposed committee 67 machine could combine the results of several algorithms and expert opinions to form one single 68 output with a confidence measure. The confidence measure is estimated using unsupervised Self-69 Organized The parameters dependency could be investigated through the correlation matrix listed in Table 1. 91 The corner frequencies of the P and S waves spectrum are highly correlated (the correlation coefficient 92 is 0.96). Meanwhile, the corner frequencies and their ratio are uncorrelated with the duration 93 magnitude, indicating that the corner frequencies are independent of the duration magnitude. 94 The events distribution over the four parameters is represented in (Figure 3). The scatter plot ( Figure  95 3), shows a continuous distribution of events along the range of each parameter. Remarkably, the 96 corner frequencies and their ratio are almost separating the earthquakes from the explosion events 97 with a small overlap. This may be attributed to the time delays of the ripple-fired quarry blasts in the 98 northern part of Egypt 32 . These ripple-fired explosions have a characteristic spectrum due to the time 99 delay between detonations 6,33-38 .  The learning data set was divided randomly into three sets containing 70%, 15%, and 15% of the data. 129 The training set, that 70 % of the data were used to train the neural to achieve the required targets. 130 The validation set contains 15% of the data to validate the training progress throughout the training 131 process. Finally, the test set contains 15% of the data used to test the neural after training. Four pairs 132 of ANNs were developed to discriminate between earthquakes and explosions using different input 133 data sets. The first three pairs of ANNs have a single parameter (either the {Pcf}, {Scf} or {Rcf} ) in the 134 input set while the last pair of ANNs has the three parameters in the input set {Pcf, Scf, Rcf}. 135 In a supervised training, the neural is trained to output a specific target set. The pairs of ANNs were 136 trained with two distinct target sets. The first target set is the source depth, where the explosions 137 have zero depth and the earthquakes have deeper depths. Meanwhile, the second target set was a 138 binary set formed of ones for earthquakes and zeros for explosions. The networks were trained to 139 produce 1 for earthquakes and 0 for explosions. So, eventually we end up with eight ANNs. 140 Each neural network was trained several times (epochs) to reach the specified target set. During each 141 epoch, the network goes through all the training samples and then updates its coefficients based on 142 the MSE. Then the data of the validation and test sets are applied to the neural network and the MSE 143 errors are computed. To be sure that the neural network is not memorizing the training set, the neural 144 network coefficient set that produces the best validation results is used for discrimination. 145 Usually, the overall performance of the ANN is measured using mean square error (MSE), mean 146 absolute error (MAE), and the correlation coefficient (R) between the estimated (y) and the actual (x) 147 values as follows: 148 By considering the ANN as a function of the input and target sets, then eight ANNs could be defined 152 in the form ANN (input set, target set). The performance results of the eight ANNs are listed in Table  153 2. The MSE and MAE could be misleading in the comparison between the ANNs that have the depth 154 as a target set and those that have the binary target set as both sets have different ranges and 155 different units (the depth is km and the binary is unitless). Therefore, the correlation coefficient R is 156 more suitable for such a comparison. 157 For the same input set, the performance is enhanced for the binary target set. The best performance 158 was for the ANN with the ratio of the cornel frequencies Rcf as input parameter and the binary target 159 set ({ }, ). This indicates that the Rcf has a more separation capability than the other 160 parameters (also this could be deduced from Figure 3)   MAE=0.015, even so, several events were misclassified). 179 To enhance the results, the output of each pair of the ANNs that has the same input parameter are 180 combined. The combination is done through a simple mathematical condition. The ANN could be 181 considered as a function of the target set and the combined ANN (ANNC) could be defined as: 182 Therefore, any event is declared as an Earthquake, if the output of the ANN that has the source depth 184 as the target set is greater than 2 and the output of the ANN that has the binary set as target set is 185 greater than 0.5. Otherwise, the event is declared to be an explosion. 186 This simple combination enhances the result significantly. Figure 5 shows the combined results of the 187 ANNs. The outputs of each successive pair of the eight ANNs listed in Table 2 are combined to produce  188 four ANNCs labeled ANN1 to ANN4 as depicted in Figure 5. The first combined ANNs has 83 mistakes 189 and the second has only 6 mistakes. While the third and fourth combined ANNs ( Figure 5 c & d) almost 190 have 100 percentage accurate discrimination (720 and 719 correct discriminations respectively). 191 However, this may not be true for any other events that were not part of the learning data set. 192 Therefore, ±0.05 percent of random noise was added to the learning data set. This random error could 193 account for miss picking of the cornel frequencies in the real situation. The results are shown in Figure  194 6. The ANNs are still capable of discriminating with few mistakes. The total mistakes are 123, 8, 13, 195 and 3 for the ANN1, ANN2, ANN3, and ANN4 respectively. This indicates that for future event 196 discrimination any ANN of the listed ANNs could produce a wrong classification. Therefore, the 197 discrimination process can't depend on any of them alone. 198 Moreover, the ANN has no measure of accuracy for any new input that was not part of the learning 199 data set. To deduce such a measure, the Self-organized Remarkably, some clusters are dominated by a single event type. Meanwhile, Figure 8 shows a 3D plot 239 of the distribution of the events over the 9 clusters with the estimated SOM weights positions marked 240 within each parameter. It should be noted that the clusters have overlapping ranges over the three 241 parameters. 242 Each cluster contains a number of events "hits" ( ). Some of them represent Earthquakes ( ) 243 and the others represent explosions ( ). 244 The support and confidence measures 49,50 for these clusters could be defined as follows: 245 The cluster support value ( )is the ratio of the number of events in that cluster to the total number 246 of events (TE). 247

= / , (5) 248
The confidence of a certain type of event in a given cluster is the ratio of the number of events of that 249 type in the given cluster to the number of events in that cluster. 250 The confidence of earthquakes of a given cluster is = / , (6) 251 The confidence of explosions of a given cluster is = / , (7) 252 For simplicity, these ratios could be presented as a percentage. The support and confidence measures 253 of the nine clusters are listed in Table 3, while those of the 4 clusters are listed in Table 4.     The trial-and-error technique is commonly practiced with neural networks to find the best neural 286 network structure that produces the best performance. Therefore, many different neural networks 287 (different structure, number of layers, and the number of neurons per layer) are trained and only the 288 one with the best performance is used. The performance is measured over the training, validation, 289 and test sets which usually do not cover the entire input space. This technique has two drawbacks. 290 First, the network with the best performance on these sets is not necessary to have the best 291 performance over any other sets of the input space. It is not necessary to have the best performance 292 over the three sets. The ({ }, ) ( Table 2) has the best performance but not over the 293 test set. Second, wasting all the efforts involved in the training of the discarded networks. 294 The committee machine could overcome these drawbacks. The committee machine can offer better 295 performance than any individual constituent neural network. Although the ANNs have an identical 296 configuration and are trained with similar data, they are trained with different initial conditions. 297 Therefore, they usually converge to different local minima. Committee machines use different 298 combination algorithms to combine the results. The combiner function could be simple as averaging 299 or more complex as a nonlinear gating function 41 . However, in this work, an ANN was used as a 300 combination function for the result of different discrimination methods as well as the results of the 301 trained ANNs. 302

303
The committee machine will tend to follow the inputs that are best matching the target, which 304 happens to be of the ANN4. To overcome this issue, intentionally, randomly manipulate the results of 305 the four ANNs (ANN1 to ANN4) to reach 20% wrong classification. So, the input from the four 306 combined ANNs has the same priority during the training process of the committee machine. 307

Discrimination procedure 308
The discrimination algorithm consists of three stages. 309

Stage 1 (ANN) 310
For any new event of an unknown source, the three parameters are estimated using the 311 EQK_SRC_PARA software 31 as indicated earlier. 312 This data is feed to the ANNs presented in Figure 4 and listed in Table 2. Then the results are combined  313 using Eq. (4). The output of this stage is four event-type. 314 Stage 2 (finding the holding cluster) 315 The inputs of this stage are the event spectral parameters (Pcf, Scf, Rcf) and the combined networks 316 ANN1 to ANN4 estimated event-type. The event parameters are used to find the SOM holding cluster 317 using the k-means centers. In each SOM of the four SOMs presented in Figure 7, the event will belong 318 to the cluster with the closest k-mean center. 319 Holding cluster (e)=arg min ∈{1,…, } ‖ − ‖ 320 Where m is the number of clusters, Ci is the k-mean center of cluster number i. e=(Pcf, Scf, Rcf), 321 Every cluster in the SOM has a support measure and each event-type within that cluster has a 322 confidence value as listed in Tables 3 & 4. The output of this stage is the support and confidence of  323 the holding clusters from the four SOMs for the designated event-type. 324 Stage 3 (Committee machine) 325 The inputs of this stage are the combined networks ANN1 to ANN4 estimated event-type with their 326 corresponding SOM support and confidence measures. These data are feed to the committee 327 machine ANN combiner to produce the final output. The output of this stage is the event-type with a 328 confidence measure. 329 Stage 4 (Measures update) 330 After verification and approval of the resulted event-type, the number of events in holding clusters is 331 incremented and its support and confidence measures are recomputed. 332 333

334
The neural networks were not able to estimate the depth of the earthquake. Also, it produces 335 relatively low or negative depths for explosions. 336 Even though the neural network fails to estimate the depths of the earthquakes, it separates the 337 earthquake events from the explosion events by produce different depth ranges for both (Figure 4c). 338 The neural were not able to estimate the depths of the earthquakes, because the number of samples 339 representing any single depth value is relatively low. 340 A simple combination was applied to the results of the ANNs trained with the same parameter, 341 however, they trained to produce different outputs either the depth or binary output (1 for 342 earthquakes and 0 for explosions). This combination is a simple form of committee machine applied 343 in the first stage. 344 The simple combination applied to the ANN outputs enhances the result significantly. The combined 345 results of the ANN trained with the corner frequency ratio Rcf and the ANN trained with the three 346 parameters almost have 100 percentage correct discrimination. These results indicate that the Rcf 347 parameter is significantly characterizing the earthquakes from the explosions. 348 The nine cluster SOM almost separates the earthquakes and explosions in different clusters. All the 349 clusters contain a single event-type except cluster 8 which contains 94 earthquakes and only one 350 explosion. To visualize the result of these SOM clusters the events were posted on a satellite map with 351 the holding cluster number indicated with different shapes (Figure 9). The green-colored asterisks 352 (cluster 5) are almost concentrated in a single location. Indicating different detonation techniques. 353 The committee machine produces 100% correct results with confidence measures that represent the 354 probability of event-type occurrence within the holding cluster. 355 The current paper represents an idea to implement artificial intelligence to assist experts in decision-356 making situations. The committee machine could Identify the nature of a particular event using the 357 aid of several discrimination methods. The proposed committee machine could combine the results 358 of several algorithms and expert opinions to form one single output with a confidence measure. 359 360 361 Figure 9: The spatial distribution of the 9 clusters SOM. The explosions are marked by asterisks and 362 Earthquakes by other shapes. The green-colored asterisks (cluster 5) are almost concentrated in a 363 single location. This indicates that this location has a special detonation characteristic.