4.1. Learning model structure and accuracy
Figure 8 shows the multi-layer structure of the trained model after training and validation. The trained model was evaluated using a confusion matrix. A confusion matrix summarizes the results of multi-class classifications (Eq. (1) and (2)) and is a measure of machine learning model performance. Four standard metrics were used to evaluate the performance. These metrics were calculated based on three numbers measured during the test: 1) true positives (TPs, correct detections), 2) false negatives (FNs, missed targets), 3) false positives (FPs, incorrect detections), and 4) true negatives (TNs, correct detections). Recall expresses the proportion of each explanatory variable detected (e.g., 5885 of 6069 for slide (y'0) were detected). In the best case, recall is equal to 1, so the CNN detected all category labels (y'0, y'1, and y'2) in the test set. Precision describes the percentage of category labels correctly detected and classified.
\(Recall=\frac{TP}{TP+FN}\times 100,\frac{TP}{FP+TN}\times 100\) (1)
\(Accuracy=\frac{TP+TN}{TP+FP+FN+TN}\) (2)
As shown in Table 4, for slide areas, recalls were 96% (yʹ0), 52% (yʹ1), and 81% (yʹ2) for each explanatory variable, whereas the overall accuracy was 0.856. Good precision was obtained for slide (y'0) and outside landslide (y'2). The non-slide (y'1) was low, which may result from the number of learning data. The results indicate an increase in the recall ratio and accuracy (except for non-slide(y'1) recall) over those of Sakita et al. (2019), who used two numerical images (Table 5).
Table 4
Correct answer rate from verification results using eight parameters
Data Type
|
Slide (yʹ0)
|
Non-slide (yʹ1)
|
Outside Landslide (yʹ2)
|
Recall Rate
|
Slide
|
5811
|
178
|
80
|
95.7%
|
Non-slide
|
391
|
658
|
220
|
51.9%
|
Outside landslide
|
464
|
236
|
3058
|
81.4%
|
Total
|
7521
|
1966
|
6636
|
|
Accuracy
|
0.856
|
|
|
|
Table 5
Correct answer rate from verification results using two parameters
Data Type
|
Slide (yʹ0)
|
Non-slide (yʹ1)
|
Outside Landslide (yʹ2)
|
Recall Rate
|
Slide
|
5489
|
176
|
404
|
90.4%
|
Non-slide
|
231
|
697
|
341
|
54.9%
|
Outside landslide
|
755
|
348
|
2655
|
70.6%
|
Total
|
7461
|
2266
|
6396
|
|
Accuracy
|
0.797
|
|
|
|
Evaluation as “Unknown data” using the trained model was conducted for three of the 38 slide sites (IDs 8, 20, and 23) and non-slide areas in the northern part of the study area (4 × 1.5 km) that were not used for training and validation. These results are shown in Table 6 and Fig. 9(a) (ex. only non-slide). For the results comparing evaluation and training, the recall of slide (y'0) decreased from 95.7–50.0%. The recall for non-slide (y'2) decreased from 51.9–21.2%. However, the recall of the Outside the landslide(y'2) increased from 81.4–87.6%. The results indicate that the learning was implemented efficiently for slide (y'0) and non-slide (y'1) areas.
Table 6
Results of analysis from the evaluation phase by eight parameters
Data Type
|
Slide (yʹ0)
|
Non-slide (yʹ1)
|
Outside Landslide (yʹ2)
|
Recall Rate
|
Slide
|
12
|
4
|
8
|
50.0%
|
Non-slide
|
18
|
65
|
223
|
21.2%
|
Outside landslide
|
90
|
188
|
1963
|
87.6%
|
Total
|
120
|
257
|
2194
|
|
Accuracy
|
0.793
|
|
|
|
The results using the two types of numerical analysis images are shown in Table 7. and Fig. 9(b) (ex. only non-slide). Compared to the eight types of numerical images (Table 6 and Fig. 9(a)), slide (y'0) increased from 37.5–50.0%. Non-slide (y'1) also decreased, but outside landslide (y'2) increased from 70.4–87.6%. Accuracy also increased from 0.639 to 0793. This indicates that an increase in the number of numerical analysis images had an effect. Additionally, Fig. 10 compares the recall ratios and accuracies of the slide sites (IDs 8, 20, and 23) and non-slide images. The recall ratio improved in four out of eight items, the accuracy improved in three out of four items, and the total accuracy improved in all items.
Table 7
Results of analysis from the evaluation phase using two parameters
Data Type
|
Slide (yʹ0)
|
Non-slide (yʹ1)
|
Outside Landslide (yʹ2)
|
Recall Rate
|
Slide
|
9
|
8
|
7
|
37.5%
|
Non-slide
|
118
|
81
|
142
|
23.8%
|
Outside landslide
|
444
|
209
|
1553
|
70.4%
|
Total
|
571
|
298
|
1702
|
|
Accuracy
|
0.639
|
|
|
|
Unknown data face generalization issues that result from introducing new elements. Specifically, the results express the non-identity of topography and geology, which present import issues in terms of predicting landslides. In contrast, the outside landslide (y'2) represents a general topography not involved with a landslide and could therefore be trained well.
4.2. Evaluation of slide areas (slide)
First, we selected three of the 38 collapse sites (IDs 8, 20, and 23) and evaluated them as unknown data for the collapse sites (data were not used as learning data). These three locations were selected to show different locations and different scale and topographic characteristics as shown in Fig. 2 and Appendix Table A1. Fig. 11 shows an example of ID 8. Fig. 11(a) shows the slope angle diagram before the slide, the approximate location of the slide, the tiles identified as slide areas (y'0), and tiles judged to be uncollapsed landslide topography (y'1). Colorless tiles were identified as having no connection to landslides (y'2). Fig. 11(b) shows the slope angle diagram after sliding.
In ID 8, the NW–SE ridge slid W at the top of the head. Besides terminal cliffs, minor scarps were observed within the slide area. These are DGSDs, and the arcuate main scarp is intermittent, suggesting that DL development was ongoing. Tiles identified as slide (y'0) selected most of the collapsed regions, and the slide areas were considered having the same properties as in the learning data. The northeastern slope of the ridge to the SE of the collapsed site was also selected as slide (y'0). This site did not slide, but because it had the same properties as the learning data, it was considered a candidate for future collapse. In contrast, uncollapsed landslide topography tiles (y'1) were partially selected in the surrounding areas. Thus, the trained model can identify sites with the same characteristics as collapse sites in non-slide areas, which are candidates for predicting future collapses. Fig. 11(c) shows the results of the learning model using slope and wavelet (Sakita et al. 2019). The slide(y'0) was more effective than the two types of analysis (Fig. 10).
Figure 12 shows an example of ID 20. Figure 12(a) shows the slope angle diagram after sliding. Figure 12(b) shows the slope angle diagram before the slide, the approximate location of the collapse, tiles identified by the model as slide areas (y'0), and tiles identified as non-slide landslide topography (y'1). Colorless tiles were identified as having no connection to landslides (y'2). At ID 20, the NW–SE ridge slid at the top of the head, trending NE. Terminal cliffs appeared to develop in the slide area. The minor scarp and arcuate but indistinct main scarp formed intermittently, suggesting that the DL development was ongoing, or that sliding occurred after development. Tiles identified as slides (y'0) were selected for the slide areas. The upper ridge adjacent to the collapse site was selected. However, uncollapsed landslide topography tiles (y'1) were partially selected in the surrounding areas. The trained model could select all slide areas in ID 20. Additionally, sites with minor scarps and irregular undulations were selected as slide areas (y'0), indicating that the learning data were effective. Figure 12(c) shows the results of the learning model using the slope and wavelet (Sakita et al. 2019). Here, the selection of slide (y'0) and non-slide (y'1) were not appropriate. This indicates that using multiple types of numerical analysis can be effective.
Figure 13 shows an example of ID 23. Figure 13(a) shows the slope angle diagram after sliding. Figure 13(b) shows the slope angle diagram before the slide, the approximate location of the collapse, the tiles identified as slide areas (y'0), and tiles judged to be uncollapsed landslide topography (y'1). Colorless tiles were identified as having no connection to landslides (y'2). In ID 23, the NE–SW ridge slid from the top of the head to the NE. Irregular undulations, minor scarps, and terminal cliffs were also confirmed in the slide area. These are DGSDs, but an arcuate main scarp did not form, suggesting this site was in the initial stages of DL development. Tiles identified as slide (y'0) in the analysis results were selected in several locations around the boundary of the slide area. Although the full extent of the slide area was not selected, minor scarps with DGSD characteristics were selected, indicating a certain effect. They did not exhibit a consolidated area; however, they are considered unknown data and have properties other than those of learned tiles, exposing the generality problem for the model. The northwestern part of the collapse exhibited a large triangulated irregular network, with a poor-quality DEM. Owing to its low accuracy, these data were treated as y2 before the learning data. Although this is an error in identifying non-slides within the landslide area, they can be evaluated as a product of learning. Figure 13(c) shows the results of the learning model using the slope and wavelet (Sakita et al. 2019). However, for slides (y'0), the selection accuracy was improved for outside the slide, whereas non-slide (y'1) and outside the landslide (y'2) selection accuracies also improved (Fig. 10).
General micro-topographic interpretations of landslide areas focus on major scarps, terminal cliffs, minor cliffs, and irregular undulations. However, in this study, slopes that experienced actual collapses were used as objective tiles in the learning data. These included terrain with no topographic features. This indicates that the learning data were not complete. However, even before the gravitational deformation affects the whole DL, we can find the same topography as the collapsed area in the 50 × 50 m range.