3.1 Comparison of the two model methods
The Landslide Hazard Maps were generated by the statistical models and machine learning methods in the GIS environment. The landslide hazard map was divided into five classes such as low, moderate, high, very high, and severe. The main aim of validation with the past landslide inventory was to find out how much past landslide data points fall in each class of the hazard zonation map and calculate the overall percentile for each class. If at least 60% of the past landslide data point falls on the high to severe classes, the landslide hazard map was expected to be satisfactory for practical use. According to the basic assumption that future landslides will most likely happen in similar physiographic settings of the past and present landslides. Fig 3 shows the percentile pixel count in each class. An assumption was made that Low to Moderate classes were classified as non-landslide area while classes between High to Severe were assumed as landslide area. From the percentile pixel calculation in each class, 68.49% of landslide data points fall on the high to severe class in the case of FR, while 68.79% of landslide data points fall on the high to severe class in the case of AHP. SE has 67.96% of landslide data points fall on high to severe class, and that of WOE has 63.25%. In terms of landslide areas, each of the hazard zonation maps was divided into pixel counts containing landslide and non-landslide pixels, and the areas of each landslide pixel and non-landslide pixel for each landslide hazard map were again calculated in the GIS environment. Here, areas under high to severe class were considered as landslide areas. For FR, 57.39% of total study areas fall under landslide areas and 42.6% of the total areas fall under non-landslide areas. While 62.5% of the total study area falls under landslide areas and 37.4% of the total areas falls under non-landslide areas for SE. AHP had 51.9% and 48% of landslide and non-landslide areas from the overall study area. And, WOE had 46.39% and 53.6% of landslide and non-landslide areas from the whole study area.
For machine learning methods, 73.32% of landslide data points fall on the high to severe class in the case of GBDT, while 70.7% of landslide data points fall on the high to severe class in the case of RF. From the percentile pixel calculation in each class referring to Fig 4, 72% of landslide data points fall on high to severe class for XGB, 70% of landslide data points falls on high to severe class for GBDT+LR, 67% of landslide data points falls on high to severe class for RF+LR and 70% of landslide data points falls on high to severe class for XGB+LR. Parameters such as Accuracy, Precision, Recall, f-measure, Area under the ROC curve, Mean Absolute Error, Root Mean Square Error and Kappa Index were used to evaluate the performance of the six algorithm models (Yuke et al.,2022). Higher values of model accuracy, precision, recall, f-measure, AUC, and kappa, as well as lower values of RMSE and MAE, mean better performance of the model (Yuke et al.,2022). The models' evaluation performance can be seen in Table 2 and Table 3 for the five performance indicators. From Table 2 , it can be seen that model XGB has the highest AUC values of 0.923, followed by GBDT and RF, all crossing above 0.900, indicating that the three models demonstrate very satisfactory and acceptable predictive capability. In terms of machine learning metrics, the XGB+LR model has the highest value for Accuracy, Precision, Recall and f-measure. In terms of error metrics, XGB, GBDT+LR and XGB+LR show high Kappa Index values and the XGB+LR model shows the lowest value for MAE and RMSE. The kappa index values show the compatibility and reliability of the LSM models (Yuke et al.,2022). The overall performance of the machine learning algorithms is quite satisfactory in landslide estimation. The model performance indicators also show a very acceptable performance which also indicates that the stacking ensemble method is a useful tool for improving the accuracy of model prediction (Yuke et al.,2022). Overall, the XGB model having an AUC value of 0.923 has the best accuracy and predictive ability among the other five models.
Fig 5 shows landslide hazard maps generated based on the prediction of various machine learning approaches with historical landslide location. From the analysis of each of the ROC curves for all the machine learning models and statistical models as shown in Fig 6 and Fig 7, an assumption can be made that the machine learning models outperform each of the statistical models. An improvement of more than 15% can be seen by adopting machine learning for accuracy assessment and predictive capabilities. Adopting a suitable hyperparameter for each of the machine learning and stacking ensemble algorithms boosts the algorithms for higher reliability and model robustness. However, Yuke et al.,2022 assumes that the importance of landslide conditioning factors is specific to a region and cannot be extrapolated to other regions. Likewise, not all regions will give less acceptable performance of the statistical method compared to advanced machine learning methods. The model fitness for landslide predictive assessment or even prediction will also be greatly influenced by the quality of the datasets collected. In the case of machine learning methods, the value of AUC for basic classifier models outperforms all the stacking ensemble models. Yuke et al., 2022 suggested that a simple stacking ensemble process of a model will not necessarily improve its performance and also suggested that, it is not always the case that the modelling performance of a fusion model is better than that of a single model. But it can be seen that, in terms of error-based assessment, the XGB+LR model has the lowest MAE and RMSE, and the highest Kappa index compared to the other five machine learning models which shows that the model performs the best in error elimination.
Table 2: Evaluation of landslide estimation models using machine learning metrics
MODELS
|
ACCURACY
|
PRECISION
|
RECALL
|
F-MEASURE
|
AUC
|
GBDT
|
0.978
|
0.979
|
0.998
|
0.988
|
0.9223
|
RF
|
0.976
|
0.976
|
0.999
|
0.988
|
0.9037
|
XGB
|
0.98
|
0.98
|
0.999
|
0.989
|
0.923
|
GBDT+LR
|
0.982
|
0.982
|
0.999
|
0.991
|
0.8613
|
RF+LR
|
0.981
|
0.981
|
0.999
|
0.99
|
0.8721
|
XGB+LR
|
0.984
|
0.9855
|
0.999
|
0.992
|
0.8962
|
Table 3: Evaluation of landslide estimation models using error metrics.
MODELS
|
MAE
|
RMSE
|
KAPPA INDEX
|
GBDT
|
0.021
|
0.147
|
0.233
|
RF
|
0.0232
|
0.152
|
0.071
|
XGB
|
0.0198
|
0.141
|
0.511
|
GBDT+LR
|
0.017
|
0.133
|
0.418
|
RF+LR
|
0.018
|
0.135
|
0.385
|
XGB+LR
|
0.0152
|
0.123
|
0.551
|