Identification of landslide-prone zones using a GIS-based multi-criteria decision analysis and region-growing algorithm in uncertain conditions

Landslides are considered to be one of the most significant natural hazards. Detection of landslide-prone zones is an important phase in landslide hazard assessment and mitigation of landslide-related losses. AHP as one of the most effective methods for GIS-based multi-criteria decision analysis is increasingly being used in susceptibility mapping. However, its weights have some degree of uncertainty that interval comparison matrix (ICM) method can be used to deal with this problem. The importance of this study is to propose an interval number distance-based region-growing (IDRG) method based on ICM for the identification of landslide-prone zones in the Urmia lake basin, Iran. To assess the capability of the proposed IDRG method, a landslide susceptibility map was produced using common AHP, too. To generate the maps, the weights of nine conditioning factors were determined using both traditional pairwise comparison matrices of the AHP method and ICM. The accuracy of the produced maps was assessed through ROC (receiver operating curve) and using a dataset of known landslide occurrences. The results indicate an improvement in accuracy of about 11% by identifying the landslide-prone zones using the IDRG method. This improvement was achieved by minimizing the uncertainty associated with criteria ranking/weighting in a traditional AHP and identifying the prone zones as areas instead of pixels. Finally, the robustness of the proposed method was demonstrated by sensitivity analysis.


Introduction
Landslide is known as a natural hazard that often occurs in mountainous and hilly areas all over the world (Akgun and Erkan 2016). This geological phenomenon is a type of mass movement, and it is the rapid fall of the large volume of rocks and soils from upslope to downslope (Chorley 1985;Malamud et al. 2004). Landslides commonly lead to loss of human life and property, as well as causing critical damage to natural resources . Risk of landslide can be reduced to a certain level by producing hazard zonation maps . Such maps would facilitate the discovery of susceptible areas and manage regional land use (Mohammady et al. 2012). So far, landslide susceptibility has been analyzed utilizing geographical information system (GIS) together with diverse models like frequency ratio (FR) (Aditian et al. 2018;Ding et al. 2017;Khan et al. 2019;Kumar and Anbalagan 2015;Zhang et al. 2016), analytical hierarchical process (AHP) (Althuwaynee et al. 2014;Bahrami et al. 2021;He et al. 2019;Myronidis et al.2016;Shahabi et al. 2014), analytical network process (ANP) (Gheshlaghi and Feizizadeh 2017; Alizadeh et al. 2018), support vector machine (SVM) (Ada and San 2018;Huang and Zhao 2018;Pham et al. 2019;Wang and Brenning 2021;Xing et al. 2021), random forest (RF) (Nhu et al. 2020;Sun et al. 2020), evidence belief function (EBF) (Althuwaynee et al. 2014;Chen et al. 2020;Ding et al. 2017;Feby et al. 2020;Zhang et al. 2016), Dempster-Shafer Feizizadeh and Blaschke 2014;Feizizadeh et al. 2014a, b;Gudiyangada Nachappa et al. 2019;Milaghardan et al. 2020;Mezaal et al. 2018;Mohammady et al. 2012;Pourghasemi et al. 2013;Shirani et al. 2018), decision tree (Wang et al. 2016;Pham et al., 2019, Wu et al. 2020, logistic regression (LR) (Aditian et al. 2018;Althuwaynee et al. 2014;Felicísimo et al. 2013;Pradhan et al. 2008;Wang et al. 2016), fuzzy logic (Feizizadeh et al. 2014a, b, Kumar and Anbalagan 2015, Gheshlaghi and Feizizadeh 2017, Bahrami et al. 2021, artificial neural networks (Gorsevski et al. 2016;Wang et al. 2016;Aditian et al. 2018), and deep learning algorithms (Bui et al. 2020, Van Dao et al. 2020, Nhu et al. 2020, Ngo et al. 2021. All the methods have their particular advantages and disadvantages, and each model's functioning varies based on the input data, the structure of the model, and the accuracy of the model (Nachappa et al. 2020). However, there is no suggestion that a particular model must be utilized for a specific situation or study area (Khosravi et al. 2018). The analytical hierarchy process (AHP) with the aim of GIS is one of the most effective methods for assessing the landslide susceptibility of the area (Althuwaynee et al. 2014;Shahabi et al. 2014;Myronidis et al. 2016;He et al. 2019;Bahrami et al. 2021). The conventional AHP method uses exact experts judgments structured in a pairwise comparison matrix (PCM) by applying Saaty's scale of importance amounts (scale of 1-9) (Boroushaki and Malczewski 2008;Saaty 2008). The relative importance of different conditioning factors is determined as crisp values based on the PCM (Lan et al. 2009). Despite the advantages of the AHP, the uncertainty of weights lies in the subjective expert judgment may have meaningful impacts on the results, which may sometimes lead to inaccurate outcomes (Chen et al. 2011;Feizizadeh et al. 2014a, b;Lan et al. 2009). Several methods have been applied to reduce the amount of uncertainty associated with the AHP method, such as the application of fuzzy logic, interval comparisons, and spatial sensitivity analysis (Boroushaki and Malczewski 2008;Chen et al. 2013;Feizizadeh and Blaschke 2014;Feizizadeh et al. 2014a, b). Due to the complexity involved in real-world decision problems, it is easier for a decision-maker to provide an interval comparison matrix and derive the interval weights than crisp ones (Cabrera- Barona and Ghorbanzadeh 2018;Entani and Tanaka 2007;Feizizadeh and Ghorbanzadeh 2017;Lan 1 3 et al. 2009). On the other hand, most of the previous researches applied the pixel-based classification to produce the final susceptibility landslide map (Feby et al. 2020;. Using this method may lead to the classification of a single pixel in a particular class (e.g., very high risk) which is surrounded by the pixels of the other classes, whereas in reality this rarely happens. In this study, a regionbased method is applied to determine landslide-prone zones, instead of a pixel-based method. The region-based classification includes the segmentation of neighboring pixels into homogenous units or objects. Object-oriented methods have become more popular for image classification in recent years (Trang et al. 2016), but they haven't been applied for the classification of susceptibility maps.
The main objective of this study was to develop a new approach to determine landslideprone zones as regions based on the concept of interval numbers for tackling uncertainty within the weights of criteria in the AHP method. The second objective was to produce landslide susceptibility maps for Urmia lake basin, Iran, as the study area. The third objective was to compare the accuracy of the proposed model with the common AHP method. The fourth objective was to investigate the robustness of the developed method using sensitivity analysis. In following these objectives, we developed a new interval number distance-based region-growing (IDRG) method. This new technique was tested for landslide susceptibility mapping (LSM) in the Urmia lake basin, Iran.

Study area and data
The study area is the Urmia lake basin, which is located in the northwest of Iran (see Fig. 1). The size of the study area is 19,913 km 2 , and its elevation is between 1260 and 3710 m above sea level. The climate of this area is semiarid, and the annual precipitation is about 300 mm. The complexity of the geological structure in the related lithological units in the Urmia lake basin has played an important role in the occurrence of numerous landslides in the region . A landslide inventory database for the East Azerbaijan Province lists 132 known landslide events. In this study, based on fieldwork and expert knowledge, nine conditioning factors were selected including aspect, distance to roads, elevation, distance to stream, distance to faults, slope, land-use, precipitation, and lithology. After that, all datasets were prepared and arranged in ArcGIS software as raster maps with a resolution of 100 m for further analysis.

Methodology
In this study, a region-growing method is proposed based on interval number distance to determine the landslide-prone zones. The input map includes pixels with interval values which are determined using ICM based on the AHP method. The results were compared with the common AHP method (see Fig. 2).

Region growing
The seeded region growing (SRG) is a segmentation method that checks adjacent pixels of initial seed points and determines whether the pixel neighbors should be added to the region depending on a similarity criterion (Adams and Bischof 1994;. Seed points are usually selected based on some user criterion such as pixels in a certain intensity or grayscale range.

Interval analytic hierarchy process (IAHP)
The analytic hierarchy process (AHP) is a popular method for solving multi-criteria decision-making problems. The conventional AHP needs exact judgments and forms crisp comparison matrices to explicit the preference information (Wei et al. 2008). However, the inherent subjective nature of expert judgments is a source of significant uncertainty in the AHP method. To overcome this problem, it is better to arrange an interval comparison matrix for deriving the weights (Feizizadeh and Ghorbanzadeh 2017). In this method, the decision-maker offers his interval judgments x ij = [l ij , u ij ], instead of a crisp number, where x ij indicates that the alternative x i is between l ij and u ij times as important as the alternative x j , and then an interval comparison matrix (ICM) A can be structured as (Lan et al. 2009): l ij ≤ u ij ,…∀i,j=1,2,…,n and l ij ≥ 0, u ij ≥ 0,…∀i, j = 1,2,…,n This matrix is a reciprocal matrix as well as a definite comparison matrix. It is defined as: The approach of generating this matrix and deriving the interval weights is explained in detail by Liu (2009) andGhorbanzadeh et al. (2018). The ICM of this study was compiled by the combination of opinions of 5 experts with proper expertise and scientific knowledge of the study area including geology, MCDM, and GIS (Table 1).
The final interval weight matrix W was calculated according to the ICM using the method proposed by Liu (2009) and then normalized (Table 2).

Fig. 2
The methodological framework In this study, unlike the previous studies, the interval weights were calculated for sub-criteria, too (Table 3).
According to Table 3, for each of the conditioning factors, two maps were generated: The first map was produced based on the lower bound of the interval weight, and the second one was made by the upper bound of the interval weight (Fig. 3).
After that, the weight of all conditioning factors (criteria) was multiplied by the related map concerning the definition of the binary operations for interval numbers as follows: Let x = [x , x ] and y = [y , y ] be real intervals. The binary operations addition (+), subtraction (−), multiplication (.), and division (/) are then defined on R as follows: In this step, all the conditioning factor maps were added together concerning the definition of the sum operator for interval numbers. Now we have a susceptibility map in which each pixel includes an interval value.   Bao et al. (2013) proposed an IND 1 definition based on the median and width of the interval number (Guo et al. 2018, Bao et al. 2013.

Interval value distance
is the width of a.

Interval number distance based region growing (IDRG) (A novel approach)
In this study, a new method was proposed to identify homogenous regions using interval values: interval number distance based region growing (IDRG). The algorithm classifies each pixel of the map as regions (e.g., high-risk zones) or backgrounds. In the proposed algorithm, it is assumed that the values of pixels in the input map are represented by interval numbers, in which the width of interval numbers is related to the uncertainty of the values (the lower interval width for a pixel value shows the lower uncertainty). Two main concerns should be dealt with when executing a region-growing algorithm: where to place the initial seeds and which similarity criterion should be assumed to characterize the regions . The most common way is to select some pixels as seed points based on simple criteria (e.g., color, intensity, or texture). In the input map, the value and the position of pixels by higher median and lower width can be determined. In IDRG, it is proposed to select them as seed points to avoid false detections and to focus on the segmentation itself.
The similarity criterion was defined to identify regions within which there is similarity and uniformity in both quantity and uncertainty. To specify the similarity criterion, the interval distance between the region's mean and a new neighbor pixel is calculated based on the IND definition proposed by Bao et al. (2013). At the first step, each seed is considered as a region, and IND is calculated between seed points and their neighbors. In cases that the smallest IND between the mean of the region and the new pixel becomes lower than a certain threshold (t), the pixel joins to the region, and the region is iteratively grown. At the next steps, the region's mean is calculated according to all pixels of the segmented region. In the end, the pixels that have three or four adjacent pixels belong to one region, joining the same region. The proposed IDRG is as follows (this code is executed for each of seed points):

Landslide-prone zone map generation using IDRG algorithm
In this case study, after initial processes, the weightings derived from ICM were used for data aggregation within a GIS environment. Two aggregated maps were generated: The first one indicates the lower bound, and the second one shows the upper bound. The first map was produced using the lower bound map of the conditioning factors ( Fig. 3. Left) and lower weight of the criteria. The second map was generated using the upper bound map of the conditioning factors ( Fig. 3. Right) and upper weight of the criteria. These maps were imported into the Matlab environment, and a new map was produced by interval numbers in which the width of interval numbers is related to the weights of IAHP. This map was considered as the input of IDRG. Moreover, according to the interval value of pixels, median map (E) and width map (W) were produced (Fig. 4). The value and the position of pixels that are related to the highest risk (higher median) and least uncertainty (lower width) can be determined. These pixels were selected as seed points for IDRG (Fig. 4).
Then, landslide-prone regions were identified for each seed according to the similarity criteria using the IDRG procedure. In this work, 4-connected neighborhoods for seed pixels were checked to determine whether the pixel neighbors should be added to the region. The region is iteratively grown by comparing unallocated neighboring pixels to the region. The pixel with the smallest IND measured this way is allocated to the respective region. This process stops when the IND between the mean of the region and the new pixel becomes larger than a certain threshold (t). After that, in our proposed method, to create homogenous regions, if three or four neighbors of a pixel belonging to a region, that pixel itself joins that region. These regions were then exported from the MATLAB programming environment into the ArcGIS 10.3 software. Then, the regions were converted into a landslide-prone zone map.

Landslide susceptibility map generation using conventional AHP method
To compare the results of the proposed IDRG method with the conventional AHP, a landslide susceptibility map was produced using the same conditioning factors and AHP. The weights of criteria and sub-criteria were calculated based on eigenvalues using a conventional pairwise comparison matrix (PCM) proposed by Saaty (2008). The weights are presented in Table 4 (Feizizadeh and Blaschke 2013).

Results and validation
Determined regions using the IDRG method in Matlab environment were converted into landslide-prone zone map in ArcGIS which is shown in Fig. 5. After that, the area of each region was calculated ( Table 5). The landslide susceptibility map generated using the conventional AHP method was classified into five classes of susceptibility using the natural breaks classification (Jenks optimization), which is an effective method for categorizing the susceptibility maps (see Fig. 6).
The accuracy of maps was evaluated based on the locations of known landslides within the study area. The validation process is a fundamental step to assess the ability of the developed approach for the identification of landslide-prone zones. The locations of observed landslides are represented on the landslide-prone zone map generated by the  Fig. 7. The area of each region determined by the IDRG method and the number of observed landslide events are presented in Table 5.
To compare the accuracy of IDRG and conventional AHP method, the area of each category using AHP and the number of observed landslide events were determined and are represented in Table 6.
According to Table 6, 39.19 percent of the study area is covered by "high"-risk and "very-high"-risk classes (9.47% + 29.72%) in the susceptibility map obtained by the conventional AHP method and 79.55% of the observed landslides were located in these classes. However, the identified regions using IDRG cover 39.79% of the study area, and 97.73% of the known landslides were located inside the determined regions which shows the ability of the proposed method in determining the landslide-prone zones. For a more detailed assessment of the results, a part of the region 2 (adjusting pixels of seed point no. 2 in IDRG) is represented in Fig. 8. Classification of the susceptibility map based on  1 3 Fig. 6 Landslide susceptibility map generated by conventional AHP method Fig. 7 Observed landslides on landslide-prone zone map generated by IDRG method pixels in conventional AHP may cause a single-pixel belonging to a particular class to be surrounded by pixels belonging to another class, while in reality this rarely happens. For example, as shown in Fig. 8, some "medium"-risk class pixels are surrounded by the "very high" and "high"-class pixels. Some of the occurred landslides are observed in this area. This can be due to both considering the crisp weights for criteria and sub-criteria in the conventional AHP method and the pixel-based classification. In the proposed method, the sensitivity to the threshold value between two classes of classification is reduced and the pixels that are close in terms of value and uncertainty are assigned to one region so that the whole area shown in Fig. 8 is inside region no. 2 in the IDRG method. Furthermore, in the proposed method, the pixels that are surrounded by pixels of a region will be joined to the region. Although it has not yet occurred any landslide in region no. 5 on the IDRG map, it is one of the high-risk areas that should be considered in decision making. In this research, the accuracies of the landslide-prone zones map produced by the IDRG method and the landslide susceptibility map generated by the conventional AHP method were evaluated by calculating relative operating characteristics (ROCs) and verifying the number of observed landslides in the various categories of the derived maps. The ROC curve is a plot of the probability of having a truly positive response (a correctly predicted event) versus the probability of a false-positive response (an incorrectly predicted event), for different probability cutoffs (Feizizadeh et al. 2014a, b;Gorsevski et al. 2016). A landslide inventory database for the study area includes 132 observed landslide events which were used to validate the results. In the ROC curve, the ideal model is denoted by a value close to 1.0 (Fawcett 2006;Nandi and Shakoor 2010). The results obtained from the ROC for the map derived from the IDRG method indicated an accuracy of 97% and for the map produced by conventional AHP, and the accuracy was about 87% (Fig. 9). The accuracy was improved by about 10% which shows the advantage of the IDRG method and considering the interval weights for both criteria and sub-criteria. To complete the examination, a sensitivity analysis was performed for investigation of the dependency of model output on the weights of input factors. A common approach for executing sensitivity analysis is to vary input factors one-at-a-time (OAT) (Ilia and Tsangaratos 2016). This study used a 10 percent change from an original criterion interval weight value to perform sensitivity analysis. Either a lower bound change (e.g., plus or minus 10%) or an upper bound change can be applied to all criteria. Therefore, four cases were considered for each criterion, as shown in Fig. 10  To execute the sensitivity analysis, 36 (4 case for 9 criteria) simulations were performed to identify landslide-prone zones where only one bound value of each criterion interval weight was altered at a time (Table 7). The other weight values remained same as values in Table 2 for each simulation.  Then, similar to the original performance, for each of the simulation runs, two aggregated maps (lower bound map and upper bound map) were produced and imported into the Matlab environment. After that, the median map (E) and width map (W) were produced. Finally, landslide-prone zones were identified using the IDRG procedure. Simulated maps were compared with the original map. Changes in the area of landslide-prone zones are presented in Table 8 and Fig. 11.
The average of changes in landslide-prone zone area in a (Low I) and c (High D) simulations, in which both lower and upper bound values are within the original interval weight (Fig. 10), is lower than in two other simulations (Table 8). In these simulations, the area of landslide-prone zones was decreased in comparison with the original map, while in b and d simulations the area was increased.
The results show that the average change in the area of the prone regions from 36 simulations to 0.1% of the area, so the proposed method can be considered relatively robust. Lithology with the highest weight among the other criteria is the most sensitive criterion, which causes 0.52% modification in the area of landslide-prone zones when its upper interval weight change is + 10%. As it was mentioned, the results are more stable when the changed values are within the original interval weights.

Conclusion and future work
In this paper, we proposed a new approach that integrates the region-growing algorithm with a kind of interval number distance as a measure of similarity (IDRG) to determine the landslide-prone zones in the Urmia lake basin, Iran. Using the conventional classification methods to produce landslide susceptibility maps, some sole pixels are surrounded by the pixels of the other classes, which is rarely happening in the real world. It is better to determine the landslide-prone zones instead of high-risk pixels, as identified by the proposed IDRG method. Furthermore, the inherent uncertainty of the weights of criteria and subcriteria determined by the experts may affect the results. Therefore, the IDRG method is proposed based on the ICM to define the weights. Using the IDRG method, 6 regions were identified as landslide-prone zones which cover 39.79% of the study area and 97.73% of the observed landslides were located inside them. To compare the results, a susceptibility map was produced using the conventional AHP. In this map, high and very high landslide susceptibility was detected for 39.19% of the study area. The results achieved from the ROC, using the landslide inventory, indicated an accuracy of 97%, 87% for IDRG and conventional AHP, respectively.
By applying the proposed IDRG method to reduce the uncertainties of weights for both criteria and sub-criteria and determining the landslide-prone zones instead of high-risk pixels, the accuracy was improved by about 10%.
To complete the investigation, a GIS-based sensitivity analysis was implemented for analyzing interval weight sensitivity in landslide-prone zone mapping. Using the proposed method, the amount of changes in the model output due to variations in interval weights of the criteria was quantified. According to the results of the sensitivity analysis, the lithology was the most sensitive as it had the highest change in the area of landslide-prone zones.
Since the results showed that the average change in the area of prone zones, according to 36 simulations, is 0.1% of the area, the proposed method can be considered relatively robust.
The proposed method can be applied to determine safe areas for decision-making in land-use planning. Furthermore, in this research to develop an interval-value-based regiongrowing method, a binary decision map was created. However, a multi-class interval-based region-growing method would be developed to support a landslide susceptibility map with multiple risk regions.