A Hybrid Landslide Susceptibility Model Obtained from Different Models (Raster or Vector, Expert or Semi-Expert) with the Correct Parameter

This study aims to determine how to choose the correct parameter for a specific study area in landslide susceptibility and how it gives results in vector or raster-based models. In the literature, factor parameters of landslide preparing and triggering conditions are used deliberately or randomly in raster or vector-based models. In this study, the landslide inventory was analyzed together with geological, topographic-morphological, environmental, and triggering parameters, and the parameters specific to the study area and its scale were decided. In order to obtain high efficiency from the models, the parameter data were taken from the landslide depletion zone. Raster-based models and vector-based models were created according to qualitative and quantitative approaches. Model outputs resulted in close Roc Curve results ranging from 0.79 to 0.92. The study area was divided into slope units and then the model output data were transferred to these units. In order to make the result easier to use, the units obtained according to the result of each model were combined, thus a single map output was obtained from 5 different raster and vector-based models. Overall, this study presents 1) the importance of the use of landslide inventory and how to use the inventory. 2) Parameters should be selected according to field analysis and field-scale rather than randomly. 3) By combining raster and vector-based on landslide susceptibility studies, make it easier to use as a base map in hazard and risk studies with a single output.

), deterministic approach (Cotecchia et al. 2009;Gökceoglu and Aksoy 1996;Westen and Terlien 1996), In addition to these, in recent years, there are also studies in which these analyzes are carried out with artificial intelligence methods knowledge-based (Chen et al. 2017;Ercanoglu and Gokceoglu 2002;Juang et al. 1992;Kanungo et al. 2006;Phong et al. 2019;Yesilnacar and Topal 2005). Landslide susceptibility studies from the past to the present, while overlay analysis was generally used in susceptibility studies between 1990-1993 years, In the following years 1993-2005, the use of binary or multivariate statistical approaches was high. In 2005-2008, both multiple and bivariate statistical approaches were used at the same time, besides, Analytic Hierarchy Process (AHP) and Artificial Neural Networks (ANN) models were also used. In 2008-2010, various models were used by comparing ANN, AHP, binary and multivariate statistics, artificial intelligence, and machine learning with each other. After 2011, modifications were made to existing models, and comparisons were made (Cihangir 2018).
When the studies were examined by topic, 2499 direct landslide susceptibility and indirect landslide susceptibility that 1832 landslide hazard and 972 landslide risk studies were conducted. Until today, there were changed in the number and type of factors used in the creation of landslide susceptibility, hazard, and risk maps (Cihangir 2018).
In these studies, geological (lithology, structural lineaments, relationship between structural geological elements and slope, groundwater, sediment thickness, discontinuity density, weathering degree, soil structure, layer position, surface material), topographically-morphological (slope, drainage network and density, relief, surface process, altitude, main or detailed geomorphological unit, slope curvature, aspect, slope length, stream erosive power index, topographic permeability index, topographic wetness index, topographic roughness ratio), environmental (landuse, vegetation, road density) and trigger (climatic, seismic, human) were used. Among these are the most used lithology from the geology group, slope from topographic factors group, landuse from environmental conditions, and climatic data "especially precipitation" from triggering factors (Fig. 4).
Our thoughts, Expert opinion, or statistical methods are used in weighting the factors chosen in both raster and vector-based studies. The expert must have experience of knowledge and geomorphological perspective that the ability to relate cause and effect from. The opinion of the expert who does not have these features may cause the model to be based on an incorrect basis. On the other hand, although some factors seem statistically related to landslides, they may not contribute to the landslide. Another situation is the close similarity between the parameters used. Simultaneous use of a factor or many other similar factors derived from it does not provide statistical benefit Costanzo et al. 2012;Kavzoglu et al. 2015). Scale preference is also important in explaining the factors affecting the landslide or putting these parameters in a process. The factors decrease or increase depending on the scale of the study. Using too many factor parameters can reduce the accuracy of the results in a model. Because They can reduce the weight of the main factors causing the landslide in the model. Therefore, the landslide type in the model should be uniform and the study area should be a reasonable field of physical morphology such as a medium scale geomorphological basin.
Dividing the study area into physical units can reduce some of the problems. For instance, when using only raster or vector-based on susceptibility maps may cause problems for risk maps. A building raster cell can remain between the high data value and low data value. Or, two vector polygons that have different values can cut a building. Therefore, it is more appropriate to take slopes as physical units in raster or vector studies. Also, if there is activity at any point of a slope, it affects the whole slope. Since the whole slope is affected by the first movement, it makes more sense to give the results as a divided slope unit (Carrara et al. 1991).
The use of the landslide inventory is very important as it provides information about the distribution pattern of the existing landslide and the determination of the factors that control the formation of the landslide (Casagli et al. 2004;Cihangir and Gorum 2016;Du et al. 2020;Eeckhaut et al. 2009). Considering the past is the key to the present principle, a current landslide inventory makes landslide susceptibility maps more reliable.
In a classical landslide system, the slope, relief, and altitude values are the lowest in the accumulation zone and depletion zone (Gorum et al. 2008). Even in very large landslides, the lithology and groundwater level of this zone change ). Many researchers have applied different methods in the sample data (Clerici et al. 2006;Gorum et al. 2008;Nefeslioglu et al. 2008;Tribe 1991;Yilmaz 2010). When using landslide inventories, it makes more sense to take topographic and geological average data of the depletion zone where the first movement occurred, rather than all parts of the landslide (Cihangir and Gorum 2016). Our opinion is that the selected average value and above model also creates fewer problems as it affects the regions where landslide susceptibility is high.
All this information in accordance with 1) Firstly, this study aims to choose the correct parameter. 2) Secondly, by revising the landslide inventory, the data obtained from the depletion zone are associated with the appropriate parameters. 3) Finally, it is to provide both vector and raster-based models with quantitative, semi-quantitative, and qualitative approaches. Also, it is to provide ease of use in danger and risk steps by combining the model outputs on a single map. All these are created for slide-type landslides according to the (Cruden 1993) classification.

Study Area
The current study area, Sakarya Basin, covers an area of ~670 km 2 between Marmara and the Blacksea Region. It is located at the northwestern portion of Turkey between 40°16′20″ N to 40°08′54″ N, and 30°23′26″ E to 30°51′14″ E (Fig. 1). The study area has steep topographic relief and its elevation ranges from 857.5 m to 1486 m above the mean sea level ( Fig. 3A and C). The slope of the area varies between 13 • and 68 • (Fig. 3B). According to the gauges of the Yenipazar meteorology station, the annual average precipitation is 38 mm. The highest rainfall occurs in December and January, and the least rainfall occurs in July-August. Geologically, During the Mesozoic period, the Tethys ocean was separating the Sakarya continent in the north and the Anatolide-Tauride block in the south. Later, the Tethyan ocean narrowed by subduction to the north in the Late Cretaceous. At the beginning of the Tertiary, the Sakarya continent in the north and the Anatolide-Tauride block in the south collided. This continent-continent collision caused deformation and Alpine orogeny in the region (Okay 2011;Yilmaz and Oezel 2008). Various rock types varying from the Paleozoic to the present are exposed in the study area, which is located in the Sakarya Zone and the Tauride-Anatolide tectonic union. Oldest to youngest; Paleozoic Magmatic-Amphibolite-Gneiss-Schist (Pzs) and Granite (Csg), Jurrasic Cherty limestone (JKs), Cretaceous aged Rhyolite (Rhyolite), Clayey limestone (Kyed) and Sandstone-Mudstone-Limestone (Kyet), Paleogene aged Limestone (Tks), Sandstone-shale-limestone-tuff (Kye) and Conglomerate-sandstone-mudstone (Tk), Quaternary aged Alluvium (Qa) and Travertine (Qt) are available. Among these rocks, there are 39% Sandstone-Mudstone-Limestone (Kyet) and 29.9% Conglomerate-sandstone-mudstone (Tk) in the study area ( Fig. 3D) (MTA 2002).

Factor parameter selection
Selecting the right parameter in a susceptibility model means setting up a good base. The first rule of choosing the right parameter is to question which parameters are most used in terms of landslide control and what is their importance in regional landslide studies in the literature. Accordingly, 200 studies selected among international publications that received at least 20 citations in the Web of Science between 1990-2020 were researched (Fig. 4). The use of conditions and triggering parameters that control the distribution conditions of landslides were examined in terms of the literature in these studies. Geological (lithology and structural lineaments), topographic-morphological (slope, altitude, and aspect), environmental (landuse, vegetation, road density), and triggering (climate) factors were used in most studies (Fig. 4). Topographic-morphology data, also known as landslide preparing conditions, were observed in almost all studies. However, according to the studies, it was seen that factors that were not effective in the study area were used randomly in most of these studies. After collecting information about the study area and understanding the processes that control the landslide, it is thought that selecting model parameters will make the results more reliable. where the rock surfaces are formed (Fig. 2).

Material and Methods
TOPO DEM with 10 m resolution was used for altitude data. 10 m resolution slope values were calculated from Topo DEM. In addition, relief values were calculated from the elevation differences in a rectangular area of 2000 m 2 . Lithology was obtained from 1: 100 000 scale General Directorate of Mineral Research and Exploration maps (MTA 2002). The current landslide inventory of the study area was determined from satellite images Landsat. Depletion zones of these landslides were determined by taking profiles from hillslope. In addition to giving an idea about which parameters should be used, Swat profiles contributed to the classification of parameters by detecting the data distribution in the landslide area. Thus, the spatial relationship between the landslide and data that lithological units, relief, elevation, and slope were revealed via Swat profiles analysis. The study area was divided into physical slope units. Five models were run on raster and vector basis.
Raster and vector model outputs were transferred to slope units. Finally, five models were combined to create a single model (Fig. 5).

Frequency Ratio Model (FR)
Frequency Ratio (FR) is a statistical approach based on the probability model whose method is applied to evaluate landslide susceptibility (Lee and Pradhan 2007). Frequency ratio is defined as the ratio of the probability of an event occurring to the probability of not happening (Erener and Düzgün 2010 A landslide susceptibility map is produced by multiplying each parameter subgroup with the calculated "RF" value and each factor parameter with "PR". In this study, the frequency ratio was used to create a raster susceptibility map (Fig. 8E).

Weighted Overlay Model (WOM)
Weighted Overlay analysis can be interpreted as multi-layer and multi-criteria evaluation. The result layer is obtained by weighing more than one raster layer relative to each other and within themselves and then overlapping them (Basharat et al. 2016;Shit et al. 2016). This layer includes areas that are suitable and not suitable for the criteria determined as a result of its evaluation (Roslee et al. 2017). The analysis combines the following steps: It reclassifies the values in the input raster according to a common evaluation scale or suitability or a similar unifying scale. Each input raster cell values are multiplied by the degree of importance raster. Adds to the resulting cell values to generate the output raster. Each layer in a raster are weighted according to their importance or impact percent. The sum of the assigned values must equal 100.
In this study, the selected factors were weighted on a raster basis with expert opinion. The landslide content of the factors was also considered in expert weighting (Table 9). Also, the frequency ratio was used to create a raster susceptibility map in this study (Fig. 8G).

Analytical Hierarchy Process (AHP)
Modified Analytical Hierarchy Process (AHP) is an expert opinion-based method that can be monitored and calibrated at every stage of the analysis process (Saaty 1980). When AHP decision hierarchy is defined; It can also be explained as a decision making and estimation method that gives the percentage distributions of decision points according to the factors affecting the decision (Nefeslioglu et al. 2013;Saaty 1980). The first step is to define the problem in which decision points are determined and the factors affecting them are determined (Nefeslioglu et al. 2013;Saaty 1980). The second step is to create comparison matrices between factors (Yaralıoğlu 2004) (Eq. 4). When using these factors, their importance degree is determined ( ) (Saaty 1980). Comparisons are made for all values above the diagonal of 1 in the comparison matrix.
(Eq. 5 is used for the components below the diagonal (Nefeslioglu et al. 2013;Saaty 1980). In the third stage; In order to determine the percentage significance distributions of the factors, column vectors that form the matrix for comparing the weights of the factors in the whole are used. "n" column vectors [B] including "n" cases are calculated (Eq. 6). Eq. 7 uses to calculate the "B" column vectors. The "C" matrix is obtained by combining the B column vectors calculated for the factors in a matrix format. By using the "C" matrix (Eq. 8), percentage significance distributions ("W" Vector) showing the relative importance values of the factors are obtained (Eq. 9).
In the fourth step, the Consistency Ratio (CR) is calculated to measure the consistency in factor comparisons. The calculation of CR is based on comparing the number of factors with a coefficient called the Basic Value (λ) (Nefeslioglu et al. 2013;Saaty 1980). In order to calculate the "λ", firstly the matrix multiplication of the "A" comparison matrix and the "W" priority vector is performed. The column vector "D" is obtained (Eq. 10). By dividing the reciprocal elements of the column vector "D" and column vector "W", the "E" factor for each evaluation factor is obtained. The arithmetic mean of these values gives the Basic Value (λ) for the comparison (Eq. 11) By using the "λ", the Consistency Index (CI) is calculated (Eq. 12). In the evaluation of consistency, the value of CI is divided by a correction value called the Random Index (RI) and the value of CR is calculated (Eq. 13) (Yaralıoğlu 2004). A calculated CR value below 0.10 indicates that the comparison matrix provided by the expert is consistent (Nefeslioglu et al. 2013;Saaty 1980). In the fifth step, percentage importance distributions of the factors at the "m" decision point are found. One-to-one comparisons and matrix operations are repeated for the number of factors (n). At this stage, "G" used in decision points for each factor forms the dimension of the comparison matrix (Nefeslioglu et al. 2013;Saaty 1980). After each comparison process, "S" column vectors showing the percentage distributions of the factor whose dimension is evaluated according to the decision points are obtained (Eq. 14). The sixth stage finds the distribution of results in decision points. This stage consists of "K" Decision matrice that is formed n number dimensional "S" column vector and dimensional (Eq. 15). When the resulting decision matrix is multiplied by the column vector "W", the "L" column vector with m elements is obtained (Eq. 16). Finally, "W" vector coefficient value is given to each factor class for AHP. In this study, the AHP was used to create a raster susceptibility map (Fig. 8F). Until this stage, AHP will be the continuation of it M-AHP for vector susceptibility map.

Modified Analytical Hierarchy Process (M-AHP)
M-AHP (Modified Analytical Hierarchy Process) is the method suggested by Nefeslioğlu et al. (2013) in order to eliminate the uncertainty arising from the subjective evaluation of AHP method. There are two differences between AHP and M-AHP (Nefeslioglu et al. 2013).
The comparison matrix is not prepared by the expert. The expert only gives the maximum scores for each factor in the system. The factor score difference matrix is then prepared. The factor score difference values are normalized depending on the maximum factor score in the system. The dimension of the sampling space in the system is determined according to the maximum factor score given by the expert. The factor comparison matrix is constructed by considering the modified importance value scale (Nefeslioglu et al. 2013).
The second group of differences is related to the evaluation of the importance distributions of the conditioning factors on the decision points. Each factor is normalized depending on its own maximum score in this stage. The linear distances between the normalized factor score and the decision points on a closed interval of [0, 1] on a numerical axis are measured. The decision point comparison matrix is constructed by considering the modified importance value scale ).
Since the landslide occurs along a slope, the model was applied to slope units. Therefore, the study area was divided into 2597 geomorphological slope units. The average value information of the factor parameters that cause landslide susceptibility was transferred to these slope units. These parameters were ranked according to their importance and points were assigned. In the next step, 2597 M-AHP analyzes were performed for each model. In each model, the scores of the decision information of each slope unit that is low, medium, and high were entered.
The following explanation shows the steps for only one slope unit.
First, the factor score difference matrix and the normalized factor score difference matrix were created for the model (Table 1). In the second stage, the table of significance values (Table 3) and the comparison matrix ("A" matrix) between factors are determined (Table 4). In the third stage, percent significance distributions of factors are determined ("C" matrix and "W" priority vector) ( Table 5). In the fourth stage, consistency in factor comparisons is measured. The vector "D" is the sum of the product of the "W" values of each priority vector of each row of the "A" matrix. "E" is obtained with the ratio of "D" to the priority vector of "W" ( Table 6). The λ obtained by the mean of the vector "E" is equal to 4. 08. The consistency index (see Eq. 12) is 0.03. The random index is 1. The consistency ratio (see Eq. 13) for the comparison matrix and the weight vector [W] was calculated to be 0.03. According to this value, it can be concluded that the comparison matrix constructed for the relevant slope unit based on the instant factor scores is consistent and rational. The evaluation of the decision points and the resultant distribution constitutes the fifth stage. To evaluate the importance distributions of the conditioning factors on these three decision points (low, moderate, and high) (Fig. 6), the [G] matrices were first constructed, and then the [S] vectors for each conditioning factor were calculated. When it comes to this stage, each parameter was normalized over its own maximum score. Linear distances of each parameter to decision points were evaluated on a normalized number line in the interval [0, 1] (Nefeslioglu et al. 2013).
Sample analysis is shown for 3 Decision Points (DP) regarding the landslide susceptibility parameter (C1; Normalized Parameter Score = 1.000). Also, this analysis should be made for C2, C3 and C4 (Nefeslioglu et al. 2013). DP-1, DP-2 and DP-3 decision points determined for the parameter "C1" (Table 7) were also made for "C2" and "C3". The high result value at the decision points constitutes the final result of the slope unit. After determining "C1", "C2", "C3" and "C4" decision points, the result distribution was obtained.
This process was applied one by one in the study area for 2597 slope units. The low, medium and high decision of each model was transferred to the slope unit. Finally, these results were normalized (Fig. 9 H). Table 2: The factor score difference matrix "A" The normalised factor score difference matrix "B".  Nefeslioglu et al. (2013) after Saaty,1980).     Table 7: Determination of "C1" parameter decision points and distribution of results.

CBS Matrix Model (CBS MM)
This model, which is a quantitative approach, is carried out using a matrix in a GIS environment, depending on all possible combinations among the landslide-causing factor types and their correlation with the landslide inventory (DeGraff and Romesburg 2020; Fernández et al. 1999;Irigaray et al. 2007).
The "Landslide Matrix" (Fig. 7a) was created by calculating the surface area affected by the landslide in each factor combination depending on the landslide inventory (depletion zone of landslide) (Fernández et al. 1999;Irigaray et al. 2007). The "Management Unit Matrix" (Fig. 7b) was created by calculating the total surface area occupied by each factor combination (Fernández et al. 1999;Irigaray et al. 2007). "Landslide Susceptibility Matrix" (Fig. 7c; Fig. 8I) was obtained by dividing the values corresponding to the "Landslide Matrix" by the values of the "Management Unit Matrix" (Fernández et al. 1999;Irigaray et al. 2007). The values in the landslide C1 = 1.000 DP-1 DP-2 DP-3 S1 The decision matrix (K Matrix) "L" Vector susceptibility matrix represent the ratio of the study area to the total landslide and the relative susceptibility of each combination of factors at each point (Fernández et al. 1999;Irigaray et al. 2007).

Fig. 7:
Illustration of the determination of landslide susceptibility by the GIS matrix method (changed from (Fernández et al. 1999;Irigaray et al. 2007).

Discussion and conclusions
There may be sharp transitions between neighboring raster cells in raster-based model studies. The neighboring pixel of a pixel with very high susceptibility may have very low landslide susceptibility. Also, some cells do not show the morphological continuity that defines a landslide. Assuming that the pixel with a high value does not affect the neighboring pixel and this change depending on the resolution detail reduces the reliability of the results. The reliability of the results of the hazard and risk studies built on these landslide susceptibility maps will also be affected by that, or the decisions of the management mechanisms that make decisions based on them.
Do natural events such as landslides have such sharp geometric boundaries? Especially when there are landslides with different morphological structures. Another situation is observed in vector-based results. they can produce landslide susceptibility that has many fragments and sharp forms on a slope.
Slope units were preferred to avoid raster and vector morphological shapes that do not describe all this landslide, and because instability on a slope affects the all slope systematically In this study. According to the slope unit approach, since the landslide-related processes take place on the slope, wide valley bases were excluded from the models. This approach affected the results positively.
Although the basic conditions such as slope and lithology affecting the landslide are common in most studies, TWI, SPI.. etc. conditions specific to some fields also affect them. Of course, it can also change according to the working scale. As the scale of the study area gets smaller, the area expands, and therefore the types of factors affecting landslides increase.
This study suggests that the conditions affecting the landslide should be obtained by research and analysis specific to the study area rather than random selection. In addition, we think that it would be more appropriate to focus on a single landslide type and choose a parameter specific to it. Because conditions and results may change according to the type of landslide.
In this direction, the correct selection of the parameters affecting the landslide was ensured by geomorphological analysis of the study area with this study. Models were created using the basic parameters determined for the slip-type landslide. Precipitation data, which is the triggering factor in the models, is considered equal in the whole area since there is only one meteorological data station for the study area. The aim here is to predict in which areas a landslide may occur when the precipitation exceeds the threshold value for the study area.
In the results in the raster-based model, the susceptibility results of the slope unit were determined according to the majority value covered by the cells in each slope unit. The M-AHP vector-based model was run as a slope unit. The results of the CBS Matrix model, which is another vector-based model, were determined according to the majority value in each slope unit, just like the raster-based model.
The results of these different types of models, ranging from raster-vector-based and qualitativequantitative methods, were clearly revealed some results for the studied region.
The highest value of the receiver operating characteristic (ROC) curve obtained according to the model results was found in the CBS Matrix model (0.92). In terms of value, the models from high to low are AHP (0.87), frequency ratio (0.86), Weighted overlay (0.86), M-AHP (0.79), respectively (Fig. 9).

Fig. 8:
Vector and raster-based model outputs Fig. 9: Illustration of receiver operating characteristic (ROC) curve for different models.
The highest and lowest susceptibility areas in both raster and vector outputs correspond to close and common areas in all models (Fig. 8). Especially the West and Northwest regions of the study area correspond to low landslide susceptibility areas. It is thought that this is due to the correct parameter selection. The transition values between the lowest and highest values differ according to the models. The model with the largest difference is the M-AHP output (Fig. 8 H). The main reason for this is that the model gives results according to 3 sharp decision points (low, medium, and high) rather than a smooth transition in this model.
A hybrid model that can decide for the highest areas in terms of landslide susceptibility was revealed by combining all models where the common areas are close to each other (Fig. 10). In this way, the spatial susceptibility in the study area was determined by a joint decision of the different models rather than a single model. This model was made to provide practicality and reliability in terms of use. 1)This study with the high success achieved above a positive effect on the results when the parameters to be used in a landslide susceptibility study is chosen specifically for the study area rather than random selection.
2) The study unites models with different bases such as raster or vector, and models with different approaches such as qualitative or quantitative on a common denominator in terms of susceptibility. 3) In general, the commonness of minimum and maximum landslide susceptibility areas in all models clearly reveals the safe and unsafe space. 4) Since the minimum and maximum landslide susceptibility areas are common in all models, the combined model clearly reveals the safe and unsafe areas. 5) As a hybrid model, the model provides more reliable decisions with the success of all models.

Acknowledgments
The author would like to thank Hakan Ahmet Nefeslioglu for their teachings and Muhterem Kucukonder for help.