Landslide Susceptibility Mapping Using GIS-based Information Value and Frequency Ratio Methods in Gindeberet area, West Shewa Zone, Oromia Region, Ethiopia

The study area is found in Gindeberet district of West Shewa zone in Oromia Regional State of Ethiopia.This area is highly susceptible to active surface processes due to the presence of rugged morphology with steep scarps, sharp ridges, cliffs, deep gorges and valleys. This study aimed to identify and evaluate the causative factors and to prepare the landslide susceptibility maps (LSMs) of the study area. Two bivariate statistical models i.e. Information value(IV) and the Frequency ratio(FR), were used. First, active, reactivated and passive landslides and scarps were identified using Google Earth image interpretation and extensive field survey for landslide inventory. A total of 580 landslide were randomly selected into two datasets in which (80%)460 landslides were used for modeling and (20%)116 landslidesfor validation. conditioning factors (slope, aspect, curvature, distance from stream, distance from lineaments, lithology, rainfall and land use) were combined with a training landslide dataset in a ArcGIS to generate LSMs which weredivided into verylow, low, moderate, high and veryhigh susceptibility zones. LSMs for IV and FR models were validated using the Area under(ROC) curve showing a success rate of 0.836 and 0.835 respectively and a predictive rate of 0.817 and 0.818 respectively wich showed a good performance of both models. The resulting LSMs can be used for land use planning and management.


Introduction
Surface of the earth is always in a dynamic change as a result of different mass wasting processes such as landslide and these changes are more common in mountainous terrains. Landslide is defined as a mass movement of rock, debris or earth down a slope resulting in a geomorphic change of the Earth's surface (Glade 1997).It can be triggered by various natural external stimuli such as intense and/or prolonged rainfall, volcanic eruption, earthquake, snow melts and also human actions that affect the drainage or groundwater condition.
Globally, every year landslide disasters cost lives of thousands and billions of USD in property damage. These phenomena cause lots of damage affecting people, organizations, industries and the environment (Glade 1997). Ethiopia is one of the mountainous countries which are characterized by a steep slope, deep gorge; frequent fault escarpment, highly weathered rock, and intensive and prolonged rainfall. All those conditions are favorable for the occurrence of slope movements after rainy seasons in most parts of the country (Woldearegay 2013). According to many studies conducted in Ethiopia (Woldearegay 2008; Ibrahim 2011 cited in Meten et al. 2015), landslides in Ethiopia had resulted in the loss of human and animal lives, damages in infrastructures and properties in the last five decades.
The current study area is found in Gindeberet District, West Shewa Zone, Oromia Region in Central Ethiopia. This area is highly susceptible to landslide problems with the occurrence of many active landslides in the last few decades which extensively damaged gravel roads, houses and farmlands in which the magnitude of the problem is alarming and the vulnerability of the people's lives and property from landslides needs special attention and so far the area has not been studied before in a greater level of detail.
The development of more advanced qualitative and quantitative methods and GIS data-processing techniques have allowed conducting numerous studies effectively and as a result, landslide susceptibility, hazards and risk assessment have attracted attentions of many researchers (VanWesten et al. 2006). According to Aleotti and Chowdhury (1999), there are two general approaches that can be used to study landslide susceptibility. These are the field-based qualitativeand data-driven quantitative approaches. Field-based, qualitative (heuristic) approach directly depends on the researcher's expertise because all of the decision rules to prepare the landslide susceptibility maps are evaluated by the researcher. Statistical methods are used to estimate the relative contributions of the factors responsible for slope instability and to make some predictions based on these factors (Suzen 2002). Accordingly, the data analysis in statistical approaches can be grouped into two i.e. bivariate and multivariate methods (Soeters and vanWesten 1996). The aim of this paper is to generate the landslide susceptibility maps of the study area using GIS-based information value and frequency ratio models.

Study Area
The study area is found in Gindeberet District which is located in West Shewa Zone of Oromia Region in Central Ethiopia (Figure 1). The area is located 182Km northwest of Addis Ababa and geographically bounded between 37° 40ʹ to 37°57ʹE longitudes and 9°30ʹ to 9°40ʹN latitudes. This area can be classified into two main physiographic regions of plateau area and rugged terrain with an elevation ranging from 1386 to 2600m a.s.l. Due to the presence of sharp ridges, cliffs, rugged slope faces, steep scarps, deep gorges and valleys, the area is highly susceptible to active surface processes. The drainage pattern of the area is dendritic type in which the river channel follows the slope of the terrain. As observed in physiographic map of the study area in Figure 2, the slopes of the study area are facing towards the north and south directions.The study area is divided into three agro-ecological zones of the highland, midland and lowland. The wet season starts on May and extends up to October and the peak wettest season occurs during July and August. This area receives rainfall twice in a year with a heavy precipitation from June to September and light to moderate precipitation from mid-March to mid-April. In July and August, the peak rainfall is in between 300 and 400 mm per month respectively in which more than 50% of the annual precipitation is accumulated during these months. The study area is moderately vegetated and most parts of the ridges are covered by eucalyptus trees, wild grass and scattered bushes. Most of the gentle slopes and flat land in the study area are intensively used for agricultural purpose and grazing land for livestock production. The most dominant agricultural crops are "teff", sorghum, barley, wheat, maize, pulses and small scale production of vegetables.

Data used and Methodology
In order to achieve the objective of this research work four main stages were followed including data acquisition and organization, database construction and analysis, preparing the landslide susceptibility maps and finally verification and comparison of the landslide susceptibility maps.

Data Acquisitionand Organization
The data acquisition part of this research work was based on primary and the available secondary data. Secondary data includes both published and unpublished papers. Topographic map, regional geological map and meteorological data were collected from the Ethiopian Geospatial Information Agency, Geological Survey of Ethiopia and National Metrology Agency of Ethiopia respectively. DEM of the study area was used to generate topographic parameters such as slope, aspect and curvature. Google Earth images were used to identify the land cover and geomorphologic features of the study area. After a comprehensive and thorough literature review and preliminary site investigation, a detailed field work was carried out to collect the primary data such as description of different types of lithological units and identifying their relative degree of weathering, visually inspection of slope steepness, collection of available spring location, mapping both active and landslide scar locations and identification of materials involved, failure mechanisms, state of activity (active, dormant etc.) and measuring of their size and shape and identifying land use and human activities and their role for landslide occurrence in the study area.

Landslide inventory map
The landslide inventory map (Figure 4) was produced through intensive field survey and Google Earth image interpretation with time series data. Extensive field studies conducted in mid-November to mid-December 2019 was used to map known landslides using GPS locations and check the size and shape of the landslides to identify the type of movements and the materials involved and to determine the landslides state of activity (active, reactivated, dormant etc.). In this study, a landslide inventory with a total of 580 (24356 pixels) landslides were identified and mapped as vector-based polygons then converted to the raster format with a pixel size of 30m by 30m in ArcGIS software and they were randomly subdivided into two data sets i.e. 80%(20336 pixels) or (464 landslides) used for the susceptibility model building and 20%(4020 pixels) or (116 landslides) used for model validation (Figure 4). From a total 580 landslides, 98 landslides were identified during the field work and the remaining landslide polygons were identified and collected from Google Earth image. Landslide distributions in the study area are dominantly found in northern and southern parts of the study area which are characterized by rugged and steep slopes, deep gorges, highly fractured and weathered rocks. The inventory is composed of debris slide, rotational and translational slide, progressive creep movement, rock slides and rock fall and complex types of landslide. All landslide events are represented by polygon features and these landslides affect a total area of 21.88Km 2 and the maximum area of the landslide polygon was 0.7576Km 2 and the minimum area was 0.000389Km 2 . In this study, the landslide classification system developed by Varnes (1978) was used.

Landslide Causative Factors
The selected factors for landslide susceptibility assessment in a GIS-based study must be operational, represented over the entire area, variable, fundamental and measurable (Ayalew and Yamagishi 2005). Based on the above criteria eight predisposing factors were selected including slope, aspect, curvature, lithology, landuse/landcover, rainfall, distance from rivers and distance from lineaments. These eight landslide causative factors were prepared in a raster format using the same spatial projection (WGS 1984 UTM zone 37N) and cell size of 30 × 30 m using ArcGIS 10.4 which is used to evaluate the spatial relationship between them and the landslides in the study area.
Topographic parameters such as slope, aspect and curvature maps were extracted from digital elevation model(DEM). The lithological map was prepared from data collected in the field and from existing geological map while the landuse/landcove map was generated from Google Earth image interpretation with field survey and verification. The lineament map was prepared from 3D Google Earth image interpretation with field survey and the distance from lineament was prepared through GIS based buffering analysis. Distance from stream map was extracted from drainage map of the study area and constructed through Euclidean distance buffering. The rainfall map was prepared by interpolating the 30-year average rainfall data of the five nearest rain gage stations of the study area using IDW interpolation in the spatial analyst tool in order to get spatially distributed rainfall raster map.

Aspect
Aspect is defined as the slope direction, it have an essential influence on landslide occurrence as it controls the exposure of slope to sunlight, cold and hot winds and rainfall (Huang et al 2015). The aspect map of the study area is derived from DEM using ArcGIS software and classified into nine classesi.e. Flat, North,Northeast,East, Southeast, South, Southwest, West and Northwest ( Figure   5a).

Curvature
Curvature is defined as the rate of change of slope gradient in a particular direction and controls the hydraulic condition and gravity. Curvature may refer to the convex, concave and flatness of a slope. Curvature of the study area was derived from DEM by ArcGIS software and classified in to three classes of convex, concave and flat ( Figure 5c).

Distance from Stream
The probability of landslide occurrence increases as distance to the stream decreases (Cellek 2019).

Lithology
Lithological variations often lead to a difference in strength and permeability of rocks and have a significant role in landslides occurrence (Sarkar et al. 2013). In the current study, lithological map of the study area ( Figure 5f) was prepared from existing regional geological map and detail field survey. The study area contains six lithological units namely Quaternary Superficial sediment, alluvial deposits, Residual soil, Basalt, Limestone and Sandstone.

Land use
Land cover is also one of the key factors responsible for the occurrence of landslides, since, barren slopes are more prone to landslides (Gomes et al 2005). The land use map of the study area ( Figure   5g) was prepared from Google Earth image interpretations and the analysis of this factor with landslide was done using ArcGIS tools. Seven land-use types were identified including dense forest, moderate forest, sparse forest, shrubs, bare land, agricultural land and settlement.

Rainfall
Rainfall plays an important role in reducing the shear strength and increasing pore pressure (Yalcin, 2007). The available data from five rain gage stations surrounding the study area was obtained from Ethiopian National Meteorological Agency(Kachisi, Yejube, Alge, Hareto and Fincha). These rain gage stations were interpolated using IDW interpolation technique that can be used to compute the unknown spatial rainfall data from the known data of sites that are adjacent to the unknown site

Information Value Method
The information value(IV) model is a bivariate statistical analysis method that was developed from information theory and developed by Yin and Yan (1988) and a little bit modified by (VanWesten 1993). The objective of the information value model was to find the combination of significant factors by determining the probability of a landslide event based on the comprehensive information available. This method is important to determine the degree of influence of individual causative factors responsible for landslide occurrence (Kanungo et al. 2009). Information value for each factor class can be calculated using: Where a conditional probability is the ratio of landslide pixels in a class to the class pixels and prior probability is the ratio of the total number of landslide pixels to the total number of study area pixels. In a practical sense, the information value of each factor class is calculated as: whereas I(H, Xi) is the information value of a factor class; Npix(Si) is the number of pixels of a landslide within class i; Npix(Ni) is the number of pixels within class i; Σ Npix(Si) is the number of pixels of a landslide within the entire study area; Σ Npix(Ni) is the number of pixels within the entire study area. Therefore, the landslide susceptibility index (LSI) for each pixel was computed by summing the information values of each factor class as follows: When LSI < 0, the likelihood of a landslide is less than average; when LSI = 0, the likelihood of a landslide is average; when LSI > 0, the likelihood of landslide is greater than average. This means that the greater the information value, the greater the possibility of a landslide.

Frequency Ratio Method
The frequency ratio approach is based on the observed relationships between the distribution of landslides and each landslide related factor to correlate between the landslide locations and the landslide factors in the study area (Lee and Pradhan 2007). To calculate the frequency ratio, the area ratio of landslide occurrence and non-occurrence was calculated for different classes or types of each factor after which an area ratio for the class or type of each factor of the total area was calculated.

Number of pixels in individual class
Total number of pixels in whole class Frequency Ratio = Landslide pixels in each class Landslide pixel in the whole area Area pixels in each class Area pixels in the whole area (7) The FR value represents the degree of the correlation between landslide and the concerned class of the conditioning factors. So a value of 1 means an average value. If the value is > 1, there is a high correlation and FR < 1 means a lower correlation.Then, the landslide susceptibility index was calculated using LSI = ∑ = FR1 + FR2 + FR3+. . +FR8 . The higher the LSI value, the greater will be the risk (Huang et al. 2015).

Relationship between Landslide Occurrence and Causative Factors
To evaluate the contribution of each factor towards landslide susceptibility in this study, the landslide distribution data layer has been compared to various thematic data layers separately. The number of landslide pixels falling on each class of the thematic data layers has been recorded and weights have been calculated on the basis of both Information value and Frequency ratio methods (Table 1 & 2).

Landslide susceptibility mapping 4.2.1. Information Value Method
The calculated information value for each parameter classes is converted into raster map in ArcGIS, then the landslide susceptibility index was prepared by summing the information values of all parameters corresponding to each pixel in the map using raster calculator. LSI = IVSlo + IVAsp + IVCurv + IVDis-Str + IVDis-Lin +IVLitho + IVRain + IVLU Where IVSlo, IVAsp, IVCurv, IVDis-Str, IV Dis -Lin, IVLitho, IVRain and IVLU are the information values for slope, aspect, curvature, distance from stream, distance from lineament, lithology, rainfall and landuse respectively.
when LSI < 0, the likelihood of a landslide is less than average; when LSI = 0, the likelihood of a landslide is equal to average; when LSI > 0 is greater than average with a greater information value which means a higher probability of landslide occurrence.
For IV model, the final LSI values of the study area range from -5.29 to 2.67 which was classified into five classes of verylow (-5.29 to -3.02), low (-3.02 to -1.62), moderate (-1.62 to -0.5), high (-0.5 to 0.53) and very high (0.53 to 2.67) using the natural breaks method. LSI, landslide percentage, landslide density and area coverage of landslide susceptibility class were shown in Table 3.

Frequency Ratio Method
The fundamental concept of this method is to calculate the ratio between the density of the phenomena in a given class and the density of the same class (Lee and Talib 2005). In the present study with the help of ArcGIS 10.4, the landslide factors were converted into raster maps with a pixel size of 30*30m. The spatial relationship between landslide locations and each landslide factor was analyzed and the number of landslide pixels in each class has been evaluated and the frequency ratio for each factor class is calculated. Then the Frequency Ratio ratings of factors in the form of raster maps were summed in ArcGIS using raster calculator to prepare the landslide susceptibility index (LSI) as follows.
A higher LSI means a higher susceptibility to landslide while a lower one indicates a lower susceptibility (Bui et al. 2012). LSI values range from 2.19 to 20.01 in the FR model. Using the reclassify function, the LSI map was reclassified into five classes of very low (2.19 -5.74), low (5.74 -7.97), moderate (7.97 -10.33), high (10.33 -13.26) and very high (13.26 -20.01) by using the natural breaks method. LSI, landslide percentage, landslide density and area coverage of landslide susceptibility class were shown in Table 4.

Verification of the susceptibility maps 4.3.1. Area under the Curve (AUC)
The area under the curve(AUC) is the measure that indicates the accuracy of the landslide susceptibility maps by creating success and prediction rate curves. The resulting area under the curve indicates the probability that more pixels were correctly labeled than incorrectly labeled (Ghorbanzadehet al. 2018). Therefore, the greater AUC values indicate a higher accuracy of the resulting susceptibility map. The success rate assesses how many landslides sites, which are used in the model, are successfully captured by the susceptibility map and consequently represents a measure of model efficiency (Neuhäuser 2012).The predictive rate calculates the percentage of the independent landslides captured with the susceptibility map. Therefore, it can be assessed how many "unknown" landslides could be "predicted" (Neuhäuser 2012). In the ROC method, the area under the curve(AUC) values, ranging from 0.5 to 1.0, are used to evaluate the accuracy of the model. AUC close to 1 suggest a higher model reliability while near or less than 0.5 suggest that the model is invalid (Chung and Fabbri 2003).
In this study, to obtain the success and predictive rate, the calculated susceptibility index values were sorted in descending order and classified into 100 classes and combined with training and validating landslide raster. Success rate was obtained by comparing the 80% (training landslide pixels) with the landslide susceptibility maps of both models and the predictive rate was obtained by comparing the 20% (validation landslide pixels) which were not included in the models producing the landslide susceptibility maps of both models. The results showed that the AUC of the success and predictive rate curves for the IV model are 0.836 and 0.817while for FR model, it was 0.835and 0.818 respectively ( Figure 6). The AUC of the success and predictive rate curves range between 0.8 and 0.9 indicating a good performance of both models.

Landslide Density Index (LDI)
In this study, the LDI value was obtained by combining the training landslides and validation landslide with landslide susceptibility and calculated using equation 10. The higher LDI on the high and very high landslide susceptibility regions further confirmed that the model is reliable and accurate (Fazye et al. 2018). As a result the LDI value for both models increased from very low to very high susceptibility classes as presented in Table 5.

Conclusion
The main aim of this research work was the application, testing and comparison of a bivariate statistical models which are capable of describing the relationship between landslides and a number of causative factors to create reliable susceptibility maps for land use planning and management. To achieve this objective, two bivariate statistical methods i.e. information value and frequency ratio models have been used. A total of 580 landslides have been identified from Google Earth image interpretations and field survey. Then, 80%(464 landslides) were used as training landslides to build the models while the remaining 20%(116 landslides) were used as testing landslides to evaluate the performance of the models eight predisposing factors (slope, aspect, curvature, distance from stream, distance from lineament, lithology, annual rainfall and land use) were used for the analysis and evaluation of the spatial relationship between these factors and landslides. The resulting LSMs from both models were subdivided into five susceptibility classes. The information value that have positive value and frequency ratio greater than 1 were found in the factor classes of a slope greater than 15°, curvatures of concave and convex and aspect of southeast, south, southwest, west and northwest facing slopes. In case of distance from stream, the five factor classes in between 0 and 300m and in case of distance from lineaments, the six factor classes in between 0 and 800m showed the highest probability of landslide occurrence. Lithological units (basalt, alluvial soil, limestone and sandstone), two rainfall classes in between 1621 and 1670mm/year and land use(bare land, sparse and moderate forest) had a high probability for landslide occurrence. LSI map of the study area was prepared based on IV and FR methods in ArcGIS10.4 using the spatial analyst tools of raster calculator and reclassified into five susceptibility classes of very low, low, moderate, high and very high using the natural breaks method of classification to produce the final landslide susceptibility maps. Finally, the models was validated using area under ROC curve and Landslide Density Index (LDI). In particular, the fitting performance and the prediction capability of the resulting landslide susceptibility models have been ascertained using the same landslide data which is used to obtain the model itself (training 80%) and independent landslide information which was not used to construct the model (validation 20%) respectively. In this study, AUC of the success rate and predictive rate curves range between 0.8-0.9 indicating a good performance of both models. The information value and frequency ratio models were validated using the Area under the curve(AUC) of the receiver operating curve(ROC) curve with a success rate of 0.836 and 0.835 respectively and a predictive rate of 0.817 and 0.81 respectively.
Generally, for a regional-scale map(1:50,000) of the study area, the bivariate statistical models (Information value and frequency ratio models) were found to be reliable. Besides this, the procedures of producing LSMs were relatively simple and cost-effective. The findings of this study can help land use planners, geologists and civil engineers to identify areas that are susceptible to landslides. Since the results are of regional-scale, the LSMs may be less useful for a site-specific development that requires large-scale maps.

Recommendation
Increasing the vegetation cover of the area by planting the trees, controlling the drainage system found in the study area, applying preventive measures such as retaining and gabion walls and achieving widespread public awareness about landslide hazards can reduce the risk.

Availability of data and materials
Rainfall data were collected from the National Metrology Agency of Ethiopia. Topographical Map was purchased from the Ethiopia Geospatial Information Agency. DEM data'sare freely available from http://gdex.cr.usgs.gov/gdex/ website.

Competing Interests
There was no any competing interest.

Funding
There was no any funding received from any governmental or non-governmental organization as the first author was a self-sponsored MSc student.

Authors' contributions
AG as a first author has mostly participated in the whole process of this research work including the field-work, data collection, database preparation and compling the results.
MM as an advisor, participated from the inception stage of this research and commented in each phase for further improvement. AG addressed the comments given from his advisor in terms of its scientific justification, methodological aspect and English correction before submission and finally the two authors approved the submission to this journal.