Frequency Ratio Density, Logistic Regression and Weights of Evidence Modelling for Landslide Susceptibility Assessment and Mapping in Yanase and Naka Catchments of Southeast Shikoku, Japan

Landslide susceptibility mapping is an important tool for disaster management and development activities such as planning of transportation infrastructure, settlement and agriculture. Shikoku Island, which is found in the southwest of Japan, is one of the most landslide prone areas because of heavy typhoon rainfall, complex geology and the presence of mountainous areas and low topographic features (valleys).Yanase and Naka Catchments of Shikoku Island in Japan were chosen as a study area. Frequency Ratio Densisty (FRD), Logistic Regression (LR) and Weights of Evidence (WoE) models were applied in a GIS environment to prepare the landslide susceptibility maps of this area. Data layers including slope, aspect, profile curvature, plan curvature, lithology, land use, distance from river, distance from fault and annual rainfall were used in this study. In FR method, two models were attempted but the FRD model was found slightly better in its performance. In case of LR method, two models, one with equal proportion and the other with unequal proportion of landslide and non-landslide points were carried out and the one with equal proportions was chosen based on its highest performance. A total of five landslide susceptibility maps(LSMs) were produced using FR, LR and WoE models with two, two and one were attempted respectively. However, one best model was chosen from the FR and LR methods based on the highest area under the curve (AUC) of the receiver operating characteristic (ROC) curves. This reduced the total number of landslide susceptibility maps to three with the success rates of 86.7%, 86.8% and 80.7% from FRD, LR and WoE models respectively. For validation purpose, all landslides were overlaid over the three landslide susceptibility maps and the percentage of landslides in each susceptibility class was calculated. The percentages of landslides that fall in the high and very high susceptibility classes of FRD, LR and WoE models showed 82%, 84% and 78% respectively. This showed that the LR model with equal proportions of landslides and non-landslide points is slightly better than FRD and WoE models in predicting the future probability of landslide occurrence.


Introduction
The Japan archipelago is situated within one of the most active tectonic belts where four major plates interact i.e. the Pacific Plate, the Eurasian Plate, the Philippine Sea Plate and the North American Plate. Hence, most of the Japan landmass is highly susceptible to different natural disasters like landslides, debris-and mudflows due to steep mountains, weak geology and severe weather conditions (Hong et al. 2005). More than 80% of Shikoku Island consists of steep mountain slopes which has a few plain areas along the coastal lines and elevated peaks in the central part.. It is a heavily forested mountainous region of Japan with a mean annual precipitation of range from 1,000mm to 3,500mm (Dahal et al. 2008a). The presence of dense population at the bases of mountains exacerbated the landslide problem.
Owing to the geological and morphological settings, landslides and floods caused by typhoon rainfall are frequent. In 2004, Shikoku experienced extreme events of typhoon rainfall andfaced huge losses of life and property. From 1951 to 2005, there were 1468 typhoon events in the northern part of the Pacific Ocean, 163 of which hit the Japanese archipelago. In 2004, Japan was hit by 10 annual typhoon events which were maximumwithin the last 55 years (Dahal et al. 2008a). The isohyetal map of Shikoku showed that the island usually gets more extensive typhoon rainfall in its southernpart than in its northern part.
Landslide susceptibility refers to a quantitative or qualitative assessment of landslides which exist or may potentially occur in an area without taking into account the time of landsliding (Fell et al. 2008). Susceptibility, hazard and risk maps are important tools for engineers, earth scientists, planners and decision makers to select appropriate sites for agriculture, construction and other developmental activities (Ercanoglu and Gokceoglu 2002). They also play an important role in efforts to mitigate or prevent the disaster in landslide prone areas by providing preliminary information to decision makers.
Landslide susceptibility methods can be categorized into heuristic (Ruff and Czurda 2008), deterministic (Gerscovich et al. 2006), combination of statistical and deterministic (Yilmaz and Keskin 2009) and statistical methods.
In heuristic methods, field observation and expert's knowledge are used to identify landslides, make a prior assumption about past and future landslide movements in the site and develop decision rules or assign weighted values for the classes of index maps and overlay them to develop a landslide susceptibility map. Deterministic method considers angle of the slope, strength of slope material (cohesion and internal angle of friction), structure (rock discontinuities, rock and soil stratification), moisture content, depth of groundwater table and pore-water pressure of the slope material in a physical model equation to determine factor of safety (Regmi et al. 2010a). A significant limitation of deterministic models is the need for geotechnical data (cohesion, internal angle of friction, depth to groundwater table, degree of saturation, etc.,) which are difficult to obtain over large areas (Terlien et al. 1995). Moreover, they do not take into account the climatic and human-induced factors, spatial distribution, temporal frequencies and magnitude of landslides.
These methods only provide the stability of the slope during a time of data collection (Regmi et al. 2010a) and may work well for site specific conditions. The statistical approach, which predicts future landslides based on past landslides (Ohlmacher and Davis 2003), on the other hand can be applied when geotechnical data limitation is a problem. Researches done on landslide susceptibility comparisons using both statistical and deterministic models have shown that the former gives a better result than the latter (Cervi et al. 2010;Yilmaz and Keskin 2009). The current study used the three statistical models of frequency ratio, logistic regression and weights of evidence to prepare the landslide susceptibility maps of the study area and the best model was selected. In the first two models, two alternative models were attempted in order to choose the best one while in the third model only one attempt was made. Then the first two best model outputs were compared with that of the weights of evidence modeling in order to choose the best predictive model among the three approaches.

The Study Area
The study area is located in Tokushima and Kochi prefectures of Shikoku Island, Southwest Japan covering an area of 599.724 Km 2 in atopographically rugged and mountainous terrain with an elevation range lying between 205m and 1950 m above sea level (Fig. 1) (Fig. 1). Besides,the study area is located in the southeastern part of Shikoku island, particularly in the Yanase -Tsurugisan -Kito rainfall belt where maximum rainfall is prevailing ( Fig. 2 and 3).

Effect of Rainfall on Landslide Occurrence in Shikoku
Landslides triggered by rainfall occur in most mountainous landscapes of the world (Dahal et al. 2008a). In Japan, precipitation occurs mostly during typhoon seasons on the Pacific Ocean side and in winter (heavy snow) on the Japan Sea side. Many types of landslides occur after heavy rainfall in tropical and temperate climatic zones. Shikoku Island has a humid climate with a mean annual precipitation of more than 2000 mm. Precipitation is largely brought by seasonal frontal systems and Typhoons that generate intense rainfall after which landslide disasters such as slope failures and debris flows frequently occur (Hong et al. 2005). Seasonal and torrential rainfall is the most important meteorological factor that can accelerate the movement of the sliding mass with a marked increase in landslide displacement (Hong et al. 2005;Wang et al. 2010).
The climatic conditions of Shikoku Island gave rise to seasonal and high intensity rainfall that caused landslide reactivation leading to deformation and collapseof buildings, roads and debris-slide control works in landslide prone areas. This active triggering factor affected the stability of crystalline schists in Shikoku Island therby promoting landslide movement (Hong et al. 2005). A huge number of landslides were also triggred in Tokushima Prefecture with some catastrophic landslides in Kisawa village causing the loss of two lives and damage of roads from the typhon event that hit Shikoku from July 30 to August 2, 2004 with a daily precipitation of 1317 mm that surpassed the historical daily rainfall record of 1114 mm in Japan. During this time, precipitation also exceeded 1500mm for the storm event and landslides occurred within a narrow area in Kisawa Village and Kaminaka town (Wang et al. 2006;Wang et al. 2010).   (1985 -2014) showing (a) Monthly rainfall (b) Annual rainfall of the study area.

Effect of Geology on Landslide Ocurrence in Shikoku
From north to south, Shikoku Island can be roughly divided into three geological zones: Ryoke, Sambagawa-Chichibu and Shimanto belts. The three zones are bounded by two northerly dipping major faults, the Median Tectonic Line (MTL) and the Butsuzo Tectonic Line (BTL) from north to south respectively.  (Dahal et al. 2008a).
Four major tectonic lines cut through the central part of the island. These are the Median Tectonic Line, the Mikabu Tectonic Line, the Butsuzo Tectonic Line and Aki-Sukumo Tectonic Line which separate the geology of the Shikoku Island into five major geological belts including the Sanbagawa belt, the Chichibu belt, the Shimanto-northbelt, the Shimanto-south belt and the Ryoke belt (Hong et al. 2005). Many landslides have been found along the southern side of the Median Tectonic Line. The geology of these landslide sites are mainly comprised of crystalline schists including politic schist, green schist, psammitic schist and siliceous schist (Hong et al. 2005).The areas where the landslides occurred are characterized by deep river valleys with steep slopes and many of the mountain slopes have steep chutes.
Most of the settlements are located on gentle slopes formed by past landslides or on narrow streamside terraces.
According to the geological map of Tokushima Prefecture, the area is mainly underlain by Paleozoic greenstone, Paleozoic and Mesozoic pelite and greywacke and serpentinite of the Mesozoic Kurosegawa terrane as well as limestone and chert (Wang et al. 2006). However, the basal sliding surface occurred mainly within weathered serpentinite .
The geology of Niihama on the other hand consists of green schist of the Sambagawa Belt in the south and sandstone and shale of the Izumi Group in the north. This geological belt is primarily made of sedimentary deposits of sandstone with frequent intercalation of shale (Dahal et al. 2008a).

Landslide Sites and Mechanisms in Shikoku
Many landslides were triggered by heavy rainfall (storm); the most catastrophic among them were five giant landslides occurring in Kisawa villages of the Nakagawa District in Tokushima Prefecture, namely the Oyochi, Kashu, Azue, Kamagatani and Shiraishi landslides.These landslides destroyed houses, forests and farms and damaged roads. Two people were buried inside Oyochi landslide and their bodies were never recovered. A catastrophic failure that occurred in the upper part of the Furon Valley in Shiraishi district destroyed more than 10 houses. Its basal sliding surface occurred within a weathered serpentinite. After the typhoon, the landslide was actively moving endangering the downstream residents ). Hong et al. (2005) investigated four crystalline schist landslides known as Zentoku, Kashio, Nishi-Igawa and Tsubayama in northwest of Tokushima prefecture that have been monitored for many years due to their continuous threat to the lives and properties of the local community. These landslides are classified as translational movements in weathered rock with deep-seated, continuous and creeping movement. The Zentoku landslide, located at Nishiiya village in Tokushima prefecture, is a large-scale crystalline schist landslide. The sliding mass is mainly composed of weathered and fractured green schist and pelitic schist. The Kashio landslide is located at Higashiiya village in Tokushima prefecture. Since 1988, the Japan Forestry Agency has systematically surveyed this large-scale landslide with the purpose of stabilizing the slope. The sliding mass is pellitic schist and the slip layer was found at varying depth levels of 20 -50 m. The Nishiigawa landslide is situated at the northern edge of the Sanbagawa belt and is very close to the Median Tectonic Line. It is smaller than the previous two slides and occupies about 6 ha in area with a length of 300 m and width of 200 m.The sliding mass is comprised of pelitic schist and siliceous schist. The landslide movement became apparent in 1973 due to an excavation at the toe of the slope. The Tsubayama landslide, which is located near the town of Ikegawa in Kochi prefecture, was identified by the Ministry of Construction of Japan in 1980 and became active in 1999 (Hong et al. 2005). The landslide is 500 m in length and 340 m in width and the sliding mass mainly consists of psammitic and pelitic schist.
In Kisawa village and its environs in Tokushima prefecture, five landslides were triggered on August 1, 2004 following a heavy typhoon rainfall. These include Oyochi, Kashu, Azue, Kamagatani and Shirashi landslides (Wang et al. 2006). The Oyochi and Kashu landslides occurred on the same side of a ridge. In Oyochi area, the bedrock is mainly composed of greenstone and serpentinite with visible cracks. Kashu landslide occurred on the same slope near the Oyochi landslide and occurred almost at the same time with the Oyochi event. Azue landslide is located on the left side of Sakashu-Kito River, in front of the Kashu landslide. A remarkable scar indicating an ancient landslide was also visible at this site (Wang et al. 2006). The displaced material slid down the slope, crossed Sakashu-Kito River and rose up the opposite mountain slope to a height of about 30 m (immediately below the Fudono area), destroying and carrying away the Fudono bridge, which was built on the national road. The main failure of Azue landslide occurred at approximately 23:00 hr on August 1, a little later than the Oyochi and Kashu landslides. Afterwards, loud crushing noises continued for several hours, probably due to retrogressive failures (Wang et al. 2006). The unstable landslide mass was estimated to be in the order of 10 6 m 3 . This landslide mass was being deformed and the displacements measured by the installed extensometers showed that its movement was very sensitive to rainfall. Kamagatani landslide is located on the true right slope of the valleys of Kamagatani River and the upper tributary of Sakashu-Kito River (Wang et al. 2006). The source area consists of mudstone, which is overlain by sandstone and ancient colluvial deposits. Shiraishi area was designated as the landslide prevention area on 1962, basing on the Japanese Landslide Preventive means and the mountain stream was also designated as the debris-flow-risk stream (Wang et al. 2006). Dahal et al. (2008a) provided a detail account of the landslide disasters associated with the 2004 typhoon events in Tokushima, Kagawa, Ehime and Kochi prefectures of the Shikoku Island. Accordingly, from late July to early August, typhoon 10 brought heavy rainfall of more than 2000 mm to the southern part of Shikoku. This rainfall created four huge landslides around the Kisawa village of the Nakagawa district in Tokushima prefecture, occurring mainly at Oyochi, Kashu, Azue and Shiraishi area (Dahal et al. 2008a).
In 2004, typhoons 4, 6, 10, 11, 15, 18, and 23 hit Kochi Prefecture. Rainfall caused by typhoon 15 (Megi) induced many landslides in the Yoshino River basin of Shikoku on 17 th -18 th August. Reihoku District in Kochi Prefecture was greatly affected by typhoon 15. Okawa and Uwezugawa villages were severely damaged. Many landslides occurred along the roadside slopes too. As a result, Okawa was isolated from the other parts of the prefecture (Dahal et al. 2008a).
The Okawa area mainly consists of crystalline green schist of the Sambagawa Belt, which includes pelitic schist, psammitic schist, and siliceous schist. There are thin to thick deposits of quaternary colluvium on the mountain slope of Yoshino River Valley. The crystalline schist is usually well known for landslide hazard and the sliding mass mainly consists of weathered and jointed schist (Hong et al. 2005;Dahal et al. 2008a).
Because of the thick forest in unpopulated hills, many landslides were only recognized long after the event. Researchers were still observing unreported small to medium-sized landslides triggered by the typhoon rainfall of 2004 in the forest of central Shikoku Island. Nevertheless, the International Sabo Association of Japan stated that nearly 600 types of slope failures have occurred in Shikoku during the 2004 typhoon events (Dahal et al. 2008a). Field observations of more than 250 slides in all prefectures indicated that both sliding and flowing are critical to the failure process. Various research studies established the nomenclature and classification of landslides in Shikoku, coming up with different terms used for such failures like debris flows, debris avalanche, soil slips, debris slides, flow-like landslides,slide-flows and so forth (Cruden and Varnes, 1996;Dahal et al. 2008a).
In a general sense, landslides in Shikoku Island after the 2004 typhoon events can be divided into translational slides, rotational slides, and a combination of both on the basis of the shape of the failure surface. The translational slides were found to be the most predominant failure mode of debris slide (Dahal et al. 2008a). Many of the landslides (e.g., Moriyuki, Monnyu,Toyohama, Okawa, and Kisawa) where translational debris slides occurring first on steep zero-order valley or concave slope and debrismaterials were run down through first-order stream channel (Dahal et al. 2008a). The flow continued to erode its route and either piled up huge debris at the mouth of the stream or continued traveling through the second-order stream for a considerable distance on the sloping terrain (Dahal et al. 2008a). Shallow failures are usually triggered by comparatively short, intense storms, whereas most of the deep-seated landslides were affected by long term variation of annual rainfall and daily rainfall (Hong et al., 2005).

Landslide Inventory
Landslide inventory maps document the extent of landslide phenomena in a region, and show information that can be exploited to investigate the distribution, types, pattern, recurrence and statistics of slope failures, to determine landslide susceptibility, hazard, vulnerability and risk, and to study the evolution of landscapes dominated by mass-wasting processes (Guzzetti et al. 2012). Having a high quality landslide inventory data, which depends on the accuracy, type and certainty of the information shown in the maps, will have a positive effect on the quality of landslide susceptibility, hazard and risk assessments (Guzzetti et al. 2012).
The landslide inventory data in the study area has been digitized from Google Earth Images ( Figure 5) which were then converted in to GIS compatible format using kml to layer conversion tool. Then these landslides were checked against the georeferenced topraphic maps for their proper locations.

Landslide Factor Maps
The landslide factors considered in this study include slope, aspect, profile curvature, plan curvature, lithology, land use, distance from fault, distance from river and annual rainfall. Generally, the higher the slope, the higher will be the landslide density provided that the rock strength is low and the land use pattern falls in barren land to slightly vegetated bushes and shrubs. Slope orientation (aspect) affects the exposure to sunlight and to winds which in turn affects indirectly other factors that contribute to landslides such as precipitation, soil moisture, vegetation cover and soil thickness (Clerici et al., 2006). Curvature is the rate of change of slope gradient or aspect in a particular direction (Wilson and Gallant, 2000). Curvature controls the hydrological conditions of the soil cover. Curvature is generally divided into plan and profile (Ohlmacher, 2007). Profile and plan curvatures in particular affect the susceptibility to landslides. Profile curvature affects the driving and resisting stresses within a landslide in the direction of motion while plan curvature controls the convergence or divergence of landslide material and water in the direction of landslide motion (Carson and Kirkby, 1972). Curvature can be positive or convex (indicating peaks), negative or concave (indicating valleys) or zero (indicating flat surface or a saddle) (Pradhan, 2010;Alkhasawneh et al. 2013).
The land use classes of the study area, which have been digitized from Google Earth image include baren land, dense forest, reservoir, river and settlement.The land use pattern of the area has been changing due to cutting of trees for timber at selected sites.
The presences of faults also contribute significantly to landslide occurrence in the study area. As can be depicted from table 1, most of the landslides are concentrated in close proximity to faults. This shows that landslides are more frequent near to the close vicinity of faults.
The river dynamics can erode and incise areas along its channel and river banks. Hence landslides are more frequent close to rivers. In general, as the distance from river increases,the probability of landslide occurrence decreases.
However, in the current study area, landslides are more frequent in intermidiate river distance classes between 400 -1500m (Table 1). This may be due to the presence of dense vegetation and competent rocks along the river courses.
The rain gauge stations inside and in the vicinity of the study area shows a spatial variability in annual and monthly rainfall patterns (Figure 3 a & b) in which the maximum rainfall is recorded during the months of June, July, August and September.
The thirty years average annual rainfall records of rain gauge stations in Shikoku island from Japan Meteorology Agency(http://www.data.jma.go.jp/gmd/risk/obsdl/index.php)were used to interpolate the rainfall isohyets (Figures 2 and   6i). (e) (f) (g) (h) Figure 6 Landslide Factor (i) Annual rainfall.

Frequency Ratio
In order to get the best result for the frequency ratio method, two approaches have been applied (Figure 7 and table 1).
The first one is using the frequency ratio value of Lee and Sambath (2006) and the second is to use the landslide density method of . Among these two options, the latter was found to be appropriate as it resulted the highest Area Under the Curve (AUC) value of 0.867 compared to the former which is 0.865. Hence the landslide density method was applied as an input data for the other methods too. Where Fr 1 , Fr 2 , Fr 3 … Fr n are the frequency ratio or landslide density raster maps of of each landslide factor, LSI represents the landslide susceptibility index and n is the number of factors. If the LSI value is higher, it means a higher susceptibility to landslide but if LSI is lower, it means a lower susceptibility to landslides (Lee and Sambath, 2006;Lee et al., 2007).

Logistic Regression
Logistic regression is the most commonly used multivariate method as (1) it can be used to predict a result measured by a binary variable such as the presence or absence of landslides based on a set of one or more independent variables; (2) it does not require the variables to be normally distributed; (3) the independent variables can be non-linear, continuous, categorical or a combination of both continuous and categorical (Menard 1995;Schicker and Moon 2012;Meten et al. 2015b). It helps to form a multivariate regression analysis between a dependent variable and several independent variables (Lee 2005b;Shahabi et al. 2014). It is useful to predict the presence or absence of a characteristic outcome based on values of a set of predictor variables (Lee and Sambath 2006;Lee et al. 2007;Yilmaz 2009;Yalcin et al. 2011).
The purpose of logistic regression is finding the best fit model to describe the relationship between a dependent variable and a set of independent variables . The dependent variable is coded as "1" and "0" representing the presence and absence of a landslide respectively (Atkinson and Massari 1998). The dependent variable is the landslide area. Then these areas are rasterized and converted into a point format in GIS. These pointsare in turn used to extract the values of individual frequency ratio density maps for each of the nine landslidefactors. These points were extracted in dBf format and can be accessed inSPSS statistical software. Then, they were saved in SPSS compatible format. The next step is to merge the landslide and non-landslide extracted points of the nine landslide factors separately (Meten et al. 2015b). Using logisticregression the spatial relationship betweenlandslides and landslide factors was established to determine coefficients of each independent variable (Akgün and Bulut 2007). After this calculation, the coefficients of the landslide factors were obtained as shown in Table 2 Where P is the probability of landslide occurrence and Z is the linear combination.
Logistic regression involves fitting an equation of the following form to the data (Eq. 8): Where b0  In this study, a landslide susceptibility map with high prediction accuracy was selected. The prediction accuracy of the model was determined from area under the curve (AUC) values of the receiver operating characteristic (ROC) curves.
For validation, the landslide inventory map was overlaid over the landslide susceptibility map and analyzed how much percentage of landslides fall in each susceptibility class. If majority of the landslides fall in the very high and high susceptibility classes, then the landslide susceptibility map is acceptable but if this is not the case, checking the quality of the data that were used in the analysis is needed (Figure 7).

Weights of Evidence
The weights of evidence modelling uses the Bayesian probability approach and was originally designed for mineral potential assessment (Bonham-Carter 1988;Bonham-Carter 1994). This method was also being applied in landslide indicates an absence of the causative factor and the magnitude indicates negative correlation. The difference between the two weights is known as the weight of contrast, C where (C = W i + -W i ˉ ) and the magnitude of contrast reflects the overall spatial association between the causative factor and landslides (Dahal et al. 2008b;Regmi et al. 2010b). If the weight contrast is positive, the factor is favorable to cause landslides and if it is negative, it will be unfavorable for the occurrence of landslides. If it is close to zero, this indicates that the factor shows minor relation to landslides. In order to calculate the weights of each landslide factor classes for landslide susceptibility mapping, Eqs. (9) and (10) Where nslclass is the number of landslide pixels in a certain factor class, nslide is the number of landslide pixels in the entire area, nclass is the number of pixels in a certain factor class, nmap is the number of pixels in the entire area, npix1 is the number of landslide pixels present on a given factor class, npix2 is the number of landslides pixels not present in a given factor class, npix3 is the number of pixels in a given factor class in which no landslide pixels are present and npix4 is the number of pixels in which neither landslide nor the given factor is present (Van Westen, 2002, Sharma and Kumar, 2008, Dahal et al., 2008bRegmi et al., 2010b).
The steps in the initial stage of WoE modeling include preparing the landslide factors, landslide inventory maps and converting them into a raster format with the same geographic projection and same pixel size of 30 m followed by calculating the number of landslide pixels in each factor's class (npix1), the number of total landslide pixels in the whole study area (nslide), the number of pixels in each factor's class (nclass), the difference between the total number of landslide pixels from landslide pixels in each factor's class (npix2), the difference between the number of pixels in the factor's class from the landslide pixels in that class (npix3) and the number of non-landslide pixels in the entire area (npix4) using Arc GIS 10 (Table 4)

Frequency Ratio
From negative values) and profile curvature (the first two negative classes and the last two positive classes).  Note: # = number of, density = (# landslide pixels /# class pixels), Frequency Ratio (FR) = (% landslide pixels / % class pixels),

Logistic Regression
In order to analyze the effects of data sampling on the prediction accuracy of landslide susceptibility maps in logistic regression method, two cases with equal-and unequal proportions of non-landslide points were combined with all landslide points. This helps to determine the coefficients of each landslide factor and other statistical parameters like -2 log likelihood, Cox and Snell R square, Chi-square, Nagelkerke R square and statistical significance ( Table 2). The prediction accuracy also showed almost similar values n this study. Among the two cases, the data sampling/combinationof equal landslide points and non-landslide points showed the highest prediction accuracy of 86.8% (Table 3).. Selecting a sample for logistic regression model involves considerations of the sample size and the proportion of landslide and non-landslide pixels (Schicker and Moon 2012). If there are many parameters, it would produce a long regression equation that may even create numerical problems and also may result in the absence of strong correlations (multicollinearity) among independent variables. A regression was performed among the independent parameters, not classes of parameters . In a logistic regression model, three approaches can be followed for data sampling purposes (Zhu and Huang 2006). The first one is using data from all over the study area, which leads to unequal proportions of landslide and non-landslide pixels (Guzzetti et al. 1999;Ohlmacher and Davis 2003).
The second approach is using the entire landslide pixels and equal proportions of non-landslide pixels. This may decrease number of data to be used but it eliminates the associated bias in the data sampling process (Zhu and Huang 2006). Yesilnacar and Topal (2005) used the total number of landslide pixels and randomly selected cells from landslide free areas. The third approach is to divide landslide pixels into two parts, i.e. training and validation data (Zhu and Huang 2006). There are also two cases in this approach.The first one is the application of unequal pixels (Atkinson and Massari 1998) and the second one is to use equal proportion of landslide and non-landslide pixels (Dai and Lee 2002). In order to tackle the drawbacks attributed to the application of unequal proportion of landslide and non-landslide pixels, equal numbers of non-landslide pixels are randomly selected from landslide free area and combined with equal number of landslide pixels for the logistic regression model in this study. A forward stepwise logistic regression method was applied in order to establish a relationship between the landslide and landslide factors. In order to get the best result from logistic regression analysis, multicollinearity and Hosmer-Lemeshow tests were considered (Zhu and Huang 2006;Bai et al. 2010). Tolerance (TOL) and variance inflation factor (VIF) are two important indexes for multicollinearity diagnosis.
Tolerance smaller than 0.2 is an indicator for the presence of a multicollinearity problem and if it is smaller than 0.1, then there is serious multicollinerarity between independent variables (Menard 1995).
In this study, all the tolerance values are ≥ 0.55 (Table 5) showing that there is no multicollinearity problem among independent landslide factors. Variance inflation factor, which is the reciprocal of tolerance index, is another criterion. Allison (2001) excluded those independent factors from logistic regression analysis if their VIF > 2 and TOL < 0.4.
However, no independent landslide factor was excluded from analysis in this study as all factors are not affected by multicollinearity problem. Hosmer-Lemshow test showed that the goodness of fit of an equation can be accepted if the significance of Chi-square is greater than 0.05 (Table 5). Hence, the logistic regression equation that was obtained from SPSS analysis can be expressed as follows. Z = 10.825 *Distance from river + 6.961 * Slope + 6.955 * Aspect + 6.381 * Distance from fault + 6.194 * Plan curvature + 6.08 * Rainfall + 4.823 * Lithology + 3.242 * Land use -1.947 * Profile curvature -6.021 ……………...…………….. (13) From this, it can be implied that distance from river, slope, aspect, distance from fault, plane curvature, rainfall, lithology with positive coefficients in their decreasing order decreasing of importance to cause landslides (Table 5). The landslide susceptibility map was found by substituting the above value of Z in equation 7 and then the map was classified into five susceptibility classes (Figure 9b). Relative operating characteristics (ROC) Curves were used to compare the presence or absence of landslides with the landslide susceptibility map. The ROC values ranges from 0.5 to 1, where 1 indicates a perfect fit and 0.5 represents a random fit .
In this study, the entire landslide data was used as a training dataset and a success rate was obtained by comparing these training landslides with the landslide susceptibility map (Bui et al. 2012). The result shows an area under the curve (AUC) value of 0.868 with a suscces rate of 86.8%.

Weights of Evidence (WoE)
After the positive weights(W + ), negative weights(W -) and weights of contrast values(C) are calculated (Table 2)  In order to check the dependency of factors among each other and with respect to landslides, the correlation matrix (Table 5) of landslide factors was prepared using logistic regression model. This test showed that all the nine landslide factors were not correlated with each other or exhibited a very insignificant correlation. Hence it is possible to combine all the landslide factors ( Fig. 6) in order to produce the landslide susceptibility map (Fig. 8c). The prediction accuracy of the weights of evidence model was evaluated by extracting the landslide susceptibility map using landslide and nonlandslide points and analyzing them in SPSS statistical software. This will help to construct receiving operational characteristic (ROC) curve and the area under the curve (AUC) value (Fig. 9). The AUC value was found to be 0.884 indicating a prediction accuracy of 88.4%. The weight of evidence method avoids the bias (subjectivity) in weighting the factor classes and it also avoids the use of inter-correlated landslide factors .  recommended the preparation of landslide susceptibility map for each landslide type. However, the current study considers the use of all landslides as the overall susceptibility from all landslides is important for decision making .  either no or very insignificant correlation (Table 5) suggesting that all the factors are independent from each other. As a result, these factor maps can be used to prepare the landslide susceptibility map by summing all.

Validation
In the frequency ratio density model, overlying all the landslide inventory data over the final landslide susceptibility map showed that 1%, 2%, 15%, 38% and 44% of the landslides fall in the very low, low, medium, high and very high landslide susceptibility classes respectively. Similarly, in logistic regression model, this overlay operation will provide 1%, 2.5%, 12.5%, 39% and 45% of the landslides to be distributed in the very low, low, medium, high and very high susceptibility classes respectively. In the weights of evidence model, the landslides used in the analysis were overlaid over the landslide susceptibility map and this showed that 2%, 3%, 17%, 36% and 42% of the landslides fall in the very low, low, medium, high and very high susceptibility classes respectively. All these three validation attempts were also ascertained by the high receiver operating characteristics curve (ROC) values (Fig. 9). The fact that 82%, 84% and 78% of the landslides that fall under the high and very high susceptibility classes of FR density, LR and WoE models respectively showed that these models predict the future probability of landslide occurrence with very good level of accuracy.
Figure 9 ROC Curve for FR Density, LR (equal landslide and non-landslide points) and Weights of Evidence models.

Conclusion
Landslides have inflicted the loss of human lives, properties, road damages and affected the environment in Yanase and Naka Catchments of south east Shikoku, Japan. In order to identify the landslide prone areas for further preventive works and developmental plans, landslide susceptibility mapping should be undertaken. For this purpose, the frequency ratio density, logistic regression and weights of evidence models were applied among the different GIS-based statistical (probabilistic) approaches. The weights of evidence model involves preparing landslide inventory and landslide factors, analyzing and calculating positive and negative weights, the weights of contrast values using ArcGIS 10 and microsoft excel softwares. After transforming the weights of contrast values (C) into positive integers, these values were assigned to each raster map in each factor class for further subtraction and division operations in a raster calculator of ArcGIS so as to get the original weights of contrast values. In weights of evidence modelling, checking the conditional independence of landslide factoors is an important step before adding the raster maps of factors with weights of contrast values. This was accomplished using logistic regression method which can provide a correlation matrix among landslide factors. This correlation indicates that all of the nine landslide factors didn't show any correlation. This helps to prepare the landslide susceptibility of the area using this model with a reclassification technique based on natural breaks method.
However, the reliability of the landslide susceptibility map should be checked for its prediction and validation accuracies. The prediction accuracy, which can be evaluated by the area under the curve (AUC) value, was found to be 80.7%. For validation, landslides used for the analysis were overlaid and this showed that 78 % of the landslides fall in high and very high susceptibility classes.
In case of appliying logistic regression model in this study, it was found that using equal proportion of landslide-and non-landslide points resulted a better prediction accuracy instead of using unequal proportions. The coefficients obtained for distance from river, slope, aspect, distance from fault, plan curvature, rainfall, lithology and land use showed higher positive values with a decreasing trend indicating the highest degree of influence or control in initiating landslide while profile curvature showed negative values indicating the least degree of influence. Choosing an appropriate reclassification technique determines how the resulting landslide susceptibility map looks like and also influences the validation accuracy. In this study, the values in the landslide susceptibility index map are unevenly distributed and as a result the application of natural breaks reclassification mdethod provided a good result unlike other methods.
From the frequency ratio and frequency ratio density models, the frequency ratio values of a certain factor classes that are greater than 1 and density values greater than 0.0458 showed a good correlation with landslide occurrence as can be seen from table 1. Although these two models provide a very close predictive result but still the frequency ratio density model provided a slightly better estimate for landslide susceptibility mapping and assessment (Table 3).
Generally, from the five models that have been utilized to select the best three models, the frequency ratio density, logistic regression with equal proportions of landslides & non-landslides and the weights of evidence were used to generate the landslide susceptibility maps of the study area. Even though the predictive rates through area under the curve (AUC) of the receiver operating characteristics (ROC) curves from these three models showed very close results but still the AUC for logistic regression with equal proportions of landslide-and non-landslide points indicated a slightly greater value. For the validation of landslide susceptibility maps, most of the landslides in the selected susceptibility map fall in the high and very high susceptibility classes which affirmed that the model is quite acceptable.