## Data type and acquisition

## Landslide inventory

The most critical dataset required for LSM is an accurate and representative landslide inventory. The inventory forms a significant aspect in prediction because it provides an understanding of conditions and processes which influences past landslide occurrences and therefore the evidence of future distribution (Ramesh and Anbazhagan, 2015). The significance of using a reliable and quality landslide dataset has been echoed by numerous scholars (e.g. Nohan et al., 2019; Daniel et al., 2021). In this study, existing historical/archived data, field survey mapping and imagery data were used in building the landslide inventory. The Google Earth multi-temporal images extending from 1990 to 2020 were analysed for remote identification of the landslide scars. The landslide inventory data were compiled from historical records and validated using field surveys. During field surveys the locations of the new landslide scars were mapped using a GPS (Garmin 64sx). Existing inventories such as the web-based Uganda National Road Authority (UNRA) landslide data inventory was consulted. The UNRA inventory has records of landslide events and impacts (Hearn et al., 2019). The current study then integrated these different inventories to produce a single database for modeling the landslide susceptibility. In this study, use of landslide point data was preferred to polygons due to limitations imposed by some small sized landslide scars.

The ALOS PALSAR Digital Elevation Model (DEM) and satellite data from Google satellite (https://mt1.google.com/vt/lyrs=s&x={x}&y={y}&z={z}) were overlaid as base maps, to aid tracking, identification and digitization of spatial locations of past landslides and field investigation. The resultant maps were converted into geotiff files and uploaded into “Avenza Maps”, a mobile application utilized for verification and updating geological and landslide information in the field. A total of 478 landslides were identified and used in the LS modeling.

## Field survey

Sections of the roads affected frequently by landslides were surveyed to gather more information related to the possible causes of instabilities and realized impacts. Data was gathered on the landslide location using GPS and also on individual failure characteristics. A buffer of 50-100m on either side of the selected road sections was made and crossed with environmental causative conditions (soils, lithology and slope angle and landuse).

Information was sought from local government departments including engineering on incidences of landslide occurrences affecting roads, drainage conditions, clearance and repair reports after landslide events and presence of stabilisation measures applied in various hotspot sections/segments of the roads.

## Predisposition to landslide factors

To evaluate the landslide susceptibility mapping (LSM), it is essential to know the preparatory and triggering factors and to prepare the necessary thematic layers (Mallick et al., 2018). Therefore, this study selected 10 conditional geomorphic and hydrological factors based on literature review, field observations and expertise for computing the fuzzy logic membership. The factors included, altitude, slope gradient, slope aspect, plan curvature, profile curvature, SPI, STI, TWI, distance to road, and distance to stream. The importance of each landslide conditioning factor was evaluated individually by comparing a map of each parameter with the landslide distribution map of the area (Fig. 2).

## Data sources

Topographic wetness index, altitude, and stream power index were derived from the Advance Land Observing Satellite/Phased Array type L-band Synthetic Aperture Radar (ALOS/PALSAR) DEM available at https://search.asf.alaska.edu/#/. According to Persichillo et al. (2017) the use of input data simply derived from DEMs allows to obtain a good level of accuracy and predictive efficiency also in case of lack of exhaustive field information. Therefore, the ALOS/PALSAR DEM suited this study because of its high resolution (12.5m) and free availability (Alahmadi, 2019). The data is further geometrically and radiometrically terrain corrected (Logan et al., 2014) thus ready for use in morphological modeling (Albino et al., 2015). As further echoed by Alahmadi (2019) ALOS/PALSAR DEMs are accurate enough for hydrological and morphological studies because of high accuracy.

Road network data was obtained from the UNRA database and supplemented with ICPAC Geoportal road data available at http://geoportal.icpac.net/layers/geonode%3Aigad_roads#more. This data is open sources, regularly updated and has been widely used (Wolff et al, 2021; Laktabai, 2020).

## Data processing, integration and analysis using fuzzy logic model

Fuzzy logic method (Bui et al., 2015) was used to assess landslide susceptibility. The method was selected due to its novel advantage over classical set theory methods such as weighted overlay, where an object belongs or not to a set thus it has a membership value of 1 or not 0 respectively (Gemitzi et al., 2011). The idea of fuzzy logic is to consider the spatial objects on a map as members of a set (Pradhan, 2009). In the fuzzy logic method however, fuzzy set theories apply fuzzy membership functions whose membership values range between 0 and 1 reflecting the degree of certainty of membership. There are also no practical constraints on the choice of the fuzzy membership values (Pradhan, 2009). Nevertheless, fuzzy set theories do not generate fuzzy membership values of landslide conditioning factors and their classes (Bui et al., 2015). Instead, expert knowledge or frequency ratios may be applied (Bui et al., 2015; Pradhan, 2009). This is consistent with Kumar and Anbaladan (2015), who indicated that landslide susceptibility mapping requires determination of fuzzy membership function of causative factors, which can be determined subjectively or objectively.

In this study, expert knowledge and grey literature was applied to achieve fuzzy membership during fuzzification processes. Each conditioning factor was reclassified to manageable classes and ranks equivalent to level of influence to landslide was assigned. The weights ranged from 1 to 10, where 10 = most influential and 1 = least influential class. The assigned ranks (crisp values) were then normalized by dividing them by a factor of 10. The resultant value of each was then used to assign a membership functions e.g. Fuzzy-Linear, Fuzzy-Large, Fuzzy-Gaussian etc.

## Fuzzification process of each conditioning factor

**Slope aspect: **This describes the direction of the slope and largely determines the exposure to the sun hence strongly influences the vegetation and evapo-transpiration rate. The slope aspect was divided into nine classes. Analysis reveal that the West, Northwest and Northeastern slope directions had the largest number of landslides. Fuzzy-Linear membership function with positive relationship was adopted as given by (Baalousha et al., 2021)

$$\mu \left(x\right)=\left(\left(x-min\right)\right)/\left(\right(max-\text{min}\left)\right)\dots \dots .\left(1\right)$$

Where; max and min are the maximum and minimum values of the crisp value. However, for negative linear relationship, the model is denoted as

$$\mu \left(x\right)=1-\left(\right(x-\text{min}))/(\left(max-\text{min})\right)\dots \dots \left(2\right)$$

**Slope gradient: **This is one of the major topographic factors for investigating slope instability and preparing LSM (Reichenbach et al., 2018; van Westen, 2008). This parameter was divided into eight classes (0–5; 6–10; 11–15; 16–20; 21–25; 26–30; 31–35; >35) and class 26–30 was ranked highest. Studies by researchers (e.g. Nakileza and Nedala, 2020; Bamutaze, 2019; Nseka et al., 2018; Phama et al., 2018) indicated high risk of landslides in this slope category. Therefore, basing on this literature Fuzzy-Gaussian membership function which prioritizes midpoints value was applied to slope with class 26–30 as midpoint. The function is mathematically given by Baalousha et al, (2021); Akteret al., 2019); Iliadis et al., (2017) as.

$$\mu \left(x\right)={e}^{(-{f}_{1}{\left(x-{f}_{2}\right)}^{2})}\dots .\left(3\right)$$

Where; \({f}_{1}\) and \({f}_{2}\) are and the spread and midpoint values respectively

**Slope curvature : **This is the curvature of a line formed by the intersection of a random plane with the surface. In the case of curvature map, based on Ramesh and Anbazhagan (2015), the negative values were classified as concave (<-0.005) positive values as convex (> 0.005), and values close to zero (-0.005 to 0.005) as flat. These surfaces influence the accumulation and flow of water. Most slides tend to occur on concave surfaces which accumulate water and moisture. In the current study profile curvature was classified in 6 classes as (-33.799 - -2.427, -2.426 - -0.597, -0.596–0.188, 0.189–0.972, 0.973–3.848 and 3.849–32.867) and plan curvature in 5 classified as (-32.14 - -1.107, -1.106 - -0.407, -0.406–0.293, 0.294–0.993 and 0.994–27.36). Fuzzy-small was applied for profile curvature denoted by (Akter et al., 2019) in (Eq. 4) while Fuzzy-Linear for plan curvature (Eq. 1).

Where; \({f}_{1}\) is the spread and \({f}_{2}\) is mean of input variable

**Elevation**

This is another geomorphometric parameter derived from the DEM and it plays a key role in influencing rainfall. Higher elevation areas are associated with high rainfall and first order streams that therefore affect slope hence steepening the rate of erosion and landslides. However, studies (e.g. Mande et al, 2022; Nakileza & Nedala, 2020) have showed that landslide risk increase with increase in slope up to a certain point and decrease and class between 1200 to 2500 are the most problematic. Based on this literature, elevation was fuzzified using Fuzzy-Gaussian membership function with elevation class 1400–1500 as midpoint values using Eq. 3.

**Distance to the road**

This similarly affects landslide risk due to the excavation and undercutting that causes reduced stress (Talaei, 2018; Duo et al., 2017; Ramesh & Anbazhagan, 2015) or in some cases overloading due to backfilling and compaction effect. The distance classes created in this study are (0–50, 51–100, 101–250, 251–500, 501–750, 731–1000, 1001–2500, 2501–5000, 5001–10000, > 10001). A study by Chen et al. (2020) revealed that landslides distribution decreased with distance away from the road thus an inverse linear relationship. Ramesh & Anbazhagan, (2015) also found high landslide risk to road distance < 2000m. Based on the above researchers` findings, a Fuzzy-Small membership function (Eq. 4) was applied to achieve road distance membership class because it gives priority to small values.

**Distance from drainage stream**

This is an important factor influencing landslide occurrence through moisture and undercutting process. Undercutting of the slope face by a stream or river induces instability (Thongley & Vansarochana, 2021) and so is the moisture contribution to saturation. This factor presents a negative linear relationship with landslides (Chen et al., 2020). That is to say landslide risk reduces with increase in distance from streams. Higher landslide occurrences were generally observed within the 100m distance in the study area expressed as Fuzzy-Small in Eq. 4.**Stream power index (SPI)**

Is a compound topographic attribute which measures the erosive power of flowing water based on the assumption that discharge is proportional to the catchment area (Panoto et al., 2022; Wang et al., 2016). High SPI values (> 40) are associated with high erosion (Panoto et al., 2022). However (Wang et al., 2016) noted that for landslides, this is not the case since highest SPI values are located in the stream itself. This study adopted Panoto`s assumption of a positive linear relationship between SPI and landslide occurrence. Therefore, a Fuzzy-Linear membership function presented in Eq. 1 was adopted.

*Topographic Wetness Index (TWI*): This measures the degree of water accumulation at a site. The fuzzification of this parameter followed a Fuzzy-Linear function. Results by (Liu & Duan, 2018) have indicated a positive correlation between TWI and landslide distribution. Pourghasemi et al., (2012) results also presented similar trends. It is in this regard that Fuzzy-Linear membership function was applied as given in Eq. 1.

**Sediment Transport Index**

The index reflects the erosive power of surface overflow and it is equivalent to slope length (LS) factor of RUSLES model (Jaafari et al., 2014). In landslide susceptibility mapping, STI represents the hydrological impacts to slope stability by revealing areas of erosion (high values) and deposition (low values). The assumption is that the river has greater energy to transport material in upslope and less energy downslope. Jaafari et al., (2014) found this assumption true in their study where frequency ration results increased with increase in SPI. However, (Chen et al., 2020) and (Liu & Duan, 2018) found contradicting results where landslide were more concentrated in deposition areas. In this study, 21 SPI classes were generated andFuzzy-Linear membership function denoted in Eq. 1 was applied as suggested by (Jaafari et al., 2014) findings. His findings exhibited similar characteristics like in the Elgon where shallow landslides are common on cliffs and very steep slopes.

## Defuzzification of conditioning factors

All conditioning factors were integrated into LSM map using Fuzzy AND operator. The model is mathematically denoted by (Çakıt & Karwowski, 2018 and Bui et al., 2015) as;

$$LSM=\text{m}\text{i}\text{n}({\mu x}_{1}, {\mu x}_{2}, {\mu x}_{3},\dots ..(5)$$

Where; \({\mu x}_{1}\), \({\mu x}_{2}\), and \({\mu x}_{3}\) represents fuzzified conditioning factors e.g. slope aspect, slope gradient, elevation etc. and min represents the minimum operator value.

The generated landslide susceptibility map was then overlaid with a road network map in a GIS environment to produce roads exposure to landslides hazard in the region (Fig. 3).

## Validation and evaluation

The landslide inventory (Fig. 4) was randomly partitioned into a training dataset with 70% (337) landslides and the rest 30% (143) records for validation of the LSM map (Fig. 5). The ROC curve constructed on sensitivity (True positives) against specificity (True negatives) was used to evaluate the performance of the model as described by Razavi-Termeh et al. (2021) and Hong et al. (2015). The parameters were computed from true positives (TP), false positives (FP), true negatives (TN) and false negatives (FN) (Thongley & Vansarochana, 2021) in ArcSDM 5.03 package where area under curve (AUC) was estimated. Also, the study paid little emphasis on landslide and road construction dating thus conditional independences were not computed.