An alternative soil erodibility estimation approach for data scarce regions: A case study in Ethiopian Rift Valley Lake Basin

: Soil erodibility (K) is an essential factor for erosion prediction, conservation planning and assessment of sediment related environmental problems. K estimation methods have been developed in many soil erosion and water quality models, which are developed for soil data rich areas and pose a challenge for areas with limited data. Unlike others, by using the erosion productivity impact calculator (EPIC) model, the required soil parameters for calculating K can be extracted from the Food and Agricultural Organization (FAO) world database. In order to verify the FAO soil database and develop an alternative K method (KET) by mimicking the equation of K used in the EPIC model, we collected 203 soil samples from different soil units in the Ethiopian Rift Valley Lake Basin (ERVLB). Unlike the K of EPIC model, KET is developed based on the physical properties of soils that can be easily measured in a laboratory. The results from KET were compared with those from the EPIC-K. Statistically, the performance of KET is excellent and the soil analysis result of ERVLB deviates from the FAO soil database on lower altitude areas of the basin. When KET is projected for overall soil units of the country, it predicts 35.7% of the country soil with less than ±5% relative error. On average, the KET can be applied to overall country soils with a relative error of −9.88% with a standard deviation of 6.4. By applying KET, the ERVLB and the country K map were produced. We recommend that the K map developed for ERVLB can be confidently used since it is validated using field data.


Introduction
Today, soil erosion by water is described as one of the most critical environmental hazards [1] because of its adverse economic and environmental impacts.Globally, soil erosion by water is diminishing land resources and reducing land productivity, increasing sediment delivery [2,3].To estimate the risk of soil erosion on land resources, an effective soil erosion prediction model has become an essential and urgent endeavour.Unfortunately, soil erosion prediction is a complex and multifaceted process that is affected by a host of factors including erosivity and erodibility.Erosivity stands for potential ability of rain to cause erosion and erodibility for susceptibility/vulnerability rate of the soil to erosion [4,5].
Soil erodibility (K) index is essential to assess the susceptibility of soil to erosion and to predict the rate of soil loss, which is the most commonly utilized K term of the universal soil loss equation and is usually regarded as the amount of soil loss per unit erosive force, e.g., rainfall, surface flow and seepage.In soil erosion prediction models, such as the soil water assessment tool [6][7][8][9], agricultural non-point source pollution model and erosion productivity impact calculator model (EPIC) [10], K index is also important to measure susceptibility of soil to water erosion.Since soil erosion research received increasing emphasis, erodibility became an important parameter for estimating soil loss.Practically it can be obtained directly from a standard plot or indirectly from empirical models.
Römkens et al. [11] noted that a direct measurement of K in the field is the most reliable way of K estimation.But such method requires the establishment and maintenance of natural runoff plots over lengthy and expensive observation periods at various locations.Hence, it is a costly and time-consuming technique.To overcome this, numerous attempts have been made to simplify the technique and to establish estimators for K calculation from readily available soil property data and standard [12,13].Up to date, the available K calculation models, such as the universal soil loss equation (USLE) [13,14], the revised universal soil loss equation [15], the erosion productivity impact calculator (EPIC) [16] and the geometric mean diameter based (Dg) model [17] have been widely used.
K was originally derived from five variables of the percentage of silt and very fine sand fraction, percentage of sand fraction, amount of organic matter, soil aggregation and soil permeability index [13].The sixth variable, rock fragment cover, was added by [14], who also provided the classical K equation.Since the amount of data required by the original model [13,14] is large, [16] derived an alternative K estimation method using only two soil properties, soil organic carbon content and soil particle size distribution.Furthermore, for soils, the measured K from which the K-value can be derived from measured soil properties are unavailable, [17] derived an alternate, yet less accurate, for estimating K using only soil parameter, the geometric mean diameters of soil particles [18].
Numerous researchers have attempted to determine alternative K estimation models under specific conditions (watershed/basin-scale studies) based on the original models [10,19] and to find out which model is the most suitable one for K estimation [11,20].Wang et al [18] improved USLE to predict K for water erosion areas of China and developed an alternative K estimation model using soil organic matter and geometric mean diameter of the soil particles.In different regions, [4,21,22] and [23] tested USLE, EPIC and Dg models for K prediction under specific conditions by applying a non-linear regression fitting technique.
However, it is a big challenge in Ethiopia to using the upgraded and modified models due to the insufficient input data to make necessary adjustments.There was few soil investigations having been done so far for the country.One of the mostly used K estimation method in Ethiopia is the what recommended by [24] and [25].It is a rough K estimation method using the colours of the soil and the approach is still using inside the country [26][27][28].
For accurately predicting K, it requires to have detailed key soil inherent properties, such as texture, organic matter, structure and permeability.Hence, estimating based only on colour may be subjective.Next, [29] developed a model to estimate the K factor for the upper Awash basin of Ethiopia using a soil texture class.However, K depends not only on soil texture class, but has a direct relation with various soil properties like shear strength, infiltration capacity, permeability rate, etc.Hence, the empirical model developed by [29] may need to add a soil parameter which can indicate the level of the soil strength.Therefore, researchers in Ethiopia are using different K estimation methods to identify erosion prone areas and estimate the sediment yields of the country.For example, [30] used a method developed by [13] to predict the spatial distribution of K in an agricultural watershed of Northern Ethiopia and [31] used Wischmeier and Smith (1978) equation to identify the erosion prone areas.To estimate the sediment yield of the Lake Ziway basin (Ethiopian Rift Valley Lake), [9] used the method developed by [16].From the various K prediction methods, [16] equation for EPIC model is applicable for data limited areas [4] and the required soil parameters can be easily extracted from FAO world soil databases.Similarly, an investigation conducted by [4,21,22] and [23] confirmed that the K estimation method developed in EPIC model can predict K better than the others.
However, the K estimation method developed in erosion productivity impact calculator (EPIC) model requires soil input parameters of soil texture class and organic carbon contents (OCC).In practice, it is more complex to determine soil chemical parameters than physical parameters.For example, the soil OCC can only be estimated by burning the soil samples, and for non-welldeveloped laboratories its accuracy will be under question.But, for a given soil, its organic carbon content has a direct relationship with soil bulk density, a physical parameter.As reported by [32] and [33], soil with high organic matter has high pore-size distribution and high soil bulk density.Modifying the model by replacing a chemical soil parameter with easily measurable physical parameter can alleviate the propagation of error during analyses.In similar manner, the need for a large amount of soil input parameters remains a big challenge to apply the available K estimation methods and minimizing the required input parameters can alleviate these problems.Therefore, in this study, the EPIC-K method developed by [16] is selected to develop an alternative K estimation method in mimicking the original K.Moreover, the soil input parameters for [16] can be extracted from FAO world soil database.However, to use the FAO world soil database, its reliability needs to be verified with major soil classes of the study area.
The objectives of this study include: (1) verifying the reliability of the FAO world soil database for the study area; (2) deriving an alternative K estimation approach with a more simplified input parameters to predict the soil loss by empirical or semi distributed models by mimicking the original K values developed for EPIC model; and (3) providing a regional K map for water erosion areas in Ethiopia Rift Valley Lake Basin using an alternative K-factor prediction approach.

Study area
Ethiopian Rift Valley Lake Basin (8°30′-4°25′N, 36°30′-39°30′E; Fig. 1) is situated in the south part of the country and characterized by a chain of lakes.It stretches from northeast to southwest just north from Lake Ziway via Lakes Abiyata, Langano, Shala, Hawassa, Abaya, Chamo and Chew Bahir up to the border with Kenya.The basin has a total area of 5.3×10 4 km 2 .
The geology of the Rift Valley Lakes Basin is divided into four major groups of rock: Mesozoic sedimentary rock, Oligocene to middle Miocene pre-rift volcanic rock and middle-Miocene to Holocene syn-and post-rift volcanic rocks and unconsolidated sediments [34].The vegetation of the northern central valley (around Lake Ziway) is a mixture of open bush, open woodland and moderately to intensively cultivated land.The central plain around Lakes Shala, Langano and Abiyata is characterised by open woodland and wooded grassland with intensive cultivation to the south and west.Between Hawassa and Lake Abaya, where the valley narrows, the central part is characterised by open and dense bush land with some erosion.The major soil groups identified in the reconnaissance soil survey of the basin [34] are luvisol, cambisol, nitisol, vertisol, solonchak, arenosol, andosol, fluvisol and leptosol.The soil map of Ethiopian Rift Valley Lake Basin is given in Figure 2.

K estimation methods
One of the oldest K estimation method was originally developed by [13].Authers [14,16 and 17] then proposed formulas for K estimation with specific soil properties, which were widely used today.For specific study areas, the selection of K estimation methods depends on the availability of the required soil input data.For example, [14] equation can be applied to areas with parameters like texture, organic matter content (OM), structure classes and permeability available.An algebraic expression derived by [14] to calculate K for the soil with silt fraction less than 70% is: 1.14 0.00021 (12 OM) 3.25( 2) 2.5( 3) 100 where K is the soil erodibility; M is the product of percent of silt and very fine sand and the percent of all soil fractions other than clay, OM is soil organic matter content (%), S is the soil structure code used in soil classification and P is the soil permeability class.
The equation derived by [16] can be applied in areas where there are sufficient data sets related to soil particle size distribution and soil OCC.The [16] K index values have been calculated as the following equation that put as EPIC (Erosion productivity impact calculator): where SAN, SIL and CLA are sand fraction (%), silt fraction (%) and clay fraction (%), respectively; C is the soil organic carbon content (%); and SN1 equals to 1-SAN/100.Similarly, the applicability of K estimation method developed by [17] depends on the availability of soil data pertaining to the calculation of the geometric grain size.Renard et al [17] developed the K estimation factor based on geometric mean diameter (Dg) of soil particle as: where Dg is geometric mean diameter of soil particle and determined as Equation 4.
where fi is the primary particle size fraction (%); mi is the arithmetic mean of the particle size (mm); and n is the number of particle size fractions.The estimation method developed by [14] requires a large amount of soil properties data than the method developed by [16]; and the method of [16] also requires relatively large amount of data than [17].Practically, determining large soil input parameters from soil map is difficult and when more soil input parameters are required more error will be obtained during sample collection and analysis.The sensitivity of the input parameters used in the methods can also affect estimation accuracy.A good example is the parameter that is used in K-Dg model.In K-Dg model, the only soil parameter used to calculate the K factor is the geometric mean diameter of the particle.As shown in Equation 3, Dg is represented by an exponential form and hence it is too sensitive to small discrepancies (the small discrepancies in particle size fraction can lead to large errors in K values).To minimize this, more samples and detailed investigation on soil particle size information may be required and which is difficult to apply for regions which has no access to use accurate laboratory analysis where and when needed.This is true for most developing countries like Ethiopia.In most developing country, researchers are using erodibility estimation method developed by [16] (EPIC-K).The advantage of using [16] (EPIC-K) method is its low number of data requirement and another advantage is the input parameters can be extracted easily from the Food and Agricultural Organization (FAO) world soil database.

Dataset
The soil map of the Ethiopian Rift Valley Lake Basin is available from the Ethiopian Ministry of Water, Irrigation and Electricity (MoWIE).To compute the K by EPIC-K method, complete soil inherent properties like soil texture and soil OCC should be available.Hence, the physical and chemical properties of the soils for Ethiopia Rift Valley Lake basin have been collected from the field and MOWIE.In the master plan study of the Ethiopian Rift Valley Lake Basin (ERVLB), the soil samples were collected and analysed from major soil types of the basin and the sampled points are shown in Figure 3.The soil samples were collected from a 0.6 m×0.6 m pit with a depth of 200 cm and the separate samples were collected for the depth of 30,85,150 and 200 cm.The analyses of soil physical and chemical characteristics were conducted in the Ethiopia Water Works Design and supervision Enterprise Laboratory.The physical parameters included texture, effective soil depth and soil drainage, and chemical parameters were PH, electrical conductivity, cation exchange capacity, base saturation percentage, exchangeable bases including exchangeable calcium , Magnesium, potassium, sodium and Sodium Percentage, Calcium to Magnesium ratio, Potassium to Magnesium ratio, Organic Carbon or Organic Matter, Total NitrogenRatio, Available Phosphorus and Calcium Carbonate.To minimize the error propagation during soil sampling and testing, more than three samples were taken from a single soil class.In this study, the laboratory results were checked for outlier and not outlier was detected.

Fig. 3 Soil sampling locations in the Rift Valley Lake Basin
As shown in Figure 3 (2010 sample pit locations), no soil sample tests were taken from the Leptosol dominated locations of the basin.In the basin, Leptosol covers around 2.07×10 5 hm 2 (3.9% of the total basin).Hence, additional soil tests were done in this study to investigate the physical properties of those soils.To do so, soil samples have been collected from two Leptosol dominated parts of the basin (Figure 3, 2018 sample points.The soil samples were collected from 0.7 m× 0.7 m soil pit at a depth of 150 cm.
To test its physical parameters, the particle size analysis was performed using the hydrometer method by following laboratory guideline procedures recommended by [35].Similarly, the bulk density of the soils was calculated from its 24-hr oven-dry weights.

Developing an alternative approach (KET) to estimate K rates of ERVLB
The mathematical expression of KET is developed based on natural phenomena of soil erosion on different soil class.Naturally, soils with a high content of silt tend to have high erodibility and erodibility is low for clay rich soils.Clay particles mass together into larger aggregates that resist detachment and transportation.Soils containing large proportions of sand have relatively large pores through which water can drain freely.These soils are at less risk of producing runoff.Similarly, soil bulk density is dependent on soil texture and soil organic matter contents.As reported by [36], soil erosion decreases with increases in bulk density.
Hence, to develop an alternative K (KET) estimation model for the basin, we considered the relation of K with texture and BD as K is directly proportional to percent silt but inversely proportional to clay and sand contents as well as bulk density.From this inverse and direct proportionality of soil parameters, we developed an algebraic expiration for K as: To investigate the developed algebraic expressions with an original K value, we have calculated the basins K values by using K factor developed for EPIC model by [16] and the ratio of major soil texture and its BD is compared statistically (Table 1).As shown in Table 1, for both cases, the ratio of silt to sand and clay with ratio of BD reflect the highest correlation than using only the textural class.Hence, the alternative K estimation method is formulated as the ratio of silt to total sand and clay with ratio of bulk density.By considering the percentage of silt to sand and clay with ratio of bulk density as an independent parameter and the K values determined by [16] in EPIC-K method as a dependent parameter, a nonlinear regression equation has been established as (Eq.6) where: KET = Newly proposed alternative K factor (ton acre hour hundreds of acre −1 foot −1 * tonf −1 inch −1 ), % Silt, % Sand and % Clay are percentage of silt, sand and clay proportion in the soil respectively, BD = Bulk density of the soil (g.cm -3 ), β and α are coefficients determined through a regression and optimization procedure.K will also depend on various soil properties such as shear strength, infiltration capacity and permeability.But this soil parameters have a direct relationship with soil bulk density.Investigation by [32] and [33] approved as increasing in soil bulk density not only induces changes in the pore-size distribution but also affects the ability of soil to shrink and to conduct water under unsaturated conditions.Similarly, the shear strength of surface soil will be affected by soil bulk density.Authers [37][38][39] obtained the direct relationship for soil bulk density and soil shear strength.The newly proposed alternative K factor has been developed by having a soil bulk density which can reflect the effects of other soil parameters.

Model Evaluation and Statistical Analyses
The newly derived model has been evaluated and tested statistically using coefficient of determination (R 2 ), Nash-Sutcliffe efficiency (NSE), root mean square error (RMSE) observations standard deviation ratio (RSR) and Percent bias (PBIAS).The applicably of the newly developed K model was verified for overall soils of the country by obtained the digital soil map of Ethiopia (Figure 2) from Federal Ministry of Water, Irrigation and Electricity (FMoWIE).

Applicability of FAO world soil database for the study area
Using data from 90 pits, 203 soil samples were used to evaluate the applicably of FAO world soil map and to generate an alternative K estimation model.The similarity and variation of the soil fractions measured in the field have been compared between the FAO (1998) soil database properties (Figure 4).As shown in Figure 4, for Vitric Andosol (ANz), Eutric Cambisol (Cme), Fluvisol (FL) and Haplic Solonetz (SNh), there is significant variation in soil texture classes.As shown in Figure 2, within the basin, Vitric and Fluvisol are found on depressions, alluvial plains, lacustrine plains adjacent to lakes, main streams, river meander belts of major rivers and in areas subject to annual flooding and consequently receiving fresh sediments from each flood.Hence, for these two-soil classes, the sand proportion is decreased and the silt and clay fraction has increased.Similarly, both Cambisol and Solonetz soil is also located on low lying areas of the basins and the fraction of sand is significantly greater than the FAO database.In the low-lying areas, there is a probability of deposition of sand if the upper catchment is dominated by sandy soils.Generally, soils located in the lower sections of the basin showed a deviation from the FAO soil database which may be due to eroded soil deposition.In order to improve this, the sample should be taken from deeper depths to analyse samples from the different soil horizons.Though the number of soil samples used for analysis were few in number (five from total basin), similar results were reported by [29] for soils of upper Awash basin, Ethiopia.

An alternative equation to determine K values
By mimicking the original K developed for EPIC model, an alternative K model (KET) for Ethiopia Rift Valley Lake Basin is developed as The K using the newly developed K method (KET) and EPIC -K method is shown in Figure 5.As [40] demonstrated, if the R 2 Value is ≥ 0.9, 0.9 to 0.75, 0.65 to 0.75 and > 0.50, the model can be rated as; excellent, very good, adequate and satisfactory performance, respectively.From Figure 5, it can be observed that the alternative soil erodibility model (KET) performance is excellent.Additional numerical criteria of model performance evaluators namely: NSE, RSR, RRMSE and PBIAS are determined and summarized in Table 2.  [40], the high R 2 and NSE (above 0.9) values and the reasonably low RSR (below 0.5) and PBIAS (below 5) indicate the excellent correlation and agreement between developed alternative soil erodibility estimation (KET) method and EPIC method.
Furthermore, the applicability of KET is tested by comparing the value of soil erodibility estimated using both methods: KET and EPIC (Figure 6).As shown in Figure 6, the K calculated by KET has replicated the entire trend of value predicted by EPIC.Similarly, in Table 2, the statistical analysis result indicated as an alternative erodibility estimation equation (KET) can represent EPIC-K equation for ERVLB soils.It is shown that both KET and EPIC-K approaches are good in replicating measured K and can be concluded that the newly developed alternative KET can be used in ERVLB.The K map of ERVLB by KET and EPIC-K method is given in Figure 7.

Evaluation of alternative K estimation method (KET) for soils of Ethiopia
Even though the goal of the study is to develop K factor for Ethiopian Rift Valley Lake Basin, its applicability was tested for all soils of the country.Ethiopian soil map indicates the existence of 56 different soil units in the country.Hence, the required soil parameters were extracted from FAO world soil database.For evaluations, we have excluded the 13 soil that were used to establish the KET equation.Hence a total of 43 soil types have been considered in analysis and the newly proposed K estimation model was applied to calculate the K values of each soil type and compared with the erodibility factors calculated by EPIC-K method.For the total of 43 Ethiopian soils, the statistical correlation of erodibility factor determined using newly developed K model (KET) and EPIC-K method is given in Figure 8. Model performance using NSE, RSR, RRMSE and PBIAS is summarized in Table 3.According to statistical model performance result (Table 3), the developed alternative K equation (KET) performance is excellent for RSR and RRMSE and very good for coefficient of determination (R 2 ), PBIAS and NSE [40].
Statistically, the alternative erodibility estimation method (KET) is appropriate to use for overall soil units of the county.But the calculated Percent bias (PBIAS) is positive and which indicates the KET is underestimating the real erodibility value.To identify the group of soil which have high relative error between the two methods, their relative error was calculated.Based on the magnitude of their relative error, we have classified the soils of Ethiopia (56 soil units) into four class.Class one: soils which show relative error less ±5%, class two, three and four are soil which have relative errors less than ±10, -15 and -20%, respectively.
As shown in Figure 9, the alternative erodibility model predicted 35.7 % of the country soil with an error of ±5% (35.7%, ±5%).Similarly, (23.21 %, ±6 to ±10%), (19.6%, -11 to -15%) and (21.4%, -16 to -20) were also found.On average, the alternative erodibility equation (KET) can be applied for more than 58.9 % of the country soil with an average error of ±5.3% and standard deviation of 3.2.The erodibility factor for the 41.1% of the country soils were predicted with an error of ±13.31% and standard deviation of 10.3.Overall, the newly developed K factor has a relative error of -9.88% with standard deviation of 6.4 for the national soils.
A) Group One: Soil with RE less than ±5% B) Group Two: Soil with RE less than ±10% C) Group Three: Soil with RE less than -15% D) Class Four: Soil with RE less than -20 % where RE is relative error and K in t ha hr / ha MJ mm Fig. 9 Comparison of K factors estimated by the equation of Sharply and Williams (1990) (EPIC-K) and KET for Ethiopia soil (t ha hr / ha MJ mm) The result obtained in this study may reflect the country's K rate.On most erodibility studies, the existing K models are over estimating the result when compared with field measured data.For example in China, [41] reported as the EPIC model is overestimating K by 51.1% on purple soil region and 18.5% in black soil regions of Chain.Authors like [4]; [21] and [22] evaluated the performance of different K models and reported almost similar results with [41].Therefore, in conclusion, the prediction error obtained from KET is within acceptable range.Hence, to improve its overall performance, it is advisable to check observed data from natural runoff plots.Based on the developed alternative erodibility equation (KET), the K map of the country was developed (Figure 10).The alternative K estimation approach, KET model, was developed for the different soil classes in Ethiopia and comparison of the new K factor to the values from the EPIC model and field data were made.Soil texture ratio and bulk density data were used to find the correlation with K values.Using the field sample data from the soil survey experiment, the equation was developed for the ERVLB soils as well as for other soil types in Ethiopia.The similarity of results obtained using KET is compared with the results of EPIC-K method and the model performance statistics of their relation is determined as excellent.
Moreover, the applicability of KET model is evaluated to total soils units of Ethiopia by extracting the required soil physical parameters from FAO world soil database and the original K estimation equation derived for EPIC-K model is considered for evaluation of the KET.As a result, the KET is predicting K values for 35.7 % of the country soil with a relative error less than ±5%.On average, the KET method can be applied for more than 58.9 % of the country soils with an average error of ±5.3% and standard deviation of 3.2.For the remaining 41.1% of the country soil, K values were predicted with an error of ±13.31% and standard deviation of 10.3.The K factor estimated with the equation proposed in this study is reliable and can be used for ERVLB soils as well as for others soils within the country.
Lastly, when KET compared with K factor developed for EPIC model, it does not require organic carbon content (OCC) and it is based on the properties of soil that can be assessed easily.Moreover, the soil properties that are required for KET can be easily obtained in laboratory through dry and wet sieve analysis, and by oven drying techniques.Therefore, the newly developed an alternative K estimation model (KET) can reduce the time and cost that will be devoted in collecting huge soil data sets from the field.Additionally, it minimizes the propagation of error from impute data set as it is based with few data sets and with easily measurable soil parameters. .

Fig. 1 Fig. 2
Fig. 1 Location of Ethiopian Rift Valley Lake Basin (B) inside the country's Rivers Basins (A)

Fig. 6 K
Fig. 6 K estimated by EPIC-K VS KET for ERVLB soils

Fig. 7 K
Fig. 7 K map of ERVLB by EPIC-K (a) and KET (b)

Fig. 8 K
Fig. 8 K value determined by KET and Sharply and Williams (EPIC-K) method

Fig. 10 K
Fig. 10 K (K by KET) map of Ethiopia

Table 1
Pearson Correlations for ratio of soil parameters and calculated K values ** .Correlation is significant at the 0.01 level (2-tailed).BD is Bulk density (g.cm-3)

Table 2
Statistics values for model performance

Table 3
Statistics values for model performance