The methodological framework of this research is illustrated in Fig. 1, which mainly includes the selection strategy of gully-controlling factors, sampling strategy of gully samples, and construction of the data-involvement scenario-based sieve of performance assessment for the selection of best-performing modeling instance.
Study area
The study area is situated in the northeastern part of Iran, in Golestan Province (Fig. 2). The area extends for about 784 km2. The maximum and minimum elevation ranges from 160 to 1490 m.a.s.l. A temperate Mediterranean climate is prevailed in the area, with the minimum and maximum precipitation of 346 and 610 mm. The most dominant land use/cover corresponds to agricultures (45%), followed by rangelands (38%), forests (16%), and residential areas (1%) (Table 1). The formation symbolized as Qsw, formed of grey to black shale and thin layers of siltstone and sandstone and Cenozoic in age, is the main geological outcrop in the area, accounting for 73% portion of the region. Table 2 details the areal extent of the geological formations in the study area.
Table 1
Land use classes in the study area
Land use
|
Area (he)
|
Area (%)
|
Forest
|
12513.04
|
15.95
|
Residential Areas
|
498.6
|
0.64
|
Rangelands
|
29858.8
|
38.07
|
Agricultural
|
35568.2
|
45.35
|
Table 2
Description of the geological formations in the study area
Geo unit
|
Description
|
Age
|
Area (ha)
|
Area (%)
|
Qm
|
Swamp
|
Cenozoic
|
2269.48
|
2.88
|
Qsw
|
Grey to black shale and thin layers of siltstone and sandstone
|
Cenozoic
|
58087.92
|
73.73
|
Ksn
|
Ammonite bearing shale with interaction of limestone
|
Mesozoic
|
9886.63
|
12.55
|
Ksr
|
Grey thick - bedded limestone and dolomite
|
Mesozoic
|
4906.5
|
6.23
|
Jmz
|
Olive - green shale and sandstone
|
Mesozoic
|
1857.68
|
2.36
|
Ekh
|
Swamp
|
Cenozoic
|
1772.73
|
2.25
|
Data construction phase
The main data architecture required for spatial modeling of any natural phenomenon revolves around two main datasets: 1) evidence samples and 2) causative factors. Gully occurrences across the study area signify the evidence samples. Among different gully positioning and sampling techniques, gully headcuts well represent the development activity of gullies, according to the consensus in the literature (e.g., Hosseinalizadeh et al. 2019; Kariminejad et al. 2019). Hence, 803 gully headcuts were considered and put into the modeling process. As a rule of thumb for large samples, a 70:30% partitioning balance was considered to split the gully samples into two sets of training and validation, respectively (Fig. 2). Training samples were put directly into the spatial modeling phase, while the validation samples were kept apart for validating the model's results.
For the causative factors, we strived to involve the most representative gully-controlling factors, among which anthropogenic (road density, distance to roads), hydrological (Height Above the Nearest Drainage, stream density, distance to streams, and Topographical Wetness Index), environmental (soil texture and land use), pedological (bulk density, lithology, clay, sand, and silt), and topographical (elevation, slope, aspect, general curvature, plan curvature, profile curvature, Relative Slope Position, LS factor, Topographical Position Index, and valley depth) drivers were selected. Factor selection strategy consisted of three phases: 1) using each causative category separately for the modeling phase, 2) using the best nominees from different categories, highly contributing to gain the best model performance, and 3) using all factors collectively. Consequently, the best contribution of the factors was identified and used, together with the training gully samples, to build the gully spatial model. The causative role of each controlling factor and their spatial scale are provided in detail in Table 3. Thematic maps of the causative factors are presented in Fig. 3.
Table 3
Causative role and spatial scale of gully-controlling factors
Conditioning Factors
|
Causative role
|
Scale
|
Anthropogenic
|
Human interventions such as unplanned road construction with poor foundations, impeding the hydrological cycle and drainage system, artificial accumulated water alongside the road and predisposing the soil
|
1:25,000
|
Hydrological
|
Soil dissolution and dispersion and retrogressive headcut development due to concentrated flow
|
1:25,000
|
Environmental
|
General soil susceptibility in terms of runoff generation (hydrological groups) and resistivity to gully propagation, land covers and their protective roles on the soil surface
|
1:100,000
|
Pedological
|
Soil erosion resistance and shear stress
|
1:100,000
|
Topographical
|
Elevation and proxy to different hydro-erosional mechanisms
|
1:25,000
|
Varian Inflation Factor (VIF) and its reciprocal index, Tolerance, were calculated to test the multicollinearity among the data and to avoid factors with strong correlations and the emanated bias in the model's result (Miles 2014). According to the VIF values reported in the literature (e.g., Malik et al. 2020), VIF values below 5 are acceptable (i.e., a Tolerance value above 0.2), allowing factors with the acceptable VIF range to be included in the modeling process.
Modeling phase
Different presence-absence and presence-only machine learning models have been used to reveal the hidden patterns of natural phenomena; each simultaneously offers different potentials and suffers from specific shortcomings. Generally, the main pitfall of presence-absence models is attributed to the selection of absence samples. In essence, absence location should identify areas with no evident activity or existence of the phenomenon. Antithetically, different types of gully erosion can inconspicuously develop in subterranean regions due to piping and soil dissolution, which may exhibit no evident symptoms unless the overlying strata are collapsed. Hence, selecting the absence locations may be biased by misrepresenting the actual absence locations and diluting to areas where the gully erosion is well developed but has not yet morphologically emerged. However, presence-absence models can better distinguish the presence pattern from absence, and their results tend to be more precise, leading to more compendious mitigation plans. On the other hand, presence-only models operate only based on the presence samples, although they consider some pseudo-absence locations called background points to obviate their deficiency on differentiating the presence: absence pattern. Overall, according to the phenomenal behavior of gully initiation and development mechanism, we used Maximum Entropy (MaxEnt hereafter). Many studies have been conducted on the application of MaxEnt in natural hazard modeling and attested to its capability and tremendous performance (e.g., Arabameri et al. 2020; Javidan et al. 2021).
MaxEnt is a generalized presence-only model that processes gully occurrence locations and different categorical and continuous thematic maps, treated as gully-controlling factors, irrespective of their diverse spatial scale. In contrast to deterministic models that operate on a phenomenon's prevalence, MaxEnt uses Bayes' theorem to circumvent any assumption on the prevalence value. Commencing with a random walk into the parameter space and assuming that all the pixels have the same probability of occurrence, MaxEnt further tries to find distinct patterns between the presence and pseudo-absence background locations. By doing so, through hundreds of modeling iterations, MaxEnt reaches to a probability distribution as similar as possible to the target probability distribution of the gully occurrence. During this process, all the thematic maps are rendered into their arithmetic replicas, known as features. More details on the mathematical background of MaxEnt can be found in Phillips et al. (2004, 2006, 2009), Elith et al. (2011), and Kornejady et al. (2017). Maximum entropy model was executed in MaxEnt software, considering 10− 5 as convergence threshold, 500 modeling iterations, and 10,000 background points.
Performance assessment
Performance metrics can be employed to congruently test the goodness-of-fit (using training dataset) and prediction power/generalization capacity (using validation dataset). These metrics are calculated from a 2×2 prediction versus reality comparison table called confusion matrix. As for cutoff-independent metrics, the Receiver Operating Characteristic (ROC) curve is a widely used index for the performance assessment of spatial models. It plots the false positive ratio (so-called 1–Specificity, describing the actual non-gully areas that are incorrectly predicted as gully by the model) on the x-axis against the true positive ratio (so-called sensitivity, describing the correct prediction of gully-affected areas as observed in the study area) on the y-axis. The area under the ROC curve (AUROC) represents the degree to which the model has successfully trained and predicted, depending on the input sample dataset (training or validation). The latter ranges from 0.5 to 1, where 0.5 represents a model acting out of random chance while AUROC of 1 signifies a model with perfect performance. Plotting ROC curves were carried out in MaxEnt software in the form of jackknife plots.