2.1. Study Area
Golestan Province, Iran covers an area about 2,037,809 ha (Fig. 1). The dense forests and rangelands in this semi-arid area have long been conducive to wildfire. The Province is known to be one of the most wildfire-prone regions of Iran (Golestan Natural Resources Administration 2018).
The dominant species in forests beech (Fagus orientalis Lipsky), alder (Alnus subcordata C.A.Mey.), Caucasian oak (Quercus castaneifolia C.A.Mey.), eastern hornbeam (Carpinus betulus L.), yew tree (Taxus baccata L.), common juniper (Juniperus communis L.), cypress tree (Cupressus sempervirens L.), and iron wood (Parrotia persica (DC.) C.A.Mey.). Dominant herbaceous species include Achillea millefolium, Hypericum androsaemum, Echium amoenum, Ruscushyrcanus Woron, Rubus sp. and Siclaman sp. (Mozaffarian 2007).
An accurate fire susceptibility map is vital for fire prevention, mitigation, and response in fire-prone areas (Tehrany et al. 2018; Eskandari et al. 2020). Selecting the factors that are most important predictors of fire susceptibility is crucial to the modeling of an accurate and reliable fire susceptibility map. In this study, a DEM was used to determine elevations, slope angles, topographic wetness indices (TWI), plan curvatures, distances to roads, distances to villages, and distances to rivers of each of the locations of previous fires. These effective topographic and anthropogenic factors for fire susceptibility were identified in the literature (Pourtaghi et al. 2016; Pourghasemi et al. 2016; Eskandari and Miesel 2017; Eskandari et al. 2020) and based also on conditions in the study area.
To account for climatic factors, data indicating annual mean rainfall, annual mean temperature, and winds were acquired; these have been shown to influence wildfire regimes (Barbero et al. 2014; Jolly 2014; Jolly et al. 2015; Vitolo et al. 2019). It has been reported that high temperature and low precipitation generally cause an increase in fire danger (van Bellen et al. 2010; Eskandari 2015). The role of wind in promoting fire and spreading fire has also been demonstrated (Jolly 2014; Field et al. 2015; Pourghasemi et al. 2016).
The DEM of Golestan Province was generated from an ASTER-GDEM (30m- resolution) available from the USGS (https://earthexplorer.usgs.gov) (Fig. 2). Slope angle was calculated from the DEM. TWI is a secondary DEM feature obtained from the 30m-resolution DEM (Beven and Kirkby 1979):
where, α is the cumulative upslope area of drainage through a point, and tan β is the slope angle at that point. TWI was expected to be an important fire-promotion factor. The wind-effect map was constructed from three variables: DEM, wind direction (degree), and wind speed (m/s) in SAGA GIS (http://saga.sourcearchive.com/documentation/2.0.7pluspdfsg2/wind__effect_8cpp_source.html) (Pourghasemi et al. 2016). A plan curvature map was a secondary DEM feature generated in ArcGIS 10.6.1.
The locations of roads, rivers, and villages in Golestan Province were extracted from 1:25,000-scale topographical maps. Distances to roads, distances to villages, and distances to rivers were then determined in ArcGIS 10.6.1. Annual mean rainfall and annual mean temperature maps were acquired from the Golestan Meteorological Administration. Maps of each of these fire susceptibility effective factors are shown in Fig. 3.
2.3.1. Fire Occurrence Detection
For fire susceptibility modeling, actual fire data in the study area is required. All of the fires that occurred in Golestan Province from 2002 to 2017 were obtained from a MODIS fire product. The MODIS hotspots have been used by many researchers for fire occurrence mapping (Chuvieco et al. 2008; Vadrevu et al. 2010; Eskandari et al. 2015b; Eskandari and Chuvieco 2015; Jolly et al. 2019; Adelabu et al. 2020; Eskandari et al. 2020). In this study, all MODIS fire products for Golestan were obtained from NASA (https://modis.gsfc.nasa.gov/data/). HDFView software (http://hdfeos.org/software/heg.php) was used to detect the fire pixels (HDF-EOS to GeoTIFF Conversion Tool (HEG) 2017). The fire products were imported to HDFView and the position of fire pixels were detected. A map of the pixels that represented past fires was constructed in GIS. The fire pixels were divided randomly into two groups: 70% for training and 30% for validation of the fire susceptibility modeling results (Fig. 1b).
2.3.2. Importance of the Effective Factors for Fire Susceptibility Mapping
Selection of the variables that serve as the most important fire location predictors is important for the creation of reliable maps generated by proper models. In this research, the importance of effective variables on fire susceptibility mapping was determined with the random forest (RF) algorithm (Leuenberger et al. 2013; Guo et al. 2017; Song et al. 2017). A multi-collinearity test for the effective factors was used to remove the highly collinear variables (Hsiao 2014; Daoud 2017). The multi-collinearity test is frequently used to detect spatial autocorrelation among independent (predicator) variables used to model the response variable (Daoud 2017) which in this study is fire susceptibility.
2.3.3. Fire Susceptibility Mapping by New and Ensemble Data Mining Models
Four individual and ensemble data mining models – GAM, MARS, SVM, and GAM-MARS-SVM – were used to map fire susceptibility. The fire susceptibility maps were created in R 3.3.3 software. The GAM-MARS-SVM is a new combined model that is being used for fire susceptibility mapping for the first time. The data mining algorithms used in this study are explained below.
18.104.22.168. Generalized additive model (GAM): Generalized additive model (GAM) algorithm has been used to model danger and susceptibility for different phenomena. In this study, the GAM is used as a semi-parametric extension of fire effective factors and fire occurrence (Hastie and Tibshirani 1990). As the GAM model assesses the predictor's partial response curves with a non-parametric smoothing function instead of parametric function, it provides the potential statistical relationships between fire occurrence and the effective factors and yields the spatial patterns (Pourghasemi and Rahmati 2018). In current research, the “GRASP” (Generalized Regression Analysis and Spatial Prediction) package, developed by Lehmann et al. (2002), was used to run the GAM algorithm in R 2.0.7.
22.214.171.124. Multivariate adaptive regression spline (MARS): MARS algorithms are used to assess the relationships between input variables (effective factors on fire) and output variables (fire susceptibility potential) (Friedman 1991). This algorithm merges three techniques to form a new algorithm: mathematical spline construction, binary recursive partitioning (BRP), and linear regression (LR) (Friedman 1991). The resulting algorithm defines the relationships of an independent variable to the effective variables as either linear or non-linear (Hastie et al. 2001; Naghibi et al. 2018). In this study, the MARS algorithm was run by the “Earth” package (Milborrow et al. 2019) in R 3.0.2.
126.96.36.199. Support vector machine (SVM): The SVM algorithm is a non-linear and binary classification process that aims to determine the thresholds that divide a training sample into predefined classes. The optimum separation minimizes misclassifications that usually occur during training (Mountrakis et al. 2011). Traditional machine-learning algorithms usually attempt to limit empirical training errors and tend to overfit (Vapnik and Vapnik 1998; Xie 2006). The main benefit of SVMs is their ability to convert models and solve non-linear classification problems caused by a lack of prior knowledge of the modeling conditions. For this study, fire susceptibility modeling using SVM was performed with the “Kernlab” package (Karatzoglou et al. 2004) in R 3.0.2.
188.8.131.52. GAM-MARS-SVM: This new ensemble machine learning/data mining technique is applied here for the first time. We have assembled three famous algorithms according to their AUC values to map fire susceptibility. This process was accomplished with R 3.5.1 statistical software.
2.3.4. Validation of the Fire Susceptibility Maps
Validation of fire susceptibility maps obtained from data mining models was performed with the AUC values extracted from ROC (Receiver Operating Characteristics) curve, a technique that is widely used for accuracy assessment of the classified maps generated by algorithms (Mas et al. 2013). An area of 1 represents perfect classification; whereas an area of 0.5 or less represents a worthless result (Yesilnacar 2005).