Zoning forest fire risk at a county-level based on geographically weighted logistic regression: a case study of Wuyishan, China

: Identification of the fire risk area at the county level is the key spatial unit for local forest resource protection and fire prevention. However, current methods and standards focus on the rating of risk level at a national scale which are not necessarily applicable at a local level. Wuyishan is a county-level city in southeastern China, which is rich in forest resources and is important for biodiversity conservation. We used a binary logistic regression (BLR) model and a geographically weighted logistic regression (GWLR) model to examine the indicators of forest fire occurrence and map the forest risk zones in the study area based on historical fire survey data from 1999 to 2013. The results showed that the BLR model simulation found that four indicators (daily average relative humidity, daily sunshine hours, elevation, and distance to the closest railway) had a significant impact on the risk of forest fires in Wuyishan City. Daily sunshine hours had a positive correlation with forest fire risk, and the other three factors were negatively correlated. The GWLR model incorporated the spatial heterogeneity of indicators into the simulation and further demonstrated that only daily average relative humidity was correlated over the entire study area. In contrast, daily sunshine hours, elevation, and distance to the closest railway were effective indicators of fire risk at a local level. The prediction accuracy of the GWLR model (85.3%) was slightly higher than that of the BLR model (84.4%). Around 19.9% of the study area was in a high fire risk zone, 34.0% was in a medium-risk zone, and 46.1% was in a low-risk zone. The high-risk zones were mainly concentrated in the central and southern areas. Our results indicate that, during the fire prevention period, the forest fire management department needs to increase the frequency of daily inspections of the forest edge areas in the high- and medium-risk areas based on the fire risk zoning map. Our approach may improve the identification of forest fire risk and fire prevention and suppression management at a county-level in mountainous and hilly areas.


Introduction
Forest fires are becoming more serious and frequent around the world under the influence of global climate change and human disturbance (Harvey, 2016;Yang et al., 2018;Williams et al., 2019). Fire risk zoning is an essential element in planning for the protection of forested areas. Owing to the complexity and uncertainty of the factors influencing forest fires at different scales, the precise zoning of forest fire risk levels plays a key role in the monitoring and effective prevention and control of forest fires.
Some countries rate their fire risk zone at a national scale based on their own forest fire science research and the characteristics of forest resources. For example, the Canadian and the US forest fire risk rating systems are currently internationally applied and well recognized (Wang, 2018). In China, forest fire risk zoning at the national scale is mainly carried out with reference to a national standard which uses a regionalized rank based on the nationwide forest fire risk. In this standard, six indicators (including the tree species or group burning type, population density, monthly average precipitation during fire prevention period, monthly average temperature during fire prevention period, monthly average wind speed during fire prevention period, and density of the road grid) are used to rate the fire risk level (high, moderate, and low) of each county across the country (State Forestry Administration, 2018). However, this method cannot distinguish the spatial heterogeneity of forest fire risk in the subcounty-level.
The county is the basic administrative unit in China. As of 2016, China had a total of 2851 county-level administrative divisions, which accounted for more than 90% of the total land area and nearly 80% of the country's total population (Ministry of Civil Affairs, 2017). The ecological processes and ecosystem services at the county-level are constrained by large-scale regions. Therefore, forest fire risk zoning at the county level is an important spatial unit for the implementation of forest resource protection, biodiversity conservation, and forest fire prevention at the national scale. A large number of counties in southern China are located in mountainous and hilly areas, where forest fires are mainly characterized by high frequency but low fire intensity factors (Ying et al., 2018). However, the six indicators mentioned above in the National Forest Fire Risk Zoning Level do not consider the role of terrain factors in the zoning. In the mountainous and hilly areas of southern China, it is necessary to identify the spatial heterogeneity of fire risk factors for the daily monitoring and prevention of forest fires in these locations. Wuyishan, a county-level city under the jurisdiction of Fujian Province of China, is the main location of Mount Wuyi, which is a UNESCO World Cultural and Natural Heritage Site and Wuyishan National Park (one of China's first ten national park pilot areas). Mount Wuyi contains the largest, most representative examples of mid-subtropical forest and rainforest in southern China. Of enormous importance for biodiversity conservation, this almost intact site acts as a refuge for a number of ancient relictual plant species, many of which are endemic to China, and also contains an extremely rich suite of flora and fauna (Song, 2020). Although Wuyishan is one of the key places for biodiversity conservation in southeastern China (Lin et al., 2018), it is also part of a high forest fire risk area in Fujian Province. Thus, this study aimed to: (1) examine the spatial heterogeneity of forest fire drivers in Wuyishan; and (2) explore a new approach for fire risk zoning at a county-level in mountainous locations. The classification of forest, the identification of indicators and the mapping fire risk areas can provide reference for the precise spatial planning and efficient implementation of biodiversity and forest fire protection in our study site.

Study area
Wuyishan City is located is located in the north of Fujian Province in China. The geographic coordinates are: longitude 117°24ʹ12ʺ-118°02ʹ50ʺ E, and latitude 27°32ʹ36 ʺ-27°55ʹ15 ʺ N (Fig. 1). The land area is 2798 km 2 and there are 3 towns, 3 streets, and 4 villages. Tourism and the tea industry are the main economic pillars (You et al., 2017). This area has a typical Danxia landform, surrounded by mountains in the north, west, and east, with a series of peaks and ridges. The central and southern parts are relatively flat, with an average elevation of 100 to 700 m; Huanggang Mountain has the highest elevation at 2158 m (Fig. 1). The regional climate is a subtropical monsoon climate, with distinct seasonal changes (a hot, rainy summer and a mild, drier winter). The annual average temperature is 17.9°C with an average temperature of 8.3°C in January and 26.7°C in July. The annual precipitation of most of the region is more than 2000 mm, and some areas exceed 3000 mm. The daily relative humidity is up to 78%. Wuyi Mountain has suitable climatic conditions for forest growth and forms an important forest area in southern China and a key area for biodiversity conservation. The forested land in the region is around 236,700 ha (accounting for 84.4% of Wuyishan's land area). The vegetation of Mount Wuyi has obvious zonal distribution characteristics because of the mountainous terrain. From high to low elevation, the zones are: mountaintop meadow, Zhongshan moss dwarf forest, temperate coniferous forest, coniferous and broad-leaved mixed transitional forest, and subtropical evergreen broad-leaved forest (Wang et al., 2010)

Materials and methods 3.1 Model variables and processing
The following types of indicators were selected as independent variables: (1) Climate factors, which were obtained from the Wuyishan Meteorological Station (National Standard Weather Station No. 58730), including daily meteorological data corresponding to the occurrence of forest fires. A total of 14 meteorological factors were used, including the daily average pressure (hpa), the daily maximum pressure (hpa), the daily minimum pressure (hpa), the daily average temperature (°C), the daily maximum temperature (°C), the daily minimum temperature (°C), daily average daily relative humidity (%), daily cumulative precipitation (mm), daily evaporation (mm), daily average wind speed (m·s -1 ), sunshine hours (h), daily mean surface temperature (°C), daily maximum surface temperature (°C) and daily minimum surface temperature (°C). (2) Topographical factors, which were obtained from the geospatial data cloud (http://www.gscloud.cn/). A digital elevation model (DEM) was used to map elevation, slope, and aspect. Based on forestry surveys, the area was divided into sunny slopes (southsouthwest-west slope), shady slopes (northwest-north-northeast Slope), and mixed slopes (eastsouth slope)). Each layer had a resolution of 30 m × 30 m. (3) Forest characteristics, which were sourced from a Forest Resource Inventory Database that was provided by the Wuyishan Forestry Bureau of China. Six stand-related factors were selected including dominant tree species, canopy density, age group, humus layer thickness, shrub layer height, and herb layer height. (4) The distance to the closest road, the distance to the closest river, the distance to the closest railway and the distance to the closest settlement were identified as human activity factors according to previous research (Bar Massawa et al., 2013;You et al., 2017;Yang et al., 2017;Ying et al., 2018). There were two railway lines in our study area. One was the ordinary train line and the second was the China Railway High-speed line. The former tended to generate artificial ignition sources from trains (such as cigarette butts) while in the latter smoking was prohibited. Therefore, in this paper, we only analyzed the ordinary train line. These independent variables were divided into quantitative variables and categorical variables to fit the models in this paper. There were 28 independent variables in total.
Forest fire data were collected from the Wuyishan Forestry Bureau. Because there was a lack of geographic coordinate point records of historical forest fire occurrence earlier than 1999 in the preliminary fire survey reports, we only selected the number of forest fires with corresponding latitude and longitude coordinates from 1999 to 2013 (70 points of historical forest fire data were use among 74 points because 4 points related to deliberate arson were excluded). For the binary logistic regression model, it was necessary to construct a data set where the fire occurrence pointsand the random control points (non-fire point) coexisted as the dependent variable. The number of random points is crucial to the construction of the model. When the number of random points is too large or too small, the accuracy of the final prediction model will be affected. To avoid the data being too discrete, the number of random points cannot be less than the number of fire points. Therefore, based on Massada et al., 2013, random points were generated at a ratio of 1:3.5, giving a total of 315 spatial point data samples (including 70 fire points and 245 non-fire points). Of these, 60% were used as the fitting data for the predictive model, and 40% were used as the model verification data.

Binary Logistic Regression Model
A binary logistic regression (BLR) model is a generalized regression model, which was developed in the medical field to predict the probability of disease occurrence. It has been widely used in forest fire prediction and forecasting (Pourghasemi, 2016;Liang et al., 2017). In the study of the probability of forest fire occurrence, the dependent variable is usually assigned a value of 1 or 0. The occurrence of forest fire is recorded as Y=1, the occurrence of no forest fire is recorded as Y=0, the probability of occurrence of forest fire is P, the probability of no forest fire is (1−P), and X1…Xm represents the potential driving factor (independent variable) that affects the occurrence of forest fire. The relationship between the probability of fire and the driving factor in the logistic regression model was established as follows: Where: (P/1−P) represents the ratio of the probability of occurrence of forest fire to the probability of non-occurrence of forest fire, β0, β1, β2...βm represent the regression coefficients of the respective variables, and the probability range of P (0, 1), (P/1−P) is a range of values To reduce the effect of collinearity on the accuracy of the model fit, the predictive model needed to diagnose the relationship between the fire risk factors. The variance inflation factor (VIF) method was used to diagnose the collinearity of all the fire risk factors. According to the principle of multicollinearity diagnosis, if VIF <10, there was no multicollinearity between the independent variables. If VIF ≥ 10, there was multicollinearity between the independent variables. Based on the results of the VIF method, multicollinear variables were eliminated. Finally, we identified 19 potential driving factors (daily relative humidity, daily precipitation, daily evaporation, daily average wind speed, sunshine hours, elevation, slope, aspect, landform type, dominant species, canopy density, age group, humus layer thickness, shrub layer height, the height of the herb layer, the distance to the road, the distance to the closest settlement, the distance to the closest railway, and the distance to the closest river) as independent variables in BLR model. To reduce the influence of the data distribution, we created three training and testing datasets using the principle of randomization. Then, the BLR model was fitted to the three sample datasets. Significant variables that appeared at least twice were integrating into the eventual fitting of the full sample data. Relative operating characteristic (ROC) values were used to check the fitting accuracy of the predictive model (Pourtaghi et al., 2016).

Geographically weighted logistic regression model
The geographically weighted logistic regression (GWLR) model is an extension of the general regression model. By considering the spatial heterogeneity of variables, GWLR can reflect the true characteristics of spatial data better than traditional predictive models (Rodrigues et al., 2014). To further explore the spatial characteristics of fire risk factors on forest fires in Wuyishan City, this paper used GWLR to study the probability of forest fire occurrence based on the variables selected in the BLR model. The model relationship was as follows: Where: (ui,vj) represents the coordinates of the i-th point, βj (ui,vj) represents the j-th regression parameter at the i-th coordinate point after the geographical weighting, and P (ui,vj) represents the probability of the occurrence of the i-th forest fire.
The steps for analyzing the non-stationarity of the fire risk driver space were as follows: (1) To model and analyze the entire study area, we used local polynomial interpolation in ArcGIS to estimate non-observed values. This was necessary because the GWLR model cannot estimate the parameters of non-observed values.
(2) Spatial interpolation analysis of the t-test value of the coefficient for each variable was carried out to further explore the spatial distribution of the variable's significance. When the t-test value was less than −1.96 or greater than 1.96, the variable was spatially significant and the interpolation result was displayed; when the t-test value was between −1.96 and 1.96, the variable was not spatially significant, and the interpolation result was not displayed.

Fire risk classification
The fire risk factor coefficients in the GWLR model were substituted into the probability model to produce the probability of forest fire occurrence in our study area. Then, a kriging interpolation method was used to interpolate the probability of forest fire occurrence in the whole Wuyishan area. According to the cutoff value of the forest fire prediction model and the default classification threshold, the forest fire risk zone was divided into low, medium, and high fire risk levels. Table 1 shows the significant variables that had no multicollinearity influence among the total 19 variables using the different random sampling datasets for the BLR simulation. Only the daily average relative humidity, daily sunshine hours, distance to the closest railway, and elevation factors were tested at least twice as significant independent variables. These four independent variable were used to fit the full sample data. The results of the model simulation are shown in Table 2. It can be seen from Table 2 that the daily average relative humidity, elevation, and the distance to the closest railway are negatively correlated with the occurrence of forest fires in our study area, while daily sunshine hours are positively correlated with the occurrence of forest fires. Specifically, for every 1% increase in daily average relative humidity and 1 m increase in elevation, the rate of forest fires here will decrease by 0.095 and 0.004, respectively. For every 1 hour increase in sunshine hours, the rate of forest fires will increase by 0.021. For the railway, the increase in the distance affects the rate of fire very slightly.  the change in odds from a one-unit change in the independent variable. When Exp(B) >1 the probability increases upon an increase in the value of the independent variable, when exp(B) <1 the probability decreases. When exp(B) = 1, the probability remains unchanged. All variables (driving forces) are significant at p < 0.01.

BLR model validation
The significance of the BLR model was P<0.0001. The validation of the different random sampling datasets is shown in Table 3. The ROC values for the three partial samples and the full sample were 0.842, 0.832, 0.874, and 0.836, respectively, which were greater than 0.7. Taking the optimal critical value (0.564, 0.528, and 0.574) of the three samples as the standard and using the remaining 40% of the test samples to test the model accuracy, the predictive model discriminant accuracy ranged from 77.0% to 85.7%. The accuracy of the full sample model was 84.4%, when optimal critical value of the full sample dataset (0.549) was taken into account as the classification standard. It can be seen that the BLR model was highly accurate and fitted the data well for forest fire classification in our study area.

Simulation based on GWLR 4.2.1 Spatial heterogeneity of fire risk indicators
To examine the spatial heterogeneity of the impact of fire risk factors on forest fires in the study area, the four fire risk factors identified above by the BLR model were used as the drivers for the GWLR model simulation. The estimated coefficients of the four independent variables are shown in Table 4. Variable coefficients had different intensities in different geographical locations, and the areas that had an impact on forest fires were also different (see Fig. 2). The daily average relative humidity was significant over the whole study area, and it had a significant negative spatial correlation with the occurrence of forest fires. The areas with greater influence from the variable coefficients were in the southeast of Wuyishan. The estimated coefficient of sunshine hours mainly showed a positive spatial correlation in the southern area. Elevation was significantly negatively correlated in the southeast and central regions. The distance to the closest railway had a significant negatively correlation with forest fire in the southwestern area.

Goodness of fit for BLR and GWLR
The goodness of fit of the two models was tested using the AIC, AICc and AUC. The smaller the AIC and AICc values, the better the model fit; the larger the AUC value, the better the model fit. It can be seen from Table 5 that the AIC and AICc values of the GWLR model were 198.093 and 198.883, respectively, which were lower than the BLR model, and the AUC value of the GWLR model was 0.927, which was higher than the AUC value of the BLR model. This indicated that the fit of the GWLR was better than the BLR model. From the perspective of prediction accuracy, the prediction accuracy of the GWLR model reached 85.3%, which was also slightly higher than the accuracy of the BLR model.

Fire risk area and distribution pattern
Since the fitting accuracy of the GWLR model was better than that of the BLR model, the GWLR model could better demonstrate the influence of fire risk factors on fire occurrence in different geographic spatial units. As Fig. 3a shows, the forest fire risk zone in Wuyishan City was divided into low, medium, and high fire risk levels, accounting for 46.05%, 34.03% and 19.92% of the total area of the study area, respectively. The high-risk areas were relatively concentrated in the central and southern regions; while low risk areas were in the western, northern and eastern regions.
From the perspective of administrative divisions, the four towns with the highest proportion of high fire risk area to the total area of towns were: Xingtian Town (XT) 78.04%, Wuyi Street (WY) 70.93%, Xinfeng Street (XF) 49.03%, and Chong'an Street (CA) 47.64% (see Fig. 3b). Among these, XF had the smallest amount of forest area, but 49.03% of the town was located in a high fire risk area and 50.97% in a medium fire risk area. In contrast, Xingcun Town (XC) had the largest forest area but the lowest fire level, and the proportion of high-risk area only reached 0.27%. The high-risk areas in Langu Village (LG) accounted for the smallest proportion (0.02%). Therefore, more investment of fire prevention and suppression should be assigned to the central (CA, WY, XF) and southern (XT) regions. The BLR and GWLR models have been used in previous research to simulate the development of fire occurrence. For example, the logistic forest fire prediction model established by Gudmundsson et al. based on terrain and precipitation data had an accuracy of more than 65% (Lee, 2012;Gudmundsson, 2014). Su et al. constructed a BLR predictive model using the forest fire data for Fujian Province, China; their model accuracy reached 72.3% (Su et al., 2015a). Yang et al. used a BLR model to perform fire insurance prediction in southern Fujian, and the model's accuracy reached 74.0% (Yang et al., 2017). In our study, the BLR model accuracy was 84.4%, and the fitting results demonstrated that daily average relative humidity and elevation were significantly negatively correlated with the occurrence of forest fires, while the daily sunshine hours and the distance to the closest railway were significantly positively correlated with the occurrence of forest fires. The traditional logistic regression model assumes that the spatial variables are all stationary and ignores the spatial heterogeneity of the model variables. The GWLR model demonstrated that only the daily relative humidity was negatively correlated across the whole study area, although this indicator had a different degree of impact on the probability of forest fires occurrence in different areas. However, elevation, daily sunshine hours, and distance to the closest railway all had an impact only in local areas. The prediction accuracy of the GWLR model was 85.3% in Wuyishan City, which was a slight improvement of about 1% compared with the BLR model. Since our results were conducted at the smallest scale (county scale), both BLR and GWLR obtained higher model simulation prediction accuracy and smaller differences in accuracy compared with previous research at larger spatial scales. Our results also proved that forest fire indicators for model fitting should be considered as non-stationary variables.
ArcGIS software has been widely used in China's forestry and land management departments as a useful spatial management tool. The GWLR module has been developed as an extension module for ArcGIS with free download on the Internet. Therefore, it is feasible to apply our modelling approach to the mapping of fire risk at a county scale. Fire protection managers can fully monitor the spatial heterogeneity of key forest fire factors, integrating topographic and land use maps of the study area. This approach can help to realize the identification and accurate management of fire risk in mountainous and hilly areas at a county scale.

Fire risk drivers at different temporal and spatial scales
Our results showed that daily average relative humidity, sunshine hours, elevation and distance to railway were the most important fire risk factors affecting forest fire occurrence in Wuyishan City. Forest fires here were prone to occur at lower elevations during relatively dry air periods in sites closer to the railway. Longer sunshine time may also increase the risk of forest fires in these places. Comparing the existing research on forest fire indicators in Fujian Province, China, we found that the fire risk indicators were different at different temporal and spatial scales.
Many studies have shown that the indicators affecting forest fires include human, meteorological, and topographical factors (Su et al., 2015b;You et al., 2017;Ying et al., 2018). For example, based on meteorological data (6 years of data from 2000 to 2005), Liang et al. (2017) pointed out that the daily maximum surface temperature, daily minimum surface temperature, daily maximum temperature, sunshine hours and days were all important. The lowest daily relative humidity was the most important meteorological factor affecting the development of forest fires at a Province level. However, Liang's paper did not consider the potential impact of terrain, vegetation types, human activities and other conditions on forest fires. Our study included 15 years of meteorological data, topography, and human factors in the simulation and found that only daily relative humidity and sunshine hours were important among the meteorological factors. The other meteorological factors mentioned by Liang et al. were not the key driving factors of forest fires occurrence at the county scale. Here, importance of elevation and the distance to the closest railway demonstrate that the scale effect in the practical application of fire risk zoning cannot be ignored.
Fujian Province is prone to forest fires and climate, topographical factors and human factors are key influences in fire behavior in the hills around Wuyi Mountain. Each season has different spatial risks of forest fires in Fujian Province. In autumn and winter, most areas of Fujian Province have medium and high fire risks. As the medium-high risk area of Wuyishan City is in northwestern Fujian Province, we should not only consider the climatic conditions for fire risk zoning, but also comprehensively consider the local topographical conditions and humanrelated factors to accurately identified fire risk prevention and control measures for specific locations. This will allow the limited resources for fire prevention and control (e.g., labor force, equipment, and funds) to be more efficiently assigned to protect the unique and precious Wuyi Mountain forest ecosystems.

Fire management at forest edges under different fire danger zones
The forest edge is the interface between forest and other ecosystem types. The ecological processes of the edge are complex, and forest edges are often a sensitive zone in the forest landscape (Cochrane and Laurance ., 2002;Armenteras et al., 2013;Numata et al., 2017). On the basis of the fire risk zoning map obtained in this study, we determined four types of land use that were closely related to forest fire occurrence in our study site (roads, buildings, farmland and tea gardens). The boundary distribution pattern of these four types of forest edge combinations and their proportions in different fire-risk areas is shown in Fig. 4. As shown in Fig. 4b, the shared edge of forest and tea gardens had the longest length (5962 m), followed by farmland (4019 m), roads (1577 m) and buildings (828 m). Among these, 37.4% of tea gardens, 28.1% of farmland, 25.1% of roads and 53.9 % of buildings were located in the high-risk zone. The shared boundary between tea gardens and forest in the high fire risk area was 2200 m, which was nearly twice that of farmland, 5.6 times that of roads, and 5 times that of buildings. Wuyishan City is the core area of Wuyi Rock Tea production, which is one of the most famous teas in China. Owing to the enormous demand from the tea market, the expansion of tea gardens has resulted in the obvious mosaic pattern of tea gardens and forest patches (Fig. 4a). The cumulative proportion of these four types of shared boundary in both medium-and highrisk areas ranged from 64.3% to 84.2%. More than half of the forest-farmland shared boundary areas were in the medium fire danger zone. With the impact of human activities and tea plantation expansion, forest landscapes here have become more fragmentated and the length of the edge has increased. Frequent tea garden management, agroforestry and agricultural activities have created more forest edge where forest stands are vulnerable to ignition sources. Therefore, it is recommended that the forest fire management department increases the frequency of daily inspections of forest edge areas in the high and medium fire risk zones based on our risk map, and promotes education related to safe agroforestry fire use and fire prevention during the fire protection period.

Conclusions
This study found that the daily average relative humidity, sunshine hours, elevation, and distance to the closest railway were indicators of forest fire occurrence in Wuyishan City. These indicators had obvious spatial heterogeneity. Daily average relative humidity was negatively correlated with the probability of forest fires over the whole study area, whereas the three remaining driving factors (sunshine hours, elevation, and distance to the closest railway) had an impact on fire occurrence in local areas at the county-level. Forest fires in our study site were prone to occur at lower elevations during relatively dry air periods and/or in places closer to the railway. The longer sunshine time may increase the risk of forest fires as well. The high-risk zones were mainly concentrated in the central and southern areas so more investment of fire prevention and suppression should be assigned to these regions, particularly in the forest edge areas of the four administrative towns or streets. The GWLR model was a suitable approach to perform refined fire risk zoning at the county scale, compared with the method of regionalized ranking of the nationwide forest fire risk. For Mount Wuyi, our approach can help to provide a reference for decision makers to prioritize areas for natural ecosystem conservation and improve practical applications of forest fire prevention and suppression.
Author contribution statement WBY conceptualized hypotheses, conducted the fieldwork and data analysis, and wrote final text. WL and YQY collected data and conducted data analysis. DJH conceptualized hypotheses, discussed the design and interpreted the findings.

Declaration of interest statement
We confirm that this manuscript has not been published elsewhere and is not under consideration in whole or in part by another journal. All authors have approved the manuscript and agree with submission to Journal of Environmental Management. The authors have no conflicts of interest to declare.