Geostatistics of Active Fire Remote Sensing Data - Examples from South Asia

Active fires illuminated on the earth surface are caught by the satellite. These fires are created by various sources such as vegetation fires, gas flares, biomass burning, volcanoes, and industrial sites such as steel mills. Near real time active fire data is collected using remote sensing techniques of satellites. Amount of active fires in an area is a proxy indicator of aerosols, green houses gases and trace gases. Here the behavior of active fires over a period of one year in Nepal, Bhutan and Srilanka are studied using spatial statistics. This study is based on data acquired through remote sensing of data acquisition platform, NASA’s MODIS. Spatial statistics is used here to study the incidence of active fires with respect to geographical location. The behavior of parameters of various autoregressive models like Spatial Durban Model, Spatial Lag Model, Spatial Error Model, Manski Model and Kelegian Prucha Model are minutely analyzed. The best model with highest pseudo R 2 is selected. The spatial behavior of the fire radiative power for three countries is also predicted using spatial interpolation and kriging. Thus the burning potential of vegetations in unsampled areas is envisaged by thus predicting FRP. Such studies give a country wise perspective to the behavior of fire; this is with reference to south Asia. They are of great importance for countries of developing world which lack a strong backbone of good quality official records. Through the statistical analyses of data collected by such platforms, important information can be indirectly assessed.

It was used to improve the mapping accuracies of above ground biomass in northern Guangdong Province of China (Su, Shen and Wang 2020). Eskandari, Miesel and Pourghasemi assessed quantitative temporal relationships using correlation and regression analyses. They identified and described statistically significant relationships between using climatic data from the Golestan Meteorological Administration and fire statistics, Iran. They also determined the spatial relationships between climatic variables and fire occurrence (Eskandari, Miesel and Pourghasemi 2020). This paper is unique in the sense that, it conducts geostatistical analysis of active fire using variograms. This also involves estimation and modeling of spatial correlation. In addition to this, this paper also aims to understand and predict the behavior of active fire in south Asia with the help of multivariate statistics. It also delves into in depth structural relationship between variables. Bhutan, Nepal and Sri Lanka are taken as examples from this region. Using active fire one year satellite data, following questions related to vegetation fire trends in south Asia are addressed. What is the country wise behavior of vegetation fires? What is the behavior of brightness and fire radiative power (FRP) of these fires? What is the spatial behavior of brightness with respect to scan, FRP and BRIGHTNESS_T31? Which parametric function best explains the variogram of FRP of one year data? How to predict the FRP of the unsampled areas of the vegetation, if they caught fire?

Material and Methods:
Data: This study is based on active fire satellite data for three countries. These three countries are Bhutan, Nepal and Sri Lanka. These are observations made by NASA's Moderate Resolution Imaging Spectroradiometer (MODIS) satellite, from 18 September 2019 to 17 September 2020. This satellite noted 189, 982 and 441 cases of active fire in Bhutan, Nepal and Sri Lanka, during this one year period.
NASA's FIRMS give the global fire locations (hotspots) easily with varieties of data formats. NASA's FIRMS is an active fire locations data which represents the midpoint pixel measuring 1 km x 1 km which extracts from the MODIS Image using the Thermal Anomalies algorithms. FIRMS are part of NASA's Earth Observing System Data and Information System (EOSDIS). EOSDIS together with twelve Distributed Active Archive Centers (DAACs) provides access to data from NASA's Earth Science Missions (Fithria A and Ani A 2017).
A comparison between data collected through field survey of the Korea Forest Service (KFS) and satellite active fire data of MODIS was done by Lim et al. Examination of the spatial autocorrelation and related factors by fire source indicated that MODIS data had higher spatial autocorrelation, with remarkable distinction found in climate factors. KFS data were collected from post-fire surveys; they resulted in low spatial autocorrelation and reduced model accuracy owing to the wide distribution of data (Lim C H and Kim Y S et al. 2019).
In this paper following statistical methods are used.
Spatial Statistics: An important concern in here is to examine spatial patterning and spatial dependence among variables of interest. This means that values close together in space tend to be more familiar than those that are further apart (Lloyd 2010).

Spatial Autocorrelation:
A key tool for the analysis of spatial autocorrelation, is the Moran' I coefficient. It measures spatial autocorrelation with neighbors of observations; that are classified into various contiguity schemes.
Regression: An assumption of standard ordinary least squares regression is the independence of observations. But this assumption rarely hold true for spatial data. Spatial autoregressive models provide a means for accounting the spatial structure of the data. Under the assumption that semivariogram can be estimated from Nh sample data pairs ( " ), ( " + ℎ) for a number of distances (or distance intervals) ℎ # by 6( $ 7 ) = 9 ( ( ) − ( + )) ( ∀ ∈ $ 7 Interpolation and Ordinary Kriging: Geostatistics deals with the analysis of random fields Z(s), with Z random, and s the non-random spatial index. Typically, at a limited number of sometimes arbitrarily chosen sample locations, measurements on Z are available and prediction (interpolation) is required at non-observed locations + . So, here in variogram modeling, the variogram is often used for spatial prediction (interpolation) or simulation of the observed process based on point observations. The spatial prediction can be done by non geostatistical interpolation method such as Inverse Distance Weighted (IDW) Interpolation. IDW assumes that each measured point has a local influence that diminishes with distance. The influence of the distance can be controlled by an exponent (p) in such a way the lower the exponent, the more uniformly all neighbour values are incorporated into the interpolation. If p = 0, the weights do not decrease with the distance and the estimated values at unsampled locations are equal to the mean of all the measured values; the value p = 2 is typically set by default in most applications, meaning that the importance of each measured location in determining a predicted value diminishes as a function of squared distance (Losada and Santos et al. 2019).
It can also be done using a geostatistical model such as Ordinary Kriging (OK). In OK interpolation, the function determining the weights is called a variogram model. This model is a function fitted to the (empirical) variogram, which in turn describes the spatial auto correlation structure of the observed pattern. The choice of this model may play a significant role in the resultant spatial estimations. A remarkable difference between IDW and OK is that the first yields estimates that are always within the range of the observed values at sampled locations (Bivand, Pebesma, Gomez-Rubio 2013).
Results and Discussion: An overview of incidence of active fire in the three countries of south Asia is given in the Figure 1. Here the entire region is divided into grids. The incidences of active fire in these three countries are shown against this backdrop. A close up map view of the FRP of active fire in the three countries is provided in Figure 2.   The behavior of these fires is described in detail using univariate statistics in Table 1 and The variables closely related to the behavior of active fire are brightness and FRP. The behavior of these variables in three countries is summarized in Table 2, in terms of descriptive statistics. It is seen that although Bhutan had 189 incidences of fire in one year, it had the highest mean FRP among the three countries. The statistics describing FRP such as the quartiles and coefficient of variation (CV) are the highest for Bhutan. This implies that although the incidence of fire is the least, in comparison the Nepal and Sri Lanka, the quantity of biomass burning in these fires is the highest. Nepal has the highest incidence of active fire of 982. But the average FRP is much lower than Bhutan. The Median FRP for Nepal is the lowest. Sri Lanka has the most consistent type of active fire. The coefficient of variation is the lowest amongst these three countries. The spread of the variable brightness of active fire is only 2.7% of the mean. The spread of FRP is 68% of the mean. This indicates that Sri Lanka is most consistent with respect to the intensity of fire. This pattern is also reflected in box plots shown in Figure 3 and Figure 4. It can be seen from these figures that Bhutan has many outlier values on the higher range of the data. This shows that although the number of such fires was the least, the intensity of these fires in terms of brightness and FRP was the highest. The box plot of Sri Lanka is the most consistent with minimum number of outliers. The histogram of the behavior of brightness and FRP is given in Figure 5 and Figure 6. The brightness of active fire takes nearly symmetrical shape in Sri Lanka. The FRP of most of active fires is between 0-20 Mega watt for Nepal and Sri Lanka. This is also validated by the Q3 values of Table 2. But for Bhutan although the incidences of such fires are very low in comparison the Nepal and Sri Lanka, but the intensity in terms of FRP is the highest. indicates that the regression residuals are spatially correlated. This is based on simple non spatial regression model of brightness on scan, FRP and brightness_T31 is constructed. The dependent variable brightness is channel 21/22 brightness temperature of the fire pixel measured in Kelvin. The independent variable scan represents 1km fire pixel and is representative of actual pixel size. The independent variable FRP represents fire radiative power measured in measured in Mega Watts. And brightness_T31 is channel 31 brightness temperature of the fire pixel measured in Kelvin. For autoregressive models, the neighbor file is based on 5 nearest neighbor contiguity scheme. These 5 nearest neighbors are indentified using great circle distances. In these autoregressive models brightness is regressed on scan, FRP and brightness_T31. Among six autoregressive spatial models tested for all the three countries, the one with maximum pseudo R 2 is selected. These auto regressive models are namely a) Spatially Lagged X model b) Spatial Error model c) Spatial Durban Error model d) Spatially Lagged Y model e) Spatial Durban model f) Manski All Inclusive model g) Kelegian Prucha model. Spatially Lagged X model = + + (Local Spatial Model) is the most suitable model for Bhutan. This is also seen in Table 3. This model explains how neighboring explanatory variables behave. The behavior of incidence of active fire in neighboring area affects the behavior of active fire in that area. Here, the X values of neighboring areas affect the incidence of active fire in that area. But there is no global spillover effect.
Active fire data of Bhutan shows just significant spatial correlation at α = 0.05 and the Moran's I standard deviate of the residual of non spatial simple linear model is 1.6883. This implies that the incidence of active fire in the neighboring area significantly affects active fire in that area with a p value of 0.04568. But here, the sign of the independent variables namely scan, brightness_T31 and FRP doesn't change when the coefficients of the lag variables is considered. These lag variables represent neighboring areas. The Spatial Error Model (Global) is also studied as the second best model. It is explained by the equation = + , = + . Here λ is spatially lagged error multiplier. In a global model, the impact of one region spills over to the other, even when they are not specified as neighbors. So the behavior of global model in explaining, the incidence of active fire is also studied. As seen from Table 3, the accuracy of these models, explained by pseudo R 2 take values 0.842 and 0.839 respectively.
As seen from Table 3, for Sri Lanka, Spatial Durban Error model, = + + , = + is the most suitable model. It is a local model. It has lag coefficients that studies the impact of neighboring area on the explanatory variables. But here, the sign of the independent variables namely brightness_T31 and FRP doesn't change when the coefficients of the lag variables is considered. Moran's I statistics standard deviate takes a value of 6.2236, and shows a high significant spatial correlation with a p value of 2.43E-10. This implies that the incidence of active fire in Sri Lanka shows a very high spatial correlation. But Spatial Error Model (Global) is also studied as the second best model. It is explained by the equation, = + , = + . Here λ is spatially lagged error multiplier. In a global model the impact of one region spills over to the other, even when they are not specified as neighbors. The pseudo R 2 values for the first and the second model are 0.808 and 0.806 respectively.
For Nepal, active fire data is highly spatially correlated. Here the Moran's I statistics standard deviate takes a value of 11.239 with a p value < 2.2E-16. Amongst three countries Nepal's data shows the highest spatial correlation. Bhutan's active fire data shows the least spatial correlation. At α = 0.01, the Moran's I statistics is not significant for Bhutan, but they are very highly significant for Sri Lanka and Nepal. Similar to Sri Lanka, Spatial Durban Error model, = + + , = + is the most suitable model. Here, the independent variable scan and brightness_T31 changes the sign for the lag variables. This is related to neighboring areas. But for the other   The results of variogram modelling are given in Table 4. It has been used in quantifying the vegetation FRP variability. Four variogram models (Matern, Spherical, Exponential and Gaussian) were tested. Matern function with smallest root mean square error seems to be the most suitable model in explaining the variogram of sample FRP of the three countries. Matern function has been successful in variogram modeling in other situations. Mianasny and Mc Bratney have highlighted the importance of (semi) variogram as keystone to geostatistics and the flexibility of Matern function in explaining the variogram of soil variability (Minasny and McBratney 2015). Matern function is a special case of exponential function and fits the soil parameters variogram close to the origin (Shaheen and Iqbal 2018). The FRP of the burning potential of the biomass at the unsampled area are predicted using IDW and OK. The results of interpolation FRP using IDW for the unsampled areas of Bhutan, Nepal and Srilanka are given in Figure 7, Figure 8 and Figure 9. Darker the color, lower is the interpolated value of FRP for the unsampled areas. IDW is a non geostatistical model, and here p = 2. The residuals obtained from the OK of the Matern model given in Table 3 are given in Figure 10, Figure 11 and Figure 12. We see that the residuals take the highest value for Bhutan and the lowest value for Sri Lanka. As seen from Table 4, the root mean square error (RMSE) for Sri Lanka is the lowest for predictions made by IDW and OK. This is because of the consistent nature of FRP data. Here RMSE are the prediction errors from IDW and kriging.
The predicted values of FRP for unsampled areas are given in Figure 13, Figure 14 and Figure 15. Here light areas show FRP with low intensity and dark areas show FRP with higher intensity.  Conclusion: Bhutan had minimum incidences of active fire in the one year period of Sept. 2019 to Sept. 2020. This was in contrast to Nepal, which had the highest incidence. But some of these fires were of very high intensity. These fires are mainly vegetation fires. The coefficient of variation of brightness and FRP was highest in Bhutan, in comparison to Nepal and Sri Lanka. The p value of Moran's I standardized variate for the residuals of non spatial simple linear model is 0.04568, in contrast to 2.43E-10 for Sri Lanka and < 2.2E-16 for Nepal. This implies that the behavior of active fire is universal throughout Bhutan irrespective of its geographical coordinates.
The behavior of brightness and FRP was most consistent for Sri Lanka with lowest values of coefficient of variation among the three countries. The distribution of these values for Sri Lanka, is symmetrical, as also reflected by the boxplots.
Amongst several autoregressive spatial models tested, Spatially Lagged X model -local, Spatial Durban Error model -local and Spatial Durban Error model -local were the best in explaining the brightness of these active fires for Bhutan, Nepal and Sri Lanka respectively. The coefficient of determination pseudo R 2 is 0.842, 0.771 and 0.808.
The variogram of FRP was best explained by Matern function for all the three countries. The spatial variability of FRP has been quantified with variogram analysis here. The FRP of burning potential of vegetation in unsampled areas was predicted by IDW and OK. These models have best explained the active fire for Sri Lanka, as the RMSE took minimum values here. Symmetric and consistent nature FRP values are the main reasons.
This study results use geostatistics to offer critical insights in understanding the behavior of vegetation fires. This will provide important guidelines for strengthening management of such fires in South Asia.