To address research question (a), we implemented a gradient boosted classification modelling framework with forest degradation history from Landsat time-series as reference data and textural metrics derived from PlanetScope images as predictors (Fig. 1, box 1). To address research question (b), we estimated biomass distributions of intact, logged and burned forests from airborne lidar data (Fig. 1, box 2). To address question (c), we compared biomass estimates from a Monte Carlo simulation that accounted for biomass uncertainty from lidar data to a simulation that also accounted for land cover classification uncertainty from the model referenced in (a) (Fig. 1, box 3).
Study sites
We conducted our study at three sites in the Brazilian Amazon that covered portions of the Feliz Natal Municipality and Xingu Indigenous Territory in the Brazilian state of Mato Grosso, and Saracá-Taquera National Forest, in the Brazilian state of Pará (Fig. 2). The Feliz Natal site comprises a mixture of land covers, whereas Xingu and Saracá sites were included as mixtures of intact forest with logging and fires only, respectively.
Feliz Natal (Fig. 2, site 1) is located in the Brazilian Arc of Deforestation, a government-defined region that concentrates 70% of deforestation within 100 municipalities (~ 1 million km2). Feliz Natal has a diversity of land uses including pastures and mechanized agriculture that have replaced part of the original forest cover. Most of the remaining forests in this region have been logged and a substantial portion of the forests have burned at least once (26).
The Xingu site (Fig. 2, site 2) lies about 80 km southeast from the Feliz Natal site, but it is located within the Xingu Indigenous Territory. Indigenous lands are well-known barriers for deforestation and other anthropogenic pressures in the Amazon (66, 67), however large fires occurred in 2016 and 2017 in the Xingu area (68). The climate of Feliz Natal and Xingu region is typical of south-eastern Amazonia, with mean annual precipitation of about 1900 mm, an extended 5-month dry season and mean annual temperature of 25°C (69).
The Saracá-Taquera National Forest (Fig. 2, site 3) holds a federal logging concession administered by the Brazilian Forest Service. Along with a long history of bauxite mining in this National Forest, specific areas were assigned for sustainable forest management. Selective logging at this site was conducted between 2015 and 2020 using reduced-impact logging techniques (70) to comply with federal regulations. The climate at this site is tropical humid, with mean annual precipitation of 2000 mm, and dry season extending from July to October (71).
Classification of forest degradation
Reference data for land cover classification
Our forest degradation reference dataset was built from visual interpretation of Landsat TM, ETM + and OLI time-series, from 1984 to 2020. At least one cloud-free image per year was available for each site. For each image of the time-series, we computed the Normalized Burn Ratio (NBR), a spectral index that has been widely used to detect forest disturbances (72, 73). NBR is computed according to the following Eq. (74):
$$NBR= \frac{\rho NIR - \rho SWIR}{\rho NIR + \rho SWIR}$$
where \({\rho }\text{N}\text{I}\text{R}\) and \({\rho }\text{S}\text{W}\text{I}\text{R}\) correspond to surface reflectances of near infra-red and shortwave infra-red bands, respectively. We then manually delineated fire and selective logging polygons and recorded the year of the degradation event. We masked out water and wetlands based on Gumbricht, Roman-Cuesta (75), and bare ground, pasture, and croplands and recent deforestation using Brazilian TerraClass and PRODES classifications (27, 50).
PlanetScope images
We selected one recent multispectral PlanetScope image with four spectral bands (blue, green, red, and near infrared) (76) for each site. For each image we calculated the Enhanced Vegetation Index (EVI), to highlight both recent degradation and subsequent regeneration. EVI is computed following Huete, Didan (77):
\(EVI=2.5 * \frac{(\rho NIR - \rho Red)}{(\rho NIR + 6* \rho Red - 7.5 * \rho Blue + 1)}\)
where \({\rho }\text{N}\text{I}\text{R}\), \({\rho }Red\) and \({\rho }Blue\) correspond to surface reflectances of near infra-red, red and blue bands, respectively. Image dates were selected based on the disturbance occurrence on each site and proportion of cloud cover (Table 1).
Table 1
Overview of PlanetScope and lidar data for each site.
|
PlanetScope data
|
Airborne lidar data
|
Site
|
Image date
|
Area (ha)
|
Year(s) of acquisition
|
Area of intact forests (ha)
|
Area of logged forests (ha)
|
Area of burned forests (ha)
|
Feliz Natal
|
17-Jun-2018
|
235,872
|
2017–2018
|
717
|
1574
|
1751
|
Xingu
|
18-Oct-2017
|
191,823
|
2017
|
740
|
0
|
554
|
Saracá
|
19-Sep-2020
|
45,756
|
2013–2015
|
855
|
2082
|
0
|
Total
|
|
473,451
|
|
2312
|
3656
|
2305
|
The GLCM approach and generation of predictors
We used the Gray-Level Co-Occurrence Matrix (GLCM) textural technique (78) to calculate metrics used to classify degraded forests in our test sites. Texture in images quantifies pixel grey level differences, size of area where change occurs (neighborhood, defined by a window size), and directionality (79). GLCM tabulates how often different combinations of pixel grey levels occur in each image and then derives statistics from this tabulation. The eight GLCM metrics used in this study can be categorized into three groups: (1) descriptive statistics, which include mean, variance, and correlation; (2) contrast, which includes contrast, homogeneity, and dissimilarity; (3) and orderliness or regularity, which includes angular second moment and entropy. Description of the GLCM metrics and practical guidelines for choosing GLCM metrics for classifying remote sensing images can be found in Hall-Beyer (80) and Hall-Beyer (79).
We generated the GLCM metrics for the Planet-derived EVI using the glcm package (81) in R (82). After empirical tests, we selected the following parameters for the glcm function: window size of 45 pixels (140.625 m); and shift as the average across all directions (i.e., no effects of directionality in the observed phenomena). We trimmed the outermost window along the edge of each image to avoid artifacts where there was insufficient information for GLCM to compute the accurate textural values.
Although the textural feature window size of about 141 m captures considerable heterogeneity associated with degraded forest patches, we explored aggregating windows to a coarser spatial resolution for more accurate classification (83). We tested different aggregation resolutions (140.625m, 281.25m, 562.5m, and 1125m; corresponding to 45, 90, 180 and 360 PlanetScope pixels, respectively), and based on model performance, we selected the 562.5m resolution. The following resampling statistics were used to aggregate the GLCM metrics from the native resolution (3.125m) to 562.5m grid cells: average, standard deviation, skewness, root mean square, minimum, first quartile, median, third quartile and maximum. In total, 72 raster layers (9 resampling statistics for each of the 8 GLCM metrics) were used as explanatory variables for the classification model.
Probabilistic classification model
Rather than selecting a single hard classification for each grid cell of our image, we quantified the probability that grid cells would fall into each of three classes: intact, logged, and burned (Fig. 1, box 1). Gradient boosted trees were used to classify grid-level degradation because of their strong predictive performance and flexibility in accommodating typical features of data such as nonlinearities and interactions (84). Our multinomial classification tree model was fitted using the stochastic gradient boosting algorithm implemented in the xgboost R package (85). The multi:softprob objective function was utilized to output a grid-level prediction containing an estimated probability of belonging to each degradation class.
Because degradation classes were unbalanced, a class-weighted loss function was utilized during model training. We specified a weight variable such that the sum of individual observation weights within each class was equal across the three degradation classes. Individual observation weights were subsequently multiplied by the grid cell associated purity (i.e., the percentage of the grid cell that is occupied by the dominant class) to down-weight error contributions from less homogeneous reference data. The full dataset was partitioned into training (50%) and test (50%) sets using stratified random sampling to balance class distributions within each split. The stratification variable consisted of binned purity values, in increments of 0.2, within each disturbance class. GLCM features were centered and scaled within each site to put each feature on a common scale after combining data for the three sites.
Several techniques were utilized to avoid overfitting during training and optimize the model’s bias-variance trade-off. First, we randomly selected (without replacement) 80% of the training data and 70% of the GLCM features to be utilized within each boosting round. This stochasticity decorrelates the decision trees and increases the predictive performance of the ensemble. Second, we used early stopping to halt training once the multiclass error rate for the validation set failed to decrease after five boosting iterations, thus rapidly detecting the inflection point of the learning curve. Finally, we set the learning rate to 0.2.
We determined optimal ranges for each hyperparameter through an iterative grid search. Terminal nodes were allowed to have a minimum of five observations, the maximum tree depth within a boosting round was equal to four, and a reduction of 0.2 in the multiclass error rate was required to further partition a leaf node. To assess model performance while accounting for uncertainty in both partition variability and algorithm stochasticity, we fitted classification models to 100 randomly generated partitions of the data.
Several studies showed that the degradation signal fades from optical images within 3–5 years (34, 86, 87) because of forest regeneration. We excluded from model training the grid cells with logging and fire disturbances that occurred more than five years prior to the date of the image, because preliminary model fits indicated increased confusion for older disturbances.
Adhering to good practice of accuracy assessments as suggested by Olofsson, Foody (59), we report the accuracy of our classification by presenting the confusion or error matrix and the most common accuracy measures. These assessments are presented for the highest predicted probability of land cover for each grid cell as if we had conducted a hard as opposed to a probabilistic classification: Overall accuracy, which is simply the proportion of the area mapped correctly. It provides the user of the map with the probability that a randomly selected location on the map is correctly classified. User's accuracy is the proportion of the area mapped as a particular category that is actually in that category in the reference data. Producer's accuracy is the proportion of the area that is a particular category in the reference data that is also mapped as that category.
Biomass estimates
To estimate grid cell-level biomass, we used high point-density lidar data collected over or adjacent to the test sites (88). In total, 8,723 hectares of airborne lidar over intact, logged, and burned forests were included (Table 1). When classifying the lidar transects over degraded forests, no distinction was made with regard to time since disturbance. Biomass was estimated as aboveground carbon density (ACD) from the lidar-derived top-of-canopy height at 50 m plot resolution (0.25 ha) using the methods detailed in Longo, Keller (23), and then resampled to 500 m resolution (25 ha) using the median values of the pixels in the 10x10 plot-equivalents, to approximately match the resolution of the classification predictors (Fig. 1, box 2). We reported the ACD estimates of Feliz Natal and Xingu sites together, because of their spatial proximity and similar ACD range.
Estimation of uncertainties
We utilized Monte Carlo simulations to quantify uncertainty in grid-level aboveground carbon density estimates (Fig. 1, box 3). We performed two simulations to understand the uncertainty from two sources: (i) ACD estimation and (ii) land cover classification. In the first simulation we accounted for the effects of both land cover classification uncertainty and biomass uncertainty. For a given iteration, individual grid cells were first classified by randomly sampling from the degradation classes with probability equal to the predicted class probabilities from the multinomial classification model (label ACD + LCLUC). For the second simulation we considered only the uncertainty in ACD estimates by selecting the degradation class with the highest class predicted probability (label ACD only). For both simulations, ACD on a per-grid cell basis was determined by randomly sampling from the associated site and degradation class carbon density estimates. We computed site-level carbon density statistics for a total of 10,000 iterations.