Study region
Our study region was located on the eastern side of the Central Cordillera of Colombia, corresponding to a highly fragmented landscape of the humid tropical biome between 150 and 2500 m elevation (Fig. 1). This region covers a mixture of sub-Andean humid forest relicts of different sizes, agroecosystems, shrublands, secondary vegetation in different stages of succession, cattle pastures, and artificialized soils (IDEAM et al., 2017). Also, this region contains three large artificial water bodies corresponding to the San Lorenzo, Playas, and Punchiná hydroelectric dams.
Bird Sampling
We chose six bird species commonly sighted in the study region with different habitat requirements and land-use patterns: Formicarius analis, Oncostoma olivaceum, Cercomacroides tyrannina, Ortalis columbiana, Troglodytes aedon and Tyrannus melancholicus. We generated detection histories for each selected bird species by establishing 70 fixed sampling points along 14 transects (1.25 km each) covering the transition from forest to secondary vegetation or grassland (Fig. 1). A minimum of 215 linear meters separated each observation point. We visited each sampling point during June and July of 2014 and 2015, recording all visual or auditory detections for 15 minutes within a maximum radius of 100 meters. Bird records were taken by at least four experienced observers, visiting each site for two consecutive days to obtain four temporal replicates per year. All sampling events were conducted between 6:00 am, and 10:00 am.
F. analis (Formicariidae) is a medium ground species (body mass: 62.19 g; length: 19 cm) that feeds mainly on invertebrates found on the forest floor and is distributed in Central and South America. O. olivaceum (Tyrannidae) is a small insectivorous species (body mass: 6.6 g; length: 9.1 cm) found in forest understory, secondary vegetation, and shrublands areas; this species is distributed in Panama and northern Colombia. C. tyrannina (Thamnophilidae) is a small insectivorous species (body mass: 9.34 g; length: 16 cm) that feeds mainly on invertebrates, and is commonly observed in forests, forest edges, and secondary vegetation. O. columbiana is a large species endemic to the Andean region of Colombia (body mass: 600 g; length: 53 cm) that feeds mainly on fruits and seeds. This species is common in its distribution range and is associated with forest edges, gallery forests, forest clearings, and pastures with scattered trees; it is occasionally observed in urban and peri-urban areas. T. aedon is a small species (body mass: 10.85 g; length: 11.4 cm) that inhabits semi-open areas, forest clearings, peri-urban and urban areas, and feeds on invertebrates and fruits. Finally, T. melancholicus is one of the most widely distributed species in Colombia, inhabiting pastures, grasslands with scattered trees, urban and peri-urban areas. It is a medium sized species (body mass: 35 g; length: 22 cm) with generalist feeding habits (mostly invertebrates) that stays high in the canopy. All descriptions were obtained from McMullan et al. (2011) and Wilman et al. (2014).
Land cover classification and Patch-based metrics
We obtained a mosaic covering the entire study region from 12 scenes of the MSI (Multispectral Imager) sensor in the Sentinel 2 Earth observation satellite, employing the Sen2Cor and Sen2Three modules of the free Sentinel Application Platform (SNAP) software (Louis et al., 2016; Main-Knorn et al., 2017). The combination algorithm gives greater importance to suitable pixels (i.e., no clouds or shadows) from scenes temporally closer to 2015, using pixels from more distant scenes only if closer scenes had low-quality information. Using the Sentinel 2 mosaic, the continental, coastal and marine ecosystems map of Colombia (IDEAM et al., 2017), and a high-resolution aerial photograph, we performed a supervised land cover classification using the open-source Semi-Automatic Classification Plugin (SCP) for QGIS 3 (Congedo, 2016), identifying five vegetation - land cover types: forest, secondary vegetation, grassland, water and bare or urban soil (Fig. 2). The obtained Kappa index was 0.82, and the overall supervised classification accuracy across the landscape was 88.07%, indicating an acceptable performance (Olofsson et al., 2014) (see supplementary material Table S2).
We calculated 24 patch-based metrics for each sampling unit focusing on forest cover using the landscapemetrics package for R (Hesselbarth et al., 2019). Since many of these metrics are strongly correlated with each other, we retained only those metrics that showed the lowest degree of correlation (i.e., Pearson r < 0.7) with any of the other metrics: total area (Ca), average shape index (Shape), number of patches (Np), clumpiness index (Clumpy), and mean Euclidean distance to nearest neighbor (Enn). The Ca and Shape metrics quantify patch area and shape, while Np, Clumpy, and Enn are aggregation metrics. We implemented circular nested buffers with four different radii (i.e., 70, 130, 270, and 8100 meters) to calculate the metrics and obtain a multiscale average for each sampling point. The equations and descriptions of each patch-based metric used here are widely described in the scientific literature (e.g., McGarigal and Marks, 1995).
Generally, Ca expresses the average area, in hectares, of all forest patches within the assessed radii and is equivalent to the habitat amount according to Fahrig (2013), assuming forest cover as a proxy for habitat of forest specialist species. The average Shape index expresses the ratio between perimeter and area of the forest patches located within the assessed radii. The shape index shows low values for regular-shaped patches (i.e., quadrangular). The number of patches, Np, is one of the simplest ways to quantify fragmentation per se in a landscape; highly fragmented landscapes have many patches. Clumpy indicates the aggregation degree of forest patches and ranges from − 1: maximum disaggregation to 1: maximum aggregation. Finally, Enn measures the average edge-to-edge distance to the nearest neighbor of the same class, in this case, forest. We expect Ca, Clumpy and Enn to be important for habitat specialists such as Formicarius analis, Oncostoma olivaceum and Cercomacroides tyrannina, whereas the number of patches and clumpy to be informative for generalist species such as Tyrannus melancholicus and Ortalis columbiana.
Land cover gradient representation and Surface-based metrics
We obtained the polar principal spectral indices (PPSi) from the polar transformation of the Sentinel 2 mosaic spectral bands for the study region, adapting the methodology developed by Moffiet et al. (2010) for ETM + and TM sensor scenes from the Landsat mission (see supplementary material Fig. S1). The PPSi indices decompose the land cover variation into three orthogonal dimensions: greenness (PPSG), brightness (PPSB), and wetness (PPSW), similar to the Tasseled Cap indices (Kauth & Thomas, 1976). With PPSi indices, it is possible to differentiate the variation in land cover between sites associated with changes in the vegetation itself (i.e., type of cover, amount of foliage, proportion of vegetation) from the variation associated with spectral brightness or the presence of water, unlike unidimensional indices such as NDVI (Moffiet et al., 2010). The PPSG index represents the most significant variation among the vegetation cover types classified for the study region, separating forest and secondary vegetation from pasture and bare or urbanized soil areas. Therefore, using the PPSB index in conjunction with the PPSG index is necessary to separate forest and secondary vegetation cover. Thus, sites or pixels with high PPSG values will have high PPSB values if they correspond to secondary vegetation and low PPSB values if they correspond to a forest (see supplementary material Fig. S1). Combining these three indices allows describing land cover through continuous variables that accurately represent the vegetation gradient present in a region based on information derived from remote sensing (Fig. 2).
We calculated 11 surface metrics for each sampling unit based on the PPSG greenness index using the geodiv package for R (Smith et al., 2021). We calculated surface metrics by implementing circular buffers with the same radii used to calculate patch-based metrics (i.e., 70, 130, 270, and 8100 meters) to obtain a multiscale average per sampling point. Then, we retained five metrics that showed a lower degree of correlation with each other (i.e., Pearson r < 0.7): average roughness (Sa), surface asymmetry (Ssk), surface kurtosis (Sku), surface area ratio (Sdr), and summit density (Sds). Sa, Ssk and Sku are amplitude metrics quantifying the variability and statistical distribution in the surface values (i.e., vertical distribution) without considering the spatial arrangement, location, or distribution of the peaks and valleys formed by the surface values (i.e., horizontal distribution). Sdr and Sds are spatial metrics, i.e., they consider both vertical and horizontal distribution of surface values.
In this case, Sa measures the absolute deviation of the PPSG values from the mean within each radius and can be interpreted as the vegetation cover variation; high Sa values indicate high heterogeneity in vegetation cover. Sa increased in sites composed of contrasting vegetation cover in our study area, such as forest-grassland edge zones with forest fragments of various sizes surrounded by different cover types. Ssk measures the skewness, and Sku measures the kurtosis of the distribution of PPSG values, so they are interpreted as complementary measures of dominance. Low Ssk values indicate dominance of high greenness values (i.e., forest or secondary vegetation), and high Sku values are interpreted as high dominance of greenness values around the mean, i.e., there is a low probability of obtaining values far from the mean. Sku is not a directional metric, so it does not indicate the dominant greenness value specifically in each locality. Sku may increase in relatively homogeneous sites around the average greenness value, which may be low or high. In our study area, Sku correlates moderately with greenness (PPSG; r = 0.48, p < 0.001), summit density (Sds; r = 0.48, p < 0.001) and forest area cover (Ca; r = 0.46, p < 0.001), indicating that higher Sku values tends to occur in forest or secondary vegetation areas predominantly.
The area surface ratio, Sdr, measures the ratio between the observed surface area and the area of a flat surface with the exact x,y dimensions, so it increases with increasing local variability in slopes; Sdr is an indicator of the amount and magnitude of vegetation cover contrasts or edges. Sds measures the number of local peaks of greenness per area and is therefore related to the number of pixels/sites with high greenness values (forests and secondary vegetation) and their spatial distribution. Sds can be interpreted as an indicator of forest fragmentation since the density of greenness peaks (i.e., localities with high PPSG values) corresponds to forest or secondary vegetation cover patches.
Species occupancy analysis
We fit dynamic occupancy models (Kéry & Royle, 2021; MacKenzie et al., 2003) for the six selected bird species using metrics from the PPM and GM landscape models as occupancy covariates (Ψ) and the date and sampling time as detection covariates (p). No covariates were used to predict colonization (γ) or extinction (ε), assuming little or no variation in the dynamics governing the individual’s distribution of the species of interest over this period (i.e., one year) (Betancur et al., 2020). We used the colext function in the unmarked package for R software to fit all models (Fiske & Chandler, 2011). First, we built a global model for each species with all selected landscape metrics plus altitude to test the general model fit and diagnose the presence of overdispersion with the mb.gof.test function from the AICcmodavg package. We tested the goodness-of-fit of the global models by comparing the observed chi-square statistic with the reference distribution generated by 999 simulations. We evaluated overdispersion by calculating the variance inflation factor (cˆ). Global model is theoretically the best-fitting model because all candidate models are special cases of the global model, so we can compute from it an unambiguous cˆ to diagnose the presence of overdispersion (Burnhan & Anderson, 2002).
We modeled the detection parameter for each species while holding occupancy, colonization, and extinction constant. We then modeled the occupancy parameter using the top-ranked detection sub-model and holding colonization and extinction constant. We built a set of 42 candidate models for the occupancy parameter avoiding models with correlated covariates or with more than three covariates to avoid problems of model convergence and interpretation of results (see supplementary material Table S3). Our main objective was to compare the metrics derived from the PMM and GM landscape models, so we did not include models combining patch-based and surface-based variables, except for the global model. Models were evaluated according to the quasi-AIC criterion (QAIC), calculated using the global model variance inflation factor. Selecting the best models from the QAIC and cˆ allows considering the overdispersion in the calculation of the model parameter standard errors caused by possible lack of independence in the predictors or species detection histories (Richards, 2008). We ranked models using ΔQAICc and considered that models with ΔQAICc < 2 had the strongest support (see supplementary material Table S4). For the final selection, we excluded models with poor explanatory power or uninformative variables, i.e., variables whose confidence interval included zero (Arnold, 2010). Finally, we assessed the goodness-of-fit of the final models for each species using the mb.gof.test function on the AICcmodavg package. With the final models for each species, we made inferences about the covariates’ effects and the direction of the relationships.