Study sites
The U.S. Forest Service Southern Research Station’s (SRS) experimental forest network (EFN) consists of 19 units (Table 1) across the southeastern U.S. (Figure 1). Established decades ago, these experimental forests were intended to address a variety of research needs, from silviculture to naval stores production to forest genetics, erosion control, and hydrology. The SRS EFN covers seven of the nine distinct southeastern ecoregions as defined by Bailey (2016), including the Central Appalachian Forest, Southeastern Mixed Forest, Outer Coastal Plain Mixed Forest, Eastern Broadleaf Forest, Lower Mississippi Riverine Forest, Ouachita Mixed Forest, and Ozark Broadleaf Forest. The first three of these were the most sampled in this study (including three or more experimental forests) and correspond to the Southern Appalachians, Piedmont, and Coastal Plain physiographic regions, respectively.
In 2018-2019, a series of plots (coordinates) were systematically established on all experimental forests in the SRS EFN to provide a standardized and uniform framework for large scale analyses. The ultimate goal is to collect long-term forest data at these locations following the Forest Service’s Forest Inventory and Analysis (FIA) protocol (Bechtold & Patterson, 2005). However, such data have not yet been collected at most of the plots sampled in this study, and were thus not available for the current analysis. From this collection of plots, we randomly selected five from each experimental forest for use in this study. We selected two additional plots at Chipola EF because there was a possibility that some would be disturbed by planned management operations, and would therefore need to be dropped from the study. However, this did not happen, resulting in a total of 97 plots across the network. The distance between plots within experimental forests was at least 500 m while the distance between experimental forests ranged from ~20-1,400 km (Figure 1). All plots were forested, although canopy openness and successional stage differed considerably depending on recent disturbance events (e.g., Hurricane Michael) or management activities (e.g., mechanical thinning or prescribed burning). Based on National Land Cover Data from 2021, the mean and median percent forest cover within 500m of the plots were 86.8% and 91.0%, respectively, with a range of 31.3-100%.
Pollinator sampling and identification
Pollinators were sampled on all 19 experimental forests using colored pan traps. Although pan traps provide a standardized, simple, and effective method for sampling pollinators in forests (Ulyshen et al., 2022) they are known to be more effective at collecting some taxa than others (Baum & Wallen, 2011; Cane et al., 2000) and can also under-sample pollinators in areas with an abundance of flowers (Baum & Wallen, 2011). While netting off flowers would have improved the study (Roulston et al., 2007), we limited sampling to pan trapping given limits of time and personnel as well as the logistical challenges inherent to visiting remote sites across such a large geographic area. Although we did not conduct floral surveys in this study, we did measure canopy openness (see plot measurements, below) which correlates with flower availability (Chase et al., 2023) and include this term in models of pollinator richness (see statistical analysis, below). Sampling involved placing two sets of three pan traps at each plot, with the traps filled with soapy water during operation. Each set was situated 5 m to the north or south of plot center and consisted of a yellow, a blue and a white plastic bowl (15.5 cm opening, ~400 ml capacity) suspended 20-30 cm above the ground on wire stands. Sets were oriented east-west with a 5 m separation between traps. The traps were used as they were manufactured and were not painted.
Traps were operated once a month for three days at a time from March to September 2021. Target dates for sampling were established at the beginning of the study. However, because weather conditions differed considerably across our study region, the exact dates of sampling varied somewhat (Table S1) to ensure that trapping took place at all locations during periods of clear weather when pollinators would be most active. All bees, hover flies, and butterflies were pinned and identified by the senior author (bees and butterflies) and SR and ADY (hover flies) to species (or rarely to morphospecies) using online (discoverlife.org) and printed resources (Gibbs, 2011; Gibbs et al., 2013; Glassberg et al., 2000; Mitchell, 1960, 1962; Skevington et al., 2019). Voucher specimens have been deposited in the senior author’s reference collection. Data were pooled by plot and sampling period for analysis.
Plot measurements
We used a convex densiometer to measure canopy openness at the center of each plot (Lemmon, 1956). Measurements were made while facing each cardinal direction in May, June, and July and we used the average of these readings as the percent canopy openness for each plot. Measurements at these times of year captured the maximum shade condition of overstory trees across our study area. We also collected information on stand basal area and overstory composition by recording the diameter and species of all trees (>20 cm at 1.4 m above the ground) within a 0.1 ha circular area centered on each plot.
Statistical Analysis
Our analysis, performed in R (R Core Team, 2022) and ArcGis Pro, consisted of two parts. First, we related the alpha diversity (i.e., local species richness at the plot level) and community composition of pollinators to land cover data and stand metrics across all plots. We then tested for differences in community composition, pollinator diversity (alpha, beta, and gamma as defined below), and seasonal patterns of alpha diversity and abundance among the most-sampled ecoregions (i.e., those represented by three or more experimental forests: Central Appalachian, Coastal Plain and Southeastern Mixed, Figure 1). Whereas alpha diversity is defined as the number of species observed in a plot, beta diversity represents the dissimilarity in diversity among plots within each experimental forest and consists of turnover (species replacement) and nestedness (species loss) components (Baselga, 2010). Finally, gamma diversity refers to the total diversity within each ecoregion.
We used NLCD (National Land Cover Database, https://www.usgs.gov/centers/eros/science/national-land-cover-database) data from 2021 (Dewitz, 2023), with a resolution of 30 x 30 m, to calculate the percentage of each land cover category surrounding our plots. This was done at five spatial scales (radii of 250 m, 500 m, 1 km, 2 km, and 4 km). Land cover metrics tested for multicollinearity at each spatial scale were percentages of: non-wetland forest (i.e., deciduous forest + evergreen forest + mixed forest), non-wetland open land (i.e., grassland/herbaceous + shrub/scrub + pasture/hay + barren land + cultivated crops + developed high, medium, and low intensities + developed_open), wetland (i.e., woody wetlands + emergent herbaceous wetlands), and evergreen forest (hereafter referred to as conifer forest). Of these, non-wetland forest was dropped based on a high variance inflation factor (VIF) value, and non-wetland open land was dropped based on subsequent problems with model convergence.
After confirming a lack of spatial autocorrelation for any of the response variables among experimental forests based on Moran’s I using the package ape (Paradis & Schliep, 2019), we used the lme4 package (Bates et al., 2015) to produce negative binomial regression models (glmer.nb) of pollinator richness. To determine which spatial scale to use in the final models of pollinator diversity, we first tested separate models for each of the five spatial scales mentioned above and compared the R-squared values, using the rsq.glmm function of the rsq package (Zhang & Zhang, 2022), i.e., the scale of effect: Holland et al., (2004). This was done separately for the richness of bees, hover flies, butterflies, and all pollinators combined. Based on this analysis, we determined landscape metrics to be most informative at the scale of 500 m for bee, butterfly, and total pollinator richness, and at the scale of 2 km for hover fly richness (Figure S1), so these were the scales used in the corresponding models.
Our final model of pollinator richness included canopy openness, wetland, and conifer forest as fixed effects (all scaled to have a mean of 0 and a standard deviation of 1) and experimental forest as a random effect to account for the lack of independence among plots within each experimental forest. Additional terms for the number of tree genera and basal area were not included due to multicollinearity. Given our study’s large sample size, we used the default Wald Z-test to assess significance. However, to confirm these conclusions, we also conducted likelihood ratio tests by dropping predictors one at a time (drop1) and using the chi-square test to assess significance. Using the same methods as described above, we calculated R-squared for each model and the contributions of fixed and random terms to the total.
To further explore how the relationship between bee richness and conifer forest may vary among ecoregions, we re-ran the model after limiting to the Southeastern Mixed or Coastal Plain ecoregions. We did not examine this relationship for other ecoregions due to either low replication (e.g., some ecoregions were represented by just one experimental forest) or, as for the Central Appalachian ecoregion, a low coverage of conifer forests. Moreover, we did not repeat these analyses for hover flies or butterflies because conifer forest was not a significant predictor in the main models for those groups.
For comparisons of pollinator community composition (all groups combined) among plots, we performed non-metric multidimensional scaling (NMDS) on a Bray-Curtis distance matrix using the vegan package (Oksanen et al., 2007). This was based on Hellinger-transformed abundance data which involves dividing each count by the sample total and taking the square root of each resulting proportion. We then conducted envfit tests for significant correlations between the resulting axes and the following variables of interest: canopy openness, wetlands within 500 m, conifer forests within 500 m, basal area, and the number of tree genera. Then we used the adonis2 function of the vegan package (Oksanen et al., 2007) to test for pairwise differences (PERMANOVA) in pollinator composition among the three most-sampled ecoregions. Finally, to determine if any taxa were strongly associated with one or more of the ecoregions, we performed indicator species analysis using the multipatt function in the package indicspecies (De Caceres et al., 2016). This test produces values ranging from 0 (no association) to 1 (complete association).
To compare the alpha (plot-level) diversity of bees, hover flies, butterflies, and all pollinators among the most-sampled ecoregions, we ran negative binomial models consisting of ecoregion and experimental forest as fixed and random effects, respectively. Total beta diversity, along with its individual turnover and nestedness components, were calculated for each experimental forest using the beta.div.comp function of the adespatial package (Dray et al., 2023). These metrics were then compared among the most-sampled ecoregions using generalized linear models. To compare gamma diversity among the most-sampled ecoregions, we used the iNext package (Hsieh et al., 2022) to produce rarefaction and extrapolation curves for richness (q=0), with statistical significance indicated by non-overlapping confidence intervals at the maximum reference sample size (Chao et al., 2014).
To better understand how landscape and stand metrics differed among the most-sampled ecoregions, and how these differences might help explain patterns in pollinator diversity, we tested linear mixed effects models (lmer) relating wetland, conifer forest, and canopy openness to ecoregion. Canopy openness was square-root transformed to meet assumptions of normality. Because assumptions could not be met for wetland data, we made pairwise comparisons between ecoregions using the non-parametric Wilcoxon signed-rank tests using the pairwise.wilcox.test function of the stats package.
To test whether pollinator seasonality differed among ecoregions, we compared when the richness and abundance of bees, butterflies, and hover flies peaked among the most-sampled ecoregions. First, for each sampled plot, we calculated the richness or abundance of each taxon by month relative to the that taxon’s highest monthly richness or abundance from the same plot, resulting in a standardized value ranging from 0-1. We then plotted the mean ± SE of these values by ecoregion and month to visualize differences in seasonality among ecoregions. We used pairwise Wilcoxon signed-rank tests to determine if the peak month of response varied among the most-sampled ecoregions.
Finally, we used the Chao1 estimator to estimate the richness of bees, butterflies, hover flies, and all pollinators combined that could be collected using pan traps across our studied forests. These calculations were made for the entire region as well as for each of the three most-sampled ecoregions using the rareNMtests package (Cayuela & Gotelli, 2014). The estimates are based on the corresponding lists of collected species and their abundances.