Study region
The 74,788 km2 Madrean Archipelago Ecoregion (Omernik 1987), or Madrean sky islands, encompasses portions of the northern states of Sonora and Chihuahua in México, and southeastern Arizona and southwestern New Mexico in the United States (Fig. 1). Our study included 55 sky island mountain complexes within the ecoregion, with an area of approximately 45,000 km2 (Deyo et al. 2013). The international border divides the ecoregion near the center at 31°33 N.
The parallel valley and mountain configuration of the Madrean sky islands resulted from continental-scale deformation over the past several hundred million years in response to plate tectonics (Coblentz 2005). Valleys receded to expose older rocks derived from marine invasions, caldera eruptions and lava flows and metamorphic core complexes (Warshall 1995). Mountain ranges are surrounded by temperate desert grassland and both temperate and subtropical scrub communities formed at the junction of the Sonoran and Chihuahuan deserts.
Climate in the region is characterized by a bimodal precipitation pattern. Winter (November to March) frontal storms bring snow to higher elevations and rain to the lower elevations (Adams and Comrie 1997; Sheppard et al. 2002). In mid- to late-summer (July to August), moisture from the south creates a monsoonal weather pattern with thunderstorms. The arid foresummer period (April to June) preceding the monsoonal moisture is typically hot and dry and generally corresponds to the fire season. Annual precipitation varies greatly by elevation, with orographic uplift contributing to increased precipitation at upper elevations.
Pine species descriptions
The five pine species vary in their association with particular environments (Table 1). Many studies reported relative affinities based on comparisons with other species, limiting interpretations of absolute limits or preferences. However, reports of broad environmental tolerance suggest that diverse microclimates associated with topographic variability could be utilized by species including P. strobiformis. Likewise, reports of drought tolerance suggest that cooler and wetter microclimates (e.g., mesic, north-facing slopes inhabited by P. arizonica) may provide refugia to some degree as warmer and drier regional climates affect local conditions. For P. discolor, tolerance of cooler, more mesic conditions at higher elevations could accommodate upward migration, while drought resistance could promote survival in its current range. Sensitivity of P. engelmannii to drought severity could make microrefugia that are decoupled from regional climate especially important to this species. Although P. chihuahuana has shown potential to adapt to warmer temperature, poor soils and wildfire, conditions conducive to regeneration may benefit from sheltered habitats found in microrefugia.
Table 1. Affinities of the five pine species with the climatic environment, documented in recent literature. Geographic focus refers to data sources from Northern (mostly United States); Southern (primarily México) and Region-wide (sky islands and beyond). English and Spanish common names were reported in Felger et al. (2001, and verified by M.S. Gonzalez-Elizondo, pers. comm.). For some species, scientific names represent groupings of synonyms from Plants of the World Online (http://powo.science.kew.org) and International Plant Names Index (https://www.ipni.org): Pinus strobiformis; P. arizonica (syn. P. ponderosa var. arizonica); P. discolor (syn. P. cembroides var. bicolor); P. engelmannii; P. chihuahuana; (syn. P. leiophylla var. chihuahuana).
Species name
|
Pinus strobiformis
Southwestern white pine
Pino huiyoco
|
Pinus arizonica
Arizona pine
Pino blanco
|
Pinus discolor
Border pinyon
Piñón, pino piñonero
|
Pinus engelmannii
Apache pine
Pino real
|
Pinus chihuahuana
Chihuahua pine
Pino chino, pino chimonque
|
Geographic focus
|
|
|
|
|
|
Northern
|
Intermediate shade tolerance4
Broad environmental tolerance10
Associated with longer fire intervals3, 14, 15
|
Moisture availability and fire intervals related to stand dynamics3
Thrives in disturbed sites with shade and exposed soil14
Tree establishment is associated with longer fire intervals3, 14, 15
|
Water stress controls lower elevational limits; exhibits both drought resistance and tolerance of cold, more mesic conditions at higher elevations4
|
Water stress controls lower elevational limits; post-fire increase in light and temperature favorable to seedlings4
Sensitivity to water stress after wildfire can limit regeneration12, 13
|
Water stress controls lower elevational limits; low light and deep litter control upper elevational limits; less shade tolerant than higher elevational pines4
Potential for adaptation to warmer temperatures, reduced soil productivity, and wildfires9
Sensitivity to water stress after wildfire can limit regeneration12, 13
|
Southern
|
Linked to elevation and solar radiation, great temperature fluctuations across the year; large spatial distribution and niche breadth1
Tolerant of semi-arid conditions6
Found in mesic sites in pine forests8
Inhabits northern aspect slopes and topographically sheltered sites10
|
Moderately sensitive to climate change6
Responds to changes in drought severity & cumulative water deficit7
|
|
Dry to moderately summer-warm open mountain slopes, mild climate, variable rainfall, elevation and maximum temperatures, open & mixed pine-oak2
Sensitive to drought severity7
Found in semi-dry temperate areas in pine-oak forests8
|
Highly to moderately sensitive to climate change6
Sensitive to drought severity7
Found in semi-dry temperate areas in pine-oak forests8
|
Region-wide
|
Large range shift and contraction forecast; occurs in areas with high winter precipitation and low warmest month temperature11
|
Inhabits mesic, north-facing slopes or canyon bottoms5
|
More drought tolerant than other pines5
|
|
More drought tolerant than other pines5
|
|
|
|
|
|
|
|
|
|
|
|
|
1 Aguirre-Gutiérrez et al. 2015
|
7 González-Cásares et al. 2017
|
13 Poulos et al. 2020
|
|
2 Ávila-Flores et al. 2016
|
8 González-Elizondo et al. 2013
|
14 Iniguez et al. 2008
|
|
3 Barton et al. 2001
|
9 Hess and Fule 2020
|
15 Iniguez et al. 2016
|
|
4 Barton 1993
|
10 Looney and Waring 2013
|
|
|
5 Ferguson et al. 2013
|
11 Shirk et al. 2018
|
|
|
6 Gómez-Mendoza and Arriaga 2007
|
12 Barton and Poulos 2018
|
|
|
Field data and derivation of species’ presence/absence variables
We combined observations of pines from field studies conducted in the sky islands of México (Flesch et al. 2016; Flesch 2019) and the United States (Ganey et al. 1996; Sanderlin et al. 2013) to generate presence and absence data for each species (Fig. 1, Table 1). In México, vegetation data were gathered within 50 m of 1104 sampling points placed non-randomly in representative areas in major montane vegetation communities from 2009-2012. The majority of sky islands that support pines were sampled (Ferguson et al. 2013).
In the United States, data were collected within 318 plots between 2014 and 2015 (Sanderlin et al. 2014). Trees (> 10 cm diameter at 1.4 m above ground) distinguished by tree species were recorded within 0.1 ha circular plots (17.8-m radius). Plots were located along transects typically consisting of 12 points spaced at 300 m intervals (Sanderlin et al. 2013; Ganey et al. 2017), with transects being proportional to the approximate coverage of each montane vegetation type in the landscape.
For the response variable in species distribution models, species were considered present when that species was recorded as a tree within the sampled plot and absent when it was not recorded at the plot. Total sample sizes after combining observations from the México and United States field studies were n = 1422 for all species except P. discolor (n = 1204); a subset of plots did not include data for this species. Frequencies of occurrence for each species across datasets were: P. strobiformis 14%; P. arizonica 35%; P. discolor 23%; P. engelmannii 30%; P. chihuahuana 17% (see Maps of presence and absence data, Online Resource 2).
Using field data provided distinct advantages over more commonly used data in species distribution modeling which are often comprised of gridded species’ occurrence records with coarse spatial resolution (Araújo and Guisan 2006). Field data used in our analysis include reliable absence records that are especially important in estimating probabilities of occurrence by comparing environmental characteristics at sites where the species was observed with those at sites where it was not observed (Guisan and Thuiller 2005; Guillera-Arroita et al. 2015).
Disadvantages of the sampling designs of the field studies include spatial bias (Araújo & Guisan, 2006) towards larger mountain ranges and potential for missing variability in mountain ranges where pines were assumed to be absent. Therefore, in some locations, predicted probabilities may reflect potential rather than actual distribution. Lower predictive ability may also result from factors not considered in the study design (i.e., island size, area of vegetation type) but should not obscure relationships to topography that define microrefugia.
Assembly of predictor variables
We assembled spatial layers of predictor variables across the geographic extent of each sky island complex (as defined by Deyo et al. 2013). We sampled the spatial layers at the field plot locations using the extract function in the R raster package (method=bilinear; Hijmans 2019). The dataset of predictor variables was appended to each species presence and absence database for use in modeling species distributions.
Terrain metrics were calculated from a 30 m Shuttle Radar Topography Mission (SRTM; Farr et al. 2007) digital elevation model (DEM) using SAGA-GIS (Conrad et al. 2015). Metrics included local morphometrics, characteristics of the catchment basin, relative position, and terrain ruggedness (Table 2). The slope position metrics describe locations relative to surrounding landscape, for example, Valley Depth is defined as the difference between elevation and an interpolated ridge level. Slope position metrics represent hillslope scale variations in moisture availability and microclimate (Hoylman et al. 2019).
Table 2. Predictor variables using in species distributions models, data sources, software used and definitions. Models were developed using 1) terrain metrics only; 2) top four predictors from the terrain model plus two remote sensing indices (NDVI or NDMI); 3) top four predictors from the terrain model plus two bioclimatic variables (TD and AHM).
Variable name
|
Source, description
|
Terrain metrics
|
30 m DEM, SAGA-GIS
|
Aspect
|
Local downslope direction with maximum rate of change (radians)
|
Slope
|
Local maximum rate of change in elevation (radians)
|
Elevation
|
Height above sea level (m)
|
Catchment Area
|
Size of the contributing area (m2)
|
Catchment Slope
|
Average slope of the contributing area (radians)
|
Mid-Slope Position
|
Intermediate slope position between high and low (0-1)
|
Slope Height
|
Vertical distance from the base of the slope to its crest (m)
|
Valley Depth
|
Vertical distance to a channel network base level (m)
|
Terrain Ruggedness Index (TRI)
|
Sum of the differences in elevation between a cell and its surrounding cells, calculated in 210 m radius (7 x 7 30-m cell) moving window
|
Remote Sensing Indices
|
Composites of 30 m Landsat imagery, Google Earth Engine
|
Normalized Difference Vegetation Index (NDVI)
|
Spring (2011-03-01 to 2011-05-30) and Monsoon (2010-08-01 to 2010-10-30)
|
Normalized Difference Moisture Index (NDMI)
|
Spring (2011-03-01 to 2011-05-30) and Monsoon (2010-08-01 to 2010-10-30)
|
Bioclimatic Variables
|
Normals period (1981-2010), AdaptWest
|
Temperature Difference (TD)
|
Difference between Mean Temperature of the Coldest Month (°C), and Mean Temperature of the Warmest Month (°C)
|
Annual Heat Moisture Index (AHM)
|
(Mean Annual Temperature + 10)/Mean Annual Precipitation/1000)
|
Using Google Earth Engine cloud computing platform (Gorelick et al. 2017), we developed two median pixel top-of-atmosphere (TOA) reflectance composites of cloud-free 30 m Landsat Thematic Mapper imagery representing two distinct seasons: the summer monsoon (composite period: August 1, 2010 – October 30, 2010) and the spring growing season (composite period: March 1, 2011- May 30, 2011) (Table 2; Fig. 2). For each seasonal composite, we calculated the Normalized Difference Vegetation Index (NDVI) as a proxy for canopy structure and chlorophyll content (Gamon et al. 1995) and Normalized Difference Moisture Index (NDMI) as a proxy for vegetation liquid water content (Gao 1996).
We included two bioclimatic variables, Temperature Difference (TD) and Annual Heat Moisture Index (AHM) calculated for the normals period 1981-2010 at 1 km spatial resolution (AdaptWest Project 2015; Wang et al. 2016) (Table 2). We selected AHM and TD based on their importance in a principal components analysis of climate in the region (Villarreal et al. 2020). Climate normals (1981-2010) for TD and AHM had low correlations (Pearson’s r < 0.50) with each other as well as terrain and remote sensing predictors. The 1981-2010 normals period overlapped or just preceded dates of field data collection.
The AHM variable is derived from Mean Annual Temperature (positive correlation) and Mean Annual Precipitation (negative correlation) (Table 2); temperature and water balance impose limits on plant reproduction and regeneration (Barton et al. 2001; Crimmins et al. 2011; Hess and Fule 2020). Differences in winter and summer temperatures determine TD values; TD represents variability in coastal influences on temperature extremes (Table 2). In the context of the current warming trend, less warming is generally expected to occur in locations that are nearer to coastlines (i.e., low TD; Ashcroft et al. 2009).
Model development and spatial predictions
We developed species distribution models using boosted regression trees, a method which incorporates advantages of tree-based methods and improves performance by fitting many models and combining them for prediction (Elith et al. 2008). We developed the models using the R packages gbm (Greenwell et al. 2019) and dismo (Hijmans et al. 2017). All models were parameterized using the Bernoulli family (for presence and absence data); the proportion of observations used in selecting variables was set to 0.5; complexity of individual trees was 5. To parameterize the learning rate, we started with η=0.01 and adjusted to a slower rate (η=0.005 or 0.001) to improve performance. We used 10-fold cross validation for model tuning and evaluation. Accuracy was evaluated using mean area under the receiver operating characteristic (ROC) curve (AUC) for the cross validations, which is most useful when the goal is predicting realized distributions and when data include known absences (Jiménez-Valverde 2012).
We constructed a terrain-only model (nine variables) and three additional models using the top four terrain predictors: terrain-bioclimatic (AHM and TD); terrain-NDMI (spring and monsoon); and terrain-NDVI (spring and monsoon). We graphed partial dependence functions (R package pdp; Greenwell 2017) for the three variables with highest relative influence in the terrain-bioclimatic models and evaluated the trends in terms of local and broader-scale climate relationships.
We mapped pine species distributions (probability of occurrence) at 1 km resolution based on the terrain-bioclimatic model and the terrain-NDVI model using the predict function in the R raster package (Hijmans 2019). Spatial predictions were smoothed using a Gaussian kernel in a 7 x 7 cell window (3-km radius) to improve visualization of clusters of similar values (e.g., hotspots) while maintaining a continuous scale of predicted probability. We compared the two distribution maps for each species to identify distinct geographic trends.