2.1 Climate data sourcing
A variety of different datasets were employed to characterize change in regional climate. At the site-level, weather station data from two forest research sites in northern Minnesota – the Cloquet Forestry Center and Marcell Experimental Forest (Gill et al. 2020; Sebestyen et al. 2020). Both stations record daily temperature and precipitation, and cover the period from 1901-present, and 1961-present, respectively. A suite of gridded data products of both temperature and precipitation were employed, all available through the University of East Anglia Climate Research Unit (CRU) (Harris et al. 2020). All CRU datasets are on a 0.5˚ grid, with a monthly temporal resolution and cover the period from 1901–2020. The CRU TS 4.06 monthly gridded climate datasets for temperature, CRU TS 4.05 monthly dataset for precipitation, as well as the CRU Self-calibrating Palmer Drought Severity Index (PDSI) dataset, which is based on CRU TS 4.05 (van der Schrier et al. 2013; Harris et al. 2020). All climate variables were analyzed in this study, but given the emphasis on spring climatology, results for precipitation and other seasonal climate variables are reported in supplemental materials (Fig. S1). Single grid cells for the nearest to the weather stations were extracted for comparison, gridded records were used to evaluate spatial patterns of change across the Great Lakes region.
2.2 Detecting state changes in the spring temperature record
To test changes in the state of the climate system, Hidden Markov Models (HMMs) were used (Evin et al. 2011; Gennaretti et al. 2014). These models have been used in climatology (Evin et al. 2011; Mallya et al. 2013; Gennaretti et al. 2014), and are gaining popularity in ecology (Langrock et al. 2012; McClintock et al. 2020). The advantage of this type of model is that it makes few prior assumptions regarding the timing of different transitions, and yields a set of posterior probabilities of the likelihood of regime change, which can be used to ascribe confidence estimates to modeled states. The basic structure of an HMM can be represented as a series of observations that are determined by an underlying set of hidden states (Fig. 1).
The initial model distribution and parameters can be tailored to fit the data itself. In this case, models were defined following the mathematical formula:
\(p\left({X}_{t}|{S}_{t}=k\right)=N\left({X}_{t}\right|{\mu }_{k},{\sigma }_{k}^{2}\) )
(1)
Where p is the probability that observation X falls into state S1 or S2 at time step t, where k describes an initial distribution, in this case a Gaussian function (\({\mu }_{k},{\sigma }_{k}^{2}\)), which was roughly appropriate for my data. At each time step, the conditional distribution of the observation given a climate state P(Xt|St) is depended on the previous climate state P(St|St−1) following a Markov process (Mallya et al. 2013). St is defined by a state-dependent distribution (k) which is unique to N number of different states. The probability of switching between state S at time t to state S + 1 at time t + 1 is referred to as the transition probability, which is represented using a matrix with N, in this case 2, dimensions.
$$P\left({x}_{t}\right)= \left[\begin{array}{cc}f\left({x}_{t}|{S}_{t}=1\right)& 0\\ 0& f\left({x}_{t}|{S}_{t}=2\right)\end{array}\right]\genfrac{}{}{0pt}{}{{S}_{t}=1}{{S}_{t}=2}$$
(2)
The initial transition matrix for my Gaussian model specifies a high probability of being the first state at the beginning and a low probability of switching between states at each time step. The Markov process, implemented by a forward algorithm through the depmixS4 R package, calculates the step-by-step dependency of each observation on the previous one to determine the likelihood of transition at each time step (Zucchini & McDonald 2009; Visser & Speekenbrink 2010). The initial parameters are therefore relatively conservative with respect to the presence of underlying states, and allow the data to “speak for itself” through the Markovian procedure (Gennarretti et al. 2014).
In order to model change in spring climate, as well as change in climate-growth relationships through time, both climate and tree-ring time series were subjected to the HMM modeling process, modeling for both a two-state and one-state model. The HMM model framework can test for the presence of more than two states in the data, but given the hypothesis that spring temperatures should show a transition sometime during the last half-century, only two states were tested. Model selection via the Akaike Information and Bayesian Information Criterion (AIC and BIC, respectively) was used to compare model performance between the one- and two-state models (Gotelli & Ellison 2013). The model yielding the lower AIC and BIC values was deemed superior. For models where the two-state model was superior, the model posterior distributions were evaluated to determine whether the model was stable (i.e. unlikely to transition back), what the posterior probability of de-transitioning back to the former state was, and at what year the state transition was deemed to have occurred.
To determine whether shifts identified on a local basis were regionally coherent, gridded CRU spring temperature data were used perform the HMM procedure at each grid cell for the entire Great Lakes region. Based on the shifts identified in the station data, the posterior probability distributions were extracted and averaged for the same time periods as the station data. For the shifts identified at the stations, the probability of exhibiting the same two-state structure was examined for all regional grid points. The better the evidence of a regional regime shifts in temperature, the higher the mean posterior probability of exhibiting the same state-shifts as on a local basis.
2.3 Characterizing drivers of change in regional climatology
To place the HMM model results into a climatological context and to evaluate the significance of trends, the mean and variance of the entire series were calculated for each grid cell, and segmented each series out according to HMM-identified state changes. In order to determine the significance of any trends, the spring climate time series were subjected to a Mann-Kendall trend test, and simple F-test to determine difference in variance between groups. This analysis was performed on both a local and a regional basis. Basic climate data processing and analyses were done in the R statistical computing environment, using the netcdf4 package, and the Kendall package for trend analysis (Pierce 2019, McLeod 2022).
To ascertain the role of internal climate variability in temperature trends across the region, the Pacific North America (PNA) and Northern annular modes (NAM) climate modes were analyzed with respect to their relationships to Great Lakes regional spring temperature variability. Both modes are dominant in winter and spring, and known to influence spring temperatures across western North America, but their influence within the mid-continent climate is less certain given the further distance from the centers of oceanic variability (Quadrelli & Wallace 2004; Ault et al. 2011; Stendel et al. 2021). Both indices are calculated by taking the leading mode variability from an empirical orthogonal function of monthly mean atmospheric pressure (Barston & Livezey 1987; van den Dool et al. 2000). The PNA is based on a leading EOF at 500 mb centered on the North Pacific and covers the period from 1950–2020, and the NAM is defined as the first EOF of winter (Dec, Jan, Feb, Mar) sea level pressure across the Northern hemisphere from 1900 to 2020 (Zhang et al. 1997; Hurrell & Desser 2009). The PNA and NAM were spatially correlated with spring (March-April-May) temperatures using the CRUTS 4.06 dataset
2.4 Tree-ring site locations
In order to assess the sensitivity of regional forest phenology and growth to variation in spring climate, tree-ring data co-located with local weather stations were collected at two sites in northern Minnesota (Fig. 2). The Cloquet Forestry Center (CFC) (Fig. 2, a) was established in the 1909 by the University of Minnesota Department of Forest Resources as an experimental research forest. The Marcell Experimental Forest (MEF) (Fig. 2, b) was established by the USDA Forest Service in the 1970s to study the hydrology and biogeochemistry of forested peatlands (Kolka et al. 2011). Both sites represent managed, mixed hardwood and conifer forests on heterogeneous topography with soils ranging from well-drained sandy uplands to peat-dominated lowlands (Kolka et al. 2011; Gill et al. 2022). At CFC, black spruce (Picea mariana) and upland red pine (Pinus resinosa) were sampled using an increment borer. At MEF, red pine, black spruce, and Eastern larch (Larix laricina) were cored. Each tree-ring site represents fifteen individual trees, with two cores collected per tree. All five sites were cored by M. McPartland in August and September of 2019.
2.5 Tree-ring chronology development & climate response analysis
Standard dendrochronological methods were used to develop site chronologies for all five sampling locations across both research forests. Cores were dried, mounted on wood blocks and sanded using successively finer-grit sandpaper, through to a 1000-grit polishing paper. Cross-dating and total-ring width measurements were done using a Velmex measurement system (Velmex Inc., 2009). Dating accuracy was assessed using the COFECHA cross dating software to identify and resolve dating issues (Holmes et al. 1983). Expressed population signal (EPS) and RBAR were used alongside statistical cross dating to determine the strength of coherence among series in the chronologies (Table S1) (Wigley 1984). Average chronologies of dimensionless ring width indices were calculated using the dplR package in the R statistical programming environment (Bunn 2008). Ring-width indices were derived using site-specific detrending choices, either using negative exponential curves or 100-year splines, depending on the site. The different site chronologies were detrended based on the characteristics of raw ring-width series. Due to stand-wide disturbances, many ring-width series exhibited periodic growth releases in the middle of the series, such that negative exponential curves were not appropriate in all cases and rigid splines were preferred in most cases (Fig. S2).
Simple correlation analysis on the mean chronologies was performed to determine the relationship of annual growth to a variety of seasonal temperature and precipitation variables. To determine the changing relationship of tree-growth and temperature, running Pearson’s correlation analyses were used on the climate-growth relationships, binned by decade. Running correlation relationships were done using the CRU TS 4.06 data due to the fact that the MEF station data only stretches back to the 1970s. Running correlations were calculated using the gtools package (Bolker et al. 2022), and statistical significance of moving correlations was determined using a Mann-Kendall test for trends with a block bootstrapping simulation to determine a null distribution of correlation values, and assess significance against the permuted distribution (Kokfelt & Muscheler 2012). In addition, the trend in correlation fields were evaluated using a Mann-Kendall test with a block-bootstrapping approach to measure significance and to derive confidence limits. Block-bootstrapping is used to determine the significance of auto-correlated data, where the blocks are resampled rather than individual values to account for the dependence across observations (Özöz & Bayazit 2011). One thousand iterations were run on each timeseries, segmented by 10-year blocks.
2.4 Remote detection and analysis of canopy phenological responses to spring temperature
To track the sensitivity of canopy phenology to spring temperatures, the Moderate Resolution Imaging Spectrometer (MODIS) daily Normalized Difference Vegetation Indices (NDVI) from 2003-2021were analyzed. These indices are produced using 16-day return images from the MODIS Terra/Aqua Daily Level 3 Global 500-meter surface albedo images, provided by the United States Geological Survey and the NASA Earth Data program and accessed using the Google Earth Engine platform (Schaaf & Wang 2015; Gorelick et al. 2017). Daily NDVI data are calculated for each pixel using the ratio of a near-infrared and infrared band, at wavelengths that capture major changes in the reflectance spectrum of green vegetation (Kriegler et al. 1969).
In the literature, a range of NDVI values from around 0.4–0.7 have been used as thresholds to determine the start of spring (Jenkins et al. 2002, Holben et al. 2007; Kern et al. 2020). The shortfall of simple threshold approaches is that they apply a single value to an entire domain, where different cover types may begin the season with variable NDVI values, and have a range of peak values. In order to create a model that could capture differences among cover types based on their unique phenological properties, each years’ time series for each 250 square meter pixel was fit with a unique harmonic curve function (Fig. 3). Harmonic curves, based on Fourier-transformed time series, represents each series as a wave defined by the unique amplitude and phase angle (Jakubauskas & Legates 2000). Thus, each curve has a peak, through, and rate of change that reflect the unique phenological properties of that forest type (Kern et al. 2020, Wang et al. 2018). More information about harmonic curve fitting can be found in Shumway & Stoffer (2017).
Based on the harmonic curve fit, a threshold of 50% (hereafter referred to as NDVI50) was used as a threshold to establish the start of spring for each MODIS pixel, calculated as the midpoint on the curve between the min and max for each year:
$${NDVI}_{50}= {NDVI}_{min}+\left(\left({NDVI}_{max}- {NDVI}_{min}\right)*0.5\right)$$
(3)
This resulted in a composite image in which the value at each pixel was the date the NDVI50 value was reached during that year, rasterized into a stack of yearly images. Using CRU TS 4.06 gridded temperature record for the spring (MAM) months, point-by-point Pearson’s correlation coefficient were calculated at the 500 meter square MODIS pixel size for the correlation between NDVI50 and temperatures for the 2003–2021 period. This resulted in a single image of correlation values that could be segmented according to forest cover type.
Using a high-resolution forest cover map from the USDA Forest Inventory and Analysis program resampled to the same resolution (500m) as the MODIS data, each pixel of correlation values were classified by forest type (USDA Forest Service 2008), and then further aggregated four rough forest type categories represented in the region: upland conifer, lowland conifer, oak savannah, and upland deciduous (which were primarily areas of early successional aspen and birch) (Table S2). The sensitivity of different forest types was assessed using linear regression, with sensitivity to warming indicated by the regression coefficient (α).