Study area
This study takes place in the Hoh River Watershed (HRW) within the Pacific Northwest of the Conterminous United States (Fig. 1.) which contains some of the highest aboveground carbon and SOC stocks in the world reaching 375 MgC ha− 1 and 709 MgC ha− 1, respectively 34. In the HRW, mean annual air temperature is 7.2oC and mean annual precipitation is 274 cm and can exceed 300 cm with most of the precipitation in winter. Precipitation mainly falls as rain but snowfall is more common in the upper elevations 35. The mountains of the HRW were created 17–20 million years ago during the Miocene to Eocene periods with the uplifting of marine sedimentary rock over the denser ocean crust. The uplifted marine sedimentary rock also formed hills and terraces in the lower HRW. During the end of the Pleistocene and the period of deglaciation, large floods from glacial melt deposited material over the lower elevation so fthe HRW creating large floodplains. Rivers continued to incise this deposited glacial material over the Holocene and into the present depositing alluvium near the present main channel of the Hoh river that bisects the HRW 36. Current topography varies from mountains with steep slopes (> 40%) in the eastern portion of the HRW to rolling hills and flat areas in the lower floodplain that drains eastward to the Pacific Ocean. Soils of the HRW reflect this geologic history and topography with dominant soils containing loamy to sandy-clay coarse textures although there is a moderate presence of volcanic ash and which promotes andisol soil development 37. The HRW has a mix of both private and public forestlands dominated by Sitka Spruce and Western Hemlock in the lower elevations that is actively managed for timber harvest (Pelt, 2001). An area along the coast and in the upper watershed is part of the Olympic National Park and is protected old-growth forest with trees up to 4 m in diameter and 80 m in height (Harmon and Franklin, 1989). The mapped wetlands within the HRW are diverse, from precipitation-driven bogs to riparian wetlands (Fig. 1. insets). Many of the wetlands are under dense forest overstory but in some forested areas with high levels of inundation trees are stunted in size and have a lower overall height and biomass. The most prominent Hydrogeomorphic wetland classes are Riverine, Mineral Flats, Organic Flats, and Depressional 38. There is a notable difference between Riverine wetlands and the other wetland classes for SOC and we mark this distinction with grouping all wetland hydrogeomorphic classes into two classes for our soil pedon dataset: Riverine and Palustrine (non-riverine).
Mapping wetlands with the Wetland Intrinsic Potential Tool (WIP)
We mapped wetlands using the Wetland Intrinsic Potential (WIP) tool, a multi-scale terrain-based wetland identification and mapping tool developed by Halabisky et al., 39. The WIP tool models wetland presence in a spatially explicit, continuous pixel approach using input parameters related to hydrophytic vegetation, hydrology, and hydric soils. The topographic and terrain input data layers are derived from discrete point aerial lidar which was processed to create a digital elevation model at a 4 m resolution of the terrain surface (Lidar source: 2012–2013 Puget Sound LiDAR Consortium (PSLC) Topographic LiDAR: Hoh River Watershed, Washington (Deliveries 1 and 2). Unlike aerial or satellite imagery, lidar can detect small topographic features under tree canopy and terrain metrics were integrated into a random forest model that was trained on wetland presence/absence point datasets derived from the National Wetland Inventory (NWI) and validated with additional field collected ground-truthed datasets. The WIP tool was specifically developed to identify wet areas missing in most wetland inventories because they do not have standing water or are hidden under tree or vegetation canopy making them difficult to detect in satellite or aerial imagery. We refer the readers to Halabisky et al., 39 for the full summary of how the WIP tool was implemented in the HRW. The output produces a wetland probability score based on the proportion of classification trees in the random forest model of how likely a pixel is a wetland (0%-100%) which is the estimated likelihood that the wetland class label is correct for a given input of terrain, hydrology, and vegetation parameters. For example, a pixel that has a wetland probability of 80% will contain a combination of landscape features that generate a wetland within 80% of the dataset. Wetlands, therefore, represent the high end of continuum corresponding to landscape soil moisture and inundation and the other end is an absence of these conditions. Because the wetland probability is continuous across the entire landscape, it enables SOC stock to be modeled continuously across the entire HRW. However, setting a threshold probability also allows estimates of wetland extent. In order to determine potential wetland extent, we chose the threshold value of 50%, above which classifies a pixel as a wetland and below which classifies pixels as non-wetland or upland. WIP model accuracy for the HRW in Washington State using wetland probability ≥ 50% to create a binary class of wetland (> 50%) vs. upland (< 50%) was 93.0%. Readers should consider that wetlands defined by the WIP tool do not have jurisdictional boundaries which require field delineation and verification to determine their exact extent based on hydrology, hydric soil, and hydrophytic vegetation at a much smaller scale.
Field sampling
We developed a stratified random sampling approach across the HRW wetland probability distribution. We utilized 30 probabilty bins, and sampled 1 pedon at a random location per bin then added 6 additional pedons at the highest and lowest probabilities as time allowed. Once sampling locations were selected, we used Garmin Handheld Global Positioning System (GPS) to navigate to each point. After designating the pedon sampling location, we then used a JAVAD GNSS Triumph-2 for more precise georeferencing. In total, we sampled 8 wetlands and 28 uplands according to the WIP probability ≥ 50% cutoff for the wetland class from the mapped model. Within the wetland class defined by the WIP ≥ 50%, we classified two distinctive wetland types: riverine and palustrine, which differed in their parent material and organic matter content. Riverine wetlands consisted of recently deposited alluvial material and exhibited very little soil development. We classified these observations in the field and later used a surficial geology map to delineate riverine areas with lower predicted SOC described below.
At each pedon site, a pit was excavated to at least 100 cm depth or to a restricting layer to characterize soil horizons, color, texture, structure, and redoximorphic features 40. Samples were collected by each soil horizon for bulk density and total carbon analysis. Bulk density was carefully extracted from the pedon face for each horizon using a fixed volume metal cylinder for mineral soils with a volume of 98.175 cm3 or a beveled polyvinylchloride (PVC) cylinder with a volume of 132.536 cm3 for organic soils. Bulk soil samples were taken from each horizon for total carbon analysis. All samples were transported in coolers and stored in refrigerated spaces between 4-6o C until laboratory preparation and analysis. Laboratory sample preparation included drying all soil samples for at least 48 hrs or to a constant weight in drying ovens at 75 oC. Soil samples were then sieved to extract the fraction less than 2 mm and remove coarse fragments. Bulk density was calculated as the mass of the less than 2 mm fraction divided by the volume of the fixed volume soil core sampler. SOC was measured with the less than 2 mm fraction in both the loss on ignition method and in a Perkin Elmer Co. 2400 model Total Carbon, Hydrogen, and Nitrogen (CHN) Analyzer. Soil samples were prepared for the CHN Analyzer by ball milling a small subsample for 2 minutes at 1/30 second frequency to homogenize the sample. Then a 20 mg subsample was balled into tin capsules and run on the CHN Analyzer. SOC stocks for each horizon were calculated from the total carbon percentage from the CHN analyzer (C) multiplied by the bulk density (BD) and the soil horizon thickness (D) (Eq. 1).
$$1) SOC Stock=\sum {C}_{i}* {BD}_{i}*{D}_{i}*(1-{CF}_{i})$$
Where, \(C\)denotes the carbon percentage, \(BD\) represents bulk density, \(D\) represents the horizon thickness, and \(CF\) represents the coarse fragment fraction of the soil sample \(i\). For the purpose of this analysis we do not spatially predict SOC deeper than 1 m soil depth. Soil pedon landscape classes were defined as wetlands for pedons with WIP ≥ 50%, as uplands for WIP < 50%, and as riverine wetland or palustrine wetlands when the sample location was inside or outside the Hoh River floodplain defined by the surficial geology, respectively.
SOC stock modeling and covariates
To generate a prediction model for SOC, we used a linear mixed effects modeling approach using the ‘lme4’ 41 R package with fixed and random effects to conduct our SOC carbon stock spatial prediction. Linear mixed effect models were used to specify the fixed effect as the WIP probability metric for our primary covariate for SOC. We also investigated multiple remote sensing metrics such as NDVI, EVI, MNDWI, and single band reflectance from Landsat imagery as additional fixed effects in the model. We used surficial geology of the HRW as our random effect due to the mapping of riverine quaternary sediments which represent river floodplains that are strongly predicted as wetland areas in the WIP tool but do not develop soil or accumulate organic matter due to recent river water scouring. Surficial geology data were downloaded as 1:100,000 scale polygons from the Washington State Department of Natural Resources geologic information portal. Four broad classes of lithological material and geologic age were extracted from the surficial geology data to provide grouping for SOC samples: Clastic, Glacial Drift, Till/Outwash, and Alluvium. Step-wise variable selection using Akaike information criterion (AIC) was used to determine fixed effect covariates in addition to the WIP tool probability metric and surficial geology random effect. However, there were no significant effects from adding remote sensing metrics to the WIP and surficial geology. Further, the heterogeneity of the forested landscape due to forest harvest was prohibitive for using spectral remote sensing metrics or lidar metrics of forest structure to predict SOC which could weight clearings or reflective surfaces inappropriately in SOC modeling. Additional terrain metrics were also excluded to avoid intercorrelations with the WIP probability covariate which already incorporates terrain information. Overall, the best model according to AIC was also simplest using just the WIP probability with surficial geology classes (Eq. 2).
$$2) {\sqrt{\left(SOC Stock\right)} }_{ij}= {\mathcal{X}{\beta }_{WIP}}_{i}+{\mathcal{Z}{\alpha }_{Surfical Geology}}_{j}+ {\in }_{ij}$$
$${\in }_{ij}\sim N(0, {\sigma }_{\in }^{2})$$
$${{\alpha }_{Surfical Geology}}_{j}\sim N(0, {\sigma }_{\alpha }^{2})$$
Where \(\mathcal{X}\) is the fixed effects design matrix for the \({{\beta }_{WIP}}_{i}\) in pedon \(i\), \(\mathcal{Z}\) is the random effects design matrix for the random effects \({{\alpha }_{Surfical Geology}}_{j}\) for an geology type \(j\), and \({\in }_{ij}\) is our model error. The model error \({\in }_{ij}\) follows a normal distribution with \({\sigma }_{\in }^{2}\) error. Pedons sampled from the random effect \({{\alpha }_{Surfical Geology}}_{j}\) are considered a random sample from a separate normal distributions for each surficial geology type \(j\).
SOC Stock Prediction
The Eq. 2 model was used to predict SOC stock at a 1 m and 30 cm depth with a R2 of 0.63 and 0.61 respectively. The Root Mean Square Error (RMSE) for the 1 m model was 96.8 MgC ha− 1 and 31.0 for the 30 cm model. A leave one out cross validation computed a cross-validation RMSE of 22.8 MgC ha− 1 for 1 m SOC stocks and 11.7 MgC ha− 1 for 30 cm SOC stocks. Bootstrapped model predictions for the 1 m model showed 95% confidence intervals (2.5–97.5%) around the mean based on 1000 simulations were 216 to 511 MgC ha− 1 for the WIP, 3.67 to 129 MgC ha− 1 for the intercept, 77.0 to 131 MgC ha− 1 for the variance, and 49.5 to 145 MgC ha− 1 for the surficial geology random effect intercept. Bootstrapped model predictions for the 30 cm model, were 51.8 to 147 MgC ha− 1 for the WIP, 36.7 to 76.1 MgC ha− 1 for the intercept, 24.4 to 41.4 MgC ha− 1 for the variance, and 24.3 to 55.8 MgC ha− 1 for the surficial geology random effect intercept. We note these bootstrapped confidence intervals were computed on the non-transformed model which potentially widens the confidence intervals but allows for better interpretation with results in SOC response variable units of MgC ha− 1.
Rasters data layers for the WIP probability and surficial geology were projected to the NAD83 UTM Zone 10 (EPSG:26910), and resampled to match the WIP original 4 m pixel resolution. SOC stocks at 30 cm and 1 m depths were predicted across the HRW using the two raster data layers and the model from Eq. 2 which resulted a spatially continuous map of the square root SOC that was then back transformed with squaring to result in SOC stocks in MgC ha− 1. We masked surface water presence by using the median modified normalized difference water index (MNDWI) across a five year period from 2016 to 2021. We examined the riverine classification and classified all MNDWI values above 0.30 as river surface water to be masked out. The masking process also removed a small lake located in the mountains on the eastern portion of the watershed and small gravel pits in the center of the watershed. The resulting SOC prediction map was used to calculate the total HRW SOC stock, wetland SOC stock, forested wetland SOC stock, riverine wetland SOC stock, palustrine wetland SOC stock, and non-wetland/upland SOC stock. Wetland SOC stock was estimated by classifying pixels as wetlands with WIP probability ≥ 50% and we refer back to Halabisky et al., 39 for the discussion of error with this threshold. We note that this WIP-based classification reflects potential wetland extent but is not meant to confer jurisdictional wetland extent which requires ground truth delineations. Surficial geology delineation of the Hoh River main channel and floodplain was used to classify riverine wetland and palustrine wetland SOC stocks. Forested wetland SOC stocks were estimated from a forest/non-forest mask of wetland SOC stocks derived from tree cover ≥ 50% in the Global Tree Cover product in Sexton et al., 42. Non-wetlands were delineated as the total area outside of the WIP probability ≥ 50% and we classify this area and SOC stock as uplands.
We quantified uncertainty in several methods. First, we examined the R2 value of the fit vs. the predicted values in the final model output to judge the overall fit of the model on the actual SOC values in the current dataset. Next we calculated confidence intervals using the confint.merMod function in the lme4 R package 41. Next, we generated a prediction interval for the model using the ‘predictInterval’ function in the ‘merTools’ 43 R package. This function computes a simulated distribution for all parameters in the model. For the random effect simulation, the distribution is simulated by sampling from a multivariate normal distribution defined by the best linear unbiased prediction estimate and the variance-covariance matrix for each level of the grouping terms. The result is a matrix of simulated values for the linear mixed effects model and each random effect grouping term has a matrix for each observation. The 5th and 95th percentiles of the final simulated distribution were used to define the uncertainty in the prediction and root mean square error was calculated from the difference in the fit vs. the predicted values. Finally, we calculated the mapped SOC prediction uncertainty with a bootstrapping approach. Bootstrapped datasets were constructed by sampling the pedon SOC values from the current dataset with replacement, then integrating that dataset into the prediction model in Eq. 1 which was used to further predict SOC across the HRW. In total we used 300 bootstrapped SOC prediction maps where each pixel contained 300 predictions to simulate a distribution from which we extracted the standard deviation to represent the prediction interval uncertainty. We then compared the WIP wetland SOC stocks with 1m SOC stocks from the National Wetland Condition Assessment (NWCA) 12. Uhran et al., provides the latest mapped wetland SOC stocks at 1 m depths for the continental U.S. modeled from harmonized pedon data from NWCA and mapped using the NLCD wetland extents at a 30 m resolution. NWCA SOC stocks were also subtracted from the WIP wetland SOC stocks..
Wetland size distribution
Wetland size and extent was derived defining wetlands from the WIP tool probability greater than 50%. All wetland pixels greater than 50% were classified as wetland and converted to polygons using the ‘terra’ 44 and ‘sf’ 45 R packages. The wetland polygons were filtered to remove all wetlands below 64 m2 which is the area equivalent to 2x2 pixels in order to conservatively estimate wetland size classes. Examination of the wetlands below 64 m2 did not reveal significant cumulative proportions of SOC or extent. Wetlands above 64 m2 were used to extract SOC values in MgC ha− 1 from the prediction raster. Size classes were defined as quantiles: 1%, 25%, 50%, 75%, 96.4% and 100% and cumulative sums for SOC and areal extent were calculated. The 96.4% quantile marks the 1-acre or 0.40 ha extent that is the minimum mapping unit of the NWI and is used as a threshold for small wetlands. The NWI defines this as the minimum mapping unit but wetlands are still mapped at smaller extents less consistently 13.