Complex Drivers of Riparian Soil Oxygen Variability Revealed Using Self-Organizing Maps

Oxygen

• Soil water is used as a proxy for oxygen, which may lead to incorrect oxygen predictions (e.g., oxygen remains high under high moisture) • A machine learning technique, the Self-Organizing Map, was used to assess the relationship between soil moisture and oxygen in riparian soil • Soil and hydrologic conditions complicated the relationship between soil water content and oxygen, which led to anomalous oxygen conditions

Supporting Information:
Supporting Information may be found in the online version of this article.
Soil O 2 levels are regulated by the diffusion of O 2 into and displacement of O 2 out of soil pores by water (physical processes), and the consumption of O 2 via soil respiration (a biological process, i.e., aerobic microbial, plant root, and faunal respiration; Moyano et al., 2013;Neira et al., 2015;Ponnamperuma, 1972).Because O 2 diffusion into soil water is much slower than in soil air (Moldrup et al., 2000), the presence of water inhibits O 2 diffusion from the atmosphere to soil pores (Skopp et al., 1990).Thus, the combined effects of inhibited O 2 diffusion and displacement, and soil respiration typically result in O 2 depletion (Neira et al., 2015;Ponnamperuma, 1972).
Our ability to predict soil O 2 concentrations across spatial and temporal gradients is limited.This is a result of the complex network of biotic and abiotic soil factors, as well as climatic conditions, that interact to modulate soil O 2 and moisture dynamics and create widespread spatial and temporal soil O 2 variability (Silver et al., 1999).For example, soil water inputs (precipitation and groundwater) vary seasonally, and with site-specific characteristics, and are modulated by hydraulic conductivity.Water demand (i.e., vegetation water uptake) also fluctuates seasonally and varies by plant species (Ewe et al., 2007).Furthermore, O 2 depletion by plant and microbial respiration is primarily controlled by soil temperature and soil water content and thus exhibits seasonal fluctuations (Chen et al., 2010;Kang et al., 2003;Lavigne et al., 2004).
Soil O 2 variability is difficult to manually monitor in-situ (i.e., using handheld soil probes or gas chromatography), and the collection of high spatial and temporal resolution O 2 data is costly, as it requires soil probes and data logging capabilities.However, our ability to predict soil O 2 concentrations across spatial and temporal gradients is limited.The challenges associated with measuring soil O 2 have led to the use of soil moisture as a proxy measurement for O 2 under the assumption that soil moisture is inversely proportional to O 2 concentration (Heinen, 2005;Ridolfi et al., 2003;Rubol et al., 2013).This assumption has been implemented in many simplified process-based denitrification sub-models embedded in N-cycling and ecosystem models (i.e., those that do not account for microbial processes or gaseous diffusion).Some of these models utilize bivariate nonlinear power functions that are modeled after an inverse relationship between O 2 and soil moisture to predict O 2 depletion based solely on water-filled pore space.Examples include the NEMIS model (Hénault & Germon, 2000), the LEACHMN model (Sogbedji et al., 2001), and the SHETRAN model (Birkinshaw & Ewen, 2000).Simplified process-based denitrification models that exclude direct O 2 measurements have been found to exhibit high sensitivity to formulations that represent soil moisture (Hénault & Germon, 2000).
Indeed, because a combination of multiple factors modulates the physical and biological mechanisms that control soil moisture dynamics and soil O 2 depletion (Silver et al., 1999), the use of soil moisture as a proxy measurement for O 2 can result in inaccurate O 2 estimations.An initial analysis using our high-frequency riparian soil sensor data shows a nonlinear relationship between O 2 and soil moisture, and other O 2 covariates (Figure 1 and Figure S1 in Supporting Information S1).The results show that most soil moisture levels are associated with a wide range of O 2 concentrations (Figure 1a and Figure S2 in Supporting Information S1), which suggests the relationship between these variables must be explored and defined using empirical data and nontraditional statistical methods.
Because riparian zones are located at the interface of terrestrial and aquatic ecosystems, they can function as hot spots for anaerobic biogeochemical soil processes (Vidon et al., 2010) and are therefore ideal study systems for soil O 2 dynamics.Due to their unique position on the landscape, riparian soils experience frequent hydrologic changes that alter soil moisture content, which can modify soil O 2 availability (Burgin & Groffman, 2012).
Changes in soil moisture are triggered by hydrologic fluctuations, and the magnitude of these shifts depends on site-specific riparian zone characteristics, such as topography, proximity to surface and groundwater flows, the size and depth of the upland aquifer, and soil hydraulic properties.Riparian zones can also experience seasonal hydrologic fluctuations resulting from changes in connectivity with the upland aquifer (Vidon & Hill, 2004) and variability in water inputs due to seasonal precipitation patterns.Furthermore, site-specific dominant vegetation types have unique water requirements, which could impact the physical soil wetting process.The diverse potential combinations of soil O 2 drivers suggest that the response of soil O 2 to fluctuations in soil moisture is a result of multivariate interactions that are highly dependent on site-specific soil conditions, seasonal fluctuations in environmental conditions, and ecosystem water and O 2 demands.(Rivera et al., 2015), making it an ideal approach for detecting patterns in large environmental datasets.The SOM can overcome limitations of traditional statistical methods, as it can tolerate nonlinearity, temporal and serial autocorrelation, and multicollinearity (e.g., Figure 1; Kundu et al., 2013;Merdun, 2011).The SOM is often applied in exploratory data analysis to leverage the temporal (or spatial) autocorrelation that may exist in the data and identify clusters of like observations (Kalteh et al., 2008;Kohonen, 2013).
The SOM maps multivariate data to a two-dimensional map/lattice, where similar data points are situated in proximity.In contrast to other, more traditional clustering algorithms, (e.g., k-means), the SOM approach enables visualization of variables that drive clustering, and thus, is a potentially powerful statistical tool for leveraging the capacity of high frequency sensor networks to monitor physical and biogeochemical parameters.SOMs have been successfully applied to resolve spatial and temporal heterogeneity in complex systems within large soil and water quality databases (Liao et al., 2019;Obach et al., 2001;Wu et al., 2008), as well as to classify sediment (Alvarez-guerra et al., 2008) and soil types (Tissari et al., 2007).Additionally, the SOM approach has been utilized to address questions concerning water resources and hydrology, such as rainfall-runoff relationships (Lin & Chen, 2006), precipitation dynamics (Kalteh et al., 2008), and links between physical soil properties and hydrologic soil processes (Merdun, 2011).However, to our knowledge, these tools have not yet been used to detect patterns in high frequency soil sensor time series.
Using the SOM approach, we addressed two main questions: (a) Can the SOM approach provide additional insight on the O 2 /soil moisture relationship by incorporating multiple predictors, and (b) Which combinations of variables lead to high versus low soil O 2 ?We clustered high frequency soil and meteorological data collected over 3 years from a poorly drained wetland position within two riparian sites located in northeastern Vermont, USA.We studied two riparian soil environments with contrasting site characteristics (e.g., adjacent land use, vegetative cover, site elevation), allowing us to address our questions in two different riparian lowland settings.

Study Sites
To investigate and better characterize O 2 variability within riparian soils, we studied two riparian soil transects with contrasting catchment characteristics.Both transects are located within Lake Champlain's Missisquoi Watershed in Vermont, USA (Figure 2) and are part of a larger soil monitoring network.This high frequency soil sensor network continuously measures physical and chemical soil conditions 15 cm below the soil surface along a gradient of landscape positions (i.e., spanning upland, wetland, and near-stream locations).
Included in this study are data collected from one low-lying, poorly drained position within each transect, where soil O 2 concentrations ranged from anoxic to near atmospheric.One transect is situated within the Champlain Valley (CV) physiographic province (hereafter referred to as "CV" site), a primarily agricultural catchment in Sheldon, VT.The other transect is located within a 95% forested catchment with minimal anthropogenic impact that is located within the Northern Green Mountain physiographic province (hereafter referred to as "GM" site), approximately 7 km north of the town of Montgomery, VT (Landsman-Gerjoi et al., 2020).
The elevation range of the CV site spans 101-106 m above sea level.et al., 2020).Differences in elevation result in contrasting meteorological conditions between the two sites (Table 1).Groundwater table depth was measured continuously at 15-min intervals at both sites using HOBO water level loggers (Onset Corporation, Cape Cod, MA, USA) that were installed inside hand dug wells in both upland and frequently inundated areas.

Soil Monitoring Network
We assessed ecosystem water delivery by measuring soil volumetric water content (VWC) and precipitation.Water demand (i.e., vegetation water uptake) was indirectly assessed by measuring ambient temperature (TA).Though soil carbon dioxide (CO 2 ) measurements are not flux measurements and cannot be used as a direct indicator of soil respiration or O 2 demand, CO 2 measurements were used as a proxy for O 2 demand.Previous studies (e.g., Jarecke et al., 2016) have documented strong inverse relationships between subsurface (10 cm) gas-phase soil pore O 2 and CO 2 .Soil volumetric water content (VWC; measuring range = 0-1.0m 3 /m 3 ; accuracy = ±0.03m 3 /m 3 ), temperature (T; measuring range = −40 to +60°C; accuracy = ±1°C), and electrical conductivity (EC; measuring range = 0-23 dS/m; accuracy = ±10% from 0 to 7 dS/m) were monitored at 15-min intervals using 5TE sensors (Meter Group, Pullman, WA).
Carbon dioxide in soil air was measured using GMT221 sensors (Vaisala, Helsinki, Finland; measuring range = 0-20%; accuracy = ±1.5% of range and ±2% of reading).Before deployment, the CO 2 sensors were calibrated with a three-point adjustment and corrections for barometric pressure (based on elevation) and humidity.Pure N 2 gas was used as a 0% CO 2 standard.The internal temperature sensor on the CO 2 probes was used for the temperature correction.Oxygen in soil air (i.e., gas-phase soil pore O 2; hereafter referred to as soil O 2 ) was monitored at 15-min intervals using Soil Response O 2 sensors (Apogee instruments, Logan, UT; measuring range = 0-100%; sensitivity = 2.6 mV per % O 2 ).Before deployment, the O 2 sensors were calibrated based on an open-air oxygen reading with corrections for barometric pressure (based on elevation), temperature, and humidity.The gas permeable membrane inlet of the O 2 sensors was encased in a porous diffusion head and was equipped with a heating element to prevent water from pooling and blocking O 2 diffusion.The O 2 sensors are subject to long-term drift (1 mV/year).
Oxygen and CO 2 sensors were installed within 45 cm of each other and were deployed inside polyvinyl chloride (PVC) pipe (with membrane inlets exposed to the soil) to secure their position in the soil.Silicone sealant was applied between the O 2 sensor diffusion head and the PVC pipe to prevent water from entering the electrical connections.For water-proofing purposes, CO 2 probes were deployed inside a gas-permeable membrane sleeve.We filled a portion of the membrane sleeve with silicone sealant to prevent moisture from entering the electrical connections between the probe and the data logger cable.Because the sensors were shielded from the soil and enclosed within either a diffusion head (for O 2 ) or a membrane (for CO 2 ), it is probable that the air inside the chambers took some time to reach the same concentration as the surrounding soil.Each site (CV and GM) was equipped with a meteorological station that measured TA, precipitation, photosynthetically active radiation, solar radiation, relative humidity, wind direction, wind speed, and dew point at 5-min intervals.Data included in this study were collected from July 2017 to June 2020.

Approach
To identify suites of soil conditions (clusters) associated with various soil O 2 regimes, we employed a Kohonen unsupervised SOM approach using the kohonen package in R (Wehrens & Buydens, 2007).We used an "unsupervised approach," meaning we fed independent variables to the model and excluded the response variable, O 2 .Furthermore, we did not constrain the number of outcome clusters (i.e., soil condition descriptors), so that we could empirically determine the most suitable number of clusters for our dataset.We first used exploratory data analysis to identify independent variables with a potential to be linked to O 2 variability within our sites, and to examine meaningful ranges of O 2 as the response variable.We also selected input variables by examining component planes generated by early iterations of the SOM, which allow for the visualization of clustering according to each independent variable.We then ran a SOM analysis approach, which involved several iterative steps, to optimize SOM execution and validate clusters, as described below.The SOM mapped our multivariate dataset to a two-dimensional map/lattice, where observations linked to similar combinations of values for input variables were situated in proximity to each other.Finally, we compared O 2 values across clusters, post hoc, to better understand drivers of specific O 2 ranges.

Overview of Kohonen Unsupervised Self-Organizing Map
A detailed description of the SOM algorithm can be found in Kohonen (2013) and Underwood et al. (2021).To summarize, the method clusters multivariate observations onto a reduced-dimension lattice.Each lattice node is first assigned a vector of random values (weights) ranging from 0 to 1.The length of this vector is equal to the number of input variables in each observation.A single vector of input values (observed data) is simultaneously presented to each node's weight vector.The vector of input values is compared to the weight vector using the Euclidian distance formula, and the lattice node with the closest matching weight vector is designated as the best matching unit (BMU).The weight vector of the BMU, in addition to nodes surrounding the BMU (defined as a "neighborhood"), is updated to resemble the input vector more closely.The neighborhood function is unique to the SOM, as other clustering tools (e.g., K-means) only update the weight vector of a single node (Merdun, 2011).
The user customizes the learning rate, α, which controls the amount by which weights are adjusted for both the BMU and nodes within the neighborhood around the BMU.This process is repeated until all observations have been presented to the lattice, which constitutes one iteration of the SOM algorithm.Both the learning rate, α, and the neighborhood size are decreased as subsequent iterations of the algorithm are executed, by relying on user-defined functions.The size of the neighborhood is eventually reduced to one node-the BMU.Multiple iterations are executed until the algorithm converges.Once the algorithm converges, the adjusted weight vectors will have self-organized across the lattice such that similar observations will be aggregated together.To define clusters of observations (i.e., nodes of the lattice containing similar weight vectors), the distance between weight vectors is calculated using a hierarchical agglomerative clustering method (Underwood et al., 2021).

Data Analysis: Initial Data Conditioning, Exploratory Analysis, and Selection of Independent Variables
To prepare our high frequency soil and meteorological observation data for input to the SOM, data conditioning steps were required.Data from 27 June 2017 to 27 June 2020 from both study sites were selected for analysis.Some observations were missing from the time series due to sensor and data logger dysfunction caused by inadequate solar power, mostly during winter and early spring.Timesteps with missing O 2 or precipitation observations were excluded from the dataset, leaving 91% of the CV time series and 27% of the GM time series to use as SOM input data.Carbon dioxide was not included as an input variable to the GM SOM, as all CO 2 observations were missing from the dataset.After removing timesteps with missing O 2 or precipitation observations, missing values of other input variables (ranging from 0% at the CV site to 0.9% of observations at the GM site; Table S1 in Supporting Information S1) were replaced with the overall median value using the Hmisc package's impute function in R (Harrell, 2020; Table S1 in Supporting Information S1).We then reduced the volume of data by downscaling our 15-min observations to an hourly frequency using hourly median values for each variable.We derived additional time series to potentially include in our dataset by computing rolling averages (for VWC, soil T, and ambient T) and rolling sum (for precipitation) ranging from 12 hr to 2 weeks prior to each observation to investigate the effects of antecedent soil and meteorological conditions on soil O 2 dynamics.We calculated the Julian date of each observation to include in the SOM as an independent variable.We then performed Principal Component Analyses (PCA), a dimensionality reduction tool, on a correlation matrix of all available independent variables to confirm that the O 2 data clustered distinctly, based on the available parameters.PCA analyses were also used to establish the dimensions of the SOM lattice best suited to our dataset (Underwood et al., 2021).
We selected a suite of independent variables to include in our final SOMs (Table S2 in Supporting Information S1) by first running provisional SOMs with all available independent variables (listed in Supporting Information S1).We observed the resulting component planes and plotted the distribution of each variable, separated by cluster, using box and whisker plots.We examined the component planes (Figures S7-S10 in Supporting Information S1) and box and whisker plots to manually identify variables that lacked distinct variability across clusters and excluded those variables from the input dataset (e.g., wind speed).A clear pattern or structure revealed by a component plane indicates that the variable is an important contributor to the dataset's variability.A uniform or random pattern suggests that the variable does not explain much variability.

Data Analysis: SOM Data Preprocessing
We prepared the input data for the SOM by normalizing each independent variable to a value between 0 and 1 using a range normalization technique.Normalization improves model performance by ensuring different measurement units and magnitudes do not influence the weight of each independent variable.In particular, range normalization has been found to result in optimal SOM performance, as it minimizes the distance between the data points and the nodes to which they are matched (measured as quantization error (QE)), thus improving model fit.QE represents the magnitude of error between the input vector and its closest match on the SOM.Thus, a lower QE indicates the SOM is more effectively representing the input data in a lower-dimensional space.The QE can be used to compare the performance of different iterations of the SOM trained with the same input dataset (not to compare between SOMs with different input datasets; Céréghino & Park, 2009).Additionally, range normalization reduces discontinuities between the structure of the input data and the map layout (measured as topographic error; Alvarez-Guerra et al., 2008;Breard, 2017).The response variable, O 2 , was not presented as an input to the SOM.
We ran SOMs on different groupings of the CV observations to evaluate which driving features were revealed when differing amounts of data were fed to the SOM.We initially ran one SOM for all the soil O 2 data from the CV site and found that the very abundant high O 2 values masked the signal of the less frequent low O 2 values.This prevented identification of the nuanced conditions or factors that led to low O 2 .We therefore performed SOMs on separate subsets of the CV data for a more refined assessment of factors associated with O 2 dynamics, and to ensure that factors associated with high O 2 conditions could be parsed from factors associated with low O 2 conditions.We subdivided the associated multivariate time series observations into sets associated with distinct ranges of O 2 values (high and low), using the Jenks Natural Breaks optimization method (Jenks, 1967;Khamis et al., 2018) via the BAMMtools package in R (Rabosky et al., 2014).The Jenks Natural Breaks optimization method is a data clustering technique designed to classify data into a user-defined number of ranges in a way that minimizes within-group distances between values.We ran only one SOM for all O 2 observations from the GM site, as O 2 values ranged from 0% to 0.6%, a range of values that corresponded to low O 2 conditions determined for the CV site.

Data Analysis: SOM Computation and Model Optimization
After selecting a suite of independent variables to include in each dataset (CV all observations (CVAO), CV high O 2 , CV low O 2 , GM), we adjusted the number of lattice nodes, the lattice dimensions, and the value of k to maximize between-cluster variance and minimize within-cluster variance.It is important to optimize the dimensions of the SOM lattice for a given dataset, as an unsuitable lattice configuration can distort the distribution of the input variables across the map.For each dataset, we followed Vesanto's rule (Vesanto et al., 2000) to pinpoint the optimal number of lattice nodes.We also approximated the column-to-row ratio of the lattice as the ratio of the two largest eigenvalues from PCA using the original, non-normalized values (Céréghino & Park, 2009).
A hexagonal lattice arrangement was used.The lattice for the CVAO, CV high O 2 , CV low O 2 , and GM SOM contained 16 rows and 57 columns, 17 rows and 34 columns, 13 rows and 33 columns, and 15 rows and 27 columns, respectively.The CVAO, CV high O 2 , CV low O 2 , and GM SOM lattice configuration contained 912, 578, 429, and 405 nodes, respectively.
SOM training was performed over 20,000 iterations, and α was set to decrease linearly from 0.05 to 0.01.The neighborhood size started at a radius from the BMU of two-thirds the lattice size and decreased linearly to a value of 0, at which point only the weights of the BMU were being updated.For a given dataset, the SOM iteration that maximized the nonparametric F statistic (ratio of within-cluster to between-cluster variance) and minimized QE (measure of map resolution) was selected as the final model run (Underwood et al., 2021).The nonparametric F-statistic was calculated using the adonis function in R's Vegan package (Oksanen et al., 2019) For a given dataset, to identify unique attributes of each cluster, we plotted the (range normalized) intra-cluster mean, relative to the overall mean, for each input variable.This metric was used to visualize suites of variables and their relative value ranges that constituted different environmental conditions associated with various O 2 levels.Mixed-effect models that incorporated random effects for temporal autocorrelation were used to test for differences in each input variable according to cluster assignment using the lme function in R's nlme package (Pinheiro et al., 2014).Standard errors of the mean of original (i.e., not range-normalized) values were compared to assess whether inter-cluster means were meaningfully different.The cluster assignments for each observation were then plotted onto an O 2 time series figure (original values) to display temporal fluctuations in classes of environmental conditions.

Champlain Valley Site: Champlain Valley All Observations (CVAO) SOM
Because the processes that control meteorological and soil conditions exhibit patterns that repeat over time, high frequency environmental observations are usually temporally autocorrelated.We detected temporal autocorrelation in our input data for each SOM (Figures S3-S6 in Supporting Information S1).At the CV site, we first performed a SOM on all observations (O 2 = 0-21.47%,n = 22,427) to assess whether our questions could be adequately addressed at this scale (Figure 3).After examining the component planes generated by the input variables (Figures S7-S10 in Supporting Information S1), we selected key input variables for the CVAO SOM (mean O 2 = 11.14%):soil T (mean = 8.6°C), 1-week cumulative antecedent precipitation (hereafter referred to as "1-week precipitation") (16.28 cm), VWC (0.51 m 3 /m 3 ), Julian date (183), CO 2 (2,805 ppm), EC (0.19 dS/m), and dew point (2.41°C).The CVAO SOM assigned observations to six different clusters (Figure 3c and Table 2).Most (48.4% of observations) were categorized by the SOM as wet and cool, with low (below average) CO 2 , EC, and 1-week precipitation (clusters 5 and 6).These clusters occurred during early spring (cluster 5) and winter (cluster 6).Clusters 2 and 3 (21.2% of observations) fell into a dry and warm category with high (above average) CO 2 and dew point, and low 1-week precipitation and EC.Both clusters occurred during summer months, but cluster 3 occurred during early spring, too.Cluster 1 (17.5% of observations) included dry and warm soil conditions, with high 1-week precipitation, CO 2 , and dew point, and low EC.Cluster 1 occurred from mid-summer to fall.Some (12.9%) observations were grouped into a wet category with high 1-week precipitation, EC, and dew point, and low CO 2 (cluster 4).Cluster 4 occurred during winter months.
While the dry and warm (July-October) and wet and cool (October-April) categorization revealed a seasonal pattern in the variability of the key controls on soil O 2 , clusters derived from all CV observations did not adequately differentiate between conditions that led to high versus low O 2 .This is evidenced by the wide range of O 2 values included in clusters 1 (11%-20%), 2 (2%-20%), 3 (0.1%-20%), 4 (1%-20%), 5 (0.1%-21%), and 6 (0%-21%).To more effectively address our questions, further sub-setting of our data into observations associated with low O 2 versus high O 2 was therefore needed to parse the unique factors (and possibly different combinations of factors) within or across seasons that led to anoxic and oxic conditions.Based on Jenks Natural Breaks, we therefore split the CV observations into two datasets associated with high (12.9%-21.5%,n = 12,593) and low (0%-4.3%,n = 7,043) O 2 to differentiate conditions that led to contrasting O 2 regimes.We performed separate SOMs (CV high O 2 and CV low O 2 ) on those subsets.

Champlain Valley Site: High O 2 SOM
Key input variables selected for the CV high O 2 SOM (mean O 2 = 18%) were determined by examining component planes and were identified as soil T (mean = 10.3°C),2-week cumulative antecedent precipitation (hereafter referred to as "2-week precipitation") (2.74 cm), VWC (0.46 m 3 /m 3 ), CO 2 (3,215.2ppm), and Julian date (192) (Figure S8 in Supporting Information S1).The high O 2 SOM identified four distinct clusters of multivariate time series observations (Figure 4c).High O 2 events, which made up 49% of the CV O 2 values, were somewhat evenly distributed among winter (26% of data points) spring (17%), summer (25%), and fall (32%).Oxygen values were consistently high from May 2018 through April 2019 (Figure 4b), and further research is required to identify what prevented O 2 depletion under wet and cool soil conditions during this period.At the CV sampling location, groundwater levels generally decreased during the growing season (early May to early October; average = 0.82 m below the soil surface) and increased during cooler months (average = 0.38 m below the soil surface).
Soil conditions within clusters 1 and 4, which made up 62% of the CV high O 2 data points, were generally dry and warm (Table 3).Cluster 1, which included 53% of data points, is associated with average (compared to the overall mean) 2-week precipitation and average CO 2 values.Contrastingly, cluster 4, which encompassed 8.6% of data points and had below average 2-week precipitation, was associated with the highest average CO 2 of all four clusters.There was overlap of the Julian date ranges of clusters 1 and 4 during warm months (Table 3).Wet and cool soil conditions (clusters 2 and 3) described 38% of data points within the CV high O 2 dataset and were associated with the highest average O 2 of all clusters.Key differences between clusters 2 and 3 include Julian date range (October-January, and January-April, respectively) and 2-week precipitation (39.9 and 19.2 cm, respectively) (Table 3).

Champlain Valley Site: Low O 2 SOM
Low O 2 values were distributed somewhat uniformly among the four seasons (winter = 32%, spring = 35%, summer = 12%, and fall = 21%).Low O 2 values represented 27% of the O 2 dataset and occurred intermittently  4).In contrast to the CV high O 2 SOM results, the majority of low O 2 data points (69%) could be categorized as wet and cool with low 2-week precipitation and CO 2 (clusters 1 and 3).Clusters 1 and 3 differed in Julian date ranges (Figure 5c).Clusters 4 and 5 included 9.3% of data points, which fell into a very dry and warm category, with average (cluster 4) and above average (cluster 5) CO 2 and 2-week precipitation (Figure 5c).Unique to the CV low O 2 SOM, an additional cluster was identified (cluster 2), which encompassed 21.5% of data points.Cluster 2 occurred throughout October and November, and May and June, and could be described as warmer than average, with average soil moisture, despite above average antecedent precipitation and below average CO 2 (Table 4).

Green Mountains Site SOM
We included all values from the GM dataset in one SOM, as O 2 values at the GM site were all low, ranging from 0% to 0.6% (n = 6,921).A unique set of input variables was chosen for the GM model based on the component planes (Figure S10 in Supporting Information S1): cumulative 2-week antecedent ambient temperature (hereafter referred to as 2-week TA), VWC, 1-week precipitation, and Julian date.Average values for these variables were 11.0°C, 0.53 m 3 /m 3 , 1.99 cm, and 186.6, respectively.The low O 2 SOM grouped the observations into 5 different clusters.Observations from the GM dataset can generally be described as dry and warm, wet and cool, or wet and warm (Table 5).Average groundwater levels at the GM sampling location were significantly higher (p < 0.001; average = 0.05 m below the soil surface) and remain elevated throughout the year.
Cluster 2 encompassed the most data points (46.8%), which can be described as dry and warm, with below average 1-week antecedent precipitation, occurring during early spring/summer.Clusters 3 and 5 included 28% of data points, which can be summarized as warmer and wetter than average conditions, with differing 1-week precipitation and Julian date ranges (Figures 6b and 6c).Cluster 4 (15.4% of data points) can be described as cool, and drier than average, with low 1-week precipitation.Encompassing 9.3% of data points, cluster 1 can be described as wet and cool, with low 1-week precipitation (Figure 6c).
Oxygen values included in the GM SOM remained consistently at or below zero percent throughout the entire sampling period, with the exception of an O 2 event in late December 2017 that reached 0.6% (Figure 6c).As such, it should be noted that the clustering results of the GM dataset incorporate the single O 2 event (Cluster 1) and therefore may not be representative of typical conditions observed at the GM site.Clusters 1 and 4 occurred throughout winter 2017/2018 and were associated with wet and cool soil conditions.Due to the interference of winter weather with our instrumentation, we were not able to monitor winter 2018/2019 or 2019/2020.Clusters 2 (warm/dry), and 3 (warm/wet) occurred intermittently from May to October 2018, and from June to September in 2019.O 2 values within cluster 5 (warm/wet) occurred as isolated events each summer during the months of July (2018 and 2019) and May and June 2020.

SOM Results Confirm Multivariate and Complex Controls on O 2
We addressed the question: Can the SOM approach provide additional insight into the O 2 /soil moisture relationship by incorporating multiple predictors?Our results confirm that using a unique combination of high frequency, multiparameter soil sensor data collected over multiple years, the SOM identified combinations of variables that control soil O 2 levels.Importantly, and in contrast to traditional ecological assumptions, low O 2 levels did not correspond solely with increasing soil moisture.Indeed, high O 2 levels persisted in both high and low moisture conditions.Low O 2 conditions instead depended on temporally varying  combinations of water inputs, water demand, and O 2 demand.The output of our unsupervised clustering analysis from both field sites (i.e., all four SOMs) revealed a pronounced seasonal signal.This is evidenced by clustering according to dry soil conditions during warm months and wet soil conditions during cool months, with varying antecedent precipitation, soil CO 2 , and soil temperature conditions.
Our results show that the information gained from the SOM clustering changes with the magnitude of variability in the input data.When all observations from the CV site were included in one SOM, we were able to detect the seasonal pattern driving fluctuations in the controls on soil O 2 but not the nuanced conditions that led to oxic versus anoxic O 2 .We thus ran separate SOMs on observations associated with high and low O 2 and hereafter refer to the results from those models.
At the CV site, the majority (69%) of low O 2 values were associated with wet and cool soil conditions with below average CO 2 and 2-week antecedent precipitation.These observations occurred from November to May (Figure 5, cluster 3).Under these conditions, a physical saturation process (as opposed to biological O 2 consumption) dominated, as the combined effects of decreased water demand (Sevanto et al., 2006) and sufficient water inputs from precipitation likely prevented the reaeration of soil pores (Neira et al., 2015).Furthermore, in Northern temperate climates, ground frost can persist during this period (November-May), especially in areas of open land without significant canopy cover (Shanley & Chalmers, 1999), which could have prevented the reaeration of soil pores.All low O 2 values at the CV site that were observed under wet and cool conditions were associated with below average subsurface CO 2 (used here as an indicator of soil O 2 demand), indicating that O 2 consumption rates were relatively low under the majority (69%) of low O 2 conditions.This finding is consistent with those of Davidson et al. (1998) and Moyano et al. (2013) who observed decreased soil respiration rates under wet and cool soil conditions during the non-growing season, which resulted from reduced plant respiration and high soil moisture levels impeding O 2 diffusion, and thus, decomposition and CO 2 production (Doran et al., 1990;Moyano et al., 2013;Skopp et al., 1990).It is therefore possible that a restriction of air exchange between the atmosphere and soil pores is necessary in order for low levels of biological soil respiration to markedly deplete O 2 before it is replenished.We note that the effects of subtle shifts in CO 2 on O 2 depletion under wet and cool soil conditions may have occurred at time scales that were finer in resolution than our hourly SOM input data.Therefore, analyzing a dataset of finer temporal resolution, or one that encompasses a shorter time period, may help detect a more significant impact of biological O 2 consumption under wet and cool conditions.
Contrastingly, most (62%) high O 2 values from the CV site were associated with dry and warm conditions that occurred during warmer months (April-October, Figure 4).This indicates that dry and warm soil conditions inhibited soil O 2 depletion by allowing O 2 to readily exchange with the atmosphere via increased air-filled pore space.However, as 62% of high O 2 values were associated with above-average soil CO 2 levels (clusters 1 and 4), our findings suggest that waterlimitation did not suppress soil respiration.Furthermore, these findings contradict previous studies by Doran et al. (1990) and Orchard and Cook (1983) that documented decreased soil respiration rates resulting from elevated soil temperatures and low soil matric potential during warm months.It is therefore likely that, under dry and warm soil conditions at our sites, sufficient soil moisture is required to block O 2 diffusion in order for elevated soil respiration rates to sufficiently deplete O 2 .We note that VWC may need to decrease below field capacity (not measured in this study) in order for water limitation to significantly reduce soil respiration rates (Davidson et al., 1998).Additionally, dry and warm soil conditions coincide with the growing season in temperate systems where the combined effect of elevated plant water uptake and negative soil matric potential can lower hydraulic conductivity (i.e., inhibit additional water inputs from percolating through the soil matrix) (Hardie et al., 2012).Under this scenario, additional precipitation inputs may not have resulted in increased VWC and therefore, despite elevated CO 2 levels (i.e., high O 2 demand) at this time, O 2 usually remained near atmospheric levels.

High Soil Moisture Levels Do Not Always Lead to Low O 2
Although the majority of high O 2 levels occurred when soils were dry and warm, and most low O 2 levels occurred under wet and cool conditions, we also intermittently observed the opposite behavior, indicating soil moisture alone is not always an accurate proxy for soil O 2 .Similar to our findings, Burgin and Groffman (2012) observed low gas-phase soil pore O 2 under dry VWC conditions in a riparian wetland during summer.These exceptions hold key insights into the importance of antecedent conditions (i.e., time lag effects) in determining soil O 2 conditions.For example, low O 2 occurred in June and July under dry and warm (below average VWC) conditions with above-average 2-week antecedent precipitation inputs.This is evidenced by clusters 4 and 5 from the CV low O 2 SOM, which accounted for 9.3% of low O 2 observations (Figure 5).In this case, high antecedent precipitation inputs could have temporarily saturated the soil, thus stimulating soil respiration, while simultaneously blocking O 2 diffusion.This could have triggered a significant O 2 depletion that persisted even after soils dried back down.This scenario agrees with the preceding warm and above-average 2-week precipitation conditions typical for cluster 2. Our findings agree with those of Silver et al. (1999), who found that forest soil O 2 concentrations were negatively correlated with cumulative rainfall for up to 4 weeks preceding O 2 measurements.The impact of antecedent conditions on O 2 dynamics is also evidenced by the findings of Smyth et al. (2019), who found that high temporal resolution sensor data, especially gas-phase soil pore O 2 , revealed distinct lag periods between changes in soil conditions and subsequent biogeochemical activity.
It is important to note that clusters 4 and 5 were present only once throughout the entire sampling period (in June-July 2019) and occurred together in quick succession, which indicates these conditions were unusual for our site, at least within our three study years.The anomalous low O 2 conditions in summer (June-July 2019) were followed by a rapid increase in soil O 2 .Burgin and Groffman (2012) and Liptzin et al. (2011) documented rapid increases after low O 2 conditions in summer and attributed these events to dry soil macropores and plant senescence.
Another example of counterintuitive patterns is the occurrence of observed high O 2 levels during fall, winter, and spring months (October-April), when conditions were wet and cool (38.3% of high O 2 data points).Two distinct near-atmospheric O 2 events occurred (one in December-February 2017/2018, the other in November-April 2018/2019) under such conditions (clusters 2 and 3 from CV high O 2 SOM, Figure 4), and the latter event had a relatively long duration.This suggests that the high O 2 levels were not a result of a brief O 2 transition period, but instead reflect the absence of an O 2 depletion mechanism in response to increased soil moisture.Interestingly, these conditions mirror those that resulted in low O 2 levels (Table 4), indicating that decreased soil respiration rates characteristic of clusters 2 and 3 from the high O 2 SOM likely did not prevent O 2 depletion.The only difference between wet and cool conditions that resulted in low O 2 (Table 4) and clusters 2 and 3 from the high O 2 CV SOM (Table 3) were higher 2-week precipitation values associated with low O 2 during January-May, which further emphasizes the important role of antecedent conditions in O 2 depletion.If O 2 consumption rates did not exceed the rate of O 2 delivery, it is possible that these distinctive O 2 levels were a result of oxygenated water inputs originating from downward flow of oxygenated surface water, or oxygenated groundwater recharge (Lahiri & Davidson, 2019;Nelson, 2002).Regardless, our results show that very similar soil conditions can result in distinctive O 2 levels.A better understanding of the drivers of soil O 2 is therefore required to investigate such heterogeneity.Our results also illustrate the utility of high frequency in-situ time series in capturing infrequent and unanticipated events, especially in cases when antecedent conditions may alter the O 2 response.Indeed, intermittent manual sampling campaigns could either miss these events entirely, mischaracterize the commonality of their occurrence, or have limited capacity to identify event drivers.As in-situ sensing networks become more commonplace in soil science research, we expect increased detection of these counterintuitive events.This will ultimately change how we understand the drivers of fluctuating O 2 conditions in the soil environment, and in particular, the role of antecedent conditions.

Site-Specific Controls on the Drivers of O 2 Regimes
While the key controls on soil O 2 were constant across sites, site-specific characteristics modulated the relative rates and impacts of ecosystem water inputs, water demand, and O 2 demand, which could have led to variable O 2 regimes across sites.Indeed, the constant anoxia (O 2 = 0-0.6%)and elevated VWC values (0.48-0.61 m 3 /m 3 ) observed at the GM site were likely due to unique site-specific features.The combined effects of topography and groundwater hydrology dynamics provided a steady water supply that created consistently saturated soil conditions.This is further evidenced by wet and warm soil conditions unique to the GM site, indicating that soils did not dry out under increased ambient temperatures.This finding is consistent with those of Silver et al. (1999), who found that soil O 2 levels were sensitive to hydrologic inputs and were significantly correlated with a topographic gradient spanning ridge, slope, and valley locations.The consistently high water inputs at the GM site generated constant anoxia by preventing the re-aeration of soil pores, and/or displacing O 2 .As expected, we observed seasonal fluctuations in key O 2 controls (2-week antecedent ambient T, 1-week antecedent precipitation, VWC), but in contrast to the CV site, this resulted in steadily low O 2 concentrations.These findings highlight a disconnection between the controls on O 2 and O 2 dynamics.This suggests that a physical soil wetting process is the primary mechanism controlling O 2 dynamics at the GM site, and that the prevention of soil pore reaeration, or O 2 diffusion prevails, thus creating a low O 2 environment, regardless of seasonal fluctuations in O 2 controls.Our results show that relationships between O 2 and soil moisture may be reliably predicted by a traditional bivariate nonlinear power function at some sites (e.g., our GM site) but a multivariate regression model may be required at sites with higher O 2 variability (e.g., our CV site).Our findings indicate that the chosen regression approach should consider the environment in which the data were collected, and our SOM analysis has identified key features to be monitored and considered in such a multivariate regression approach.
The topography, groundwater hydrology, and vegetation characteristics unique to the CV site resulted in seasonal VWC fluctuations.For example, low VWC values observed throughout the growing season at the CV site were likely the result of depleted groundwater levels in combination with high plant water uptake by abundant sedges and nettles (water demand), and elevated soil respiration rates (O 2 demand).These conditions facilitated O 2 diffusion and reaeration of soil pores, thus restoring soil O 2 to near-atmospheric levels.Under dry and warm conditions, antecedent precipitation plays an important role in O 2 depletion, as soils at the CV site can become too dry to displace O 2 or block O 2 diffusion.The significant impact of antecedent precipitation conditions on soil O 2 is also highlighted by Silver et al. (1999), who found soil O 2 levels at ridge locations to be significantly correlated with cumulative 4-week antecedent precipitation.
Low Nutrient cycling processes that require anaerobic soil conditions or anaerobic microsites, such as denitrification, sulfate reduction, or methanogenesis, will not proceed if the soil environment is well aerated (Sexstone et al., 1982).Furthermore, the growing season is a critically important time for nutrient sequestration and transformation within agricultural watersheds (Wang et al., 2014), as fertilizer is generally applied to agricultural fields in early spring.High soil O 2 levels during this critical period for N mobilization could reduce soil denitrification rates, which could have detrimental effects on nearby aquatic ecosystems.

Complications Associated With Predicting Soil O 2 Based Solely on Soil Moisture
We asked the question: Which combinations of variables lead to high versus low soil O 2 ?The results of our clustering analysis suggest that riparian soil O 2 dynamics are controlled by a network of seasonally variable, rate-dependent, and location-dependent parameters, and as such, the relationship between O 2 and soil moisture is more complex than represented by traditional ecological models.In contrast to traditional ecological thought, high soil moisture does not always result in low O 2 levels, and vice versa.Reliance on these more traditional reduction functions would have yielded contrasting results at our two study sites.If we had predicted soil O 2 levels at the CV site solely based on the commonly assumed negative correlation between moisture and O 2 (i.e., VWC values that typically result in low O 2 ; VWC = 0.5-0.6 m 3 /m 3 ), 30.6% of O 2 values would have been incorrectly predicted as low, and 6.7% would have been incorrectly predicted as high (VWC ≤ 0.4 m 3 /m 3 ).In contrast, consistently high VWC observed at the GM site resulted in consistently low O 2 .Therefore, although we did not observe a significant negative correlation between O 2 and soil moisture (data not shown), our predictions based on soil moisture conditions alone would have been reasonably accurate at this site.
Our findings have important implications for nutrient cycling models that rely solely on soil moisture measurements to predict soil O 2 , and for empirical studies that make inferences about soil biogeochemical processes based on O 2 estimations (Rubol et al., 2012).Soil O 2 dynamics strongly modulate the rate and efficiency of microbially-mediated soil elemental (e.g., C, N, S) cycling through shifts in redox potential.Incorrect estimations of soil O 2 can therefore result in inaccurate predictions of critical N, C, etc., process rates.For example, much of the literature involving the measurement of soil O 2 and its relationship to soil moisture is within the context of climate-change driven shifts in soil moisture regimes and the subsequent effects on C storage and soil respiration (O'Connell et al., 2018;Santiago et al., 2005).These changes in C storage are modulated by confounding effects of seasonally variable soil moisture and temperature, as well as O 2 (Davidson et al., 1998).Critical soil biogeochemical processes not only impact watershed nutrient mobilization and downstream water quality, but also soil greenhouse gas production.It is therefore imperative to continue to improve our understanding of soil O 2 dynamics, as they are likely to increase in complexity as we face complications linked to a changing climate.Models that incorporate mass balance approaches are a potential alternative to those that rely on traditional reduction functions to predict soil O 2 .For example, Parolari et al. (2021) applied mass balance models to power spectra quantified from soil sensor timeseries and found accurate predictability of O 2 variability using temperature and soil moisture as inputs.
Our analysis uniquely incorporated multivariate data of high temporal resolution, which allowed us to investigate and provide new insight into the mechanisms controlling O 2 dynamics within our study sites.However, the limitations of our analysis are highlighted by our spatially constrained dataset, as we included observations from one landscape position within two different riparian soil sites of contrasting adjacent land use.Therefore,  our results cannot be directly scaled up to predict O 2 regimes across more expansive ecosystem scales.It is also important to consider how the limited spatial resolution of our dataset (within a given sensor location) impacts our ability to assess the suitability of soil conditions for biogeochemical processes that occur in deeper soil depths, especially soil denitrification (Groffman et al., 2009).
Furthermore, the inherent spatial heterogeneity of the environmental factors that control soil redox-sensitive processes results in hot spots and hot moments of biogeochemical activity, which are very difficult to predict.We also note that our sensor network was unable to detect O 2 within soil solution, or anaerobic microsites within soil aggregates, where hotspots of soil denitrification, for example, can occur (Parkin, 1987).These sources of spatial heterogeneity could result in rates of biogeochemical processes that are not reflective of the soil conditions measured by our instrumentation.
However, our results provide information about riparian soil O 2 dynamics that can be used for larger scale pattern analysis.As the factors that control soil O 2 were similar across the two sites, the seasonal variability we observed in the key O 2 controls may also apply to other riparian soil environments located in temperate climates.This seasonal O 2 framework could be an effective tool as a first pass prediction of whether O 2 conditions are conducive to aerobic or anaerobic soil processes.However, we also must consider that various site-specific characteristics   will likely affect water inputs, water demand, and O 2 demand in ways that uniquely affect O 2 regimes.We posit that a fruitful next step would be to conduct comparable analyses that leverage more spatially expansive soil sensor networks that include other depths within the biologically active soil profile, as well as across variable climate, topographic, hydrologic, and geologic riparian soil environments (e.g., NEON, Critical Zone Observatories, LTER) to improve our understanding of the drivers of soil O 2 dynamics and our capacity to systematically model soil O 2 behavior and associated soil biogeochemical cycles.
It is also important to consider the implications of measuring gas-phase soil pore O 2 (measured in this study), compared to dissolved O 2 , especially in the context of biogeochemical process rates.Microbially mediated oxidation processes require dissolved-phase O 2 , whereas other abiotic O 2 oxidation processes can occur with gas-phase soil pore O 2 .Gas-phase soil pore O 2 may have a more direct impact on abiotic oxidation processes, like chemical weathering, photo-oxidation of organic compounds (e.g., lignin) electrochemical oxidation (i.e., the transfer electrons between minerals and change oxidation states), and abiotic oxidation of pollutants (e.g., heavy metals).However, the role of gas-phase soil pore O 2 as a predictor of microbially mediated processes is well-established.For example, Smyth et al. (2019) found gas-phase soil pore O 2 to be the best predictor for soil CH 4 fluxes, suggesting the type of soil O 2 measured in our study is linked to soil biogeochemical process rates.Rubol et al. (2013) also demonstrated the important role of soil pore O 2 in predicting rates of dissimilatory reduction of nitrate to ammonium.Thus, without labor-intensive methods needed to measure dissolved O 2 , our high frequency, continuous data provide information on the status of O 2 in soils, which has implications for both abiotic and microbially mediated processes.

Conclusions
We used a SOM approach to address key drivers of the spatial and temporal variability exhibited by riparian soil O 2 levels within the top 15 cm of the soil profile.Our results show that, in contrast to traditional ecological assumptions, O 2 cannot be accurately predicted based only on an inverse relationship between O 2 and soil moisture.Further, we found the SOM approach to be useful for identifying drivers of soil O 2 at intermediate soil moisture levels.However, in inundated soils, where O 2 is consistently depleted (i.e., our GM site), the reasoning that high soil moisture results in low soil O 2 holds true, eliminating the need for a multivariate approach to predict soil O 2 conditions.Our findings indicate that soil O 2 is controlled by a diverse set of seasonally variable parameters (antecedent precipitation, soil T, VWC, soil CO 2 ) and location-dependent conditions (topography and groundwater hydrology) that interact to result in a complex and nonlinear relationship between O 2 and soil moisture.Importantly, our results reveal that increases in soil moisture do not always trigger O 2 depletion, indicating that process-based ecosystem and denitrification models that rely on soil moisture alone to estimate soil O 2 may over-estimate denitrification, or other anaerobic process rates (e.g., iron reduction or methanogenesis).A more nuanced understanding of soil O 2 dynamics would therefore lead to improved predictions of temporal variability in redox-controlled nutrient cycling processes.
precipitation).This enables us to comprehensively assess potential drivers of soil moisture variability and the related O 2 response.Importantly, two types of O 2 can be measured in soil: gaseous O 2 (in air-filled pore spaces between soil particles) and dissolved O 2 (in soil water and water films surrounding soil particles; Neira et al., 2015).We measured gaseous O 2 in the present study.Continuous in-situ monitoring of O 2 in soil is most commonly monitored as gaseous O 2 , as measurement of dissolved O 2 in soil requires extraction of soil water and O 2 measurement via sensor (optical or electrochemical), titrimetric, or colorimetric methods.While high-frequency data for multiple parameters is advantageous for ecosystem monitoring, it requires the use of tools that are specifically suited to analyze multivariate and nonlinear data.The Kohonen unsupervised self-organizing map (SOM), a type of artificial neural network, is a powerful clustering tool that can reliably analyze such multivariate and nonlinear data

Figure 1 .
Figure 1.Scatter plots displaying nonlinear relationships between (a) O 2 and volumetric water content (VWC), (b) log transformed O 2 and original VWC values, (c) O 2 and soil temperature, and (d) O 2 and hourly precipitation sum.High frequency soil sensor data were collected in a frequently inundated position within a riparian area in northern Vermont, USA from 2017 to 2020.

Figure 2 .
Figure 2. (a) Map of the USA with the State of Vermont highlighted in green.(b) Map of the State of Vermont, USA, and the province of Quebec, Canada, with The Missisquoi basin, a subbasin of the Lake Champlain Basin, outlined in black.The Champlain Valley (CV) site, which is located within the Hungerford Brook subwatershed, is shaded in gray.The Green Mountains (GM) site is located within the Trout River subwatershed.(c) Satellite image of the CV site (Sheldon, VT) with a black circle indicating where sensors are installed.(d) Photograph of the CV site riparian transect.(e) Satellite image of the GM site (Montgomery, VT) with a red circle indicating where soil sensors are installed and (f) photograph of the GM transect.

Figure 3 .
Figure 3. Results from the Champlain Valley all observations (CVAO) self-organizing map (SOM), including (a) volumetric water content (VWC) time series with dashed line representing mean VWC for the CVAO dataset, (b) O 2 time series highlighted with the six clusters identified by the CVAO SOM, (c) bar plots displaying range normalized intra-cluster means of each input variable (n = number of observations per cluster)."NA" represents O 2 values that do not have corresponding independent variable values.1-wk precip., one-week antecedent precipitation; EC, electrical conductivity.

Figure 4 .
Figure 4. High O 2 self-organizing map (SOM) results for the Champlain Valley (CV) site, including (a) volumetric water content (VWC) time series with dashed line representing mean VWC for the CV high O 2 dataset, (b) O 2 time series highlighted with the four clusters identified by the high O 2 SOM, (c) bar plots displaying range normalized intra-cluster means of each input variable (n = number of observations per cluster).Clusters that are not shaded (represented by "NA") correspond to O 2 values outside of the high O 2 range.2-wk precip., two-week antecedent precipitation.
Within the same column, different letters represent significant differences between clusters (p < 0.05).

Figure 5 .
Figure 5. Low O 2 self-organizing map (SOM) results for the Champlain Valley (CV) site, including: (a) volumetric water content (VWC) time series with dashed line representing mean VWC for the CV low O 2 dataset, (b) O 2 time series highlighted with the five clusters identified by the low O 2 SOM, (c) bar plots displaying range normalized intra-cluster means of each input variable (n = number of observations per cluster).Clusters that are not shaded (represented by "NA") correspond to O 2 values outside of the low O 2 range.2-wk precip., two-week antecedent precipitation.
Within the same column, different letters represent significant differences between clusters (p < 0.05).
Within the same column, different letters represent significant differences between clusters (p < 0.05).

Figure 6 .
Figure 6.Self-organizing map (SOM) results for the Green Mountains (GM) site, including: (a) volumetric water content (VWC) time series with dashed line representing mean VWC for the GM dataset, (b) O 2 time series highlighted with the five clusters identified by the GM SOM.Negative O 2 values, which were set to zero when fed to the SOM, are included in this time series to show O 2 variability.(c) Bar plots displaying range normalized intra-cluster means of each input variable resulting from the GM SOM (n = number of observations per cluster).1-wk precip., one-week antecedent precipitation.

Table 2
Mean Value of Each Input Variable and O 2 Across Six Clusters Identified by the Champlain Valley All Observations Self-Organizing Map 19447973, 2023, 6, Downloaded from https://agupubs.onlinelibrary.wiley.com/doi/10.1029/2022WR034022,Wiley Online Library on [05/02/2024].See the Terms and Conditions (https://onlinelibrary.wiley.com/terms-and-conditions)on Wiley Online Library for rules of use; OA articles are governed by the applicable Creative Commons License

Table 3
Mean Value of Each Input Variable and O 2 Across Four Clusters Identified by the Champlain Valley High O O 2 values (n = 7,043) occurred less frequently than high O 2 values (n = 12,593) at the CV site, which has important implications for nutrient cycling.This finding suggests that the process of O 2 depletion requires the convergence of a more specific suite of soil conditions than high O 2 levels do.Due to seasonal fluctuations in the controls on soil O 2 at our sites, low O 2 values also occurred less frequently (25% of low O 2 values) during the growing season (generally early May to early October in Vermont), compared to high O 2 values (61%).

Table 4
Mean Value of Each Input Variable and O 2 Across Five Clusters Identified by the Champlain Valley Low O

Table 5
Inter-Cluster Means of Each Input Variable and O 2 Across Five Clusters Identified by the Green Mountains Self-Organizing Map