3.1 Biosphere 2 tropical rainforest description and drought experiment
Biosphere 2 (B2) in Oracle, AZ is a 1.27 ha steel and glass-enclosed building harboring five distinct biomes, including a 0.19 ha tropical rainforest (TRF), an enclosed ecosystem with variable topography and microhabitats with the ability to control rain and temperature inside providing an optimal setting for studying drought effects58–60. The B2 TRF harbors approximately 70 species of trees and shrubs, forming a canopy and understory, with soils that represent biogeochemical cycles present in rainforests61. In late 2019, a 65-day drought experiment was conducted as part of the Water, Atmosphere, and Life Dynamics campaign (WALD)44. During ambient (pre-drought) conditions, rainfall events were simulated by spraying 15,000 L of irrigation water from the top of the B2 TRF at a frequency of 3 days a week. The last rainfall event before the drought was on October 7, 2019, and the first post-drought rain-fall was on December 12, 2019. During the drought, ambient temperatures were maintained in the lowland region between 20 to 26.7 °C. For more detailed characterization of the WALD drought experiment, please refer to 44.
3.2 Position-specific (C1 and C2) 13C pyruvate labeling
Soil was labeled with position-specific (C1 or C2) 13C-pyruvate (henceforth referred to as C1-13C-pyruvate and C2-13C-pyruvate, respectively) to track C allocation into CO2, VOCs, and primary metabolites, as done previously in plants45,62. Three replicates each of C1-13C-pyruvate or C2-13C-pyruvate labeling was performed per site in the B2 TRF (Site 1, Site 2, and Site 3; n = 6 per site) (Fig S1a) within soil chambers (Fig S1b). 13C labeling was performed before and during drought (September 12 - 15 and November 7-19, respectively). The night before labeling, two automatic chambers were placed onto the corresponding soil collars and covered with a rain-out shelter. Each morning, these collars were labeled at around 10AM by using a 5x5 cm stencil with 1x1 cm openings placed into one side of each chamber and adding 100 μl of C1-13C-pyruvate or C2-13C-pyruvate solution (40 mg/ml) to each opening to a depth of 1 cm, for a total of 25 injections (Fig S1b). The stencil was then removed prior to soil chamber gas measurements.
Soil moisture and temperature was measured using a portable probe and LabQuest viewer (Vernier, Beaverton, OR, USA). For pre-drought, soil moisture and temperature data were collected on October 1 and 9 and for drought on November 11 and 18 for a subset of collars (P11 , P21, P23, P32, and P33 [Site #, replicate #]), except for November 18, where all collars from the experiment were tested (see Fig S1a). Soil moisture measured near the labeling sites decreased from 26.0 ± 6.9 to 13.8 ± 2.6 % (p < 0.001) between pre-drought and drought conditions, respectively. Soil temperature did not change significantly, and ranged from 23.0 ± 0.6 during pre-drought and 23.3 ± 1.2 °C during drought.
3.3 Continuous monitoring of VOCs and CO2
Prior to labeling, collars were measured at typical temporal resolution (~ 2 hr) over night. In order to capture any rapid changes in gas fluxes due to changes in microbial activity after pyruvate addition, measurement intervals were increased to high frequency (30 minutes) directly prior to labeling. After gas fluxes were expected to equilibrate, approximately 8 hr post pyruvate labeling (~ 6 PM), measurement intervals were decreased to low frequency (50 minutes) and remained at this frequency until measurements were stopped at 48 hr post labeling. For each measurement, the automatic chamber closed over the collar for a total of 10 min (pre-purge, 2.5 min; measurements, 6.5 min; post-purge, 1 min). Fluxes were measured using an automated, multiplexed Licor soil flux system (Licor 8100, Li-8150 16-port multiplexer and Lic 8100-104 Long-Term Chambers with opaque lids, Licor Inc.). The system was coupled to a Picarro G2201-i analyzer (Picarro Inc., Santa Clara, US) to measure CO2 and isotopic composition and a proton-transfer-reaction time-of-flight mass spectrometer (PTR-TOF 8000, Ionicon Inc. Innsbruck, Austria) for VOCs (including 13C-VOCs). The PTR-ToF-MS sampled the sub-flow from the soil flux system at 30 sccm via fluorinated ethylene propylene tubing heated at 60 °C. The drift voltage was 600 V, the drift temperature was 60°C and the drift pressure was 2.2 mbar, resulting in an E/N ratio of 137 Td. The time resolution was 10 ms and the mass range was up to 500 amu. The PTR-ToF was operated in the H3O+ mode, therefore only compounds having proton affinity higher than water (697 kJ/mol) underwent proton-transfer reactions and could be detected on their protonated mass to charge ratio (m/z), which includes the vast majority of VOCs. Measured ions were attributed to chemical formulas and specific chemical species based on the exact protonated m/z. PTR-TOF data were processed using the software PTRwid63. To account for possible variations of the reagent ion signals, measured ion intensities were normalized to the H3O+ counts in combination with the water-cluster ion counts64. At midnight, automatic calibrations were performed using standard gas cylinders containing different multi-VOC component calibration mixtures in Ultra-High Purity (UHP) nitrogen (Apel-Riemer Environmental, Inc., Colorado, USA). For a detailed description of the calibration setup see Werner et al. (2021). The concentrations of compounds included in the standard were calculated with an uncertainty of ≤23%. Concentrations of compounds not included in the calibration standard cylinders were calculated by applying the kinetic theory of proton transfer reaction65,66 with an uncertainty of ≤50%.
Select additional soil experiments were conducted with a Vocus proton transfer reaction time-of-flight mass spectrometer (PTR-TOF; TOFWERK, AG, Thun, Switzerland)67 coupled to a custom-built gas chromatograph (GC; Aerodyne Research, Inc., Billerica, MA, USA)68. The GC contains an integrated two-stage adsorbent-based thermal desorption preconcentration system for in situ collection of VOCs prior to separation on the chromatographic column. For preconcentration, a multi-bed (Tenax TA/Graphitized Carbon/Carboxen 1000; Markes International) sorbent tube was used for the first stage of sample collection, this tube is then subjected to a post-collection water purge before the sample was thermally desorbed to a multi-bed focusing trap (Tenax, Carbopack X, Carboxen 1003; Markes International) prior to injection onto the GC column. For this study, the GC was equipped with a 30-m Rxi-624 column (Restek, 0.25 mm ID, 1.4 µm film thickness) which resolves non- to mid-polarity VOCs in the C5 - C12 volatility range prior to ionization in the PTR detector. The GC-PTR sampled from in-situ soil gas probes69 on an alternating timed schedule. The GC-PTR can speciate structural isomers and help identify some unknowns by matching to calibrated retention times.
3.3.1 CO2 data analysis
CO2 fluxes and their isotopic composition were calculated based on data from the Picarro isotope analyzer. Fluxes were calculated with linear and exponential models, fitted to each individual chamber measurement. A deadband of 30 s was used for each measurement to allow for mixing in the chamber. The linear models were calculated based on the first 120 s, for the exponential model the full closure period of 6.5 min was used. Fluxes were quality controlled visually. Fluxes based on the exponential fit were used preferentially but replaced by the linear-fit flux where necessary. The isotopic composition was calculated based on the individual efflux rates of the 12C-CO2 and 13C-CO2 isotopologue. These are reported separately by the gas analyzer, and linear fits based on the first 120 s were used to calculate efflux rates. Due to the high enrichment of the labeled soil CO2 efflux, this method was found to be more reliable compared to conventional Keeling-plot approaches. The isotopic composition of the CO2 efflux was then calculated from the ratios of 12C-CO2 and 13C-CO2 efflux rates and normalized to the Vienna Pee Dee Belemnite (VPDB) scale (δ13CCO2 = ((13C/12C)CO2)/((13C/12C)VPDB) – 1). C isotope fluxes were quality controlled visually for each individual chamber and outliers removed manually.
13C-CO2 emitted from chambers that received 13C1-pyruvate is formed as C is decarboxylated via pyruvate dehydrogenase (PDH) to form acetyl-CoA or via an alternate decarboxylation reaction leading to biosynthesis of compounds (e.g., biomass, secondary metabolites, VOCs), while 13C-CO2 emitted from chambers that received 13C2-pyruvate is primarily formed during decarboxylation in the TCA cycle during energy production (Fig 1) (modified from25,26). Using this concept, we can approximate both total (ecosystem-wide flux of) and relative (proportion of total) C allocated to biosynthesis by calculating 13C-CO2-C1 - 13C-CO2-C2 (C1 - C2) and 13C-CO2-C1 / (13C-CO2-C1 + 13C-CO2-C2 ) (C1/[C2 + C2]), respectively, where 13C-CO2-C1 and 13C2-CO2-C2 are efflux from chambers that received 13C1-pyruvate or 13C1-pyruvate, respectively. To facilitate calculation of Bt and Br, fluxes across time were binned to 0-3 (3 hr), 3 - 6 (6hr), 6 - 12 (12hr), 12 - 18 (18 hr), 18 - 24 (24 hr), 24 - 30 (30 hr), 30 - 36 (36 hr), 36 - 42 (42 hr), and 42 - 48 (48 hr).
3.3.2 VOC data analysis
The isotopic composition of the VOCs flux rate was calculated by applying the linear model to calculate the rate of change in the fractional abundance of 13C (13C-VOC/[13C-VOC +12C-VOC ]). For each 6.5 min chamber measurement, a deadband of 30 s was used to allow for mixing in the chamber and the linear model was applied to the successive 10 data points collected at 10 s intervals. VOCs to examine were selected based on two criteria: 1) immediately downstream of pyruvate in the pyruvate metabolism KEGG pathway and 2) VOCs previously detected in soil emissions.
3.3.3 Cumulative fluxes and distribution of 13C from pyruvate
Cumulative 13C-CO2 and VOC effluxes from 0 to 48 h past 13C-pyruvate injection were calculated using the area under the curve (auc()) function within the flux R package 70. Based on the total amount of 13C-pyruvate added (100 mg, or 112 mmol), we could determine what percentage of total 13C ended up in CO2 or VOCs.
3.4 Soil sample collection and processing
For metagenomics, metatranscriptomics, and metabolomics, soil samples were collected just prior to 13C-pyruvate labeling (0 hr) then at 6 and 48 hr post labeling. For 0 hr collection, soil (~ 8 g) was collected directly outside of the stencil using a 2.25 cm diameter sterile metal ring pushed into the soil to 2 cm depth and placed into a sterile 50 mL tube. For 6 and 48 hr collections, soil samples were collected using the same method as for 0 hr, but inside the metal frame where pyruvate labeling occurred. After each soil collection, samples were immediately brought to the lab and allocated for different downstream analyses: 1 g for metabolomics stored at -20 °C and 2 g in Lifeguard Soil Preservation Solution (Qiagen) for DNA/RNA extractions stored at -80 °C.
3.4.1 DNA/RNA extractions
RNA and DNA were co-extracted from 1 g of soil using the RNeasy Powersoil Total RNA Kit (Qiagen) coupled with the RNeasy Powersoil DNA Elution Kit (Qiagen) following the manufacturer’s protocol and eluting in 100 μL of kit-provided elution buffer. RNA and DNA quality and concentrations were measured using a Qubit 4 Fluorometer (Thermo Fisher Scientific) and NanoDrop (Thermo Fisher Scientific). RNA was further treated with DNAse (DNAse Max, Qiagen) to remove any DNA contamination. Total RNA and DNA was sent to the Joint Genome Institute (JGI; Berkeley, CA) for library prep and sequencing.
3.4.2 Water extractions for metabolomics
Water extractions were performed on samples for nuclear magnetic resonance (NMR), followed by solid phase extraction (SPE) for Fourier-transform-ion-cyclotron-resonance MS (FTICR-MS). Water extraction procedures as previously described for bulk metabolite characterization were followed71–73. Briefly, 1 g of soil and 5 mL of double deionized (DDI) water were vortexed in a 15 mL centrifuge tube and sonicated for 2 hours; the sonication bath retained a temperature of 21 °C to ensure metabolite integrity. The 15 mL centrifuge tube was removed from the sonicator and centrifuged for 20 minutes. One mL of water-extractable organic C (WEOC) was removed and stored at -80 °C for NMR analysis at Pacific Northwest National Laboratory (PNNL) while 4 mL of WEOC was saved for downstream SPE.
SPE is an extraction technique used to clean and concentrate samples for the isolation and analysis of organic compounds74. Liquid samples are passed through a macroporous styrene-divinylbenzene crosslinked polymer that is able to retain polar organic compounds in water extracted solvents. For this process, Bond Elut PPL (Priority PolLutant) barrel cartridges, which were preactivated with methanol (MeOH), were used. The 4 mL of WEOC were acidified to pH 2 with 1M hydrochloric acid then passed through the sorbent with the vacuum set no higher than -5 pi, to allow compounds to absorb into the retention mechanisms. After the compounds of interest were retained, the cartridges were washed three times with 9 mL of 0.01M hydrochloric acid to remove all impurities. The cartridges were then dried with a pressure air hose for 2-3 minutes and rinsed with 1.5 mL of MeOH in a slow dropwise flow rate into 2 mL vials. The eluate was then capped, stored at -80 °C, and sent to PNNL for FTICR-MS analysis.
3.5 Metabolomics
3.5.1 NMR
1H-NMR bulk metabolite characterization was performed on soil water extracts. Samples (180 µL) were combined with 2,2-dimethyl-2-silapentane- 5-sulfonate-d6 (DSS-d6) in D2O (20 µL, 5 mM) and thoroughly mixed prior to transfer to 3 mm NMR tubes. Resonances corresponding to 13C labeling were identified by visual inspection, comparing labeled spectra to unlabeled spectra. Once differences were identified, the molecular location and quantification of 13C incorporation was determined by J-coupling pattern analysis and the ‘linefitting’ tool in Mestrenova, respectively. NMR spectra were acquired on a Bruker Avance III spectrometer operating at a field strength of 17.6 T (1H ν0 of 750.24 MHz) and equipped with a 5 mm Bruker TCI/CP HCN (inverse) cryoprobe with Z-gradient and at a regulated temperature of 298 K. The one-dimensional 1H spectra were acquired using a nuclear Overhauser effect spectroscopy (noesypr1d) pulse sequence. The 90° H pulse was calibrated prior to the measurement of each sample with a spectral width of 12 ppm and 1024 transients. The NOESY mixing time was 100 ms and the acquisition time was 4 s followed by a relaxation delay of 1.5 s during which presaturation of the water signal was applied. Time domain free induction decays (72114 total points) were zero filled to 131072 total points prior to Fourier transform, followed by exponential multiplication (0.3 Hz line-broadening), and semi-automatic multipoint smooth segments baseline correction. Chemical shifts were referenced to the 11H methyl or 13C signal in DSS-d6 at 0 ppm. The 1D 11H NMR spectra of all samples were processed, assigned, and analyzed using Chenomx NMR suite 9.2 (Chenomx Inc.; Edmonton, AB, Canada) with quantification of spectral intensities of compounds in the Chenomx library relative to the internal standard. Candidate metabolites present in each of the complex mixtures were determined by matching chemical shift, J-coupling, and intensity information of the experimental signals against signals of the standard metabolites in the Chenomx library. Signal to noise ratios (S/N) were measured using MestReNova v 14.2.014 with the limit of quantification equal to a S/N of 10 and the limit of detection equal to a S/N of 3. Standard 2-D experiments such as 11H / 13C - heteronuclear correlation (HSQC) experiments or 2-D 11H/ 11H Total Correlation spectroscopy (TOCSY) experiments further aided corroboration of several metabolite identifications where there was sufficient S/N.
3.5.2 FTICR
A 12 Tesla Bruker FTICR mass spectrometer (MS) was used to collect high-resolution mass spectra of WEOC by direct injection for secondary metabolite characterization. Approximately 100 µL of water extract was mixed with methanol (1:2) before injection onto the mass spectrometer to enhance ionization. A standard Bruker electrospray ionization (ESI) source was used to generate negatively charged molecular ions. Samples were introduced directly to the ESI source. The instrument settings were optimized by tuning on a Suwannee River fulvic acid standard, purchased from the International Humic Substances Society. Blanks (HPLC grade methanol) were analyzed at the beginning and end of the day to monitor potential carry over from one sample to another. The instrument was flushed between samples using a mixture of water and methanol. The ion accumulation time varied to account for differences in C concentration between samples. One hundred and forty-four individual scans were averaged for each sample and internally calibrated using an OM homologous series separated by 14 Da (CH2 groups). The mass measurement accuracy was <1 ppm for singly charged ions across a broad m/z range (100–1,000 m/z). The mass resolution was ∼240 K at 341 m/z. The transient was 0.8 s. Data Analysis software (BrukerDaltonik version 4.2) was used to convert raw spectra to a list of m/z values applying FTICR–MS peak picker module with a signal-to-noise ratio (S/N) threshold set to 7 and absolute intensity threshold to the default value of 100. Putative chemical formulae were then assigned using Formularity software75, as previously described76. Chemical formulae were assigned based on the following criteria: S/N > 7, mass measurement error <1 ppm, and taking into consideration the presence of C, H, O, N, S, and P and excluding other elements. To ensure consistent formula assignment and eliminate mass shifts that could impact formula assignment, all sample peak lists were aligned to each other. The following rules were implemented to further ensure consistent formula assignment: (a) picking formulae with the lowest error between predicted and observed m/z, and with the lowest number of heteroatoms, and (b) the assignment of one phosphorus atom requires the presence of at least four oxygen atoms. The chemical character of thousands of peaks in each sample's ESI FTICR–MS spectrum was evaluated using van Krevelen diagrams. Biochemical compound classes were reported as relative abundance values based on counts of C, H, and O for the following H:C and O:C ranges: lipids (0 < O:C ≤ 0.3 and 1.5 ≤ H:C ≤ 2.5), unsaturated hydrocarbons (0 ≤ O:C ≤ 0.125 and 0.8 ≤ H:C < 2.5), proteins (0.3 < O:C ≤ 0.55 and 1.5 ≤ H:C ≤ 2.3), amino sugars (0.55 < O:C ≤ 0.7 and 1.5 ≤ H:C ≤ 2.2), lignin (0.125 < O:C ≤ 0.65 and 0.8 ≤ H:C < 1.5), tannins (0.65 < O:C ≤ 1.1 and 0.8 ≤ H:C < 1.5), and condensed hydrocarbons (aromatics; 0 ≤ 200 O:C ≤ 0.95 and 0.2 ≤ H:C < 0.8)73. Analysis of FTICR data was performed using MetaboDirect (version 0.2.7)77 to create profiles of biochemical compound classes, and principal component analysis (PCA) plots.
3.6 Metatranscriptomics and metagenomics
3.6.1 Metagenomics
3.6.1.1 Metagenome libraries
Metagenome libraries were created as previously described78 by shearing 100 ng of genomic DNA to 300 bp using the Covaris LE220 instrument and size selected using TotalPure NGS beads (Omega Biotek). The Kapa-HyperPrep kit (Kapa Biosystems) was used to create an unamplified Illumina library, which was paired-end sequenced (2 x 150) on the Illumina NovaSeq 6000 platform using S4 flow cells.
3.6.1.2 Sequence processing
Sequence filtering, assembly, mapping, and annotation were performed at JGI as previously described78. Briefly, BBDuk (version 38.94), included in the BBtools package (https://sourceforge.net/projects/bbmap/), was used to trim reads that contained adapter sequences, homopolymers of G’s of the size 5 or more at the ends of reads, and reads where quality drops to 0. BBDuk was also used to remove reads that contained 1 or more ‘N’ bases, had an average quality score across the read less than 10, or had a minimum length <= 51 bp or 33% of the full read length. Reads mapped with BBMap (version 38.44), included in the BBtools package, to masked human, cat, dog, and mouse references at 93% identity common microbial contaminants were removed for downstream analysis. After filtering, each sample had an average of 136,442,170 ± 36,180,687 reads and 20.4 ± 5.4 Gb (Table S2). Filtered reads were error corrected prior to assembly using bbcms (version 38.90), included in the BBtools package, and assembled using metaSPAdes (version 3.15.2)79 with a minimum contig of 200 bp. Filtered reads were then mapped back to contigs using BBMap to obtain coverage information. Each sample had an average of 53.1 ± 6.6 % of filtered reads map to assembled contigs (Table S2). Feature prediction was next performed on assembled contigs by using tRNAscan-SE (version 2.0.6) to predict tRNAs, INFERNAL (version 1.1.3)80 to identify non-coding (nc) RNAs and rRNAs, CRT-CLI81 to identify CRISPR regions, and Prodigal (version 2.6.3)82 and GeneMarkS-2 (version 1.07)83 to identify protein-coding genes (CDSs). Functional annotation was performed by associating CDSs with Kegg Orthology (KO) terms. Gene copies per KO were calculated as the average coverage of the contigs each gene was predicted from, multiplied by the number of genes in the KO78.
3.6.2 Metatranscriptomics
3.6.2.1 Metatranscriptome libraries
rRNA was removed from 10 ng of total RNA using Qiagen FastSelect 5S/16S/23S for bacterial rRNA depletion (and additional FastSelect plant and/or yeast rRNA depletion) (Qiagen) with RNA blocking oligo technology. The fragmented and rRNA-depleted RNA is reverse transcribed to create first strand cDNA using Illumina TruSeq Stranded mRNA Library prep kit (Illumina) followed by the second strand cDNA synthesis which incorporates dUTP to quench the second strand during amplification. The double stranded cDNA fragments are then A-tailed and ligated to JGI dual indexed Y-adapters, followed with an enrichment of the library by 10 cycles of PCR. Metatranscriptomics of sample P35SSC1_191108_c were not completed due to sequencing issues.
3.6.2.2 Sequence processing
Sequence filtering, assembly, mapping, feature prediction, and annotation were performed by JGI as described above for metagenomics, except for the additional removal of ribosomal RNA during filtering and using MegaHit (version 1.2.9)84 for assembly. After filtering, each sample had an average of 91,405,261 ± 35,174,895 reads and 13.0 ± 5.0 Gb, and of these, an average of 73.6 ± 8.3 % mapped to assembled contigs (Table S2). Differential expression was performed in DESeq285 on gene copies of KOs, calculated as described above.
3.6.2.3 Weighted gene co-expression network analysis (WGCNA)
WGCNA was performed on metatranscriptomic gene expression data to identify modules of co-expressed genes. First, gene copy data was normalized in DESeq2 using the variance stabilization transformation (VST) method and samples were clustered to detect outliers, which removed only one sample (P15SSCI_191107_c). Next, the normalized data was analyzed using the WGCNA package in R86,87, with the following settings: soft-thresholding power of 6 (chosen based on where scale free topology model fit R2 = 0.8), minimum module size of 80, minimum KME of 0.35, and a ‘signed’ Topology Overlap Matrix (TOM). Pearson correlation coefficients (r) were calculated between module eigengenes, representing the first component of module, and the following: 1) condition (values for pre-drought and drought set to -1 and 1, respectively) and 2) 13C-enrichment of VOCs (acetate-C2 [from C2-13C-pyruvate]), acetone-C2, and diacetyl-C2) where values used were the average 12C/(12C + 13C) flux during a 2 hr window before (48 hr), after (0 hr), or before and after (6 hr +/- 1hr) pyruvate addition.
3.7 VOC cycling genes
Active metabolic pathways were determined by mapping KOs to KEGG pathways using the KEGG pathway mapper tool88. VOC cycling genes were chosen based on their being immediately up or downstream from the VOC in the metabolic pathway. Whether the gene was involved in production or consumption of VOC was assessed by whether it was positively or negatively correlated with VOC, respectively, or based on the literature. It should be noted that in some cases, genes can play a role in both the production and consumption of VOC, in which case we relied on the correlation as an indicator.
3.8 Statistical analysis
Statistical analyses of NMR data were performed in R (version 4.0.2). Principal component analysis (PCA) was performed on log-transformed NMR data using prcomp() and plotted with ggplot289. The Mann Whitney U-test was used to detect significant differences between pre-drought and drought conditions (data exhibited a non-normal distribution as determined after testing for normality using ggdensity(), ggqqplot() within the ggpubr package (0.4.0), and shapiro.test().
Correlations between 13C-VOCs and gene expression were performed by calculating Spearman correlation coefficients between flux of 13C-enriched VOCs and variance stabilization transformation (VST)-normalized gene copy data. For VOC and CO2 data, Mann Whitney U-tests were performed between emissions data averaged across all time points between pre-drought and drought conditions. For WGCNA eigengene expression, Kruskal Wallace and Dunn tests were performed to calculate significant differences between pairwise time points post pyruvate addition. All p-values reported (Spearman and Pearson correlations, Mann Whitney U-tests) have been corrected for false discovery rate (FDR) if making multiple comparisons. Enriched KEGG metabolic pathways within each module were calculated using the ClusterProfiler package in R90, with the KEGG reference database set to “ko”.