Larval rearing and microbiome sampling
Water samples were taken from a commercial hatchery for microbiome analysis on days 1, 5, 8 and 12 of a grow-out trial where Pacific geoduck larvae (4 days post fertilization on day 1) were held at two pH levels (Additional File 1). Water was pumped into a commercial shellfish hatchery from ~100 ft deep in Dabob Bay, WA and maintained at pH 7.1 or 8.2. The pH 8.2 treatment water was buffered with sodium bicarbonate, and CO2 aeration was used to maintain water at pH 7.1. Each pH treatment had its own header tank for treating water where pH was maintained; flow rate into the header tanks was 1.8 gallons per minute (gpm) and into each treatment tank was 0.6 gpm. Three replicate 200 L tanks (flow-through) were used for each pH treatment. Larvae were fed a mix of flagellates and diatoms (Tisochrysis lutea, Nannochloropsis oculata, Chaetoceros muelleri, Pavlova lutheri, Rhodomonas salina, Tetraselmis suecica) with 30,000 cells/mL of algae in the larval tanks at day 2 post-fertilization, 40,000 cells/mL by day 3, and 50,000 cells/mL by day 4 through the end of the experiment. Water (3.8 L) was collected from larval tank effluent filtered through serial Swinnex filters (Sigma-Aldrich, Saint Louis, MO) of 5, 0.7, and 0.22 µm. The 0.22 µm filters were folded and frozen in individual plastic bags at -80°C.
Metagenomics
DNA for library construction was isolated from 0.22 µm filters from a single tank from each pH treatment from days 5, 8, and 12 for pH 8.2 and days 1, 8, and 12 for pH 7.1. DNA was isolated with the DNeasy Blood and Tissue Kit (Qiagen, Hilden, Germany), following a modified version of the manufacturer’s Gram-Positive Bacteria protocol. Briefly, filters were unfolded and placed in plastic tubes (1.7 mL) and incubated with Buffer Al (400 µl) and proteinase K (50 µL) overnight at 56°C. The next morning, ethanol (100%, 400 µl) was added to each sample and samples were eluted in Buffer AE (50 µl). Samples were quantified on a Qubit 3.0 with the Qubit 1x dsDNA HS Assay Kit (Invitrogen, Carlsbad, CA) using 5 µl of sample.
Libraries were prepared for multiplexing using a Nextera DNA Flex Kit (Illumina, San Diego, CA) following the manufacturer’s protocol for DNA quantities <10 ng, with the following adjustments. Polymerase chain reaction (PCR) steps were performed in thin-walled PCR tubes (200 µl) with a PTC-200 thermalcylcer (MJ Research, Bio-Rad Laboratories, Hercules, CA); magnetic separations were performed in tubes (1.7 mL) using a DynaMag2 (Invitrogen). Libraries were quantified on a Qubit 3.0 with the Qubit 1x dsDNA HS Assay Kit (Invitrogen). Library quality was assessed on a 2100 Bioanalyzer with a High Sensitivity DNA Kit (Agilent, Santa Clara, CA). The average fragment size was ~530 bp. The six metagenome libraries were pooled and sequenced on an Illumina HiSeqX System with a 150 bp paired-end read length.
Sequencing reads were quality trimmed using TrimGalore (0.4.5; [12]). Reads were subjected to two rounds of trimming; initial quality and adapter trimming, followed by removing 20 bp from the 5’ end of each read. Pre- and post-trimming reads were assessed using FastQC (0.11.7;[13]) and MulitQC (1.6.dev0; [14]).
Gene prediction with the trimmed sequencing reads was performed using MetaGeneMark (v3.38; [15]) to produce a translated metagenome to create a database for metaproteomic protein inference.
Trimmed sequencing reads were functionally annotated with a combination of DIAMOND BLASTx (v 0.9.29; [16]) and MEGAN6 (v 6.18.3; [16]). Annotation with DIAMOND BLASTx [17] was run against NCBI nr database (downloaded September 25, 2019). The resulting DAA files were processed for importing into MEGAN6 with the add-meganizer tool using the following MEGAN6 mapping files: prot_acc2tax-Jul2019X1.abin, acc2interpro-Jul2019X.abin, acc2eggnog-Jul2019X.abin. This work was facilitated through the use of advanced computational, storage, and networking infrastructure provided by the Hyak supercomputer system at the University of Washington.
To perform taxonomic classification, meganized DAA files were imported and converted to RMA6 files via the MEGAN6 graphical user interface “Import from BLAST” dialog, using the default Naive LCA settings and the following mapping files: prot_acc2tax-Jul2019X1.abin, acc2interpro-Jul2019X.abin, acc2eggnog-Jul2019X.abin, acc2seed-May2015xx.abin. All mapping files were downloaded from the MEGAN6 website (https://software-ab.informatik.uni-tuebingen.de/download/megan6/old.html).
Diversity index (Shannon-Weaver and Simpson-Reciprocal) values were determined using MEGAN6.
Metaproteomics
Filters (0.22um) from two separate tanks at each pH treatment and all four time points (days 1, 5, 8, 12) (n=8) were opened in a cell culture dish on ice. Four washes of ice cold 50 mM NH4HCO3 (1 mL) were used to rinse the cells from the filters. The wash containing cells was centrifuged at 10,0000 rpm for 30 minutes. Liquid was removed from the tubes, leaving pelleted cells. Pellets were resuspended in 50 µm NH4HCO3 with 6M urea (100 µl). Each sample was sonicated three times (Sonic Dismembrator Model 100, Fisher Scientific; power set between 1-2) and chilled between sonications. Protein digestion and peptide desalting followed the protocol outlined in [18].
Metaproteomics data were acquired on a Q-Exactive (ThermoFisher, Waltham, MA). Both the pre-column (3.5 cm) and analytical column (30 cm) were packed in-house with Dr. Maisch (Ammerbuch, Germany) C18 3 µm beads. Peptide spectra were collected from 5 µl injections of 1µg total peptides in triplicate over a 60 minute gradient of 5-35% acetonitrile. Samples were run in randomized groups of four with a blank injection in between each group and a quality control standard (Pierce Peptide Retention Time Calibration mix + bovine serum albumin, ThermoFisher) after every second group of four. Proteomics data can be found in the ProteomExchange PRIDE repository under the accession number PXD020692, username: [email protected], password: DHXIRp02 [19].
Raw mass spectrometry files were searched against a database that included the translated metagenome (see above) and common lab contaminants (cRAPome.org) with Comet 2018.01 rev.2 [20,21]. After Comet, the TransProteomic Pipeline (TPP) was run to statistically score confident peptide spectra using Peptide prophet and infer proteins with Protein Prophet [22,23]. Results from the TPP were analyzed with Abacus with an FDR cut-off of 0.01 (combined file ProteinProphet probability of 0.93) to generate consensus peptide assignments and protein assignments across the entire experimental dataset (Additional File 2; [24]). Only proteins with at least 2 unique peptides identified across all experimental replicates were included in downstream analyses. Technical replicates were averaged for the analysis. Normalized spectral abundance (NSAF) values, calculated by Abacus, were used to analyze metaproteomic data across time and pH treatments with non-metric multidimensional scaling plot (NMDS) based on a Bray-Curtis dissimilarity matrix and analysis of similarity (ANOSIM) with the vegan package [25] in R v 3.6.3.
The change in detected taxa and Gene Ontology (GO) terms (i.e. abundance of peptides associated with a specific taxonomic group or GO term) were plotted in R using the peptide spectral matches (PSMs) or reads for a specific taxa/GO term normalized to all PSMs or reads for a mass spec run/library. The first time point in the plots was set to 0 and for a subsequent time point (n) were [ratio of reads or PSMs]n-[ratio of reads or PSMs]n-1. The ratios of PSMs versus the ratios of reads were plotted on an x-y plot with a linear model line of best fit and corresponding R2 value to determine how well reads and PSMs corresponded at each day and pH.
For comparison of statistically significant functional and taxonomic differences between pH treatments at each time point, metaGOmics [26] was used to compare proteins inferred between pH and between days within each pH treatment. All Comet search results (see above) were analyzed together with percolator [27] and the resulting file was parsed into individual mass spectrometry experiment files. Peptide spectral matches for each peptide were averaged within a day and pH treatment so that a single results file for each day/pH was uploaded to metaGOmics. The background proteome for metaGOmics was the metagenome-derived proteins inferred in the Comet-to-Percolator pipeline. Only prokaryotic sequences were used for the analysis. The portal for the metaGOmics analysis can be found here: https://www.yeastrc.org/metagomics/viewUploadedFasta.do?uid=QCqX3ZlvD4KkDZ2B. Taxonomic and GO results were considered significant if the metaGOmics Laplace corrected q-value was less than or equal to 0.05.