Meta-pangenomics Reveals Depth-dependent Shifts in Metabolic Potential for the Ubiquitous Marine Bacterial SAR324 Lineage
Background
Oceanic microbiomes play a pivotal role in the global carbon cycle and are central to the transformation and recycling of carbon and energy in the ocean’s interior. SAR324 is a ubiquitous but poorly understood uncultivated clade of Deltaproteobacteria that inhabits the entire water column, from ocean surface waters to its deep interior. Although some progress has been made in elucidating potential metabolic traits of SAR324 in the dark ocean, very little is known about the ecology and the metabolic capabilities of this group in the euphotic and twilight zones. To investigate the comparative genomics, ecology and physiological potential of the SAR324 clade, we examined the distribution and variability of key genomic features and metabolic pathways in this group from surface waters to the abyss in the North Pacific Subtropical Gyre, one of the largest biomes on Earth.
Results
We leveraged a pangenomic ecological approach, combining spatio-temporally resolved single amplified genome, metagenomic and metatranscriptomic datasets. The data revealed substantial genomic diversity throughout the SAR324 clade, with distinct depth and temporal distributions that clearly differentiated ecotypes. Phylogenomic subclade delineation, environmental distributions, genomic feature similarities, and metabolic capacities revealed congruent groups that, when merged, form Operational Ecogenomic Units (OEUs). The four SAR324 OEUs delineated in this study revealed striking divergence from one another with respect to their habitat-specific metabolic potentials. The OEUs living in the dark or twilight oceans shared genomic features and metabolic capabilities consistent with a sulfur-based chemolithoautotrophic lifestyle. In contrast, those inhabiting the sunlit ocean displayed higher plasticity energy-related metabolic pathways, supporting a presumptive photoheterotrophic lifestyle. In epipelagic SAR324 OEUs, we observed the presence of two types of proton-pumping rhodopsins, as well as genomic, transcriptomic, and ecological evidence for active photoheterotrophy, based on xanthorhodopsin-like light-harvesting proteins.
Conclusions
Our approach combining pangenomic and multi-omics profiling revealed a striking divergence in the vertical distribution, genomic composition, metabolic potential, and predicted lifestyle strategies of geographically co-located members of the SAR324 bacterial clade. The results highlight the utility of pangenomic approaches employed across environmental gradients, to decipher the properties and variation in function and ecological traits of specific phylogenetic clades within complex microbiomes.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5
This is a list of supplementary files associated with this preprint. Click to download.
Supplementary Table 1. Features of genomes used in this study
Supplementary Table 2. Links to repository of SAGs isolated from this study
Supplementary Figure 1. Venn diagrams of SAR324 genes pooled either by subclade (A) or by ecotype (B). The total of genes shared among the genomes constituting the subclade or the ecotype, are displayed in italic grey below the number of genes unique to the subclade or the ecotype. Venn diagrams have been generated from http://bioinformatics.psb.ugent.be/webtools/Venn
Supplementary Figure 2. Depth distribution of SAR324 average coverage in the same samples from which the SAGs have been isolated (black) and in HOT time-series (grey). Depth distribution of physical and chemical parameters at each month of 2016 are displayed in red and the average profile in blue. T: temperature, S: salinity, O: oxygen concentration, NO2+NO3: Nitrite and nitrate concentration, P: phosphorus concentration.
Supplementary Figure 3. Depth distribution of SAR324 ABC-transporter at Station ALOHA.
Supplementary Figure 4. Placement of SAR324 16S rRNA coding genes (red) into the SILVA 132 reference tree. Subclades as defined by the ANI in this study are displayed by the inner brackets. Outside brackets denote the official classification as based on the SILVA database. 16S rDNA sequences retrieved from population genomes were aligned using SINA and placed into the reference tree using ARB_add_by_parcimony as implemented in ARB software. Genes are identified as follow: Population genome identifier (this study) | GenBank assemblies (GCA) identifier | genome description
Supplementary Figure 5. Comparison between phylogenetic tree of 16S rRNA coding genes (left) and ANI classification (right). Phylogenetic tree of 16S rDNA genes was inferred from a MUSCLE alignment using Maximum Likelihood and General Time Reversible model with MEGA X software.
Supplementary Figure 6. Placement of SAR324 rhodopsin genes (red) into the MicRhoDE reference tree. Rhodopsin protein sequences retrieved from SAR324 population genomes were aligned on MicRhoDE reference alignment using MAFFT --addfragments and backtranslated using pal2nal software before being placed into the reference tree using ARB_add_by_parcimony as implemented in ARB software. Genes are identified as follow: Population genome identifier | orthogroups cluster identifier | gene identifier
Supplementary Figure 7. Protein alignment of rhodopsins identified in SAR324 population genomes and closest relatives retrieved from HOT time-series metagenomes. Amino acid residue motifs involved in ion pumping, opsin fixation and spectral tuning are highlighted by black rectangles. Proteorhodpsin-like sequences are displayed in blue and Xanthorhodopsin-like sequences in green. Amino acid residues are colored according properties and conservation of residues (ClustalX). Consensus sequence and colored bars of sequence conservation have been created using SnapGene Viewer v.4.2.6. Predicted secondary and tertiary structures of rhodopsins have been predicted using RaptorX server.
Supplementary Figure 8. Synteny map of the genic neighborhood of Xanthorhodopsin-like coding genes retrieved in SAR324 genomes. Target gene is displayed in red, enzyme-coding gene in orange, hypothetical enzyme-coding gene in yellow, transporter-coding gene in green, protein-coding gene in grey and hypothetical coding gene in white. tRNA are displayed by black bars. Contigs are identified as follow: Population genome identifier - GenBank assemblies (GCA) identifier (contig identifier). Name background was colored according to subclades.
Supplementary Figure 9. Synteny map of the genic neighborhood of Proteorhodopsin-like coding genes retrieved in SAR324 genomes. Target gene is displayed in red, enzyme-coding gene in orange, hypothetical enzyme-coding gene in yellow, transporter-coding gene in green, protein-coding gene in grey and hypothetical coding gene in white. tRNA are displayed by black bars. Contigs are identified as follow: Population genome identifier - GenBank assemblies (GCA) identifier (contig identifier). Name background was colored according to subclades
Supplementary Figure 10. Phylogenetic reconstruction of SAR324 RuBisCO types. RuBisCO protein sequences retrieved from SAR324 population genomes (in red) were aligned on type references from Tabita et al 2007 using MUSCLE. Neighbor Joining phylogenetic reconstruction using a Poisson model; Bootstrapping of 1000.
Supplementary Figure 11. Box plot of GC percentage of SAR324 OEUs.
Posted 23 Feb, 2021
On 21 Feb, 2021
Invitations sent on 21 Feb, 2021
Received 21 Feb, 2021
On 15 Feb, 2021
On 11 Feb, 2021
On 10 Feb, 2021
On 08 Feb, 2021
Meta-pangenomics Reveals Depth-dependent Shifts in Metabolic Potential for the Ubiquitous Marine Bacterial SAR324 Lineage
Posted 23 Feb, 2021
On 21 Feb, 2021
Invitations sent on 21 Feb, 2021
Received 21 Feb, 2021
On 15 Feb, 2021
On 11 Feb, 2021
On 10 Feb, 2021
On 08 Feb, 2021
Background
Oceanic microbiomes play a pivotal role in the global carbon cycle and are central to the transformation and recycling of carbon and energy in the ocean’s interior. SAR324 is a ubiquitous but poorly understood uncultivated clade of Deltaproteobacteria that inhabits the entire water column, from ocean surface waters to its deep interior. Although some progress has been made in elucidating potential metabolic traits of SAR324 in the dark ocean, very little is known about the ecology and the metabolic capabilities of this group in the euphotic and twilight zones. To investigate the comparative genomics, ecology and physiological potential of the SAR324 clade, we examined the distribution and variability of key genomic features and metabolic pathways in this group from surface waters to the abyss in the North Pacific Subtropical Gyre, one of the largest biomes on Earth.
Results
We leveraged a pangenomic ecological approach, combining spatio-temporally resolved single amplified genome, metagenomic and metatranscriptomic datasets. The data revealed substantial genomic diversity throughout the SAR324 clade, with distinct depth and temporal distributions that clearly differentiated ecotypes. Phylogenomic subclade delineation, environmental distributions, genomic feature similarities, and metabolic capacities revealed congruent groups that, when merged, form Operational Ecogenomic Units (OEUs). The four SAR324 OEUs delineated in this study revealed striking divergence from one another with respect to their habitat-specific metabolic potentials. The OEUs living in the dark or twilight oceans shared genomic features and metabolic capabilities consistent with a sulfur-based chemolithoautotrophic lifestyle. In contrast, those inhabiting the sunlit ocean displayed higher plasticity energy-related metabolic pathways, supporting a presumptive photoheterotrophic lifestyle. In epipelagic SAR324 OEUs, we observed the presence of two types of proton-pumping rhodopsins, as well as genomic, transcriptomic, and ecological evidence for active photoheterotrophy, based on xanthorhodopsin-like light-harvesting proteins.
Conclusions
Our approach combining pangenomic and multi-omics profiling revealed a striking divergence in the vertical distribution, genomic composition, metabolic potential, and predicted lifestyle strategies of geographically co-located members of the SAR324 bacterial clade. The results highlight the utility of pangenomic approaches employed across environmental gradients, to decipher the properties and variation in function and ecological traits of specific phylogenetic clades within complex microbiomes.
Figure 1
Figure 2
Figure 3
Figure 4
Figure 5