Our strategy for examining species-specific transcription over diel cycles while accounting for viral activity is shown in Supplementary Fig. 1. To begin, we retrieved metatranscriptomics reads from 44-time points taken during a cruise in the North Pacific Subtropical Gyre (NPSG) from July 25th and August 5th, 2015. Because the transcripts were derived from the entire bacterial community, and our focus is on species-specific metabolic processes with viral activity, we mapped the reads to individual species in the Station ALOHA Gene Catalogue. To compare and contrast lifestyles and metabolic capacity in natural populations that shift with diel cycles, we selected genes from 1-dominant photoautotroph (Prochlorococcus Marinus) and 1-dominant heterotroph (Candidatus Pelagibacter ubique). These data were used to create transcriptome abundance tables for each species that were used in subsequent network analyses to find genes that are co-expressed (Supplementary Dataset 2 and Supplementary Dataset 3).
The resulting gene modules (co-expressed genes) were then linked to diel cycles and viral activity using Dynamic Bayesian Network analyses (Fig. 1). Data on viral activity for Prochlorococcus Marinus or Candidatus Pelagibacter ubique was derived from prior work 5 by summing up the normalized genes counts for viruses linked to those hosts. Data on diel cycles was derived from surface photosynthetic active radiation (PAR) values taken during sampling. This novel pipeline allowed us to examine species-specific metabolic functions, in conjunction with viral activity, that may be overlooked when combined in a community-level analysis. Moreover, by carefully separating the metabolic activity of photoautotrophic and heterotrophic populations, we can examine their synchronized metabolic activity, in natural communities, and with viral activity.
Diel patterns in transcription activity for dominant microbes in the euphotic zone. To examine species-specific gene modules in the metatranscriptomic data, we distilled down the dataset to two dominant species (1-photoautotroph and 1-heterotroph) to explore fine-scale patterns in gene co-expression related to diel cycles and viral activity. Specifically, we used weighted gene co-expression network analysis (WGCNA) to group genes into gene modules based on their co-expression patterns for Prochlorococcus Marinus (a photoautotroph, Fig. 1a) and Candidatus Pelagibacter ubique (a heterotroph, Fig. 1b). The complete list of KEGG annotated for each module were shown in Supplementary Dataset 4 and Supplementary Dataset 5. Overall, 3 gene modules for Prochlorococcus Marinus showed diel patterns (during the day, night/dawn, dusk/night, and dusk) 4 for Candidatus Pelagibacter ubique (during the day, day/dusk, and night; see Supplementary Table 1). Each of these gene modules is described in detail below.
Bayesian network analysis and dynamic dependencies on diel cycles and viral activity.
Next, we examined temporal and environmental dynamics in gene modules for each species (Prochlorococcus Marinus and Ca. Pelagibacter ubique) using a Dynamic Bayesian Network (DBN) analysis. To gain insight into complex interactions between each species and the environment, we examined the dynamic dependencies of surface PAR and viral activity in the network. Because both species are key contributors to microbial dynamics at Station ALOHA, known interactions between taxa and genes can be used to validate our models. Moreover, the data can be used to build predictive models for dominant photoautotrophs and heterotrophs based on previous time-points, and potentially uncover new biological interactions related to diel cycles or viral activity.
To examine biological interactions related to diel cycles and viral activity in Prochlorococcus Marinus and Ca. Pelagibacter ubique, we obtained an averaged network structure for each species independently (Fig. 2a & b) and for the two species combined (Fig. 2c). The resulting networks contained nodes that are either gene modules, or environmental parameters (viral activity or surface PAR). Lines connecting each of the nodes indicate directed links between nodes at different timepoints, wherein the width of the line indicates the strength of the bootstrap support. For example, nodes that are strongly connected to viral activity (strength = 1; red) or surface PAR (green) have 10 times 10-fold cross-validation, meaning of one hundred networks 100% of the networks show this connection (Fig. 2).
Diel Patterns in Prochlorococcus Marinus and Ca. Pelagibacter ubique
Prochlorococcus Marinus has increased transcription of genes related to photosynthesis and protein folding during the day. We found a significant connection between p_module_1 for Prochlorococcus Marinus and Surface PAR in the DBN (Fig. 2a-day; strength = 1). This module is over-expressed during the day (Fig. 1a) and contains genes that are related to photosynthesis and protein biogenesis, as described below. p_module_1 for Prochlorococcus Marinus showed increased transcription during the day for genes related to chaperones and protein folding (clpC, ftsH, hflB), chlorophyll metabolism (chlH, chlE), RNA degradation (rnj), and photosynthesis (psbA, psbC, psaA; Supplementary Table 1). Prior work shows that photoautotrophs have increased transcription for genes related to light-capture and protein synthesis with increased surface PAR 9–11, which is consistent with these findings. clpC has been previously shown to play an important role in oxidative stress 34, and could be important in preventing photooxidative stress in biological molecules such as proteins, lipids, and DNA, or more specifically photosystem components. Interestingly, we also identified an uncharacterized putative membrane protein (K06890) in p_module_1 that is associated with COG0670 (YbhL), a gene that is known to interact with ftsH. These membrane proteins are essential for disassembly and oligomerization of protein complexes during protein biogenesis 35. These findings suggest that Prochlorococcus Marinus has increased levels of transcription in genes related to photosynthesis, removal of oxygen free radicals that can harm photosystem components, and chlorophyll metabolism.
Prochlorococcus Marinus has increased transcription of genes related to oxidative phosphorylation and protein synthesis at night. At night/dawn, Prochlorococcus Marinus shows increased transcription in p_module_3 for oxidative phosphorylation (atpA), and ribosome and protein synthesis (pnp, rplB, rpsC, tuf; Supplementary Table 1). These data support the hypothesis that cyanobacteria alternate between metabolic processes for photosynthesis during the day and protein synthesis during the night 19.
Ca. Pelagibacter ubique prevents protein damage, regulates transcription, and scavenges sulfur from DMSP during the day. Ca. Pelagibacter ubique is the most abundant heterotrophic bacteria in the ocean. In the daytime, this species contains abundant transcripts in cp_module_1 (day/dusk), cp_module_2 (day), and cp_module_4 (day) that contain genes for RNA degradation and biosynthesis (groEL). cp_module_2 and cp_module_4 are associated with RNA degradation, chaperones and folding (dnaK, ftsH; Supplementary Table 1). In the DBN, cp_module_1 and cp_module_4 both showed strong connections (strength = 1) with surface PAR in Ca. Pelagibacter ubique (Fig. 2b-day). Previous studies on Ca. Pelagibacter ubique indicates that groEL, dnaK, and ftsH are among the most abundant proteins in marine pelagic populations 36. Interestingly, these genes can assist in protein folding and proteolysis to prevent protein damage and exposure to environmental stresses 36. cp_module_2 also contains abundant transcripts for S-Adenosyl-l-homocysteine (ahcY). The resulting product of ahcY is S-adenosyl-l-methionine (SAM) methyltransferases 37, which is an important riboswitch in Ca. Pelagibacter ubique and regulates transcription 38. ahcY transcripts are co-expressed with the ABC transporter gene (proX) in cp_module_1 and cp_module_2. proX is involved in glycine betaine and proline betaine uptake systems 39 and allows marine bacteria to degrade dimethylsulfoniopropionate (DMSP) 40–42 to create volatile sulfur species such as dimethylsulfide (DMS). Thus, the upregulation of proX during the day may allow Ca. Pelagibacter ubique to scavenge reduced sulfur from DMSP 43–45 for cell growth, which results in the production of DMS for cloud production and cooling. This species also appears to use gluconeogenesis during the day (aldB in cp_module_1) to breakdown pyruvate its primary carbon source 46, and generate energy via the TCA cycle (sdhA in cp_module_1). Finally, cp_module_2 and cp_module_4 contain genes related to nitrogen uptake systems including ammonium transporter (amtB) and the branched amino acid transport system (livK) that could help to drive the growth in N-limiting conditions in the open ocean.
Ca. Pelagibacter ubique transcribes genes for energy production at night. cp_module_6 in Ca. Pelagibacter ubique shows increased transcription overnight related to oxidative phosphorylation (atpA, atpD), glycolysis and gluconeogenesis (aldB), and elongation factors (tuf; Supplementary Table 1). These findings suggest that ATP production by oxidative phosphorylation occurs at night and may play a role in breaking down compounds released by phytoplankton when they bloom and are lysed by viruses during the day. These findings are consistent with prior work that shows that light does not influence growth rates in Ca. Pelagibacter ubique 47. Thus, Ca. Pelagibacter ubique replicates throughout the day by generating energy from 1) proteorhodopsin energy production via proton motives forces across the cell membrane during the day 48 and 2) increased oxidative phosphorylation due to carbon and nutrient breakdown at night (cp_module_6, Fig. 1b).
Viral Activity in Prochlorococcus Marinus and Ca. Pelagibacter ubique
Viral infection in Prochlorococcus Marinus occurs at dusk redirecting host energy, nucleotide, and carbon metabolism. In Prochlorococcus Marinus, p_module_6 is strongly connected to viral activity in the DBN (Fig. 2a-day; Supplementary Table 2) and contains genes that are known to be overexpressed in the host during viral infection 49. Genes in p_module_6 are overexpressed at dusk (Fig. 1a), which is consistent with prior studies of temporal viral activity 5. Interestingly, viral activity has a strong connection to itself during the day (Fig. 2a-day), and none of the gene modules have a significant association with viral activity at night (Fig. 2a-night), indicating that viral activity is highly time-dependent. p_module_6 contains genes related to carbon metabolism (talA), DNA replication and repair (recA), and nucleotide metabolism (nrdJ) that are encoded and expressed by cyanophage during infection 49. Previously, talA was shown to be the most highly expressed transcript in cultured marine cyanobacteria during viral infection. Phage gene expression of talA is thought to redirect carbon from the Calvin cycle in the host to the Pentose Phosphate Pathway (PPP) for viral energy production 49. Interestingly this metabolic shift simultaneously creates reducing power (via the PPP) and the carbon skeleton needed for nucleotide metabolism, which is a limited resource for phage production 7,49. Further, p_module_6 contains nrdJ that may be overexpressed for nucleotide metabolism for phage production 7. Similarly, recA is thought to play a role in phage replication by repairing DNA damage from UV damage radiation 5. p_module_6 also contains genes related to hyper-modification (glyA) that may protect bacteriophage DNA and prevent degradation by the host (clpX) 50. These opposing genes may be involved in the molecular arms race between the phage and their host during infection. Finally, p_module_6 contains gltS, a GS-GOGAT component, for nitrogen assimilation that was previously shown to peak in Prochlorococcus at dusk 5. Phage particles are composed of up to 41% extracellularly derived N 51. Therefore transcription of this gene at dusk may be important for viral replication and not host-nitrogen assimilation as previously suggested 19. All in all, module 6 contains genes that may be over transcribed due to a viral take-over of host metabolism, rather than host-driven processes.
In comparison, p_module_3 shows a weak association with viral activity during the day (Fig. 2a-day; strength = 0.71). p_module_3 contains genes for oxidative phosphorylation and ribosome and protein synthesis that are overexpressed at night (Fig. 1a) and may be important for host protein synthesis 52. The different network structures between day and night indicate that viral activity is driven by diel cycles in their photoautotrophic host, as previously described 5.
Gene modules linked to viral infection in Ca. Pelagibacter ubique. In Ca. Pelagibacter ubique, viral activity is strongly connected with cp_module_4 and cp_module_5 during the day (Fig. 2b-day, with strength equal to 1 and 0.87, respectively), compared to cp_module_4 and cp_module_8 at night (Fig. 2b-night; Supplementary Table 2). These data indicate that viruses may have different strategies for infection during the day vs night based on changes in the metabolic processes of their hosts (Fig. 2b). Therefore, although the host does not exhibit differences in growth rates due to diel cycles 47, viral infection strategies change. Abundant genes in cp_module_4 are grouped in three categories: 1) RNA degradation and biogenesis (groEL and dnaK); 2) peptidases and inhibitors (ftsH, hflB and hflK); and 3) ammonia transporters (amtB). These genes are also found in cp_module_2 and are overexpressed during the day (Fig. 1b) but not linked to viral activity (Fig. 2b-day). Given that dnaK, groEL, and ftsH are abundant both in this host and marine pelagic populations 36, they could be expressed by the host in cp_module_2 during the day and by phage in cp_module_4 during the day and night to bolster host fitness during infection. cp_module_5 includes genes related to glycolysis, gluconeogenesis and pyruvate metabolism (aldB and lldD), TCA cycle (sdhA and frdA), DNA repair and recombination (recA), and RNA degradation (dnaK and HSPA9). Interestingly, pyruvate for gluconeogenesis is a primary carbon source for SAR11 clades 53 and may help to drive replication using proteorhodopsin energy production 54 during the day. At night, genes in cp_module_8 are related to unknown proteins and aminoacyl-tRNA synthetases (thrS). Unknown genes in cp_module_8 require further study to elucidate their role in host metabolism and potentially for viral replication.
A combined network structure elucidates the interaction of each species over diel cycles and with viral infection. Microbes in the ocean do not exist in isolation, and instead are part of complex interactions both between species and the environment. Thus, to mimic these interactions we combined the species together in a unified model. Overall, our combined network (Fig. 2c) mimicked the structure of the networks for individual species but clarified the strength of connections between nodes. In contrast to other findings that growth rates in Ca. Pelagibacter ubique are not tied to diel patterns 47; we see that genes in cp_module_4 have strong connections with surface PAR (strength = 1, Fig. 2c). Specifically, genes related to preventing protein damage, regulating transcription, and scavenging sulfur from DMSP may have diel periodicity in natural Ca. Pelagibacter ubique populations. Interestingly, this same module is also closely connected with viral activity in both the day and night. Compared to Prochlorococcus Marinus that has strong connections to viral activity during the day only. Viral diel patterns for Prochlorococcus Marinus are related to gene co-expression patterns in photosynthesis, energy production, and nucleotide biosynthesis (in p_module_6). Further, viral activity at night in Ca. Pelagibacter ubique is driven both by genes in cp_module_4 and viral activity in Prochlorococcus Marinus. These data suggest that viral lysis of photoautotrophic bacteria helps to drive viral activity in heterotrophs like Ca. Pelagibacter ubique likely due to increased growth rates from available nutrients. Thus, the differences between the individual species and combined networks indicate that there is more information we can acquire when considering both species as a community. Moreover, viruses appear to be a major driver in microbial community dynamics, on par with diel cycles.