SARS-CoV-2 infection modulates the community composition and diversity of nasopharyngeal microbiomes
To shed light on the effects of SARS-CoV-2 infections in the diversity and composition of human nasal microbiome, we analyzed RNA-Seq data of nasopharyngeal samples of randomly selected Healthy human, COVID-19 and Recovered patients. The alpha-diversity (within sample diversity through Shannon and Simpson indices) assessment identified significant differences in microbial species richness among the COVID-19, Recovered and Healthy controls regardless of the method used to tabulate microbial abundances i.e., either PathoScope 2.0 (PS) or MG-RAST 4.13 (MR), showing higher diversity in the microbial niche of COVID-19 samples (COVID-19, p = 0.0065; Recovered, p = 0.0105; Healthy, p = 0.0401; Kruskal-Wallis rank sum test) (Fig. 1A). The beta diversity (between sample or metagenome diversity) through principal coordinate analysis (PCoA), as measured on the Bray-Curtis distance method using PS and MR data, showed distinct discrimination across the metagenomes and separated samples by microbial population structure (Fig. 1B). Therefore, significant variation in microbiome diversity and composition across these three metagenomes (p = 0.0059, Kruskal-Wallis test) was evident. The diversity of microbiomes between samples (PCoA) however did not vary significantly (p = 0.651, Kruskal–Wallis rank sum test) according to the gender (male or female) of the study population (Fig. 1B).
Microbiome composition at the domain level was numerically dominated by bacteria, with a relative abundance of 81.58%, followed by viruses (18.22%) and archaea (0.20%) (Data S1). The unique and shared distribution of microbes found in the three participant groups are represented by comprehensive Venn diagrams (Fig. 2). Our analyses detected 532 bacterial genera, including 486, 421 and 509 in COVID-19, Recovered and Healthy nasopharyngeal samples, respectively, and of them, 72.74% genera were common in three sample groups (Fig. 2A, Data S1). Notably, we detected 2281 bacterial species through PS analysis, of which 919, 676 and 1477 species were found in COVID-19, Recovered, and Healthy samples, respectively (Fig. 2B, Data S1). In this study, compared to COVID-19 and Recovered samples, the Healthy samples had unique or sole association of 1166 bacterial species which underwent dysbiosis during the pathogenesis and subsequent recovery phase of COVID-19 (Data S2). Of the identified bacterial species, 67.27% and 76.78% had sole association in COVID-19 and Recovered samples, respectively indicating their opportunistic inclusion in COVID-19 patients and re-establishment beneficial commensal flora with the recovery of SARS-CoV-2 infections (Fig. 2B, Table 1, Data S2).
The MR pipeline detected 55 viral and 48 archaeal genera in three metagenomes, and of them, 16.37% viral (Fig. 2C), and 27.08% archaeal (Fig. 2D) genera were found to be shared among these metagenomes. By comparing these genera across the sample categories, we identified 35, 31 and 23 viral, and 42, 20 and 35 archaeal genera in COVID-19, Recovered and Healthy-swabs, respectively (Table S1). We found that 5.45% and 16.36% viral (Fig. 2B), and 37.5%, and 8.33% archaeal genera in Healthy and Recovered samples, respectively shared with those of COVID-19 samples. The COVID-19 patients had sole association of 14 (25.45%) viral, and 7 (14.58%) viral genera (Fig. 2C-D, Table S1, Data S1). Moreover, 213, 52 and 71 viral species (bacteriophages mostly) were identified in COVID-19, Recovered and Healthy metagenomes, respectively. Among the detected viral species, 97.65% and 98.07% had sole association with COVID-19 and Recovered samples, respectively (Data S2).
SARS-CoV-2 infection reduces the diversity and composition of nasopharyngeal commensal bacteria
The present microbiome study demonstrated that both the composition and the relative abundances of bacterial taxa at the phylum, order, family, genus, and species-level differed significantly (p = 0.031, Kruskal Wallis test) among COVID-19, Recovered and Healthy controls. Firmicutes was found to be the most predominant bacterial phylum in COVID-19 and Recovered metagenomes with a relative abundance of 76.31% and 39.28%, respectively (Data S1). The other predominant bacterial phyla were Bacteroidetes (7.07%), Proteobacteria (6.85%), Fusobacteria (5.37%), Actinobacteria (1.45%), and Cyanobacteria (1.23%) in COVID-19 metagenome, and Proteobacteria (27.42%), Actinobacteria (16.61%), Bacteroidetes (12.86%), and Cyanobacteria (1.71%) in Recovered metagenome (Data S1). On the other hand, Proteobacteria (50.19%), Bacteroidetes (41.60%), and Firmicutes (6.42%) were the most abundant phyla in Healthy control samples (Data S1).
We found significant differences (p = 0.021, Kruskal-Wallis test) in the diversity and relative abundance of the bacteria at the genus level. The individual and inter-individual microbiome variability showed that Healthy individuals had higher number of bacterial genera (average 318.57/person) compared to COVID-19 (234.50/patient) and Recovered (156.29/person) patients (Fig. S1A). However, the bacterial genera detected in the metagenome COVID-19 patient mapped to the highest number of reads per genus (476.95 reads/genus) compared to the healthy (278.89 reads/genus) and COVID-19 recovered (37.36 reads/genus) individuals (Fig. S1B). In COVID-19 metagenome, Streptococcus was the most abundant bacterial pathogen with a relative abundance of 37.16% followed by Veillonella (24.25%), Prevotella (4.97%), Staphylococcus (3.49%), Fusobacterium (2.89%), Clostridium (2.59%), Leptotrichia (2.26%), and Coprobacillus (2.24%) (Fig. 3, Data S1). Likewise, top abundant bacterial genera in the metagenome of recovered individuals were Staphylococcus (28.82%), Streptomyces (9.11%), Acinetobacter (8.96%), Corynebacterium (5.18%), Streptococcus (2.48%), and Helicobacter (2.42%) (Fig. 3, Data S1). Conversely, the Healthy control metagenome was predominated by Pedobacter (11.69%), Sphingobacterium (6.35%), Pseudomonas (5.03%), Enterobacter (4.79%), Flavobacterium (4.62%), Pseudoalteromonas (3.82%), Escherichia (3.50%), Exiguobacterium (3.38%), Shewanella (3.16%), Chryseobacterium (2.58%), Aeromonas (2.52%), Klebsiella (2.50%), and Vibrio (2.03%). The rest of the genera detected in these metagenomes had relatively lower abundances (<2.0%) (Fig. 3, Data S1).
We further investigated the species-level differences of microbial communities across these three metagenomes through PS analysis, which revealed significant variations (p = 0.011, Kruskal-Wallis test) in microbiome composition, diversity and relative abundances (Fig. 4, Table 1, Data S1 and S2). The COVID-19 metagenome was dominated by 73 species (7.94%) of Streptococcus genus while Clostridium, Chrysobacterium, Paenibacillus, Neissaria, Acinetobacter, Staphylococcus, and Corynebacterium genera were represented by 38, 26, 25, 23, 22, 22 and 17 different species, respectively. Similarly, the bacteriome of the Recovered patients was dominated by different species of Corynebacterium (39), Acinetobacter (36), Chryseobacterium (30), Sphingobacterium (26), Staphylococcus (24), Pseudomonas (18), and Flavobacterium (16) (Fig. 4, Table 1, Data S1-S2). In contrast, the bacteriome of the Healthy control was predominantly represented by 145 species (9.82%) of Pseudomonas genus followed by 49, 41, 38, 38, 36, 36, 33, and 26 species of Vibrio, Shewanella, Acinetobacter, Flavobacterium, Chryseobacterium, Enterobacter, Pseudoalteromonas, and Sphingobacterium genera, respectively (Fig. 4, Table 1). Remarkably, 78.94% (1166/1477) bacterial species were solely associated with the Healthy nasopharyngeal samples, and not detected in SARS-CoV-2 infected (COVID-19) and Recovered patients. Among these depleted commensal species Pseudomonas sp. LPH1 (25.32%), Brevundimonas sp. Bb-A (4.85%), P.oleovorans (3.25%), Pseudomonas sp. phDV1 (1.75%) and Brevundimonas sp. DS20 (1.38%) were top abundant (Fig. 4, Table 1, Data S1-S2). Conversely, the COVID-19 samples had sole association of 609 opportunistic bacterial species, and of them, Streptococcussalivarius K12 (19.13%), S. mitis (18.13%), Neisseria subflava (13.77%), Veillonella dispar (11.03%), Acinetobacter junii 64.5 (3.31%), V. parvula (2.35%), Prevotella melaninogenica (2.26%), S. parasanguinis (2.22%), Streptococcus sp. LPB0220 (1.98%), N. flavescens (1.26%), and V. atypica (1.11%) were top abundant (Fig. 4, Table 2, Data S2). Likewise, the Recovered patients of COVID-19 had sole association of 519 species with comparatively higher relative abundances of Pseudomonas stutzeri DSM 4166 (8.48%), Staphylococcus capitis (5.89%), S. epidermidis RP62A (5.13%), P. mendocina NK-01 (3.12%), Moraxellaosloensis A1920 (2.60%), A. indicus A648 (2.57%), Escherichia coli (2.41%), Sphingobacterium sp. G1-14 (1.96%), A. junii 64.5 (1.85%), S. pneumoniae (1.60%), Ralstonia pickettii (1.55%), Micrococcus luteus (1.46%), Rheinheimera sp. D18 (1.33%), Corynebacterium segmentosum (1.26%), Elizabethkingia anopheles (1.25%), and Cutibacterium acnes (1.08%). The rest of the species in these metagenomes had relatively lower (< 1.0%) abundances (Fig. 4, Table 1, Data S2).
SARS-CoV-2 infection associated dysbiosis of viruses and archaea in the nasopharyngeal tract
The COVID-19 and Recovered patients had higher number of viral genera and/or species compared to Healthy controls. The COVID-19 and Healthy samples were predominated by the genus of Betacoronavirus with a relative abundance of 97.95% and 83.57%, respectively. The other abundant viral genera in COVID-19 samples were Alphacoronavirus (0.90%) and Siphovirus (0.54%) while Alphacoronavirus (3.54%), T1-like viruses (3.25%), Cystovirus (3.23%), Siphovirus (1.91%), Myovirus (1.09%), Lambda-like viruses (0.78%), and N4-like viruses (0.72%) (Fig. 5, Data S1). The Recovered sample were mostly dominated by Alphacoronavirus (92.54%) followed by Betacoronavirus (1.23%), Siphovirus (1.16%), T1-like viruses (0.81%), Macavirus (0.67%), and SP6-like viruses (0.56%) (Fig. 5, Data S1). Most of these viral genera had lower relative abundances in Healthy controls. In addition, the SARS-CoV-2 was found as the predominantly abundant viral strain in COVID-19 (99.44%) patients whereas Human coronavirus NL63 remained as the top abundant viral strain in Recovered (90.89%) patients. The rest of the viral strains in three metagenomes were mostly different strains of bacteriophages (Data S1).
In this study, top abundant archaeal genera COVID-19 samples were Halogeometricum (19.57%), Haloquadratum (10.53%), Natrialba (8.06%), Methanosarcina (6.25%), Halorhabdus (5.76%), Methanocaldococcus (5.76%), Haloterrigena (5.59%), Methanobrevibacter (4.11%), Halorubrum (3.95%), Methanococcoides (3.95%), Methanococcus (2.96%), and Methanocorpusculum (2.96%). Correspondingly, Haloterrigena (36.47%), Methanocaldococcus (35.29%), Halogeometricum (6.47%), Thermococcus (5.29%), Haloquadratum (2.94%), Methanosarcina (2.35%), and Pyrococcus (2.80%) were the predominating archaeal genera in the Recovered metagenome (Fig. 6, Data S1). The Healthy samples, however, possessed Methanospirillum (15.34%), Methanoregula (10.58%), Methanocaldococcus (10.05%), Methanosarcina (9.00%), Thermococcus (7.94%), Haloterrigena (6.88%), Methanococcus (3.18%), Methanoculleus (3.18%), Methanosphaera (3.18%), Euryarchaeota (3.18%), Methanobrevibacter (2.65%), Sulfolobus (2.65%), and Haloquadratum (2.12%) genera with relatively higher abundances. The rest of the archaeal genera had lower relative abundances (<2.0%) across these metagenomes (Fig. 6, Data S1). Notably, seven archaeal genera (14.58%) had inclusion in COVID-19 metagenome compared to those of Healthy controls, and of them, Methanopyrus (0.82%) and Metallosphaera (0.66%) were predominantly abundant (Data S2). The Recovered and Healthy samples also had sole association of 2 (4.17%) and 3 (6.25%) archaeal genera, and the relative abundances of these genera remained much lower (0.59%) in both metagenomes (Data S2).
SARS-CoV-2 infection associated changes in genomic potentials of the microbiomes
In our current RNA-Seq data set, there was a broad variation in resistance to antibiotics and toxic compounds (RATCs) diversity and composition across three metagenomes (Fig. 7, Data S3). The categories and relative abundances of the RATC were significantly correlated (p = 0.027, Kruskal-Wallis test) with the relative abundance of the associated bacteria found in COVID-19, Recovered and Healthy control metagenomes (Data S3). MG-RAST identified 49 RATCs distributed in the bacterial genomes of the three metagenomes (Fig. 7, Data S3). The COVID-19 associated microbiomes harbored the 40 different RATC groups while 32 and 44 RATC gene families were detected in the microbiomes of Recovered and Healthy control samples. Of the detected RATCs, genes associated with biofilm formation in Staphylococcus had the highest relative abundances in Recovered samples (80.0%) followed by COVID-19 samples (45.95%), and Healthy controls (3.57%). Moreover, genes associated with acriflavin resistance were also predominantly abundant in COVID-19 (58.86%) and Recovered (66.67%) metagenomes. The other predominantly abundant RATC functional groups in COVID-19 metagenome were quorum sensing: autoinducer-2 synthesis (29.73%), multidrug resistance cluster; mdtABCD (21.0%), cobalt-zinc-cadmium resistance (20.64%), BlaR1 regulatory family (15.56%), multidrug resistance efflux pump; pmrA (11.39%), resistance to fluoroquinolones (10.83%), lsrACDBFGE operon (10.81%) and multidrug resistance efflux pumps (10.25%) (Fig. 7, Data S3). Simultaneously, the microbiomes of the Recovered samples harbored higher abundance of genes encoding biofilm adhesin biosynthesis (20.0%), cobalt-zinc-cadmium resistance (18.97%), multiple antibiotic resistance (MAR) locus (17.50%), methicillin resistance in Staphylococci (14.66%), macrolide-specific efflux protein; macA (13.33%), arsenic resistance (12.93%), multidrug resistance efflux pumps (12.93%), and YjgK cluster linked to biofilm formation (11.0%) (Fig. 7, Data S3). Conversely, beta-lactamase resistance (35.71%), multidrug resistance efflux pump; pmrA (29.10%), biofilm formation in Staphylococcus (22.73%), quorum sensing in Yersinia (21.43%) and Pseudomonas (17.86%), and teicoplanin-resistance in Staphylococcus (12.82%) were top abundant RATC functional groups in the microbiomes of the Healthy controls. The rest of RATC groups also varied in their expression levels across the three metagenomes, being more prevalent in the COVID-19 and Recovered microbiomes (Fig. 7, Data S3).
By examining the correlation between the different gene families of the same KEGG pathway for COVID-19, Recovered and Healthy controls microbiomes, we found significant differences (p = 0.034, Kruskal-Wallis test) in their relative compositions and abundances. Our analysis revealed that genes coding for pyruvate carboxylase (pyc) had several-fold overexpression in the microbiomes of the COVID-19 (26.31%) and Recovered (4.11%) groups compared to Healthy controls (1.80%) (Fig. 8A, Data S3). In addition, genes encoding for adherent junction (25.17%), tight junction (24.48%), environmental information processing (15.03%), carbohydrates metabolism (11.51%) and oxidative phosphorylation (7.26%) had higher relative abundances in the COVID-19 associated nasopharyngeal microbiomes (Fig. 8A, Data S3). The Recovered microbiomes however had higher relative abundances of genes coding for focal adhesion (53.08%), transport and catabolism (40.44%), cell adhesion molecules (40.00%), genetic information and processing (35.12%), lysosome activity (28.68%), endocytosis (27.13%), and cell-to-cell communication (20.38%) (Fig. 8A, Data S3). Conversely, the gene families associated with bacterial secretion system (40.09%), cell motility (39.83%), succinyl-CoA synthetase subunit C and D (34.35%), gap junction (27.27%), protein metabolism (19.49%), TCA cycle (18.72%) and citrate synthase (16.89%) remained overexpressed in Healthy nasopharyngeal microbiomes (Fig. 8A, Data S3).
We also sought to gain further insight into the SEED hierarchical protein functions represented by different genes, and found 37 statistically different (p = 0.013, Kruskal-Wallis test) SEED functions in COVID (COVID-19 and Recovered) and Healthy control metagenomes. Overall, the COVID-19 and Recovered samples associated microbiomes showed a higher relative abundance of these SEED functions compared to those of Healthy controls. For instance, cytokine-cytokine receptor interaction (50.0%), regulation of oxidative stress response (26.82%), phage integration and excision (24.92%), toxin-antitoxin regulation system (17.24%), protection from reactive oxygen species (14.96%), and phage regulation (6.06%) related gene expression remained more than two-times overexpressed in Healthy commensal microbiomes compared two COVID-19 and Recovered sample-associated microbiomes (Fig. 8B, Data S3). The COVID-19 microbiomes were enriched in genes coding for cell growth and death (55.60%), Streptococcuspyogenes virulence regulators (42.44%), fratricide in Streptococcus (32.67%), neuroactive ligand receptor interaction (28.21%), epithelial cell signaling (20.35%), clustering-based subsystems (14.65%), proteolytic pathways (9.2%), prophage lysogenic conversion modules (3.69%) and bacterial invasion of epithelial cells (1.93%) compared to rest of the two metagenomes (Fig. 8B, Data S3). The Recovered microbiomes however had a higher abundance of SEED functions involved in glutathione: non-redox reactions (30.30%) and redox cycle (6.06%), coagulase cascade (20.0%), osmotic stress (16.07%), membrane transport (14.55%), MT1-MMP pericellular network (10.0%), BarA-UvrY(SirA) two-component regulatory system (9.29%), and oxidative stress (5.54%). In contrast, most of these SEED modules had relatively lower abundance in Healthy sample microbiomes (Fig. 8B, Data S3).