Here, we enrolled 98 hospitalized patients with COVID-19 (mean age 33 years; 53% male) and 78 non-COVID-19 controls matched for gender and co-morbidities (mean age 48 years; 42% male, Table 1). All COVID-19 patients and non-COVID-19 controls had stool specimens sampled at inclusion. Blood specimens were additionally sampled for COVID-19 patients at admission to test for pro-inflammatory markers and white cell counts (Supplementary Table 1). Amongst the COVID-19 patients, 37 (38%) had serial faecal samples collected from hospitalization until after discharge (Supplementary Figure 1). We enriched both faecal RNA and DNA virions from a total of 277 faecal samples and performed non-targeted shotgun metagenomic sequencing on the RNA virome (mostly eukaryotic viruses) and DNA virome (mostly prokaryotic bacteriophages). We report gut virome profiles in association with SARS-CoV-2 infection, disease severity, and blood parameters.
Alterations in faecal RNA virome of COVID-19 patients
To understand whether SARS-CoV-2 infection influences the gut RNA virome, we compared faecal RNA virome of COVID-19 patients at baseline (Day 0, the first time point of stool collection after hospitalization) with that of non-COVID-19 controls. Among all host factors (SARS-CoV-2 infection, age, gender, medications, co-morbidities), SARS-CoV-2 infection showed the largest effect size on impacting composition of the faecal RNA virome (permanova test p<0.01, R2=0.041, Figure 1A) followed by chronic hepatitis B (HBV) infection and asthma. At the species level, SARS-CoV-2 was enriched in faecal samples of patients with COVID-19 compared with non-COVID-19 controls (MaAsLin2 analysis with adjustment for HBV infection and asthma, FDR p<0.05, Figure 1B). In contrast, Pepper mild mottle virus (PMMoV), a plant virus known to be prevalent and abundant in human faeces [17], was underrepresented in patients with COVID-19 (FDR p<0.05, Figure 1B&C). Seven (19%) of the 37 COVID-19 patients who had longitudinal follow-up showed prolonged faecal SARS-CoV-2 shedding, as indicated by continued SARS-CoV-2 RNA detection in faeces after nasopharyngeal clearance of the virus (Supplementary Figure 2A). PMMoV virus was persistently underrepresented both during hospitalization and after disease resolution in COVID-19 patients (Figure 1C, Supplementary Figure 2B). Diet over the course of hospitalisation (Supplementary Table 2) did not show significant effect in the temporal variation of the gut RNA virome (permanova test, p=0.2) or the relative abundance of PMMoV virus (permanova test, p=0.4).
Overall, the faecal RNA virome composition of COVID-19 patients remained distinct from that of non-COVID-19 controls during the disease course and after disease resolution (Figure 1D&E). Among 16 COVID-19 patients who had nasopharyngeal clearance of SARS-CoV-2 virus (disease resolution as determined by negative PCR result for SARS-CoV-2 on nasopharyngeal swabs), 11 (69%) had persistently altered faecal RNA virome after disease resolution and two (13%) lasted up to 30 days (Figure 1D).
We also performed quantitative RT-PCR assays to examine SARS-CoV-2 viral levels in faecal and nasopharyngeal swab specimens in patients with COVID-19 at hospitalisation. We found that SARS-CoV-2 levels were lower in faeces of patients with moderate COVID-19 than those with asymptotic/mild COVID-19 (p<0.01, Supplementary Figure 3A), which was also confirmed by RNA virome shotgun metagenomic sequencing assay (p<0.05, Supplementary Figure 3B), though the detection sensitivity of faecal SARS-CoV-2 RNA by shotgun sequencing assay was not on par with quantitative RT-PCR assay. SARS-CoV-2 viral load in faecal specimens was approximately 2-log lower than that in nasopharyngeal specimens (p<0.0001, Supplementary Figure 3C). Importantly, SARS-CoV-2 levels in nasopharyngeal samples significantly correlated with SARS-CoV-2 levels in faecal samples (Pearson correlation Rho=0.3, p=0.0038, Supplementary Figure 3D).
Alterations in faecal DNA virome of COVID-19 patients
We then investigated the effect of SARS-CoV-2 infection on faecal DNA virome composition at baseline and during disease course. At the community level, viromes of COVID-19 patients at baseline differed significantly from that of non-COVID-19 controls (permanova p<0.01, Figure 2A) and were more heterogeneous than that of non-COVID-19 controls (p<0.0001, Figure 2B). Among all host factors (SARS-CoV-2 infection, age, gender, medications, co-morbidities), SARS-CoV-2 infection again showed the largest effect size on impacting the composition of the faecal DNA virome (R2=0.018, Figure 2C) followed by hyperlipidemia and the antiviral medication Lopinavir-ritonavir. Administration of Lopinavir-ritonavir was inversely associated with the presence of Listeria phage (correlation coefficient -0.21, p=0.03), a phage infecting the pathogenic bacteria, Listeria.
A total of 45 DNA virus species were found to be significantly different in the faecal DNA virome between COVID-19 patients and non-COVID-19 controls (19 virus species enriched in COVID-19 patients versus 26 virus species enriched in non-COVID-19 controls, identified via DESeq, while controlling for the factors hyperlipidemia and Lopinavir-ritonavir, shown in Figure 2D). A majority (69%, 18 out of 26 virus species) of the DNA viruses enriched in faeces of non-COVID-19 controls were prokaryotic viruses, particularly bacteriophages (62%, 16 out of 26). In contrast, more eukaryotic viruses, particularly environment-derived eukaryotic viruses with unknown host, were enriched in feces of COVID-19 patients.
The differentially enriched gut DNA virus species in COVID-19 patients showed substantial temporal variations during the disease course (Figure 3A). Diet during the time of hospitalisation did not show significant effect in the temporal variation of the facecal DNA virome (permanova test, p=0.3). Overall, faecal DNA virome composition of COVID-19 patients differed markedly from that of non-COVID-19 controls during the disease course and after clearance of SARS-CoV-2 (Figure 3B&C). Among COVID-19 patients who had follow-up after disease resolution, six (32%) showed markedly more dissimilar faecal DNA virome to non-COVID-19 controls at the last follow-up (three patients lasted up to 20-30 days), compared to their dissimilarity to non-COVID-19 controls at baseline (Figure 3B).
Alterations in the functionality of the enteric virome in COVID-19 patients
We next investigated functionality alterations of the gut virome using HUMAnN2 prediction. A larger number of gene families were enriched in COVID-19 viromes at baseline than non-COVID-19 controls (28 versus 9 gene families, FDR p<0.05, Figure 4). We found significant enhancement in the functional capacities of gene mobilization and viral/phage integration into the host in COVID-19 viromes (Figure 4). Features of viral integration (expansion of temperate virions/phages) have been observed in the gut under inflammatory conditions in both humans and mice [18, 19]. In addition, functions involved in host stress/inflammation/virulence response (DNA repair, Arginine repressor, Hemolysin channel protein, DNA polymerase IV), bacterial metabolism and membrane transport were also enriched in the faecal virome of COVID-19 patients (Figure 4). Diet during time of hospitalization did not show significant effect on virome functionality variation (p=0.45). These data highlight that SARS-CoV-2 infection may associate with a functionality shift of the human gut virome to inflammation- and stress-related responses in relation to their hosts (both the commensal bacteria and humans). The viral functions enriched in COVID-19 (particularly those associated with host metabolism) were significantly associated with the abundances of viruses enriched in COVID-19, including Streptococcus phage, Escherichia phage, Homavirus, Lactococcus phage, Ralstonia phage, Solumvirus, and Microcystis phage (Supplementary Figure 4).
Faecal virome alterations correlated with disease severity of COVID-19
Based on COVID-19 disease symptoms and severity classification criteria [20], we stratified our patients into non-severe (N=56; asymptomatic/mild cases) and moderate/severe groups (N=42; moderate/severe/critical cases) (Figure 5A). Compared to non-severe cases, moderate/severe cases showed a significantly higher blood levels of LDH, neutrophil count, C-reactive protein (CRP), Alanine aminotransferase (ALT), and lower blood levels of Albumin at admission (all p<0.05, Figure 5B-F, Supplementary Figure 5). Our data are in line with recent reports highlighting that more severe cases had more pronounced systemic inflammatory responses [2, 21-24]. We then explored association between baseline faecal RNA and DNA virome profiles with COVID-19 severity and blood measurements at hospitalization. Abundance of the plant-derived RNA virus, Pepper chlorotic spot virus (PCSV) was higher in patients with non-severe than those with moderate/severe disease (p=0.013, Figure 5G). In addition, a high abundance of PCSV in feces was associated with low blood concentrations of the inflammation markers, LDH and CRP (correlation coefficient Rho=-0.269 and -0.276 respectively, Figure 5H&I). Similarly, abundance of 9 DNA virus species (Myxococcus phage, Rheinheimera phage, Microcystis virus, Bacteroides phage, Murmansk poxvirus, Saudi moumouvirus, Sphaerotilus phage, Tomelloso virus, and Ruegeria phage) in feces negatively correlated with COVID-19 severity (all FDR p<0.05, Figure 5J). In particular, 8 out of the 9 DNA virus species showed strong negative correlation with blood levels of the inflammation indicators LDH, neutrophil count, white cell count, or CRP (Figure 5K). Interestingly, among them, four viral species, Myxococcus phage, Bacteroides phage, Murmansk poxvirus, and Sphaerotilus phage, which inversely associated with inflammation indicators, also inversely correlated with host age (Figure 5K). This result coincides with the observation that elderly individuals were at higher risk for unfavorable severe COVID-19 outcomes [2, 25]. These data suggest that such RNA and DNA viruses may counteract the effect of SARS-CoV-2 infection predisposing infected subjects to a less severe COVID-19 course. Five out of the 9 severity-associated DNA virus species showed persistent lower abundances in the faeces of COVID-19 patients during disease course and after disease resolution compared to non-COVID-19 controls (all p<0.05, Figure 6), indicating an unfavorable effect of SARS-CoV-2 infection on these gut viruses. The cause or consequence of such associations needs to be further explored. In addition, a large number of DNA virus species in feces (n=132) showed significant correlations with blood parameters in COVID-19 patients, most of which were negative correlations with blood LDH concentrations, neutrophil and white cell counts (Supplementary Figure 6). These data underscore the potential significance of gut DNA virome in calibrating host immunity and counteracting infection of SARS-CoV-2, warranting further investigation.