Virome of the human body
We investigated the systemic distribution of the eukaryotic DNA virome in nine organs (skin, brain, colon, liver, lung, heart, kidney, blood, and hair) of 31 recently deceased individuals. To this end, we integrated quantification via PCR-based methods (a total of 25 DNA viruses) and genomic characterization via targeted viromics (a total of 38 viruses).
We found that the viral DNAs (vDNAs) have unique distribution profiles within the human body and across individuals. Out of the 279 samples analyzed, 92% were positive for vDNAs by either qPCR or NGS. We detected a total of 17 viruses (mean 6.7, range 2-12 viruses/individual), (Figure 1A).
Figure 1. Prevalence, distribution, and diversity of viral DNAs. The number of viruses detected with qPCR and NGS A) in the body (≥ 1 tissue positive for a virus) of the study population (mean in dashed line) and B) in different sample types (mean in bar column). C) Alpha diversity estimation of different organ viromes by mean Shannon index. Statistical significance was calculated by one-way ANOVA (p<0.001). Post-hoc pairwise comparison of groups was done by REGWF and divided into four categories (a,b,c,d) with p-value<0.05 between categories. D) Beta diversity was estimated using Bray-Curtis dissimilarity and plotted with t-distributed stochastic neighbor embedding (t-SNE) for visualization. Ellipses show 95% confidence interval of four observed clusters: blood samples (red); colon, kidney, liver, and lung samples (blue); hair (yellow); and brain, heart, and skin samples (green).
The overall prevalences were high (Figure 2), with human herpesvirus 6B (HHV-6B) being the most frequent (in 97% of individuals), followed by parvovirus B19 (B19V, 87%), torque teno viruses (TTV, 87%), Epstein-Barr virus (EBV, 85%), human herpesvirus-7 (HHV-7, 77%), Merkel cell polyomavirus (MCPyV, 58%), and JC polyomavirus (JCPyV, 48%). In addition, the prevalence of human papillomaviruses (HPVs), assessed exclusively by NGS in 10 individuals, was 80%.
The copies of viral DNAs were normalized to cell counts using the human single-copy gene RNase P. The quantities are reported as viral copies per million cells (cp/mc), and the averages indicate geometric means. The viral quantities in the positive samples had a mean of 540 cp/mc. Of the most prevalent viruses, TTV and HHV-6B had the highest copies, with means of 1600 and 1100 cp/mc, respectively. The mean quantity of B19V was 550 cp/mc while those of HHV-7, EBV, and JCPyV were in the range of 75-140 cp/mc (p<0.05 One-way ANOVA), (Figure 3, Supplemental Figure 1).
We reconstructed in silico 129 viral genomes with over 50% breadth coverage, of which 70 were complete or near-complete (>90% breadth, Figures 4 and 5). The viral genomes assembled were unique to each individual and shared high homology across the different organs of an individual (intra-host variation).
The within-sample diversities (a-diversity) were highest in the lung, liver, colon, and kidney as calculated by the richness (mean 3.9, 3.7, 3.6, and 3.4 viruses/sample, respectively) and the Shannon index (mean 0.89, 0.72, 0.79, and 0.63, respectively), (Figures 1B and 1C). The blood, hair, skin, and heart had mean richness of 3.2, 2.3, 2.3, and 2.1 viruses/sample, respectively, and Shannon indices of 0.52, 0.21, 0.30, and 0.38, respectively. The brain had the lowest a-diversity, with a mean richness of 1.3 viruses/sample and a Shannon index of 0.17.
We analyzed the diversity between samples (b-diversity) by Bray-Curtis dissimilarity, which considers the presence/absence and the abundance of vDNAs in a sample. We visualized the calculated dissimilarities between samples with t-SNE (t-distributed stochastic neighbor embedding). Based on this analysis, we identified four different clusters of organs sharing similar viral profiles. The first cluster consisted principally of the colon, liver, lung, and kidney samples, the second of the skin, brain, and heart samples, and the third and fourth of the blood and hair samples, each with unique vDNA profiles (Figure 1D).
We created a weighted correlation network to evaluate whether the presence (φ coefficient) and quantities (Spearman’s correlation) of specific viruses in a given organ correlate with those in other organs of the same individual (Supplemental Figure 2). According to this analysis, most of the vDNA findings in solid tissues were independent of the detection in blood, suggesting that the positivities were, in fact, inherent to the tissue (rather than blood-derived).
The network topology varied between viruses. For B19V, the positivity and quantity in one organ correlated statistically with that in other organs of the same individual (Supplemental Figure 2A). For other viruses, such as EBV, HHV-6B, and HHV-7, the correlation held only between certain tissue pairs (Supplemental Figure 2).
Virome of the digestive system
Colon. This organ harbored 13 different vDNAs (Figure 2), of which HHV-6B was the most prevalent (87%), followed by B19V (74%). Remarkably, in the colon was observed the highest prevalence of HHV-7 (65%) in any organ. TTV and EBV were detected in 42% and 39% of colonic samples, while JCPyV in 16%, and HPV, MCPyV, HCMV, HPyV6, and BKPyV in 3-11%. The mean viral copies ranged between 1200-1800 cp/mc for TTV, HHV-6B, and B19V, and from 150-270 cp/mc for EBV and HHV-7 (Supplemental Figure 1).
We determined the φ coefficient of all vDNA pairs from each organ to evaluate whether the co-detection of certain vDNAs was statistically significant. In the colon, this was the case for HHV-7 and B19V (R=0.49; p=0.005) or HHV-7 and TTV (R=0.36; p=0.049), (Supplemental Figure 3C).
Liver. There was a high prevalence of HHV-6B and B19V in the liver (90% and 77%, respectively), (Figure 2). In comparison, the prevalence of HHV-7 was lower (45%) and that of TTV higher (77%) than in the colon. HPV or HPyV6 were not detected in this organ.
The highest copies in the liver were of HHV-6B (4400 cp/mc) and TTV (2700 cp/mc), and the lowest were of EBV, HHV-7, and B19V (44-240 cp/mc), (Supplemental Figure 1G). The B19V quantities were significantly lower in the liver than in the colon (p<0.05), (Supplemental Figure 4B).
JCPyV and BKPyV were often co-detected in the liver (R=0.49, p=0.006), (Supplemental Figure 3).
Virome of the respiratory system
Lung. We found 11 different vDNAs (Figure 1B and 1C) in the lung. Here, HHV-6B and B19V had the highest prevalences (87% and 84%, respectively), followed by HHV-7 (61%), EBV (58%), and TTV (52%). The prevalences of EBV and HCMV (23%) were higher in the lung than in any other organ (Figure 2 and Supplemental Figure 1).
In the lung, TTV had the highest copies with a mean of 2700 cp/mc, while the quantities of the other viruses ranged between 32 and 630 cp/mc (Supplemental Figure 1).
In this organ, TTV coincided with EBV (R=0.49, p=0.006), HHV-7 (R=0.42, p=0.018) or JCPyV (R=0.37, p=0.039), (Supplemental Figure 3). The presence of JCPyV correlated strongly with that of HCMV (R=0.48, p=0.006).
Virome of the cardiovascular system
Whole blood. TTV was the most frequently found in blood (81%), with the highest quantity (mean 6200 cp/mc, p<0.05), (Figure 2, Supplemental Figure 1). The TTV prevalence was followed by B19V (71%), HHV-6B (55%), EBV (45%), HHV-7 (26%), and JCPyV (19%). The mean quantities of these viruses ranged between 86-480 cp/mc. HCMV, BKPyV, HPyV6, and HPV were detected in one or two samples.
In the blood, JCPyV was co-detected with HPyV6 (R=0.37, p=0.040) or HCMV (R=0.54, p=0.002), and HHV-7 with HCMV (R=0.45, p=0.012), (Supplemental Figure 3).
Heart. The mean viral quantities were overall low (140-850 cp/mc) in the heart. B19V was the most prevalent (84%), followed by HHV-6B (45%). In contrast to blood, TTV was found in only 29% of the heart samples (Figure 2, Supplemental Figure 1).
The a-diversity was lower in the heart than in the blood (richness 2.13 vs. 3.2 viruses/sample, Shannon index 0.38 vs. 0.52).
In the heart, the co-detection of JCPyV and HHV-6B (R=0.42, p=0.017), JCPyV and HHV-7 (R=0.43, p=0.017), as well as of EBV and HHV-7 (R=0.53, p=0.002) were statistically significant (Supplemental Figure 3).
Virome of the urinary system
Kidney. We found 11 different virus types in the kidney, of which HHV-6B (84%), B19V (84%), EBV (52%), and TTV (45%) were the most prevalent (Figure 2). Markedly, the highest prevalence (39%) and mean quantity of JCPyV (560 cp/mc) were observed in this organ (Supplemental Figure 1). HHV-7, HCMV, BKPyV, and MCPyV were detected sporadically.
In the kidney, the positivity of EBV and TTV correlated statistically (R=0.52, p=0.003), as did that of HCMV and JCPyV (R=0.42, p=0.02), (Supplemental Figure 3).
Virome of the central nervous system
Brain. This organ contained the fewest vDNAs in our cohort, with approximately 1.3 viruses/sample and a Shannon index of 0.17 (Figure 1B and 1C). Among the findings were B19V (74%), HHV-6B (19%), TTV (13%), JCPyV (10%), and MCPyV (10%) with a mean quantity of 160 cp/mc (Supplemental Figure 1).
In the brain, the co-detection of TTV and JCPyV was statistically significant (R=0.43, p=0.017), (Supplemental Figure 3).
The virome of the integumentary system
Skin. The skin virome was less diverse than that of the lung, colon, liver, or kidney. The prevalence of vDNA was in general markedly lower, although a total of 14 different viruses were detected. The highest prevalence and mean quantity was of B19V (87%, 1800 cp/mc), (p<0.05), (Figure 2, Supplemental Figures 1 and 5B). The quantities of other vDNAs in the skin ranged between 21 and 310 cp/mc (Supplemental Figure 1).
In general, we identified polyomaviruses and HPV more often in the skin than in internal organs, with prevalences of 23% for MCPyV, 22% for HPV, 16% for JCPyV, and 13% for HPyV6. In contrast, the frequencies of herpesviruses (HHV-6B, 23%; EBV and HHV-7, 16%) and TTV (16%) were lower than in other organs.
Pulled Hair. There was a high variance in hair compared with other tissue types; while some samples had no detectable vDNAs, others had six or more. We found in total 15 different viruses, of which the most frequent were HPVs (detected in 8/10 subjects by NGS), MCPyV (53%), and HSV-1 (20%). The MCPyV copies in hair were remarkably high, with a mean of 24,000 cp/mc compared to 100 cp/mc in the skin (Supplemental Figure 4B).
Interestingly, we found HSV-2, HPyV7, and HPyV10 exclusively in hair. In contrast to other organs, the evidence of B19V DNA in hair was limited to a few sequence reads in four samples (median 5.5 reads), (Figures 2, 3, and 4). In one hair sample HPV types 22 and 23 were co-detected, and in another sample HPV types 111 and 12.
In the skin, the co-detection of TTV and EBV was statistically significant (R=0.52, p=0.003), (Supplemental Figure 3).
Findings in three individuals that differed from the predominant core virome
In one individual with a history of metastatic pulmonary carcinoma (Supplemental Table 2), we detected in nearly all organs the DNAs of HSV-1 (900-910,000 cp/mc, mean 27,000 cp/mc), HCMV (92-620,000 cp/mc, mean 8,200 cp/mc), and JCPyV (10-150,000 cp/mc, mean 3,600 cp/mc) (Figure 3). These findings contrasted with those of the rest of the cohort, in which we found HSV-1 only in hair and skin, and HCMV only in lung, liver, and kidney, consistently at low prevalence. The quantities of EBV in this individual, particularly in lung, hair, and colon, were higher than in any other subject of the cohort (11,000, 6,700, and 1,800 cp/mc, respectively). Furthermore, this individual was positive by NGS for HPV-122 in blood, hair, colon, and skin; for HPV-105 in blood and for HPV-111 in hair. By NGS, we reconstructed HSV-1 genomes from blood, colon, and lung with breadth coverages of 90%, 57%, and 52%, respectively, as well as HCMV genomes from the heart, liver, lung, and skin with breadth coverages of 66%, 52%, 69%, and 56% (Figures 4 and 5). Moreover, from this individual, we assembled the genomes (with respective coverage breadths) of JCPyV in the kidney (100%), liver (99%), blood (97%), colon (92%), lung (90%), heart (54%), hair (45%), and skin (39%).
The sole finding of VZV DNA in the cohort corresponded to an individual with metastatic mantle cell lymphoma (stage IV). This subject presented with facial herpes zoster and erythema multiforme five weeks before death (of non-natural causes), (Supplemental Table 2). Although no signs of shingles were present at the external post-mortem examination, a skin sample taken from the face revealed 630,000 cp/mc of VZV but also, remarkably, 100,000 cp/mc of EBV, the highest of this virus in any sample in the entire cohort. In comparison, in the femoral skin, heart, liver, and lung, the VZV quantities were 10-20 cp/mc, and those of EBV were 13-320 cp/mc. Of note, this was the only individual positive for EBV in the brain (320 cp/mc). From the facial skin sample, we assembled near full genomes of VZV and EBV, with respective breadth coverages of 98 and 97% (Figure 5).
Another subject in our cohort was positive for HBV DNA with quantities of 630 cp/mc in the liver. In addition, we found traces of the DNA of this virus in the subject’s skin, colon, heart, lung, and kidney, at copies ranging from 10 to 50 cp/mc. The blood sample was negative by both NGS and qPCR. The presence of HBV core antibodies (HBcAb +) and the absence of both IgM (HbcAbM -) and surface antigen (HBsAg -) pointed to resolved past infection. From this individual, we assembled HBV genomes from the skin, colon, heart, liver, and lung with respective breadth coverages of 41%, 18%, 15%, 12%, and 8%.
Sequence analysis reveals unique viral strains in each individual
We reconstructed the following viral genomes with over 50% breadth coverage (total number of genomes; mean depth coverage): B19V (n=60; 56x), HHV-6B (n=22; 19x), MCPyV (n=11; 1,200x), JCPyV (n=14; 20x); HHV-7 (n=6; 3x), HSV-1 (n=4; 5x), HCMV (n=4; 3x) EBV (n=2; 25x), VZV (n=1; 32x), HPyV6 (n=1; 8x), HPyV7 (n=1; 2x), and HPV (n=1; 30x). The viral sequences generated per organ and individual are presented in Figure 4, and representative viral-genome-profiles are shown in Figure 5. The B19V genomes were of genotype 2 in 73% (8/11 subjects) and of genotype 1 in 27% (3/11) of the subjects. The JCPyV genotypes were 1B in 60% (3/5) and 4 in 40% (2/5) of the subjects. Both EBV genomes were of type 1.
The viral genomes assembled were unique to each individual, excluding the possibility of a common contaminant (e.g., from reagents). The viral consensus sequences generated from different organs of the same individual had over 99% identity, indicating limited intra-host variation.
The viral genome sequences with highest coverage were deposited in GenBank, accession numbers ON023008-ON023041.
NGS vs. qPCR
We found an overall positive agreement of 92% (274/299; 95% Cl: 88 - 94%) and a negative agreement of 95% (1782/1866; 95% CI: 94-96%) between NGS and qPCR. We excluded TTV from this calculation, as its detection rate by NGS was uniquely low (positive agreement of 36/58). This is likely due to unsatisfactory representation of the capture baits of this highly diverse virus species.
The breadth and depth of the assembled viral sequences correlated significantly with the qPCR copy numbers (ANOVA, p<0.001), (Supplemental Figure 5).