Viral biogeography of gastrointestinal tract and parenchymal organs in two representative species of mammals

In this study we report the first comprehensive metagenomic analysis of the prokaryotic and eukaryotic virome occupying luminal and mucosa-associated habitats along the entire length of the gastrointestinal tract (GIT) in two animal species, the domestic pig and rhesus macaque. The highest loads and diversity of bacteriophages are found in the lumen of the large intestine in both mammals. Mucosal samples contain much lower viral loads but a higher proportion of eukaryotic viruses. Parenchymal organs contained significant amounts of bacteriophages of gut origin, in addition to some eukaryotic viruses. GIT virome composition is both region-and species-specific with a strong separation between upper and lower gut. Nonetheless, certain viral and phage species are found ubiquitously from the oral cavity to the distal colon. Correlations between individual phages and their potential microbial hosts in the GIT are overwhelmingly positive, which confirms earlier concepts of the temperate-like life cycles of the majority of gut phages and a prevalence of the “piggyback-the-winner” ecological dynamics.


Introduction
The gastrointestinal tracts (GIT) of humans and other mammals contain highly individualized microbiomes [1][2][3][4] , composed of bacteria 5 , archaea 6 , eukaryotic microorganisms 7 , and viruses [8][9][10] .The close association of microbes and their mammalian host as an ecological unit is increasingly recognised as important for health 11,12 .The gut virome, which is largely composed of bacterial viruses (bacteriophages or phages) remained a relatively unexplored area until recently, when a potential role for the virome in shaping bacterial communities was postulated 9,[13][14][15] .A number of potential mechanisms by which such shaping could occur have been suggested, and include "kill-the-winner" dynamics in bacterial communities caused by phage predation (at least at strain and sub-strain level 16 ); diversifying selection, acting upon both adaptive mutations 17,18 and phase variations 19,20 ; as well as phage-mediated horizontal gene transfer (HGT) that could involve diverse mechanisms such as generalised, specialised and lateral transduction [21][22][23] .
Our current understanding of the virome, and the phageome in particular, is limited and based mostly on sequencing-based studies of faecal samples, which represent static snapshots of the distal gut virome.Neither the temporal dynamics 16 , nor the variation and flux of viral populations along the longitudinal and transverse axes of GIT (the "viral biogeography" of the gut 24,25 ), have received proper attention.Recent human cohort studies highlighted a tight association between the gut virome and gut bacteriome in terms of both α-and β-diversity 16,26 .Additionally, multiple lines of evidence suggest that many successful gut bacteriophages, such as the crAss-like phages, engage in long-term, persistent relationships with their hosts 27,28 , in line with the "piggyback-the-winner" dynamics of temperate bacteriophages 29 .It is important to obtain a more detailed view of both the temporal and spatial dynamics of the virome in order to understand its interplay with the bacterial microbiome, its significance for human health and potential role in disease 30,31 .
From a microbial perspective, the GI tract is a challenging and competitive environment with multiple biotopes and ecological niches occupied by different microbial taxonomic groups.Complex macro-and micro-anatomy of the alimentary tract, together with exocrine functions of GIT mucosa and accessory organs create a series of longitudinal and radial biochemical gradients, affecting the composition of local resident microbiota, including viruses 24,32,33 .Adaptation to such microhabitats is clearly evident amongst bacteria, such as body site-specific lactobacilli 34 or various mucin-foraging bacteria 35 .Host-associated mutualistic and commensal bacteria have evolved persistence mechanisms such as adsorption and embedding into mucus layers, and potentially have access to anatomical sites protected from the luminal stream and the action of bacteriophages 24,33 .On the other hand, the ability to bind to and accumulate in the mucous layer, restricting bacterial invasion, was also reported for certain bacteriophages, which prompted a discussion on the role of bacteriophages as a quasi-immune system of the alimentary tract [36][37][38] .Pronounced physiological and anatomical differences between homologous GIT segments in different species of mammals, associated with digestive adaptations, adds another layer of complexity to this system 39 .
In this study, we present the first comprehensive biogeographical analysis of viruses in the GIT of two mammalian species, the domestic pig (Sus scrofa domesticus) and rhesus macaque (Macaca mulatta).We focused our attention on bacteriophage populations and attempted to answer three key questions.Firstly, what are the differences in virome composition between different alimentary tract regions and how representative are distal gut samples of the virome in the upper GIT?Secondly, to what extent can the virome be shared between the alimentary tract regions and with extra-GIT organs?Lastly, is there a correlation between bacterial and phage biogeography in the alimentary tract, and is the data supportive of the concept of mostly temperate, "piggyback-thewinner" type of phage-host ecological dynamics rather than "kill-the-winner" dynamics?

Results
Virome sequencing approach.We applied shotgun metagenomic sequencing of VLPenriched samples [40][41][42] , to characterize luminal/mucosal viral DNA and RNA content in different locations along the digestive tract.In order to adopt a broader taxonomic outlook and get insights into spatial virome organisation that go beyond physiological and anatomical specifics of a particular mammalian species, we included six healthy domestic pigs (Sus scrofa domesticus) and six rhesus macaques (Macaca mulatta).Thirteen anatomical locations were sampled for each species, including skin, tongue, stomach, small intestine (SI; proximal [duodenum], medial [jejunum], and distal [ileum]), caecum, large intestine (LI; proximal, medial, and distal colon), as well as liver, lung and spleen.At relevant sites, both the mucosal and the surrounding luminal content were sampled (Fig. 1; Table S1).Given the overwhelming prevalence of bacteriophages in mammalian faecal viromes 16,40 , their possible role in shaping the gut bacteriome 15,18,43 and a lack of knowledge on their spatial distribution and populations dynamics in the GIT 9,33 , bacterial viruses were the primary focus of this study.
Genomic DNA and cDNA of mixed viral populations were sequenced using Illumina Novaseq platform to a median depth of 6.2±5.8M per sample (median±IQR; Fig. S1).Unlike many previous studies, our viral metagenomics approach was designed to be relatively unbiased.A simple nucleic acid extraction procedure was adopted that deliberately avoided the use of micro-filtration, VLP precipitation using PEG/NaCl, chloroform extraction, or density gradient ultracentrifugation; all of which are known to introduce different biases in virome profiling studies 41,42,44 .By avoiding wholegenome amplification we also avoided artificial virome composition skewness, loss of viral diversity, and over-amplification of small circular ssDNA genomes 16,45 .Lastly, including lactococcal phage Q33 as an artificial internal viral standard in our extraction procedure allowed us to estimate absolute abundance of viral genomes in the sample by comparing their mean sequence coverage with that of the internal standard 16,42 .
Assembly of reads into contigs and scaffolds 46 , removal of redundancy across individual samples and animals 16 , and selection of viral sequences from a bacterial and mammalian host DNA background yielded of catalogue of 70,615 scaffolds, corresponding to putative complete and fragmented viral genomes (Fig. S2; Datasets S1 and S2).At least 23 families of prokaryotic and eukaryotic viruses 47,48 were recognised across the viromes of the two mammalian species (Fig. S3).Approximately half (36,024) of the scaffolds were broadly similar (≥50% sequence identity) to previously reported genomes of either cultured or uncultured viruses [49][50][51][52] , but the remaining half were only identifiable as viral at the level of encoded protein sequences 53 .However, even within sequences homologous to previously reported viral genomes and genome fragments, the vast majority (72% of scaffolds with hits to IMG/VR database) constitute novel viral species by the recently proposed standard of metagenomic viral species delineation (≥95% sequence identity over ≥85% of its length) 54 .

Absolute viral counts along the GIT proximal-distal axis.
Absolute quantitation of viral genomic scaffolds with ≥50% calculated completeness level (n = 2,331), grouped at viral family level, revealed pronounced differences in the virome between GIT locations, as well as the differences between the two animal species.The pig LI lumen is dominated by tailed bacteriophages (Siphoviridae, Myoviridae, Podoviridae, crAss-like phages 9,52,55 ) with total viral loads approaching 10 10 genome copies g -1 contents.Similar total counts are evident in macaques, although small ssDNA Microviridae phages 56 are the most numerous group of taxonomically classified viruses (Fig. 1).Total viral loads in large intestinal mucosa samples were three orders of magnitude lower than matched luminal samples, and eukaryotic viruses (families Circoviridae, Astroviridae, Calicivirdae and Parvoviridae) had higher relative weights in those locations.Stomach and SI lumina and mucosae were colonised by relatively even mixes of bacteriophages and eukaryotic viruses, with a characteristic prevalence of Parvoviridae in the pig small intestinal mucosa.Similar combinations of viral families were detectable in tongue mucosa and skin samples in both animal species.
Interestingly, samples taken from lung, spleen and liver parenchyma in both species contained unexpectedly high viral loads, approaching and exceeding 10 6 genome copies g -1 of tissue.In macaques, these viral populations that are apparently associated with interior body milieu of healthy animals, were mainly represented by eukaryotic viruses of Circoviridae and Caliciviridae families.In both species, and especially in pigs, the viral consortia of interior milieu included bacteriophages, primarily from Microviridae family (Fig. 1).
We then used all 70,615 viral scaffolds, both high quality and highly fragmented 54 , to identify compositional virome differences between different body sites in both animal species (Figs.S4, S5; Dataset S3) and compute standard α-and β-diversity metrics 57 .While highly fragmented viral scaffolds are less useful for taxonomic classification and host identification purposes 16,31 , omitting them from diversity analyses would leave the majority of viral diversity untapped (>50% of all Illumina reads from most body sites) (Fig. S6A,B).To compensate for inter-individual virome differences and make the virome more comparable across animal cohorts we used gene sharing networks 58 to group all non-singleton viral genomic scaffolds (n = 12,633) into 3,888 Viral Clusters (VCs, Fig. S6C,D).

Virome composition along the GIT proximal-distal axis. Multivariate virome comparison,
based on fractional abundance of VCs at different sites and Bray-Curtis ecological dissimilarity measure, revealed a strong separation of large intestinal viromes from the small intestinal and gastric viromes in both animal species (Fig. S7, Fig. 2).When viewed across the two species, differences between organs were responsible for 8% of variance (main effect, ADONIS with 1000 permutations, p ≤ 0.001).A similar fraction of variance was explained by animal species (p ≤ 0.001).Surprisingly, inter-individual virome differences accounted for 14% of variance.This is despite the fact that within each cohort animals were relatively inbred, lived in the same facility and were fed with a standardised diet.Moreover, between organ variance in interaction with the individual animal factor accounted for 19.6% of virome data variance (p ≤ 0.001), more than the percentage of variance explained by similar interaction between organ and animal species factors (6.1%, p ≤ 0.001).Differences between mucosal and luminal virome explained only a relatively minor fraction of variance (2.1% for the main effect, 2.6% and 1.8% in interaction with organ and animal species respectively, p ≤ 0.001).The major compositional separation axis between viromes of LI, SI and other organs seems to be closely aligned with overall diversity and total viral load (p ≤ 0.001 in PERMANOVA), with caecal and LI viromes being simultaneously the most taxonomically diverse and the most populous (Fig. 2, Fig. S8).
In a single macaque (M6) and pig (E6), all mucosal sites were sampled twice, with 1 cm separation between each pair of samples, to assess whether close proximity of mucosal sites in the gut correlates with increased similarity of the virome composition.Inside the small and LI, there was a tendency for these paired samples to resemble each other more closely than more distant sites within these and other animals, but this did not reach the level of statistical significance (Fig. S9).
We attempted to identify specific VCs driving the separation between organ-specific viromes, as well as VCs responsible for separation between luminal and mucosal viromes and the two animal species (Figs.S10-S12).Across the two species, a total of 676 VCs were differentially abundant between organ pairs in the following sequence: tongue-stomach-SI-Caecum-LI (p < 0.05 in Kruskal-Wallis test with FDR correction), with the largest fraction of these VCs (n = 632) being discriminatory between the SI and caecum/LI (p < 0.05 in a post hoc Mann-Whitney test with FDR correction).To put this finding into perspective of a complete GIT microbiome, we looked for correlations between fractional abundances of differentially abundant VCs on one hand, and those of bacterial genera (obtained using amplified 16S rRNA gene-based microbiota profiling, Table S2) on the other (Fig. S13).We observed that many of the organ-discriminatory VCs were in fact positively correlated (Spearman ρ ≥ 0.6; p < 0.05) with bacterial genera characteristic of a particular segment in the GIT (Fig. S14), further confirming the tight association of bacterial viruses with their bacterial hosts in the gut as suggested in previous studies 8,16,26 .

Sharing of individual viral species between different regions in the GIT.
Having observed this partial separation of GIT sites by virome composition, we reasoned that there should be extensive sharing of individual viral species/strains between multiple GIT sites in each of the animals.To investigate this we returned to the level of individual viral scaffolds and visualised their sharing between organs in a particular sequence (Fig. 3).Agreeing with the individualised nature of gut viromes demonstrated above, patterns of viral scaffold sharing between different organs were also unique, not only between pigs and macaques, but also between individual animals within each cohort (Suppl.files 1-14).Despite that, common trends in viral sharing between organs could also be easily observed (Fig. S15, S16).As shown in an example with pig E1 and macaque M1, high diversity populations of LI bacteriophages (Fig. 2) are also efficiently shared between all locations in caecum and colon (Figs. 3 and 4).Specifically, in pig E1, between 484-578 Siphoviridae, 170-287 Myoviridae, 117-131 Podoviridae and 98-105 crAss-like phage genomic scaffolds are shared in sequence between sites from the caecum to the distal colon (in luminal and mucosal samples together).By contrast, only seven Siphoviridae and seven Myoviridae scaffolds detected in the distal SI are represented in the caecum.Additionally, a number of gastric viral scaffolds not seen in the SI were also detectable in the caecum.In this example a single Astroviridae genomic scaffold found in the lower and upper gut was also detectable in the stomach and on the animal's skin.Whereas a Caliciviridae genome was detectable in the lung, spleen and on the skin.Interestingly, two scaffolds, a Microviridae phage and a porcine circovirus were shared between liver parenchyma and the skin, while a single Myoviridae genomic scaffold from the stomach was also detectable in the liver and spleen (Fig. 3).
In the example with macaque M1, even more extensive sharing of phages and eukaryotic viruses was observed between the stomach, gut and the parenchymal organs, suggesting that extensive translocation and systemic circulation of bacteriophages from the gut is indeed possible (Fig. 4).
Overall, across the two animal species, we observed extensive directional sharing of viral genomic scaffolds within the upper and lower intestine.Between 17.5-34.9% of caecal viral diversity is represented in the distal colon, while 19.2-60.7% of viral diversity in the proximal SI is also seen in the caeca (Fig. S15, S16, S17).Similarly, 12.5-16.2% of stomach viral genomic scaffolds end up being detected in the caeca.Oral viromes are well connected with the stomach, with 23.3-41.4% of diversity being shared.However, only 3.8-5.6% of tongue-associated viral scaffolds are detectable in the caeca (2.3-3.1% in the distal colon).
The data provides convincing evidence for translocation of some alimentary tract bacteriophages across healthy gut epithelia 59 , ending up in the internal organs (liver, lung, spleen), presumably via micropinocytosis 59 , the portal vein (liver), lymphatic system, or perhaps via regurgitation of stomach contents (lung).Specifically, the livers of both animal species, as well as macaques' lung and spleens, shared with the intestinal sites not only the eukaryotic viruses (Circo-, Calici-, Parvoviridae) but also small genomes of Microviridae phages (Fig. S17, Suppl.Files 13-14).By contrast, lungs and spleens of pigs E2, E4 & E6 contained much higher diversities of phages originating from stomachs and oral cavities (Suppl.Files 2, 4, 6).Some of the most ubiquitous phage genomic scaffolds are listed in Fig. S18.Eukaryotic viruses shared between different anatomic locations are discussed in Supplemental Results and are summarised in Figs.S19  and S20.

Virus-to-bacteria correlations in the GIT.
Having obtained evidence for the continuous presence of individual viral genomic scaffolds along the proximal-distal axis of GIT, we attempted to detect correlations in fractional abundances of individual viral scaffolds and bacterial OTUs across all anatomical locations on a per animal basis.Overall, we detected 185±175 (median±IQR) viral scaffolds per animal strongly associated with bacterial OTUs (Spearman ρ ≥ 0.7; p < 0.01 with Bonferroni correction).Detected correlations were overwhelmingly positive (Fig. 5A-C; Fig. S21-22), agreeing with previous reports of mostly persistent and temperate-like interactions of gut phages with their bacterial targets [27][28][29] .To further confirm these observations and filter out cases of correlation unrelated to direct phage-host relationships, we performed a more focussed analysis of the phage-host pairs for which correlation with a particular host agrees with host prediction (to genus level) from viral sequence analysis.Such prediction was based on: a) host assignment for closely related viral homologs from IMG/VR database 49 ; b) CRISPR spacer matches 60 ; c) detection of homologous prophages in members of the same bacterial genus; or d) matches of viral tRNA gene copies with those carried by bacterial genomes of the same genus (Table S3).Even with this focussed view the observed correlations were mostly positive with a few notable examples (Fig. S23).As shown in Fig. 5D-F, a single Eubacterium OTU and a small 17kb Podoviridae phage predicted to infect Eubacterium, engage in almost mutually exclusive behaviour in the GIT of pig E2.By contrast, multiple Prevotella, Faecalibacterium and Ruminococcus OTUs in macaques M3 and M6 appear to be strongly and positively associated with their cognate phages belonging to Microviridae, Siphoviridae and crAss-like phage families.

Discussion
Recent studies have observed correlations in gut bacteriome and phageome composition and claimed associations between altered virome composition and GIT diseases in humans 30,31 .It has been speculated that phages could play a decisive role in controlling bacterial population density and structure via "kill-the winner" or similar types of ecological dynamics 13,14,61 .Indeed, in simplified microbiota models exponential growth of phage under optimal conditions can lead to rapid collapse of sensitive bacterial populations 62 , resulting in cascades of knock-on effects in nonsusceptible bacterial populations via inter-bacterial interactions 15 .In addition, limited evidence from animal and human virome transplantation studies suggest that invading native gut microbiomes with allochtonous phage consortia can cause significant shifts in bacteriome composition, and can even affect the physiological status of the recipient mammalian host [63][64][65] .
There is also convincing evidence that points toward a much less disruptive role of phages in the microbiome, in that most numerically prevalent phage types are either temperate (existing in the form of prophages as well as free viral particles), or have evolved to support a long term, stable persistence in the microbiome with only limited effects on the density of bacterial host populations 66 .A number of potential persistence mechanisms have been proposed that includes phase variation of phage receptors in bacteria 19,27 leading to herd immunity 20 , or physical segregation of mucus-embedded sensitive bacteria from luminal phages ("source-sink" model) 33 .It would be impossible to fully understand the dynamics of phage-host interaction and therefore the role of phages as either "drivers" or "passengers" in real-world complex microbiomes without having a detailed map of the virome in both temporal and spatial (biogeographic) dimensions.In this study we provide such a map for two mammalian species, pigs and macaques.
From a technical perspective the study was designed to minimize the biases typically associated with virome analysis 41,46,67 .We used unamplified nucleic acids and assembly-based cataloguing of novel viruses, coupled with quantitation by comparison against a spike-in viral standard.We also clustered individual sequences into VCs to allow us to robustly detect and quantify both known and novel viruses with DNA or RNA genomes (Fig. S1-S7).Unlike in many previous studies 67 , we revealed an abundance of RNA viruses, including novel phages belonging to Leviviricetes class, and novel mammalian Astroviridae and Caliciviridae (Fig. S8).Small ssDNA Microviridae phages were found to be a dominant group in the macaque colon, a finding that previously would have been dismissed as a result of DNA amplification bias 16,45 .A limitation of this assembly-based approach was, however, that we almost certainly missed some of the low abundance viruses seen in a previous study of the macaque virome 25 .
We interrogated our data to find answers to three key questions posed at the outset of the study.We looked at the differences in phageome composition between different regions in the alimentary tract in both pigs and macaques.As expected, the vast majority of phage biomass and diversity was concentrated in the colonic lumen (peaking in caeca), reflecting the dense community of bacterial prey in that site (Fig. 1 and 2).Upper GIT viromes were distinctly different and reflective of differences in bacteriome composition between different GIT regions (Fig. 2, S14).Interestingly, distal gut luminal viromes appeared to be very homogenous, from caecum to distal colon and compositionally much more reflective of an individual animal, than of a particular location in the colon (Fig. 2).This is in good agreement with results on bacterial biogeography in the macaque gut reported by Yasuda et al., who observed predominantly inter-individual, and less location-specific, variation of luminal microbiota in jejunal, ileal, and colonic sites 68 .The same authors noted significant differences between luminal and mucosal microbiota in the same locations, with the latter being more influenced by biogeography than by an individual animal.In line with this, we observed enrichment of facultative anaerobic bacteria (Campylobacteriaceae, Helicobacteriaceae) in mucosal samples from pigs and macaques in bacteriome analysis (Fig. S13).However, on the virome side, this did not manifest in accumulation of phage VCs specifically associated with this type of habitat.Instead, mucosal samples show depletion of numerous phage VCs (Fig. S12), drastically reduced viral load and increased prevalence of viruses infecting mammalian cells (both in relative and absolute terms, Fig. 1).
These results support a recently proposed "source-sink" model 33 arguing that exclusion of bacteriophages from mucous layer creates a refuge for bacterial cells, allowing co-existence of virulent phage and sensitive bacterial cells in close proximity.At the same time, this apparently disagrees with an earlier "bacteriophage adherence to mucus" model (BAM) 36 , which argued that accumulation of bacteriophages and an increased virus-to-microbe ratio (VMR ~ 39:1) in the mucus creates a barrier limiting bacterial invasion and segregating bacterial population to the luminal space.In the absence of quantitative data on bacteria, our study cannot testify to the VMR ratios in the lumen and mucosa.The BAM model therefore, can still accommodate our results, with a caveat that certain bacteriophages possessing Ig-like protein domains required for binding to mucus 36 are equally abundant in the mucus and in the lumen, while phages lacking this ability are excluded into the luminal space.One can envisage complex scenarios of phage-host interaction in the GIT, with some phage-host pairs following "source-sink" dynamics, while others showing behaviours more conforming with the BAM model.
We observed extensive sharing of individual viral strains throughout the entire GIT.The most prominent examples were phages found continuously in comparable quantities throughout the entire intestinal tract (Fig. S18).For the majority of strains however, the continuous flow of phages from small to large intestine seems to be interrupted at the ileocaecal valve (Fig. 3-4, Suppl.Files 1-14).This can be explained in part by drastic differences in composition (and presumably total biomass) of bacteriomes between SI and LI, which in turn support the growth of completely different phage populations.However, a complete extinction of small intestinal phages during passage from SI to LI seems unlikely, and therefore, the dilution effect, caused by vastly larger viral biomass supported by greater numbers of bacteria, combined with limitations imposed by sequencing depth, is a likely cause of the apparent disappearance of gastric and small intestinal phages in the caeca and LI.
Despite our original expectations, we could not confirm any tendency for mucosal samples taken from neighbouring sites to be closer in virome composition to each other than to more distant sites, which again suggests a relative homogeneity of virome along the proximal-distal axis within each of the anatomically distinct alimentary tract regions.This observation calls for future longitudinal studies to examine viral flow and local temporal differences in virome composition in the gut.Agreeing with our previous longitudinal observation of the virome in humans 16 , each of the individual animals in this study carried a distinctly different GIT virome, coupled to highly individualised patterns of viral distribution and sharing between different sites and organs (Suppl.

Files 1-14).
Luminal samples from the distal LI, which can roughly be equated with faecal samples for the purpose of this study, are only representative of a fraction of the viral diversity present in different segments of the alimentary tract.This is especially evident in the case of eukaryotic viruses, many of which are readily detectable in colonic mucosa (Astroviridae in pigs) or SI lumen (Caliciviridae in both pigs and macaques), and in parenchymal organs such as liver, lung and spleen, but not in the distal LI lumen (Fig. S19).Interestingly, and agreeing with our earlier notion of virome individuality, each animal harboured a unique pattern of eukaryotic viruses, with regards to their taxonomic composition, strain variation and biogeographic distribution (Fig. S20).The epidemiological and pathological significance of biogeographic distribution of these common viruses in porcine and murine GIT (in particular porcine Astroviridae 69 ) is difficult to establish without further extensive population and longitudinal data collection.
One of the interesting findings in this study was possible evidence of bacteriophage translocation from the gut into the systemic circulation and eventually parenchymal organs such as the liver, spleen and lungs.While animal dissection and sample collection for this study was conducted within a sanitary research environment, fully aseptic conditions were not possible.Therefore, it is possible some of the viral biomass in parenchymal organs that was orders of magnitude lower than was in the gut could represent cross-contamination of solid organ samples.Nevertheless, we believe that this cannot explain our findings.Parenchymal organ viromes were dominated by eukaryotic viruses, and phages present in them were specific strains shared with alimentary tract viromes, but not the most dominant strains.It has previously been demonstrated that at least specific phage types are able to adhere and translocate through the intestinal epithelial lining 36,59 .In our study, a tendency towards enrichment for smaller phages (family Microviridae, Fig. S17) was observed in parenchymal organ viromes, which might indicate increased transepithelial diffusion of small viral particles.The exact fate of translocating phage and their systemic effects has so far remained unclear 70,71 , and our observations might be insightful for studying anti-phage immune responses 72 .
Finally, we used correlation analysis to examine dynamic relationships between bacteriophages and their predicted bacterial hosts along the GIT proximal-distal axis.We observed that the majority of identifiable phage-host pairs had highly synchronous patterns of fractional abundance fluctuation across different body sites (Fig. 5, Fig. S21-S23).This observation implies that the replication of the majority of phages in the gut environment proceeds at levels, or via mechanisms, that do not lead to collapse in numbers of the corresponding host bacteria.In contrast to that, cases of out-of-sync fluctuations of fractional abundance in phage-host pairs, implying growth of bacterial population at sites where phage activity was low, and collapse sites where the corresponding phage was actively replicating, were rare (Fig. 5D).These findings are indicative of low phage predation control of bacterial population densities in the gut, and agree with the "piggyback-the-winner" ecological model 29 , which postulates that high bacterial densities in microbial ecosystems, such as the mammalian GIT, favour temperate or temperate-like behaviour of the resident bacteriophages, as opposed to stricter population control and "kill-the-winner" dynamics imposed by phages in low density marine environments 73 .

Conclusion
We report the first comprehensive description of viral biogeography in the GIT of two large mammalian species, chosen to be phylogenetically, functionally and anatomically relevant for humans.This study employs numerous state-of-the-art methodological approaches, including biasreducing library preparation and assembly steps, custom bioinformatic pipelines and quantitative virome profiling to support a critical role for gastro-intestinal niches in the ecological dynamics of phage-host populations in the mammalian gut.This work also highlights dramatic under-sampling of gastro-intestinal viral communities (particularly eukaryotic viruses) with distal LI sampling (or faecal sampling) alone, and points to consistent drop-out of upper GI viral communities in colonic samples.This near total bias against viral communities from the small intestine in faecal samples is important given that the small intestine accounts for about two-thirds of the human GIT's length 74 .As such, our work joins growing research, in both viral 33 and bacterial 75,76 microbiome studies, highlighting the need for direct GIT sampling.In addition to these findings, we detected some overlap between viral communities in parenchymal organs and the GIT which was not related to their overall abundance, suggesting that there may be some degree of specificity to viral translocation.Finally, we propose that this dataset and its accompanying methodological and bioinformatic protocols may provide an important catalogue and resource for future investigators in the field.

Supplementary Files
This is a list of supplementary les associated with this preprint.Click to download.manuscriptsupp nal.pdf

Fig. 2 .
Fig. 2. α-and β-Diversity of viromes in various anatomical sites in domestic pigs and rhesus macaques.A, Constrained ordination (CAPSCALE) of Bray-Curtis dissimilarities between virome samples, based on fractional VC counts; anatomical locations, Shannon diversity index and total viral load used as constraining variables [estimated component of variation (ECV) of 11.8, 1.0, and 1.7% respectively, p = 0.001]; arrows indicate vectors of the two continuous constraining variables; B, Shannon diversity index calculated with read counts for individual viral genomic scaffolds; SI, small intestine; LI, large intestine; Prox/Mid/Dist, proximal, medial and distal portions, respectively.

Fig. 3 .
Fig. 3. Sharing of viral genomic scaffolds between different anatomical sites in pig E1.Vertical grey rectangles height is proportional to viral richness (individual genomic scaffold counts) at each location, aggregated across luminal and mucosal samples; Thickness of coloured connectors is proportional with the number of genomic scaffolds of each viral family shared between pairs of locations; SI, small intestine; LI, large intestine; Prox/Mid/Dist, proximal, medial and distal portions, respectively; unclassified genomic scaffolds were excluded.

Fig. 4 .
Fig. 4. Sharing of viral genomic scaffolds between different anatomical sites in macaque M1.Vertical grey rectangles height is proportional to viral richness (individual genomic scaffold counts) at each location, aggregated across luminal and mucosal samples; Thickness of coloured connectors is proportional with the number of genomic scaffolds of each viral family shared between pairs of locations; SI, small intestine; LI, large intestine; Prox/Mid/Dist, proximal, medial and distal portions, respectively; unclassified genomic scaffolds were excluded.

Fig. 5 .
Fig. 5. Correlation of fractional abundance between viral genomic scaffolds and bacterialOTUs across anatomical sites.SI, small intestine; LI, large intestine; Prox/Mid/Dist, proximal, medial and distal portions, respectively.Absolute abundance of viral genomes was calculated by comparing coverage with that of the spike-in standard (phage Q33).Only genomes with >50% of estimated completeness were taken into account when calculating viral loads.