Study cohort
In total, 424 stool samples from 40 different animal species including Homo sapiens, belonging to four diverse mammalian orders were collected and microbiome profiled (Table 1). After filtering (see Methods) the dataset included microbiota profiles from 368 unique subjects across 38 mammalian species, including four zookeepers, 44 non-zoo-keeping humans and 324 zoo animals. These animals were sampled in seven different zoos across Germany (Berlin, Neumuenster, Gettorf, Warder, Hamburg, Leipzig, Schwaigern) to analyze the effect of location and habitat as well as host phylogeny and ecology. Mammalian orders in the cohort comprise Artiodactyla, Carnivora, Perissodactyla and Primates. Artiodactyla (also called even-toed ungulates) encompass most of the world's species of large land mammals such as sheep, goats, camels, pigs, cows, and deer, from which ten species are included in this study. Six different species were included from the order Perissodactyla (also called odd-toed ungulates), which in general consists of about 17 species that are hoofed animals (e.g., horses and rhinoceroses). The order Carnivora comprises over 280 species of placental mammals, from which five were sampled in this study. Additionally, 23 different species from the order Primates (old/new world) were analyzed, including the most closely human related primate species, Pan paniscus and Pan troglodytes. We would like to emphasize that all stool samples, as well as the corresponding extracted-DNAs are publicly available at the Institutes biobank for non-profit research purposes upon request.
Fecal microbiota composition highly reflects animals’ phylogeny and thus, also their diet
In this study, the V1V2 region of the 16S rRNA gene from feces of 417 samples (post quality control) were sequenced and ASV tables were generated and annotated to species level. This resulted in 63,780 unique ASVs and 1381 species across the 417 samples. One sample per animal (or stool pool, see methods) was selected as some individual animals were sampled more than once, resulting in 368 samples from 38 species. The general prevalence of microbial species was restricted since 80.5% of microbial species were detected in less than 5% of the samples, likely reflecting the high diversity and distinctness of the various host species and their respective ecologies in the dataset [11].
As expected, the fecal microbiota profiles of the studied mammalian species followed their phylogenetic relationships regarding prevalence and abundance of microbial families (Fig. 1). The stacked bar plot in Fig. 1 illustrates the average microbiota profile of each selected host species in the study and highlights how substantially the Carnivora microbiota differ from the remaining host clades. Another interesting finding from this evaluation was the restricted abundance of Bacteroidaceae, which only showed high abundances in Homo sapiens, Callithrix jacchus, Varecia rubra and Suricata suricatta. Relatively high proportions of unclassified bacterial families were found in the orders Artiodactyla, Diprotodontia and Perissodactyla, underlining the understudied fecal microbial diversity within these orders. Evaluating the relative frequency of species in each of the five mammalian order clades, we found support for Carnivora as the limiting group to define a mammalian core microbiota. Out of 603 microbial species, five were found with a relative frequency above 50% in all groups, one belonging to each of Bacteroidaceae, Ruminococcaceae, Erysipelotrichaceae, Lachnospiraceae and unclassified Candidatus Saccharibacteria. None of the five ASVs were annotated at species level. When disregarding the Carnivora clade, 27 species met the threshold, while when Artiodactyla, Diprotodontia, Perissodactyla or Primates are left out, only 5, 6, 7 or 6 species met the threshold, respectively.
To evaluate if microbial species showed specificity to any single or combination of host clades, we applied a multi-level pattern analysis (multipatt in R package indicspecies). We found 305 species with significant specificity (out of 602 tested) to any one combination of host order clades, with 82 assigned to Artiodactyla, 156 to Carnivora, 114 to Perissodactyla and 83 to Primates (multipatt, p.adj<0.05, Table S1). Interestingly, a large precentage of species associated with Carnivora were uniquely associated with this clade (139 of 156) and therefore show indicator tendency for this order and include two species of Escherichia/Shigella genus, and 21 species of the order Clostridiales. For primates, only 24 of the 83 species were specific to the primate order.
We then evaluated the phylogenetic relatedness of each species in the microbiome community and found a broad association of species abundance with host phylogeny. We calculated the mean phylogenetic distance (MPD) between all species and compared observed phylogenetic relatedness to the pattern expected under the null community randomized while holding species richness constant, as we observed some association of host phylogeny with microbiome alpha diversity diversity (picante::ses.mpd). We selected species present in >5 samples and regressed out the effect of location (see methods). A total of 231 species (out of 536) showed phylogenetic relatedness (p.adj<0.05, abundance weighted MPD model, 999 permutations, Table S2). Two Prevotella genera, namely Prevotella copri and one unclassified at species level, was tested in the model and both showed significant phylogenetic relatedness (p.adj<0.05). Five species of Spirochaetes were analyzed, with three Spirochaetales showing significant phylogenetic relatedness (all unclassified at species level, two annotated as Sphaerochaeta and Treponema at genus level). The remaining two Spirochaetes showed nominal association but did not passmultiple testing correction (p.adj of 0.026 and 0.047).
Dietary preferences are one factor that strongly varies between mammalian orders and diet is believed to be one of the most important factors influencing the gut microbiome, in addition to host genetics, ecology, or habitat. However, as (i) host-phylogenetic clusters have strongly correlated dietary behaviors which largely follow host phylogeny, (ii) our dietary data is restricted to main food categories as binary data e.g. overall intake of meat and/or plant-based diet, and (iii) as dietary data is only available for approximately one third of the cohort (incl. only two Carnivora), consistent and reliable segregation of the microbial variation between diet and host phylogeny is not fully feasible using this dataset (strongly nested). Still, with a large overlap in diets for different species and some within-species differences due to between-zoo differences, we found it of relevance to consider the dietary patterns as much as the data allowed.
First, we evaluated if dietary information could explain parts of the variation in the gut microbiota community composition of the mammals. To this end, we used the multiple regression on matrices (MRMs) model, as described in more detail below. Details on diet were available for 115 hosts belonging to 15 species. The data was used to generate eight dietary variables that reflect the main dietary categories from the animal diets e.g. fruit, meat, vegetables, greenery (see Methods), and a distance matrix was calculated based on shared dietary patterns (Jaccard distance). We calculated the variation explained by phylogeny, location and diet using the 115 samples with available dietary data. In this subset of samples, both with and without the inclusion of dietary data, the effect of host phylogeny was significant (median p<0.05). Diet was not significant (median p>0.05) despite 12% variation explained, probably due to the high correlation between diet and host phylogeny in general. Above we observed an association of Bacteroidaceae with host phylogeny and identified two Prevotella species showing phylogenetic association with the host (namely Prevotella copri and one unclassified at species level). Abundance patterns of Prevotella has previously been found to be associated with diet with positive associations with fiber intake and negative associations with meat intake. Therefore, we zoomed in on these two species and evaluated the role of vegetables and meat on their abundance while considering location and host phylogeny via Phylogenetic Generalized Least Squares models (PGLS). The unclassified Prevotella significantly associated with meat intake (and host phylogeny and location) (PGLS, meat p<0.05, lambda=0.34 (95% CI 0.16-0.61)), while Prevotella copri only associated significantly with host phylogeny (lambda=0.96 (95% CI 0.90-0.98)). At the family level, we evaluated association of Bacteroidaceae with meat intake and identified both meat and host phylogeny as significantly factors influencing its abundance across hosts (PGLS p<0.05, lambda=0.49 (95% CI 0.26-0.74)).
To further understand the influence of dietary preferences on microbiome diversity, we compared the alpha diversity between hosts grouped into their five orders. Shannon diversity varied between all pairs of host orders that were not both predominantly herbivores (two-way anova correcting for location, q<0.001), while pairs comprising predominantly herbivores showed no significant difference (q>0.05) (no difference between Perissodactyla and Artiodactyls) (Fig. 2). Carnivora (carnivores and omnivores) and Primates (predominantly omnivores) had on average approximately half the community diversity of herbivores (Artiodactylas and Perissodactylas), probably highlighting the diversity increasing effect of hgher plant and fiber intake, which requires a rich enzyme repertoire. This observation of a dietary-driven alpha-diversity pattern was further supported by a comparison of Shannon diversity between mammals grouped by their dietary behaviour (as oppose to phylogenetic order) (Fig 2B). The observation of a pattern of alpha diversity that follows both the dietary behaviour and host phylogeny of the mammals, was further supported by a Phylogenetic Generalized Least Squares (PGLS) analysis of the association between alpha diversity and dietary behaviour, that becomes insignificant when considering host phylogeny (pgls in R package caper, lambda='ML', controlling for location, p>0.05 for both Shannon and Chao). Interestingly, within the Primate’s order, the alpha diversity varied greatly. At the genus level, Hylobates, Macaca, Pan and Pongo, showed the highest diversity, while Varecia and Callithrix showed the lowest diversity (considering clades with min 5 individuals, Figure S1). A similar pattern was found for richness (Chao), with the lowest diversity found in Carnivora and Primates (Figure S1 and S2). For richness, Perissodactyla showed a high diversity also compared to the other clade of predominantly herbivores (versus Artiodactylas q=6.2*10-5). Host phylogeny directly dictates the animal’s dietary preferences in part by shaping their digestive abilities such as the ruminant animals specialized stomach that give them the ability to acquire nutrients from plant-based food. The fermenting process is driven by microbial actions, and when comparing the microbial diversity between ruminants and none-ruminants in the dataset, we identified a significantly higher diversity in the ruminant mammals (Figure 2D). To evaluate if there was a detectable effect of individual food groups when controlling for phylogenetic relatedness and location, we applied a PGLS model to the 115 samples with available dietary data and evaluated the association of each of the eight food groups with Shannon diversity. The analysis detected a significant association for fruit, eggs and greenery (PGLS, p<0.05) and a trending association for multimineral/vitamin (p=0.05), all models retaining a significant lambda, indicating a role of both host phylogeny and intake of these food groups on microbial diversity.
To evaluate the variability of microbial communities within host groups with different dietary preferences (carnivores n=6, herbivores n=84, omnivores n=278), we performed dispersion analysis based on the Bray-Curtis diversity measure of dissimilarity between host’s microbiome compositions (betadisper in R, bias.adjust=T). The analysis was performed for microbial taxa at the ASV, species, genus and family level, and at all levels the analysis detected a significant difference in variability. However, the pattern of variability between the diet groups changed when moving from the ASV to the higher taxonomic levels. At ASV level, the carnivores had the lowest dispersion, while there was no significant difference between the herbivores and omnivores (mean distance 0.59, 0.67, and 0.68, respectively). At species level, the herbivores diversity decreased drastically (mean distance 0.35), while the omnivores also decreased to 0.52 and carnivores remained largely unchanged (0.55). In addition, there was a significant difference between the herbivores and omnivores at species level (median p=0.0017). The carnivores changed from having the lowest dispersion to having the highest, just above the omnivores, and the pattern remained stable at higher microbial taxonomic levels. Even as we adjusted for sample bias, notable variation in the number of different species sampled within each host order clade remained. When calculating the difference between the herbivores and omnivores, we therefore performed 100 random samplings of five host species groups per diet-group, performed the analysis of variation on each subsampling and then calculated the mean dispersion and p-value across the 100 analyses. The observed pattern is likely caused by the lower species assignment rates in the plant-eating groups as compared to carnivores; carnivores had the lowest number of ASVs with 328 ASVs, and omnivores the highest with 6198 ASVs. At microbial species level the herbivores had 396 species (incl. unannotated) down from 3006 ASVs. The relative change in richness was very low for carnivores (1.77 times) as compared to herbivores and omnivores (7.59 and 10.58, respectively). The carnivores showed the highest percentage of annotated ASVs across microbial species, genera and family levels, followed by the omnivores (percentage annotated microbial families: 99.4% for Carnivora, 75.7% for Omnivore and 67.0% for Herbivore).
Host phylogeny remains an important factor in shaping gut microbiome also for captive and geographically separated mammals
Next, we evaluated whether the variation in the gut microbiota of the mammals is mainly explained by location (given by the Zoo’s geographical locations and humans home-city) or by their phylogenetic relatedness. We included one sample per individual mammal (as some mammals had been sampled multiple times) and used multiple regression on matrices (MRMs), as described previously [11], to calculate how much of the microbial variation could be assigned to host phylogeny and location. The three-distance matrices were based on geographic coordinates for location, patristic distances for host phylogeny and the species table for the microbiota (see Methods). To control for the effects of intra-species variation, we performed the analysis 100 times, each time with one randomly selected sample per host species. Thirty-eight host species had data points across all three matrices, and data was selected from a total of 386 samples. The analysis was performed considering both relative abundances (Bray-Curtis) and presence/absence (Jaccard) for microbiota composition. In both analyses, host phylogeny explained a significant amount of variation (median p-value<0.05, coefficient ~23% for BC and Jaccard), while the variation explained by location was insignificant (median p-value>0.05, coefficient -0.03% BC, -0.09% Jaccard, Figure S3). In contrast alpha diversity shows strong associations to the geographic location and the phylogenetic relationships between the animals (Shannon lambda= 0.77 (95% CI 0.63-0.87, lower and upper p<0.05), location p<0.05, R2=10%; Chao lambda= 0.80 (95% CI 0.68-0.89, lower and upper p<0.05), location p<0.05, R2=17%).
Despite the very limited variation in microbial community composition found to be explained by location when using the full host phylogenetic tree reflecting geological time and location reflecting geographic distance, further evaluation of the host phylogenetic subgroups using ANOVA and PerMANOVA (adonis) approaches did detect some influence of location. These analyses treat location and host phylogeny as categories unorganized by evolutionary distance or morphology (host taxonomy), or geographical distance (here zoo location). For host mammals grouped at genus level, the variation in microbial composition explained by location was 3.36% after adjusting for phylogeny, and variation explained by phylogeny was 34.7% (likewise after adjusting for location). These associations were highly significant (adonis2 p<0.001, species-level microbiome, 999 permutations, min. 5 animals per host group, Table S3). Visual evaluation of the community structure by host phylogeny and zoo location supported an effect of both factors (Figure 3), however only a small shift could be detected due to location. Whereas the order Carnivora again clusters decidedly different from all others, members of the Artiodactyla and Perissodactyla are more similar to each other even compared to the Primates, which displayed high variation (dispersion) among their microbial communities. Having a closer look into each order, Primates revealed a peculiar pattern in their microbial communities. Here, four human samples were included that did not belong to the human Kiel control group, but instead were sampled from two animal zookeepers from Gettorf, as well as from two workers not handling animals. The two samples of the animal zookeepers shifted, away from the human samples from the geographically close Kiel area, towards Gettorf zoo where they worked and for the zookeeper of lemurs, tamarins and squirrel monkeys, the shift was directed towards the Saguinus oedipus (tamarins) (see Figure 3 “Primates”), indicating their microbiomes are influenced by the animals they interact with. Otherwise, the clustering of primate species highly reflects their phylogeny, even though many of them live in different group sizes and together with many other species. A similar pattern could be observed for most other host orders, too, including e.g., sheep and goats within in the Artiodactyla or the racoons within the Carnivora.
When considering the within-sample diversity (alpha diversity) using PGLS analysis, as opposed to the community composition (beta-diversity) evaluated above, support was detected for an effect of both host phylogeny and location, supporting the above observations for community structure (Shannon lambda= 0.77 (95% CI 0.63-0.87, lower and upper p<0.05), location p<0.05, R2=10%; Chao lambda= 0.80 (95% CI 0.68-0.89, lower and upper p<0.05), location p<0.05, R2=17%).
One way by which the confinement of animals to specific zoos could influence a possible phylogeny-driven microbial composition could be through local community dynamics and restricted bacterial/host dispersal between locations, eventually leading to zoo specific microbial communities/signatures. Thus, we looked for zoo-specific microbial species within the 351 most abundant microbial species (min abundance 0.001% in at least 3% of samples). Only four species (Alistipes finegoldii, Bacteroides stercoris, Bifidobacterium tissieri and Clostridium IV leptum) were unique to one location, namely A. finegoldii and C. IV leptum to Kiel, and B. stercoris and B. tissieri to Neumuenster. The bacteria specific to Kiel originate completely from human hosts while bacteria specific to Neumuenster Zoo originate from Primates and Carnivora. In Neumuenster, marmosets (Primates) hosted by far the majority of the two species (found in 17 or the 19 marmosets’ stool-pool samples), while samples from marmosets were also only available for this location. Both B. stercoris and B. tissieri were also found in one stool-pool from ring-tailed coati, and B. tissieri was found in one stool-pool of racoons. Both ring-tailed coatis and racoons were also sampled in other zoos (Gettorf and Berlin, respectively), indicating some cross-host species transfer within zoos or indicates similarities in the host’s ecologies. However, the overall pattern does not indicate widespread zoo-specific microbial species. A PGLS analysis of B. stercoris and B. tissieri with location, confirmed the importance of host phylogeny over location (caper::PGLS, lambda='ML', B. stercoris lambda=0.75 (95% CI 0.61-0.85, p upper and lower <0.05), location p=0.85; B. tissieri lambda=0.67 (95% CI 0.50-0.80, p upper and lower <0.05), location p=0.74). PGLS cannot be used to evaluate the two taxa unique to Kiel due to Kiel location only containing samples from one host species which contributes all of these specific species (44 of 48).
Variation in the family Hominidae
A reduced diversity and an increase in dispersion have been observed for humans as compared to closely related taxa or other mammals. Across mammalian families, Hominidae showed a highly variable diversity (Figure S1). The range overlapped with most other families but was clearly lower than most herbivorous families, and showed generally higher Shannon diversity than the Carnivora, and the primate families Callitrichidae (marmoset), Cebidae and Lemuridae. Our dataset includes three genera (comprising four species) within the family Hominidae. Since microbial diversity is also known to be decreased within westernized populations [15, 21], we here included another dataset comprising fecal bacterial microbiota results of a large children cohort from one of the poorest countries in the world, Guinea-Bissau (Western Africa) [25, 27]. We selected 159 individuals who were recruited at home (controls) at a minimum of 10 years of age and compared their alpha diversities with the human subjects of our study cohort. With regards to Shannon diversity, Guinea-Bissau human subjects had a significantly higher diversity compared to the German subjects (q=3.1*10-12) but lower diversity when compared to the hominid primate genera Pan (q=1.4*10-16) and Pongo (q=0.048) (Figure S4). As expected, human subjects from Germany had an even lower alpha diversity when compared to Pan (q=2.5*10-28) and Pongo (q=7.4*10-6), whereas Pan and Pongo did not show differences to each other (q=0.19). PGLS analysis confirmed the importance of phylogeny in shaping diversity of these four hominid species clades (PGLS with location, lambda='ML', including only German humans, Shannon lambda=0.73 (95% CI 0.33-0.95, p<0.05), Chao lambda=0.83 (95% CI 0.51-0.97, p<0.05).
To further evaluate the microbial community (species) within the Hominidae, we evaluated the dispersion of samples in each subgroup of Pan, Pongo and humans grouped by location. The humans showed a higher dispersion as compared to Pan and Pongo, and the German subjects a higher dispersion compared to subjects from Guinea-Bissau (p =6.2*10-5, German mean distance 0.42, Guinea-Bissau mean distance 0.36).
To further explore the variation in the Hominidae, we used the indicator species analysis (multipatt introduced above) to identify bacterial species that are specific to a certain host group (Pan, Pongo, and German and Guinea-Bissau human subjects). The analysis identified 141 species (out of 353 analyzed) with significant specificity to one or a combination of groups, with 2 assigned to specifically to humans from Guinea-Bissau (out of 38 assigned to a combination that include Guinea-Bissau), 35 specifically to humans from Germany (out of 71), 8 specifically to Pan (out of 59) and 20 specifically to Pongo (out of 76) (multipatt p.adj<0.05). Figure 4 shows the relative abundance and frequency of the 53 most significant species (p.adj<0.001), the full list is presented in Table S4. Of the 35 species showing specificity to the German humans, out of the 71 species associated with Germans, eight belonged to the Bacteroides clade and five to Alistipes. Prior studies comparing mammal clades and westernized with non-westernized populations reported that Spirochaetes are increasingly absent from populations consuming a westernized diet [14, 21]. The current dataset includes six species assigned to the order Spirochaetes, of which four showed significant association to non-human subgroups, while two are highly abundant and prevalent in subjects from Guinea-Bissau (potentially Brachyspira pilosicoli, B. aalborgi). Also, Prevotella shows an association towards non-westernized subjects (Prevotella copri and one unclassified Prevotella, multipatt p.adj<0.05). Contrary to Prevotella, Bacteroides displayed a high abundance and prevalence only in human subjects from Germany (specificity to German humans, eight of 10 Bacteroides with multipatt p.adj<0.05). These findings outline previously reported and yet unknown phenomena along gradients of westernization in humans and the hominids in general, and argue for the diverse interaction between host background, lifestyle, and microbiome as shown in previous studies on humans as well [28].
There exist over 250 different species of monkeys, and these can be divided into two main groups; the Old World monkeys that are native to Africa and Asia, and the New World monkeys that are native to Central and South America. Our dataset also includes different species of New (33) and Old World monkeys (52), with Old World monkeys falling in the Cercopithecidae family and New World monkeys in the Callitrichidae and Cebidae families. Recently, Amato and colleagues found evidence that human stool microbiota composition is more similar to members of the Cercopithecidae than to those of Pan and Pongo [22]. Thus, we here analyzed compositional differences using data from zoo animals instead of wild ones. Surprisingly, data obtained confirmed the overall findings of Amato et al., though all animals in zoos are in close contact with humans. Comparisons of the microbiota composition (species) of the Cercopithecidae mammals with German humans, Guinea-Bissau humans and Pan, using adonis2, showed a smaller difference between German humans and Cercopithecidae (R2=0.36), as compared to German humans and Pan (R2=0.40). When taking the Guinea-Bissau humans instead of the German humans, the pattern was the same, however both measures of difference was reduced (vs. Cercopithecidae R2=0.25, vs. Pan R2=0.30).
Globally over 250 different monkey species exist, which can be divided into two main groups; the Old World monkeys which are native to Africa and Asia, and the New World monkeys which are native to Central and South America. Our dataset also includes various species of New (33) and Old World monkeys (52), with among the families Cercopithecidae (Old World monkeys) and Callitrichidae and Cebidae in the group of New World monkeys. Recently, Amato and colleagues found evidence that human fecal microbiota composition is more similar to members of the Cercopithecidae than to those of hominid apes [23]. Thus, we analyzed compositional differences using data from zoo animals instead of wild ones to reduce the effects of parasitation, environment and diet. Surprisingly, we can confirm the overall findings of Amato et al., even though zoo animals are in close contact with humans. Comparisons of the microbiota composition between members of the Cercopithecidae, P. pansicus, P. troglodytes and human subjects form Germany and Guinea-Bissau showed a smaller difference between German humans and Cercopithecidae (R2=0.36), as compared to German subjects and members of the genus Pan (R2=0.40). Human subjects from Guinea-Bissau displayed a similar trend, although differences were more subtle (vs. Cercopithecidae R2=0.25, vs. Pan R2=0.30).