Factors shaping the abundance and diversity of archaea in the animal gut

1 Archaea are active members of the gut microbiome, but a thorough analysis of their diversity 2 and abundance in a wide range of animals is lacking. Here, we examined both quantitatively 3 and qualitatively the gut archaeome of 269 species from invertebrates to primates. Archaea 4 are present across many animals and mostly represented by four genera and one family of 5 methanogens, but also members of Thaumarchaeota. Five major events of adaptation to the 6 gut in the Archaea were identified. Host phylogeny, diet, and intestinal tract physiology are 7 key factors shaping the structure and abundance of the archaeome. The abundance of 8 methanogens is positively correlated with diet fibre content in mammals and 9 hydrogenotrophic methyl-reducing methanogenesis (the main methanogenesis pathway in 10 many animals) is linked to diet and methyl compounds-producing bacteria. Our results provide 11 unprecedented insights on the intestinal archaeome and pave the way for further studies on 12 their role in this environment. 13 of the absolute/relative abundance of methanogens were carried out on species where archaea have been detected. Correlation between digesta mean retention time vs Averaged absolute abundance of methanogens in Primates (n 24). Fibre consumption is significantly related to the abundance of Methanogens in primates.


14
The intestinal microbiome plays key roles in host health 1-8 . It is composed of bacteria, 15 archaea, microbial eukaryotes, and viruses/phages. Research on the microbiome of many 16 animals has unveiled features that influence the overall structure of the intestinal microbiome 17 such as diet and the ability to fly 9-12 . However, most of these studies have only targeted the 18 bacterial intestinal community. It is known that host-associated archaeal methanogens 19 produce a significant amount of methane gas in ruminants, which makes them ecologically 20 and environmentally important 13  The gut archaeome of other animals such as rats, hoatzin, pigs, seals, wallabies, 25 kangaroos, iguanas, fish, horses, and even in the tissue of sponges was examined by 26 independent studies using different molecular and cultural approaches 15,24-30 . Overall, these 27 studies reported that the most common methanogens in the gut are members of the 28

Methanobacteriales and Methanomassiliicoccales, and that the Methanosarcinales and 29
Methanomicrobiales are also present, although less frequently 13,15,31,32 . Only one study 30 addressed the distribution of intestinal methanogens in a wide variety of animals, but using 31 methane gas detection tests 33 . This study detected methanogens in a wide range of animals. 32 It also suggested that they have been acquired early in animal evolution and were completely 33 lost in some lineages such as the Carnivora. However, the methodology used in this study has 34 several limitations, as it does not provide taxonomic information and cannot detect 35 methanogenic populations with low concentrations in faeces or non-methanogenic archaea. 36 Here, we carried out a sequence-based analysis of the gut archaeome based on nearly 37 400 samples from 269 species covering a broad spectrum of animal diversity. We investigated 38 the host range of archaea in eight animal classes, identified the major gut archaeal lineages 39 and predicted the dominant methane metabolisms using both sequencing and quantitative 40 approaches. We discussed the number of events of adaptation to the gut in the Archaea, 41 including in ammonia-oxidizing Thaumarchaeota and in Bathyarchaeota, both previously 42 rarely identified in this environment. By using a large range of metadata from the literature 43 we define key factors structuring the abundance and composition of the gut archaeome across 44 the animal kingdom. 45 We collected faeces from 269 species of animals (n samples = 391) ranging from Invertebrates 48 to Mammals -the majority of which, except for birds, fish, and gastropods, came from captive 49 specimens (Table S1). We used tree approaches to characterize the archaeal community of 50 these samples: i) quantitative PCR (qPCR) targeting total Archaea, total Bacteria, and five 51 archaeal lineages previously found in the animal intestine (Methanobacteriales, 52 Methanomassiliicoccales, Methanomicrobiales, Methanimicrococcus and Thaumarchaeota), 53 and ii) 16S rRNA gene amplicon sequencing of the Archaea only and iii) of the entire microbial 54 community. We detected the presence of archaea in the gut microbiome of 175 species 55 belonging to all eight classes of animal investigated, including 14 orders of mammals ( Figure  56 1; Table S1). 57 Archaea were detected in a higher proportion of the samples when using archaea-specific 58 59 primers for qPCR (78%) or amplicon sequencing (76%) with respect to the universal prokaryote 60 primers for amplicon sequencing (44%). This difference was observed in most animal classes 61 ( Figure 1). In addition, universal prokaryote primers also captured a lower number of ASVs 62 (1.9 + 2.6 ASVs per sample) with respect to the archaea-specific primers (13.6 + 20.3 ASVs per 63 sample) (Kruskal-Wallis p = 1.65e -8 ( Figure S1), n = 218). With ~10,000 prokaryotic reads per 64 samples, the archaeal species/ASVs that represent less than 0.01% of the microbial 65 community are likely missed, which may explain both lower proportion of archaea-positive 66 animals and the lower archaeal alpha-diversity in the approach relying on prokaryote universal 67 primers. 68 69 Five major events of adaptation to the gut in the Archaea 70 The broad taxonomic coverage of the animal hosts and the use of archaeal specific primers 71 allowed us to identify archaeal ASVs belonging to 19 described families, 10 orders, 6 classes, 72 and 3 phyla. 84.9% of these ASVs (94.5% of the reads) share more than 95% identity with 73 species in the Living Tree Project (LTP, v138) database 34 amended with characterized 74 candidate species, and half of the reads share more than 99% identity with known species 75 (Table S2)  Orange arrows on the tree indicate proposed events of adaptations to the gut environment, either at the base or within displayed lineages. The histogram shows the proportion of sequences from a given lineage present in either animal digestive tract (« Gut »), open natural environment (« Environment ») or built environment (« Digester »). Circle surface area represents the percentage of reads attributed to each taxon in our study including only gut-related samples. b) Correlation between the absolute abundance of Archaea and the absolute abundance (16S rRNA copies/gram of feces) of Bacteria (black), summed methanogen lineages (Methanobacteriales, Methanomassiliicoccales, Methanomicrobiales, Methanimicrococcus; green) and Thaumarchaeota (purple), all determined by qPCR using lineage specific primers. The scale of the absolute abundance of Archaea is on panel c). Plotted samples correspond to those with amplified Archaea in Miseq, presented in panel c. c) Proportion of archaea corresponding to the dominant methanogen lineages (green), Nitrososphaeraceae (purple) and rarer taxa (light blue) in samples, based on Miseq sequencing with archaeal specific primers, according to absolute abundance of archaea in the sample (qPCR). Dots indicate the relative abundance of these three categories/lineages of archaea in each sample. Lines indicate the moving averages with a subset size of 25 samples. sequences ( Figure S2), suggesting that some general traits needed to maintain in the gut are 152 shared by these Bathyarchaeota. 153 154

Specific associations between archaea and their hosts 155
In mammals, main factors significantly affect the beta-diversity of archaea with the 156 following level of influence: host phylogeny > coefficient of gut differentiation > host diet > 157 digestive tract type, regardless of the diversity measurement used -i.e., 158 Weighted/Unweighted UniFrac, Bray-Curtis, and Jaccard ( Factors that influence the Beta Diversity of archaea in Mammals. Mammals with > 2 species per order (n = 73, unless otherwise indicated) rarefied to 3000 reads per sample were subject to beta diversity analyses. * including only zoo from which more than three samples were collected, and samples from the same species were treated separately (n = 99; df = 11). Signficant differences were tested for between beta diversity metrics using a permutational analysis of variance (PERMANOVA), p < 0.05 was considered signifcant.

162
The level of beta-diversity variance explained by host phylogeny is as high (or higher) 163 as the one previously reported for bacterial communities in mammals 9,10,47,55-58 . Specific Cetartiodactyla-associated clade ( Figure S4), suggesting that the ancestor of these two 176 Figure 3. Archaeal taxonomic diversity and abundance in the animal gut. a) Information on animal primary diet gathered using the Elton Trait database, the Animal Diversity Website database, or from specialists who provided fecal samples. Primary diet was considered food material that made up >70% of the animal's diet. b) Absolute abundance of archaea as determined by qPCR with archaea-targeting primers on a log scale. c) Observed richness (number of different ASV) of archaea. Green represents animals with low richness, black represents medium richness ~ 25; and red represents high richness >50 archaeal ASVs. d) Taxonomic diversity of archaea in the animal intestinal microbiome. Samples were rarefied to 3000 archaeal reads. e) Predicted methane metabolism, assigned to ASVs based on taxonomic annotation (Table S4).  Equidae (order Perissodactyla) and Bovidae (order Cetartiodactyla) within mammals ( Figure  203 3b; Table S1). Conversely, we found comparably low levels of archaeal richness within the 204 Aves and Actinopterygii. 205 206

Strong influence of diet on methanogen abundance and composition 207
Diet is another important factor affecting the gut archaeome, both in terms of alpha-208 diversity, beta-diversity (Table 1; Figure S9) and abundance ( Figure 6). 209 Figure 6: Influence of host diet-type, diet-fibre content, and mean retention time on the absolute abundance of total methanogens, Thaumarchaeota and Bacteria. Abundance of a) total methanogens (n = 161), b) Thaumarchaeota (n = 116) and c) Bacteria (n = 223) according to host diets-type. Significant differences across all groups were determined via the Kruskal-Wallis test, with p < 0.05 set as significant. Wilcoxon rank sum test with continuity correction was used to determine differences between diet types *: p < 0.05; **: p < 0.01; ***: p < 0.001; ****: p < 0.0001. Correlation between diet-fibre content and absolute abundance of d) methanogens (n = 103), e) Thaumarchaeota (n = 55) and f) Bacteria (n = 107) in mammal species. g) Dietary fibre versus averaged absolute abundance of methanogens in Primates (n = 12). Mean retention time is significantly related to the abundance of methanogens in primates. Statistical analyses and representation of the absolute/relative abundance of methanogens were carried out on species where archaea have been detected. h) Correlation between digesta mean retention time vs Averaged absolute abundance of methanogens in Primates (n = 24). Fibre consumption is significantly related to the abundance of Methanogens in primates.
Indeed, herbivorous animals have a higher number archaeal ASV than carnivorous and 210 omnivorous animals ( Figure S9). Moreover, the absolute and relative abundance of 211 methanogens is higher in animals with a plant-based diet (e.g., leaves, fruits) than in animals 212 feeding on meat or insects, and their abundance is intermediate in omnivorous animals 213 ( Figure 6a). This link between methanogen abundance and diet-type is further supported by 214 the positive correlation of both the absolute and relative abundances of methanogens (but 215 not of Thaumarchaeota, and very weakly for bacteria) with the fibre content of the diet ( Figure  216 6d-f; Figure S10). The increase in methanogen absolute/relative abundance reaches a limit at 217 around 200 g of crude fibre/kg of dry matter (Figure 6d; Figure S10). At lower host taxonomic 218 level, the positive correlation also holds for Primates, for which we sampled species with 219 contrasted average fibre intake ( Figure 6g). An increased diet fibre content was previously 220 reported to be associated with a higher expression level of methanogenesis genes in humans 221 59 and a greater methane production in pigs 60 and ruminants 61 . As the vast majority of 222 intestinal methanogens are hydrogenotrophic, these relationships can be explained by the 223 higher production of hydrogen from fibre/carbohydrates-rich diets (plant) than from 224 protein/fat-rich diets (meat) 62 . 225 However, the level of H 2 produced from fibre degradation also depends on which 226 bacteria are involved, Clostridiales being known to producing more H 2 than Bacteroides during 227 fibre degradation 63 . Thus, other than diet, methanogens are also influenced by the 228 composition of the bacteria degrading it. In humans, cellulolytic Ruminococcaceae 229 (Clostridiales, Firmicutes) spp. have been reported to be present in the gut of methane 230 producers, while cellulolytic Bacteroides spp. prevail in non-methane producers 63 , and 231 methanogens are enriched in subjects with Firmicutes/Ruminococcaceae enterotype 64 . We 232 found that eight Ruminococcaceae OTUs (including six from uncharacterized genera) co-occur 233 with methanogens, and -more generally-19 out of the 30 bacterial OTUs positively associated 234 with methanogens belong to Clostridiales and only four to Bacteroidales (Table S3, 235 Supplementary text). Other than benefiting from fibre degradation, methanogens can also 236 favour it by stimulating microbes involved in its degradation. Indeed, the presence of 237 methanogens in cocultures has been shown to increase the level of extracellular 238 polysaccharide-degrading enzymes of Ruminococcus flavefaciens 65 . 239 The abundance of hydrogenotrophic methyl-reducing methanogen lineages (i.e., 240 Methanomassiliicoccales and Methanimicrococcus) is less influenced by fibre content than 241 lineages that include hydrogenotrophic CO 2 -reducing methanogens (i.e., Methanobacteriales 242 and Methanomicrobiales; Figure S11). Moreover, hydrogenotrophic methyl-reducing 243 methanogens represent a lower proportion of the methanogens in herbivorous animals than 244 in animals having another type of diet (p = 0.003). As methyl-reducing methanogens depend 245 on different methyl-compounds (e.g. methanol, methylamines) for their energy metabolism 246 and because they can utilize hydrogen at lower concentration than CO 2 -reducing 247 methanogens 66 , their distribution may be more affected by the availability of methyl-248 compounds than by fibre content. One of these methyl-compounds, methanol, is produced 249 by the bacterial degradation of pectin 67 . This metabolism was shown to occur in the animal 250 gut (e.g., human, pigs, lemurs, ruminants) as revealed by the identification of bacteria with a 251 methylesterase activity 68,69 and by the increase in methanol concentrations in response to 252 pectin consumption 70-72 . Our data show that the ratio of hydrogenotrophic methyl-reducing 253 to CO 2 -reducing methanogens is higher in frugivorous species than in herbivorous ones (p = 254 0.005), which is likely related to large amounts of pectin in fruits. This support a previous 255 hypothesis that the high relative abundance of Methanosphaera stadtmanae (an obligate 256 methanol-reducing methanogen) in orangutan is related to their high fruit consumption 73 . 257 We also found a high relative abundance of hydrogenotrophic methyl-reducing 258 methanogens in most of the sampled Primates (Figure 3e), and particularly in Lemuridae, 259 which may be related to the presence of fruits in their diet (Table S1) present in various diets 81-83 and pectin is not limited to fruit but is also a constituent of the 276 plant cell wall 84 , which therefore do not limit the presence of hydrogenotrophic methyl-277 reducing methanogens to frugivorous animals. In our dataset, hydrogenotrophic methyl-278 reducing methanogens constitute almost 40% of the overall methanogen reads (Figure 7a; 279 Table S4) and represent a large fraction of the methanogens in many animals (Figure 3e; 280 Supplementary text). This contrasts with many non-host environments (e.g. sediments, peat 281 Figure 7: Main methanogenesis pathways in the animal gut. a) Proportion of the total archaeal reads that are assigned to taxa with a predicted CO 2 -dependent hydrogenotrophic methanogenesis (H 2 + CO 2 ; blue) or methyl-dependent hydrogenotrophic methanogenesis (CH 3 -R + H 2 ; orange) pathway. Methanosarcina spp. can have diverse methanogenesis pathways (the two above-mentioned pathways and the methyl-dismutation (or methylotrophic) and acetoclastic pathways. b) Diagram of the most favourable methanogenic metabolisms depending on methanol concentration (C(methanol) in mol/l) and hydrogen partial pressure (p(H 2 ) in bar). Coloured area in the map indicate ranges of C(methanol) and p(H 2 ) for which either CH 3 -R dismutation, CH 3 -R + CO 2 or CO 2 + H 2 is the most favourable pathway, i.e. concentrations and pressure ranges for which the associated ∆G cat expressed in kJ/mol CH 4 is the most negative. ∆G cat values were calculated for T = 298 K, pH = 7 and p(CO 2 ) = p(CH 4 ) = 10 -1 bar, when the difference in ∆G cat between two or three catabolisms was less than 10 kJ/mol CH 4 , catabolisms were then considered to be equally favourable. This corresponds to central

Impact of digestive tract physiology 311
Both the coefficient of gut differentiation 86 (i.e., proportion of the gut dedicated to 312 fermentation) and where the fermentation takes place (e.g. foregut, hindgut, caecum) explain part of the variance in of the beta-diversity (Table 1). In addition, many ASVs are almost 314 ubiquitous in the ruminant Cetartiodactyla (paraphyletic, Ruminantia and Tylopoda), but 315 mostly absent from non-ruminant Cetartiodactyla or other animals, highlighting possible cross 316 influence of gut physiology and host-phylogeny ( Figure S12). Whether these archaea found in 317 faeces originate from the rumen compartment or can colonize more largely the gut of these 318 animal is currently unknown. The total abundance of methanogens is positively correlated 319 with gut differentiation coefficient in mammals (R 2 = 0.33, p = 0.0036, n = 25), while there was 320 no correlation with abundance of Thaumarchaeota and Bacteria ( Figure S13).  (Table S1). Fresh faecal samples (n = 392) were stored at -20°C until DNA extraction. 376 Total DNA was extracted using a modified QIAamp PowerFecal DNA Kit (Hilden, Germany) protocol. Cells were lysed using the Fastprep (MP Biomedicals) cell homogenizer 'faecal 378 sample' default setting in the lysis buffer provided in the PowerfFecal DNA kit. For subsequent 379 analyses, genomic DNA was diluted ten times, to limit the effect of PCR inhibitors. 380 381

Quantitative PCR 382
Total bacteria, total archaea, and specific archaeal lineages (Methanobacteriales, 383 Methanomassiliicoccales, Methanomicrobiales, Methanimicrococcus, Thaumarchaeota) were 384 quantified using quantitative PCR with lineage specific primers (Table S5)  and resulted in more than 21 million reads and more than 16.7 million reads for the 407 prokaryotic and archaea specific sequencing, respectively. 408

Microbial Diversity Analyses 410
Reads were processed and assigned to amplicon sequence variants (ASVs) using the DADA2 411 software (v1.12.1) in R (v3.6.0). Briefly, reads were trimmed and quality-filtered using the 412 standard parameters -maximum expected errors for forward and reverse reads = 2, quality 413 score = 2, and trimming length = 273 and 170 base pairs for forward and reverse reads, 414 respectively. Forward and reverse reads were merged with a 20 base pair overlap, ASVs were 415 generated, and chimeras were discarded. ASV annotation was performed using the Silva 16S 416 rRNA database (v132). Assignment of ASVs to a main type of methane metabolism 417 (hydrogenotrophic CO 2 -reducing, hydrogenotrophic CH 3 -reducing, acetoclastic and 418 methylotrophic (methyl-dismutation)), was done based on their taxonomic affiliation, since 419 all members of almost all methanogen genera/families have the same dominant type of 420 methane metabolism (Table S4). Methanosarcina is the main exception, as species from this 421 group can have one or several types of methane metabolisms. All ASVs that were not 422 annotated as archaea were removed from the archaeal-specific primer generated sequences, 423 and ASVs annotated as archaea or bacteria were kept from the universal primer generated 424 sequences. Samples from the same species were merged by summing ASV abundances. These Methanomassiliicoccales and Thaumarchaeota. For diversity analyses, rarefaction was 431 performed to normalize sequencing depth to 3,000 reads, leading to 1,253 archaeal ASVs. 432 Bacterial ASVs were normalized to a sequencing depth of 12,000 reads per sample. Observed 433 richness (alpha diversity) was estimated and all beta diversity analyses were performed using 434 the 'phyloseq' package in R (v1.30.0). Subsequent statistical analyses were performed using 435 the base Rstudio 'stats' package (v3.6.0) as well as the R package 'vegan' (v2.5-6). To test for 436 significant differences using the various beta diversity metrics (Table 1)

Co-occurrence of Archaea and Bacteria 459
To identify co-occurrence signal between archaea and bacteria across Mammalia, Reptilia, 460 and Aves, we integrated the sequences from both the Universal and Archaea specific 16S rRNA 461 gene amplicon sequencing. Only bacterial reads were selected from the Universal 16S rRNA 462 gene amplicon sequencing for this analysis. We used VSEARCH 102 to cluster ASVs into OTUs 463 at 97% in order to reduce the size of the dataset and to filter out truly low abundance lineages 464 of microbes. Then, to merge these datasets in a way that accurately represented the microbial 465 community in terms of relative abundance between archaea and bacteria, we normalized the 466 two datasets both in terms of sequence depth and in terms of archaea-bacterial ratios -467 information which was gathered through qPCR data. OTUs that were present in less than 10% 468 of the animal classes -Mammalia, Aves, and Reptilia independently-were removed. Following 469 this, we implemented both the SPIEC-EASI (Spiec.Easi package v1.1.0, 106 ) and the SparCC 470 algorithms 107 (part of the Spiec.Easi package (v1.1.0)) in Rstudio (v3.6.0) to determine co-471 occurrence trends between archaea and bacteria. Networks were calculated with 1000 472 iterations. The output from these analyses were filtered using a 0.5 minimum threshold of 473 edge stability (SPIEC-EASI) (Table S3) and a p-value < 0.05 (SparCC), independently. Only the 474 co-occurrence patterns identified by both algorithms were further analysed. 475 476

Investigation of archaea distribution in the gut and other environment 477
All archaeal 16S rRNA gene sequences from Silva database 99 longer than 800 bp and with 478 more than 80% sequence quality, alignment quality and pintail quality were downloaded. 479 Sequences from metagenomes were removed because their environmental origin was not 480 clearly indicated. The annotation of each sequence was retrieved from GenBank and used to 481 classify them as "Gut", "Environmental" or "Human built" origin. Sequences from sponge, 482 animal environments (e.g. nest) or polluted sites (e.g. dump) were not included. For each catabolism Gibbs free energy (∆G cat ) calculations were performed using the R 495 package CHNOSZ 108 considering C(methanol) between 10 -3 and 10 -7 mol/l, p(H2) between 1 496 and 10 -7 bar, T = 298 K, pH = 7 and p(CO 2 ) = p(CH 4 ) = 10 -1 bar.