Fecal and Ruminal Microbiome Components Associated With Methane Emission in Beef Cattle.

Background: The impact of extreme changes in weather patterns in the economy and humanity welfare are some of the biggest challenges that our civilization is facing. From the anthropogenic activities that contribute to climate change, reducing the impact of farming activities is a priority, since its responsible for up to 18% of greenhouse gases linked to such activities. To this end, we tested if the ruminal and fecal microbiomes components of 52 Brazilian Nelore bulls, belonging to two experimental groups based on the feed intervention, conventional (A) and byproducts based diet (B), could be used as biomarkers for methane (CH 4 ) emission. Results: We identied a total of 5,693 Amplicon Sequence Variants (ASVs) in the Nelore bulls microbiomes from the experimental group B. Statistical analysis showed that the microbiome populations were signicantly different among treatment groups. Differential abundance (DA) analysis with the ANCOM approach identied 30 bacterial and 15 archaea ASVs as DA among treatment groups. Random forest models, using either bacteria or archaea ASVs as predictors, were able to predict the treatment group with high accuracy (r2>0.85). Association analysis using Mixed Linear Models indicate that bacterial and archaea ASVs are linked to the CH 4 emission phenotype, of which the most prominent were the ruminal ASV 40 and fecal ASV 35. These ASVs contributed to a 9.7% increase and 7.3% decrease of the variation in CH 4 emission, respectively, which indicated their potential as targets for feed interventions and/or biomarkers.

concentrates with the industrial by-products citrus pulp, corn germ, corn germ oil meal and peanut shell meal (Group B, n = 26). All animals received mineral supplements, active dry yeast, virginiamycin and monensin.
The experiment was conducted at the feedlot facility of "Embrapa Pecuária Sudeste" and lasted 105 days, which included 15 days for animal adaptation to the feedlot, 30 days for growth and 60 days for animal nishing. Feedlots were divided based on the dietary treatment and initial weights, with heavyweight and lightweight animals grouped separately ( Table 1). The facility has collective stalls with automatic GrowSafe® (GrowSafe Systems Ltd., Airdrie, Alberta, Canada) feed system, used to collect data regarding live weight and daily food consumption. The CH 4 emission during the nishing period in the feedlot was measured using the GreenFeed system (C-lock Inc., Rapid City, SD, USA). All animal data used in this study is available in Table 1. Approximately 10 g of feces were obtained from each animal two weeks before slaughtering, and 50 mL of rumen content immediately after slaughter. All samples were frozen in liquid nitrogen and permanently stored at −80 °C prior to analysis. DNA extraction was performed using the Quick-DNA™ Fecal/Soil Microbe Miniprep Kit (ZYMO Research Corp., Irvine, CA), as determined by the standard protocol. PCR target ampli cation was performed using the follow primer sets: 341-b-S-17F (3'CCTACGGGNGGCWGCAG5') and 785-a-A-21R (3'GACTACHVGGGTATCTAATCC5') for bacteria 16S rRNA gene sequences; Ar915aF (3'AGGAATTGGCGGGGGAGCAC5') and Ar1386R (3'GCGGTGTGTGCAAGGAGC5') for archaea 16S rRNA gene sequences; Reg1320R (3'AATTGCAAAGATCTATCCC5') and RP841F (3'GACTAGGGATTGGARTGG5') for protozoa 18S rRNA gene sequences. PCR conditions, sequencing libraries and DNA sequencing were performed as described in Andrade et al., (2020).
Data retrieval, pre-processing and analysis In addition to the dataset generated in this study, raw reads generated in our previous work with bulls fed conventional diet (Group A) were retrieved from the SRA database [accession number PRJNA525838], and processed to infer the impact of dietary treatments and to search for association with phenotypes.
All rRNA gene sequence reads from group A and B were ltered for quality (>Q25) and trimmed at the positions 220 (forward) and 175 (reverse) using QIIME 2 (version 2018.8) [29] . These positions were selected based on aggregation plots provided by QIIME2. The ltered data was submitted to DADA2 to generate Amplicon Sequence Variants (ASVs) with the option just-concatenate, and exclude chimeric sequences [30] . Bacterial sequences were classi ed using the SILVA database version 132 [31] , archaeal sequences using the Rumen and Intestinal Methanogen database (RIM-DB) [32] , which allows the classi cation of archaea up to the species level and protozoa using a curated database containing protozoa 18S rRNA genesequences [33] with the feature classi er plugin within QIIME 2. Rarefaction curves were generated for each dataset and used to standardize the data (Additional le 1: Figure S1A, Additional le 2: Figure S2A). The resulting ASV table was used to determine alpha (Number of ASVs and Shannon-Wiener index) and beta diversities (Unweighted Unifrac distance) with QIIME 2.

Statistical analysis
The Mixed models approach using the REML methodology was used in order to verify if the means of CH 4 emission among experimental groups was signi cantly different (P <0.05), using the dietary treatment as xed effect and the slaughter batch and residual as random effects. We assessed differences in the microbial community structure, using alpha and beta diversities and the statistical tests Kruskal-Wallis, Permanova, and Principal Coordinates Analysis (PCoA) within QIIME 2 (version 2018.8) after data rarefaction. Relative abundances were transformed using the Centered Log-Ratio method (CLR), available in the python package scikit-bio (http://scikit-bio.org/) to be used in further analysis.
We contrasted the microbiome of groups submitted to different dietary treatments using the Analysis of Composition of Microbiomes (ANCOM) V2.1 [34] , with signi cance values adjusted for multiple tests using the Benjamin-Hochberg method (α<0.05). We also applied a conservative W-statistic (W-statistic cutoff = 0.9), in which an ASV was considered as differential abundant if its composition varied when compared to 90% of the rest of the dataset, being the W-value the number of times the null hypothesis was rejected for a given ASV across two groups. ANCOM is a statistical approach that compares the abundances of each ASV individually transformed in Aitchison's log-ratio with all the remaining ASVs without any distributional assumptions.
We implemented Random Forest (RF) classi cation models to verify the use of microbiome population abundances as predictors to discriminate treatment groups. The models were trained using Python's scikit-learn package (http://scikit-learn.org/) with 70% of the data for the training set and 30% for the test set with ve-fold cross-validation, max_leaf_nodes=30 and n_estimators=100. Features and hyperparameters were optimized using scikit-learn functions Feature Importance and GridSearch.
Association analysis with CH 4 emission phenotype was conducted using the Linear Mixed Models approach, similar to Difford et al., [35] with CLR-transformed abundances. The regression model was implemented using the Python's Statsmodel package (http://statsmodel.org/) using the following formula: CH 4 daily mean (g/d) ~ CLR(ASV) + Weight group + Slaughter date Experimental group information was used as covariates, diet as xed effect and a random intercept for each group as a random effect in the statistical models. Signi cance values were adjusted for multiple tests using the Benjamin-Hochberg method with an exploratory signi cance level (α<=0.1). We considered only ASVs that presented a relative abundance higher than 0.5% and a prevalence in >10% of animals in experimental groups.

Microbiome composition
The sequencing of microbiome rRNA amplicons from ruminal and fecal samples of the experimental group B yielded a total of 10,573,763 paired-end reads (4,628,604 paired-end reads for bacteria, 4,443,390 for archaea and 1,501,769 for protozoa), reaching 20,241,296 paired-end reads with the addition of sequencing data from Group A. After quality control, and singleton exclusion, a total of 4,519 bacterial ASVs (2,680 ruminal ASVs and 1,839 fecal ASVs), 1,023 archaeal ASVs (421 ruminal ASVs and 602 fecal ASVs) and 151 ruminal protozoan ASVs across treatments. Rarefaction curves based on the alphadiversity metrics of Shannon-Wiener (diversity) reached a plateau, which indicated that additional sequences would not likely result in additional features.
Comparison of samples from different treatment groups using alpha-diversity metrics (Observed ASVs and Shannon-Wiener indexes) under the Kruskal-Wallis testing method revealed that rumen bacterial diversity was signi cantly more abundant and richer in animals fed the conventional diet (Group A) than those fed the byproducts diet (Group B) (P = 0.006 and P = 0.04, respectively). Similarly, the ruminal archaea diversity was also richer (P = 0.0004), but not more abundant. There was no signi cant difference when contrasting alpha diversity metrics of fecal samples. Comparisons of the beta-diversity metric Unweighted Unifrac using the PERMANOVA approach, revealed that samples of archaea and bacteria tended to form two signi cant clusters, which represented the treatment groups (adjusted P< 0.01) ( Supplementary Figures 1-3), a tendency most pronounced in fecal populations.

Phenotypic description
Methane emission was calculated for each experimental group as the average value of all visits to the GrowSafe feedlot during the nishing period. Animals from the experimental group A presented a mean methane emission of 179,1 g/day and a standard error of 26,18 g/day, while animals from the experimental group B presented a mean methane emission of 161 g/day and a standard variation of 26,05 g/day. Mixed Models showed that the difference in CH 4 emission between experimental groups was signi cant (P<0.0001).

Taxonomic composition of the experimental group B
Herein, we will present only results concerning the taxonomic composition of dietary treatment group B. Please refer to [28] for an extensive exploration of the taxonomic diversity of the group A.

Differential abundant ASVs in dietary treatment groups
We applied the analysis of composition of microbiomes (ANCOM) to investigate the in uence of dietary treatments in the microbiome composition at the ASV level. Seventeen ruminal ASVs of bacterial origin were differentially more abundant (DA) in the group A, from which the most prominent were classi ed as Bacteroidales Discrimination between dietary treatment groups with Random Forest classi cation models Random forest (RF) classi cation models were trained using CLR transformed relative abundances of each dataset, to test if the microbiome populations at ASV level could be used to discriminate the treatment group. Random forest has been shown to be the most accurate Machine Learning (ML) model for microbiome data analysis [36] . This method has the ability to discriminate groups, while considering interrelationships in high dimensional data [37] . The trained models resulted in high cross-validation scores for the bacteria test sets (r 2 =0.89 for rumen, r 2 =0.84 for feces), for archaea (r 2 =0.86 for rumen and r 2 =0.82 for feces) but not for protozoa (r 2 =0.57).
The feature importance function was used to select ASVs that contributed the most to the model's accuracy and to optimize the models. In short, the number of predictors were reduced to those with a contribution value >=0.01 to retrain the models, this resulted in 16 of 1683 as predictors for bacteria, 27 of 118 for archaea, and 30 of 52 for protozoa in rumen, while 22 of 1077 ASVs were predictors for bacteria and 33 of 88 for archaea in feces, respectively. This feature reduction resulted in an increased cross-validation for bacteria (r 2 =0.91 for rumen and r 2 =0.94 for feces), archaea (r 2 =0.91 for rumen and r 2 =0.86 for feces) and for protozoa (r 2 =0.71) with high recall and precision scores (Supplementary Table  1). Predictors used are available in the Supplementary Table 2.
Association between bacterial and archaeal ASVs found in rumen and feces and CH 4 emission.
Previous analysis showed a signi cant difference in the mean CH 4 emission of experimental groups, with group A (estimated mean = 179.11) emitting more methane than group B (estimated mean = 160.97). In order to investigate the proportion of variation of CH 4 emissions explained by the microbiome composition of these animals, Linear mixed models were used with experimental groups information as xed effects, weight and slaughter groups as co-variables, daily mean CH 4 emission (g/day) as the dependent variable and individual log-transformed ASVs abundances as independent variables. This analysis identi ed signi cant associations between bacteria and archaea and CH 4 emission in both environments. Within the rumen microbiome, the ASV 40, a Pseudobutyrivibrio (β=16.5contribution = 9.7%) and the ASV 44, a Bacteroidales (β=-2.6, contribution = -1.3%), were associated with CH 4 emission phenotype ( Figure 3).

Discussion
In our previous study, we extensively explored the taxonomic structure and relationships of bacteria, archaea and protozoa from two different sections of the Nelore cattle GIT [28] . Herein we expand this study by introducing a new experimental group under a different dietary treatment. We contrasted these experimental groups to investigate the impact of the dietary intervention on microbial abundance and diversity, as well as the impact of individual ASVs on host CH 4 emission.
The microbiome structure is affected by the feed composition Analysis with alpha-diversity metrics showed that both bacteria and archaea communities only differed in the rumen environment, being less rich in animals of the group B. Although being outside the scope of this study, a link between a poorer ruminal microbiome and the increase of the feed e ciency phenotype was detected, and evidences suggest that high e cient animals produces less methane [38,39] . As it will be further discussed, our methane association analysis reinforces the hypothesis of a favorable effect of the poorer microbiome on this trait. Also, PCoA analysis with the beta-diversity metric Unweighted Unifrac showed the existence of distinct clusters for treatment groups A and B for bacteria and archaea but not for protozoa. Altogether, these results indicate that feed is an important modulator of the microbiome, which agrees with previous studies in which the impact of different diets and feed components were evaluated [40,41] .
Differential abundance analysis with bacterial ASVs revealed a signi cant impact of the dietary treatment in the bacterial populations of both environments. ASVs classi ed as belonging to the Christensenellaceae family, as well as to the Prevotella, and Fibrobacter genera, which are all producers of Short-Chain Fatty Acids (SCFA) such as acetate and butyrate [42,43] , were identi ed as more abundant in the group A, while ASVs classi ed as genera known to produce succinate and propionate, Succiniclasticum and Succinivibrio as well as the Lachnospiraceae family [44,45], were identi ed as more abundant in group B. Differently from acetate and butyrate production, which increases H 2 , and consequently CH 4 production and emission [46] , propionate is an electron acceptor end-product of rumen fermentation that is probably an alternative to methanogenesis [47] . Also, it was shown that the increase in propionate concentrations is strongly associated with a decrease in CH4 production [48] . The three DA ASVs identi ed in the fecal samples corresponded to bacteria that commonly inhabit the hindgut, such as the Oscillibacter genus and Prevotellaceae family, both more abundant in the group B, and Rikenellaceae family, more abundant in the group A [49][50][51] . The identi cation of a small number of DA ASVs in the fecal microbiome is consistent with the alpha diversity analysis, in which there was no signi cant difference in both abundance and richness among experimental groups.
The dietary treatment also had a signi cant impact in the archaea populations that increased the abundance of ASVs classi ed as M. gottschalkii in both rumen and feces of animals from the group A, and M. ruminantium of animals from the group B. The relative abundance of these species differed in the experimental group under by-product diet (group B), in which it was observed a decreased relative abundance of M. gottschalkii and an increased abundance of M. ruminantium (9.6% and 23.8%) in rumen and feces when compared to conventional (group A). A study on sheep with contrasting phenotypes for CH4 emission found a higher abundance of the archaea M. gottschalkii in the higher emitter group and M. ruminantium in the lower emitter group [10] . This difference can be explained by speci c genomic structures because, unlike M. gottschalkii, M. ruminantium lacks the coding genes for methyl-CoM reductase II (McrII), affecting its tness in an environment with high concentrations of H 2 , the main substrate for the hydrogenotrophic pathway of ruminal methanogenesis [52] . Furthermore, the relative abundance of these highly abundant ASVs can partially explain the difference in methane emission observed between treatment groups, in which the group B emitted less methane than group A.
We also built RF models to test if the microbial ASVs CLR-transformed abundances could be used as predictors for host's outcomes, in this speci c case, the treatment group. Random Forests is a nonparametric ensemble machine learning approach, consisting in a collection of a multitude of decision trees, the forest, in which their predictions are averaged in a regression task, or selected based on a majority vote in a classi cation task. The R 2 score had a signi cant increase when re-trained with ASV that were contributing the most to the average reduction of weighted impurity in a tree, thus being more important for the classi cation. A small part of the bacterial ASVs identi ed as DA by the ANCOM approach, such as the ASVs 97 and 521 for rumen and ASVs 332 and 526 for feces, were selected for the re-training. On the other hand 12/15 of the archaea ASVs identi ed as DA were selected for re-training, which suggests that the archaeal populations are more sensitive to changes in the dietary treatments.
Random Forest models have been applied to the microbiome eld to classify experimental groups based on the microbiome composition [36], to identify fecal contamination in environmental samples [53] and to identify taxa whose abundances were different in mothers delivering prematurely [54] . Altogether, these results indicate that the microbiome composition is affected by the feed at the individual ASV level.

Phenotypic associations indicates biomarkers for CH 4 emission in both sections of the git
Mixed Linear model analysis identi ed a single bacterial ASV, the ASV40, as positively associated with CH4 emission in the rumen microbiome. This highly abundant ASV was classi ed as a member of the Pseudobutyrivibrio genus and presented the highest β coe cient, which explainined 9.72% of the variation in CH4 emission in the experimental groups. Butyrate-producing bacteria, such as Pseudobutyrivibrio and Butyrivibrio, were identi ed by Partial least squares as highly associated with CH4 emissions in a study with Bos taurus breeds representing extreme phenotypes [55] , thus con rming our ndings with a different methodology.
One ASV classi ed as Bacteroidales F082 group, abundant in the rumen of different cattle breeds and ruminant species [56] was identi ed as negatively associated with CH 4 emission levels in the rumen microbiome. A positive association with CH 4 emission was described by Difford et al., (2018) for the uncultured Bacteroidales BS11 gut group . His claim was supported by the functional annotation of two genera inside this group, in which the end products for cellulose fermentation included acetate, butyrate, propionate, CO 2 and H 2 [35,57]. Although being commonly identi ed in the rumen microbiome of different species, the Bacteroidales order is genetically diverse, as one would expect of members of low taxonomic rank to be, thus reinforcing the need for aditional exploration of this variability and for more studies in order to investigate the functional pathways responsible for the negative association of the F082 group and CH 4 emission.
Also, we identi ed 6 fecal ASVs that were signi cantly associated with CH 4 emission. This is the rst time that such a relationship has been observed in beef cattle. Of the 6 ASVs, two were positively associated with this phenotype, a Succinivibrio and a Parabacteroides, which together explained 7.7% of the variance in CH 4  Some authors consider this problem intractable because the ruminal microbiota can rapidly adapt to external interventions [70] . Additional experiments need to be performed to test the potential markers identi ed in this exploratory study. However, understanding the biology of speci c microorganisms that contribute to complex phenotypes may help to develop successful interventions for methane mitigation in bovines.

Conclusion
The feed composition induced signi cant differences in abundance and richness of ruminal and fecal microbial populations. The dietary treatment based on industrial byproducts applied to our experimental groups had an impact on the microbiome diversity of bacteria and archaea, but not on protozoa. Microbiome components (ASVs) of bacteria and archaea can be successfully used to predict the treatment group, thus giving support to the hypothesis that the feed intervention can modulate microbiome abundance and diversity. Microbiome components were associated with CH 4 emission in both ruminal and fecal microbiomes. While ruminal ASVs are expected to be directly associated to CH 4 production and emission, given that we monitored rumen CH 4 emission in the feedlot, the relation of fecal ASVs with this trait is unclear, although they can be biomarkers for CH 4 Figure 1 Relative abundance of bacterial populations, at genus level, in the rumen and feces of Nelore cattle fed with byproduct based diet. Only microorganisms with a relative abundance greater than 0.5% are shown in the legend.

Figure 2
Relative abundance of A) archaeal populations, at species level, in the rumen and feces of Nelore cattle fed with byproduct based diet and B) protozoa populations, at genus level, in the rumen of Nelore Cattle.
Only microorganisms with a relative abundance greater than 0.5% are shown in the legend.