Association between host genetics of sheep and the rumen microbial composition

A synergy between the rumen microbiota and the host genetics has created a symbiotic relationship, beneficial to the host’s health. In this study, the association between the host genetics and rumen microbiome of Damara and Meatmaster sheep was investigated. The composition of rumen microbiota was estimated through the analysis of the V3-V4 region of the 16S rRNA gene, while the sheep blood DNA was genotyped with Illumina OvineSNP50 BeadChip and the genome-wide association (GWA) was analyzed. Sixty significant SNPs dispersed in 21 regions across the Ovis aries genome were found to be associated with the relative abundance of seven genera: Acinetobacter, Bacillus, Clostridium, Flavobacterium, Prevotella, Pseudomonas, and Streptobacillus. A total of eighty-four candidate genes were identified, and their functional annotations were mainly associated with immunity responses and function, metabolism, and signal transduction. Our results propose that those candidate genes identified in the study may be modulating the composition of rumen microbiota and further indicating the significance of comprehending the interactions between the host and rumen microbiota to gain better insight into the health of sheep.


Introduction
The gut microbiome is a complex network with a symbiotic relationship with the host; hence, the microbiome composition may impact the balance of the rumen ecosystem, and their interaction provides beneficial functions to the host. The gut microbiome has a significant role in the maintenance of rumen ecosystem stability and the host's health, as it influences the nutritional, physiological, and immunological potential of the host (Zhao et al. 2013;Guo et al. 2015). The microbiota also extend the host's digestive capabilities, produce vitamins, and prevent pathogen colonization (Goodrich et al. 2016). Likewise, the gut microbiota is essential in developing and differentiating the intestinal mucosal epithelium , and they modulate the host's immune response to maintain gut homeostasis .
The gut microbiome composition and function are modulated by factors such as age, diet, environment, and host genetics (Malmuthuge et al. 2015). These factors influence how the microbiomes function and the microbial structure. Though it has been generally established that the significant factor modulating the gut microbial communities is diet (Henderson et al. 2015;Ellison et al. 2017), novel pieces of evidence through genome-wide association studies (GWAS) have shown that host genetics has a significant role in the modulation of gut microbial composition in mice and humans . Blekhman et al. (2015) showed an association between host genetic variation in SNPs and Bifidobacterium abundance close to the lactase (LCT) gene. LCT encodes for lactase enzyme that hydrolyzes lactose when expressed in the gastrointestinal tract. It has been suggested that host genetics is a vital component in modulating the gut microbial composition. GWAS in animals has been mostly used to analyze several complex traits such as association with feed efficacy (de Oliveira et al. 2014) and genetic resistance to diseases (Benavides et al. 2015;Zvinorova, 2017). Despite its significance, the compound relationship of host genetics, physiology, and gut microbiome components is poorly understood. Few studies assess the host genetics influence on the gut microbial composition (Sasson et al. 2017). Crespo-Piazuelo et al. (2019) identified pig genetic variation SNPs that were associated with the abundance of six genera and the majority of candidate genes identified in 17 associated regions encoded for enzymes participating in defense systems of the host, while other proteins were involved in metabolism. Studies on the association of host genetics and their rumen microbiota composition are limited in ruminants, and those few are focused on cattle compared to other ruminants (Sasson et al. 2017;Li et al. 2019). In particular, association studies between rumen microbial composition and the sheep genome have not yet been defined. This study therefore aimed at identifying genomic regions in the host that may modulate the rumen microbial composition and find the correlation between host genetic variation and rumen microbial abundance through host-microbiota association in sheep.

Animal material and sampling
All the procedures involving animals were permitted by the Agricultural Research Council-Animal Production Ethics committee (APIEC17/21). The trials were done at the Agricultural Research Council (ARC), GI Microbiology and Biotechnology unit, and the Small Stocks Unit in Irene, Gauteng province, South Africa.
Sixty-four sheep (32 Damara sheep breed and 32 Meatmaster sheep breed) were used for the trial, and they were approximately 7 months old with the average weight as follows: Meatmaster males (24.6 ± 3.4 kg); Meatmaster females (21.5 ± 3.1 kg); Damara males (36.6 ± 8.3 kg); Damara females (28.9 ± 6.9 kg). The animals were housed per treatment with males and females separated in open barn trial pens with ± 4m 2 shelters. Ten (10) animals were excluded from the experiment after adaptation period before the trial commenced, due to 6 showing signs of sickness and 4 mortalities. The animals were grouped into five treatment groups as follows: no antibiotic, no probiotics (T1), only potential probiotic (T2), only potential probiotic (T3), the combination of potential probiotics (T4), antibiotic (T5). Animals were fed on commercial pellet feed as previously described (Mani et al., 2021), with hay and freshwater supplied ad libitum. Rumen and blood samples were collected before the trial commenced and later collected again at the end of the trial. Rumen samples were collected using a stomach tube following the procedures of Shen et al. (2012).

Microbial DNA extraction, 16S rRNA gene amplification, and sequencing
Microbial DNA was extracted with QIAamp Fast DNA stool Mini Kit (Qiagen, Germany) following the guidelines of the manufacturer for pathogen detection, and DNA concentration was evaluated with Nanodrop 2000 (Thermo Electron Corporation, USA). The DNA samples were pooled according to their respective treatments, taking into consideration the sex of the animals (males and females DNA pooled separately). The pooled DNA quality was evaluated using gel electrophoresis on 1.5% agarose gels and visualized with UVP BioSpectrum 310 Imaging System (FisherScientific, UK). The pooled DNA samples were used as templates for amplifying the V3-V4 hypervariable region of the prokaryotic 16S rRNA gene according to the 16S Metagenomic Sequencing Library Preparation guide (Illumina, San Diego, USA). Illumina MiSeq platform was used to perform sequencing, and 300 bp paired-end reads were generated. Raw reads were deposited to the Sequence Read Archive (SRA) of the National Center of Biotechnology Information (NCBI) database under BioProject PRJNA578022.

Data analysis
Illumina universal adapter sequences and the low-quality sequence regions were removed using Trimmomatic (version 0.36), while trimming raw data sequences generated by the MiSeq platform. Demultiplexed sequence files were imported to QIIME2 software (version 2018.8) (Bolyen et al. 2018) for analysis. The imported reads were denoised and trimmed using DADA2. Sequences were assigned to operational taxonomic units (OTU) with 97% identity, and the OTU feature table was generated; chimeric sequences were detected and removed from the representative OTU sequences. For taxonomic analysis, OTU representative sequences were aligned to the Greengenes database. All other statistical analyses were carried out using the R package phyloseq (version 1.24.2) (McMurdie and Holmes, 2013) on RStudio (version 3.5.3) (www.r-proje ct. org). The 9134 OTUs were grouped into 20 phyla and 107 genera through tax_glom method in the phyloseq method in R. OTUs with the relative abundance of ≥ 0.05% in at least one sample were the ones retained in both breeds for subsequent analyses.

Host DNA extraction and SNP genotyping
Sheep whole blood was used to extract DNA with Wiz-ard™ Genomic DNA Purification Kit (Promega, USA). DNA samples for each breed were pooled according to their respective treatments, taking into consideration the sex of the animals (males and females DNA pooled separately). DNA quality and quantity were measured using 1.5% agarose gel electrophoresis (Bio-Rad) and Qubit 2.0 Fluorometer (Life Technologies), respectively. Genotyping of 40 DNA samples was done with the Illumina OvineSNP50 BeadChip (Illumina, San Diego, CA). Genotypes were obtained using GenomeStudio software and quality control of the data was done through PLINK software (version 1.9) (Purcell et al. 2007). Additional analyses were performed only on SNPs that mapped and aligned to the Oar_v3.1 reference assembly, retaining a total of 42,503 SNPs with missing genotype < 2% and minor allele frequency (MAF) > 2% to be used for genome-wide association analysis.

Genome-wide association (GWA) analysis
Associations between the SNPs mapped to the genome and microbiota composition at the genus level were carried out. Samples were normalized by their relative abundance and find.top.taxa function in R was used to extract genera that were present in ≥ 90% of the samples in both sheep breeds and had more than 0.05% of the total reads to avoid errors caused by low abundant genera. Unspecified genera were excluded from the analysis; thus, GWAS was carried out in 7 out of the 107 found genera. GWAS analysis was performed using PLINK -assoc and -linear options, with the -linear option performing a linear regression analysis with each individual SNP as a predictor. The multiple test correction was done with false discovery rate (FDR) method (Benjamini-Hochberg, 1995) using -adjust function in PLINK.

Functional gene annotation and prediction
Associated regions in the sheep genome were annotated at 1 Mb on each side of Oar_3.1. Gene locations were based on the Ensembl Genome Browser (www. ensem bl. org; release 99). The genes contained in the regions of significant SNPs in the sheep genome were extracted using Oar_3.1 reference in the Biomart tool (Kinsella et al. 2011) using Ensembl. Variant Effect Predictor tool (McLaren et al. 2010) from the Ensembl project was used in predicting the functional consequences of the significant SNPs.

Microbial composition
The microbial community within the rumen microbiome of the two breeds was phenotyped using the v4 region of the 16S rRNA gene and the sequencing resulted in 1,996,134 quality-filtered reads, as described previously (Mani et al 2021). The relative abundance of rumen microbial composition and its taxonomic distribution was estimated in both breeds, with a total of 15 bacterial phyla and 1 archaeal phylum ( Fig. 1a). At the genus level, Damara is indicated to be more diverse compared to Meatmaster (Fig. 1b). The core microbial communities across breeds show a highly conserved abundance and structure. This is in agreement with a study by Wallace et al. (2019), where they showed that core microbial members are closely related to each other compared to the noncore members. This relatedness could indicate how they share certain functional characteristics. The proportions of the microbial composition in the two Fig. 1 Relative abundance of rumen microbial composition in the two breeds at a phylum level and b genus level showing breed and sex association with microbiota breeds showed some variation, also with regards to the male and female within breed. This is in agreement with a study by Ji et al. (2018), where they observed varied proportions of microbial composition between the sexes of the animals. This shows that, even within closely related breeds, or animals of the same herd, there is diversity in the composition of rumen microbiota.

Genome-wide association study
GWAS was achieved using 42,503 SNPs genotyped in 40 animal samples and the relative abundance of seven genera, Acinetobacter, Bacillus, Clostridium, Flavobacterium, Prevotella, Pseudomonas, and Streptobacillus. Significant associations (FDR < 0.1) were found in all seven genera and 60 significant SNPs were dispersed in 21 chromosomes across the Ovis aries genome ( Fig. 2 and Table 1). There were two shared regions found for the abundance of Prevotella and Bacillus, located in the region of 25.47 ~ 26.10 Mbps on OAR17, Bacillus and Acinetobacter shared a region of 45.13 ~ 49.52 Mbps located on OAR26; also, they were associated with one SNP (s22112.1 on OAR26, P = 1.29e-20-3.95e-08, FDR = 2.24e-17-8.45e-05). Eightyfour candidate genes were identified in the regions of significant SNPs associated with rumen microbial composition.

Gene Ontology and associated pathways
To determine the molecular and metabolic functions where the candidate genes participate in, identification of their ontological categories and associated metabolic pathways was done. The candidate genes were categorized into 35 functional groups of the Gene Ontology classification, with 19 groups of biological process (BP), 5 groups of molecular function (MF), and 11 groups of cellular components (CC). The highest abundance of genes was represented in the CC category (Fig. 3). The analysis of KEGG pathways in which   these genes participate in showed that most of the genes are associated with metabolic pathways (Table 2).

Discussion
Previous studies have demonstrated that host genetics play a role in modulating the composition of gut microbiota in mammals (Blekhman et al. 2015;Goodrich et al. 2016;Crespo-Piazuelo et al. 2019;Li et al. 2019). The results in this study suggest that there are associations between the abundance of certain genera and host genotypes (SNPs), which might provide answers on which genetic components contribute to the rumen microbial composition. Some of the microorganisms that showed to be related to host genetics are relevant in the composition of the ruminal environment, degradation of feed. Wallace et al. (2019) have shown that the rumen function and productivity of cows can be predicted with the microbial abundance that forms part of the core community across different breeds and dietary diversity. The detailed description of the candidate genes mapped in the sheep genome's significant genetic variations is discussed.

Acinetobacter
The relative abundance of Acinetobacter is associated with genetic variations in six sheep genome regions. In the region OAR1, two candidate genes have been identified: Adenylate Kinase 4 (AK4) and Leptin Receptor (LEPR). AK4 regulates the activation of the energy sensor and phosphorylation of AMPK protein kinase by controlling cellular ATP levels. The kinase is also essential in oxidative stress cellular defense response (Lanning et al. 2014;Jan et al. 2019). While LEPR is a receptor for leptin, which is a hormone that is adipocyte-specific regulating body weight (Clément et al. 1998). This gene is associated with weight gain and reproductive seasonality traits in ewes, as the associations between lamb birth weight and adult body weight and the LEPR mutation were also observed (Haldar et al., 2014).
In the region of OAR3, two candidate genes were found: Integrin Subunit Beta 1 Binding Protein 1 (ITGB1BP1) and Rho Associated Coiled-Coil Containing Protein Kinase 2 (ROCK2). ITGB1BP1 is essential for cell adhesion, proliferation, differentiation, and migration (Fournier et al. 2002). ROCK2 is vital for regulating cell polarity and the actin cytoskeleton. ROCK2 is also involved in regulating smooth muscle contraction, neurite retraction, cell adhesion and motility, stress fiber, actin cytoskeleton organization, and focal adhesion formation (Takeda et al. 2019). ROCK2 might be involved in regulating the rumen smooth muscle contraction. A study by Xiang et al. (2018) reported that increased expression of genes regulating smooth muscle contraction in the rumen is related to sheep with shorter mean retention time (MRT), as the muscle contraction reduces MRT and subsequently the CH 4 . This may increase gastrointestinal muscle contraction linked to the rapid digesta passage rate, thus influencing rumen microbiota. Three candidate genes were found in the region of OAR22, Carbohydrate Sulfotransferase 15 (CHST15), Abraxas 2, BRISC Complex Subunit (ABRAXAS2), and Zinc Finger RANBP2-Type Containing 1 (ZRANB1). CHST15 is essential in glycosaminoglycan metabolism, a significant structural constituent of the extracellular matrix, and this gene potentially acts as a B-cell surface signaling receptor . ABRAXAS2 is involved in the deubiquitination of the interferon receptor IFNAR1 involved in interferon signaling, and it also down-regulates bacterial lipopolysaccharide (LPS) response (Zheng et al. 2014). ZRANB1 is essential in cell migration and stress fiber dynamics; it might also modulate TNF-alpha signaling (Bai et al. 2011). In the region of OAR25, three candidate genes were found: WDFY family member 4 (WDFY4), oxoglutarate dehydrogenaselike (OGDHL), and DEPP1 autophagy regulator (DEPP1). WDFY4 plays a significant role in regulating cDC1-mediated cross-presentation of viral and tumor antigens in dendritic cells. It is also known to be essential in B-cell survival through autophagy regulation (Theisen et al. 2018).
OGDHL encodes for similar protein that degrades glucose and glutamate (Dai et al. 2020). DEPP1 is known as fasting-induced gene as its expression is induced by fasting and progesterone and is a vital modulator of FOXO3-induced autophagy via oxidative stress (Stepp et al. 2014). While no candidate gene was found in other genetic variations in the region of OAR26, the genetic variant with significant SNP OAR26_46995252.1 contained nuclear receptor subfamily 1 group D member 2 (NR1D2) identified. NR1D2 regulates genes essential in metabolic functions such as lipid metabolism and the inflammatory response (Ramakrishnan and Muscat, 2006). Acinetobacter is said to be more involved in epithelial proliferation and disease (Mao et al. 2015) and these genes might modulate the abundance of Acinetobacter.

Bacillus
In the gut, Bacillus is known to partake in the metabolism of nutrients and is involved in the maintenance of intestinal homeostasis (Ilinskaya et al. 2017). The relative abundance of Bacillus is associated with genetic variations in 18 regions of the sheep genome; no candidate gene was found in three of the regions. In OAR2, significant SNP OAR2_107495304.1 was located in the region of scavenger receptor class A member 3 (SCARA3). SCARA3 has been shown to deplete reactive oxygen species as it's known as a cellular stress response gene and thus has a significant function in the protection of cells from oxidative stress (Han et al. 1998). Two significant SNP s04253.1 and OAR2_135296252.1 were located in the region of integrin subunit alpha 4 (ITGA4). ITGA4 encodes a protein in the integrin alpha chain family; integrins are a key class of cell surface receptors mediating associations of cells and extracellular matrix and are essential in cell surface adhesion and signaling (Rivera Rosado et al. 2011). Significant SNP OAR2_153029366.1 contained growth factor receptorbound protein 14 (GRB14) that may be modulating Bacillus abundance as it plays a role in inhibiting the signaling of the insulin receptor, inducing insulin resistance (Manning et al. 2012). The region of OAR3 contained ATPase Plasma Membrane Ca2 + Transporting 1 (ATP2B1), which catalyzes the ATP hydrolysis attached to calcium transport from the cytoplasm to the extracellular space thereby playing a vital role in intracellular calcium homeostasis (Wu et al. 2013).
Three candidate genes were found in the region of OAR4: Oxysterol Binding Protein like 3 (OSBPL3), Neuropeptide Y (NPY), and ADAM Metallopeptidase Domain 22 (ADAM22). OSBPL3 is essential in regulating actin cytoskeleton, cell adhesion, cell polarity, and cellular lipid metabolism (Lehto et al. 2008). NPY acts as a neuromodulator and is involved in several physiological processes, including stress response, food intake, cardiovascular function, and circadian rhythms (Vázquez et al. 2008). ADAM22 is involved in regulating cell adhesion and inhibiting cell proliferation (Liu et al. 2009). In the region of OAR6, two candidate genes were identified: alpha kinase 1 (ALPK1) and TRAF Interacting Protein with Forkhead Associated Domain (TIFA). ALPK1 initiates an innate immune response triggered by the detection of bacterial pathogen-associated molecular pattern metabolites (PAMPs), which is vital for eliminating pathogens and engaging adaptive immunity (Zimmermann et al. 2017). TIFA encodes an adapter protein essential in adaptive and innate immunity by activating proinflammatory NF-kappa-B signaling following the detection of bacterial PAMPs (Milivojevic et al. 2017). In the region of OAR7, two candidate genes were found: MCC regulator of WNT signaling pathway (MCC) and erythrocyte membrane protein band 4.1 like 4A (EPB41L4A). MCC plays a part in cell migration (Arnaud et al. 2009). While EPB41L4A product NBL4 is said to play a role in the beta-catenin signaling pathway, Beta-catenin regulates and coordinates cell-cell adhesion and gene transcription (Ishiguro et al. 2000). Region of OAR10 contained aconitate decarboxylase 1 (ACOD1) that is essential in the antimicrobial response of innate immune cells by producing itaconic acid that is involved in the antimicrobial activity of macrophages (Michelucci et al. 2013). Transmembrane Protein 98 (TMEM98) is found in one region of OAR11 and the secreted form of this gene promotes T helper 1 cells (Th1) differentiation (Fu et al. 2015).
Nine candidate genes were identified in the second region of OAR11: Alpha-1,6-Mannosylglycoprotein 6-Beta-N-Acetylglucosaminyltransferase B (MGAT5B), Acetylgalactosaminide Alpha-2,6-Sialyltransferase 2 (ST6GALNAC2), Cytoglobin (CYGB), Sphingosine Kinase 1 (SPHK1), Galanin Receptor 2 (GALR2), Galactokinase 1 (GALK1), Solute Carrier Family 16 Member 5 (SLC16A5), Otopetrin 2 (OTOP2), and Otopetrin 3 (OTOP3). MGAT5B is essential in the O-mannosyl glycan pathway, and the modulation of integrin and laminin-dependent adhesion and neuronal cell migration (Abbott et al. 2006). ST6GALNAC2 has roles at the cell surfaces in bacterial adhesion, protein targeting, and cell-cell and cell-substrate interactions (Samyn-Petit et al. 2000). CYGB might have a protective function against oxidative stress conditions and might have a role in intracellular oxygen storage or transfer (Hodges et al. 2008). SPHK1 and its product S1P are essential in the NF-kappa-B activation pathway and TNF-alpha signaling essential in immune processes (Kee et al. 2005). GALR2 expresses galanin, an important neuromodulator present in the gastrointestinal system, brain, and hypothalamopituitary axis. Also essential in regulating the growth hormone release and modulation of insulin release (Baltazar et al. 2001). GALK1 is a major enzyme for galactose metabolism (Petry and Reichardt, 1998). SLC16A5 catalyzes the prompt transport of many monocarboxylates such as branched-chain oxo acids derived from leucine, pyruvate, lactate ketone bodies acetoacetate, valine and isoleucine, beta-hydroxybutyrate, and acetate across the plasma membrane (Murakami et al. 2005). OTOP2 and OTOP3 are proton-selective channels that transport protons specifically into cells. The activity of these channels is possibly essential in cell types that use intracellular pH changes for cell signaling or the regulation of biochemical or developmental processes (Tu et al. 2018).
In the region of OAR12, Phospholipase A2 Group IVA (PLA2G4A) was proposed as a candidate gene as these are a group of enzymes hydrolyzing phospholipids into fatty acids and other lipophilic molecules (Niknami et al. 2009). Two candidate genes were found in the region of OAR13: Glutamate Decarboxylase 2 (GAD2) and Optineurin (OPTN). GAD2 encodes one of the various forms of glutamic acid decarboxylase and has been recognized as a vital autoantigen in insulin-dependent diabetes (Niknami et al. 2009). OPTN is involved in innate immune response activation during viral infections (Pourcelot et al. 2016). CD44 Molecule (Indian Blood Group) was found in the region of OAR15 and this gene encodes for a cell surface glycoprotein essential in cell adhesion, cell-cell interactions, and migration. This glycoprotein also participates in other cellular functions such as activation and homing T-lymphocytes, inflammation, and response to bacterial infections (Funaro et al. 1994;Yoshida et al. 2012). The region of OAR16 contained Fibroblast Growth Factor 10 (FGF10) that is essential in regulating the development of an embryo, cell proliferation, and differentiation. FGF10 may be involved in wound healing (Zhang et al. 2006). Two candidate genes were found in the region of OAR19: Contactin 6 (CNTN6) and cell adhesion molecule L1 like (CHL1). CNTN6 and CHL1 encode for proteins that function as cell adhesion molecules; these molecules are known to interact and have a key role in inflammatory responses (Wei et al. 1998;Murray et al. 1999).
In the region of OAR20, five candidate genes were identified: Methylmalonyl-CoA Mutase (MMUT), Glycine-N-Acyltransferase Like 3 (GLYATL3), Defensin Beta 133 (DEFB133), Defensin Beta 113 (DEFB113), and Defensin Beta 110 (DEFB110). MMUT plays a role in degrading various amino acids, cholesterol, and odd-chain fatty acids via propionyl-CoA to the tricarboxylic acid cycle (Forny et al. 2014). GLYATL3 catalyzes the conjugation of long-chain fatty acyl-CoA thioester and glycine producing long-chain N-(fatty acyl) glycine intermediate in the primary fatty acid amide biosynthetic pathway (Jeffries et al. 2016). DEFB133, DEFB113, and DEFB110 are a family of antimicrobial and cytotoxic peptides that are made by neutrophils. They have an innate immune defense response to bacteria (Jeffries et al. 2016). Lipin 2 (LPIN2) is found in the region of OAR23 and plays a significant role in regulating fatty acids metabolism at different levels (Valdearcos et al. 2012). Two candidate genes found in the region of OAR26 associated with three significant SNPs: 1-Acylglycerol-3-phosphate O-acyltransferase 5 (AGPAT5) in SNP DU481531_204.1 and Nuclear Receptor Subfamily 1 Group D Member 2 (NR1D2) in SNP OAR26_45130684.1 and OAR26_46516694.1. AGPAT5 converts lysophosphatidic acid (LPA) into phosphatidic acid and plays a role in LPA containing unsaturated or saturated fatty acids C15:0-C20:4 at the sn-1 position using C18:1-CoA as the acyl donor (Prasad et al. 2011). NR1D2 is found to be associated with Acinetobacter in the same region and has already been described.

Clostridium
The relative abundance of Clostridium is associated with genetic variations in region OAR2 of the sheep genome. SNP s43106.1 contained two candidate genes: RAD23 Homolog B, Nucleotide Excision Repair Protein (RAD23B) and Solute Carrier Family 44 Member 1 (SLC44A1). RAD23B is involved in cellular response to DNA damage stimulus (Ng et al. 2003) and cellular response to interleukin-7 (Aiello et al. 2018). SLC44A1 might provide choline for cell membrane phospholipid synthesis (Traiffort et al. 2013). SNP OAR2_58286833_X.1 contained the TLE Family Member 1, Transcriptional Corepressor (TLE1) gene which inhibits NF-kappa-B-regulated gene expression in human monocytes also, increased expression of TLE1 is significant for inhibition of interleukin 12 (IL-12) p70 expression mediated by zymosan (Alvarez et al. 2011) also as inhibiting immune suppression induced by Bacillus anthracis toxin (Larabee et al. 2013).

Flavobacterium
The relative abundance of Flavobacterium was associated with genetic variations in chromosome 1, 2, 6, 16, 22, and 23 along the sheep genome. In the region of SNP OAR1_74421103.1, three candidate genes were proposed: Growth factor independence-1 (GFI1) involved in regulating the pre-T cell differentiation, hematopoietic stem cell maintenance, and secretory cell type development in the intestines (Qu et al. 2015). The formin-binding protein 1-like (FNBP1L) is critical for antibacterial autophagy (Park et al. 2017). The ATP Binding Cassette Subfamily D Member 3 (ABCD3) is crucial in the peroxisomal degradation of branched-chain fatty acids and biosynthesis of bile acid (Baker et al. 2015). Unsaturated fatty acids released from bile are known to increase the solubility and absorption of saturated fatty acids in the rumen, which in turn improves fat absorption (Leat and Harrison, 1969). The regulation of this gene may directly influence the occurrence of Flavobacterium, as this bacteria is essential in fatty acid metabolism and biohydrogenation (McKain et al. 2010). Three candidate genes were proposed in the region of significant SNP OAR1_197593478.1: Immunoglobulin superfamily member 11 (IGSF11), Beta-1,4-Galactosyltransferase 4 (B4GALT4), Rho GTPase-Activating Protein 31 (ARHGAP31). IGSF11 functions as a cell-cell adhesion molecule in homophilic interactions and encourages cell growth (Harada et al. 2005). B4GALT4 encodes for enzymes that play a vital role in glycosaminoglycan biosynthesis (Gao et al. 2016). ARHGAP31 encodes a GTPase-activating protein regulating Cdc42 and Rac1 essential in protein trafficking, cell growth, and cell migration (Materna-Kiryluk et al. 2014). SNP OAR2_247207503.1 was associated with the abundance of Flavobacterium. The SNP is located within the region of Adenylate kinase 2 (AK2) and lymphocyte Cell-Specific Protein-Tyrosine Kinase (LCK): both these genes might be associated with microbial composition as their functions related with the immune system. AK2 is essential in cellular energy homeostasis and cell proliferation (Chen et al. 2012). LCK is essential in the selection and maturation of developing T-cells, and in T-cell antigen receptor (TCR)linked signal transduction pathways (Jung et al. 2016).
Candidate genes found in the region OAR16 are associated with innate and adaptive immune responses (C9) (Spicer et al. 2018); activation of T-cell factor signaling and mediating PGE2 induced expression of early growth response 1 (PTGER4) (Okano et al. 2006); cell growth and survival regulation responding to hormonal signals (RIC-TOR) (Sarbassov and Ali, 2014); activation and control of interleukin-2 expression (FYB1) (Veale et al. 1999); cell differentiation and can induce macrophage adhesion and spreading (DAB2) (Rosenbauer, 2002). Five candidate genes were identified in region OAR22; NFKB2 is essential in cellular responses and in regulating the immunological responses to infections and inflammation (Manna and Aggarwal, 2000;Krljanac et al. 2014). TRIM8 plays different roles in immune pathways, such as having a positive role in the TNF-alpha and IL-1beta signaling pathways (Li et al. 2011). CNNM2 is essential in magnesium (Mg2 +) homeostasis by mediating the epithelial transport and renal reabsorption of Mg2 + (Corral-Rodríguez et al. 2014). FGF8 is significant in regulating the development of an embryo, cell proliferation, differentiation, and migration (Pei et al. 2017). ELOVL3 is essential in the production of monounsaturated and saturated very-long-chain fatty acids (VLCFAs) associated with numerous biological processes as precursors of lipid mediators and membrane lipids (Leonard et al. 2000;Westerberg et al. 2006). SMAD Family Member 4 (SMAD4) is the candidate gene found in the OAR23 region; it serves as a transcription activator regulating TGF-beta receptor-mediated signaling, and is involved in IL-2 activation and signaling pathway (Yang et al. 2015).

Prevotella
Six regions on the sheep genome were associated with the abundance of Prevotella. In the region of OAR1 in the sheep genome, one candidate gene was identified: RUNX Family Transcription Factor 1 (RUNX1) which is crucial for the normal hematopoiesis development and positively regulates the expression of RAR Related Orphan Receptor C (RORC) gene in T-helper 17 cells (Fujimoto et al. 2007). One candidate gene was found in the region of OAR4; Glutamate metabotropic receptor 3 (GRM3) was the identified candidate gene, which is known to modulate synaptic glutamate levels that regulate stress response and a study showed fixed missense mutation on GRM3 differentiating between aggressive foxes from tame foxes . This gene could be associated with the behavioral traits of these two breeds because, compared to the nature of other sheep breeds, Damara and Meatmaster are aggressive animals. Also, this gene lies close to the region of a locomotive trait (ATLOC) in the sheep quantitative trait locus (QTL) database. In the region of OAR7, two candidate genes were identified: neural precursor cell expressed, developmentally downregulated 4 (NEDD4) and RAB27A (member RAS oncogene family). NEDD4 is a significant candidate molecular marker essential in cell proliferation, ovarian development, and sexual reproduction in zebrafish (Wiszniak et al. 2016). RAB27A is involved in cytotoxic granule exocytosis in lymphocytes, as it is essential for priming at the immunologic synapse and granule maturation and granule docking (Herrero-Turrión et al. 2008).
Three candidate genes have been found in the region OAR16: Dual Specificity Phosphatase 1 (DUSP1), BCL2 Interacting Protein 1 (BNIP1), and Stanniocalcin 2 (STC2). DUSP1 is essential in the cellular response to environmental stresses and negatively regulates cellular proliferation (Shen et al. 2017). BNIP1 interacts with the E1B 19 kDa protein, protecting host cells from virally induced cell death (Boyd et al. 1994). While STC2 encodes for homodimeric glycoprotein, it regulates cell metabolism, renal and intestinal calcium and phosphate transport, and cellular calcium/ phosphate homeostasis (Meyer et al. 2009). Prevotella microbes are involved in peptide and protein degradation in the rumen, in saccharolytic pathways, and in the production of saturated fatty acids (Stevenson and Weimer, 2007). STC2 has been found to have a significant association with the relative abundance of Prevotella in cattle, and it has been previously associated to fatty acids and cell metabolism (Gonzalez-Recio et al. 2018); hence, this gene is suggested that it could modulate the abundance of Prevotella composition in the rumen. In OAR 25 region, three candidate genes were identified; two are also found to be associated with Acinetobacter in the same region and have already been described (WDFY4 and OGDHL). The third gene is the mitogen-activated protein kinase 4 (MAPK8) and it acts as a point of integration for multiple biochemical signals and is essential in extensive cellular processes such as transcription regulation and development, proliferation, and differentiation (Wang et al. 2007).

Pseudomonas
The relative abundance of Pseudomonas is associated with genetic variations of two regions in OAR6 and OAR14. Three candidate genes have been found in the OAR6 region: 3-Hydroxybutyrate Dehydrogenase 2 (BDH2), Solute Carrier Family 9 Member B2 (SLC9B2), and CDGSH Iron Sulfur Domain 2 (CISD2). BDH2 facilitates the formation of 2,5-dihydroxybenzoic acid (2,5-DHBA), a siderophore that plays a role in iron assimilation and homeostasis, as it shares structural resemblances with bacterial enterobactin (Devireddy et al. 2010;Liu et al. 2014). This gene may be significant in the prevention of pathogens to invade the host. SLC9B2 is essential in regulating sodium homeostasis, intracellular pH, and cell volume , which may be significant in regulating ruminal microbial composition. It is also essential in insulin secretion and clathrinmediated endocytosis in beta-cells (Deisl et al. 2013). CISD2 regulates autophagy, contributing to antagonizing BECN1mediated cellular autophagy at the endoplasmic reticulum (Chang et al. 2010). Casein Kinase 2 Alpha 2 (CSNK2A2) and Glutamic-Oxaloacetic Transaminase 2 (GOT2) are two candidate genes identified in the region of OAR14. CSNK2A2 phosphorylate acidic proteins such as casein are essential in several cellular processes, like cell cycle control and circadian rhythms (Yang et al. 2002). GOT2 is essential in amino acid metabolism and is essential for the exchange of metabolites between mitochondria and cytosol. It also enables cellular uptake of long-chain free fatty acids (Blüher et al. 2017;Hong et al. 2019).

Streptobacillus
Two candidate genes found in the OAR25 region may be associated with the Streptobacillus relative abundance. Surfactant Protein A1 (SFTPA1) and Peroxiredoxin Like 2A (PRXL2A). SFTPA1 encodes a protein binding to particular carbohydrate moieties found on the surface of microorganisms and lipids. This protein is vital in the immune defense and surfactant homeostasis against respiratory pathogens (Pinnell et al. 2015). PRXL2A acts as an antioxidant and negatively regulates macrophage-mediated inflammation by inhibiting the production of macrophage of inflammatory cytokines by suppressing MAPK signaling pathway (Fang et al. 2015).

Conclusion
This study identified associations between sheep genetic variations and relative abundance of seven genera: Acinetobacter, Bacillus, Clostridium, Flavobacterium, Prevotella, Pseudomonas, and Streptobacillus. Candidate genes identified in the associated genetic regions encode proteins essential in host defense systems, containing the immune system, antimicrobial activities, inflammatory responses, and regulation of cell adhesion and migration. In contrast, other proteins contribute to fatty acid and bile metabolism and other biochemical processes. The results confirm that host genetics is important in modulating microbial composition; nonetheless, the association found in this study may be population-specific as various factors alter the rumen microbiota. More studies are necessary for diverse populations to help determine genetic combinations favorable for the enhancement of beneficial microbiota and provide excellent intestinal health to the individuals and avoid the invasion of potential pathogens.
Acknowledgements The authors acknowledge Centre for High Performance Computing, South Africa for the computer programs and facilities that were used in this project.
Author contribution SM, OAA, and MAA designed the experiment. SM carried out all the laboratory work and analyses. SM drafted the manuscript. OAA and MAA revised the manuscript. All authors read and approved the final manuscript. Data availability Raw reads used in the present study were deposited to the National Center of Biotechnology Information (NCBI) Sequence Read Archive (SRA) database under BioProject PRJNA578022 (https:// www. ncbi. nlm. nih. gov/ biopr oject/? term= PRJNA 578022). The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.