Previous studies have demonstrated that host genetics play a role in modulating the composition of gut microbiota in mammals [3, 7, 8, 13]. The results in this study suggest that there are associations between the abundance of certain genera and host genotypes (SNPs), which might provide answers on which genetic components contribute to the rumen microbial composition. Some of the microorganisms that showed to be related to host genetics are relevant in the composition of the ruminal environment, degradation of feed. The detailed description of the candidate genes mapped in the sheep genome’s significant genetic variations is discussed.
Acinetobacter
The relative abundance of Acinetobacter is associated with genetic variations in six sheep genome regions. In the region OAR1, two candidate genes have been identified; Adenylate Kinase 4 (AK4) and Leptin Receptor (LEPR). AK4 regulates the activation of the energy sensor and phosphorylation AMPK protein kinase by controlling cellular ATP levels. The kinase is also essential in oxidative stress cellular defence response [20, 21]. While LEPR is a receptor for leptin, which is a hormone that is adipocyte-specific regulating body weight [22]. In the region of OAR3, two candidate genes were found; Integrin Subunit Beta 1 Binding Protein 1 (ITGB1BP1) and Rho Associated Coiled-Coil Containing Protein Kinase 2 (ROCK2). ITGB1BP1 is essential for cell adhesion, proliferation, differentiation and migration [23]. ROCK2 is vital for regulating cell polarity and the actin cytoskeleton. ROCK2 is also involved in regulating the smooth muscle contraction, neurite retraction, cell adhesion and motility, stress fibre, actin cytoskeleton organization and focal adhesion formation [24]. ROCK2 might be involved in regulating the rumen smooth muscle contraction. A study by Xiang et al. [25] reported that increased expression of genes regulating smooth muscle contraction in the rumen is related to sheep with shorter mean retention time (MRT), as the muscle contraction reduces MRT and subsequently the CH4. This may increase gastrointestinal muscle contraction linked to the rapid digesta passage rate, thus influencing rumen microbiota.
Three candidate genes found in the region of OAR22, Carbohydrate Sulfotransferase 15 (CHST15), Abraxas 2, BRISC Complex Subunit (ABRAXAS2) and Zinc Finger RANBP2-Type Containing 1 (ZRANB1). CHST15 is essential in glycosaminoglycan metabolism, a significant structural constituent of the extracellular matrix and this gene potentially acts as a B-cell surface signaling receptor [26]. ABRAXAS2 is involved in the deubiquitination of the interferon receptor IFNAR1 involved in interferon signaling, and it also down-regulates bacterial lipopolysaccharide (LPS) response [27]. ZRANB1 is essential in cell migration and stress fibre dynamics; it might also modulate TNF-alpha signaling [28]. In the region of OAR25, three candidate genes were found, WDFY family member 4 (WDFY4), oxoglutarate dehydrogenase-like (OGDHL) and DEPP1 autophagy regulator (DEPP1). WDFY4 plays a significant role in regulating cDC1-mediated cross-presentation of viral and tumor antigens in dendritic cells. It is also known to be essential in B-cell survival through autophagy regulation [29]. OGDHL encodes for protein similar that degrades glucose and glutamate [30]. DEPP1 is known as fasting-induced gene as its expression is induced by fasting and progesterone and is a vital modulator of FOXO3-induced autophagy via oxidative stress [31]. While no candidate gene was found in other genetic variations in the region of OAR26, the genetic variant with significant SNP OAR26_46995252.1 contained nuclear receptor subfamily 1 group D member 2 (NR1D2) identified. NR1D2 regulates genes essential in metabolic functions such as lipid metabolism and the inflammatory response [32]. Acinetobacter is said to be more involved in epithelial proliferation and disease [33] and these genes might modulate the abundance of Acinetobacter.
Bacillus
In the gut, Bacillus is known to partake in the metabolism of nutrients and is involved in the maintenance of intestinal homeostasis [34]. The relative abundance of Bacillus is associated with genetic variations in 18 regions of the sheep genome; no candidate gene was found in three of the regions. In OAR2, significant SNP OAR2_107495304.1 was located in the region of scavenger receptor class A member 3 (SCARA3). SCARA3 has been shown to deplete reactive oxygen species as it’s known as a cellular stress response gene and thus has a significant function in the protection of cells from oxidative stress [35]. Two significant SNP s04253.1 and OAR2_135296252.1 were located in the region of integrin subunit alpha 4 (ITGA4). ITGA4 encodes a protein in the integrin alpha chain family; integrins are a key class of cell surface receptors mediating associations of cells and extracellular matrix and are essential in cell surface adhesion and signaling [36]. Significant SNP OAR2_153029366.1 contained growth factor receptor-bound protein 14 (GRB14) that may be modulating Bacillus abundance as it plays a role in inhibiting the signaling of the insulin receptor, inducing insulin resistance [37]. The region of OAR3 contained ATPase Plasma Membrane Ca2 + Transporting 1 (ATP2B1), which catalyzes the ATP hydrolysis attached to calcium transport from the cytoplasm to the extracellular space thereby playing a vital role in intracellular calcium homeostasis [38].
Three candidate genes found in the region of OAR4; Oxysterol Binding Protein like 3 (OSBPL3), Neuropeptide Y (NPY) and ADAM Metallopeptidase Domain 22 (ADAM22). OSBPL3 is essential in regulating actin cytoskeleton, cell adhesion, cell polarity and cellular lipid metabolism [39]. NPY acts as a neuromodulator and is involved in several physiological processes, including stress response, food intake, cardiovascular function and circadian rhythms [40]. ADAM22 is involved in the regulating cell adhesion and inhibiting cell proliferation [41]. In the region of OAR6, two candidate genes were identified; alpha kinase 1 (ALPK1) and TRAF Interacting Protein with Forkhead Associated Domain (TIFA). ALPK1 initiates an innate immune response triggered by the detection of bacterial pathogen-associated molecular pattern metabolites (PAMPs), which is vital for eliminating pathogens and engaging adaptive immunity [42]. TIFA encodes an adapter protein essential in adaptive and innate immunity by activating proinflammatory NF-kappa-B signaling following the detection of bacterial PAMPs [43]. In the region of OAR7, two candidate genes were found MCC regulator of WNT signaling pathway (MCC) and erythrocyte membrane protein band 4.1 like 4A (EPB41L4A). MCC plays a part in cell migration [44]. While EPB41L4A product NBL4 is said to play a role in the beta-catenin signaling pathway. Beta-catenin regulates and coordinates cell-cell adhesion and gene transcription [45]. Region of OAR10 contained aconitate decarboxylase 1 (ACOD1) that is essential in the antimicrobial response of innate immune cells by producing itaconic acid that is involved in the antimicrobial activity of macrophages [46]. Transmembrane Protein 98 (TMEM98) is found in one region of OAR11 and the secreted form of this gene promotes T helper 1 cells (Th1) differentiation [47].
Nine candidate genes were identified in the second region of OAR11; Alpha-1,6-Mannosylglycoprotein 6-Beta-N-Acetylglucosaminyltransferase B (MGAT5B), Acetylgalactosaminide Alpha-2,6-Sialyltransferase 2 (ST6GALNAC2), Cytoglobin (CYGB), Sphingosine Kinase 1 (SPHK1), Galanin Receptor 2 (GALR2), Galactokinase 1 (GALK1), Solute Carrier Family 16 Member 5 (SLC16A5), Otopetrin 2 (OTOP2) and Otopetrin 3 (OTOP3). MGAT5B is essential in the O-mannosyl glycan pathway, and the modulation of integrin and laminin-dependent adhesion and neuronal cell migration [48]. ST6GALNAC2 has roles at the cell surfaces in bacterial adhesion, protein targeting and cell-cell and cell-substrate interactions [49]. CYGB might have a protective function against oxidative stress conditions and might have a role in intracellular oxygen storage or transfer [50]. SPHK1 and its product S1P are essential in the NF-kappa-B activation pathway and TNF-alpha signaling essential in immune processes [51]. GALR2 expresses galanin, an important neuromodulator present in the gastrointestinal system, brain and hypothalamopituitary axis. Also essential in regulating the growth hormone release and modulation of insulin release [52]. GALK1 is a major enzyme for galactose metabolism [53]. SLC16A5 catalyzes the prompt transport of many monocarboxylates such as branched-chain oxo acids derived from leucine, pyruvate, lactate ketone bodies acetoacetate, valine and isoleucine, beta-hydroxybutyrate and acetate across the plasma membrane [54]. OTOP2 and OTOP3 are proton-selective channels that transport protons specifically into cells. The activity of these channels is possibly essential in cell types that use intracellular pH changes for cell signaling or the regulation of biochemical or developmental processes [55].
In the region of OAR12, Phospholipase A2 Group IVA (PLA2G4A) was proposed as a candidate gene as these are a group of enzymes hydrolyzing phospholipids into fatty acids and other lipophilic molecules [56]. Two candidate genes found in the region of OAR13; Glutamate Decarboxylase 2 (GAD2) and Optineurin (OPTN). GAD2 encodes one of the various forms of glutamic acid decarboxylase and has been recognized as a vital autoantigen in insulin-dependent diabetes [56]. OPTN is involved in innate immune response activation during viral infections [57]. CD44 Molecule (Indian Blood Group) was found in the region of OAR15 and this gene encodes for a cell-surface glycoprotein essential in cell adhesion, cell-cell interactions and migration. This glycoprotein also participates in other cellular functions such as activation and homing T-lymphocytes, inflammation and response to bacterial infections [58, 59]. The region of OAR16 contained Fibroblast Growth Factor 10 (FGF10) that is essential in regulating the development of an embryo, cell proliferation and differentiation. FGF10 may be involved in wound healing [60]. Two candidate genes found in the region of OAR19; Contactin 6 (CNTN6) and cell adhesion molecule L1 like (CHL1). CNTN6 and CHL1 encode for proteins that function as cell adhesion molecules, these molecules are known to interact and have a key role in inflammatory responses [61, 62].
In the region of OAR20, five candidate genes were identified; Methylmalonyl-CoA Mutase (MMUT), Glycine-N-Acyltransferase Like 3 (GLYATL3), Defensin Beta 133 (DEFB133), Defensin Beta 113 (DEFB113) and Defensin Beta 110 (DEFB110). MMUT plays a role in degrading various amino acids, cholesterol and odd-chain fatty acids via propionyl-CoA to the tricarboxylic acid cycle [63]. GLYATL3 catalyzes the conjugation of long-chain fatty acyl-CoA thioester and glycine producing long-chain N-(fatty acyl) glycine intermediate in the primary fatty acid amide biosynthetic pathway [64]. DEFB133, DEFB113 and DEFB110 are a family of antimicrobial and cytotoxic peptides that are made by neutrophils. They have innate immune defence response to bacteria [64]. Lipin 2 (LPIN2) is found in the region of OAR23 and plays a significant role in regulating fatty acids metabolism at different levels [65]. Two candidate genes found in the region of OAR26 associated with three significant SNPs; 1-Acylglycerol-3-phosphate O-acyltransferase 5 (AGPAT5) in SNP DU481531_204.1 and Nuclear Receptor Subfamily 1 Group D Member 2 (NR1D2) in SNP OAR26_45130684.1 and OAR26_46516694.1. AGPAT5 converts lysophosphatidic acid (LPA) into phosphatidic acid and plays a role in LPA containing unsaturated or saturated fatty acids C15:0-C20:4 at the sn-1 position using C18:1-CoA as the acyl donor [66]. NR1D2 is found to be associated with Acinetobacter in the same region and has already been described.
Clostridium
The relative abundance of Clostridium is associated with genetic variations in region OAR2 of the sheep genome. SNP s43106.1 contained two candidate genes; RAD23 Homolog B, Nucleotide Excision Repair Protein (RAD23B) and Solute Carrier Family 44 Member 1 (SLC44A1). RAD23B is involved in cellular response to DNA damage stimulus [67] and cellular response to interleukin-7 [68]. SLC44A1 might provide choline for cell membrane phospholipid synthesis [69]. SNP OAR2_58286833_X.1 contained the TLE Family Member 1, Transcriptional Corepressor (TLE1) which inhibits NF-kappa-B-regulated gene expression in human monocytes also, increased expression of TLE1 is significant for inhibition of interleukin 12 (IL-12) p70 expression mediated by zymosan [70] also as inhibiting immune suppression induced by Bacillus anthracis toxin [71].
Flavobacterium.
The relative abundance of Flavobacterium was associated with genetic variations in chromosome 1, 2, 6, 16, 22 and 23 along the sheep genome. In the region of SNP OAR1_74421103.1, three candidate genes were proposed; Growth factor independence-1 (GFI1) involved in regulating the pre-T cell differentiation, hematopoietic stem cell maintenance and secretory cell types development in the intestines [72]. The formin-binding protein 1-like (FNBP1L) is critical for antibacterial autophagy [73]. The ATP Binding Cassette Subfamily D Member 3 (ABCD3) is crucial in the peroxisomal degradation of branched-chain fatty acids and biosynthesis of bile acid [74]. Unsaturated fatty acids released from bile are known to increase the solubility and absorption of saturated fatty acids in the rumen, which in turn improves fat absorption [75]. The regulation of this gene may directly influence the occurrence of Flavobacterium, as this bacteria is essential in fatty acid metabolism and biohydrogenation [76]. Three candidate genes were proposed in the region of significant SNP OAR1_197593478.1: Immunoglobulin superfamily member 11 (IGSF11), Beta-1,4-Galactosyltransferase 4 (B4GALT4), Rho GTPase Activating Protein 31 (ARHGAP31). IGSF11 functions as a cell-cell adhesion molecule in homophilic interactions and encourages cell growth [77]. B4GALT4 encodes for enzymes that play a vital role in glycosaminoglycan biosynthesis [78]. ARHGAP31 encodes a GTPase-activating protein regulating Cdc42 and Rac1 essential in protein trafficking, cell growth and cell migration [79]. SNP OAR2_247207503.1 was associated with the abundance of Flavobacterium. The SNP is located within the region of Adenylate kinase 2 (AK2) and lymphocyte Cell-Specific Protein-Tyrosine Kinase (LCK), both these genes might be associated with microbial composition as their functions related with the immune system. AK2 is essential in cellular energy homeostasis and cell proliferation [80]. LCK is essential in the selection and maturation of developing T-cells, and in T-cell antigen receptor (TCR)-linked signal transduction pathways [81].
Candidate genes found in the region OAR16 are associated with innate and adaptive immune responses (C9) [82]; activation of T-cell factor signaling and mediating PGE2 induced expression of early growth response 1 (PTGER4) [83]; cell growth and survival regulation responding to hormonal signals (RICTOR) [84]; activation and control of interleukin-2 expression (FYB1) [85]; cell differentiation and can induce macrophage adhesion and spreading (DAB2) [86]. Five candidate genes identified in region OAR22; NFKB2 is essential in cellular responses and in regulating the immunological responses to infections and inflammation [87, 88]. TRIM8 plays different roles in immune pathways, such as having a positive role in the TNFalpha and IL-1beta signaling pathways [89]. CNNM2 is essential in magnesium (Mg2+) homeostasis by mediating the epithelial transport and renal reabsorption of Mg2+ [90]. FGF8 is significant in regulating the development of an embryo, cell proliferation, differentiation and migration [91]. ELOVL3 is essential in the production of monounsaturated and saturated very-long-chain fatty acids (VLCFAs) associated with numerous biological processes as precursors of lipid mediators and membrane lipids [92, 93]. SMAD Family Member 4 (SMAD4) is the candidate gene found in the OAR23 region; it serves as a transcription activator regulating TGF-beta receptor-mediated signaling, and is involved in IL-2 activation and signaling pathway [94].
Prevotella.
Six regions on the sheep genome were associated with the abundance of Prevotella. In the region of OAR1 in the sheep genome, one candidate gene was identified; RUNX Family Transcription Factor 1 (RUNX1) which is crucial for the normal hematopoiesis development and positively regulates the expression of RAR Related Orphan Receptor C (RORC) gene in T-helper 17 cells [95]. One candidate gene was found in the region of OAR4; Glutamate metabotropic receptor 3 (GRM3) was the identified candidate gene, which is known to modulate synaptic glutamate levels that regulate stress response and a study showed fixed missense mutation on GRM3 differentiating between aggressive foxes from tame foxes [96]. In the region of OAR7, two candidate genes were identified; neural precursor cell expressed, developmentally downregulated 4 (NEDD4) and RAB27A (member RAS oncogene family). NEDD4 is a significant candidate molecular marker essential in cell proliferation, ovarian development, and sexual reproduction in zebrafish [97]. RAB27A is involved in cytotoxic granule exocytosis in lymphocytes, as it is essential for priming at the immunologic synapse and granule maturation and granule docking [98].
Three candidate genes have been found in the region OAR16: Dual Specificity Phosphatase 1 (DUSP1), BCL2 Interacting Protein 1 (BNIP1) and Stanniocalcin 2 (STC2). DUSP1 is essential in the cellular response to environmental stresses and negatively regulate cellular proliferation [99]. BNIP1 interacts with the E1B 19 kDa protein, protecting host cells from virally induced cell death [100]. While STC2 encodes for homodimeric glycoprotein, it regulates cell metabolism, renal and intestinal calcium and phosphate transport and cellular calcium/phosphate homeostasis [101]. Prevotella microbes are involved in peptide and protein degradation in the rumen, in saccharolytic pathways and in the production of saturated fatty acids [102]. STC2 has been found to have a significant association with the relative abundance of Prevotella in cattle, and it has been previously associated to fatty acids and cell metabolism [103]; hence this gene is suggested that it could modulate the abundance of Prevotella composition in the rumen. In OAR 25 region, three candidate genes were identified, two are also found to be associated with Acinetobacter in the same region and have already been described (WDFY4 and OGDHL). The third gene is the mitogen-activated protein kinase 4 (MAPK8) and it acts as a point of integration for multiple biochemical signals and is essential in extensive cellular processes such as transcription regulation and development, proliferation and differentiation [104].
Pseudomonas:
The relative abundance of Pseudomonas is associated with genetic variations of two regions in OAR6 and OAR14. Three candidate genes have been found in OAR6 region: 3-Hydroxybutyrate Dehydrogenase 2 (BDH2), Solute Carrier Family 9 Member B2 (SLC9B2) and CDGSH Iron Sulfur Domain 2 (CISD2). BDH2 facilitates the formation of 2,5-dihydroxybenzoic acid (2,5-DHBA), a siderophore that plays a role in iron assimilation and homeostasis, as it shares structural resemblances with bacterial enterobactin [105, 106]. This gene may be significant in the prevention of pathogens to invade the host. SLC9B2 is essential in regulating sodium homeostasis, intracellular pH, and cell volume [107], which may be significant in regulating ruminal microbial composition. It also essential in insulin secretion and clathrin-mediated endocytosis in beta-cells [108]. CISD2 regulates autophagy, contributing to antagonizing BECN1-mediated cellular autophagy at the endoplasmic reticulum [109]. Casein Kinase 2 Alpha 2 (CSNK2A2) and Glutamic-Oxaloacetic Transaminase 2 (GOT2) are two candidate genes identified in the region of OAR14. CSNK2A2 phosphorylate acidic proteins such as casein are essential in several cellular processes, like cell cycle control and circadian rhythms [110]. GOT2 is essential in amino acid metabolism and is essential for the exchange of metabolites between mitochondria and cytosol. It also enables cellular uptake of long-chain free fatty acids [111, 112].
Streptobacillus.
Two candidate genes found in the OAR25 region may be associated with the Streptobacillus relative abundance. Surfactant Protein A1 (SFTPA1) and Peroxiredoxin Like 2A (PRXL2A). SFTPA1 encodes a protein binding to particular carbohydrate moieties found on the surface of microorganisms and lipids. This protein is vital in the immune defence and surfactant homeostasis against respiratory pathogens [113]. PRXL2A acts as an antioxidant and negatively regulates macrophage-mediated inflammation by inhibiting the production of macrophage of inflammatory cytokines by suppressing MAPK signaling pathway [114].