Expression QTL mappingExpression cis and DEG cis eQTL analysis
LSM2 and SOAT1 harbor some highly significant cis eQTLs. LSM2 binds to other members of the ubiquitous and multifunctional family Sm-like (LSM) in order to form RNA-processing complexes. These complexes are involved in processes such as stabilization of the spliceosomal U6 snRNA, mRNA decay and guide site-specific pseudouridylation of rRNA [10]. Lu et al. [11] identified two missense polymorphisms in SOAT1 associated with cholesterol in plasma and triglyceride levels in mice since they are able to increase enzyme activityG. None of these two genes were identified as DEG, therefore they must be more involved in skeletal muscle homeostasis.
Expression trans and DEG trans eQTL analysis, and master regulators
The 27 master regulators identified in the eQTL analysis could contribute to gene expression control by promoting cell signaling or by direct transcriptional activation/repression mechanisms. A number of structural proteins and transcription regulators were identified as master regulators. Neurotrophin 3 (NTF3),, Glutamate Decarboxylase 1 (GAD1),, FAT Atypical Cadherin 4 (FAT4),, Transmembrane 4 L Six Family Member 1 (TM4SF1),, Transmembrane 4 L Six Family Member 1 (TM4SF1) and Keratin 7 (KRT7) encode for transmembrane or cytoskeletal proteins. Zinc Finger Protein 804A (ZNF804A),, Paired Box 8 (PAX8),, Lysine Demethylase 4A (KDM4A) and RUNX1 Translocation Partner 1 (RUNX1T1 or Myeloid Translocation Gene on 8q22-MTG8) encode for transcription factors or histone demethylases. NTF3, TM4SF1, and KDM4A are further discussed.
NTF3 was identified as a master regulator in the present analysis since rs207649022 was able to explain variation in the expression of 76 genes (Table 1), 69.7% of which were DEG genes (Figure 3B). Since NTF3 was associated with a number of DEGs, this master regulator was able to explain variability in gene expression associated with meat quality. The Neurotrophic Factor gene family regulates myoblast and muscle fiber differentiation. It also coordinates muscle innervation and functional differentiation of neuromuscular junctions [12]. Mice with only one functional copy of the NTF3 gene showed a smaller cross-sectional fiber area and more densely distributed muscle fibers [13]. Upregulation of NTF3, stimulated by the transcription factor POU3F2, is present during neuronal differentiation [14]. The neocortex has multiple layers originated by cell fate restriction of cortical progenitors and NTF3 induces cell fate switches by controlling a feedback signal between postmitotic neurons and progenitors. Therefore, changes in NTF3 expression can modulate the amount of tissue present in each neocortex layer [15].
NTF3 was identified in a previous study as highly associated with cooking loss [16] pointing out that markers inside this locus are able to explain variation at both the phenotype and gene expression level. This implicates NTF3 as a positional and functional gene with a potential role in meat quality. These effects are probably not due to cis regulation on NTF3 given that the number of reads mapped to this gene was extremely low and it did not surpass the threshold used in order to be included in the DEG analysis (average = 6.7, min = 0; max = 23). However, NTF3 could be actively expressed in earlier developmental stages and then expressed at a basal level, exerting control on expression regulation later on when cellular morphology has been completely established. A Functional Annotation Clustering analysis for the NTF3 regulated genes indicated that the master regulator NTF3 could be involved in the regulation of specific mechanisms and pathways related to Mitochondrion, Transit peptide and Mitochondrion inner membrane (Additional file 6).
The expression of 62 genes was associated with rs378343630, a marker located in the TM4SF1 master regulator. This gene encodes a plasma transmembrane protein and belongs to a gene family involved in signal transduction processes; thus, it modulates development, growth, and motility [17]. TM4SF1 physically interacts with the membrane and some cytoskeleton-associated proteins to form cell projections named ‘nanopodia’ [18], which are described as frequently identified in multiple types of cancer. This gene is highly expressed in pancreatic cancer cells and stimulates metastasis by upregulating Discoidin Domain Receptor Tyrosine Kinase 1 (DDR1),, Matrix Metallopeptidase 2 (MMP2) and Matrix Metallopeptidase 9 (MMP9) [19]. In liver, TM4SF1 reduced apoptosis and promoted cell migration by upregulating MMP–2, MMP–9 and VEGF, and downregulating Caspase–3 and Caspase–9 [17]. Upregulation of miR–9 produces downregulation of TM4SF1, MMP2, MMP9 and VEGF in colorectal carcinoma, inhibiting cell migration and invasion [20]. In esophageal cancer stem-like cells, downregulation of miR–141 increases TM4SF1 expression, self-renewal ability and promotes cell invasion [21]. The Functional Annotation Clustering analysis for TM4SF1 found an overrepresentation of the transcription, DNA-templated term (Additional file 6); thus, TM4SF1 could be involved in the regulation of specific mechanisms and pathways associated with transcription in longissimus dorsi muscle. Neither TM4SF1 nor any gene in this cluster was identified as DEG; therefore they might be more related to skeletal muscle homeostasis than meat quality.
The KDM4A cluster has 32 regulated genes associated with rs135786834; KDM4A encodes a histone lysine demethylase able to modify trimethylated H3-K9/K36 to dimethylated products, contributing to gene expression, cellular differentiation and cancer development [22]. Histone H3K9 methylation promotes the silencing of muscle-specific genes in proliferating myoblasts and derepression of these genes is required to initiate muscle differentiation. Expression of a KDM4A isoform named DN-JMJD2A is upregulated during differentiation of myoblasts into myotubes promoting myotube formation and transcriptionally activating muscle-specific genes such as MyoD [23]. The only DEG master regulator identified in the present analysis was KDM4A and this master regulator harbors rs135786834, an SNP associated with expression of 32 genes by trans association. Therefore, changes in the expression of tKDM4A did not show evidence of promoting the expression of genes related to meat quality.
Multigenic effects based on the eQTL analysis
Some of the most interesting genes identified in this analysis were ULK2, MYL1, and PHF14.Forty-three SNPs were associated with ULK2 expression. ULK2 encodes a serine/threonine-protein kinase required for autophagy under nutrient-deprived conditions [24]. Downregulation of ULK2 activates mTOR c1 signaling, promoting cell proliferation [25]. The MYL1 gene encodes a fast-twitch regulatory light chain of myosin in skeletal muscle; downregulation of MYL1 alters myocyte morphology and muscle structure, and generates congenital myopathy in zebrafish [26]. A total of 40 and 36 polymorphisms were associated with the expression of MYL1 and PHF14, respectively. PHF14 is ubiquitously expressed and its protein has multiple PHD fingers, a domain present in chromatin-binding proteins which are able to recognize particular epigenetic marks on histone tails. The PHF14 knockout in mice generates neonatal lethality and severe structural changes in multiple organs especially lungs. PHF14 is an epigenetic regulator required for the normal development of multiple organs [27], is probably involved in skeletal muscle homeostasis.
Splicing QTL mapping
Splicing cis and DEG cis analysis
The TTN gene harbors a highly significant DEG cis sQTL (p-value = 2.0x10–7) and encodes a central sarcomeric protein. Some TTN mutations are associated with skeletal-muscle diseases such as tibial muscular dystrophy [28]. Fernandez-Marmiesse et al. [29] identified a non-sense mutation in a TTN exon only present in a fetal skeletal isoform and associated with a neuromuscular disorder; histologically, this mutation promotes sarcomeric deposition of a filamentous material. A DEG cis sQTL (p-value = 5.1x10–7) was identified in the TEK gene. This gene encodes a receptor for Angiopoietin–1 (ANGPT1),, and its signaling pathway is critical for migration, sprouting and survival of endothelial cells; TEK activates the SHC Adaptor Protein 1 (SHC1), a protein involved in triggering the Ras/mitogen-activated protein kinase pathway, regulating migration and endothelial organization [30]. Therefore, cis sQTLs in TTN and TEK could explain variation in the expression of these genes and variation in meat quality-related phenotypes.
Trans and DEG trans splicing QTL analysis, and master regulators
Similarly, as the identified expression master regulators, the splicing master regulators can be grouped as transcription regulators and structural proteins. Small Nucleolar RNA, H/ACA Box 3A (SNORA3),, Small Nucleolar RNA, H/ACA Box 19 (SNORA19),, Ring Finger Protein 20 (RNF20),, and Zinc Finger Protein 804A (ZNF804A) could be classified as transcription regulators. Phosphodiesterase 9A (PDE9A),, Olfactory Receptor Family 13 Subfamily F Member (OR13F1),, Dynein Axonemal Heavy Chain 7 (DNAH7) and Von Willebrand Factor C Domain Containing 2 (VWC2) can be identified as structural proteins.
The small non-coding RNAs such as SNORA3 and SNORA19 modulate stability, folding and interaction with proteins and more recently, functions such as mRNA editing, alternative splicing and posttranscriptional gene silencing were discovered [31]. However, no clear function of SNORA3 and SNORA19 is described. Exon expression of 34 exons from 17 genes and 20 exons from 15 genes were associated with the polymorphisms rs209617551 (SNORA3) and BTB_01634267 (SNORA19),, respectively.
Expression of 23 exons from 19 genes was associated with rs382101207, an SNP located in Ring Finger Protein 20 (RNF20).. Upregulation of RNF20 stimulates H2B monoubiquitination and methylation at H3K4 and H3K79; it promotes expression of Homeobox genes, a group of transcription factors [32]. RNF20 also regulates expression of H2A and H2B histones, p53, several proto-oncogenes and promotes cell migration and tumorigenesis [33]. The RNF20/RNF20 (Bre1 complex) is documented as a tumor suppressor by upregulating a set of tumor suppressor genes and by contributing to genomic stability maintenance. Bre1 deficient cells present a high frequency of DNA double-strand breaks (DSB), and abundant aberrant RNA-DNA structures (R-loops), indicators of replication stress and genomic instability [32].
Pierce et al. [1] theorized that a high proportion of trans associations are caused by cis effects. However, no cis QTL was identified in any expression or splicing master regulator. This result suggests that, in the present analysis, trans effects might contribute significantly to phenotypic variation related to skeletal muscle homeostasis and meat quality.
Multigenic effects based on the sQTL analysis
The large number of sQTLs identified in genes like TTN (324) and NEB (63) could be related to gene size, since these genes are 275 and 219 kb long, respectively, which would increase the probability of being involved in trans regulation.On the other hand, some relatively short genes such as TCEB2 (9.9 kb) and USF2 (3.9 kb)also had a large number of sQTLs (43 and 33, respectively) indicating a possible complex splicing regulation.
A total of 324 and 67 polymorphisms were associated with TTN and NEB ratio exon counts, respectively. TTN and NEB are involved in assembly and mechanical activity of striated muscles. Both proteins are large sarcomere filament-binding proteins expressed in skeletal muscle and multiple splicing events in the bovine homologous are described. In the human brain, NEB acts as an actin filament stabilizer and regulates neuronal length. It is also involved in myofibrillogenesis, modulates thin filament length and allows proper muscle contraction [34]. TTN, NEB, and USF2 were identified as DEG; therefore, sQTL regulation could contribute to phenotypic variability associated with meat quality in longissimus dorsi and skeletal muscle homeostasis.
Gene expression and splicing regulation mechanisms by plasma and organelle associated proteins
The cell cytoskeleton provides cellular mechanical constraints and extracellular matrix stiffness [35]. However, structural proteins are involved in multiple biological processes different from the organizational ones, with signaling and cell fate being some of the most important. Cell signaling is crucial since it orchestrates cellular responses to different microenvironmental stimuli, and transcription repression-activation and splicing regulation are influenced by signaling proteins. A number of receptors, transmembrane linkers, cytoskeletal fibers and membrane-associated transcription factors were previously associated with transcription repression-activation.
The OR4A47, GPR98, PDE9A, OR13F1 and SYT14 master regulators were also described as transmembrane protein-coding genes and this type of molecule is involved in cell signaling processes. Pandey et al. [36] reported that estrogen can signal using diverse receptors, the G Protein-Coupled Estrogen Receptor 1 (GPR30) being one of them. Stimulation of GPR30 by estrogen activates a transcription factor network that upregulates Cellular Communication Network Factor 2 (CCN2),, promoting proliferation and cell migration. The master regulators GAD1 and TM4SF1 encode transmembrane linkers similar to the integrin family. Integrins can modulate signal transduction cascades involved in cell survival, proliferation, differentiation and organ development [37]. The dimer ITGA1-ITGB1 can stall Epidermal Growth Factor Receptor (EGFR) signaling by stimulating Protein Tyrosine Phosphatase, Non-Receptor Type 2 (PTPN2). The cytoplasmic domain of ITGA1 interacts with PTPN2 and decreases EGFR phosphorylation after Epidermal Growth Factor (EGF) stimulation [38].
The cytoskeletal protein-coding genes KRT7, FAT4, MYH14, and DNAH7 were identified as master regulators. Some cytoskeletal proteins might drive transcription regulation and promote cellular mechanisms such as growth and apoptosis. Flouriot et al. [35] reported that the actin network can regulate Myocardin Related Transcription Factor A (MRTFA) subcellular localization, a protein involved in growth-quiescence switch. High F/G actin ratio or mutant MRTFA cells showed higher global biosynthetic activity and open chromatin state, associated with extensive histone modifications. In Drosophila, Hippo tumor suppressor pathway controls organ size, and proteins such as Yorkie (human homologous Yes Associated Protein 1 -YAP),, a transcriptional coactivator, and Hpo and Warts kinases (human homologous Serine/Threonine Kinase 3 -STK3- and Large Tumor Suppressor Kinase 1 -LATS1, respectively) belong to this pathway. YAP is negatively regulated by STK3 and LATS1. F-actin accumulation promotes overgrowth in Drosophila imaginal discs through modulating the activity of the Hippo pathway [39].
Applicability of the present results and future analysis
The present results provide biological support to some of the previously identified pQTLs related to complex phenotypes in cattle and could contribute to discovery of potential causative polymorphisms. pQTL and eQTL colocalization for NTF3 (cooking loss) and GPR98 (tenderness) was evident in the present population [16]; however, more research is required in order to be able to determine if these genes harbor actual causative markers associated with meat quality. The use of causative polymorphisms in genomic prediction is the ideal scenario since it is not affected by recombination events between the actual pQTL and the marker being genotyped, over time. In this respect, research showed that polymorphisms associated with expression regulation such as eQTLs and sQTLs can explain an important proportion of the genetic variance present on complex phenotypes in cattle.
Lopdell et al. [4] identified a set of 3,695 distinct eQTL variants for milk, fat and protein yield and showed that they have increased the predictive ability for milk composition related phenotypes. DGAT1, MGST1, and GPAT4 were identified as the most highly predictive regions. A 1 Mbp region nearby DGAT1 harbors three polymorphisms that are able to explain a high amount of the SNP variance in the set. Xiang et al. [40] classified 17,669,372 imputed variants into 30 sets of markers. This classification included categories such as inter-species conserved markers, polymorphisms associated with metabolic traits (several milk metabolites), expression regulation related polymorphisms (gene and exon expression QTLs, sQTLs, and allele-specific expression QTLs), and markers with evolutionary roles. An index was constructed for each marker using the amount of genetic variance explained by them for a total of 34 complex traits in cattle. Conserved markers, polymorphisms associated with metabolic traits and expression regulation related markers were able to explain the highest amount of genetic variance. Later, this index was applied to another population composed of 7,551 individuals and it was determined that high ranking variants significantly increased genetic variance estimates and genomic prediction accuracies for milk, fat and protein yield.
However, other research has found difficult to illustrate the potential use of eQTL and sQTL mapping on the predictive ability for complex phenotypes. The research of Berg et al. [41] was focused on identifying pQTLs caused by eQTL for milk, fat and protein yield, and calving interval. No strong evidence of association between pQTL and eQTL effects were evident.
The results reported by Berg et al. [41] could indicate that most eQTLs are able to explain a very small fraction of the variance associated to pQTLs; however, it is important to highlight that lack of power for eQTL effect estimation and long-range LD could contribute the difficulty of identifying pQTLs and eQTL colocalization. Additionally, the relationship between pQTL and eQTL effect could be dependent on the genetic architecture of the phenotype being assessed and its degree of transcriptional control. In this respect, Lopdell et al, [4] noticed that predictions for milk, fat and protein yield using eQTL variants did not surpass R2, of 0.5 since all the QTL effects present in these traits are not due to expression effects. Furthermore, eQTLs in related tissues or eQTLs present at different stages of development could contribute to these phenotypes as well.
In order to identify causative polymorphisms, the present results require validation through eQTL and sQTL mapping on additional populations with Angus, Brahman and mixed breed composition. After validation, candidate genes would also need to be confirmed using in-vitro and in-vivo analysis. For the assessment of proteins described as eQTL and sQTL associated transcription factors, techniques such as Electrophoretic mobility shift assay (EMSA) and Chip-seq could be used in order to identify actual DNA-protein interaction able to regulate gene expression of the potential target genes. To support eQTL and sQTL master regulator activity for structural proteins able to activate signaling cascades and gene expression, knockout and knockdown trials could verify if these proteins could module this biological activity. Finally, to identified cis regulations, reporter gene experiments can be used.