DOI: https://doi.org/10.21203/rs.3.rs-18231/v1
Fillet quality traits have economic importance to the aquaculture industry [1]. For profitable aquaculture production, there is a need for fish fillets with optimum nutritional attributes and consistent organoleptic qualities. Consumer attitude towards fish is influenced by fillet quality, including intramuscular fat and moisture content[2]. Fish provide a valuable source of long-chain omega 3 and other essential fatty acids for human diets, which can lower the risk of several chronic diseases [3]. Moreover, the quantity and quality of intramuscular lipid impact fillet quality attributes, particularly juiciness, flavor, color, texture, and shelf-life [4].
Previous studies showed a high correlation between fat, moisture, and protein content, which together account for 99% of the muscle weight [5]. In mammals, the intramuscular fat content exhibited a significant negative correlation with moisture content [6, 7]. A simultaneous lower protein and moisture content, along with increased fat content of the porcine meat was reported [7]. In fish, previous studies showed that muscle mobilizes protein or fat, replenishing them with water depending on the energetic demands associated with various physiological conditions [8–10]. Muscle degradation, accompanying sexual maturation in female rainbow trout, caused a significant decline in muscle protein content associated with a concomitant increase in moisture content [8]. Meanwhile, fat content was not affected by spawning in rainbow trout, suggesting that intramuscular fat is not required to support gonadal maturation. Additionally, muscle protein catabolism was implied as a source of energy during spawning, generating voids in the muscle that were replaced by water [8]. However, more recent studies in our laboratory revealed that gravid fish maintained on a high plane of nutrition showed reduced intramuscular fat and increased abundance of transcripts involved in fatty acid degradation compared to control sterile fish. These results suggested that female trout approaching spawning mobilize intramuscular fat to fuel gonadal maturation. Spawning fish exhibiting a decrease in muscle fat content showed a concurrent increase in moisture, shear force, and protein content [10]. Many changes in muscle growth and quality attributes occur 2–3 months pre-spawning. Control of the muscle fat content in fish can enhance muscle quality attributes and hence aquaculture industry profitability.
The fat content of the tissue is not usually measured directly on breeding candidates, which makes it difficult for the aquaculture industry to improve it. The aquaculture industry controls the fat content of fillets by adjusting lipid content in the ration [4]. Rearing triploid fish has been adopted in rainbow trout to avoid loss of fillet quality associated with fat mobilization and protein catabolism during sexual maturation [9]. Selective breeding can be used to enhance phenotypic traits of interest. One aim in the breeding programs is to convert feed into muscle rather than accumulating an excess of visceral fat. Also, accumulating excessive lipids in the muscle makes the fillet processing difficult and reduces the fillet firmness [1]. A two-generation program of selection on muscle fat content was initiated in rainbow trout to produce lean and fat lines where the fat percentage increased by ~ 15 to 31% in the fat line depending on the diet [11]. These lines were used as a model to study the effect of muscle fat content on fillet quality [11]. Selection on fat content enhanced fillet color and texture [11], feed conversion ratio (FCR), and protein-retention efficiency [12]. Separately, five generations of family-based selection on body weight of rainbow trout were performed at the USDA National Center of Cool and Cold Water Aquaculture (NCCCWA) [13]. In the fish population of year class (YC) 2010, fat content showed a moderate correlation with whole body weight (regression coefficient (R2) value of 0.50) [14]. Therefore, selection for bodyweight yielded heavier fish with more fat in the muscle. Similarly, gilthead seabream exhibited a 0.1% increase in muscular fat content concomitant with a 0.08% decline in moisture content per increment of ten grams in weight [15]. Muscle fat and moisture content showed moderate heritability in fish, including rainbow trout, implying the existence of genetic variance in a rainbow trout population selected for an enhanced rate of growth [16], thus making genetic responses to selection possible. However, the genetic architecture of fat and moisture content has not been delineated in fish. Understanding the genetic basis of the phenotypic traits in question and development of fish strains of improved genetic gain will enhance efficiency of breeding programs, aquaculture industry profitability, and consumer satisfaction.
Genome-wide association (GWA) studies help to identify variants responsible for phenotypic variations, which can be prioritized in breeding programs that rely on genomic information for making selection decisions. A few GWA studies have been conducted on aquaculture species to identify quantitative genomic loci (QTL) responsible for the genetic variability in body weight [13], fillet quality [13, 17], and disease resistance [18]. Two GWA studies were performed in Atlantic salmon [1] and common carp [19] to identify QTL associated with muscle fat content. In Atlantic salmon, four significant SNPs affecting muscle fat content were identified on chromosomes 9 and 10, based on a 5,650-SNP panel [1]. In common carp, a high-density, 250K SNP array revealed eight SNPs related to muscle fat content; however, none of the SNPs surpassed the genome-wide significance level [19]. The two studies did not identify QTL explaining a large proportion of the genetic variance in fat content for fish. To the best of our knowledge, no GWA studies have been performed in rainbow trout to identify SNP markers associated with genetic variance for fat and moisture content.
A 50K transcribed SNP-chip, suitable for GWA analyses, has been recently developed in our laboratory. The array has been used to identify large-effect QTL responsible for genetic variance in fillet yield, firmness, protein content, and body weight gain [17, 20]. The chip outperformed a 57K genomic SNP panel [13] in identifying major QTL associated with genetic variability in fillet yield and bodyweight gain in fish populations of two consecutive generations of the USDA growth-selection line. The current study aimed to identify QTL affecting additive genetic variance in fillet fat and moisture content for the same rainbow trout population.
The fish population used in the current GWA analyses was previously described in detail [21]. Briefly, a selective breeding program was established at NCCCWA in 2002 by intercrossing seven domesticated strains of rainbow trout; this fish population has gone through five generations of selection for improved growth performance [22]. Full-sib families collected from third- and fourth-generation fish (Year-class; YC 2010 and 2012) were used for the current GWA analyses. Phenotypic data for muscle fat and moisture content were obtained from 789 fish representing 197 families produced from two generations (YC 2010 and 2012). Single-sire × single-dam matings occurred over 6 weeks to produce full-sib families. Individuals from each family were reared together in a 200-L tank in order to keep the pedigree information. Tagging fish, at ~ 5-months post-hatch with a passive integrated transponder (Avid Identification Systems Inc., Norco, CA) allowed husbandman to rear different fish families together in 800-L communal tanks (~ 13 °C) until ~ 13 months post-hatch. A commercial fishmeal-based diet was provided, and the feeding schedule was previously described [23]. Fish were purged for 5 days before harvest to facilitate viscera removal.
Whole-body weight (WBW) was measured, and fish were sorted within family accordingly. For sampling, the 2nd or 3rd fish from each sorted family was selected in order to keep the distribution of WBW around the median of each family. For each YC, fish were randomly assigned to one of five harvest groups allowing a single fish per family per harvest group (~ 100 fish each). The harvest groups were sampled over five weeks (one group/week). Fish groups from the two-year classes were harvested at different ages. Fish produced from YC 2010 were harvested between 410- and 437-days post-hatch (mean body weight = 985 g; SD = 239 g), whereas, fish produced from YC 2012 were harvested between 446- and 481-days after hatching (mean body weight = 1,803 g; SD = 305 g). Fish were euthanized with tricaine methanesulfonate (Tricaine-S, Western Chemical, Ferndale, WA), harvested, and eviscerated. Head-on gutted carcasses were shipped on ice to the West Virginia University Muscle Foods Processing Laboratory (Morgantown, WV) where carcasses were manually processed into skinless fillets [10].
Proximate analyses, including crude lipid and moisture content, were previously described [24]. In brief, crude lipid and moisture analyses were performed using AOAC-approved methods (AOAC 2000). Crude lipid content was determined using Soxhlet extraction with petroleum ether. Moisture content was assessed by weighing the sample before and after drying at 110 °C for 18 hrs. When muscle fat and moisture content were regressed on body weight, regression coefficient (R2) values of 0.23 and 0.38 were observed, respectively. The pedigree-based heritability h2 (h2ped) for fat and moisture content was estimated according to Zaitlen et al. [25].
A 50K, transcribed gene SNP-chip was recently developed and used in identifying genomic loci responsible for additive genetic variance in fillet yield [17]. SNPs used to build the SNP chip were identified and reported in our previous study [21]. Briefly, the chip included ~ 21K SNPs showing potential allelic imbalances with growth, muscle quality, and susceptibility/resistance to Bacterial Cold Water Disease (BCWD) [17, 21]. About 5K nonsynonymous SNPs were included in the chip. Additional SNPs were added to the chip to have a minimum of 2 SNPs per each SNP-harboring gene. The array contains 50,006 SNPs.
In total, 1,728 rainbow trout fish were used for genotyping and quality assessment of the SNP chip. Genotyped fish were collected from the USDA/NCCCWA growth- and BCWD-selection lines [17]. Samples and genotypes had undergone quality control checks. The Affymetrix SNPolisher software was used for QC assessment and filtration of samples/genotypes at the default settings [26]. Genotyped samples were filtered using a call rate of 0.97 and Dish QC (DQC) threshold of 0.82. In the current study, 789 genotyped fish, with available phenotypic data for intramuscular fat and moisture content, were used for GWA analyses.
Estimates of SNP effects from weighted single-step GBLUP (WssGBLUP) were used to conduct the current GWA analysis as we previously described [17]. The WssGBLUP combines phenotypes, genotypes, and pedigree information into a single evaluation, where SNPs used to construct the genomic relationship matrix are weighted according to the proportion of additive variance they explain. The following a mixed model was used:
y = Xb + Z1a + Z2w + e
where y is the vector of phenotypes, b is the vector of fixed effects (harvest group and hatch-year), a is the vector of random animal effect, w is the vector of random family effect, and e is the vector of residual effects. X, Z1, and Z2 are incidence matrices for the effects contained in b, a, and w, respectively. Whereas the family and residual random effects were considered uncorrelated, the animal effect was correlated. The covariance structure for the animal effect was given by
where H is a matrix that combines pedigree- and genomic-based relationships[27] and
is the additive variance.
The variance components were estimated using AIREMLF90[28]. The inbreeding coefficient was calculated by INBUPGF90 [17, 29] using pedigree data representing 63,808 fish produced from five consecutive generations by the NCCCWA breeding program. Genomic data were edited using PREGSF90[28], and samples or SNP were kept according to the following parameters: minor allele frequency (MAF) > 0.05, call rate > 0.90, and Hardy-Weinberg equilibrium (HWE) < 0.15. Out of 50,006 SNPs, 35,322 SNPs (70.6%) passed QC and were used for the WssGBLUP analysis.
Two iterations were used in the current WssGBLUP analysis. All SNPs passing QC were assigned, by default, the same weight in the first iteration (i.e., weight = 1.0). In the second iteration, SNP weights were determined according to SNP effects (û) calculated in the first iteration as asû22p(1 - p), where p is the allele frequency in the current genotyped population. In brief, three steps were conducted for each iteration: 1) weights were assigned to the SNPs; 2) genomic estimated breeding values (GEBV) were calculated, based on H− 1 [29]; and 3) SNP effects and updated weights were computed by POSTGSF90 [29] using genomic sliding windows of 50 adjacent SNPs. This window size (n = 50 SNPs) was adopted instead of physical size (e.g. specific number of nucleotides) because SNPs were not evenly distributed over the whole genome. The qqman package [30] in R was used to generate Manhattan plots showing the proportion of additive variance explained by the 50 SNP windows.
PLINK [31] was used to filter the genomic data before performing a single marker association analysis. The filtering criteria included, MAF > 0.05 and HWE < 0.001. The command, --assoc, was used to retrieve R-squared values (R2) of association between the quantitative traits and genotypes. A generalized test score in ONETOOL [32] was used to perform family-based association analysis. This test allows for multiple covariates and accounts for family structure using a kinship matrix. Covariates were incorporated in the linear model to account for fixed effects (harvest group and hatch-year) and population structure. The first two principal components (PCs) were included in the model to account for population structure. P-values with Bonferroni correction were used to account for multiple testing. Manhattan plots showing single SNP markers associated with variations in intramuscular fat and moisture content were generated using qqman package [30].
To annotate the SNP-harboring genes, a bed file containing SNPs associated with the quality trait of interest was intersected with the rainbow trout genome gff/gtf file using Bedtools [17, 33]. SNPs were classified as genic or intergenic according to their physical position relative to the body of the gene. Genic SNPs exist in coding DNA sequence (CDS), introns, or untranslated regions (5’UTR and 3’UTR). Intergenic SNPs are defined as SNPs located in the region between genes.
Muscle fat and moisture contents are interrelated attributes that affect the organoleptic quality and nutritional value of muscle foods [19]. In fish, high-fat content may influence fillet processing and reduce the firmness leading to fillet downgrading [1]; moreover, it significantly impacts texture, juiciness, and flavor. In mammals, increased marbling scores are positively related to beef tenderness, accounting for ~ 9% of the shear force variation [34]. The inability to retain moisture during postmortem storage, in both fish and mammals, is associated with a high drip loss and, in turn, reduces the industry profitability by influencing processing yield and palatability [35, 36]. In the pork industry, drip loss results in up to 10% product losses affecting profitability at wholesale and retail levels [35]. Increased knowledge of the genetic basis of muscle quality traits will facilitate to further the commercial breeding in salmonids. GWA studies are powerful tools to identify genetic variants associated with complex traits [19]. However, no GWA studies were previously conducted to dissect the genetic architecture of fillet fat and moisture contents in rainbow trout. The estimated heritability for fat and moisture content was 0.31 ± 0.07 and 0.42 ± 0.07, respectively, suggesting existence of adequate genetic variability in the NCCCWA fish population to allow genetic improvement through selective breeding. A higher rate of genetic gain is obtained when genomic information is used [37].
In this study, we used genomic windows of 50 SNPs of a 50K SNP chip, to perform GWA analyses, in addition to the single-marker analysis approach, to identify genomic regions associated with the traits. Given that the 50 K, SNP chip contains SNPs of potential association with intramuscular fat content [28; 49], all fish used to build the SNP-chip were excluded from the GWA analysis in the present study.
The fish population used for the current GWA analyses had an average muscle fat content of 9.2 ± 1.91 (%) and moisture content of 69.93 ± 1.75 (%). Variations in fat and moisture content are shown in Figure (1). Previous studies reported a significant correlation between changes in fat and moisture content in fish [10, 15]. Consistently, our data showed a significant negative correlation between fat and moisture content (R = -0.88; p-value = 6.3E-262).
All 35,322 SNPs (70.6%) that passed QC were used in the WssGBLUP analysis. A complete list of proportions of additive genetic variance for fat content explained by all genomic windows is provided in Table S1. Of them, a total of 137 genomic sliding windows explaining at least 2% (arbitrary value) of the additive genetic variance for fat content are listed in Table S2. Most of the SNP sliding windows (n = 124; ~91%) were located within 62 protein-coding genes. Genomic loci affecting the additive variance for fat content were clustered in 5 chromosomes (1, 4, 5, 19, and 29) (Fig. 2).
Chromosome 19 harbored the highest number (n = 50) and the most significant peaks affecting fat content (up to 5.51%) (Table S2, Fig. 2). Many of the SNPs were located within the CDS of the SNP-harboring genes (n = 58) as well as their 3’UTR (n = 55). In order to understand the biological significance of the QTL associated with fat content, we annotated the SNP-harboring genes and searched their functions in the literature (described below).
Similarly, a complete list of the proportions of additive genetic variance for moisture content explained by all windows identified in this study is provided in Table S3. A total of 178 genomic sliding windows revealing at least 2% of the additive genetic variance for moisture content are listed in Table S4. Most of the SNP sliding windows (n = 165; ~93%) were located within 86 genes coding for proteins. Genomic loci affecting the additive variance for moisture content were clustered on 5 chromosomes (5, 14, 19, 25, and 29) (Fig. 3). Chromosome 29 harbored the highest number (n = 48), whereas the most significant peaks affecting moisture content (up to 4.46%) were identified on chromosome 19 (Table S4, Fig. 3). Many of the SNPs were located within CDS of the SNP-harboring genes (n = 68) as well as their 3’UTR (n = 72).
As shown above, a negative linear relationship has been established between fat and moisture content in this selectively bred rainbow trout population (YC 2010 and YC 2012), suggesting a common mechanism underlying the genetic variation in the two traits. This negative correlation was consistent with other studies in fish and mammals [6, 7, 10, 15]. The current WssGBLUP identified common SNPs affecting the additive genetic variance for fat and moisture content on chromosomes 19 and 29 (Tables S2 & S4). The majority of the common SNPs (n = 47) were located on chromosome 19. Thirty-two SNPs, out of 47, involved in lipid metabolism were identified in 16 protein-coding genes on chromosome 19 (Table 1). Briefly, cathepsin B had a single 3’UTR SNP. Cathepsin B regulates very-low-density lipoprotein (VLDL) secretion and free fatty acid uptake in response to oleic acid exposure in mice [38]. Thioredoxin-related transmembrane protein 1-like (TMX1) had three SNPs. Loss of TMX increases lipid peroxidation in TMX−/− mice, which, in turn, enhances oxidative stress [39]. Guanine nucleotide-binding protein GI/GS/GO gamma-2 subunit (GNG2) had a single 3’UTR SNP. GNG2 expression is positively correlated with adipocyte size [40]. SNPs in genes encoding beta-taxilin and Alpha-L-fucosidase 2 (FUCA2) were covering windows explaining the highest proportion of the additive genetic variation for fat and moisture content. Adipose tissue of obesity susceptible and resistant rats differentially expressed beta-taxilin under a high-fat diet [41]. FUCA2 is a glycolipid processing enzyme [42]. Two SNPs in F-box only protein 30 (FBXO30) and the microtubule-binding protein ensconsin were ranked next to beta-taxilin and FUCA2. An SNP in FBXO30 was located in a genomic region explaining 4.95% of the additive genetic variance for polyunsaturated fatty acids in cattle [43]. Knock-down of microtubule-binding or -associated proteins led to changes in fat accumulation during adipogenesis [44]. Dihydropyrimidinase-related protein 5-like (CRMP5) had a single synonymous SNP. CRMP5 has GO terms belong to lipid metabolic processes [45]. Five SNPs were identified in a gene encoding trifunctional enzyme subunit alpha, mitochondrial (HADHA). This gene is involved in fatty acid beta-oxidation [45]. mRNA decay activator protein ZFP36L1 had three SNPs in the 3’UTR. Knockdown of mammalian ZFP36L1 led to downregulation of ERK activation and inhibition of adipogenesis [46]. A single 3’UTR SNP was identified in ELM2 and SANT domain-containing protein 1 (ELMSAN1). Epigenome-wide association analysis showed DNA methylation changes in ELMSAN1 were associated with body mass index (a key measure of adiposity) [47]. Prostaglandin reductase 2 (PTGR2) had two nonsynonymous SNPs. This enzyme catalyzes reduction of the conjugated α,β-unsaturated double bond of 15-keto-PGE2 in an NADPH-dependent manner, which is a critical step in inhibition of PPARγ-mediated adipocyte differentiation [48]. Spectrin beta chain, erythrocytic (SPTB) gene had two SNPs. The SPTB interacts with phospholipids in natural membranes and has a role in controlling the fluidity of the inner lipid leaflet of the cell membrane [49].
Chromosome 29 had 14 SNPs in genomic windows explaining at least 2% of the additive genetic variance for intramuscular fat and moisture content (Tables S2 & S4). Of them, seven SNPs were involved in lipid metabolism (Table 1). A single SNP was identified in a gene encoding short-chain specific acyl-CoA dehydrogenase, mitochondrial (ACADS). This enzyme has a role in fatty acid beta-oxidation [50]. An intronic SNP was identified in a gene coding for arrestin domain-containing protein. The latter has GO terms belonging to fat pad and skin development and regulates the body mass [51]. Myocyte enhancer factor 2c (MEF2C) had the highest number of SNPs (n = 5) on chromosome 29. MEF2C is a transcription factor involved in skeletal muscle differentiation; however, it has been reported as a constituent of a mechanism that programs gene expression involved in development of brown adipocytes [52]. MEF2A and MEF2D isoforms exhibited in vivo differential expression in mammalian striated muscle and white adipose tissue of insulin-deficient diabetic mice [53]. To our knowledge, the role of MEF2C in white adipose tissue remains uncertain.
In addition, twelve SNPs in genes involved in transmembrane transport and cytoskeleton remodeling were identified in common QTL affecting additive variance for fat and moisture content (Table 2). The majority of these SNPs were identified on chromosome 19 (n = 11). Three synonymous SNPs were identified in a gene encoding intersectin-2 (ITSN2). This protein is necessary for the clathrin-mediated endocytosis and actin cytoskeleton remodeling [54]. Six SNPs were identified in 3 genes involved in vesicle-mediated transport (i.e., exocytosis); dnaJ homolog subfamily C member 5B, visinin-like protein 1, and syntaxin-binding protein 5. The actin cytoskeleton remodeling controls each step of exocytosis [55]. Three SNPs were identified in microtubule-associated protein RP/EB family member 3 (MAPRE3) and centrin-3. MAPRE3 and centrin-3 control the dynamics of the microtubule cytoskeleton [45, 56].
The actin cytoskeleton interacts with the cell membrane to control water transport [57]. In the current study, thirty-six variants in genes (n = 14) involved in cytoskeleton remodeling were identified, affecting the additive variance for moisture content in rainbow trout (Table 3). Briefly, bone morphogenetic protein receptor type-2 (BMPR2) had a single synonymous SNP. BMPR2 is known to interact with the cytoskeleton, and BMPR2 mutant mice exhibited cytoskeletal defects [58]. A single SNP was identified in a gene encoding muscle associated receptor tyrosine kinase (MUSK). Activation of MUSK in myotubes regulates the reorganization of the actin cytoskeleton [59]. Two SNPs were identified in THAP domain containing 1 (THAP1), which has a role in regulation of the mitotic cell cycle [60]. The gene encoding asparaginyl-tRNA synthetase (NARS) had 3 SNPs in windows explaining the highest additive variance (up to 3.46%; Table 3). Mutations in NARS leads to cell cycle arrest in the S phase [61]. The actin cytoskeleton undergoes dramatic changes during the cell cycle [62]. Ten SNPs were identified in three genes coding for cyclin-I (CCNI), cyclin-G1 (CCNG1), and cyclin-G2 (CCNG2). Cyclins function as regulators of the cell cycle and actin cytoskeleton dynamics [63]. The serine/threonine-protein, phosphatase 2A (PP2A), had a 3’UTR SNP. This phosphatase is associated with microtubule stabilization, where it binds and dephosphorylates the microtubule-associated proteins [64]. Annexin A6 (ANXA6) had two synonymous SNPs. ANXA6 contributes to membrane and cytoskeleton organization in a Ca2+-dependent manner [65]. Tubulin beta-4B chain (TUBB4B) had four synonymous SNPs within 1Kb of chromosome 25. TUBB4B is a critical component of microtubules [45]. Five SNPs, clustered in ~ 2Kb, were identified in a gene coding for mid1-interacting protein 1 (MID1IP1) [66]. This protein enhances fatty acid biosynthesis [66] and stabilizes microtubule organization [45]. Two SNPs were identified in a gene encoding tubulin-specific chaperone A (TBCE). TBCE is a tubulin-folding protein required for proper microtubule cytoskeleton organization [67]. Additionally, mutations in TBCE drive muscle atrophy [67]. Proteinase-activated receptor 1 (PAR1) and PAR2 had four SNPs. PAR-mediated, RhoA activation is vital for cytoskeletal reorganization [68].
A strong positive correlation between moisture and protein content has been established in different species. A simultaneous decline in protein and moisture content was previously reported in mammals [7]. Moisture content in rainbow trout exhibited a bidirectional relationship with protein content depending on physiological/metabolic status. For example, a negative correlation between moisture and protein content were previously reported under muscle catabolic conditions associated with full sexual maturation (R2 = 0.994, p < 0.01) [8]; whereas, a positive correlation was reported in female trout, on a high plane of nutrition, that were approaching spawning [10]. This was explained by selective mobilization of either protein during spawning or fat before spawning; in either case, the depleted macromolecule was replaced by water. It is noteworthy that protein content variation of the current study was not statistically significant between the 4 high-ranked families versus 4 low-ranked families (data not shown). The current WssGBLUP analysis indicated that thirteen SNPs in genes involved in protein degradation were involved in the additive genetic variance of moisture content (Table 4). Briefly, E3 ubiquitin-protein ligase RNF170 is an E3 ubiquitin-protein ligase that plays an essential role in the ubiquitination and degradation of inositol 1,4,5-trisphosphate receptor type 1 (ITPR1). The latter controls the calcium release from the endoplasmic reticulum [69], which affects the muscle protein content in rainbow trout [20] and has a profound effect on the regulation of cytoskeleton [70]. Cystatin-1, which possesses a peptidase inhibitor activity, had a single 5’UTR SNP. Thioredoxin-like 1 (TXNL1) had two synonymous SNPs. The knockdown of TXNL1 moderately stabilizes the ubiquitin-protein conjugates suggesting a connection between protein reduction and proteolysis [71]. Pre-mRNA-processing factor 19 (PRPF19) and ubiquitin-conjugating enzyme E2 D2 (UBE2D2) had four SNPs. These ligases catalyze polyubiquitin chain assembly and play a role in proteasomal protein degradation [72, 73]. Nuclear factor NF-kappa-B p105 subunit (NFKB1) had two 3’UTR SNPs. NFKB1 is involved in the negative regulation of cellular protein metabolic process [74] and apoptotic process [75].
In addition to the ubiquitin-protein ligases, SNPs in three genes involved in lysosomal/phagosomal pathways were identified. Ras-related protein rab7 (RAB7A) harbored a 3’UTR SNP. RAB7A is a major regulator of endo-lysosomal maturation/ trafficking and protein targeting to lysosome inducing autophagosome formation [76]. Thus, RAB7A positively regulates the protein catabolic process [77]. V-type proton ATPase subunit B (ATP6V1B2) had a 3’UTR SNP. V-ATPase is responsible for acidifying the intracellular compartments, including lysosomes [78]. The gene encoding the β chain of the adaptor protein-3 (AP-3) complex had a single synonymous SNP. Deletion in AP3B1 perturbs assembly of AP-3 complex and, in turn, trafficking of transmembrane lysosomal proteins [79].
Taken together, most of the common genomic loci affecting the highest proportion of the additive variance were involved in lipid metabolism, suggesting a common mechanism underlying intramuscular fat and moisture content and, partially, explaining the strong negative correlation between the fat and moisture content in this selectively bred rainbow trout population. Unique loci affecting moisture content were primarily involved in cytoskeleton regulations and protein turnover. Inhibition protease activity, such as calpains, reduced degradation of proteins responsible for cell membrane-cytoskeleton attachments and postmortem drip channel formation in muscle. The presence of calcium enhances proteolysis by µ-Calpain, of myofibrillar and other cytoskeletal proteins during postmortem storage [36].
To identify single SNP markers associated with variation in fat and moisture content, we analyzed SNPs that passed QC filtration (n = 29,451) using a generalized score test; this test incorporates multiple covariates in the analysis and accounts for family structure using a kinship matrix [32]. In this study, 8 and 24 significant SNPs, surpassing the genome-wide significance level, had a potential impact on the fat and moisture content (Bonferroni-corrected p < 1.69E-06; Figs. 4 & 5 and Tables S5 & S6), respectively.
SNPs associated with the fat content were mainly located on chromosome 5 (n = 7) and SNP-harboring genes have roles in lipid metabolism (Table 5). The list includes 78 kDa glucose-regulated protein (GRP78), spindle and kinetochore associated complex subunit 1 (SKA1), apelin receptor B (APLNR-B), desmoplakin, podocan, and calcium-binding mitochondrial carrier protein SCaMC-1 (SLC25A24). Briefly, two missense mutations were identified in genes coding for GRP78 and SKA1. GRP78 is essential for adipocyte differentiation and a balanced secretion of adipokines. Deletion of GRP78 causes lipoatrophy in mice observed as a dramatic reduction in gonadal and subcutaneous adipose tissue [80]. SKA1 was downregulated in adipose tissues between samples from obese and healthy control children and has been suggested as a candidate biomarker for childhood obesity [81]. Three synonymous mutations were identified in genes encoding APLNR, desmoplakin, and podocan. APLNR knockout mice demonstrated excess fatty acid accumulation in skeletal muscle [82]. Abnormalities in desmoplakin have been associated with changes in lipid metabolism [83]. Podocan belongs to the small leucine-rich proteoglycans (SLRPs) that bind to low-density lipoprotein receptor-related protein (LRP-1) [84]. A 3’UTR SNP was identified in a gene coding for SLC25A24. Mice fed a high-fat diet exhibited increased expression level of SLC25A24; whereas, adipocyte differentiation was suppressed in Slc25a24- knockout [85].
Puromycin-sensitive aminopeptidase (NPEPPS), on chromosome 17, had a nonsynonymous SNP explaining the highest variability in fat content (R2 = 3.2%) (Table 5). NPEPPS impacts different physiological processes, including protein turnover and cell cycle regulation. NPEPPS was upregulated in mitten crabs fed with a linseed oil rich in linoleic acid [86]. However, the effect NPEPPS on lipid metabolism in fish needs further investigations.
SNPs associated with moisture content (n = 24) were associated with protein turnover, calcium metabolism, and cytoskeleton integrity (Table S5). Most of these SNPs (n = 11; ~46%) were located on chromosome 17. An SNP in a gene coding for acylphosphatase 2 ranked at the top of the list (R2 = 7.4%) (Table 5); however, its physiological role is not clear. Eight SNPs associated with moisture content were identified in five genes engaged in protein metabolism (Table 5). These genes are NPEPPS, eukaryotic initiation factor 4A-III (EIF4A3), eukaryotic translation initiation factor 4B (EIF4B), ribosome binding protein 1 (RRPB1), and F-box only protein 46 (FBXO46). Briefly, the aminopeptidase, NPEPPS, was associated with variation in fat and moisture content, suggesting a correlation between moisture and fat content. Five SNPs were identified in two genes encoding EIF4A3 and EIF4B, suggesting a role for the protein translation machinery in determining variation in moisture content. RRPB1 is an ER integral membrane protein implicated in polysome assembly and, therefore, protein synthesis [87]. RRPB1 has been suggested as essential in regulation of UPR signaling molecules and autophagy [88]. Finally, the F-box family SCF-E3 ubiquitin ligase, FBXO46, had a single 3’UTR SNP.
Expression of genes involved in the actin cytoskeleton and cytoskeletal organization was positively correlated with drip loss [89]. In this study, a total of 10 SNPs associated with moisture content were identified in seven genes engaged in cytoskeleton regulation (Table 5). These genes are encoding serum response factor (SRF), kinesin-1 heavy chain (KIF5B), inositol-trisphosphate 3-kinase C (IP3KC), supervillain (SVIL), calcineurin subunit B type 1 (PPP3R1), eIF4A and EIF4B. Briefly, SRF is a master regulator of the actin cytoskeleton [90]. Mutations in KIF5A caused cytoskeletal defects in humans [91]. Two, 3’UTR SNPs were identified in IP3KC and PPP3R1. IP3K and calcium/calcineurin signaling play critical roles in maintaining Ca2+ homeostasis that has a profound effect on the cytoskeleton [70, 92, 93]. Supervillin (SVIL) is one of the first components of the costameric membrane skeleton to assemble during muscle formation. It establishes a high-affinity connection between the membranes and actin cytoskeleton [94]. Translation initiation factors, including eIF4A and EIF4B, associate with the actin cytoskeleton, which affects protein synthesis [95].
In our previous work, we profiled transcriptome expression of fish families (YC 2010) showing contrasting phenotypes in fat content, which revealed only 17 differentially expressed transcripts associated with fat content [14]. About 90% of the genetic variation among individuals comes from SNPs [96], and therefore, identifying SNP markers associated with complex traits is most suitable for genetic evaluation in selection programs. Two previous GWA studies identified a few SNPs responsible for the additive variance for fat content in Atlantic salmon and Common Carp [1, 19]. The current GWA analysis identified a total of 137 SNPs in windows explaining at least 2% of the additive genetic variance for fat content, suggesting a better characterization of the genetic basis underlying variation in fat content. The discrepancies among the different GWA studies might be due to; 1) usage of different algorithms in the GWA studies, 2) variation in population size, 3) substantial difference in the capacity of the SNP arrays, 4) polygenic nature of intramuscular fat content, 5) different thresholds in each study including sliding window size [13].
Compared to our WssGBLUP analysis, the single marker GWA analysis revealed a smaller number of SNP markers associated with variation in intramuscular fat and moisture content. In addition, these two GWA approaches revealed different significant peaks associated with traits of interest. Aguilar et al. (2019) showed that the highest peak based on the p-value was not the same based on the proportion of variance explained, and this is because the latter depends on allele frequency, i.e., high effect but low frequency decreases the variance explained[97]. This result is consistent with other studied traits, such as fillet firmness, protein content [20], and bodyweight gain [98] in rainbow trout. The potential factors associated with observed heterogeneity between the two approaches are different algorithms, thresholds, and windows size used in each approach. The WssGBLUB was more effective than the single marker GWA in examining the genetic architecture of studied traits and identifying common QTL between traits. This method has proven to be optimal for breeding populations given the data structure: phenotyped individuals may not have genotypes, and there is a long history of pedigree recording [97]. Common QTL identified in this study may explain the high negative correlation between fat and moisture content. The recombinational progression of QTL and nearby markers determines the information content of haplotypes [99]. However, SNP-harboring genes identified by the two approaches had similar biological functions and were involved in lipid metabolism, protein turnover, and cytoskeletal remodeling. Routine use of single-SNP and multi-makers for GWA analysis was previously recommended to take advantage of the complete information content of the genotypes [99].
The current GWA analyses identified novel genomic regions associated with additive genetic variance for fat and moisture content in rainbow trout. These genomic loci code for proteins involved in lipid metabolism, actin cytoskeleton remodeling, and protein synthesis/degradation. Compared to a previous GWA work in Atlantic salmon, this work reveals significant QTL affecting fat content, which appears to be a polygenic trait. The top common windows affecting additive genetic variance for fat and moisture content are mainly on chromosome 19. These findings provide a genetic basis for description of the molecular mechanisms underlying fat and moisture content in teleost fish. In addition, this work provides putative markers that could be prioritized when estimating genomic breeding values for fat and moisture content, aiming an increased accuracy.
Bacterial Cold Water Disease
Gene ontology
Genome-wide association
Hardy–Weinberg equilibrium
Minor allele frequency
USDA National Center of Cool and Cold Water Aquaculture
Quality control
Quantitative trait loci
Single nucleotide polymorphism
Untranslated region
Weighted single-step GBLUP
Year class
Ethics approval and consent to participate
Institutional Animal Care and Use Committee of the United States Department of Agriculture, National Center for Cool and Cold Water Aquaculture (Leetown, WV) specifically reviewed and approved all husbandry practices used in this study (IACUC protocol #056).
Consent for publication
Not applicable.
Competing interests
The authors declare that the research was conducted in the absence of any commercial or financial relationships that could be construed as a potential conflict of interest.
Funding
This study was supported by a competitive grant No. 2014-67015-21602 from the United States Department of Agriculture, National Institute of Food and Agriculture (MS), and by the USDA, Agricultural Research Service CRIS Project 1930-31000-010 “Utilizing Genetics and Physiology for Enhancing Cool and Cold Water Aquaculture Production”. The content is solely the responsibility of the authors and does not necessarily represent the official views of any of the funding agents.
Authors' Contributions
MS, TL, and BK conceived and designed the experiments. RA-T, MS, TL, and BK performed the experiments. RA-T, AA, DL, BK, and MS analyzed the data. AA and MS wrote the manuscript.
Acknowledgments
The authors acknowledge J. Everson, M. Hostuttler, K. Jenkins, J. Kretzer, J. McGowan, K. Melody, T. Moreland, and D. Payne for technical assistance. The use of trade, firm, or corporation names in this publication is for the information and convenience of the reader. Such use does not constitute an official endorsement or approval by the USDA or the ARS of any product or service to the exclusion of others that may be suitable. USDA is an equal opportunity provider and employer.
Availability of data and materials
All datasets generated for this study are included in the manuscript and/or the Additional Files. The genotypes (ped and .map files) and phenotypes are available in our previous publication [20].
Due to technical limitations, Tables 1-5 are provided in the Supplementary Files sections.
CAPTIONS:
Table 1. SNP markers in genomic sliding windows explaining at least 2% of the genetic variance for fat and moisture content and involved in lipid metabolism. A color gradient on the left indicates differences in additive genetic variance explained by windows containing the representative SNP marker (green is the highest and red is the lowest). SNPs are sorted according to their chromosome positions.
Table 2. SNP markers in genomic sliding windows explaining at least 2% of the genetic variance for fat and moisture content and involvement in transmembrane transport and cytoskeleton regulation. A color gradient on the left indicates differences in additive genetic variance explained by windows containing the representative SNP marker (green is the highest and red is the lowest). SNPs are sorted according to their chromosome positions.
Table 3. SNP markers in genomic sliding windows explaining at least 2% of the genetic variance for moisture content and involved in cell cycle and cytoskeleton regulation. A color gradient on the left indicates differences in additive genetic variance explained by windows containing the representative SNP marker (green is the highest and red is the lowest). SNPs are sorted according to their chromosome positions.
Table 4. SNP markers in genomic sliding windows explaining at least 2% of the genetic variance for moisture content and involved in proteolytic activities. A color gradient on the left indicates differences in additive genetic variance explained by windows containing the representative SNP marker (green is the highest and red is the lowest). SNPs are sorted according to their chromosome positions.
Table 5. SNP markers significantly associated with variability in fat and moisture content using family-based association analysis. A color gradient on the left shows the phenotypic variation explained by each SNP marker (green is the highest and red is the lowest). SNPs associated with the phenotypes are sorted according to their chromosome positions.