- HYPOTHESIS-FREE SCREEN
Data driven hierarchical clustering
Each fat depot could be clearly discriminated by gene expression as they were all awarded separate branches in the dendrogram tree (Figure 1). The gene expression differences between fat depots clearly overwhelm any breed differences within a depot. IMF was separated from the other 4 fat depots. Of the remaining 4 depots, Inter and Omen were most closely related, followed by Kid then SC. Given SC appears the most functionally divergent of the ‘pure’ (i.e. confident assertion of no muscle contamination) fat depot samples, we elected to compare all depots to SC in turn.
SC versus all other fat depots
We plotted all fat depots minus SC and manually explored the extreme genes (Figure 2). It is clear from examining the plots that 2 consistent outlier genes by expression profile in SC are HOXA10 and DLK1. Their log2 normalised expression of these two genes in each fat depot is compared in Table 1. Other genes of interest for their extreme expression in at least one depot are TDH, TMEFF2 and CLDN10.
We next focussed on the particular IMF versus SC comparison in more detail.
IMF versus SC
In the IMF versus SC comparison the most extreme 1% (145 out of 14, 476) downregulated genes in IMF enriched for ‘humoral immune response’ (Hypergeometric statistic, FDR q-value = 0.00017) based on genes including CFB, CXCL3, CD163, CD36 and CD163L1). We used a modified DE metric called Phenotypic Impact Factor (PIF) which is a product of DE and average abundance across the treatments of interest. The extreme 20 most downregulated genes are shown in Table 2.
From Table 2 it can be seen that various aspects of fat metabolism (SCD, DGAT2, FASN, ACSS2, AGPAT2, CIDEA, G0S2), extracellular matrix biology (TIMP4, SPARC, CCDC80) and some inflammation related genes (CXCL3, CD163) are prominently featured in those transcripts whose expression is lower in IMF than SC. In terms of formal enrichment, the 5% downregulated genes (PIF) in IMF enriched for ‘lipid metabolic process’ (hypergeometric P-value = 2.55 e-15) and ‘defense response’ (hypergeometric P-value 7.08 e-15), but these were not as extreme as the generic enrichment ‘response to chemical’ (hypergeometric P-value = 9.08 e-17).
On the other hand, the upregulated 1% in IMF enriched very significantly for ‘muscle system process’ (FDR q-value = 3.54 e-46) based on 44 genes generally regarded as either muscle-specific (e.g. myoglobin) or highly expressed in muscle cells (such as numerous specialised myosin light and heavy chain isoforms as illustrated by MYL2 and MYH2). A more lenient 5% extreme upregulated PIF (or 724 genes) marginally lessens the impact of the muscle specific detection (FDR q-value = 3.44 e-44). It is clear from these functional enrichments and manually exploring the MA plot that the dissected IMF must have contained a very small amount of LD muscle contamination not present in the other fat depots. Genes strongly expressed by the muscle cells in the IMF sample appear to make up the triangle shaped protuberance in the IMF versus SC MA plot (as highlighted by the red circle on Figure 2A). We have elected not to tabulate the extreme upregulated genes in IMF versus SC, as this would return a list of genes dominated by mRNA encoding muscle structural and muscle metabolic proteins.
To better determine the cellular origin of many of the upregulated genes in the dissected IMF we first computed Differential Expression (DE) between pure LD (nearly all muscle, small amount of IMF) and SC (all fat, no muscle contamination). We then identified genes which had nominally >4 fold higher expression in LD than SC as likely being transcribed predominantly from muscle cells, not fat cells. This yielded a list of 171 genes, which we have then highlighted on the IMF versus SC plot (Figure 3 panel A; Additional File 1). The highlighted genes all clearly fall in the triangular shaped protuberance distorting the overall IMF versus SC MA distribution.
This analysis reinforces the conclusion that those outlier genes in the IMF versus SC are almost certainly driven by the presence of some LD muscle in the ‘pure’ IMF sample and therefore need to be interpreted cautiously. The impact of the ‘contaminating’ muscle derived RNA on the expression of most of the remaining genes is harder to foresee, as there will be a continuum of shared expression between the myocytes and adipocytes, depending on the particular gene being investigated. In this submission, we have provided the normalised mean expression for LD in addition to all the fat depots for the 34,227 probes (Additional File 2) so the interested reader can assess the possible impact of contaminating LD on a case by case basis.
Mitoproteome
In an effort to explore the behaviour of the mitochondria in our understanding of the metabolism of IMF versus SC metabolism we highlighted those mRNA known to encode mitochondrial proteins. Using the downloaded mitochondrial protein database we matched 886 of these in our data, 595 of which were higher in IMF than SC, but only 287 of which were lower (Figure 3 panel B). This is a significant deviation (P = 2.29 e-26) from the null expectation of equilibrium (i.e. 443 above and 443 below). This result could be argued to be consistent with IMF having a higher mitochondrial content and / or mitochondrial activity than SC, but the presence of some high mitochondrial content LD muscle in the IMF depot is presumably influencing the result. Notable among the mitochondrial genes strongly downregulated in IMF compared to SC (despite the probable impact of the contaminating muscle) is PCK2.
Structural differences between marbling adipocytes and the other fat depot adipocytes
To visually account for the effect of the contaminating LD on the IMF gene expression we colour coded the MA plot comparing IMF to SC (Figure 4). Colour coding each gene individually visually highlights genes (such as CH25H) that are most likely higher in IMF than SC because of particularly high expression from the IMF adipocytes per se and not because of the contaminating LD muscle (yellow / orange dots above 0 on the y axis, purple dots below 0 on the y axis). To highlight these genes in a gene list we asked the question, “which mRNA are more highly expressed in IMF than the average of all other fat depots by <1.32 fold (a difference of 0.4 on the log2 scale) and also much more highly expressed in IMF than LD (>2 fold).” This analysis returns a manageable list of 49 genes whose expression is more confidently derived from marbling adipocytes (Additional File 1). Tabulating the top 20 of these ranked on IMF SC PIF yields the gene list in Table 3.
In this adapted list there is clear functional enrichment in IMF for components of the cytoskeletal architecture (CNN1, ACTA2, MYH11, ACTB, ACTRT2, SORBS2, KRT18). Other genes of interest include a) CH25H which encodes an enzyme that catalyses the production of a particular oxysterol metabolite b) the microRNA MIR145 and c) CTPS2 which catalyses the production of CTP, a high energy analog to ATP but whose hydrolysis is coupled to a restricted subset of metabolic reactions including glycerophospholipid synthesis.
In another analysis more directed at the specific IMF versus SC comparison, we asked for those genes >2 fold higher expression in IMF than SC (without considering expression in the other fat depots), and higher expression in IMF than LD (by any amount). This produced a list of 73 genes (Additional File 1) which enriches for extracellular matrix organisation (GFAP, COL28A1, COL2A1, ITGA8, TNC, SNCA, MYH11, MKX, COL4A6 and TNFRSF11B) using the 2 list option in GOrilla (FDR q value = 0.0068). Manual exploration and curation of the gene list highlighted 3 additional functional groups of interest that are relatively upregulated in IMF versus SC: cholesterol metabolism (PMP2, CH25H, CYP4B1), retinoic acid metabolism (STRA6, MEST) and insulin and carbohydrate metabolism (GRB14, NR4A3 and MOXD1). ANGPTL7 encodes a protein that is elevated in the plasma of obese humans. The expression profiles for the genes within these three functional groupings are shown in Table 4.
Moreover, we submitted the normalised mean expression of the 73 genes across the 5 fat depots and LD muscle into Permut Matrix and subjected it to hierarchical cluster analysis (Figure 5). This restricted subset of genes (which we have chosen to be theoretically diagnostic of the IMF depot due to high expression) satisfactorily clusters the tissues in a manner such that IMF is considered the most unique tissue while the other fat depots cluster with the LD muscle. This can be contrasted with the original clustering performed on a randomly selected 10,000 genes whose first branch unsurprisingly separates the LD muscle from all the fat depots. The clustering on rows clusters genes who are co-expressed across tissues and clearly some of the clusters reflect known functional relatedness, with KRT8 with KRT18 reflecting keratin biology and ACTA2 and MYH11 reflecting cytoskeletal biology.
Of those 41 genes we identified as more lowly expressed in IMF than SC (but unlikely to be due to low expression in the contaminating LD) (Additional File 1) the most extreme 10 are shown in Table 5.
- HYPOTHESIS-DRIVEN ANALYSIS
Lipogenesis in adipocytes
To formally connect these mRNA data to traditional biochemical knowledge, we tabulated the expression profiles of those genes encoding rate limiting enzymes and other proteins considered influential in the various lipogenic processes (Table 6). This includes the following biochemical processes: precursor transport into the adipocyte cells (glucose and free FA), aspects of intermediate energy metabolism (glycolysis and pyruvate metabolism), de novo FA synthesis, FA elongation, FA desaturation, FA esterification with glycerol and finally the supply of reducing power equivalents.
*GPAM (alias GPAT1) a quantitatively influential enzyme of esterification according to [14].
**Cytosolic NADP isocitrate dehydrogenase (IDH1) is not registered as expressed in our fat depot mRNA data.
*** While not considered rate-limiting for esterification (and glycerolipid synthesis) we wish to draw attention to the expression profiles for a number of additional genes in these two pathways as some have expression profiles indicating higher activity in the IMF than the SC (that cannot be explained by LD muscle contamination) (Table 7).
We can see that some of these canonical lipogenic pathways show clear, consistent patterns of gene expression based on the key enzymes. For example, de novo FA synthesis (FASN and ACACA 1.63-1.79 fold), FA elongation (ELOVL6 1.61 fold), desaturation (SCD 2.2 fold), supply of reducing power equivalents (G6PD, ME1 and PGD 1.33, 1.43 and 1.51 fold), esterification (GPAM and DGAT2 1.3 to 1.82 fold) and lipolysis (PNPLA2, LIPE, MGLL and PLIN2 1.31, 1.33, 1.39 to 1.61 fold) are rather consistently downregulated in IMF versus SC. On the other hand, there are patterns of both up and downregulation within other pathways. Glycolytic flux and pyruvate metabolism are two such pathways, comprising key genes exhibiting both higher and lower expressed in IMF than SC. The ~ 4 fold upregulation of PFKM and PKM (i.e. muscle specific isoforms of the enzymes) in IMF versus SC glycolytic flux can most likely be attributed to LD contamination.
We examined a subset of core lipogenic processes in more detail at the whole pathway level. The normalised mean expression data for the FA biosynthesis and FA elongation pathways are shown in Additional File 3. We next manually explored the gene lists, detecting a number of particular isoforms as being upregulated in IMF for esterification and glycerolipid synthesis. These have been tabulated (Table 7) and are contrary to the pathway output (based on GPAM, DGAT) outlined in Table 6. The expression of AGPAT9 (esterification) and MBOAT2 (glycerolipid synthesis) are particularly noteworthy, with a DE in excess of 1.4 fold higher in IMF compared to SC that is not attributable to LD contamination.
- qRT-PCR
CH25H was found to be significantly (P < 0.0001; 2 Tailed Mann Whitney U Test) more highly expressed in dissected IMF than SC by ~ 34-fold using the first primer pair (Figure 6) and 38-fold using the second primer pair, yielding a 36 fold average. The direction of change is the same as for the microarray probe but the absolute difference is ~ 10 fold higher.
- Analysis of LD muscle in Wagyu x Hereford crosses versus Piedmontese x Hereford crosses
The microarray based expression profile for CH25H in intact mature postnatal LD muscle is higher in Wagyu x Hereford crosses than Piedmontese x Hereford crosses by 20 months of age, the difference in expression increases with increasing developmental time and by 30 months of age is 2 fold higher (Figure 7). This 2 fold difference very closely approximates the close to 2 fold difference in carcass IMF previously reported (8.8% IMF in Wagyu x Hereford and 5.1% IMF in Piedmontese x Hereford animals). The increasing significance of the observed differences at 20m, 25m and 30m are reflected by P values of 0.478, 0.158 and 0.003 respectively.
- Oxysterol metabolite quantitation
The relationships of the oxysterols to the IMF phenotypes are documented below in Table 8 and Figure 8. The Full SAS output to all 15 phenotypes can be found in Additional File 4.
Despite most of the phenotypes provided (8/15) being non-marbling related, at the P < 0.05 threshold, the only phenotypes significantly associated with the various oxysterols are marbling related phenotypes (the one exception being 7α,25-dihydroxycholesterol and Eye Muscle Area with r = 0.72, P = 0.04). This means the non-IMF fat depot phenotypes did not reach significance with any of the quantitated oxysterols.
We have detected 4 positive correlations and 3 negative correlations. In terms of absolute correlations to IMF phenotypes, 24S-hydroxycholesterol’s relationship to Eye Round IMF is the top performer (r = 0.91 ; P < 0.001) (Additional File 4). No significant relationship to loin IMF was detected for any of the oxysterols although 7α,26-diHCO approached significance (r = 0.67; P = 0.0646). Finally, 25-hydroxycholesterol is detected at much higher levels in cattle plasma than in human plasma, consistent with our prediction that the metabolite is largely derived from IMF and humans are essentially a zero IMF species.