Sequencing data summary
In our study, libraries were constructed from liver tissue samples from ND-fed mice (n=5) and HFD-induced hepatic IR mice (n=5) and subjected to sequencing analysis. High-throughput sequencing produced 516,429,450 and 457,341,558 raw reads from the two groups, respectively. The raw reads were filtered to obtain high-quality clean reads. In total, 506,933,232 and 448,761,362 clean reads were retained for the ND and HFD groups, respectively (Table 1). The Q20 and Q30 of the clean data were subsequently calculated to evaluate the quality of the RNA-sequencing data. The Q20 and Q30 quality scores were higher than 90%, and the sequencing error rate was less than 1%. The data indicated that the RNA-sequencing results had reliable quality. The obtained clean reads were mapped to a mouse reference genome, with the mapping ratio ranging from 96.10% to 97.13% (supplemental file 2).
Identification and classification of lncRNAs in the liver of mice
Based on the mouse reference genome and related databases, 10,495 known lncRNAs were found. After a strict filtering process, 541 novel lncRNAs, which exhibited repeated emergence in four software programs (CPC, PfamScan, CPAT and CNCI), were identified (supplemental file 3). According to the relative chromosomal position of the coding gene, the novel lncRNAs were classified into five broad categories: 5,572 intergenic lncRNAs (50.5%), 4,004 antisense lncRNAs (36.3%), 642 sense exonic overlap lncRNAs (5.8%), 22 sense intronic overlap lncRNAs (0.2%) and 796 bidirectional lncRNAs (7.2%) (Figure 1A). Furthermore, the distributions of these lncRNAs in terms of length and chromosomal location were analyzed in this study. In different length groups, most of the lncRNAs were 201-1,000 nt long (Figure 1B). Chromosomal distribution analysis revealed that chromosomes 1, 2, 5 and Y contained relatively higher amounts of lncRNAs (Figure 1C).
Differential expression analysis of lncRNAs and mRNAs
After high-throughput sequencing, a total of 11,036 lncRNAs were obtained by comparing the ND and HFD groups. The analysis of the lncRNAs DE between the two groups showed that compared with the ND group, the HFD group contained 232 DE lncRNAs, among which 108 were upregulated lncRNAs and 124 were downregulated lncRNAs. Regarding the expression profiles of mRNAs, 40,675 mRNAs were present in both groups. Among all of these mRNAs, compared with the ND group, the HFD group contained 291 DE mRNAs, including 166 upregulated mRNAs and 125 downregulated mRNAs. A volcano plot and heatmap were generated to visualize the DE lncRNAs and DE mRNAs identified by comparing the two groups, as shown in Figure 2 and in supplemental file 4. Moreover, the chromosomal distributions of the DE lncRNAs and mRNAs are shown in Figure 3.
DE lncRNA validation by q-PCR
To validate the reliability of the lncRNA expression data, seven lncRNAs that exhibited relatively abundant expression and significant differential expression in the livers from mice in the two groups were selected and evaluated by q-PCR. These lncRNAs included one upregulated lncRNA (ENSMUST00000200707) and six downregulated lncRNAs (ENSMUST00000107095, ENSMUST00000146928, ENSMUST00000153523, MSTRG.7107.6, MSTRG.19471.1 and MSTRG.19772.3). The results showed that the level of the lncRNA ENSMUST000030200707 was significantly increased, while the levels of the lncRNAs ENSMUST00000107095, ENSMUST00000146928, ENSMUST00000153523, MSTRG.7107.6, MSTRG.19471.1 and MSTRG.19772.3 were obviously decreased in the HFD group compared with the ND group (Figure 4). The q-PCR validation results for these DE lncRNAs were consistent with our bioinformatic analysis based on high-throughput sequencing.
GO and KEGG enrichment analyses of the DE lncRNAs and mRNAs
First, target genes of the DE lncRNAs were predicted by bioinformatic approaches, and the results are presented in supplemental file 5. Subsequently, GO and KEGG enrichment analyses were performed on the predicted target genes of the DE lncRNAs. The top 30 GO terms are shown in Figure 5A, in which “glycerol biosynthetic process from pyruvate” was majorly enriched in the biological process category. GO enrichment also indicated enrichment of other important biological processes, including various compounds and nucleic acid metabolism in the cell, mitochondrial modification, catalytic enzyme activity and molecular function regulation. Furthermore, the top 30 enriched KEGG pathways are presented in Figure 6A, and the “Renin-angiotensin system” was the top enriched pathway. More importantly, pathway enrichment analysis showed enrichment of several pathways related to IR or glycolipid metabolism, such as the citrate cycle (TCA cycle), the PPAR signaling pathway, the AMPK signaling pathway, the MAPK signaling pathway, glycolysis/gluconeogenesis, the glucagon signaling pathway, biosynthesis of unsaturated fatty acids, the PI3K-Akt signaling pathway, insulin resistance, and the insulin signaling pathway.
For the DE mRNAs, the top 30 enriched GO terms and KEGG pathways are shown in Figure 5B and Figure 6B. The GO terms mainly included cellular_component, biological regulation, regulation of cellular process, metabolic process, intracellular organelle and cytoplasmic part. KEGG pathway analysis showed that DE mRNAs were mainly associated with Protein digestion and absorption, Pancreatic secretion, Arachidonic acid metabolism, PPAR signaling pathway and Retinol metabolism.
Interaction and coexpression network analysis
The interactions among the proteins encoded by the DE mRNAs are shown in Figure 7A. The Cxcl2, Serpine1 and Prss2 genes were the important genes that interacted with many of the other DE mRNAs in this network. Moreover, a DE lncRNA-mRNA coexpression network, which consisted of 87 nodes (59 DE lncRNAs and 28 DE mRNAs) and 1,125 edges, was constructed (Figure 7B). Within this RNA network, the lncRNA MSTRG.8007.2 had the maximum number of targets. The Pnlip, Ctrl, Cpb1 and Amy2a3 genes had the maximum number of coexpressed lncRNAs.