3.1 Characteristics of gene expression from GEO
The gene expression data (GSE93789, GSE101728) were obtained from the GEO data repository (http://www.ncbi.nlm.nih.gov/geo/). After reannotation, 33591 genes in GSE93789 (GPL16956) and 46102 genes in GSE101728 (GPL21047) were obtained. On the basis of the criteria of |log FC|>1 and p-adjusted value<0.05, gene expression data were analyzed for differentially expressed genes using limma package. The GSE93789 had 4553 DEGs, containing 2735 up-regulated and 1818 down-regulated genes. The GSE101728 had 1599 DEGs, containing 657 up-regulated and 942 down-regulated genes. The particular information of GSE93789 and GSE101728 data set and the number of DEGs acquired from the two data sets are displayed in Table 1. We constructed volcano plots to show the significantly DEGs (Fig. 2). We used the heatmap to display the gene expression changes (Fig. 3).
3.2 Clinical data of HCC and the intersected DEGs
The clinical data of HCC patients were obtained from TCGA database. The information of HCC patients mainly containing sex, age, TNM stage, tumor grade, Child–Pugh, residual tumor classification and Ishak fibrosis score were collected, as shown in Table 2. Limma package was used to analyze DEGs on the basis of the criteria of |Log FC| >1 and adjust P value < 0.05. In summary, 148 lncRNAs with 68 up-regulated and 80 down-regulated, 2181 mRNAs with 728 up-regulated and 1453 down-regulated were screened. Venn diagram demonstrates intersected genes both coming from GEO and TCGA. Totally, 289 differentially expressed mRNAs and 10 differentially expressed lncRNAs were acquired (Fig. 4 and Table 3).
3.3 ceRNA network
In order to analyze if these hub lncRNAs and mRNAs existing mRNA-lncRNA-miRNA network, we used the intersected lncRNAs and mRNAs to perform mRNA-lncRNA-miRNA network. Ultimately, four lncRNAs, forty-nine miRNAs and twenty-five mRNAs were involved in mRNA-lncRNA-miRNA network (Fig. 5).
3.4 GO, KEGG pathway analysis, and PPI network
In order to analyze the underlying biological roles of the hub genes contained in the mRNA-lncRNA-miRNA network, we performed GO and KEGG analysis (Fig. 6). The outcome shown that the biological functions mainly related to nuclear division, sister chromatid segregation, organelle fission and chromosome segregation, while the molecular functions gathered in monooxygenase activity, oxidoreductase activity, iron ion binding and heme binding. The KEGG pathway analysis demonstrated that most of the genes were related to cell cycle, p53 signaling pathway, oocyte meiosis, complement and coagulation cascades and progesterone-mediated oocyte maturation. Furthermore, differentially expressed mRNAs were used to construct PPI network by STRING. The outcome demonstrated that most genes are associated with cell cycle.
3.5 Prognostic value of hub lncRNAs and mRNAs in patients with HCC
In order to find out whether the hub lncRNAs and mRNAs involved in the mRNA-lncRNA-miRNA network were related to the prognosis of HCC patients, we analyzed the hub gene expression and the survival data in TCGA. Because some patients didn’t have intact information to do survival analysis, 415 cases were obtained in the end. In the four hub lncRNAs, the result demonstrated that lncRNA GSEC was obviously significant associated with the overall survival (OS) (P < 0.05). The expression of GSEC was negatively correlated with survival time of HCC. In the 25 hub mRNAs, the result shown that 12 mRNAs were related to the OS of HCC: B3GNT5(P<0.001), SORBS2(P=0.037), ENAH(P=0.003), SUCO(P=0.004), ESR1(P=0.006), FAT4(P=0.047), SLC38A2(P=0.012), MASP1(P=0.031), GHR(P=0.007), PARPBP(P=0.002), RRM2(P<0.001) and MMS22L(P<0.001). The high levels of SORBS2, ESR1, FAT4, MASP1 and GHR predicted good prognosis, but seven mRNAs had a negative correlation with the OS of HCC. The results of KM analysis are displayed in Figure 7.