Development of lncRNAs signature and lncRNAs–based nomogram
A total of 415 patients were enrolled in this study, including 377 patients from TCGA dataset, 38 patients from GEO dataset (24 patients from GSE115019 and 14 patients from GSE101728). The baseline clinical information of these patients was shown in Table S1.
Differentially expressed genes (DEGs) analysis was used to get differently expressed lncRNAs between tumor and normal tissues. There were 3046, 2613and 113differentially expressed lncRNAs in TCGA dataset (Fig. 1A), GSE101728 (Fig. 1B) dataset andGSE115019 dataset (Fig. 1C), respectively. Finally, 10 overlapped differentially expressed lncRNAs in these datasets were selected for construction of lncRNAs signature (Fig. 1D). Cox regression analysis was used to find the most significant prognostic lncRNAs (AL161668.5, DDX11-AS1) to construct a signature from the 10 lncRNAs in discovery set (Fig.2A). DDX11-AS1 was highly expressed in tumor than normal tissues. On the contrary, AL161668.5 was highly expressed in normal tissues. In addition, both DDX11-AS1 and AL161668.5 were associated with overall survival (OS) significantly (HR: 1.81, 95%CI: 1.29-2.54, P < 0.01; HR: 0.97, 95%CI: 0.94-0.99, P = 0.04, respectively). Subsequently, risk score of each patient based on the expression of lncRNAs were acquired automatically by R software (Fig. 2B). Using the median risk score, HCC patients could be divided into high or low risk groups (Fig. 2C). Low-risk patients experienced longer OS compared with those with higher risk scores (HR: 0.76, 95%CI: 0.63–0.91; P < 0.01, Fig. 2D).
A nomogram was constructed that integrated our lncRNAs signature and other clinical parameter in discovery set, such as age, gender, grade and stage (Fig. 3A).The C-index of the nomogram was 0.71. Table 1 showed that risk scores and stage were associated with OS significantly in univariate Cox analysis (HR: 1.32, 95%CI: 1.10-1.60, P < 0.01; HR: 1.69, 95%CI: 1.33-2.14, P < 0.01, respectively). Additionally, risk scores, age and stage were significantly correlated with prognosis of HCC in multivariate Cox regression model. (HR: 1.30, 95%CI: 1.07-1.59, P < 0.01; HR: 1.02, 95%CI: 1.00-1.04, P = 0.04; HR: 1.76, 95%CI: 1.37-2.26, P < 0.01, Figure S1 A, respectively).
Validation of lncRNAs–based Nomogram in test set
To confirm the prognostic value of this lncRNAs–based nomogram, we also did the Cox analysis in test set. Patients with HCC in low risk group also had longer survival time (HR: 0.54, 95%CI: 0.30–0.96; P = 0.03, Figure S1 C). Similarly, risk scores and stage were associated with OS significantly in univariate Cox analysis in test set (HR: 1.98, 95%CI: 1.20-3.27, P = 0.01; HR: 1.56, 95%CI: 1.12-2.16, P = 0.01, respectively, Table 1). In addition, risk scores and stage were also significantly correlated with prognosis of HCC in multivariate Cox regression model (HR: 1.68, 95%CI: 1.00-2.81, P = 0.05; HR: 1.53, 95%CI: 1.08-2.17, P = 0.02, Figure S1 B, respectively, Table 1). Besides, calibration plots revealed that our nomograms also did well in the test set (Fig 3B, C). The C-index of the nomogram was 0.73.
Construction of a ceRNA network and identification of associated biological signaling pathway
The mRNAs and miRNAs targeted by lncRNAs were predicted by circlncRNAnet, miRcode and TargetScan database. Only DDX11-AS1 had detailed information. There were 29 mRNAs and 5miRNAs correlated with DDX11-AS1, which were finally selected for constructing ceRNA network (Fig. 4A).
GSEA were performed to identify the lncRNA associated biological signal pathway. We found that the lncRNAs were highly associated with 11 KEGG pathways (Fig. 4B)including metabolic pathways, retinol metabolism, chemical carcinogenesis, metabolism of xenobiotics by cytochrome P450, drug metabolism, bile secretion, cell cycle, carbon metabolism, cellular senescence, microRNAs in cancer and oocyte meiosis. The mRNAs in the ceRNA network were also enriched in these pathways. For the GO functional analyses, they were mainly enriched in 51 GO terms including 12 biological process terms (Fig. 4C), 18 molecular function terms (Fig. 4D), and 21 cellular component terms (Fig. 4E). The top three biological process, molecular function and cellular component were biological regulation, metabolic process, response to stimulus; protein binding, ion binding, nucleic acid binding; membrane, nucleus, cytosol, respectively.