Development and Validation of a Prognostic Signature Based on Autophagy-related Long Noncoding RNA Analysis in Hepatocellular Carcinoma

Background: The present aimed to establish a prognostic signature based on autophagy-related long non-coding RNAs (lncRNAs) analysis to predict the clinical outcome of hepatocellular carcinoma (HCC) patients using a comprehensive analysis of micro array data from The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) databases. Methods: Based on co-expression network analysis, autophagy-related lncRNAs were obtained using edge R package. Univariate and multivariate analyses were used to develop an autophagy-related lncRNAs risk signature and a predictive nomogram by data from the TCGA database. Subgroup analysis and internal validation were performed to conrm the predictive power. Subsequently, the predictive autophagy-related lncRNA prognostic signature and nomogram were externally explored and validated by HCC patients from the ICGC database. Results: According to univariate and multivariate analyses, four autophagy-related lncRNAs (BACE1-AS, SNHG3, MIR210HG, and ZEB1-AS1) with prognostic value were identied in patients with HCC. Then, an autophagy-related lncRNAs prognostic signature was constructed. Based on the prognostic signature, all patients were subdivided into high- and low-risk groups. The autophagy-related lncRNA prognostic signature showed strong predictive power independent of all the clinicopathological features. A predictive nomogram based on the prognostic signature and multiple clinicopathological features was established through a survival analysis with the HCC data in TCGA. Internal validation demonstrated that the prognostic signature and nomogram have a strong power for predicting overall survival in patients with HCC. Whereas external validation further conrmed the robustly predictive power. Subsequently, functional enrichment analysis indicated the high-risk group mainly enriched in autophagy- and cancer-related pathways. Conclusions: The four autophagy-related lncRNAs prognostic signature and the predictive nomogram established in the study might support clinician in individual treatment optimization and clinical decision-making for patients with HCC.


Background
Hepatocellular carcinoma (HCC) is one of the most lethal malignancies globally due to its aggressive biologic behavior, dramatically growing incidence, and high mortality [1,2]. Undoubtedly, early diagnosis and initial surgical intervention are one of the biggest challenges facing most clinicians [3]. Despite diagnosis and multimodal therapies have signi cantly improved, the survival bene t remains to be limited, mainly due to the high heterogeneity [4][5][6]. Hence, there is a requirement to discover reliable predictive and prognostic biomarkers to increase risk prediction ability and guide individualized therapy.
Autophagy is a multi-step lysosomal degradation pathway that promotes nutrient cycling and metabolic adaptation. Those functions are biological processes of maintaining cellular function [7][8][9]. Autophagy has also been extensively proven involved in diverse pathologies including cancer [10]. What is more, the function of autophagy is bilateral. On the one hand, autophagy could provide the necessary circulating metabolic substrates and enzymes to response to various of poor circumstance; on the other hand, inadequacy autophagy also promotes malignant cells rapidly growing, especially in advanced cancer [11,12]. Novel potential target therapies involved in autophagy pathways have also been extensively investigated by many studies [13,14].
Long non-coding RNA (lncRNA), usually more than 200 nucleotides, is a batch of newly-discovered RNA transcripts that are lack of protein-coding ability. [15] An increasing number of lncRNAs were proven to be related to different physiological and physiological progresses such as genetic expression regulation, RNA decay, microRNA regulation and protein folding by regulating transcriptionally or posttranscriptionally biological processes including autophagy [16,17]. Increasing studies revealed that lncRNA inhibited or activated autophagy by regulating autophagy-related genes or pathways [17][18][19].
With the rapid development of RNA-sequencing techniques, the potential for utilizing lncRNA as biomarker to facilitate the detection, treatment, or prognosis of malignancy has been gradually unfolded [20,21]. Hence, the present aimed to establish a prognostic signature based on autophagyrelated lncRNAs analysis and a predictive nomogram to predict the clinical outcome of HCC patients using a comprehensive analysis of micro array data from The Cancer Genome Atlas (TCGA) and International Cancer Genome Consortium (ICGC) databases.

Patient data acquisition
Microarray datasets including the raw RNA-sequencing and related clinical data of HCC patients were downloaded from TCGA, https://portal.gdc.cancer.gov/) and ICGC (https://icgc.org/). 374 HCC patients from the TCGA microarray were used as a training set to develop an autophagy-associated lncRNA prognostic signature, whereas 243 HCC patients from the ICGC database were included as a validation dataset. Patients with a follow-up time of less than 1month were removed for survival analysis. Finally, 343 HCC patients from the TCGA database and 230 HCC patients from the ICGC database were registered in the present study.
Due to all the data was collected directly from public databases, not any protocol was required from the ethical committee.
Autophagy-related lncRNAs screening 232 autophagy-related genes were obtained from the Human Autophagy Database (HADb, http://autophagy.lu/clustering/index.html). Then the pro les of those autophagy-related genes and lncRNAs were freely retrieved from the TCGA and ICGC RNA transcription data, respectively.
The correlations between the expression of the lncRNAs and the corresponding autophagy-associated genes were investigated. LncRNAs with a correlation coe cient |R| > 0.5 and P < 0.050 were considered to be autophagy-related lncRNAs.

Construction of an autophagy-related lncRNA signature
The autophagy-related lncRNAs that were related to autophagy-associated genes in both TCGA and ICGC databases were considered as candidate indicators to construct a prognostic signature. According to Kaplan-Meier (KM) method and univariate Cox proportional hazards model, survival analysis of each autophagy-related lncRNA was done to screen prognostic biomarker with both signi cant P values less than 0.050. Then, multivariate Cox regression analysis was used to assess their contribution as prognostic factors among those nominated autophagy-related lncRNAs. Optimal autophagy-related lncRNA signature was acquired based on the lowest Akaike information criterion (AIC) value.
Subsequently, a prognostic signature was established by the sum of the value of coe cients multiply expression of autophagy-related lncRNAs. Subsequently, each HCC patients obtained a prognostic risk score.
Evaluation and validation of the prognostic signature Based on the median value of their risk score, all the patients were classi ed into high-(high risk score) and low-(low risk score) groups, respectively. K-M survival curve was used to compare the prognosis of the high-and low-risk group patients. Whereas the difference between the two groups was evaluated by a two-sided log-rank test. The receiver-operating characteristic (ROC) curve was performed to evaluate the predictive accuracy of the prognostic signature. Area under the ROC curve (AUC), which is a measure of discrimination, was performed to assess the prognostic accuracy. AUC ranges from 0.5 (no predictive power) to 1 (perfect prediction).
Strati ed survival analysis by clinicopathological characteristics was conducted to evaluate the accuracy of the prognostic signature. Moreover, the performance of the prognostic signature constructed by the TCGA training cohort was validated in ICGC testing dataset by a similar approach.
Independence of the prognostic signature from clinicopathological parameters To evaluate whether the predictive power of the prognostic signature could be independent of clinicopathological parameters, we performed univariate and multivariate Cox proportional hazards regression analyses in the TCGA and ICGC cohorts.

Establishment and validation of a predictive nomogram
A predictive nomogram to accurately predict the overall survival probability by using risk score calculated from the autophagy-related lncRNAs prognostic signature and clinicopathological parameters in the TCGA cohort. Harrell's concordance index (C-index) was performed to evaluate the prognostic accuracy. C-index ranges from 0.5 (no predictive power) to 1 (perfect prediction). Calibration plot was used to assess the performance characteristics of the nomogram. Subsequently, external validation from ICGC testing cohort further con rmed the robustly predictive power of the nomogram. Two-sided P value < 0.050 was considered to be statistically signi cant.

Construction of the lncRNA-mRNA co-expression network
The lncRNA-mRNA co-expression network was constructed to explore the correlations between the autophagy-related lncRNAs and corresponding mRNA. Cytoscape Software (version3.7.2) was used to visualize co-expression network. Sankey plots were utilized to reveal the detail correlations by R studio software using the package of ggalluvial.

Functional analysis
Based on co-expressed genes of the autophagy-related-lncRNAs, the possible functional pathways and categories were enriched by Gene Ontology (GO) and the Kyoto Gene and Genomic Encyclopedia (KEGG). P and q values are less than 0.050 were considered to be signi cant.
Gene set enrichment analysis (GSEA) GSEA was utilized to interpret the functional enrichment of gene expression data. This method derives its function by analyzing gene sets, so it can be also used to determine whether the gene set shows a statistically signi cant difference between the two biological states. Underlying mechanisms were investigated within "Molecular Signatures Database" of c2.cp.kegg. v6.2. Symbols through GSEA with a Java program. The random sample permutation number was set as 1,000, and the signi cance threshold was P < 0.050

Results
Identi cation of autophagy-related lncRNAs in HCC patient tissue samples TCGA HCC transcriptome pro ling contained 14142 lncRNAs and 19658 mRNAs transcripts. A total of 226 lncRNAs and 22688 mRNAs were identi ed by analyzing RNA-seq data of HCC patient tissue samples from the ICGC database.
The expression level of 232 autophagy genes were also extracted from the TCGA and ICGC databases, respectively. Pearson correlation analysis was performed to identify autophagy-related lncRNAs using |R| > 0.5 and P < 0.050 as the selection criteria. Finally, 349 and 67 autophagy-related lncRNAs were selected from the TCGA and ICGC databases, respectively. After autophagy-related lncRNAs were intersected from the two cohorts, 19 autophagy-related lncRNAs were obtained (Fig. 1A).
Identi cation of prognostic risk expressed autophagy-related lncRNAs in HCC patients from TCGA database According to KM method and univariate Cox proportional hazards model, survival analysis of each autophagy-related lncRNA was done to screen prognostic biomarker with both signi cant P values less than 0.050. Figure 1B showed that a total of 9 autophagy-related lncRNAs might have prognostic value for patients with HCC. Subsequently, Multivariate Cox regression analysis showed that 4 of 9 autophagyrelated lncRNAs were well candidates to construct a prognostic risk signature on the basis of the lowest Akaike information criterion (AIC). The four autophagy-related lncRNAs (BACE1-AS, SNHG3, MIR210HG, and ZEB1-AS1) were con rmed to be unfavorable factors for HCC (Fig. 2).
According to the prognostic signature, each patient obtained a risk score in connection with the patient's prognosis. All patients were divided into high-(n = 171) or low-risk groups (n = 172) by the median risk score, respectively. KM survival analysis demonstrated that the OS of HCC patients with low-risk score had signi cantly longer than those with high-risk score (shown in Fig. 1C). The 1-, 3-, and 5-year survival rates for the high-risk group patients were 75.60%, 49.90%, and 41.50%, respectively, whereas 1-, 3-, and 5year survival rates for the low-risk group patients were 93.40%, 76.30% and 57.00%, respectively. Distribution of patients' risk scores in different groups was ranked in Fig. 1D. The scatter dots plot showed that the correlations of dead status with the risk score (Fig. 1E). Those results demonstrated that patients with a higher risk score suffered from a shorter survival time and lower survival rates. Moreover, heat map showed different expression levels of the autophagy-related lncRNAs in the high-and low-risk groups. As shown, patients with high-risk score also expressed higher levels (Fig. 1F). Time-dependent ROC curve analysis further showed that the AUC value for the prognostic signature was 0.728 (Fig. 1G).
The correlation of the autophagy-related lncRNA prognostic signature with clinicopathological features Subsequently, analysis was done to investigate the clinical value of the prognostic signature in different subgroups strati ed by patients' clinicopathological characteristics. As shown in Table 1, patients with high riskscore were prone to nd in those with higher level of creatinine or AFP. Interesting, riskscore tends to increase in the advanced T stage and TNM staging, suggesting that the autophagy-related lncRNA prognostic signature might be signi cantly associated with HCC progression. No signi cant difference of the other clinicopathological variables with the prognostic signature was found. Prognostic value of the autophagy-related lncRNA prognostic signature in different subgroups Subgroup analysis was done to investigate the prognostic value of the autophagy-related lncRNA prognosis signature among different subgroups strati ed by clinicopathological variables. As shown in Table 2, the risk scores prognostic signature showed better performance in male patients without liver cirrhosis, the history of HBV or HCV infection as well as family history, whereas obese patients with poor physical condition could bene t more in prognosis based on the autophagy-related lncRNA prognostic signature. Considering laboratory index, the prognostic signature seemed more applicable to patients with relatively lower expression levels of serum AFP, albumin, and creatinine. Besides, the prognostic signature showed strong predictive power independent of clinicopathological features including gender, age, the history of alcohol consumption, tumor stage and histological grade. Validation of the autophagy-related lncRNA prognostic signature in the ICGC database Next, we further assessed the predictive power of the autophagy-related lncRNA prognosis signature in the ICGA database. A total of 230 HCC patients were enrolled. All the patients were separated into low and high-risk groups based on the median value of the risk score. Consistent with the results derived from the TCGA database, KM survival curve analysis demonstrated that the OS of HCC patients in the low-risk group was signi cantly higher than that of the patients in the high-risk group (Fig. 3A). The expression of each autophagy-related lncRNA showed that those lncRNAs were down-regulated in the low-risk group and up-regulated in the high-risk group (Fig. 3B). Time-dependent ROC curve analysis further demonstrated that the AUC value for the autophagy-related lncRNA prognostic signature was 0.685 in ICGC database, which further con rmed the robust prediction of the signature among HCC patients (Fig. 3C). The distribution of risk score and survival status was shown in Fig. 3D-E. Those results further con rmed that the signature could accurately predicted the prognosis of HCC patients.
Determination of the autophagy-related lncRNA prognostic signature as an independent prognostic factor Subsequently, univariate, and multivariate Cox regression analyses were performed to assess whether the prognostic signi cance was a prognostic factor independent of clinicopathological features in the TCGA and ICGC databases. As shown in Fig. 4A (Fig. 4B). Besides, based on univariate and multivariate analyses, autophagy-related lncRNAs prognostic riskscore was also proven to be an independent prognostic factor in the ICGC validation cohorts (Fig. 4C-D).

Constructing and Evaluating a nomogram to predict OS in patients with HCC
We constructed a predictive nomogram to accurately predict the 1-, 3-, and 5-year OS rate by using riskscore calculated form the autophagy-related lncRNA prognostic signature and clinicopathological parameters including age, gender and AJCC stage in the TCGA cohort (Fig. 5A). Calibration plots further identi ed that the nomogram performed well in predicted 1-, 3-, and 5-year survival probabilities with an ideal model, which indicated that the nomogram was perfectly calibrated to predict OS at assessing the performance characteristics (Fig. 5B-D). The C-index value for the nomogram was 0.734. Therefore, internal validation demonstrated that the nomogram was accurate and reliable.
Subsequently, external validation was investigated in the ICGC external validation cohort. The C-index of the nomogram was 0.701 in validation cohort. The calibration plots showed excellent agreement between the actual observation and the nomogram prediction in terms of the 1-,3-, and 5-year survival probabilities ( Fig. 5E-6G). Hence, external validation from ICGC testing cohort further con rmed the robustly predictive power of the nomogram.

Construction of the prognostic autophagy-associated lncRNA-mRNA co-expression network and functional enrichment analysis
To gure out the potential functions of the four-prognostic autophagy-associated-lncRNAs in HCC, we analyzed the lncRNAs mediated mRNA network by using cytoscape 3.6.1. As showed in Fig. 6A, 4 lncRNAs, 91 mRNAs and 141 lncRNA-mRNA pairs were shown in lncRNA-mediated genes network. Sankey plot also showed the detail relationships of prognostic autophagy-associated lncRNAs with autophagy-related-genes and risk types (Fig. 6B). Then, GO and KEGG pathway analyses further demonstrated that these genes were main correlated with autophagy and enriched in pathways in cancer ( Fig. 6C-D).

Gene set enrichment analysis
Further functional annotation was performed by GSEA. GSEA results showed the altered gene sets in the high-risk group were mainly found to be directly involved in the tumorigenesis and progression of malignant tumor (Fig. 7A). Besides, the differentially expressed genes between the two groups were main enriched in the autophagy-associated and tumor-related pathways including ERBB signaling pathway, MAPK signaling pathway, mTOR signaling pathway, VEGF signaling pathway, WNT signaling pathway, and P53 signaling pathway (Fig. 7B). In addition, the altered gene sets in the high-risk group were also found to be involved in the physical actions of autophagy (Fig. 7C). Whereas, the protective pathways involved in metabolism were signi cantly enriched in the low-risk group (Fig. 7D).

Discussion
HCC is one of the most lethal and common types of primary liver malignant neoplasms worldwide. Despite the diagnosis and multimodal therapies have signi cantly improved, the survival bene t remains to be limited, mainly because of high heterogeneity of HCC [22]. Clinically, histological grade, tumor stage, molecular subtype, and serum indicator were used to evaluate the prognostic effect among HCC patients [23]. However, those clinicopathological characteristics could not accurately provide prognostic value, which resulted in inaccurate judgment on prognosis. According to the situation, some high-risk patients may experience tumor cell growing uncontrollably due to inadequate treatment, whereas low-risk patients may receive over treatment. Therefore, reliable genetic signatures or biomarkers as prognostic predictors or therapeutic targets are of signi cance for HCC.
Malignant cell progress and extinction involve in reprogramming of energy metabolism including overutilization of amino acids as tryptophan, aerobic glycolysis, tricarboxylic acid (TCA) cycle, glutamine, and arginine, dysfunctional mitochondrial bioenergetics as well as oxidative phosphorylation and so on [24,25]. Moreover, these energy metabolisms are dramatically associated with autophagy. Hence, understanding the detail and direct relationships between autophagy and tumor progress might provide reliable basis for the development of speci c agents targeting these mechanisms, and then treating tumors. After decades of researches on genetic prognostic biomarker of tumor-related events like microRNAs, and genes, lncRNAs have aroused much attention in the recent. So far, the roles of lncRNA in tumorigenesis and progression of malignant tumor have been gradually uncovered. The prognostic value of lncRNAs also has been extensively studied. However, so far, there has been no systematic process to identify autophagy-related lncRNAs signatures for predicting the prognosis of HCC patients. Hence, it is critical to develop an autophagy-related lncRNAs signature to predict the clinical outcome among HCC patients.
In the present study, HCC patients from TCGA and ICGC databases were enrolled to explore the prognostic value of autophagy-related lncRNAs. Based on lncRNAs and autophagy genes co-expression network, we rstly identi ed 19 autophagy-related lncRNAs through analyzing high throughput RNA transcriptome data from the TCGA and ICGC databases. After univariate and multivariate Cox regression analyses, four prognostic autophagy-related-lncRNAs including BACE1-AS, SNHG3, MIR210HG, and ZEB1-AS1 were selected out for establishing a prognostic signature. Each patient obtained a risk score based on the expression of the four autophagy-related-lncRNAs. All the patients were divided into high-and low-risk based on the median value of risk scores. We further found that patients with low scores had signi cantly better OS than those with high scores in the TCGA patient cohort. In addition, the distribution of risk score, survival status and ROC curve analysis further con rmed the prognostic accuracy of the signature.
Similar results were also obtained when the autophagy-related-lncRNAs signature was validated in the ICGC patient cohort, which further con rmed the robust prediction of the prognostic signature for HCC patients. On the basis of univariate and multivariate regression analyses, the autophagy-related lncRNAs prognostic signature showed strongly predictive power independent of all the clinicopathological features including gender, age, history of alcohol consumption, tumor stage and histological grade in both cohorts. Hence, the autophagy-related lncRNAs prognostic signature showed powerful potential for clinical applications.
The associations of the prognostic signature and clinicopathological characteristics were investigated and results showed the signature was signi cantly related to advanced tumor and a higher level of serum AFP. The correlations were supported by the explanation that autophagy mainly promotes malignant cell rapid growth, invasion and migration, and its alteration might lead to poor prognosis of patients with advanced cancers [14,26]. Subgroups analyses strati ed by clinicopathological variables further veri ed the steadied prognostic value of the prognostic signature.
Nomogram is an effective and reliable clinical tool used to provide probabilistic prediction for an individual patient with cancer. Therefore, in order to achieve better predictive ability of prognosis for HCC patients, several clinicopathological variables (age, gender, and AJCC stage) and the autophagy-related lncRNAs prognostic signature were incorporated to establish a predictive nomogram in the present study. Calibration plots showed that the predicted 1-, 3-, and 5-year survival rates were comparable with the actual observation. High C-index con rmed that the nomogram showed powerful discrimination, indicating that it might function as a potentially prognostic tool. In addition, external validation from the ICGC testing cohort further con rmed the predictive power of the nomogram.
The role of autophagy in cancer is controversial because it could inhibit and promote tumor. However, with the development of understanding of autophagy, the role of autophagy in cancer has also been gradually uncovered. On the one hand, autophagy could provide the necessary circulating metabolic substrates and enzymes to response to various of poor circumstances such as tumor microenvironment; on the other hand, inadequacy autophagy also promotes malignant cell rapid growth especially in advanced tumor. Recently, lncRNAs' roles in mediating tumorigenesis, progression, metastasis, and drug resistance by regulating autophagy-related genes or microRNAs have gradually been found. In the present study, we rstly identi ed four autophagy-related lncRNAs for establishing a prognostic signature. BACE1-AS has been widely reported in previous studies to be signi cantly associated with prognosis of patients with cancer [27,28]. BACE1-AS could also promote autophagy-related neuronal damage through the miR-214-3p/ATG5 signaling axis in Alzheimer's disease [29]. Additionally, the role of SNHG3 in caner has been gradually uncovered [30]. Increasing studies indicated that SNHG3 affects the tumor development and progression by regulating autophagy-related microRNAs, genes, or corresponding pathways [31][32][33][34]. MIR210HG [35][36][37][38], and ZEB1-AS1 [39,40] have also been found to affect tumorigenesis, progression, and tumor metastasis, resulting in a poor prognosis for patients with cancer. Hence, the prognostic signature established by the four robust autophagy-related lncRNAs undoubtedly showed better prognostic value.
Subsequently, we also identi ed autophagy-related genes governed by the four lncRNAs and established the lncRNA-mRNA co-expression network. GO and KEGG functional enrichment analysis showed those genes mainly enrich in autophagy and tumor-related signaling pathways.
GESA functional enrichment analysis showed the signi cant difference between the low-and high-risk groups. The high-risk group main enriches in autophagy-and cancer-related pathways. The altered gene sets in the high-risk group were also found to be involved in the autophagy-associated and tumor-related pathways as well as the physical actions of autophagy. Whereas the predictive pathways involved in various of metabolisms were signi cantly enriched in the low-risk group. Therefore, these results further con rmed that autophagy is a critical modulator of oncogenesis and progression. We also concluded that the four autophagy-related lncRNAs might be potential therapeutic targets.
However, our study has several limitations. For example, functional experiments such as immunohistochemistry, quantitative real-time PCR, and ow cytometry should be conducted to uncover the potential molecular mechanisms for predicting the effect of autophagy-related lncRNAs and con rm our ndings.

Conclusions
In conclusion, the four autophagy-related lncRNAs signature was a reliable tool for predicting OS in HCC  Kaplan-Meier survival curves for the four-prognostic autophagy-related-lncRNAs in HCC patients from the TCGA database. The four autophagy-related lncRNAs were con rmed to be risk factors.   line indicates the performance of the proposed nomogram. The X-axis is nomogram predicted probability of survival and Y-axis is actual survival. Blue dots are sub-cohorts of the data set; Red vertical bars represent 95% con dence interval.

Figure 6
Construction of the prognostic autophagy-associated lncRNA-mRNA co-expression network and functional enrichment analysis. Network of four-prognostic autophagy-related-lncRNAs with co-expressed autophagy genes in HCC. The red nodes correspond to autophagy-related lncRNAs, the blue nodes