3.1. Establishment of a Co-expression Network
A total of 14,142 lncRNAs were separated form the patients’ expression profiles we downloaded from TCGA database. In searching of HADb, we found 257 autophagy related genes(Supplementary File1) of which 210 genes were identified in our TCGA-LUAD expression matrix (Table S1). Next, we constructed a co-expression network between the autophagy related gene we identified and the correlated lncRNAs with a cut-off value of ∣R2 | >0.3 and P < 0.001. Finally, a total of 1,703 autophagy-related lncRNAs (ARlncRNAs) were filtered out for further analysis.
3.2. Development of Prognostic Risk Model from Autophagy-related lncRNA
Based on our univariate Cox regression result, 74 autophagy-related lncRNAs were found to have a significant correlation with patients’ survival in lung adenocarcinoma. 57 of the 74 ARlncRNAs were favorable factors (HR<1) while 17 ARlncRNAs showed a harmful results (Table S2). The Lasso regression found 5 autophagy-related lncRNAs (Figure1a) and subsequent multivariate Cox regression included the 5
ARlncRNAs (LINC01137, AL691432.2, LINC01116, AL606489.1 and HLA-DQB1-AS1) into our risk score model (Figure1b-c, Table1). Besides, the Kaplan-Meier survival analysis was conducted to evaluate the prognostic value of each of the the ARlncRNAs. All of the five ARlncRNAs had significant prognostic value in predicting patients’ overall survival among which AL691432.2 and HLA-DQB1-AS1 were considered protective factor while the other three ARlncRNAs were risk factors (Figure 2). Moreover, we had established a co-expression network and Sanky diagram to show the correlation between significant ARlncRNAs and autophagy related genes (Figure 3). Meanwhile, the network also revealed that AL606489.1 had a significant correlation with BIRC6 and WDFY3 from the Pearson correlation test (Supplementary file 2). Finally, according to the results of the multivariate Cox regression analysis, a risk score model was generated based on the following formula: Risk Score= (0.0538 × LINC01137 expression) + (-0.0853 × AL691432.2 expression) + (0.0718 × LINC01116 expression) + (0.221 × AL606489.1 expression) + (-0.0562 × HLA-DQB1-AS1 expression).
3.3. The Prognostic Influence of the Established Signature
According to our risk score formula, the risk score of each patient was calculated and assigned to high/low risk group. The median risk score in our analysis was 0.9939 and 223 patients were assigned to low risk group while 222 patients was assigned to high risk group. The Kaplan-Meier survival analysis showed that the risk score was significantly correlated with patients’ overall survival in lung adenocarcinoma with a p-value <0.001 (Figure 4e). For the time-dependent ROC curve, the AUC value for 1-year, 3-year, 5-year, 7-year was 0.735, 0.672,0.662 and 0.732 respectively which indicating a reliability of this signature combination to predict patients’ survival (Figure 4a-d). Next, the risk score distribution along with the survival time between high and low risk group showed in Figure 5a implies a poorer survival probability in high risk group. For this prognostic signature, high expression of AL691432.2 and HLA-DQB1-AS1were found in low risk group while the other three were associated with high risk group (Figure 5b). Furthermore, we found a small number of deaths in the low risk group compared to high risk group.
3.4. Clinical Value of the Autophagy-Related lncRNA Signature
Univariate and multivariate Cox regression analysis were conducted on this 5- ARlncRNAs signature in patients’ cohort to evaluate whether this signature can be an independent factor of other relevant information such as gender, age and TNM stage. In our univariate Cox regression analysis, the risk score and tumor stage were independent prognostic indicators with a HR of 1.075 (95% CI: 1.046–1.104, P < 0.001) and 1.666 (95% CI: 1.409–1.969, P < 0.001) respectively (Table 2, Figure 6a). In the multivariate Cox regression analysis, the overall survival was used as a dependent variable and other clinical factors (age, gender, stage and TNM) were regarded as covariates. The multivariate Cox regression analysis indicated that risk score still was an independent factor in our analysis with a HR of 1.088 (95%CI = 1.057 − 1.120, P < 0.001), Table 3, Figure 6b). Further, according to the nomogram we drew from our analysis, risk score and tumor stage were the most two contributors to 1-year, 3-year and 5-year overall survival of patients with lung adenocarcinoma (Figure 6c). Besides, to evaluate the reliability of this scoring system, the AUC value under the ROC curve we drew from our results indicated risk score (0.668) as well as tumor stage (0.733) had certain credibility in predicting a patient’s prognosis based on the 5-ARlncRNAs signature (Figure 6d). The C-index of our model was 0.726 (95% CI: 0.671-0.781). A detailed clinical stratification on risk score showed this
5-ARlncRNAs signature had a close relationship with tumor stage especially the tumor size as well as the lymph nodes metastasis (Table 4).
3.5. Functional Analysis
GSEA was then applied to find the relevant gene ontology terms and KEGG pathways that involved between the high and low risk group. For GO terms, we found a total of 147 upregulated in high risk group and 324 downregulated in high risk group with a P<0.001. In our analysis, the most concentrated biological process including cell metabolism, cell division as well as T cell selection (Table S3). For KEGG analysis, a total of 20 pathways were enriched among which 14 were upregulated in high risk group. The enriched pathways were mostly related with p53 signaling pathway, sugar metabolism, protein export, DNA replication and JAK-STAT signaling pathway (Table S3). Moreover, most of the GO terms and KEGG pathways enriched in our analysis were found to be closely related to the occurrence and development of lung adenocarcinoma, indicating that the five lncRNA may play a role in lung cancer related functions.