Identification of metabolism-related lncRNAs in LUAD patients
The workflow is presented in Figure 1. We screened 490 LUAD patients and 14056 lncRNAs for this analysis. Table 1 shows the clinical details of samples in the training and testing sets. We analyzed and compared the expression levels of the 816 metabolism-associated genes among normal and tumor samples and identified 185 differentially expressed metabolism genes. Moreover, we calculated the corresponding lncRNA expression matrix of these genes. Of them, 43 were downregulated, and 142 were upregulated. The expression distribution of metabolism-associated differently expressed genes in LUAD was displayed in Figure 2A-2C. Analyses of the GO enrichment indicated that 185 differentially expressed genes participated in many critical physiological processes, especially in the alpha-amino acid metabolic and small molecule catabolic process, mitochondrial matrix, basolateral plasma membrane, anion transmembrane transporter activity, and active transmembrane transporter activity (Figure 2D). KEGG analysis discovered that 185 differentially expressed metabolic genes were associated with signaling pathways, including biosynthesis of amino acids, biosynthesis of cofactors, purine metabolism, and carbon metabolism (Figure 2E). Finally, the coexpression network between metabolism-related lncRNAs and differentially expressed metabolic genes was shown in the San-key diagram (Figure 2F), and 2633 metabolism-related lncRNAs were identified as metabolism-related lncRNAs. The correlation between metabolism-related lncRNAs and corresponding genes, such as ABCA8, PMM2, and lncRNAs, was shown in (Supplementary Table S1) and visualized in (Figure 2G).
Table 1
Sample details of the training group and the test group
Covariates
|
Type
|
Total
|
Test
|
Train
|
P-value
|
Age
|
<=65
|
231(47.14%)
|
110(45.08%)
|
121(49.19%)
|
0.409
|
>65
|
249(50.82%)
|
129(52.87%)
|
120(48.78%)
|
unknow
|
10(2.04%)
|
5(2.05%)
|
5(2.03%)
|
Gender
|
FEMALE
|
262(53.47%)
|
130(53.28%)
|
132(53.66%)
|
1
|
MALE
|
228(46.53%)
|
114(46.72%)
|
114(46.34%)
|
Stage
|
Stage I-II
|
378(77.14%)
|
192(39.18%)
|
186(37.96%)
|
0.8448
|
Stage III-IV
|
104(21.22%)
|
48(9.80%)
|
56(11.43%)
|
unknow
|
8(1.63%)
|
4(1.64%)
|
4(1.63%)
|
T
|
T1-2
|
426(86.94%)
|
210(42.86%)
|
216(44.08%)
|
0.8287
|
T3-4
|
61(12.45%)
|
32(6.53%)
|
29(5.92%)
|
unknow
|
3(0.61%)
|
2(0.82%)
|
1(0.41%)
|
M
|
M0
|
324(66.12%)
|
161(65.98%)
|
163(66.26%)
|
0.5842
|
M1
|
24(4.9%)
|
10(4.1%)
|
14(5.69%)
|
unknow
|
142(28.98%)
|
73(29.92%)
|
69(28.05%)
|
N
|
N0
|
317(64.69%)
|
161(65.98%)
|
156(63.41%)
|
0.8888
|
N1-3
|
162(33.06%)
|
77(15.71%)
|
85(17.35%)
|
unknow
|
11(2.24%)
|
6(2.46%)
|
5(2.03%)
|
Construction and Validation of Risk Model
We selected 527 metabolism-related lncRNAs by univariate COX regression analysis (Figure 3A, Supplementary Table S2). LASSO regression analysis was further applied, with the result that 15 lncRNAs showed a strong correlation with the OS of LUAD patients (Figure 3B-3C). Finally, we employed multivariate Cox regression analysis for screening the eight metabolism-related lncRNAs (Table 2) to construct the risk model (Figure 3D).
Table 2
Multivariate cox regression analysis of 8 hub lncRNAs
|
coef
|
HR
|
HR.95L
|
HR.95H
|
pvalue
|
AC068228.1
|
1.04537938
|
3.31067518
|
2.069563114
|
5.296079195
|
5.91E-07
|
LINC02390
|
-2.765775643
|
0.05374742
|
0.010851902
|
0.266200811
|
0.000341925
|
AC123595.1
|
-0.612033991
|
0.305070349
|
0.148148482
|
0.62820703
|
0.001275727
|
AC021016.1
|
-0.536593072
|
0.274268488
|
0.144707684
|
0.519828675
|
7.32E-05
|
LINC00707
|
0.409269302
|
1.955105801
|
1.532233927
|
2.494683498
|
6.98E-08
|
AL132656.2
|
-1.227239484
|
0.125174908
|
0.039862049
|
0.393074569
|
0.00037181
|
AL033397.2
|
0.313575844
|
1.47613924
|
1.183856926
|
1.840583105
|
0.00054189
|
LINC00941
|
0.330708502
|
1.778741168
|
1.396296046
|
2.265937908
|
3.12E-06
|
A risk score was obtained based on the following formula: risk score= expression of AC068228.1×(1.04537938048222)+ expression of LINC02390×(-2.76577564344421)+ expression of AC123595.1×(-0.612033991186885)+ AC021016.1×(-0.53659307165381)+ expression of LINC00707×(0.40926930226994)+ expression of AL132656.2×( -1.22723948408621)+ expression of AL033397.2×(0.3135758443623)+ expression of LINC00941×( 0.330708502412192)
We defined these LUAD patients as two groups: low-risk groups and high-risk groups(based on the median value of the predictive risk scores). Figure 3E presented the distributional patterns of risk scores among two subgroups in the training set. We summarized the survival parameters of these patients in Figures3F. The results suggested that as risk score increased, OS time decreased while mortality rise. Figure 3G illustrated the relative expression difference for the eight metabolism-related lncRNAs in the training set. Then, we performed a K-M analysis and found that the high-risk group held an inferior OS rate than that of the low-risk group (p < 0.001) (Figure 3H). Additionally, we evaluated the model’s efficacy based on the testing set and the entire set. We summarized the risk scores, survival parameters, and expression of the eight metabolism-related lncRNAs in the abovementioned two sets and listed them in (Figure 4A-4F). K-M analyses presented the same outcomes based on the testing set and the entire set (Figure 4G-4H). All the above results supported the power of our risk model.
Independent Prognostic Analysis and Nomogram
Using univariate Cox and multivariate Cox regression analyses, we combined metabolism-related lncRNAs with clinical parameters to investigate whether the risk model could be used independent factor for predicting the survival of LUAD. As shown in the univariate cox regression analysis, the hazard ratio (HR) of the risk model was 1.04, and the 95% confidence interval (CI) was 1.029–1.061 (p< 0.001) (Figure 5A). Multivariate cox regression analysis showed that the HR was 1.041 and CI was 1.024–1.059 (p< 0.001) (Figure 5B). Therefore, the risk model might serve as a prognostic factor independent of other clinical parameters such as age, sex, pathological stage and so on.
Furthermore, we constructed a nomogram for better predicting the 1,3,5, -year OS of LUAD patients by combining the risk model with clinical factors, including gender, age, stage, TNM, and risk score (Figure 5C). After that, the prediction accuracy of the nomogram was assessed. Observed survival rates are blue, while the optimized survival rates are shown in gray, indicating a good match between them (Figure 5D).
Assessing the Risk Model.
We used receiver operating characteristic (ROC) curve analysis to verify the efficacy of the risk model. The 1-, 3-, and 5-year AUC of entire set was 0.843, 0.816, and 0.814 (Figure 6A). The AUC for the risk score was higher than the AUC for any other clinicopathological feature, fully demonstrating that the predictive risk model for LUAD is robust and highly reliable (Figure 6B). The concordance index also suggested the accuracy of the risk model (Figure 6C). Then, we employed P-principal-component (PCA) and t-SNE analyses to assess the distribution between the high and low-risk groups in training and testing sets (Figure 6D-6E). The PCA and t-SNE analysis results confirmed that the metabolism-related lncRNAs model had grouping capabilities.
Besides, we also use PCA analysis to verify further the ability of the risk model between two subgroups based on the entire gene expression profiles, 185 differentially expressed metabolic genes, 2633 metabolism-related lncRNAs, and risk model according to the eight metabolism-related lncRNAs (Figure 6F-6I). The results confirmed that the distributional patterns of the high-risk and low-risk groups were significantly different, indicating that the risk model was competent to distinguish the two groups with high accuracy.
We evaluated the discrepancies between two subgroups based on the universal clinicopathologic characteristics. By further grouping patients of LUAD by gender, age, stage, and TNM, survival analysis found that LUAD low-risk patients also had better OS than high-risk patients (Figure 7). The above results indicated that the risk model maintained its powerful predictive ability among subgroups of different clinical features.
Identifying the immune infiltration status
We explored the immune infiltration status among two subgroups using the CIBERSORT algorithm (Supplementary Table S3). The proportions of 22 immune cells in every patient were presented in (Figures 8A-8B). Furthermore, the results based on the ssGSEA algorithm revealed that some factors reflecting immune functions were upregulated in the low-risk subgroup (e.g., T_cell_co_stimulation, HLA, Type_II_IFN_Response) (Figure 8C, Supplementary Table S4). The infiltration of ads, B_cells, DCS, iDCs, Mast_cells, Neutrophils, TIL, and T_helper_cells, was higher in the low-risk group (Figure 8D). These results suggest that the low-risk group has a higher immune infiltration status, combined with a better OS in the low-risk group, which we reasoned might contribute to the antitumor effect. Similarly, LUAD patients in the low-risk group also showed remarkably higher stromal, immune, and ESTIMATE scores, suggesting that the high-risk group held dissimilar TME(Figure 8E-G).
Furthermore, the comparison illustrates that approximately 93.49% of samples exhibited genetic mutations in the high-risk samples, as well as 82.19% of samples exhibited mutations in the low-risk samples were displayed in (Figure 9A-9B). Additionally, TMB scores were calculated from the TCGA mutation data. They presented a higher TMB status in the high-risk groups, indicatied that high-risk patients might benefit more from immunotherapy (Figure 9C). Therefore, we tested the correlation between the risk model based on metabolism-related lncRNAs and TMB (Figure 9D, R=0.3, P=1.6e-11). The results showed that the metabolism-based classifier index was highly correlated with TMB. To investigate the impact of TMB state on prognosis in LUAD patients, we applied survival analysis based on high and low TMB groups. However, the survival curve of patients with high TMB was similar to patients with low TMB, indicating that the TMB failed to distinguish the survival in LUAD (Figure 9E). Additionally, we checked the efficacy of TMB scores related to risk scores based on the model to judge its predictive ability for predicting the OS outcomes (Figure 9F). Surprisingly, the model showed a significant predictive power for patients with LUAD. Besides, according to the immune subtype data in TIMER2.0, we tested whether a the-risk model based on the eight metabolism-related lncRNAs could identify the different immune subtypes (Figure 9G, Supplementary Table S5). The result suggested that the risk model effectively distinguished the resistant subtype. Our findings might shield new light on understanding the molecular pathogenesis of LUAD from the perspective of metabolism-lncRNAs.
Clinical Treatment and Drug sensitivity analysis
Considering the significant differences in the immune microenvironment between the low-risk and high-risk groups, we speculated that responses to drugs, chemotherapy, critical ICPs, and immunotherapy might differ between the two groups. Using the “pRRophetic” package, we then evaluated the therapeutic response using the IC50 values of 138 anti-tumor drug patients obtainable in the GDSC database to explore potential drugs targeting our risk model and improve treatments for patients with LUAD. The IC50 of A.443654, A770041, AMG.706, AUY922, AZ6828, and AZD.0530 were higher in the low-risk group, suggesting high risk patients may respond better to those drugs. Interestingly, the ABT.888, AP.24534, ATRA, and Axitinib showed a higher level in the high-risk group (Figure 10A), indicating that low-risk patients were more sensitive to these drugs. Besides, with ICIs have been applied in the treatment of LUAD and other cancers, we further explored the differences in ICI-related biomarkers expression among two subgroups. The results presented that the low-risk group had high PD1, CTLA4, TIGIT, PD−L1, and HAVCR2 expression (Figure 10B). Additionally, we counted the IC50 of common anti-lung cancer drugs in two subgroups. Patients in the low-risk groups were related with a higher IC50 of targeted therapy such as erlotinib and gefitinib and chemotherapeutics like cisplatin, paclitaxel, etoposide, which indicated that the risk model served as a promising predictor of anti- tumor drug sensitivity (Figure 10C). Besides, we analyzed the correction between 8 metabolism-related lncRNAs and drugs. For example, the correlation coefficient between Sulfatinib and LINC00707 was the highest (Cor=−0.433, p<0.001)(Figure 10D). As a result, we might be able to select the most appropriate drugs for LUAD patients. Besides, a high-risk group with lower TIDE scores may be more sensitive to immunotherapy than the high-risk group. So, our metabolism-based classifier index might serve as a powerful indicator for instructing clinical treatment (Figure 10E).
Functional analysis
To understand the potential biological process involved, we employed enrichment analysis to identify the signature of the eight metabolism lncRNAs (Supplementary Table S6). As shown in Figure 11A, GO analysis revealed that it mainly participates in the humoral immune response, human antimicrobial response, clathrin-coated endocytic vesicle, multivesicular body, receptor-ligand activity, and signaling receptor activator activity. According to KEGG analysis, the signature was connected with Hematopoietic cell lineage, Amoebiasis, Arachidonic acid metabolism, Pancreatic secretion, and so on (Figure 11B-11C). Further, we leveraged GSEA software to explore better the differences in biological functions in the KEGG pathways (Supplementary Table S7). The GSEA results illustrated that the high-risk group was enriched in the pathway such as cell cycle, DNA replication, homologous recombination, glycolysis gluconeogenesis, pyrimidine metabolism, and others (Figure 11D). In contrast, pathways such as allograft rejection, asthma, B/T cell receptor signaling pathway, and others were enriched in the low-risk group (Figure 11E). Besides, we presented metabolism-related differently expressed genes, eight metabolism-related lncRNAs, and risk types in the Sankey network (Figure 11F). Finally, with the help of Cytoscape, we presented an interaction network for visualizing the co-expression between the lncRNAs and mRNAs (Figure 11G).
Verifying the expression Level of eight Prognostic lncRNAs in vitro
We performed RT-qPCR to validate the expression Level of eight Prognostic lncRNAs using BEAS-2B and LUAD cells, including A549 and H1975 (Figure 12). three lncRNAs (LINC02390, AC021016.1, AC123595.1) exhibited significant downregulation in both LUAD cells. Given that the high expression of lncRNAs could represent a better survival (Figure 3D), they might function as tumor suppressor factors. TwolncRNAs (LINC00941, LINC00707) were upregulated in both LUAD cells. Similarly, and thus the high expression of the four lncRNAs could be representative of worse survival, suggesting that they may serve as risk factors. Interestingly, AL132656.2 was downregulated in A549 cells while upregulated in H1975 cells, so a deeper understanding of the specific mechanism is required. In addition, we also verified the differences in the expression of these eight critical lncRNAs in LUAD samples and standard tissue samples based on the TCGA database. The expression patterns of the remaining three lncRNAs were reversed. Except for LINC02390, the other lncRNA expression patterns were consistent with our risk model's respective coef coefficients of lncRNAs.