3.1. amino acid metabolism-related lncRNAs in LIHC
We obtained expression data from the TCGA database for 370 LIHC tumor samples. By co-expression analysis of amino acid metabolism-related genes and their co-expressed lncRNAs, we obtained 456 amino acid metabolism-related lncRNAs (correlation coefficient > 0.6, p < 0.001). The co-expression relationships between amino acid metabolism-related genes and associated lncRNAs were visualized using sankey plots(Fig. 1A).
3.2. Construction and validation of prognostic models
Three hundred and seventy patients were randomly divided 2:1 into training and validation groups, 245 in the training group and 125 in the validation group, and the clinicopathological characteristics of patients in both groups were compared (Table 1). One-way Cox regression analysis (Figure 1B)yielded 74 amino acid metabolism-related lncRNAs in the training group, which were significantly correlated with patients' OS (p < 0.05). We further performed Lasso regression analysis(Fig. 1C -D), and the candidate lncRNAs screened by Lasso regression were subjected to multifactorial Cox regression analysis, and we finally screened 6 amino acid metabolism-related lncRNAs significantly associated with survival in patients with hepatocellular carcinoma (p < 0.05), and constructed a prognostic model consisting of 6 lncRNAs: risk score = LINC02870 × (0.242) + AL031985.3 × ( 0.602) + AC011476.3 × (-0.688) + AC012640.1 × (0.326) + AL365361.1 × (-0.718) + LUCAT1 × (0.474). The correlation heat map can show the expression correlation between the screened lncRNAs and the related genes(Fig. 2).
To better assess the prognostic value of risk characteristics based on the median risk score as a critical value, patients were divided into a high-risk group and a low-risk group. Survival analysis showed that patients in the high-risk group in the training cohort(Fig. 3A), the validation cohort (Fig. 3B)and the whole cohort(Fig. 3C) had a statistically significant lower survival rate than patients in the low-risk group (p < 0.05), and the survival curves indicated the prognostic value of the risk score as a predictor of prognosis. Similar results were obtained for PFS survival curves in the entire TCGA cohort(Fig. 3D), with significantly slower disease progression in the low-risk group than in the high-risk group of patients, indicating that the higher the risk score, the worse the prognosis of the patients. Subsequently, risk scores were used to assess risk profiles, survival status, and expression of model lncRNAs in low- and high-risk patients in the training(Fig. 4A), validation(Fig. 4B) and entire cohorts(Fig. 4C). All results showed that mortality was significantly better in the low-risk group than in the high-risk group, and the heat map showed that AC011476.3, AL36536 was highly expressed in the low-risk group and the rest of LncRNAs were highly expressed in the high-risk group.
Univariate(Fig. 5A) and multifactorial Cox regression analyses(Fig. 5B) were used to assess whether the risk models for the six amino acid metabolism-related lncRNAs described above were independent prognostic factors for hepatocellular carcinoma. The age, gender, stage and risk scoring of hepatocellular carcinoma patients were jointly included in the univariate independent prognostic analysis and the multifactor independent prognostic analysis, as seen in the univariate Cox regression analysis, the model risk score (HR = 1.189, 1.133–1.247, p < 0.001) and stage (HR = 1.680,1.369–2.062,p < 0.001) were independently associated with OS, and in multivariate Cox regression analyses model risk score (HR = 1.203, 1.141–1.268, p < 0.001), staging (HR = 1.670,1.353–2.061 ,P < 0.001) were independently correlated with OS, indicating that the risk model of six amino acid metabolism-related lncRNAs was the most important prognostic factor for hepatocellular carcinoma, independently of other clinical features. In addition, ROC curves (Fig. 5C)were used to assess the sensitivity and specificity of the risk model for prognosis of patients with LIHC. We further compared the differences between the risk model and other predictors, including age, gender, and stage. The results showed that the AUC of the signature was higher than all others at 1, 2 and 3 years. These results indicate that the risk model has significantly higher predictive values than other clinicopathological parameters and that the risk model has a satisfactory ability to assess the prognosis of patients with hepatocellular carcinoma, with an area under the curve (AUC) of 0.754, 0.767, and 0.722 at 1, 3, and 5 years(Fig. 5D-F), respectively. in addition, it indicates that the risk model has a high accuracy in patients with hepatocellular carcinoma, and the risk score of c-index was also higher than other clinical parameters((Fig. 5G). In conclusion, these results demonstrate the good performance of the risk model.
3.3. Prognosis Nomogram prediction chart
Survival status of patients with hepatocellular carcinoma was predicted by building a nomogram survival prediction model. We constructed a column line graph based on the variables of patient gender, age, risk score and clinical stage(Fig. 5I). Each factor in this column line graph has a specific score, and the scores of each prognostic factor are summed according to the actual situation of each sample to obtain the total score, by which we can predict the survival rate of patients at 1, 2 and 3 years. The results of the calibration curves plotted showed(Fig. 5H) that the predicted 1-year, 3-year and 5-year survival probabilities matched the actual survival probabilities, respectively, indicating a good agreement between the survival rates of LIHC patients and the values predicted by the column line graphs. These data suggest that column line plots can lead to more accurate individual clinical decision making and monitoring.
3.4. Model validation and principal component analysis for clinical grouping
To further explore the correlation between risk score and clinical staging of hepatocellular carcinoma patients, we evaluated the relationship between risk score and clinical characteristics of the model. The results showed (Fig. 6A-B)that in the population of hepatocellular carcinoma patients grouped according to risk scores, the overall survival time of patients with low risk scores was significantly longer than that of patients with high risk scores in both early and advanced stages of hepatocellular carcinoma (P < 0.001), indicating that the model has high predictive accuracy and can be used to compare the survival prediction of patients at different stages. In addition, PCA was performed to compare the differences between the low- and high-risk groups based on the model's six amino acid metabolism-associated lncRNAs, amino acid metabolism-associated lncRNAs, amino acid metabolism genes, and genome-wide expression profiles of the risk model(Fig. 6C-F). The results showed that the direction of distribution between the low-risk and high-risk groups based on risk scores was obvious, and the results showed a clear distribution, indicating that these lncRNAs could be reliably used to construct features, suggesting that the risk model could clearly divide hepatocellular carcinoma patients into two parts, and the survival status of hepatocellular carcinoma patients in the high-risk group was different from that of the low-risk group.
3.5. Enrichment analysis and immune function analysis
To investigate the underlying biological mechanisms of molecular heterogeneity between low- and high-risk populations, we screened 644 differentially expressed genes (DEGs). We found that most DEGs were enriched in the GO pathway(Fig. 6G) in the immunoglobulin complex, humoral immune response, cell recognition, external side of plasma membrane. In addition, KEGG pathway analysis (Fig. 6H)showed that DEGs were enriched in Chemical carcinogenesis - DNA adducts, Metabolism of xenobiotics by cytochrome P450, suggesting that these lncRNAs are involved in tumor development. These results suggest that DEGs are associated with cancer development and provide us with new ideas about the biological mechanisms associated with amino acid metabolism-related lncRNAs. Next, we compared the infiltration levels of 13 immune-related gene sets between high and low risk groups to assess the correlation between amino acid metabolism-related lncRNAs and other immune features(Fig. 7A). The results showed that the infiltration levels of all 12 significant immune-related gene sets were higher in hepatocellular carcinoma patients with low-risk groups, including Type_II_IFN_Reponse, APC_co_stimulation, CCR, Parainflammation, T_cell_co- inhibition, Check-point, T_cell_co-stimulation, Cytolytic_activity, Inflammation-promoting, APC_co_ inhibition, HLA, type_I_IFN_Reponse. therefore, the amino acid metabolism-related lncRNAs included in our study showed identifiable patterns in terms of prognosis, TME characteristics and immune features in patients with hepatocellular carcinoma.
3.6. Tumor mutation load analysis and TIDE analysis
We used TGCA somatic mutation data to generate TMB scores and used the maftools algorithm to observe mutations in the low-risk groups(Fig. 7B) and high-risk(Fig. 7C) and showed that for most genes, the frequency of mutations was higher in the high-risk group (92.13%) than in the low-risk group (69.95%), with TP53 (37%), CTNNB1 (35%) and TTN (25%) having the highest mutation frequencies in the high-risk group and TP53 (16%), CTNNB1 (16%) and TTN (22%) in the low-risk group. The highest mutation frequencies were found in TP53 (16%), CTNNB1 (16%) and TTN (22%) in the high-risk group. In addition, there was a significant difference in TMB(Fig. 7D) between the high-risk and low-risk groups (p = 0.001). Thus, the high-risk group exhibited a higher degree of malignancy. In addition, survival analysis showed that higher TMB in hepatocellular carcinoma had poorer OS(Fig. 7F). given the prognostic role of risk score and TMB in hepatocellular carcinoma, we further explored the prognostic value of combining the two by dividing all samples into four groups: high TMB/high risk, low TMB/low risk, high TMB/low risk, and low TMB/high risk(Fig. 7G). The results showed a statistically significant difference in survival among the four groups (p < 0.001), with patients with high TMB/high risk having the worst OS and those in the low TMB/low risk group having the best overall survival. In addition, the Tumor Immune Dysfunction and Exclusion (TIDE) algorithm was used to assess the potential response to immunotherapy in the high- and low-risk groups. the higher the TIDE prediction score, the higher the chance of immune escape, indicating that patients are less likely to benefit from immunotherapy. The results showed(Fig. 7E) higher TIDE scores in the low-risk group than in the high-risk group, indicating a higher likelihood of immune escape and a lower benefit of immunotherapy for patients in the low-risk group.
3.7. Compound sensitivity analysis
We screened potentially effective antitumor drugs using the pRRophetic package and found differences in the IC50 of some compounds in the high and low risk groups of the model (p < 0.001). Among them, the IC50 of drugs such as Sunitinib, Salubrinal, and Epothilone B were lower in the high-risk group than in the low-risk group, representing a higher sensitivity of the drugs in high-risk patients and a negative correlation between their NRAV expression levels and OIS scores(Fig. 8A-E). drugs such as Embelin and FTI-277 showed the opposite effect. Based on these results, we believe that this risk score model may be useful in guiding the clinical treatment of hepatocellular carcinoma.