Construction and verification of the prognosis model
The single-factor cox regression analysis was developed by the TCGA data set samples with 760 TME-related gene expression values as continuous variables. The P -value < 0.01 was used as a threshold. 50 genes were finally screent out. Protective genes with HR value less than 1 are favorable for prognosis, while risk genes with HR value greater than 1 are unfavorable for prognosis. 2 of the 50 genes are protective genes, and the remaining 48 genes genes are risk genes. The forest diagram of the top 20 genes with the smallest P-value among the 50 genes is shown in Fig. 1A.
After that, the 50 selected genes were subjected to LASSO Cox regression analysis. According to the lambda values of the LASSO Cox analysis, we have determined that the optimal number of genes is 15 (Fig. 1B, whereas the lambda value is the smallest). The 15 genes are BCAT1, RANBP1, CORO2B, RASIP1, EZH2, TCEAL4, FST, CD63, GEMIN7, SPOCK1, S100A11, CXCL3, RUVBL2, ETV5 and EID1. Then the gene expression was weighted with the regression coefficients of the LASSO Cox regression analysis to establish a risk score model for predicting patient survival, according to Risk Score = (0.1459482*GABARAPL2)+(0.1549568*SAR1A)+(0.1554117*ST13)+(0.1892597 *GAPDH)+(0.1121593*FADD)+(0.0857282*LAMP1). We calculated the risk score of each patient, and divided the samples of the TCGA data set and GEO data set into high-risk groups and low-risk groups according to the median of the risk scores. Based on the survival analysis, we found that in the TCGA data set and GEO validation set, high-risk laryngeal cancer samples have poor overall survival compared with lower-risk group (Fig. 1C).
In addition, it can be demonstrated from the time-dependent ROC that the AUC of the 1-year, 3-year, and 5-year survival period of the TCGA data set are 0.7478, 0.8688 and 0.7829 respectively (Fig. 1D), indicating the risk model can effectively predict the prognosis of patients with laryngeal cancer. At the same time, we found that the 15 genes were differentially expressed between the high and low-risk groups (Fig. 1E). In general, the risk score constructed by 15 TME-related genes: BCAT1, RANBP1, CORO2B, RASIP1, EZH2, TCEAL4, FST, CD63, GEMIN7, SPOCK1, S100A11, CXCL3, RUVBL2, ETV5, and EID1 can better predict the prognosis of patients with laryngeal cancer.
Risk score is an independent prognostic marker of HCC
We included age, sex, TNM stage, HPV index and risk score in multivariate Cox regression analysis to determine whether the risk score was an independent prognostic indicator. The results are shown in Fig. 2A. It was found that risk score was significantly associated with overall survival, and the samples with high risk score had a higher risk of death and were unfavorable for prognosis (HR = 5.50, 95% CI : 3.54–8.6,P < 0.001).
In order to further explore the prognostic value of risk score in laryngeal cancer patients with different clinicopathological factors (including age and TNM stage), we regrouped patients, performing a Kaplan-Meier survival analysis. It could be demonstrated that the overall survival rate of the high-risk group is significantly lower than that of the low-risk group of the samples in different ages and stages (Fig. 2B-2C). These results confirm that the risk score can be used as an independent indicator to predict the prognosis for HCC patients.
Nomogram model can better predict the prognosis of HCC patients
We use four independent prognostic factors: age, gender, radiotherapy status as well as risk score to construct the nomogram model (Fig. 3A). For each patient, draw three lines upwards to determine the points obtained from each factor in the Nomogram. The "Total Points" axis is determined by the sum of these points, of which draw a line to generate the probability of HCC patients surviving 1, 3, and 5 years. The one-year and two-year corrected curves in the calibration chart are relatively close to the ideal curves (the 45-degree line that passes through the origin of the coordinate axis with a slope of 1), indicating that the predicted results of the model in one, three, and five years are in good agreement with the actual results (Fig. 3B-3D).
Differential gene analysis and functional enrichment results of patients with laryngeal cancer patients
In order to further investigate the differences of gene expression between high-risk and low-risk groups, we conducted a differential gene analysis and a functional enrichment analysis between the two sets of samples. Compared to the low-risk group, a total of 673 genes display a specific expression manner, including 583 up-regulated genes and 90 down-regulated genes (Fig. 4A).
We performed GO and KEGG enrichment analysis for these 673 genes. It was found that these 673 genes were significantly enriched in GO terms such as extracellular matrix organization and KEGG Pathway such as ECM-receptor interaction. The top 20 most significantly enriched genes and pathways are shown in Fig. 4B and 4C.
Immune status of laryngeal cancer patients in high and low-risk groups
We used the CIBERSORT method combined with LM22 feature matrix to estimate the difference of immune infiltration between 22 immune cells in high-risk and low-risk groups of patients with laryngeal cancer. Figure 5A summarized the results of immune cell infiltration in 169 laryngeal cancer patients. There are significant differences in the infiltration ratios of 6 types of immune cells, such as B.cells.naive, between high and low-risk groups (Fig. 5B).
Since the expression of immune checkpoints has become a biomarker of immunotherapy for laryngeal cancer patients, we analyzed the correlation between patient risk score and key immune checkpoints (CTLA4, PDL1, LAG3, TIGIT, IDO1, TDO2). It could be seen that the risk score is closely associated with all the 6 checkpoints, which demonstrate the significant differences between the high and low-risk groups of laryngeal cancer patients (Fig. 5C).