Patient Characteristics
A total of 412 patients are eligible for this study. There were 336 males (81.6%) and 76 females (18.4%). Most subjects had a current or history of smoking (72.1%). As shown in Table 1, 52.2% of patients had supraglottic squamous cell carcinoma, and 47.8% of patients had glottal laryngeal squamous cell carcinoma. Most patients have localized early tumors (T1 or T2) (65.3%), but most of them are moderately or poorly differentiated (59.0%) (Tab. 1).
The prognostic value of tumor local inflammation infiltration indicators peripheral blood inflammatory markers
In our study, 226 patients had lower iTILs (Fig. 2a), 186 patients had higher iTILs (Fig. 2b). The 5-year OS rate of the high iTILs group (50.5%) was significantly higher than that of the low iTILs group (42.5%, P<0.001, Figure 3a). According to the TILv stratification of patients, 126 patients had lower TILv (Fig. 2c), and 286 patients had higher TILv (Fig. 2d). The 5-year OS rate of the high TILv group (48.3%) was significantly higher than the low TILv group (41.3%, P<0.001, Fig. 3b). Lastly, we stratified the patients again according to TILf, 142 patients having lower TILf (Fig. 2e) and 270 patients having higher TILf (Fig. 2f). The 5-year OS rate of the high TILf group (47.8%) was significantly higher than the low TILv group (42.9%, P<0.001, Fig. 3c). In conclusion, the levels of iTILs, TILv, and TILf are closely related to the improvement of patient survival outcomes. In addition, we also evaluated the prediction accuracy of the three indicators for OS, and the ROC curve shows the good prediction performance of our three indicators (Fig. 3d). The statistical analysis of the relationship between iTILs, TILv, and TILf levels and the disease recurrence did not find any significant correlation. In the univariate cox analysis, BMI ≥ 24, poorly differentiated, supraglottic carcinoma, high T, N stage or TNM, low iTILs, TILv, and TILf levels were identified as predictors of poor prognosis. Next, we included statistically significant indicators into the multivariate analysis, and the results showed that BMI (P=0.041, HR: 0.642, 95% CI: 0.420–0.983), iTILs (P<0.001, HR: 0.502, 95% CI: 0.338-0.746), TILv (P<0.001, HR: 0.462, 95% CI: 0.318-0.672) and TILf (P=0.004, HR: 0.585, 95% CI: 0.405-0.845) levels showed significant statistical significance (Tab. 1). Therefore, we can consider that high iTILs, TILv, and TILf levels are independent predictors of a good prognosis.
The Kaplan-Meier curve is shown in Figure e-h. The 5-year OS rate of the high lymphocyte-to-monocyte ratio (LMR), platelet-to-lymphocyte ratio (PLR), neutrophil-to-lymphocyte ratio (NLR)and systemic immune-inflammation index (SII, platelets × neutrophils/lymphocytes) group was significantly higher than that of the low group (all P<0.05). In the univariate Cox analysis, high PLR, NLR, and SII were significantly correlated with better OS (Tab. 1). In the multivariate Cox analysis, PLR is still statistically associated with better OS in LSCC patients (P<0.05, HR: 0.585, 95% CI: 1.000-1.006), which indicates that PLR is LSCC independent prognostic indicators (Tab. 2).
In the TCGA database, the TILs of LSCC patients are related to the prognosis
We first retrieved the transcriptome data and clinical characteristics of LSCC from the TCGA database. Excluding samples with repeated sequencing and zero survival time, a total of 111 tumor samples were included in the study. We used the GTF file to annotate the probes and the “mean” method to delete duplicate gene names. According to immune gene sets, analyze tumor samples by immune-related single-sample gene set enrichment analysis (ssGSEA) and use heat maps for visualization(Fig. 4a). Next, the samples were grouped according to the enrichment of TILs, and the KM curve showed that LSCC patients with high TILs had longer survival times than those with low TILs (P<0.001) (Fig. 4b). We continued to use the Pearson correlation test to evaluate the correlation between TILs and survival time. As shown in Figure 4c, the survival time of patients gradually increased with the increase of TILs score (R=0.23, P=0.013).
Screening of gene co-expression modules related to TILs and functional analysis of module characteristic genes
We eliminated genes with low expression levels and selected the top 25% of the genes with variance variation rates. We cluster the samples to identify whether there are any obvious outliers. The height cutoff value was set to 120, and 102 samples were included in our analysis (Fig. S2a). After preliminary sorting, we constructed the WGCNA co-expression module of 6322 genes from 102 samples. To construct a WGCNA network, we first calculated the soft thresholding power β, to which the co-expression similarity is raised to calculate adjacency. We used the pick Soft Threshold function in WGCNA for the network topology analysis. The soft thresholding power β was set at 5 in the subsequent analysis because the scale independence reached 0.9 and had relatively high-average connectivity (Fig. S2b). We constructed the gene network and identified modules using the one-step network construction function of the WGCNA R package. To cluster splitting, the soft thresholding power was set at 5, the minimum module size was set at 30, and the deep Split was set at 2 (which implies a medium sensitivity). Finally, 17 genes co-expression modules were constructed (Fig. S2c). We mapped the relationships between the identified modules (Fig. S2d). The heatmap depicts the topological overlap matrix (TOM) among all genes included in the analysis. The light color represents a high overlap, and the progressively darker red color represents a decreasing overlap. The results of this analysis indicated that the gene expression was relatively independent between modules. We correlated modules with the TILs and immune checkpoints (ICPs) scores of LSCC samples and search for the most important associations. The analysis results show that the blue module is significantly associated with TILs and ICPs (P<0.0001) (Fig. 5a, b, and c).
The Venn diagram shows the intersection of the ICPs-related gene set and the TILs-related gene set (Fig. 5d), and finally 802 genes are obtained. We found that these genes are mainly concentrated in chr1 and chr19 (Fig. S2e). In order to understand the composition of these genes in the cell and the biological functions they perform, we performed GO and KEGG enrichment analysis on 802 WGCNA hub genes. KEGG results showed that they are mainly involved in the regulation of pathways including Cytokine-cytokine receptor interaction, chemokine signaling pathway, T cell, and B cell receptor signaling pathway, PD-L1 expression, and PD-1 checkpoint pathway in cancer (Fig. 5e). GO results show that 802 WGCNA hub genes are mainly enriched in Biological Processes, such as T cell activation, regulation of immune effector process and lymphocyte proliferation; Cellular Component, such as MHC protein complex, immunological synapse, and MHC class II protein complex and molecular functions such as immune receptor activity, cytokine receptor binding, and cytokine receptor activity (Fig. 5f). The P values of the above enrichment analysis results were all P<0.05.
Predictive model construction and evaluation
We included 111 TCGA samples in the survival correlation analysis. Univariate Cox analysis was performed on these WGCNA hub genes (P <0.05), and 69 WGCNA hub genes related to prognosis were obtained (Fig. S3a). In order to prevent overfitting, Lasso Cox regression analysis was performed on these prognostic WGCNA hub genes, and 19 WGCNA hub genes were obtained (Fig. S3b, c). We selected 19 WGCNA hub genes to construct features and obtained a well-balanced prognostic model (The list of immunization pathways and coefficients of WGCNA hub genes is shown in Table S1). We separately constructed the ROC curves of WGCNA hub genes signature, TNM stage, age, and gender. The area under the curve (AUC) of the WGCNA hub genes signature is 0.790 (Fig. 6a), which shows that our signature has excellent predictive power. Then, we plotted the ROC curve of WGCNA hub genes signature over time. We found that the areas under the WGCNA hub genes signature curve are: 1 year: 0.811; 3 years: 0.953; 5 years: 0.939. This shows that our model has a good predictive ability for patients with survival (Fig. 6b). We used the time-dependent ROC to determine the cutoff value for the best WGCNA hub genes signature. The best cutoff value for WGCNA hub genes signature is 3.510 (Fig. S3d). According to the cutoff value, patients were divided into high-risk groups and low-risk groups. We found that compared to patients in the low-risk group, the overall survival rate of patients in the high-risk group was significantly lower (Fig. 6c). Subsequently, we drew a survival status graph that included survival time and risk score regression curves. The results showed that as the risk score increased, the mortality rate of patients gradually increased, and the survival time gradually shortened (Fig. 6d). Moreover, as the risk score increases, the expression values of genes such as NUPR2, MT-ATP8, and AQP9 increase. On the contrary, the expression value of genes such as KDM5D, MAP3K14, and TNFRSF4 decreased with the increase of risk score (Fig. 6e).
ICGC database verification model
In the ICGC-GENEs data set, the risk scores of 85 LSCC samples were calculated according to the same method. We drew the ROC curve of the signature over time. We found that the areas under the ICGC-GENEs signature curve are: 1 year: 0.628; 3 years: 0.815; 5 years: 0.821. This shows that our model still has excellent predictive capabilities in the validation set (Fig. S4a). The patients were divided into high-risk groups and low-risk groups. As shown in Fig. S4b, patients in the low-risk group have a longer overall survival. Next, we used the same method to draw a survival status graph with regression curves. The results showed that as the risk score increased, the patient’s mortality rate gradually increased, and the survival time gradually shortened(Fig. S4c). This is consistent with the results we obtained in the TCGA database.
Correlation between predictive models and clinical characteristics
We evaluated the prognostic value of the LSCC risk score. In the univariate analysis, we found OS and risk score (HR = 5.658, 95% CI =3.430-9.331, P<0.001), M staging (HR =8.355, 95% CI=1.912-36.513, P<0.01), N staging (HR =1.437, 95% CI =1.042-1.980, P<0.05), gender (HR =3.142, 95% CI=1.511-6.533, P<0.05) were significantly correlated (Fig. 7a). Multivariate analysis showed that risk score (HR =6.156, 95% CI =3.644-10.401, P<0.001) and M stage (HR =12.416, 95% CI =2.714-56.799, P=0.01) were independent predictors of OS (Figure. 7b). In order to further improve the prediction accuracy, we constructed a new nomogram prediction map based on the WGCNA hub genes (Fig. 7c). The nomogram C-index is 0.835. By calculating the total score, oncologists can easily obtain the probability of OS predicted by the nomogram of a single patient. Besides, we used the calibration curve to evaluate the model’s prediction accuracy. The results showed that the prediction calibration curve of the three calibration points in 1, 3, and 5 years was close to the standard curve, which indicated that the model had good predictive performance (Fig. 7d). We also used the DCA to evaluate the reliability of the model (Fig. 7e). It can be seen that the profit of this model was significantly higher than the limit curve and clinical characteristic; that is to say, it had good reliability.
The relationship between immune cell infiltration and risk model
We explore the difference in immune cell infiltration between the two groups. Based on the ESTIMATE algorithm, we first calculated the immune score for each LSCC sample. As shown in Fig. 8a and 8b, compared to the high-risk group, the immune score of the low-risk group (P=0.029) is higher, and it is negatively correlated with the risk score (the correlation coefficient is -0.36, P<0.001). Next, we used the MCP counting method to calculate the abundance of 10 immune-related cells. Compared to the high-risk group, the low-risk group had more abundant four types of cell populations (CD8 T cells, Myeloid dendritic cells, Neutrophils, T cells) (Fig. 8c). We further explored the relationship between immune cell infiltration and risk score. The results showed that the degree of immune cell infiltration was significantly negatively correlated with the risk score (Fig. 8d). We use CIBERSORT to further supplement the relative proportion of 22 immune infiltrating cells in each sample (Figure 8e). The relative proportions of T cells CD8 and T cells CD4 memory resting in the low-risk group were higher. The relative proportions of Macrophages M0 and Eosinophils in the high-risk group were higher. There is a big difference in immune cell infiltration between high-risk and low-risk groups. It is worth noting that the radar chart shows that T cell CD4 memory cessation and M0 macrophage infiltration rate are higher in all patients.
Mutational situation in the ICGC database and Assessment of response to immunotherapy
The association of immune subtypes with mutational status higher tumor mutation burden (TMB) and somatic mutation rates are correlated to stronger anticancer immunity[30]. Therefore, we calculated the TMB and mutations in each patient using the mutect2-processed mutation dataset of ICGC, and analyzed the difference between the mutation and TMB in the high- and low-risk groups. The waterfall chart shows the overall mutation status of the top 20 genes in the high- and low-risk groups (Fig. 9a and 9b). Compared to the high-risk group, the low-risk group showed a higher mutation rate. Next, we plotted the TMB situation in the high and low risk groups. Compared to the high-risk group, the low-risk group showed higher TMB (Fig. 9c). These findings suggest that our prediction model can predict the TMB and somatic mutation rates of LSCC patients, and that patients in the low-risk group may have a positive response to immunotherapy.
The application of ICIs for tumor immunotherapy has become a promising treatment for recurrent and metastatic LSCC. In order to further study the relationship between risk models and immunotherapy, we explored the correlation between risk models and ICIs-related biomarkers. The results showed that the expression levels of LAG3 and PDCD1 were up-regulated in the low-risk group (both P<0.05), and the risk score was similar to LAG3 (r = -0.40, P<0.05), PDCD1 (r = -0.40, P<0.05) and CD274 (r = -0.20, P <0.05) was negatively correlated, indicating that the low-risk group benefited more from immunotherapy (Fig. 9d and 9e).
Gene set enrichment analysis of the risk model
Since the risk model obtained is highly correlated with the prognosis, immune microenvironment, and treatment effect of LSCC, we tried to explore their functional meaning and internal connection through subsequent enrichment analysis. The results of KEGG functional enrichment analysis showed that the differentially expressed genes of high-risk and low-risk populations were significantly enriched in 9 KEGG pathways (P <0.05). Among them, the high-risk group had the most abundant TGF beta signaling pathway and ECM receptor interaction, and the low-risk group had the most significant way to produce primary immunodeficiency and intestinal immune network for IgA production (Fig. 10a). To investigate if the signature was associated with tumor biological processes, Gene Set Enrichment Analysis (GSEA) was performed. As shown in Fig. 10b, a total of 9 hallmark gene sets were enriched (P<0.05). Four hallmarks (including HYPOXIA,KRAS_SIGNALING_UP AND MTORC1_SIGNALING) were associated with high-risk scores, suggesting that the activation of these biological processes may participate in LSCC progression. In contrast, the other 5 hallmarks (INTERFERON_ALPHA_RESPONSE, INTERFERON_GAMMA_RESPONSE, PI3K_AKT_MTOR_ SIGNALING, IL6_JAK_STAT3_SIGNALING, DNA_REPAIR and G2M_CHECKPOINT) were associated with low-risk scores, suggesting that their activation inhibits tumor progression and improves survival in LSCC patients.
Immunohistochemistry
In our study, 112 patients had a high CTSL expression by employing IHC experiments (27.2%), another 300 patients had a low CTSL expression (72.8%, Fig. 11a). The OS rate of patients with high CTSL expression was significantly lower than the low CTSL expression group (P<0.001, Fig. 11b). According to the KDM5D stratification of patients, 270 patients had high KDM5D expression (65.5%), 275 patients had low KDM5D expression (34.5%, Fig. 11a). The OS rate of patients with high KDM5D expression was significantly higher than the low KDM5D group (P<0.001, Fig. 11c). In addition, in the correlation analysis of TILs, we found that CTSL was negatively correlated with TILs (P<0.05), and KDM5D was significantly positively correlated with TILs (P<0.001) (Fig. 11d and 11e).