Mutation and copy number variation (CNV) of TRGs in HNSC
Somatic mutations were found in 33 TRGs in 80 (15.69%) of the 510 samples. The mutation frequencies from high to low were AHNAK, LIG3, LTBR, ITM2A, CXCL12, IL12B, CLIC1, and DBI, while the other TRGs did not have any mutations (Fig. 1a). Moreover, 33 TRGs of somatic CNV were found to be common. CNV increased significantly in AHNAK, MS4A3, DBI, LTBR, MRPL51, IL1RN, and AHCY and decreased significantly in AHNAK, MS4A3, DBI, LTBR, MRPL51, IL1RN, and AHCY (Fig. 1b). The position of the CNV on chromosomes in TRGs is shown in Fig. 1c. Factors that regulate gene expression include CNV, DNA methylation, transcription factors, among others [16,17]. The analysis results showed that mRNA expression of HNSC and normal tissue samples was significantly different (Fig. 1d), indicating a potential role of TRGs in HNSC genesis.
TRGs subtypes were obtained by consensus clustering
The research process is illustrated in Figure S1. The characteristics of the included patients with HNSC were shown in Table S1. First, K–M and univariate Cox regression analyses were performed to determine the effects of the 33 TRGs on the prognosis of patients with HNSC (Fig. S2 and Table S2,3), and TRGs regulatory connections and interactions were demonstrated through the TRGs network (Fig. 2a). The consensus clustering algorithm was used in this study to obtain TRGs subtypes to analyze the expression characteristics of prognostic TRGs in HNSC (Fig. S3). k = 2 was the best choice, that is, subtypes A (n = 507) and B (n = 264) (Fig. 2b). The expression of the two TRG subtypes was significantly different, as shown by the PCA analysis (Fig. 2c). Crucially, the K–M curve showed that patients with subtype B had better survival (log-rank test, p = 0.005; Fig. 2d). Finally, the heat map showed significant differences in TRGs expression levels among the different subtypes (Fig. 2e).
TME differences in TRGs subtypes
First, according to the GSVA results, subtype B was significantly abundant in the immune activation (Fig. 3a-b; Table S4). The abundance of multiple immunoinfiltrating cells (Table S5), showed that the infiltration of most immune cells of subtype B was significantly higher (Fig. 3c). Similarly, immune checkpoint analysis showed significantly elevated expression of subtype B at all eight immune checkpoints (Fig. 3d). Finally, the immune scores were evaluated, and the analysis showed that patients with subtype B with better prognosis still had higher immune scores in the TME (Fig. 3e).
Screening of DEGs and search for gene subtypes
Through "limma" package, 1270 DEGs were identified based on TRG subtypes, set the standard for a false discovery rate (FDR) < 0.05 and |logFC| > 0.585. Using GO and KEGG, biological functions and pathways of DEGs were identified (Fig. 4a-b; Table S6), these 1270 genes were mainly related to immune-related biological function and pathways. Finally, to analyze the prognostic value of 1270 DEGs, 307 DEGs related to survival were screened, and the samples were divided into three genomic subtypes: A, B, and C (Fig. 4c). The heat map and boxplot showed significant differences in the expression of DEGs and TRGs in the three gene subtypes (Fig. 4d-e). These results demonstrated that TRGs play a fundamental role in immune-related processes and prognosis.
Established of risk score
Fig. 5a shows the allocation of patients with HNSC into two TRG subtypes, three genetic subtypes, and two risk groups. First, 307 prognostic-related DEGs were analyzed by LASSO, and 44 survival-related genes were obtained (Fig. 5b-c). Then, 23 genes were finally obtained by multivariate Cox regression analysis (Table S7), including 11 high-risk genes (CCL19, ATP2A3, SH3KBP1, ARL4D, CPVL, LPL, ULBP2, HOXC13, SFRP2, IGFL1, and CSF2), and 12 low-risk genes (CTLA4, ANKRD44, FGD3, CLEC3B, LIMD2, and CTSG, NKX23, DSC3, PTPRZ1, ODAM, SPINK6, and TFF3). Moreover, the risk score for subtype C was the lowest among the three gene subtypes (Fig. 5d). Of the two TRG subtypes, subtype B with a higher immune score had a lower risk score; therefore, a lower risk score might be associated with immune activation (Fig. 5e). In particular, the expression levels of most TRGs differed significantly between the two risk groups (Fig. 5f) and the gene expression results were similar to those previously reported.
Evaluation of prognostic models
Risk ranking points and scatter plots showed that patients with high-risk scores were more likely to die, and heat maps were used to show the differential expression of 23 prognostic-related genes (Fig. 6a-c). KM showed that the prognosis in the low-risk group was significantly better (log-rank test, p <0.001; Fig. 6d-f). Analysis of 1-year, 3-year, and 5-year prognostic efficiency showed that the risk score had a higher AUC value, suggesting that the risk score had better predictive power for survival in HNSC (Fig. 6 g-i). These results were consistent across the training, testing, and across all sets.
Establishment of a nomogram
Forest maps illustrated that age, clinical stage, and risk score were important prognostic factors for patients with HNSC (Fig. 7a, b). These factors were used to construct nomogram models for HNSC (Fig. 7c). For example, the 1-year, 3-year, and 5-year survival rates of a 65-year-old patient with stage III-IV disease were 71.7%, 36.8%, and 22.8%, respectively. Finally, calibration plots showed that the nomogram had good prediction ability (Fig. 7d).
TME and immune checkpoint with risk score
First, scatter plots showed that immune cells were significantly associated with the risk scores (Fig. 8a). The heat map showed that most immune cells were significantly associated with the 23 prognostic genes (Fig. 8b). Finally, the boxplot showed that the expression of most immune checkpoints was significantly correlated with risk scores (Fig. 8c).
CSC, mutation and drug sensitivity
Figure 9a shows that HNSC cells with lower risk scores were characterized by more stem cells. The results of Figure 9b-c indicated that 226 (97%) of 233 high-risk groups had gene mutations and 238 (90.84%) of 262 low-risk groups had gene mutations, among which TP53 and TTN had the highest mutation frequency. However, the boxplot and scatter plot (Fig. 9d-e) showed no significant difference between the risk score and TMB, suggesting that the potential role of mutations between different risk groups may not be significant. Finally, we found that the low-risk group had a significantly lower IC50 for chemotherapy drugs such as methotrexate, mitomycin, and vinorelbine, suggesting that the risk score may also be potentially related to drug sensitivity (Fig. 9f-i).