Development of CSC gene-related signature in patients with HNSCC
Among 28 CSC genes encoding CSC biomarker proteins validated in HNSCC [10], we selected seven CSC genes—CD44, MET, ALDH1A1, BMI1, PROM1, SOX2, and POU5F1– from a literature search that satisfied the following criteria: (a) showing clinical significance associated with expression of corresponding CSC biomarker proteins in HNSCC and (b) studied more than twice over the past 10 years [5, 8, 9, 23–31]. In TCGA cohort, high expression of four CSC genes (CD44, MET, ALDH1A1, and BMI1) was significantly associated with patient prognosis (p = 0.0069, 0.0051, 0.028, and 0.021, respectively; Fig. S1). Thus, these four genes were selected as reference CSC genes.
We identified the genes whose mRNA expression was correlated with at least one of the four reference CSC genes in TCGA cohort. A total of 81 genes were identified and were selected as CSC gene-related signatures (Fig. S2A, Table S1). Using the CSC gene signature, patients in TCGA cohort (n = 566) were divided into the CSC-HR (n = 285) and CSC-LR (n = 281) subgroups (Fig. 1A). The CSC-HR subgroup showed significantly higher mRNA expression of CD44 and MET than the CSC-LR subgroup (p < 0.0001 and p < 0.0001, respectively). And, the CSC-HR subgroup showed significantly lower mRNA expression of ALDH1A1 and BMI1 than the CSC-LR subgroup (p < 0.0001 and p < 0.0001, respectively). Using Kaplan–Meier analysis and log-rank test, we confirmed that the CSC-HR subgroup showed significantly lower 5-year OS and RFS rates than the CSC-LR subgroup (p = 0.04 and p = 0.02, respectively; Fig. 1B-C).
Independent validation of CSC gene-related signature
The CSC gene signature was validated using four independent cohorts: the Leipzig (n = 270), FHCRC (n = 97), MDACC (n = 74), and Greece (n = 109) cohorts (Fig. S2B). The details of the clinical and pathological characteristics of each cohort used in this study are shown in Table 1. Patients in each validation cohort were efficiently classified into the CSC-HR and CSC-LR subgroups based on the CSC gene signature. The CSC-LR subgroup in each validation cohort showed better prognosis than the CSC-HR subgroup (Fig. 2). In the Leipzig cohort, the CSC-HR subgroup showed lower 5-year OS rates than the CSC-LR subgroup (p = 0.06; Fig. 2A). Furthermore, the 5-year OS rates were significantly lower in the CSC-HR subgroups than in the CSC-LR subgroups in the FHCRC and MDACC cohorts (p < 0.0001 and p = 0.02, respectively; Fig. 2B-C). Also, in the Greece cohort, the CSC-HR subgroup showed significantly lower 5-year RFS rates than the CSC-LR subgroup (p = 0.009; Fig. 2D).
Table 1
Clinical and pathological characteristics of the five independent HNSCC cohorts.
Characteristics | TCGA cohort (n = 566) | Leipzig cohort (n = 270) | FHCRC cohort (n = 97) | MDACC cohort (n = 74) | Greece cohort (n = 109) |
Age | | | | | |
≥ 60 | 316 (55.83%) | 117 (43.33%) | 47 (48.45%) | 37 (50.00%) | 74 (67.89%) |
< 60 | 249 (43.99%) | 153 (56.67%) | 50 (51.55%) | 37 (50.00%) | 35 (32.11%) |
Unknown | 1 (0.18%) | 0 | 0 | 0 | 0 |
Sex | | | | | |
Male | 415 (73.32%) | 223 (82.59%) | 66 (68.04%) | 58 (78.38%) | 104 (95.41%) |
Female | 151 (26.68%) | 47 (17.41%) | 31 (31.96%) | 16 (21.62%) | 5 (4.59%) |
Smoking | | | | | |
Yes | 423 (74.73%) | 222 (82.22%) | NA | 59 (79.73%) | 108 (99.08%) |
No | 128 (22.61%) | 48 (17.78%) | NA | 15 (20.27%) | 1 (0.92%) |
Unknown | 15 (2.65%) | 0 | NA | 0 | 0 |
Alcohol | | | | | |
Yes | 371 (65.55%) | 239 (88.52%) | NA | NA | 58 (53.21%) |
No | 182 (32.16%) | 31 (11.48%) | NA | NA | 51 (46.79%) |
Unknown | 13 (2.3%) | 0 | NA | NA | 0 |
Tumor site | | | | | |
Oral cavity | 346 (61.13%) | 83 (30.74%) | 97 (100%) | 71 (95.95%) | 0 |
Oropharynx | 82 (14.49%) | 102 (37.78%) | 0 | 3 (4.05%) | 0 |
Larynx | 128 (22.61%) | 48 (17.78%) | 0 | 0 | 109 (100%) |
Hypopharynx | 10 (1.77%) | 33 (12.22%) | 0 | 0 | 0 |
Unknown | 0 | 4 (1.48%) | 0 | 0 | 0 |
T classification | | | | | |
T1-T2 | 218 (38.52%) | 115 (42.59%) | NA | 30 (40.54%) | NA |
T3-T4 | 344 (60.78%) | 155 (57.41%) | NA | 44 (59.46%) | NA |
Unknown | 4 (0.71%) | 0 | NA | 0 | NA |
N classification | | | | | |
Negative | 295 (52.12%) | 94 (34.81%) | NA | 42 (56.76%) | NA |
Positive | 267 (47.17%) | 176 (65.19%) | NA | 32 (43.24%) | NA |
Unknown | 4 (0.71%) | 0 | NA | 0 | NA |
Stage | | | | | |
I-II | 135 (23.85%) | 55 (20.37%) | 41 (42.27%) | 19 (25.68%) | 30 (27.52%) |
III-IV | 417 (73.67%) | 215 (79.63%) | 56 (57.73%) | 55 (74.32%) | 79 (72.48%) |
Unknown | 14 (2.47%) | 0 | 0 | 0 | 0 |
HPV status | | | | | |
Positive | 68 (12.01%) | 60 (22.22%) | 0 (%) | NA | NA |
Negative | 274 (48.41%) | 209 (77.41%) | 97 (100%) | NA | NA |
Unknown | 224 (39.58%) | 1 (0.37%) | 0 | NA | NA |
Radiotherapy | | | | | |
Yes | 304 (53.71%) | NA | NA | 47 (63.51%) | 54 (49.54%) |
No | 171 (30.21%) | NA | NA | 26 (35.14%) | 43 (39.45%) |
Unknown | 91 (16.08%) | NA | NA | 1 (1.35%) | 12 (11.01%) |
Treatment | | | | | |
Unimodal | 188 (33.22%) | 78 (28.89%) | 43 (44.33%) | 25 (33.78%) | 43 (39.45%) |
Multimodal | 278 (49.12%) | 189 (70.00%) | 53 (54.64%) | 48 (64.87%) | 54 (49.54%) |
Palliative | 1 (0.17%) | 3 (1.11%) | 0 | 0 | 0 |
Unknown | 99 (17.49%) | 0 | 1 (1.03%) | 1 (1.35%) | 12 (11.01%) |
CSC gene signature | | | | | |
CSC-HR subgroup | 285 (50.35%) | 122 (45.19%) | 38 (39.18%) | 47 (63.51%) | 57 (52.29%) |
CSC-LR subgroup | 281 (49.65%) | 148 (54.81%) | 59 (60.82%) | 27 (36.49%) | 52 (47.71%) |
HNSCC, head and neck squamous cell carcinoma; TCGA, The Cancer Genome Atlas; FHCRC, Fred Hutchinson Cancer Research Center; MDACC, MD Anderson Cancer Center; CSC, cancer stem cell; CSC-HR, CSC gene-associated high-risk; CSC-LR, CSC gene-associated low-risk; NA, not available |
CSC gene signature as an independent prognostic factor of HNSCC
To assess the independent prognostic factors of patients with HNSCC, we performed univariate and multivariate Cox proportional hazards models using CSC gene signature, patient demographics, social history, HPV status, and clinical staging of patients in TCGA and FHCRC HNSCC cohorts (n = 663). CSC gene signature (CSC-HR vs. CSC-LR subgroup), HPV status (negative vs. positive), and advanced clinical stage (stage III and IV vs. stage I and II) were independent prognostic factors of OS in patients with HNSCC (p = 0.0086, 0.0031, and 0.0035, respectively; Table S2).
Association of CSC gene signature with HPV status of HNSCC
We thought that if the additional survival analysis was performed individually according to the HPV status, it might be helpful to find appropriate indications to investigate CSC gene signatures to predict patient prognosis in HNSCC. Thus, we analyzed the prognosis of the CSC-HR and CSC-LR subgroups in patients with HPV (+) and HPV (-) HNSCC from the five HNSCC cohorts (Fig. 3). There were no significant differences in 5-year OS rates between the CSC-HR and CSC-LR subgroups in patients with HPV (+) HNSCC (n = 128, p = 0.2; Fig. 3A). However, the CSC-HR subgroup showed significantly lower 5-year OS rates than the CSC-LR subgroup in patients with HPV (-) HNSCC (n = 578, p = 0.003; Fig. 3B).
Association of CSC gene signature with the result of radiotherapy
The expression of CSC markers is correlated with poor prognosis after radiotherapy in HNSCC [32, 33]. However, the clinical correlation between radiotherapy and genes encoding CSC markers has not been clearly studied. Thus, we analyzed the prognosis of the CSC-HR and CSC-LR subgroups in the five HNSCC cohorts that did and did not receive radiotherapy. The CSC-HR subgroup showed significantly lower 5-year OS rates than the CSC-LR subgroup among patients with HNSCC who received radiotherapy (p < 0.0001; Fig. 4A). But, the CSC-HR subgroup showed no significant differences in 5-year OS rates compared to the CSC-LR subgroup in patients with HNSCC who did not receive radiotherapy (Fig. 4B). Next, we compared patients who received radiotherapy and those who did not receive radiotherapy in the CSC-HR and CSC-LR subgroups, respectively. In the CSC-HR subgroup, patients who received radiotherapy showed no significant differences in 5-year OS rates compared to those who did not receive radiotherapy (p = 0.1; Fig. 4C). However, in the CSC-LR subgroup, patients who received radiotherapy showed significantly higher 5-year OS rates than those who did not receive radiotherapy (p < 0.0001; Fig. 4D). The interaction test for OS was performed to determine any correlation between radiotherapy and the CSC gene signature in HNSCC. The results revealed a significant correlation between radiotherapy and the CSC gene signature (p < 0.0001).
Pathway analysis
A total of 11 significant Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were identified using DAVID (Table S3). Several pathways appeared to be related to the cancer or HNSCC pathways, including focal adhesion (p = 1.0E-5), small-cell lung cancer (p = 3.1E-4), ECM–receptor interaction (p = 4.6E-3), proteoglycans in cancer (p = 7.2E-3), PI3K-Akt signaling pathway (p = 1.0E-2), and pathways in cancer (p = 6.5E-2). Pathways associated with endothelial–mesenchymal transition signaling were also identified: regulation of the actin cytoskeleton (p = 8.5E-3) and leukocyte transendothelial migration (p = 9.9E-3).