Identification and Functional annotation of differentially expressed ARGs in HNSCC
All clinicopathological and RNA-seq data in a total of 502 HNSCC and 44 non-tumor samples were retrieved from TCGA portal. The expression of all ARGs in these samples were calculated and compared. Then, 28 up-regulated and 10 down-regulated ARGs (P <0.05, fold change >2.0) were identified in tumor samples compared to non-tumor counterparts (Fig.1A, B). In detail, the expression patterns of these 38 differentially expressed ARGs were visualized in box plots (Fig.1C).
To gain insights into the biological functions of these differentially expressed ARGs in HNSCC, we performed gene annotation analysis using DAVID platform. As expected, our results indicated that these genes were significantly enriched in functional categories involved in regulation of apoptotic signaling and autophagy (Fig.2). Moreover, as displayed in Fig.3, KEGG pathway enrichment analysis revealed that these genes were significantly associated with apoptosis, EGFR tyrosine kinase inhibitor/Platinum resistance, HPV infection, and ErbB signaling pathway, et al. These results suggest that these ARGs might contribute to HNSCC tumorigenesis through various biological pathways and processes, and also highlight the incompletely known roles of autophagy in HNSCC initiation and progression.
Identification of prognostic ARG candidates in HNSCC
Among 546 HNSCC samples in TCGA, 499 patients had clinical follow-up information available and then enrolled for prognostic analyses. To explore the potential associations between ARGs and overall survival (OS) in patients with HNSCC, univariate Cox proportional hazards regression analysis was applied to screen the prognosis-related ARGs. Our results revealed that the abundance of 8 differentially expressed ARGs (BAK1, CXCR4, EGFR, NKX2-3, FADD, CDKN2A, CTSL1 and PARK2) were remarkably correlated with overall survival with P values less than 0.05 (Fig.4). In order to enhance the robustness, these 8 candidates were further subjected for multivariate Cox regression analyses and subsequently 3 ARGs including EGFR, FADD, PARK2 were consistently identified as independent prognostic factors and further included to develop a prognostic signature (Fig.5A).
The 3-ARGs prognostic signature predicts survival of patients with HNSCC
A risk score formula based on the expression level and weighted by coefficient of these three prognostic ARGs was generated as follows: risk score = (0.09646 × expression level of EGFR) + (0.18189 × expression level of FADD) + (0.52452 × expression level of PARK2). Subsequently, ROC analysis was performed to estimate the sensitivity and specificity of this 3-ARGs prognostic signature in survival prediction. The optimal cut-off point was selected as 1.045 which had the maximal sensitivity and specificity (Fig.5B). Accordingly, patients in the training cohort (TCGA-HNSCC) were stratified into subgroups with high risk score (n=191) or low risk score (n=308). The Kaplan-Meier analysis revealed that patients with increased scores had markedly inferior survival as compared with those with low scores (P< 0.0001, Fig.6A). As shown in Fig.6B-D, the survival among patients, the distribution of risk score and expression of individual ARGs in HNSCC samples were shown in detail.
Validation of the 3-ARGs prognostic signature for survival prediction
We next validated the predictive value of this 3-ARGs prognostic signature using other two independent HNSCC datasets (GSE41613 and GSE42743). Utilizing the established risk score formula, patients in these two cohorts were classified into high-risk and low-risk subgroups. Consistent with the results from the training cohort, patients with high scores had significantly inferior overall survival compared with those with low scores in the GSE41613 cohort (P=0.013, Fig.7A) and GSE42743 cohort (P =0.034, Fig.7C), respectively. Subsequently, ROC curve was plotted to evaluate the prognostic accuracy of this 3-ARGs prognostic signature, presenting AUC values 0.673 and 0.627 in these two datasets, respectively (Fig.7B, D).
Univariate and multivariate prognostic analyses of 3-ARGs signature in HNSCC
As shown in Table 1, we employed univariate Cox regression analyses to substantiate the prognostic values of 3-ARGs signature and revealed that 3-ARGs signature risk score was remarkably associated with overall survival of patient in the TCGA cohort (HR, 1.845; 95% CI, 1.410-2.416; P<0.001). As anticipated, some well-established prognostic parameters including tumor size, pathological grade and cervical node metastasis were also identified as prognostic predictors. Furthermore, to rule out other confounding factors, we performed multivariate Cox regression analyses and found that this 3-ARGs prognostic signature was identified as an independent prognostic predictor for overall survival in TCGA-HNSCC cohort (HR, 1.735; 95% CI, 1.282-2.348; P < 0.001).
Expression of EGFR, FADD and PARK2 in HNSCC and their correlations with autophagy markers
Finally, the expression of three identified ARGs (EGFR, FADD and PARK2) in clinical specimens from Human Protein Profiles (https://www.proteinatlas.org) was retrieved and compared with normal counterparts. Significantly higher expression of these genes in HNSCC tissue was observed (Fig.8A). Then, to further understand the associations of three genes and autophagy, we analyzed the correlation between the expression of three candidate genes and three well-established autophagy markers (ATG5, Beclin1 and LC3). Noticeably, significant associations between 3 genes and autophagy markers were identified (Fig.8B). In addition, Kaplan-Meier analyses revealed that patients with increased expression of EGFR and FADD had markedly inferior survival as compared with those with low expression, respectively (P= 0.0016 and 0.0043), while PARK2 shown no prognosis significance (P=0.28) (Fig.8C).