Initial screening of genes using GSEA
We obtained clinical data from 177 patients with PAAD, along with an expression data for 56753 mRNAs, from the TCGA database. The hallmark gene sets from the Molecular Signatures Database (MSigDB) were selected to represent well-defined biological process or courses, which contained 50 specific gene sets. GSEA was analyzed using the above database to detect whether the gene sets showed statistically differences between PAAD and adjacent normal tissues. A mount of 36 gene sets were upregulated in PAAD. Among them, HALLMARK_GLYCOLYSIS, and HALLMARK_ESTROGEN_RESPONSE_LATE gene sets were significantly enriched (Figure1). We then selected GLYCOLYSIS (P=0.028, NES=1.54) for further analysis, which has the top-ranking function and contained 199 genes.
Identification of survival‑associating glycolysis‐related mRNAs
In order to identify novel genetic biomarkers associated with the survival of patients with PAAD, we applied univariate Cox regression of 199 genes that were enriched via glycolysis. A total of 7 genes were significantly associated with OS (P<0.05) and then entered them into multivariate Cox regression analysis. Three independent genes (KIF20A, CHST2, and MET) were selected via multivariate COX regression analysis (Table 2). With HR>1 associated with poorer survival (KIF20A and MET), with HR<1 associated with better survival (CHST2).
Construction of a three-mRNA signature to predict patient outcomes
Based on integrating the expression level and corresponding regression coefficients derived from multivariate Cox regression analysis, a prognostic risk score model was established as follows:
Risk score = 0.1755 * expression of KIF20A + (-0.1400) * expression of CHST2 + 0.0214 * expression of MET
We then ranked the risk score in ascending order. According to the risk score formula, 177 PAAD patients were classified into high-risk groups (n=88) and low-risk groups (n=89), using the median risk score as cut off value (Figure 2A). The survival time of the patients are shown in Figure 2B. Patients with the low‐risk group had lower mortality rates, whereas patients in high‐risk group showed poorer survival. Additionally, a heatmap displayed the expression profiles of 3 mRNAs (Figure 2C). Compared with the low-risk group, the high-risk mRNAs (KIF20A and MET) was upregulated in high-risk group, whereas the expression of the protective type of mRNAs (CHST2) was downregulated in high-risk group. The sensitivity and specificity of the three-mRNA signature by area under the receiver operating characteristic curve (AUC) value in the ROC curve at 1-year survival was calculated. The ROC curve analysis score was 0.718 (Figure 3), indicating the good sensitivity and specificity of the three-mRNA signature in predicting survival of PAAD patients.
Risk score generated from the signature as an independent prognostic indicator
To verify whether the risk score is independent of other clinical features, we performed univariate and multivariate Cox regression analyses to evaluate the importance of these indicators in PAAD patients, which included risk score, age, gender, grade, and TNM stage as covariables (Figure 4). We found that the risk score (HR: 1.242; 95% CI: 1.157-1.334; P<0.001) age (HR: 1.028; 95% CI: 1.006-1.050; P=0.012), and grade (HR: 1.377; 95% CI: 1.020-1.859; P=0.037) were associated with patient survival in the univariate analysis. We identified risk score and age had independent prognostic value, as these factors showed significant differences in both univariate and multivariate analyses. All these indicated that the three-gene signature had competitive prognostic value for survival prediction.
Validation of three-mRNA signature for survival prediction by Kaplan–Meier curve analysis
Compared high-risk group and low-risk group, the Kaplan-Meier analysis showed a significant difference in the survival of the patients (P<0.001; Figure 5A). Additionally, the risk score was a stable prognostic marker for patients with PAAD stratified by age (<65 or >=65), TNM stage (stage I or stage II-IV), T stage (stage 1-2 or stage 3-4), and N stage (Figure 5B, 5C, and 5D). The patients in the high-risk group had significantly shorter OS than those in the low-risk group in M0 subgroup (Figure 5E). Since there were only 4 patients in M1 subgroup, it is regrettable that the analysis of M1 subgroup has not been completed. All these results indicated that the risk score was robust in predicting the prognosis of patients with PAAD.