Patients and clinical characteristics
A total of 519 NPC patients were enrolled in this study. All the patients were randomly divided into primary cohort (363 NPC patients) and validation cohort (156 NPC patients). Table 1 described the patients’ demographic data and clinical characteristics between the primary cohort and validation cohort. In the primary cohort, 209 (57.57%) patients with NPC were male and 154(42.43%) patients were female. The mean age (SD) of patients was 46.05 (10.87) years and the median OS was 51.0 months (interquartile range (IQR), 42.3-66.7 months). There were 92 (58.97%) male and 64 (41.03%) female patients in the validation cohort. The mean age (SD) was 46.87(11.58) years and the median OS was 50.4 months (IQR, 41.7-66.0 months). In the primary cohort and validation cohort, the 1-, 3-, 5-year OS rates were 95.0%, 84.0%, 46.8% and 98.7%, 84.0%, 45.5%, respectively.
Model construction based on clinical characteristics
The sliding windows sequential forward feature selection method (SWSFS) was used to identify the important variables by minimizing the ‘out of bag (OOB)’ error rate (Figure 1A). In the primary cohort, three variables that dNLR (HR = 1.14, 95% CI: 1.05-1.23, P = 9.14×10-4), HGB (HR = 0.98, 95% CI: 0.97-0.99, P = 5.24×10-3) and EBV DNA (HR = 1.59, 95% CI: 1.32-1.93, P = 1.22×10-6) were significantly associated with OS in NPC patients (Figure 1B). The computational formula of risk score was 0.466×DNA+0.129×dNLR-0.02×HGB. The heatmap of NPC samples in two cohort were shown in figure 2, in which red represents upregulated imaging features and blue represents downregulated imaging features. Three feature clusters (C1–C3) were identified in the heatmap by the unsupervised hierarchical clustering of 59 imaging features.
ROCs were used to assess the accuracy of the established risk score model, TNM stage, treatment, and EBV DNA. In the primary cohort, for 1-year OS (Figure 3A), the AUC of TNM stage, Treatment, EBV DNA, and our established model were 0.748, 0.591, 0.751 and 0.797, respectively. Moreover, our model achieved higher AUC than TNM stage, Treatment, EBV DNA for 3-year OS (Figure 3B) and 5-year OS (Figure 3C). In the validation cohort, for 1-year OS (Figure 3D), the AUC of TNM stage, Treatment, EBV DNA, and our established model were 0.399, 0.588, 0.932 and 0.854, respectively. For 3-year and 5-year OS, the AUC of TNM stage, Treatment, EBV DNA, and our established model were 0.728, 0.573, 0.794, 0.821 and 0.725, 0.555, 0.747, 0.791(Figure 3E, 3F). The results of time-dependent ROC curve for OS in the primary cohort (Figure 4A) and validation cohort (Figure 4B) showed that the AUCs of TNM stage, EBV DNA, Treatment and our established model more detail.
Moreover, we evaluated the C-Index of the established model, TNM stage, Treatment and EBV DNA for prediction of OS in the primary cohort and validation cohort. In the primary cohort, the established model achieved higher C-index of 0.733(95%CI:0.673-0.793) than TNM stage 0.712 (95%CI:0.657~0.768), Treatment 0.542 (95%CI:0.505~0.580) and EBV DNA 0.691 (95%CI:0.626~0.756). In the validation cohort, The C-index of our model, TNM stage, Treatment and EBV DNA were 0.772 (95%CI:0.691~0.853), 0.699 (95%CI:0.628~0.770), 0.551 (95%CI:0.503~0.600), 0.739 (95%CI:0.652~0.826), respectively (Table 2).
Performance of the established model in stratifying risk
Based on the computational formula of Risk score (0.466×DNA+0.129×dNLR-0.02×HGB), high risk (risk score ≤ -0.16) and low risk (risk score > -0.16) subgroups were divided in patients of NPC. We used the R package “survival” and “survminer” to determine the Cut-off value. The optimum cut-off of our model was -1.46. The results showed that patients with high-risk score had a significantly shorter OS than low-risk score patients in the primary cohort (P < 0.01) (Figure 5A) and in the validation cohort (P < 0.01) (Figure 5E). In the primary cohort, the 1-, 3-, and 5-year survival probabilities of high-risk and low-risk patients were 90.0%, 71.3%, 37.3% and 99.0%, 93.0%, 53.5%. Meanwhile, in the validation cohort, the 1-, 3-, and 5-year survival probabilities of high-risk and low-risk patients were 97.2%, 70.4%, 35.2% and 98.8%, 95.3%, 54.1%. Moreover, in patients of stage III and stage IV, Kaplan-Meier curves showed that high-risk and low-risk subgroups were significantly associated with OS outcomes in the primary cohort (P < 0.001, P = 0.011) and in the validation cohort (P = 0.015, P = 0.021).
The nomogram for the prediction of OS
We established a nomogram for OS including risk score, TNM stage, EBV DNA in the two cohort. In the primary cohort, the nomogram model achieved a C-index of 0.783(95% CI: 0.730~0.836), which was significantly higher than that of the prognostic model 0.733(95%CI:0.673-0.793, P < 0.005) (Figure 6A, 7A). In the validation cohort, the nomogram model achieved a C-index of 0.776(95% CI: 0.709~0.844), which was significantly higher than that of the prognostic model 0.772 (95%CI:0.691~0.853, P = 0.455) (Figure 6E, 7B). Calibration curves for the probability of survival at 1-, 3-, 5-years showed optimal agreement between the prediction established in the nomogram and the actual observation in the two cohorts (Figure6C-D, 6F-H). The RMS curves showed a larger slope in the primary cohort for nomogram, indicating superior estimation of survival with nomogram (Figure 7).
The correlations among the variables in the nomogram model
The correlations among variables of the nomogram model were showed in Figure 8. In this figure, blue displayed positive correlations and red displayed negative corrections. Moreover, the correlation coefficients were proportional to the color intensity and the circle size. In the primary cohort, there was high significant between EBV DNA and risk score. Meanwhile, treatment was moderately associated with TNM stage. Coincidentally, we were able to get consistent results in the validation group.