Patients' baseline characteristics and survival
Clinical characteristics of patients in the training and validation cohorts shown in Table 1 include sex, age, lymph node metastasis (LNM), tumor length, GTVp, clinical T (cT) stage, clinical N (cN) stage, clinical TNM (cTNM) stage, and treatment. The median age was 60 years. The optimal cutoff of GTVp was defined as 30cm3 by our previous study [(18)].In addition, upper thoracic ESCCs were more frequent in males than females in both cohorts. With regard to staging, the majority of patients were staged as T2-3 and/or N0-1, and > 50% of patients had lymph node metastasis. For the training and validation cohorts, 319 of 568 patients and 72 of 155 patients died, respectively. The 5-year OS rates of the training and validation cohort were 44.6% and 51.3%, respectively.
Table 1
Clinical characteristics of patients with upper thoracic esophageal squamous carcinoma in two cohorts
|
Training cohort (n = 568)
|
Validation cohort (n = 155)
|
Gender
|
|
|
Male
|
391 (68.8)
|
123 (79.4)
|
Female
|
177 (31.2)
|
32 (20.6)
|
Age (years)
|
|
|
< 60
|
276 (48.6)
|
61 (39.4)
|
≥ 60
|
292 (51.4)
|
94 (60.6)
|
LNM
|
|
|
No
|
252 (44.4)
|
68 (43.9)
|
Yes
|
316 (55.6)
|
87 (56.1)
|
Tumor length (cm)
|
|
|
≤ 5
|
363 (63.9)
|
109 (70.3)
|
> 5
|
205 (36.1)
|
46 (29.7)
|
GTVp (cm3)
|
|
|
< 30
|
396 (69.7)
|
121 (78.1)
|
≥ 30
|
172 (30.3)
|
34 (21.9)
|
Clinical T stage
|
|
|
T1
|
15 (2.6)
|
11 (7.1)
|
T2
|
112 (19.7)
|
23 (14.8)
|
T3
|
247 (43.5)
|
105 (67.7)
|
T4
|
194 (34.2)
|
16 (10.3)
|
Clinical N stage
|
|
|
N0
|
285 (50.2)
|
68 (43.9)
|
N1
|
185 (32.6)
|
49 (31.6)
|
N2
|
88 (15.5)
|
22 (14.2)
|
N3
|
10 (1.8)
|
16 (10.3)
|
8th AJCC stage
|
|
|
I
|
10 (1.8)
|
11 (7.1)
|
II
|
251 (44.2)
|
68 (43.9)
|
III
|
117 (20.6)
|
46 (29.7)
|
IV
|
190 (33.5)
|
30 (19.4)
|
Treatment
|
|
|
Surgery
|
238 (41.9)
|
70 (45.2)
|
CRT
|
216 (38.0)
|
0
|
Surgery + CRT
|
114 (20.1)
|
85 (54.8)
|
Abbreviation: LNM, lymph node metastasis; GTVp, primary gross tumor volume; AJCC, American Joint Committee on Cancer |
Nomogram model construction and validation
For the training cohort, univariate analysis revealed that sex, LNM, tumor length, GTVp, T stage, N stage, and cTNM stage were prognostic factors (all P < 0.05). Furthermore, multivariate analyses found that only sex, T stage, N stage, and GTVp were independent prognostic factors (Table 2). Accordingly, a nomogram model was constructed based on such independent prognostic factors to depict their different weighted points (Fig. 1). In addition, the 1-, 3-, and 5-year OS rates were predicted by the sum of these independent prognostic factor points. It can be easily seen that patients with higher scores were prone to have poorer clinical outcomes.
Table 2
Multivariable analysis of clinical variables to predict overvall survival in the training cohort
|
HR
|
95% CI
|
P value
|
Gender
|
0.719
|
0.557–0.929
|
0.011
|
Clinical T stage
|
1.239
|
1.061–1.448
|
0.007
|
Clinical N stage
|
1.284
|
1.120–1.471
|
0.000
|
GTVp
|
1.578
|
1.237–2.012
|
0.000
|
Abbreviation: HR, hazard ratio; CI, confidence interval. |
Next, the predictive function of the nomogram model was tested in the training cohort. Confirmed by 1,000 bootstrap resamples, the C-index for 5-year OS was 0.622 (95% CI: 0.59–0.654) (Table 3). The AUC value of the ROC for 5-year OS was 0.709 (95% CI: 0.661–0.758) (Fig. 2A). Furthermore, the calibration curve for 5-year OS confirmed consistency between actual and predicted clinical outcomes (Fig. 2B).
Table 3
The discriminatory ability of the nomogram model vs. AJCC stage
|
C-index
(95% CI)
|
AUC
(95% CI)
|
△C-index
(P value)
|
NRI
(P value)
|
IDI
(P value)
|
TC
Nomogram
|
0.622
(0.591–0.654)
|
0.709
(0.661–0.758)
|
-
|
-
|
-
|
TC
AJCC stage
|
0.580
(0.548–0.612)
|
0.654
(0.605–0.703)
|
-
|
-
|
-
|
VC
Nomogram
|
0.713
(0.656–0.771)
|
0.739
(0.655–0.823)
|
-
|
-
|
-
|
VC
AJCC stage
|
0.659
(0.602–0.716)
|
0.689
(0.605–0.773)
|
-
|
-
|
-
|
TC Nomogram vs. AJCC stage
|
-
|
-
|
0.057
P < .001
|
26.6%
P < .001
|
6.4%
P < .001
|
VC Nomogram vs. AJCC stage
|
-
|
-
|
0.054
P = 0.020
|
23.9%
P = 0.040
|
7.6%
P < .001
|
Abbreviation: TC, training cohort; VC, validation cohort; C-index, concordance index; CI, confidence interval; NRI, net reclassification index; IDI, integrated discrimination improvement. |
Note: NRI or IDI>0 indicate positive improvement, suggesting that the nomogram model achieved better prediction ability than AJCC stage. NRI or IDI<0 indicate diminished improvement, and the nomogram model’s prediction ability was less than that of the AJCC stage. NRI or IDI = 0 indicate that the nomogram model did not change.
Finally, this nomogram model was validated through an external independent cohort. For the validation cohort, the C-index of 5-year OS was 0.713 (95% CI: 0.656–0.771) (Table 3) and the AUC value of 5-year OS was 0.739 (95% CI: 0.655–0.823), which seemed better than those of training cohort (Fig. 2C). The calibration curve demonstrates good agreement between actual and predicted OS (Fig. 2D).
Comparison of the predictive accuracy between the nomogram and the AJCC staging system
In the comparison of nomogram model with the 8th AJCC staging system, four indices including C-index, AUC, NRI, and IDI were compared (Table 3). For the training cohort, the C-indexes for the nomogram model and 8th AJCC staging were 0.622 vs 0.580 (△C = 0.0570, 95% CI: 0.0271–0.08072, P < 0.001) and 0.713 v. 0.659 (△C = 0.0541, 95% CI: − 0.0026–0.0401, P = 0.020) for the validation cohort. Time-dependent ROC analyses showed that the AUC value of the nomogram model was significantly better than that of the 8th AJCC staging for the training or validation cohort. With regard to the comparison of the NRI of 5-year survival between the nomogram model and 8th AJCC staging, the discrimination ability of the nomogram model was increased by 26.6% and 23.9% in the two cohorts, respectively (all P < 0.05). In addition, in the comparison of IDI of 5-year survival, that of the nomogram model increased by 6.4% and 7.6% in the training and validation cohorts, respectively.
Moreover, the DCA confirmed our expectations. The 5-year DCA curves also suggested a better clinical benefit of the nomogram model compared with the 8th AJCC staging (Fig. 3). In addition, it was also illustrated that GTVp was an excellent prognostic evaluation risk factor, and a combination of the GTVp and the 8th AJCC staging was markedly better than the 8th AJCC staging.
Risk-groups categorization
Depending on the 5-year OS rate, risk scores were calculated to stratify patients into low, moderate, and high-risk groups in the training cohort. Statistically significant differences were ultimately found among these three subgroups (all P < 0.05) (Fig. 4A). Furthermore, 356, 132, and 80 patients were separately categorized into low, moderate, and high-risk groups, and their risk-score intervals were < 152, 152–213, and > 213, respectively. The 5-year OS rates for low, moderate, and high-risk groups were 86.1%, 54.5%, and 28.1%, respectively.
To further compare the nomogram with AJCC staging, the 5-year OS curves of AJCC staging are shown in Fig. 4B. The 5-year OS rates gradually decreased as AJCC clinical staging increased as follows: 88.9%, 55.4%, 36.9%, and 32.5%. However, the OS curves of stages I and II were not well separated, and those of stages III and IVA stage did not significantly differ (all P > 0.05).