Patients' baseline characteristics and survival
Clinical characteristics of patients in the training and validation cohorts shown in Table 1 include sex, age, lymph node metastasis (LNM), tumor length, GTVp, clinical T (cT) stage, clinical N (cN) stage, clinical TNM (cTNM) stage, and treatment. The median age was 60 years. In addition, upper thoracic ESCCs were more frequent in males than females in both cohorts. With regard to staging, the majority of patients were staged as T2-3 and/or N0-1, and >50% of patients had lymph node metastasis. For the training and validation cohorts, 319 of 568 patients and 72 of 155 patients died, respectively. The 5-year OS rates of the training and validation cohort were 44.6% and 51.3%, respectively.
Table 1 Clinical characteristics of patients with upper thoracic esophageal squamous carcinoma in two cohorts
|
Training cohort (n=568)
|
Validation cohort (n=155)
|
Gender
|
|
|
Male
|
391 (68.8)
|
123 (79.4)
|
Female
|
177 (31.2)
|
32 (20.6)
|
Age (years)
|
|
|
< 60
|
276 (48.6)
|
61 (39.4)
|
≥ 60
|
292 (51.4)
|
94 (60.6)
|
LNM
|
|
|
No
|
252 (44.4)
|
68 (43.9)
|
Yes
|
316 (55.6)
|
87 (56.1)
|
Tumor length (cm)
|
|
|
≤ 5
|
363 (63.9)
|
109 (70.3)
|
> 5
|
205 (36.1)
|
46 (29.7)
|
GTVp (cm3)
|
|
|
< 30
|
396 (69.7)
|
121 (78.1)
|
≥ 30
|
172 (30.3)
|
34 (21.9)
|
Clinical T stage
|
|
|
T1
|
15 (2.6)
|
11 (7.1)
|
T2
|
112 (19.7)
|
23 (14.8)
|
T3
|
247 (43.5)
|
105 (67.7)
|
T4
|
194 (34.2)
|
16 (10.3)
|
Clinical N stage
|
|
|
N0
|
285 (50.2)
|
68 (43.9)
|
N1
|
185 (32.6)
|
49 (31.6)
|
N2
|
88 (15.5)
|
22 (14.2)
|
N3
|
10 (1.8)
|
16 (10.3)
|
8th AJCC stage
|
|
|
I
|
10 (1.8)
|
11 (7.1)
|
II
|
251 (44.2)
|
68 (43.9)
|
III
|
117 (20.6)
|
46 (29.7)
|
IV
|
190 (33.5)
|
30 (19.4)
|
Treatment
|
|
|
Surgery
|
238 (41.9)
|
70 (45.2)
|
CRT
|
216 (38.0)
|
0
|
Surgery+CRT
|
114 (20.1)
|
85 (54.8)
|
Nomogram model construction and validation
For the training cohort, univariate analysis revealed that sex, LNM, tumor length, GTVp, T stage, N stage, and cTNM stage were prognostic factors (all P <0.05). Furthermore, multivariate analyses found that only sex, T stage, N stage, and GTVp were independent prognostic factors (Table 2). Accordingly, a nomogram model was constructed based on such independent prognostic factors to depict their different weighted points (Figure 1). In addition, the 1-, 3-, and 5-year OS rates were predicted by the sum of these independent prognostic factor points. It can be easily seen that patients with higher scores were prone to have poorer clinical outcomes.
Table 2 Multivariable analysis of clinical variables to predict overvall survival in the training cohort
|
HR
|
95% CI
|
P value
|
Gender
|
0.719
|
0.557-0.929
|
0.011
|
Clinical T stage
|
1.239
|
1.061-1.448
|
0.007
|
Clinical N stage
|
1.284
|
1.120-1.471
|
0.000
|
GTVp
|
1.578
|
1.237-2.012
|
0.000
|
Next, the predictive function of the nomogram model was tested in the training cohort. Confirmed by 1,000 bootstrap resamples, the C-index for 5-year OS was 0.622 (95% CI: 0.59–0.654) (Table 3). The AUC value of the ROC for 5-year OS was 0.709 (95% CI: 0.661–0.758) (Figure 2A). Furthermore, the calibration curve for 5-year OS confirmed consistency between actual and predicted clinical outcomes (Figure 2B).
Finally, this nomogram model was validated through an external independent cohort. For the validation cohort, the C-index of 5-year OS was 0.713 (95% CI: 0.656–0.771) (Table 3) and the AUC value of 5-year OS was 0.739 (95% CI: 0.655-0.823), which seemed better than those of training cohort (Figure 2C). The calibration curve demonstrates good agreement between actual and predicted OS (Figure 2D).
Comparison of the predictive accuracy between the nomogram and the AJCC staging system
In the comparison of nomogram model with the 8th AJCC staging system, four indices including C-index, AUC, NRI, and IDI were compared (Table 3). For the training cohort, the C-indexes for the nomogram model and 8th AJCC staging were 0.622 vs 0.580 (DC = 0.0570, 95% CI: 0.0271–0.08072, P <0.001) and 0.713 v. 0.659 (DC = 0.0541, 95% CI: –0.0026–0.0401, P = 0.020) for the validation cohort. Time-dependent ROC analyses showed that the AUC value of the nomogram model was significantly better than that of the 8th AJCC staging for the training or validation cohort. With regard to the comparison of the NRI of 5-year survival between the nomogram model and 8th AJCC staging, the discrimination ability of the nomogram model was increased by 26.6% and 23.9% in the two cohorts, respectively (all P <0.05). In addition, in the comparison of IDI of 5-year survival, that of the nomogram model increased by 6.4% and 7.6% in the training and validation cohorts, respectively.
Table 3 The discriminatory ability of the nomogram model vs. AJCC stage
|
C-index
(95% CI)
|
AUC
(95% CI)
|
△C-index
(P value)
|
NRI
(P value)
|
IDI
(P value)
|
TC
Nomogram
|
0.622
(0.591-0.654)
|
0.709
(0.661-0.758)
|
-
|
-
|
-
|
TC
AJCC stage
|
0.580
(0.548-0.612)
|
0.654
(0.605-0.703)
|
-
|
-
|
-
|
VC
Nomogram
|
0.713
(0.656-0.771)
|
0.739
(0.655-0.823)
|
-
|
-
|
-
|
VC
AJCC stage
|
0.659
(0.602-0.716)
|
0.689
(0.605-0.773)
|
-
|
-
|
-
|
TC Nomogram vs. AJCC stage
|
-
|
-
|
0.057
P < 0.001
|
0.266
P < 0.001
|
0.064
P < 0.001
|
VC Nomogram vs. AJCC stage
|
-
|
-
|
0.054
P = 0.020
|
0.239
P = 0.040
|
0.076
P < 0.001
|
Note: NRI or IDI>0 indicate positive improvement, suggesting that the nomogram model achieved better prediction ability than AJCC stage. NRI or IDI<0 indicate diminished improvement, and the nomogram model’s prediction ability was less than that of the AJCC stage. NRI or IDI = 0 indicate that the nomogram model did not change.
Moreover, the DCA confirmed our expectations. The 5-year DCA curves also suggested a better clinical benefit of the nomogram model compared with the 8th AJCC staging (Figure 3). In addition, it was also illustrated that GTVp was an excellent prognostic evaluation risk factor, and a combination of the GTVp and the 8th AJCC staging was markedly better than the 8th AJCC staging.
Risk-groups categorization
Depending on the 5-year OS rate, risk scores were calculated to stratify patients into low, moderate, and high-risk groups in the training cohort. Statistically significant differences were ultimately found among these three subgroups (all P <0.05) (Figure 4A). Furthermore, 356, 132, and 80 patients were separately categorized into low, moderate, and high-risk groups, and their risk-score intervals were <152, 152–213, and >213, respectively. The 5-year OS rates for low, moderate, and high-risk groups were 86.1%, 54.5%, and 28.1%, respectively.
To further compare the nomogram with AJCC staging, the 5-year OS curves of AJCC staging are shown in Figure 4B. The 5-year OS rates gradually decreased as AJCC clinical staging increased as follows: 88.9%, 55.4%, 36.9%, and 32.5%. However, the OS curves of stages I and II were not well separated, and those of stages III and IVA stage did not significantly differ (all P >0.05).