Characteristics of RCTs included in the analysis
In this systematic literature search, 2,432 articles were identified. After excluding 50 duplicates, 2,382 studies were further screened. The full texts of 28 articles were finally evaluated after excluding 2,354 studies. Of the 28 studies, 3 duplicate publications, 6 repeat publications, 10 non-RCT studies, and 2 non-chemotherapeutic studies were excluded. The remaining 7 RCTs were considered eligible for the meta-analysis (Figure 1). The characteristics of the eligible RCTs are summarized in Additional files 1: Table S1 [7-13]. In the 7 eligible RCTs, 3,612 patients were randomly assigned to 17 treatment arms. All treatment arms comprised combination regimens with 3–5 cytotoxic drugs. One treatment arm included lung irradiation as the protocol treatment. No study included molecular-targeted therapy or immune therapy. Study phases and post-protocol treatments were not clearly described in any study, and ITT analyses were conducted only in 2. The primary endpoint was defined in 5 of 7 RCTs as EFS, including 3-year EFS, whereas 2 earlier studies described both survival time and time to relapse as major endpoints [7,8]. All RCTs included both EFS and OS as efficacy measures of the trial. Whereas most RCTs focused on localized ES, two had subgroup arms for high-risk and metastatic disease, which included 277 patients. The mean of the studies’ median follow-up periods was 6.79 (5.1–8.5) years. Because the median EFS and OS were not reached in 7 and 5 treatment arms, respectively, analyses regarding median survival were not included in our study. The radiological response to chemotherapy was not described in any of the studies, and the histological response was assessed in only 2. Therefore, tumor responses could not be evaluated in the present study.A significant difference in the HRs of EFS was observed between the control and experimental arms (HR 0.80, 95% CI 0.68–0.96, P = 0.01) (Additional File 2: Figures S1). Meta-analyses of the OS HRs revealed significantly better survival in the experimental arm than in the standard arm (HR 0.79, 95% CI 0.63–0.98, P = 0.03) (Additional File 3: Figures S2). Figure 2 displays forest plots for the treatment effects estimated by hazard ratios (HR) on the 2-year OS and 1-year PFS, TTP, and TTF for each trial.
Correlations between EFS and OSThe trial-level correlation between HRs for EFS and OS was good (R2 = 0.747, 95% CI 0.531–0.981) (Table 1, Figure 2a). The Spearman’s rank correlation coefficient (ρ) was 0.683 (95% CI 0.035–0.927, P = 0.042). However, the R2 for the association between the OS HR and the 1-year EFS was moderate (R2 = 0.348, 95% CI 0.00–0.759; ρ = 0.450, 95% CI -0.305–0.858, P = 0.22) (Table 1, Figure 2b). The correlations between the OS HR and the 3-year EFS (R2 = 0.765, 95% CI 0.545–0.985; ρ = 0.717, 95% CI 0.10–0.936, P = 0.030) and the 5-year EFS (R2 = 0.695, 95% CI 0.423–0.967; ρ = 0.767, 95% CI 0.209–0.948, P = 0.016) were assessed as very good and good, respectively (Table 1, Figures 2c-d).
Similar to what we observed for the 1-year EFS, the correlation between the 1-year OS and the OS HR was poor (R2 = 0.089, 95% CI 0.00–0.408; ρ = 0.214, 95% CI -0.642–0.833, P = 0.64). Meanwhile, the 3-year OS (R2 = 0.831, 95% CI 0.650–1.00; ρ = 0.929, 95% CI 0.584–0.990, P = 0.0025) and the 5-year OS (R2 = 0.809, 95% CI 0.625–0.993; ρ = 0.767, 95% CI 0.209–0.948, P = 0.016) showed very good correlations with OS HR (Table 1, Figures 3a-c).
Further sensitivity analyses were conducted for surrogacy evaluation by removing the treatment arms of the metastatic and high-risk populations. There were 2 RCTs (INT-0091 and EICESS-92) in which metastatic ES was included. In INT-0091, all 120 patients in the metastatic subgroup had metastatic disease [9]. On the other hand, in EICESS-92, the definition of “high-risk” was large localized tumor (≥ 100 ml) or metastatic disease [10]. Thus, the high-risk subgroup in EICESS-92 included 157 patients with metastatic disease and 335 patients with non-metastatic large localized tumor. After the removal of these subgroups, localized ES analyses revealed an improved correlation between the surrogate endpoints and OS. The correlation between the EFS HR and the OS HR was very good (R2 = 0.818, 95% CI 0.625–1.00; ρ = 0.929, 95% CI 0.584–0.990, P = 0.0025) (Table 2, Figure 4a). The R2 for the associations between the OS HR and the 1-year EFS remained moderate (R2 = 0.436, 95% CI 0.00–0.873; ρ = 0.750, 95% CI -0.007–0.961, P = 0.052) (Table 2, Figure 4b). The correlations between the OS HR and the 3-year EFS (R2 = 0.807, 95% CI 0.604–1.00; ρ = 0.857, 95% CI 0.294–0.979, P = 0.014) and the 5-year EFS (R2 = 0.772, 95% CI 0.537–1.00; ρ = 0.929, 95% CI 0.584–0.990, P = 0.0025) were very good (Table 2, Figures 4c-d). The correlation between the 1-year OS and the OS HR remained poor (R2 = 0.136, 95% CI 0.00–0.535; ρ = 0.257, 95% CI -0.701–0.884, P = 0.62); however, the 3-year OS (R2 = 0.858, 95% CI 0.693–1.00; ρ = 0.943, 95% CI 0.559–0.994, P = 0.0048) and the 5-year OS (R2 = 0.895, 95% CI 0.778–1.00; ρ = 0.929, 95% CI 0.584–0.990, P = 0.0025) showed nearly excellent correlations with OS HR (Table 2, Figures 5a-c).When two very old RCTs conducted in the 1970s were further removed from the evaluation of surrogacy, the correlation between HRs for EFS and OS was good, with R2 = 0.519 (95% CI 0.041–0.997) and ρ = 0.800 (95% CI -0.280–0.986).