Baseline characteristics
From Jan 2009 to Dec 2015, a total of 1879 consecutively breast cancer with pN0-1 who treated with mastectomy were included for analysis. Firstly, we randomly divided into a training cohort and a testing cohort (7:3), with 1,316 patients in the training cohort and 563 in the validation cohort. No variables were significantly different between the two cohorts. The median age at diagnosis was 56 years (range, 28–92 years) in the training set and 58 years (range, 23-92 years) in the testing set. The median tumor size was 2 cm in both cohorts. The majority of our cohorts (1,590 patients, 84.6%) did not have lymph node (LN) metastasis. A total of 1731 patients (92.1%) treated with adjuvant systematic therapy, and 623 patients (33%) treated with both adjuvant chemotherapy and hormonal therapy. Adjuvant chemotherapy was received by 785 patients (60%) in the training cohort and 311 (55.2%) in the validation cohort. Among 1096 patients received chemotherapy, 537 patients (49%) treated with doxorubicin, cyclophosphamide, and paclitaxel; 511 patients (46.6%) treated with doxorubicin-containing or taxanes-containing regimens, and the remaining 48 patients (4.4%) treated with other regimens. Adjuvant hormonal therapy was received by 868 patients (66.0%) in the training cohort and 383 patients (68%) in the validation cohort. Among 1251 patients treated with hormonal therapy, 698 patients (55.8%) treated with aromatase inhibitors (AIs); and 549 patients (43.9%) treated with Selective Estrogen Receptor Modulators (SERMs). Detailed baseline characteristics are listed in Table 1.
Survival outcomes
By the latest follow-up of Oct 2019, with a median followed-up of 60 months, a total of 101 (5.37%) patients died in the entire cohorts, with 55 (2.93%) patients deaths attribute to breast cancer and the remaining 46 (2.45%) deaths due to other reasons. The survival outcomes of the entire cohort was excellent. The 5- and 10-year breast cancer specific survival (BCSS) was 98% and 95%, respectively (supplemental figure 1A and 1B). The 5- and 10-year overall survival (OS) was 97% and 91%, respectively(figure 1C and 1D). A total of 44 patients developed local regional recurrences, with the 5- and 10-year cumulate LRR rate was 2% and 3% (supplemental figure 2), respectively. In addition, a total of 90 patients developed distant metastasis (DM), and the 5- and 10-year cumulate DM rate was 4% and 6%, respectively (supplemental figure 2). In the external validation cohort, the 5-year OS and BCSS was 95% and 97%, respectively(supplemental figure 3). And the cumulative incidence of 5-year LRR and DM was 3% and 6%, respectively(supplemental figure 3).
Factors associated with BCSS,LRR and DM
A total of 16 variables were considered as potential predictors. We used a lasso regression algorithm based on each variable for predictor selection in the training cohort. As showed in the figure 3, when the partial likelihood deviance reached its minimum value, the appropriate tuning parameter g was 0.0059 and logg is -5.14; and five variables with nonzero coefficients were obtained from the LASSO analysis (figure 2).
As for LRR-free survival, LASSO regression showed that only number of positive LN was the significant predictor. And univariate Cox-analysis also found that number of positive LN was a significant risk factors for LRR (HR 1.71, 95%CI: 1.13-2.60, p=0.027, table 2); Additionally, ten variables with nonzero coefficients, including tumor location, age, number of positive LN, pathological T stage, Ki-67, PR status, grade, number of resected LN, adjuvant hormonal therapy and anti-HER-2 therapy, were obtained from the LASSO analysis.
Construction of the nomogram
Univariate analysis showed that all of the five variables were significant predictors for BCSS (all p<0.05). Multivariate Cox-regression analysis showed that four variables of age(p=0.052) number of positive LN (p<0.001), pathological T stage(p=0.021 and p<0.001), Ki-67 (p=0.005, table 2) have independent prognostic significance for BCSS. All of these five variables were selected for the construction of nomogram of BCSS (figure 3A). The newly developed predictive model showed good discrimination with a C-index of 0.81. And an excellent concordance between the predicted and observed 5-year BCSS probabilities was observed in calibration plot (figure 4A).
Additionally, univariate analysis showed that six of the ten variables were significant predictors for DM-free survival (number of positive LN, pathological T stage, Ki-67, number of resected LN, grade and PR status, p<0.05); Multivariate Cox-regression analysis showed that four variables of number of positive LN (p=0.002), pathological T stage(p=0.01 and p=0.002), Ki-67 (p=0.018) and total of resected LN (p=0.037) had independent prognostic significance for DM (table 2). Finally, six variables were selected for the construction of nomogram of DM-free survival(figure 3B). The C-index of nomogram in the testing data set was 0.77, and calibration plot indicated that there was a good concordance between the predicted and observed 5-year DM-free survival probabilities(figure 4D).
Internal validation of nomogram
In the internal validation data set of 563 patients, The C-index of the internal test data set for BCSS and DM-free survival was 0.65 and 0.70, respectively. Calibration plot was used to compare the difference between predicted 5-year BCSS and DM-free survival probabilities and the actual 5-year survival probabilities. Our result showed that the calibration curve revealed good concordance between the predicted and observed probabilities(figure 4B and Figure 4E).
External validation of nomogram
The model was externally validated in an independent cohort of 1356 breast patients from one phase III trial (NCT00041119). As the Ki-67 data could not be obtained from the phase III trial, we constructed a modified nomogram based on four variables including age, number of positive LN, pathological T stage and grade. The C-index for the modified model for BCSS in the training and external validation data set was 0.79 and 0.72, respectively. Calibration plot also revealed good concordance between the predicted and observed probabilities in the external data set (figure 4C).
In addition, ki-67 and total resected LN could not be obtained from the trial, thus a modified nomogram for DM-free survival based on four variables of pathological T stage, number of positive LN, grade and PR status was established in the external validation cohort, The C-index for the modified model for DM-free survival in the training and external validation data set was 0.72 and 0.69, respectively. a good concordance between the predicted and observed 5-year DM-free survival probabilities in the external data set was observed (figure 4F).