Baseline characteristics
In total, this study included 2065 patients diagnosed with early or locally advanced BC treated with NAC followed by curative surgery (Table S1). Data on every characteristic were collected for all patients except baseline serum CEA (null counts: n=35, 1.7%), CA-15-3 (n=30, 1.4%), and ER score (n=1, 0.1%). Median age at BC diagnosis was 46.6 years, and 65.5% of patients were premenopausal. In terms of subtypes, there were 643 (31.1%) cases of HR+/HER2- BCs, 343 (16.6 %) cases of HR+/HER2+ BCs, 412 (19.9%) cases of HR-/HER2+ BCs, and 667 (32.3%) cases of TNBC. Clinical stage I and II comprised 35.8% of cases, while stage III comprised 66.2% of cases. The pCR rates were 30.6% in the entire population, 10.6% in HR+HER2- group, 39.9% in HR+HER+ group, 45.4% in HR-HER2+ group, and 30.3% in TNBC group.
Feature selection
Univariate analyses of the relationship between baseline patient characteristics and pathologic response to NAC were performed (Table 1). Baseline patient characteristics of age, menopausal status, baseline CA-15-3, Allred scores of ER and PR, HER2 status, expression level of Ki-67, clinical stage, T and N stages, and NAC regimens significantly affected pCR status (p<0.05, respectively). Among these factors, age and clinical stage were excluded due to interference with other factors: age and menopausal status (correlation coefficient: 0.75) and clinical stage and N stage (correlation coefficient: 0.92) (Figure S1). Both ER and PR (correlation coefficient: 0.84) were clinically important and were included for further analysis. Lastly, CEA and histology were added for further analysis after literature review, even though these two factors were not significantly associated with pCR [19-21]. Therefore, 11 features were selected: menopausal status, CEA, CA-15-3, histology, ER score, PR score, HER2 status, Ki-67, T stage, N stage, and NAC regimen.
Multivariate analysis with machine learning model
Six ML models were tested using 11selected features (Figure 1A). Logistic regression (LR), LR with L1 penalty, linear support vector machine (SVM-L), radial basis function SVM, random forest, and LightGBM were used for these tests. Among these six models, LightGBM had the highest performance with an AUC of 0.78 compared with other models for pCR prediction in one-fold evaluation. Therefore, further multivariate analysis was performed using the LightGBM model. In multivariate analysis, AUC increased from 0.7845 to 0.810 as an average of 10-fold results after hyper-parameter tuning with Bayesian optimization (Figure 1B, Table 2). Detailed hyper-parameters of LightGBM were described in Supplementary Table S2.
Weight of feature importance affecting the machine learning model
Permutation feature importance was performed to identify features that significantly affect pCR. This analysis identified seven features that significantly affected pCR, and NAC regimen was the most contributing feature with 0.26 AUC drop (-0.26 0.033). Other features affecting pCR were ER score (-0.04 0.010), N stage (-0.02 0.010), T stage (-0.02 0.011), and Ki-67 (-0.01 0.007). Menopausal status, histology, PR score, and HER2 did not change AUC in this ML model.
Machine learning model for pCR prediction according to BC subtype
Further analyses to establish a pCR prediction model according to BC subtype were performed. The BC cohort consisted of 643 (31.1%) HR+/HER2- BCs, 343 (16.6%) HR+/HER2+ BCs, 412 (19.9%) HR-/HER2+ BCs, and 667 (32.3%) TNBCs. The pCR rate according to subtype was 10.6% for HR+HER2- BCs, 39.9% for HR+HER+ BCs, 45.4% for HR-HER2+ BCs, and 30.3% for TNBCs. AUC, sensitivity, and specificity of pCR according to BC subtype are described in Table S4. Among the four subtypes, HR+/HER2- BCs had the highest AUC (0.841) (0.716 for HR+/HER2+ BCs, 0.753 for HR-HER2+ BCs, and 0.653 for TNBCs) (Figure 3).
Permutation feature importance was also performed according to BC subtype (Figure 4). In HR+HER2- BCs, AUC changes were observed for CEA (-0.08 0.053), ER score (-0.07 0.017), CA-15-3 (-0.05 0.033), Ki-67(-0.03 0.020), and PR score (-0.03 0.016). For HR+HER2+ BCs, PR score intensely decreased AUC (-0.20 0.054), while NAC regimen (-0. 0.025), menopausal status (-0.0 0.031), N stage (-0.05 0.044), Ki-67 (-0.0 0.021), CEA (-0.03 0.010), and ER score (-0.0 0.031) affected pCR status. Interestingly, N stage was the most contributing feature to pCR (AUC change: -0.19 0.027) in HR-HER2+ BCs, followed by NAC regimen (-0.08 0.024), serum CA-15-3(-0.02 0.012), T stage (-0.02 0.015), and serum CEA (-0.01 0.005). For TNBC, NAC regimen, T stage, and Ki-67 intensely affected pCR (AUC change: -0.07 0.037, -0.07 0.054, and -0.05 0.016, respectively).