Patient characteristics
Table 1 shows the clinicopathologic factors of 1536 patients with early-stage breast cancer from four centers, TCIA, and DUKE. The average age of the enrolled patients was 52.0 ± 10.7 years (range: 21–85 years), with 540 (44.6%) and 671 (55.4%) having positive and negative ALN status, respectively. Of the 540 patients with positive ALN status, 215 (39.8%) were identified with a high-ALN burden. Patients from Center I and DUKE underwent follow-up for overall survival (OS), consequently demonstrating a median [interquartile range] OS of 19.8 [9.42–41.2] months and 46.4 [28.8–63.1] months, respectively.
Table 1
Demographic and clinicopathological characteristics
Characteristics | levels | Centre I (n = 532) | Centre II (n = 113) | Centre III (n = 185) | Centre IV (n = 381) | TCGA (n = 99) | Duke (n = 226) |
---|
ALN burden | Low | 448 (84.2%) | 90 (79.6%) | 149 (80.5%) | 309 (81.1%) | NA | NA |
| High | 84 (15.8%) | 23 (20.4%) | 36 (19.5%) | 72 (18.9%) | NA | NA |
ALN status | Negative | 272 (51.1%) | 62 (54.9%) | 119 (64.3%) | 218 (57.2%) | NA | NA |
| Positive | 260 (48.9%) | 51 (45.1%) | 66 (35.7%) | 163 (42.8%) | NA | NA |
Menopausal Status | Postmenopausal | 241 (45.3%) | 52 (46%) | 124 (67%) | 297 (78%) | NA | NA |
| Premenopausal | 291 (54.7%) | 61 (54%) | 61 (33%) | 84 (22%) | NA | NA |
Histological grade | Grade 1 (low) | 34 (6.4%) | 9 (7.9%) | 14 (7.6%) | 45 (11.8%) | NA | NA |
| Grade 2 (intermediate) | 313 (58.8%) | 35 (31.0%) | 88 (47.6%) | 130 (34.1%) | NA | NA |
| Grade 3 (high) | 185 (35.8%) | 69 (61.1%) | 83 (44.9%) | 206 (54.1%) | NA | NA |
Histological type | Ductal | 494 (92.9%) | 100 (88.5%) | 165 (89.2%) | 341 (89.5%) | NA | 207 (91.6%) |
| Lobular | 12 (2.3%) | 3 (2.7%) | 4 (2.2%) | 14 (3.7%) | NA | 16 (7.1%) |
| Other | 26 (4.9%) | 10 (8.8%) | 16 (8.6%) | 26 (6.8%) | NA | 3 (1.2%) |
Enhanced pattern | Mass | 418 (78.6%) | 91 (80.5%) | 140 (75.7%) | 305 (80.1%) | NA | NA |
| No-mass | 114 (21.4%) | 22 (19.5%) | 45 (24.3%) | 76 (19.9%) | NA | NA |
MRI ALN status | Negative | 405 (76.1%) | 88 (77.9%) | 164 (88.6%) | 296 (77.7%) | NA | NA |
| Positive | 127 (23.9%) | 25 (22.1%) | 21 (11.4%) | 85 (22.3%) | NA | NA |
MRI ALN burden | Negative | 505 (94.9%) | 104 (92%) | 174 (94.1%) | 358 (94%) | NA | NA |
| Positive | 27 (5.1%) | 9 (8%) | 11 (5.9%) | 23 (6%) | NA | NA |
Ki67 (%) | <14 | 135 (25.4%) | 29 (25.7%) | 49 (26.5%) | 139 (36.5%) | NA | NA |
| ≥ 14 | 397 (74.6%) | 84 (74.3%) | 136 (73.5%) | 242 (63.5%) | NA | NA |
ER | Negative | 85 (16%) | 35 (31%) | 57 (30.8%) | 112 (29.4%) | NA | 56 (24.8%) |
| Positive | 447 (84%) | 78 (69%) | 128 (69.2%) | 269 (70.6%) | NA | 170 (75.2%) |
PR | Negative | 120 (22.6%) | 34 (30.1%) | 69 (37.3%) | 121 (31.8%) | NA | 74 (32.7%) |
| Positive | 412 (77.4%) | 79 (69.9%) | 116 (62.7%) | 260 (68.2%) | NA | 152 (67.3%) |
HER2 | Negative | 417 (78.4%) | 63 (55.8%) | 107 (57.8%) | 278 (73%) | NA | 190 (84.1%) |
| Positive | 97 (18.2%) | 30 (26.5%) | 34 (18.4%) | 70 (18.4%) | NA | 36 (15.9%) |
| Uncertainty | 18 (3.4%) | 20 (17.7%) | 44 (23.8%) | 33 (8.7%) | NA | 0 (0%) |
Molecular subtype | Luminal | 456 (85.9%) | 80 (74.1%) | 134 (78.8%) | 277 (76.5%) | NA | NA |
| HER2-positive | 35 (6.6%) | 17 (15.7%) | 18 (10.6%) | 32 (8.8%) | NA | NA |
| Triple-negative | 40 (7.5%) | 11 (10.2%) | 18 (10.6%) | 53 (14.6%) | NA | NA |
Clinical Tumor stage | T1 | 206 (38.7%) | 51 (45.1%) | 75 (40.5%) | 150 (39.4%) | 43 (43.4%) | 128 (56.6%) |
| T2 | 326 (61.3%) | 62 (54.9%) | 110 (59.5%) | 231 (60.6%) | 56 (55.6%) | 98 (43.4%) |
Age | Median (IQR) | 50.00 [44.00, 59.00] | 51.00 [44.00, 60.00] | 56.00 [49.00, 68.00] | 53.00 [46.00, 62.00] | 53.00 [45.00, 62.00] | NA |
Radscore | Median (IQR) | 0.41 [0.25 ,0.53] | 0.41 [0.22, 0.52] | 0.41 [0.22, 0.53] | 0.38 [0.19, 0.52[ | -0.03 [-0.27, 0.25] | -0.08 [-0.43, 0.23] |
ALN, axillary lymph node burden; ER, estrogen receptor; PR, progesterone receptor. |
Feature selection and radscore calculation
A total of 944 features were extracted from ROI. Figures S1a–1b show that data distributions from different centers were relatively scattered before using ComBat, but these datasets converged after the center effect was eliminated using ComBat. First, 736 features with ICC > 0.75 were selected. Second, 388 features were selected using independent sample t-tests or Mann–Whitney U tests (p < 0.05). Third, 37 features were selected based on correlation analysis. Fourth, logistic regression analysis was performed on the remaining features after upsampling high-burden patients four times (Figures S1c–1d), reducing inter-sequence redundancy, ultimately yielding nine features. Finally, a neural network method (BPNN model) was used to develop an ALN burden prediction model using the nine most predictive features, with their predictive probability constituting the radscore.
Performance of the prediction models
Univariate and multivariate logistic analyses identified menopausal status, MRI-ALN status, MRI-ALN burden, and radscore as independent predictive factors for ALN burden (Table 2). A clinical model and a combined model were constructed based on these factors.
Table 2
Univariate and multivariate logistic regression analysis to assess the association of clinical characteristics and radscore with ALN burden
Characteristics | | OR (univariable) | OR (multivariable) |
---|
Menopausal Status | Postmenopausal | | |
| Premenopausal | 1.60 (1.22–2.10, p < .001) | 1.68 (1.21–2.34, p = .002) |
MRI ALN status | Negative | | |
| Positive | 7.09 (5.18–9.69, p < .001) | 3.14 (2.19–4.49, p < .001) |
MRI ALN burden | Negative | | |
| Positive | 59.33 (18.69-188.32, p < .001) | 19.68 (5.99–64.62, p < .001) |
Radscore | Mean ± SD | 7.47 (5.12–10.88, p < .001) | 3.90 (2.63–5.78, p < .001) |
Clinical characteristics clinical T stage was not included in the modeling, because it was not significant after combination with radscore. OR, Odds Ratio. ALN, axillary lymph node burden. |
The BPNN radiomics model (with AUCs of 0.856, 0.781, 0.809, and 0.783 in training and three external validation cohorts) performed comparable to the combined model (AUCs of 0.899, 0.826, 0.812, and 0.803, DeLong’s test, P = 0.112–0.850), significantly outperforming the clinical model (AUCs of 0.771, 0.689, 0.620, and 0.643, P < 0.01). Table 3 and Figs. 2a–2d show the model performance details. The performance of BPNN radiomics model performed better than the MRI-ALN burden in four cohorts (McNemar’s test, p < 0.001). Figure 3 presents three typical cases demonstrating the clinical application of the radiomics model. Figure 3 illustrates that patient 1 was pathologically ALN-negative but misclassified as ALN-positive by MRI. In contrast, patient 2 was initially diagnosed as ALN-negative but was later found to have a low-ALN burden pathologically. Similarly, patient 3 was initially deemed ALN-negative but was later revealed to have a high-ALN burden.
Table 3
Model performance evaluation and comparison
| Model | AUC (95% CI) | Accuracy | Sensitivity | Specificity | Delong |
---|
Training Cohort (Centre I) | Clinical model | 0.771 (0.741–0.802) | 0.715 | 0.595 | 0.828 | 0.000 |
Radscore BPNN model | 0.856 (0.830–0.880) | 0.791 | 0.964 | 0.629 | 0.000 |
Combined model | 0.899 (0.878–0.920) | 0.831 | 0.929 | 0.739 | - |
Validation Cohort I (Centre II) | Clinical model | 0.689 (0.566–0.805) | 0.752 | 0.435 | 0.833 | 0.005 |
Radscore BPNN model | 0.781 (0.669–0.870) | 0.681 | 0.826 | 0.644 | 0.173 |
Combined model | 0.826 (0.732–0.910) | 0.717 | 0.739 | 0.711 | - |
Validation Cohort II (Centre III) | Clinical model | 0.620 (0.514–0.718) | 0.789 | 0.25 | 0.919 | 0.000 |
Radscore BPNN model | 0.809 (0.733–0.875) | 0.686 | 0.806 | 0.658 | 0.850 |
Combined model | 0.812 (0.735–0.881) | 0.762 | 0.694 | 0.779 | - |
Validation Cohort III (Centre IV) | Clinical model | 0.643 (0.572–0.716) | 0.756 | 0.444 | 0.828 | 0.000 |
Radscore BPNN model | 0.783 (0.722–0.835) | 0.706 | 0.764 | 0.693 | 0.112 |
Combined model | 0.803 (0.748–0.852) | 0.738 | 0.667 | 0.754 | - |
CI, confidence interval; AUC, Receiver Operating Characteristic curves and Area Under the Curve. BPNN, the back propagation neural network algorithm. |
Prognostic stratification analysis of ALN status-related radscore
A cutoff value of radscore with 0.542 was calculated by the maximum Youden index, and patients were categorized into predicted high- and low-ALN burden groups. Kaplan–Meier survival curve revealed that the predicted low-ALN burden group had significantly better OS than the predicted high-ALN burden group in the Center I (hazard ratio [HR] = 31.52, P = 0.034) and DUKE cohort (HR = 20.72, P = 0.031) (Figs. 2e–2f).
Potential biological analysis of the ALN-related radiomics model
This study identified 231 differentially expressed genes (DEGs) between the predicted high- and low-ALN burden groups, comprising 120 upregulated and 111 downregulated DEGs in the predicted high-ALN burden group (Fig. 4a). Hierarchical clustering analysis revealed that the DEGs were mainly distributed in five functional modules, including response chemotaxis to adhesion; ERK1, ERK2, and MAPK cascade; blood activation body coagulation; receptor surface signaling pathway; and epidermal epidermis development differentiation (Fig. 4b). GO functional analysis revealed that epidermal cell differentiation-related, keratinocyte differentiation, and epidermis pathways were downregulated in the predicted high-ALN burden group (Figures S2a-c). Conversely, migration/invasion pathways, such as cell chemotaxis, regulation of chemotaxis, and cell-substrate adhesion were upregulated in the predicted high-ALN burden group (Figures S2d-f).
Relationship between tumor immune infiltration and radiomics
Significant differences in eight types of RNA-based immune markers between the low and high radscore groups were observed (Fig. 4d). Specifically, Mv Endothelial, Pericytes cells, and others were more abundant in the predicted high-ALN burden group, whereas common lymphocyte precursors (CLP × cell), smooth muscle × cell, and others were more prevalent in the predicted low-ALN burden group. These results indicate differences in tumor immunity and the tumor microenvironment between the high- and low-score groups. Correlation analysis between radiomics features and immune cell scores revealed a strong negative correlation between radiomic features and immune scores (Fig. 5).