Baseline Characteristics
In this study, 1318 patients were recruited from four centers. Patients were excluded for inadequate MRI quality (n = 20), lack of histopathologic data (n = 34), previous history of breast cancer (n = 13), external institution surgery or unassessed pCR (n = 81), and incomplete clinical data (n = 61), resulting in 1109 patients being included in the study (Fig. 1A). For predicting pCR, the training cohort comprised 435 patients (center A [n = 413], center B [n = 22]; median age, 51 years [IQR, 46–55 years]). Two external validation cohorts consisted of 351 patients from center C (median age, 48 years [IQR, 43–52 years]) and 323 patients from center D (median age, 48 years [IQR, 40–56 years]).
In all cohorts, significant differences were observed in HR status, HER2 status, max FD, and global FD between pCR and non-pCR groups (all P < 0.05) (Table 1).
Reproducibility of Fractal Analysis
The FDs for both 3D and 2D fractal analyses showed good consistency, the Bland-Altman repeatability coefficients ranging from 0.11 to 0.19 (Supplementary Table S3 and Fig. S2). Variance-components analysis indicated that the variance between patients (variance:0.0172-0.0195) exceeded the variance between readings (variance: 0.0001-0.0009) for both 3D and 2D fractal analyses. The coefficient of variation (COV) for 2D FDs (COV: 3.21-6.78) between readings was found to be higher than that for 3D (COV: 2.65), whereas the coefficient of variation for 2D FDs (COV: 377.94-467.28) between patients was lower compared to 3D (486.94). Furthermore, five FDs showed good inter-observer consistency (ICC: 0.87-0.93) (Supplementary Table S4).
Correlation Analysis
Spearman correlation analysis indicated that global FD strongly positively correlated with max, median, and mean FD (correlation coefficient [r]: 0.69 to 0.81, P < 0.05), and negatively with HER2 status (r: -0.12 to -0.01, P ≤ 0.04). Global FD showed negative correlations with sphericity and surface area to volume ratio (r: -0.65 to -0.18, P ≤ 0.04), and positively with diameter and volume (r: 0.54 to 0.68, P ≤ 0.04) (Fig. 2, Supplementary Table S5 and S6).
Variables Associated with pCR
Univariable logistic regression analysis showed that HR status, HER2 status, Ki-67 status, Clinical T stage, and global FD were associated with pCR. After adjustment of the multivariable model for variables with P < 0.05 in the univariable analysis, HR status (odds ratio [OR], 0.234 [95% CI: 0.135, 0.406]; P < 0.001), HER2 status (OR, 3.320 [95% CI: 1.923, 5.729]; P < 0.001), and global FD (OR, 0.352 [95% CI: 0.261, 0.480]; P < 0.001) were independent predictors for pCR (Table 2). These independent predictors were then used to develop the nomogram model-1. Following the same process, clinicopathological variables (HR and HER2 status) and morphological features (sphericity, major axis length, and maximum 2D diameter [row]) were identified to develop the nomogram model-2 (Supplementary Table S7).
Performance of Models for Prediction of pCR
For predicting pCR to NAC, the AUCs ranging from 0.52 to 0.73 were observed for five FD univariable models across two external validation cohorts (Supplementary Fig. S2). Global FD achieved AUCs of 0.73 (95% CI: 0.67, 0.79) and 0.68 (95% CI: 0.61, 0.74), significantly outperforming morphological models with AUCs of 0.61 (95% CI: 0.54, 0.64) and 0.55 (95% CI: 0.49, 0.62) in the two external validation cohorts (Delong test, all P < 0.001), respectively (Fig. 3A-C, Table 3, Supplementary Table S8).
The nomogram model-1 achieved AUCs of 0.80 (95% CI: 0.75, 0.86) and 0.74 (95% CI: 0.68, 0.79) (Fig. 3A-C, Fig. 4), significantly outperforming nomogram model-2 with AUCs of 0.74 (95% CI: 0.68, 0.80) and 0.69 (95% CI: 0.62, 0.74) in two external validation cohorts, respectively (Delong test, P < 0.001) (Supplementary Table S8).
The calibration between predicted and observed probabilities was good for nomogram model-1 (Hosmer-Lemeshow test, P: 0.35-0.78) (Fig. 3D-F). Decision curve analysis showed that nomogram model-1 offered greater clinical benefit across most threshold ranges and demonstrated net benefits in two external validation cohorts at thresholds of 0.07 to 0.68 and 0.13 to 0.66 (Fig. 3G-I).
Model Performance for Prediction of pCR in Patient Subgroups
Four subgroup analyses were conducted based on molecular subtypes, age, menopausal status, and Ki-67 status. In two external validation cohorts, the global FD for prediction of pCR to NAC achieved AUCs ranging from 0.65-0.83 for patients with four molecular subtypes (HR+/HER2-, HR+/HER2+, HR-/HER2-, and HR-/HER+) (Supplementary Fig. S3).
The nomogram model-1 achieved AUCs ranging from 0.72-0.83 for patients with age ≤ 45 years or age > 45 years, and 0.74-0.82 for premenopausal or postmenopausal patients in two external validation cohorts. For patients with high and low Ki-67 expression, the nomogram model-1 achieved AUCs of 0.77 (95% CI: 0.69-0.85) and 0.80 in external validation cohort 1 (Supplementary Fig. S4).
Survival Analysis
For survival analysis, 171 patients from center A (median age, 50 years [IQR, 45-55 years]) were enrolled. During the follow-up (DFS: median, 29 months [IQR, 15-44 months]; OS: median, 37 months [IQR, 18.05-48.85 months]), 52 patients had recurrence and 14 patients died (Table 4).
Cox proportional hazards analysis identified menopausal status (hazard ratio [HR], 1.88 [95% CI: 1.08, 3.28]; P = 0.03), NAC treatment response (HR, 3.75 [95% CI: 1.14, 12.33]; P = 0.03) and global FD (HR, 2.03 [95% CI: 1.08, 3.81]; P = 0.03) were independent prognostic factors for DFS. While cT4 stage (HR, 5.92 [95% CI: 1.50, 23.34]; P = 0.01) and global FD (HR, 4.85 [95% CI: 1.05, 22.46]; P = 0.04) were independent prognostic factors for OS (Fig. 5).
For DFS, the cutoff for dividing high and low-risk groups was 11.71, while for OS, the cutoff value was 42.01. Kaplan-Meier analysis for DFS and OS revealed significant differences between the low and high-risk groups (log-rank test, DFS: P = 0.04; OS: P < 0.001), with the low-risk group exhibiting better DFS and OS (Fig. 6).