MRI-Based Radiomics Approach Predicts Tumor Recurrence in ER+/HER2- Early Breast Cancer Patients

Oncotype Dx is a genetic assay providing a recurrence score (RS) correlated with the risk of cancer recurrence and adjuvant treatment response in breast carcinoma. We investigated the ability of an MRI-based radiomics approach to predict the risk of tumor recurrence in breast cancer. A total of 62 patients with biopsy-proved ER+/HER2- early breast cancer who underwent pre-treatment MRI and Oncotype Dx were included. An RS>25 was considered discriminant between low-intermediate and high risk of tumor recurrence. Two readers segmented each tumor. Radiomics features were extracted from the tumor and the peritumoral tissues. Partial least square (PLS) regression was used as the multivariate machine-learning algorithm. PLS β-weights of radiomics features included the 5% features with the largest β-weights in magnitude (top 5%). The diagnostic performance of the radiomics model was assessed through receiver operating characteristic (ROC) analysis. A null hypothesis probability threshold of 5% was chosen (p < 0.05). The predictive model delivered an AUC of 0.76. Of the 47 features included in the top 5%, 33 were texture-related and derived from the tumor and the peritumoral tissues. After a prospective evaluation in more extensive clinical trials, this approach may identify non-invasively patients who are more likely to benet from adjuvant therapy.


Introduction
Breast cancer is a leading cause of death and the most common cancer in women [1]. Breast cancer consists of four main subtypes, classi ed by tumor genotype and molecular characterization in luminal-A, luminal-B, HER2-enriched, and basal-like cancer [1,2]. Luminal tumors represent the most invasive breast cancer (70%) in western countries. The tumors are usually estrogen-receptor (ER+) and/or progesterone-receptor (PR+) positive and negative HER2-receptor (HER2-) [3]. Luminal-A breast cancer has low Ki-67 levels and a good prognosis. In contrast, luminal-B cancer shows a higher Ki-67 proliferation rate, higher histologic grade, greater risks of relapse, and worse disease-free survival [1,2].
Hormone therapy represents a mainstay for patient management [4]. About 15% of luminal-B cancers will develop a recurrence within 10 years from the diagnosis if treated with hormonal therapy alone. Although the risk of recurrence could be lowered by adjuvant chemotherapy, selecting patients who might bene t from adjuvant chemotherapy is debated [5][6][7].
Oncotype Dx (Genomic Health, Redwood City, CA) is a validated genetic assay providing a recurrence score (RS) that correlates with clinical outcomes in patients with ER+/HER2-invasive breast carcinoma . Recent studies demonstrated that RS correlates with recurrence rates and adjuvant treatment response [5,10,12,18,26]. This method is now incorporated into clinical practice guidelines for treatment decisions [26]. However, the technique is invasive and costly. These limitations prompted researchers to investigate new imaging-based biomarkers [15,[28][29][30][31][32][33]. Multiparametric magnetic resonance imaging (MRI) is the most sensitive modality to diagnose and assess treatment response in breast cancer patients [34][35][36][37]. In this regard, recently developed imaging-based methods, such as Radiomics, allow analyzing imaging data and extracting many quantitative features, thereby adding a whole-tumor volume of extra- information to the conventional qualitative visual assessment [15,[28][29][30][31][32][33]38]. MRI-based predictors of tumor recurrence allow the non-invasive selection of patients at high risk of recurrence, with signi cant improvements on patient healthcare and overall costs. Few studies investigated the relationship between Oncotype Dx RS scores and MRI-based radiomics features extracted from the tumor [39][40][41][42][43][44][45][46][47]. Compared to previous studies, ours was focused not only on the tumor but also investigated the peritumoral tissues. Moreover, we distinguished the recurrence risk according to the recent American Society of Clinical Oncology (ASCO) clinical practice guidelines update [10].
This study investigated the ability of MRI-based radiomics features extracted from the tumor and the peritumoral tissues to predict the risk of tumor recurrence in ER+/HER2-breast cancer patients. Thus, by demonstrating the presence of imaging-based biomarkers, we non-invasively identify patients who are more likely to bene t from adjuvant therapy.

Methods
Subjects. This study received formal approval from the Ethical Committee of the University G. d'Annunzio of Chieti-Pescara, Italy; informed consent was waived by the same ethics committee that approved the study (Comitato Etico per la Ricerca Biomedica delle Province di Chieti e Pescara e dell'Università degli Studi "G. d'Annunzio" di Chieti e Pescara). The study was conducted according to ethical principles laid down by the latest version of the Declaration of Helsinki. A total of 62 patients who underwent clinically indicated breast MRI between January 2016 and May 2020 at our institution were retrospectively included. Inclusion criteria were 1) ER+/HER2-early breast cancer con rmed via biopsy, 2) MRI performed on a 1.5 T scanner, 3) availability of Oncotype DX RS.
MRI Protocol. All patients in this cohort underwent a clinically indicated breast MRI consisting of a standard T1-weighted (T1w), T2-weighted (T2w), diffusion-weighted imaging (DWI), and Dynamic Contrast Enhancement (DCE) acquisition performed using a 1.5 T MR scanner (Achieva, Philips Medical System, Best, the Netherlands) equipped with a dedicated phased-array breast coil. Detailed information regarding the DCE acquisition is described in Table 1. *FFE=fast eld echo **= rst ("early") and second ("peak") contrast enhanced dynamic phase Imaging analysis. Whole-volume tumor manual segmentation of the tumor (T) was performed on the rst ("early") and second ("peak") contrast-enhanced dynamic T1w images for each patient by two independent senior radiology residents. The software used for the segmentation was an open-source medical image computing platform, 3DSlicer Version 4.8 (www.3dslicer.org). To create the "tissue surrounding tumor" segmentations (TST), a "3dmask_tool" (AFNI) was used [48]. First, a 2 mm dilatation ("dilate") and a 2 mm erosion ("erode") were obtained from the CT of each patient. Then, the two masks were subtracted ("dilate" -"erode") to obtain the TST which was 4 mm thick [49]. All the TST segmentations were then checked by the two readers and manually adjusted if necessary to include only the outer border of the tumor and the adjacent perivisceral tissue. T and TST are shown in Figure 1a.
Radiomic Features Extraction. The extraction of radiomic features from the masked (T and TST) T1w images was performed using PyRadiomics [50]. Reproducibility assessments of the features extracted by the two readers from the segmentations of all patients were performed. To avoid data heterogeneity bias and promote reproducibility, MR images and masks were resampled using 3 isotropic voxel dimensions (1x1x1 mm, 2x2x2 mm, 3x3x3 mm). For each segmentation and for each image resolution (1 mm, 2 mm, and 3 mm), ten built-in lters (Original, wavelet, Laplacian of Gaussian (LoG), square, square root, logarithm, exponential, Gradient, LBP2D, LBP3D) were applied, and seven feature classes ( rst-order statistics, shape descriptors, glcm, glrlm, ngtdm, gldm, glszm) were calculated, which resulted in a total of were converted into z-scores relying on their subject distribution.
Machine Learning Analysis. A machine learning approach was used to exploit the radiomic features' multidimensionality and infer the risk of recurrence (high vs. low-intermedium). Two main strategies were implemented to address the large number of features extracted [51,52]. The rst approach reduced the number of used features by selecting only highly repeatable features between the masks delineated by the two radiologists (r > 0.95). The second approach leveraged the high collinearity among radiomic features. It then used a linear regression analysis to infer the risk of recurrence, thus employing a space dimension reduction procedure, namely the partial least square (PLS) regression [51][52][53][54]. PLS has one hyperparameter, namely the number of uncorrelated components to be used in the regression. Leave-oneout nested cross-validation (nCV) was used to achieve hyperparameter optimization and evaluate the generalizable performance of the procedure [54][55][56]. In nCV, data are divided into folds, and the model is trained on all data except one-fold in an iterative, nested manner. Whereas the outer loop estimates the model's performances among iterations (test), the inner loop evaluates the optimal hyperparameter (validation). If the number of folds equals the number of samples (one-fold per sample) the procedure is de ned as leave-one-out nCV, an approach highly suited for medical applications where samples represent subjects [57][58][59]. The whole leave-one-out nCV PLS analysis was repeated multiple times for the following group of masks: a) DCE images ("early" and "peak") in both T and TST, b) "peak" DCE in both T and TST, c) "early" DCE in both T and TST, d) "peak" DCE in T, e) "peak" DCE in TST, f) "early" DCE in T and g) "early" DCE in TST.

Statistical Analysis
The classi cation performances were assessed through Receiver Operating Characteristic (ROC) analyses considering the inferred (out-of-training-sample) recurrence risk in the outer loop fold of the machine learning framework. Patients with low-intermediate recurrence risk were attributed to the "negative" group, whereas patients with high recurrence risk were attributed to the "positive" group. The ROC analyses were also performed on random shu ed outcomes to simulate the null hypothesis and evaluate its con dence interval (repeated 10 6 times). The ROC analysis delivered an Area Under the Curve (AUC), which, using the random shu ed outcomes, could be transformed into a z-score for assessing its statistical signi cance. The Statistical Analysis was performed in Matlab.

Results
Out of the 62 women included in the study, the mean age was 49 Table 2). In total, 1409 radiomic features were extracted for each image. Each MRI included "early" and "peak" DCE images. We extracted two masks (T and TST) from each set of images ("early" DCE-T, "early" DCE-TST, "peak" DCE-T and "peak" DCE-TST). All MRI images were resampled at 3 resolutions, for a total of 33816 features per patients. In detail, the number of features selected (based on inter-read repeatability of r>0.95) for each repetition of the analysis were: n=940 from "early" and "peak" DCE T+TST images, n=644 from "peak" DCE T+TST images, n=296 from "early" DCE T+TST images, n=315 from "peak" DCE T, n=329 from "peak" DCE TST, n=230 from "early" DCE T and n=66 from "early" DCE TST. Using the nCV machine learning framework, a signi cant inference on the risk of recurrence was obtained when including all features in the analysis ("early" and "peak" DCE T+TST, optimal number of PLS components, n=19), with an AUC=0.76, z=3.01, p=1.1•10 −3 (Figure 2a). Standalone combinations of "early" and "peak" DCE images of T and TST did not deliver a signi cant multivariate inference of the risk (p>0.05, Figure 2b). Figure 3a reports the nCV β-weights distribution depicting the strength and sign of the effect of the original radiomic features in the inference of the outcome. Since the larger labeling value of "1" was associated with an increased risk of tumor recurrence, the positive β-weight suggested a higher risk at increasing feature value and vice-versa for negative weights. Figure 3b reports the top 5% (n=47) β-weights associated with the most relevant features involved in the prediction (those with the largest β-weights magnitudes). These features were balanced between T and TST (23 and 24, respectively). In detail, 25 of the top 5% features were associated with images at 1 mm resolution, 15 at 2 mm resolution, and 7 at 3 mm resolution. Most (33/47) of those features were related to the texture analysis. 33 (70%) top 5% weights were associated with the second-order analysis of the images (e.g., features computed using the gray-level co-occurrence matrix, GLCM, or the gray-level dependence matrix, GLDM), whereas only 14 features were related to rst-order analysis. In addition, a larger number of "peak" (n=33) versus "early" DCE (n=14) features were present.

Discussion
Our results showed that MRI-based radiomics can predict the risk of recurrence in ER+/HER2-early breast cancer patients. These ndings con rmed the promising preliminary results showing a signi cant association between radiomics signatures and risk of breast cancer recurrence [41,43]. For example, Li et al. reported that radiomics features including tumor size and tumor heterogeneity predicted multigene assay recurrence scores [41]. A recent study generated a radiomics signature based on dynamic contrastenhanced MRI to distinguish between low (recurrence score < 18) and non-low (recurrence score > 18) Oncotype DX risk groups in estrogen receptor (ER) positive invasive breast cancer [43]. The authors obtained a Rad-score based on 10 radiomics features reaching an AUC of 0.759 [43]. Interestingly, most of the top 5% features derived from 1 mm slice thickness images. This result suggests that high-resolution imaging was a relevant parameter for the prediction performance. Supporting this hypothesis, most of those features were texture-related, re ecting the degree of heterogeneity in breast tissue. The role of contrast-enhanced imaging was also relevant in our study. In this regard, 70% of the top 5% features were extracted by contrast-enhanced images obtained at the "peak" of contrastenhancement. These results align with the current state-of-the-art breast MRI that recommends the acquisition of images approximately 60-90 seconds after the administration of contrast [35]. Although breast MRI without intravenous contrast administration has been proposed as a screening procedure, current techniques, such as DWI, are not sensitive enough to replace DCE-MRI [63, 64].
Our study has some limitations. First of all, it included a relatively low number of patients and lacked a validation cohort. This is due in part to the extraordinary cost of genetic testing that limited the study population size. However, our analysis is set to be a proof-of-concept study, and the nCV implemented in our study minimized the effect driven by the reduced number of samples and over tting [55]. Second, ours is a retrospective single-center study. Further studies, with a prospective design and multicentric, are warranted to con rm our ndings and re ne the standardization of our approach. Third, we only analyzed dynamic contrast-enhanced MRI images, thereby excluding T2-weighted or diffusion-weighted images. Further studies are needed to clarify the potential role of these sequences in tumor recurrence prediction.
In conclusion, a radiomics-based machine learning approach showed the potential to accurately predict the recurrence risk in early ER+/HER2-breast cancer patients. Most of the discriminant radiomics features were related to texture analysis from the tumor and peritumoral environment. Prospective validation studies, possibly in larger multicentric cohorts, are warranted to better de ne the role of radiomics as a predictive biomarker in breast cancer.

Declarations
Data Availability. The datasets generated during and/or analyzed during the current study are not publicly available due to the clinical and con dential nature of the material but can be made available from the corresponding author on reasonable request. Figure 2 ROC analysis of the machine learning (PLS) classi cation performance. Patients with RS<25 were attributed to the "negative" group, whereas patients with RS>25 were attributed to the "positive" group. (b) β-weights are associated to the top 5% of features with the largest β-weights in magnitude.