AI-enhanced Synchronized Multiparametric 18F-FDG PET/MRI for Accurate Breast Cancer Diagnosis

Purpose: to assess whether a radiomics and machine learning (ML) model combining quantitative parameters and radiomics features extracted from synchronized multiparametric 18 F-FDG PET/MRI images can differentiate benign and malignant breast lesions. Methods: 102 consecutive patients with 120 BI-RADS 0, 4 and 5 breast lesions (101 malignant, 19 benign) detected by ultrasound and/or mammography were prospectively enrolled and underwent hybrid 18 F-FDG PET/MRI for diagnostic purposes. Quantitative parameters and radiomics features were extracted from dynamic contrast-enhanced (MTT, VD, PF), diffusion (ADCmean of breast lesions and contralateral breast parenchyma), PET (SUVmax, mean and minimum of breast lesions, SUVmean of uni- and contralateral breast parenchyma) and T2-w images. Different diagnostic models were developed using a ne gaussian support vector machine algorithm and exploring different combinations of quantitative parameters and radiomics features to obtain the highest accuracy in discriminating benign from malignant breast lesions using a 5-fold cross validation. The performance of the best radiomics and ML model was compared with that of expert readers review physician using the McNemar test. Results: Eight radiomics models were developed. The integrated model combining MTT and ADC with radiomics features extracted from PET and ADC images obtained the highest accuracy for breast cancer diagnosis (AUC 0.983) and was higher (AUC 0.868) yet not signicant to expert readers review (p=0.508). Conclusion: A radiomics and ML model combining quantitative parameters and radiomics features extracted from synchronized multiparametric 18 F-FDG PET/MRI images can accurately discriminate benign from malignant breast lesions. metabolic breast imaging has the potential to improve breast cancer diagnosis while simultaneously reducing the number of biopsies without missing cancers. multi-center validate the multiparametric 18F-FDG PET/MRI AI-based radiomics model. and metabolic excellent invasive


Introduction
Breast cancer is the most commonly occurring malignancy in women worldwide, representing 11.6% of newly diagnosed cancer cases in 2018 [1]. Patients' prognosis changes dramatically if breast cancer is diagnosed at early compared to later stages, with a 5-year survival rate ranging from 98-100% to 66-98%, respectively [2]. Despite the many advantages obtained in the eld of new surgical approaches and targeted drug development, early diagnosis still represents one of the most effective means to conquer breast cancer. Imaging techniques that are currently used to diagnose breast cancer comprise mammography, ultrasound and MRI [3]. Among these, MRI is the highest sensitive imaging modality for breast cancer detection, through the depiction of neoangiogenesis as a tumor speci c feature. A challenge in the broader use of breast MRI are false positive ndings leading to unnecessary invasive biopsies in benign tumor causing unnecessary costs and patient anxiety [4]. Other factors that affect MRI speci city are related to image acquisition technique and readers' experience [4].
Carcinogenesis is a complex, multistep process during which cancers develop distinct pathological biological properties-cancer hallmarks-including sustained proliferation, evading growth suppressors and apoptosis, promoting angiogenesis, invasion and metastasis [5]. Advanced imaging techniques including morphologic, functional and metabolic information have been introduced to allow the non-invasive depiction of these pathophysiological processes at cellular level. These novel imaging data can be used for tumor diagnosis and characterization, assessment of the response to speci c treatments and prediction of patients' outcome [6].
Synchronized multiparametric 18 F-uoro-2-deoxy-d-glucose ( 18 F-FDG) PET/MRI is a novel imaging technique that combines multiparametric morphologic and functional information from MRI with metabolic information provided by PET offering unique insights into tumor biology to achieve the ultimate goal of precision medicine in oncology [7,8]. Recent data support the use of 18 F-FDG PET/MRI in breast cancer patients for different diagnostic purposes [9,10]. Initial data using the combination of separately acquired MRI and PET data indicate an improvement in the differentiation of benign and malignant breast lesions [11], but at present the role of synchronized multiparametric 18 F-FDG PET/MRI for breast cancer diagnosis has not been fully assessed.
Recently a new paradigm in healthcare has emerged with advances in medical imaging technologies, image analysis and the advent of arti cial intelligence (AI) and its application to medical imaging. Radiomics is the extraction of large numbers of quantitative features from standard-of-care medical images using computer algorithms that can be correlated with various data e.g patient characteristics, outcomes and pooled in large-scale analysis to create decision support models [12][13][14]. Radiomics has the potential to represent "the bridge between medical imaging and personalized medicine" [15].
We hypothesized that an AI-based radiomics model combining quantitative simultaneously acquired 18 F-FDG PET/MRI data enables accurate differentiation of benign and malignant breast tumors. Therefore, the aim of our study was to develop and validate a diagnostic AI model using quantitative perfusion, diffusion and metabolic data as well as radiomics features extracted from simultaneous multiparametric 18 F-FDG PET/MRI to non-invasively differentiate benign from malignant breast lesions.

Patient population
This prospective single-institution study was approved by the institutional review board and written informed consent was obtained from all participants. From June 2016 to July 2020, 154 patients were included in the study and underwent synchronized multiparametric 18 F-FDG PET/MRI of the breast for diagnostic purpose. Patients ful lled the following inclusion criteria: > 18-year-old; not pregnant nor breastfeeding; imaging abnormality (BI-RADS 0, 4/5) on ultrasound and/or mammography (i.e. asymmetries, microcalci cations, architectural distortion, breast mass); no previous treatment; no contraindications to contrast-enhanced MRI. Exclusion criteria were: no histopathology nor follow-up available; incomplete 18 F-FDG PET/MRI examinations; 18 F-FDG PET/MRI images not suitable for the subsequent quantitative and radiomics analysis (e.g. image artifacts, incomplete dynamic scans). Thus, 102 patients (mean age 50 years, age range 23-82 years) with 120 breast lesions (101 malignant and 19 benign) were nally included in this study. The BI-RADS category distribution of included lesion was: BI-RADS 0 (n=8), BI-RADS 4 (n=16), BI-RADS 5 (n=96). The ow-chart of the patient selection process is illustrated in Figure 1.

Standard of reference
Histology was used as standard of reference for lesions classi ed as BI-RADS 4 (n=22) /5 (n=95) at 18 F-FDG PET/MRI. In case of a benign histopathological diagnosis at image-guided needle biopsy, the nal diagnosis was benign. In case of a high-risk lesion with uncertain potential for malignancy, the nal diagnosis was established with open surgery. In patients with malignant lesions the standard of reference was histological analysis of the surgical specimen; in patients who received neoadjuvant treatment, the biopsy results were considered the standard of reference. In three lesions classi ed as BI-RADS 2 (n=2) or BIRADS 3 (n=1) at 18 F-FDG PET/MRI stable imaging follow-up was available for at least two years.

Multiparametric PET/MRI acquisition protocol
All patients underwent synchronized multiparametric 18 F-FDG PET/MRI performed on a Biograph mMR system (Siemens, Germany), an MRI-compatible PET detector integrated with a 3.0 MRI scanner.
Patients fasted at least 5 hours before receiving an intravenous application of 2.5-3.5 MBq/kg bodyweight of 2-deoxy-2-18 F uoro-D-glucose ( 18 F-FDG). All measured blood glucose levels were less than 150 mg/dL (8.3 mmol/L) prior the tracer injection. The PET/MRI acquisition started then after an uptake time of 60 min. MRI-based attenuation correction was applied using Dixon-VIBE sequences obtaining in-phase and opposed-phase as well as fat-saturated and water-saturated images. A threedimensional (3D) acquisition technique was used that offered an axial eld of view (FOV) of approximately 26 cm and a transverse FOV of 59 cm with a sensitivity of 13.2 cps/kBq. The multiparametric MRI was performed using a dedicated 16 channel breast coil (Rapid Biomedical, Germany) and the imaging protocol consisted of the following sequences: 1. Axial T2-weighted sequence, TR/TE= 4820/192 ms, matrix size 640x480, FOV 360 x 360 mm, slice thickness 2.5 mm, gap 3 mm, FA 128°.

Image analysis
Two board-certi ed radiologists with 10 and 6 years of experience in breast imaging independently evaluated MRI data. A nuclear medicine physician with 10 years of experience and a radiologist with 6 years of experience, trained in hybrid imaging under the supervision of a nuclear medicine physician, independently evaluated PET images. Readers were blinded from nal histopathological results and previous examinations. To assess the intra-observer reproducibility of PET/MRI quantitative parameter measurements, all lesions were reassessed by the same readers after a wash-out period of four weeks. Breast lesions were identi ed on DCE post-contrast subtracted images and lesion location and size (maximum diameter on DCE post-contrast subtracted images in the axial plane) was recorded.
For quantitative perfusion analysis, a pixel-by-pixel fast-deconvolution method was applied using the open source MRI perfusion analysis tool UMMPerfusion (Horos plugin) [16]. Arterial input function was selected by drawing a 2D region of interest (ROI) in the right ventricle. Breast lesions were identi ed and segmented on subtracted images at early postcontrast time points, as soon as the lesion was clearly visible [17]. 2D ROIs were drawn over the enhancing tumor portion, avoiding the inclusion of cystic, hemorrhagic necrotic areas or susceptibility artifacts from biopsy markers, and then pasted on the corresponding quantitative maps to extract mean transit time (MTT), plasma ow (PF) and volume distribution (VD).
DW images and corresponding quantitative ADC maps were analyzed. In detail, breast lesions were rst identi ed on high b-value DW images; thereafter, 2D ROI were positioned on ADC maps on the qualitatively darkest part of the tumor, using DCE images as a reference to identify contrast-enhanced regions and also avoiding the inclusion of cystic, hemorrhagic necrotic areas or susceptibility artifacts from biopsy markers [18]. Using this approach, ADCmean of primary lesions and as well as of the normal appearing unilateral and contralateral breast parenchyma was calculated.

F-FDG PET
For PET quanti cation, a volumes-of-interest (VOI) was manually drawn around every suspicious breast lesion to acquire their maximum (SUVmax), mean (SUVmean) and minimum (SUVmin) standardized uptake value (SUV) using the Hermes Hybrid Viewer (Hermes Medical Solutions, Stockholm, Sweden). The VOI was de ned using the region grow 3-dimensional approach with a xed threshold determined to capture PET metabolic tumor volume, but not physiological 18 F-FDG uptake in surrounding tissues. For metabolic quanti cation of non-tumoral unilateral and contralateral breast tissue, a VOI was placed in the normal breast parenchyma to obtain its SUVmean away from the nipple and areola. Examples of ROI placement over breast lesions on DCE-MRI, ADC and PET images for the extraction of quantitative parameters are illustrated in Figure 2.

Radiomics analysis and model development
PET/MRI images were imported to dedicated software (ITK-SNAP v. 3.6.0) [19] for lesion segmentation. A radiologist with 6 years of experience in breast imaging annotated each lesion on DCE, DWI, PET and T2-w images. First, whole breast lesions were segmented on DCE-MRI images using a semi-automated method. The second post-contrast time point was chosen for lesion segmentation, in order to better depict tumor enhancement compared to the surrounding breast parenchyma. The same approach was applied to DWI and PET images. Finally, manual segmentation was performed to annotate breast lesions on T2w images slice per slice. In all steps, care was taken to avoid the inclusion of cystic/necrotic areas. When a biopsy marker was present, a distance of at least 2 mm was kept. Examples of tumor segmentation analysis is illustrated in Figure 3.
Considering the unbalanced distribution of benign and malignant breast lesions, adaptive synthetic sampling was employed to equalize class sizes [20]. Data for all four image types was initially reduced to 16 gray levels. Radiomics features were calculated using CERR [21]. DCE, T2-w, ADC and PET images were used for radiomics feature extraction. Segmentations performed on DWI images were used for the extraction of radiomics features from ADC images. Considering that T2w and ADC images were not isotropic, feature extraction was performed in a 2D fashion for each slice and then aggregated over the whole lesion (BTW3 as de ned by the Image Biomarker Standardisation Initiative) [22]. The LASSO regression was then utilized to determine which radiomics features were of most importance. A maximum of 5 features were selected to avoid over tting. Diagnostic models were then developed in MATLAB using a ne gaussian support vector machine (SVM), one of the most employed ML classi ers in medical imaging [23] and a 5-fold cross-validation. Data were initially standardized to prevent any particular dependence on an individual parameter. This process was then repeated 1000 times to provide nal diagnostic metrics. Analysis was performed for each of the four image types independently and then in various combinations to assess potential improvements in diagnostic accuracy for the discrimination of benign and malignant breast lesion.
Clinical 18 F-FDG PET/MRI Interpretation DCE-MRI was assessed according to the BI-RADS V Edition lexicon [24]. A BI-RADS category from 2 to 5 was assigned to each lesion. BI-RADS scores were then dichotomized as follows: 2-3= benign and 4-5=malignant. Subsequently, ADC values were calculated for each lesion, as described above. An ADC value of 1.3 ×10 −3 mm 2 /sec was used as diagnostic threshold for de ning benignity and malignancy, as suggested by the EUSOBI consensus statement on diffusion weighted imaging [18].
Lesions showing ADC values equal to or greater than 1.3 ×10 −3 mm 2 /sec were classi ed as benign, while lesions with ADC values lower than 1.3 ×10 −3 mm 2 /sec were classi ed as malignant. On PET a lesion was classi ed as benign if it did not show 18 F-FDG uptake higher than the above background activity; conversely, a lesion showing 18 F-FDG uptake greater than the surrounding parenchyma was classi ed as malignant [25]. To achieve a nal diagnosis, the following criteria were applied for the combined DCE-MRI, DWI and PET evaluation: A lesion was classi ed as malignant if at least two among DCE-MRI, DWI and PET or all of them were positive for malignancy.
A lesion was classi ed as benign if at least two among DCE-MRI, DWI and PET or all of them were negative for malignancy

Statistical analysis
Intra and interobserver reproducibility of quantitative parameter measurements was assessed using the intraclass correlation coe cient (ICC) analysis. The agreement was rated as follows: poor with ICC <0.5, moderate when comprised between 0.5 and 0.75, good when ranging between 0.75 -0.90, and excellent when greater than 0.90 [26]. Kolmogorov-Smirnov test was performed to assess whether quantitative parameters were distributed normally. Accordingly, independent t test or Mann-Whitney U test were used to compare quantitative parameters between benign and malignant breast lesions. Diagnostic accuracy, sensitivity, speci city, positive and negative likelihood ratio of radiologists and nuclear medicine physician in classifying breast lesions were also calculated. ROC curves of BI-RADS score as well as signi cant quantitative DWI, perfusion and PET parameters for breast cancer diagnosis were also calculated. Differences in terms of performance among the different radiomics models and between the best performing radiomics model and clinical interpretation were assessed using Mc Nemar test. A p value < 0.05 was considered statistically signi cant. Statistical analysis was performed using SPSS, released 2017,

Radiomics models
One hundred and one features were extracted in six classes (22 rst order, 26 based on gray-level co-occurrence matrices, 16 based on run length matrices, 16 based on size zone matrices, 16 based on neighborhood gray level dependence matrices, and ve based on neighborhood gray tone difference matrices) from DCE, ADC, T2-w and PET images, respectively. Eight radiomics models were developed to predict breast cancer diagnosis, based on different combinations of multiparametric 18 F-FDG PET/MRI images. Radiomics models with corresponding selected radiomics features are reported in Table 3. Firstly, a radiomics model based on quantitative parameters alone was built. ADCmean of breast lesions, MTT and SUVmax were selected by the LASSO regression and used by the SVM classi er, obtaining an AUC of 0.981 for correctly classifying breast lesions.  summary of all radiomics models with corresponding accuracy metrics, including AUROC, diagnostic accuracy, sensitivity, speci city, positive and negative likelihood ratio is reported in Table 4.  (Table 5), the performance of the integrated model combining quantitative parameters and radiomics features was higher, but not signi cantly different from that of the other radiomics models (p > 0.069) .

Discussion
At present, no studies have been published on synchronized AI-18 F-FDG PET/MRI for breast cancer diagnosis. The aim of this study was to investigate whether an AI-based radiomics model combining quantitative simultaneously acquired 18F-FDG PET/MRI data enables accurate differentiation of benign and malignant breast tumors. A model including both quantitative parameters and radiomics features can accurately discriminate benign from malignant breast lesions. Our results indicate that AI-enhanced functional and metabolic breast imaging showed an excellent performance, higher than expert readers, thus having the potential to assist human readers in correctly classifying suspicious breast lesions and obviate unnecessary invasive breast procedures.
While DCE-MRI is undisputedly the most sensitive test for breast cancer detection with a pooled sensitivity of 99% [27], there is still room for improvement in diagnostic accuracy due to overlap in imaging features of benign and malignant breast tumors, interpretation-in uencing physiological factors such background parenchymal enhancement and last but not least human detection or interpretation error [28].
To compensate for these limitations, additional functional and metabolic imaging techniques such as DWI, perfusion and PET have been developed that provide insights of tumor biology and thus improve diagnostic accuracy. Several studies have shown the incremental diagnostic value of these individual parameters [29,30] particularly their combined application as multiparametric MRI or PET/MRI has been shown to improve diagnostic accuracy for breast cancer detection and characterization [11,31].
Our ndings also indicate that different functional and metabolic imaging techniques enable the non-invasive simultaneous depiction of oncogenic processes such as induction of neoangiogenesis, metabolic reprogramming and sustained proliferation.
In our study the clinical interpretation of 18 F-FDG PET/MRI showed a good diagnostic accuracy with an AUC 0.868 for breast cancer diagnosis, and these ndings are in line with previous studies [11,31,32].
To fully leverage the wealth of information provided by synchronized multiparametric 18 F-FDG PET/MRI we aimed to develop and validate a diagnostic AI model using quantitative perfusion, diffusion and metabolic data as well as radiomics features to non-invasively differentiate benign from malignant breast lesions.
The AI-model with the best diagnostic accuracy was based both on radiomics features extracted from ADC and PET images as well as quantitative DCE (MTT) and DWI (ADCmean) information of breast lesions and achieved an accuracy, sensitivity and speci city of 94.8%, 95.3 and 94.3%, respectively. This indicates that to enable a most accurate breast cancer detection information on tumor cellularity, metabolism and permeability is desirable.
It is worth noting that the model based on quantitative parameters only (i.e. ADC, MTT and SUVmax) also showed a good performance (accuracy: 93.2%).
Although the multiparametric 18 F-FDG PET/MRI AI-based radiomics model performed best, the performance was not statistically different from the clinical interpretation by expert readers. It has to be noted that while clinical interpretation achieved similar sensitivities (95.3% vs 100%), the multiparametric 18 F-FDG PET/MRI AI-based radiomics model achieved a higher speci city (94.3% vs 73.7%) highlighting the potential to reduce false positive ndings and obviate unnecessary breast biopsies in benign breast tumors [29].
Several studies have been published on the use of AI applied to MRI for breast cancer diagnosis, mainly aiming at increasing its relatively low speci city, compared to the high sensitivity, with accuracy values ranging from 0.728 to 0.920 [33][34][35][36][37]. Similar to our work, the group of Zhang et al. also explored the possibility to improve the accuracy of the ML classi er combining radiomics features extracted from both morphological and functional contrast-enhanced and diffusion kurtosis MRI images of 207 histologically proven breast lesions. They found that the model based on radiomics features from T2-w, DKI and quantitative DCE pharmacokinetic parameter maps had the best discriminatory ability for benign and malignant breast lesions (AUC = 0.921) [33]. Radiomics coupled to machine learning analysis applied to DCE-MRI, including both radiomics features and clinical data, also proved to be accurate in the characterization of < 1 cm breast lesions in ninety-six high-risk BRCA mutation carriers, with a diagnostic accuracy of 81.5%, signi cantly higher than qualitative morphological assessment with BI-RADS classi cation (AUC 53.4%) [37]. Regarding PET imaging, the role of this functional technique has been explored in breast cancer mainly for prognostic/therapeutic purposes, particularly in the early prediction of the response to neoadjuvant chemotherapy [39][40][41]. In a recently published study, the usefulness of radiomics and ML applied to PET/CT to differentiate breast carcinoma from lymphoma was investigated in a small number of lesions (19 breast lymphoma and 25 breast cancer lesions) [42]. Different predictive models were built using combinations of clinical data, quantitative parameters (SUV), radiomics features ( rst and second order parameters extracted from both PET and CT images) and CT images. Models based on clinical data, SUV and PET radiomics features as well as on clinical data and CT radiomics features resulted as the most accurate ones with AUC of 0.806 and 0.759 in the validation cohort, respectively [42]. In an experimental study by Vogl et al. conducted on 34 breast lesions, a computer-aided segmentation and diagnosis (CAD) system was developed for automated lesion segmentation and classi cation (benign vs malignant) using separately acquired MRI and 18 F-FDG PET/CT images [43]. The CAD system achieved a Dice similarity coe cient of 0.665 for lesion segmentation and AUC of 0.978 for breast cancer diagnosis. While PET and DWI features were found to improved DCE-MRI segmentation performance, such improvement was not observed for lesion characterization [43].
Limitations of our study have to be acknowledged. Firstly, our study is limited by the small sample size and the unbalanced distribution of benign and malignant breast lesions, with relevant implications for speci city. To overcome the limitation of the relatively small sample size, especially regarding benign lesions, we opted for performing an internal 5-fold cross validation which is proven to be robust in such cases [44]. The unbalanced distribution of benign and malignant tumors is related to this study being conducted at a single tertiary care cancer centre and the inclusion of BI-RADS 0, 4-5 lesions with a clinical indication for performing a breast 18 F-FDG PET/MRI. We addressed this limitation by using a well-established adaptive synthetic sampling to balance the two classes. Another limitation is the lack of external validation of the proposed AI-model, which may limit generalizability. To date there is only a limited number of centers world-wide that have clinical synchronized PET/MRI scanners and perform clinical breast imaging and the collaboration with a different institution to validate our models is in development. Furthermore, two dynamic sequences were used before and after MRI update. However, acquisition parameters were kept similar and AI techniques are meant to be applied to images acquired with different acquisition protocols; indeed, this issue did not affect the good accuracy of the ML classi er. Finally, several cases had to be excluded from the analysis as at least one among DCE-MRI, DWI or PET images was not suitable for the extraction of quantitative parameters or for the radiomic analysis, in order not to impair data's reliability. Despite this stringent exclusion criterion, and also considering the limited access to such an advanced imaging technique, an adequate number of breast lesions was nally included which allowed the achievement of a good performance in the AI discrimination task.
In conclusion, a synchronized multiparametric 18 F-FDG PET/MRI AI-based radiomics model allows accurate differentiation of benign and malignant breast lesions with a performance comparable to that of expert readers. Initial data indicate that AIenhanced functional and metabolic breast imaging has the potential to improve breast cancer diagnosis while simultaneously reducing the number of biopsies without missing cancers.
Larger multi-center studies are planned to validate the multiparametric 18F-FDG PET/MRI AI-based radiomics model. Our results indicate that AI-enhanced functional and metabolic breast imaging showed an excellent performance, higher than expert readers, thus having the potential to assist human readers in correctly classifying suspicious breast lesions and obviate unnecessary invasive breast procedures.  ROI placement over breast lesions on DCE-MRI, ADC and PET images for the extraction of quantitative parameters.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.