Improving the malignancy prediction of breast cancer based on the integration of radiomics features from dual-view mammography and clinical parameters

Radiomics has been a promising imaging biomarker for many malignant diseases. We developed a novel radiomics strategy that incorporating radiomics features extracted from dual-view mammograms and clinical parameters for identifying benign and malignant breast lesions, and validated whether the radiomics assessment could improve the accurate diagnosis of breast cancer. A total of 380 patients (mean age, 52 ± 7 years) with 621 breast lesions utilizing mammograms on craniocaudal (CC) and mediolateral oblique (MLO) views were randomly allocated into the training (n = 486) and testing (n = 135) sets in this retrospective study. A total of 1184 and 2368 radiomics features were extracted from single-position region of interest (ROI) and position-paired ROI, separately. Clinical parameters were then combined for better prediction. Recursive feature elimination and least absolute shrinkage and selection operator methods were applied to select optimal predictive features. Random forest was used to conduct the predictive model. Intraclass correlation coefficient test was used to assess repeatability and reproducibility of features. After preprocessing, 467 radiomics features and clinical parameters remained in the single-view and dual-view models. The performance and significance of models were quantified by the area under the curve (AUC), sensitivity, specificity, and accuracy. The correlation analysis between variables was evaluated using the correlation ratio and Pearson correlation coefficient. The model using a combination of dual-view radiomics and clinical parameters achieved a favorable performance (AUC: 0.804, 95% CI: 0.668–0.916), outperformed single-view model and model without clinical parameters. Incorporating with radiomics features of dual-view (CC&MLO) mammogram, age, breast density, and type of suspicious lesions can provide a noninvasive approach to evaluate the malignancy of breast lesions and facilitate clinical decision-making.


Introduction
Breast cancer (BC) has become a serious threat to women's health and quality of life, resulting in millions of deaths each year [1].With the advancement of medical imaging technologies and increased public awareness of health, raising the breast cancer diagnosis rate has become a pressing necessity for crowd screening [2].Different imaging methods for breast cancer screening are now widely used to aid the detection of small breast lesions in order to reduce breast cancer deaths in women [3].For example, breast ultrasonography evaluates lesions using a combination of morphology and blood flow, but is insensitive to calcification.Breast magnetic resonance imaging (MRI) has a good sensitivity and accuracy in discriminating breast lesions; however, minor lesions have a lower accuracy than advanced lesions.Furthermore, due to its long imaging time and high cost, MRI has not been adopted as a standard examination method [4,5].Digital mammography (DM), which uses a 2D technique, is a widely used tool for detecting breast cancer, and the American Cancer Society has declared that mammography is one of the best approaches to preserving women's health [6,7].However, mammography still has a serious restriction in that the appearance of lesions may be diminished, leading 1 3 to missed or incorrect diagnosis for relying exclusively on a morphological appraisal of suspicious findings, which may foster an increase in the positive predictive value (PPV) of work-up examinations [8,9].Radiomics, which uses quantitative methods that can be more beneficial for diagnosing and defining benign and malignant breast tumors, was used to overcome these constraints [10,11].
Most breast cancer can be precisely classified using radiomics analysis to increase morphological and texture details, such as the presence of an uneven lesion border and heterogeneous interior enhancement, according to current research.Cheung et al. showed that radiomics characteristics can predict ductal carcinoma in situ underestimate with reasonable accuracy [12].Moreover, other studies have shown that precancerous mammograms contain unique imaging information beyond breast density that can be used for risk assessment of breast cancer [13][14][15][16].Sun et al. extracted the complementary information between CC and MLO mammographic views of a breast mass, greatly improving the classification performance and diagnostic speed of mammographic breast mass [17].In addition to their utility in risk assessment, parenchymal and texture features can help evaluate and predict breast cancer [18].Song et al. found that MRI based radiomics can provide a noninvasive approach to evaluate microvascular invasion before surgery, which can help surgeons make decisions of surgical strategies and assess patient's prognosis [19].However, in the clinical setting here, using radiomics methods to determine the status of breast cancer lesions remains a challenge for clinicians and requires more data to support its real application.More importantly, the accuracy and repeatability depend on the quality of the input data and annotation [20].
To solve this problem, many researchers developed predictive models, such as molecular classification of breast cancer, efficacy evaluation of neoadjuvant chemotherapy, risk prediction of early breast cancer, and prediction of axillary lymph node metastasis in breast cancer to help clinical decision-making.Subsequent studies have incorporated different modalities of radiomics data to predict the malignancy of breast lesions [21], commonly used mammography images including CC position alone [8,22], mixed CC and MLO position [23,24], and a combination of CC position and MLO position [6,25].The relationship between breast imaging and the observation of developing breast cancer has been studied extensively [26].A further study [27] investigated that radiomics features extracted from mammographic masses on the combination of CC and MLO position images provided similar diagnostic performance for malignant mammographic masses as biopsies.Besides the change of position, previous studies have found a high correlation between parenchymal texture features extracted from the left and right breast duo to left-right breast symmetry [28].However, the study did not discuss the features extracted from the different positions that would affect the final model.
In this study, we aimed to develop a model that combines radiomics features of dual-view mammographic and clinical features to classify breast lesions, which purpose is to explore whether better results could be achieved by combining position-paired radiomics and clinical information to the characterization of breast lesions than by using radiomic features alone.

Patients
The protocol was approved by the Institutional Research Ethics Board (Approval [2021] No.004) in the People's Hospital of SND.The informed consent requirement was waived due to the nature of this retrospective study.A total of 4128 patients underwent DM examination from November 2017 to May 2021 were performed to identify eligible cases in this study.The exclusion criteria were (a) patients lacked a visible lesion on the mammography images, (b) patients had a history of breast implantation, (c) patients undergo biopsy, radiotherapy, or chemotherapy before the examination, and (d) quality of images did not satisfy the need of diagnosis.Finally, 380 women (mean age = 51.70 ± 7.48 years; range 32-74 years) with 621 breast lesions were enrolled in this study, including 95 malignant lesions and 526 benign lesions.Figure 1 shows the selection of the study group.

Image acquisition
All patients underwent mammography examination using a digital breast X-ray machine (Siemens Mammomat Fusion) with the device tube voltage of 50 kV and the tube current of 100 mA.The craniocaudal (CC) and mediolateral oblique (MLO) views photography was performed with an exposure voltage of 20 to 35 kV and an exposure current of 20 to 500 mAs.All mammograms were saved at a 12-bit quantization level and were not further processed or normalized [29].Clinical information of enrolled patients was obtained from the hospital information system (HIS) and recorded.

Patients grouping
A total of 621 breast lesions were separated into a validation cohort with 135 lesions (95 benign and 40 malignant lesions) and a training cohort with 486 lesions (431 negative and 55 positive lesions).It was worth noting that the cohort was divided into training and testing groups according to whether have a definite pathological result.
Concerning that pathological results are rare for clinical research, our study treated surgically proven lesions as testing data and treated the rest of the lesions as training data.Of all 135 lesions in testing group proved by pathological diagnosis, 40 malignant breast lesions included invasive ductal carcinoma (n = 19), ductal carcinoma in situ (n = 14), invasive lobular carcinoma (n = 4), and invasive papillary carcinoma (n = 3), while 95 benign breast lesions included fibroadenomas (n = 35), adenosis (n = 42), hyperplasia (n = 13), intraductal papilloma (n = 3), and granulomatous inflammation (n = 2).From the breast imaging report and data system (BI-RADS) benchmark and previous studies [30,31], many researchers found that BI-RADS category ≤ 3 showed a low malignancy-rate (< 2%).Therefore, this study treated these probably benign (BI-RADS category ≤ 3) lesions as negative samples and treated the other BI-RADS category

Image interpretation by radiologists
All digital imaging and communications in medicine (DICOM) images were uploaded from the picture archiving and communication system (PACS).After the acquisition, all DM images were reviewed and interpreted by two independent breast radiologists who were blinded to all clinical information including pathology reports (Radiologist 1 with 5 years of experience; Radiologist 2 with 10 years of experience) to obtain the radiological features.Radiological features included the BI-RADS category, breast density (a, b, c, or d), and type of suspicious lesions (mass, calcification, asymmetry, or architectural distortion).When the two radiologists' opinions disagreed, a senior Radiologist 3 with 25 years of diagnostic experience participated in the image evaluation, and an agreement was reached after consultation.

Radiomics feature selection
The ROI regions from each patient were retrospectively identified and manually outlined in CC and MLO views by radiologist 1 (with 5 years of diagnosis experience) and radiologist 2 (with 10 years of diagnosis experience) on the InferScholar platform (https:// resea rch.infer vision.com/ v2/).And all ROIs were reviewed by Radiologist 3 (25 years of diagnosis experience).A sample of lesion segmentation is shown in Fig. 2a, b.Subsequently, quantitative radiomics features were automatically extracted from ROIs images on the InferScholar platform.Features from CC and MLO views were extracted as separate features.Three months later, 60 patients were randomly selected to assess the reproducibility of radiomics feature extraction, intraclass correlation coefficients (ICCs) were employed to assess the intra-and inter-observer agreement [1].The features with ICCs greater than 0.75 were considered to indicate good agreement and were kept in the data sets for the radiomics feature selection [32].Firstly, we extracted 1184 and 2368 radiomics features from single-view and dual-view mammograms, respectively.Secondly, 467 radiomics features with ICC > 0.75 were retained after feature reduction.The extracted radiomics features were divided into seven categories: first-order features (n = 104), gray-level co-occurrence matrix (GLCM) features (n = 102), gray-level run length matrix (GLRLM) features (n = 99), gray-level size zone matrix (GLSZM) features (n = 78), gray-level dependence matrix (GLDM) features (n = 63), neighboring gray tone difference matrix (NGTDM) features (n = 14), and shapebased features (n = 7).The heat map of all selected radiomics features is shown in Fig. 3.

Development of the radiomics model
To eliminate the influence of redundant features on the process of modeling, we used the recursive feature elimination (RFE) and the least absolute shrinkage and selection operator (LASSO) regression to select the key features from the retained radiomics features and clinical features.After the process of feature selection, our study used random forest (RF) algorithm [32,33] to conduct the predictive model.The discriminant models of single-view and dual-view were trained and tested on the independent data set.The fivefold cross-validation was used to develop the model.The radiomic framework is shown in Fig. 4.

Statistical analysis
The statistical analysis was performed in SPSS software (IBM Corporation, New York, version 26.0).Quantitative data were displayed as mean ± standard deviation and median (minimum-maximum).Continuous data were displayed as a percentage.Kolmogorov-Smirnov was used to test the normality of data and consistency of data distribution.Independent sample t-test and Mann-Whitney U-test for quantitative variables and Chi-square test for qualitative variables were used.The correlation ratio (η 2 ) for quantitative variables and Pearson correlation coefficient for qualitative variables were applied for correlation analysis.The performance measures (accuracy, sensitivity, and specificity) of each model were calculated.The receiver operating characteristic (ROC) curve analysis was performed to evaluate the diagnostic performance of each mammographic radiomics model, with the area under the ROC curve (AUC), accuracy, sensitivity, and specificity calculated as comparison metrics.Delong's test was used to compare the difference of AUCs for different models.P < 0.05 was regarded as a statistically significant difference.

Patient characteristics
A total of 380 female patients (age range, 32-74 years; median age, 52 years) with 621 breast lesions were enrolled in this study.Depending on the presence or absence of the histopathological results, all patients were divided into the training cohort (n = 289) and testing cohort (n = 91).The characteristics of all breast lesions are provided in Table 1.In the testing group, the difference in age, breast density, and BI-RADS category was statistically significant (P < 0.05), while the clinical feature of lesion type showed no differences between benign and malignant lesions (P = 0.091).

Significant features contributing to classification
In the random forest method, feature importance was determined by how much each features contributes to the final decision tree.Among all 467 features with high stability (ICC > 0.75), RFE and LASSO algorithms were applied to select the significant features contributing to classify benign and malignant lesions.Finally, the top important features entered the following modeling process, the importance scores of features and the classes to which each feature belongs are shown in Fig. 5.For qualitative clinical information (lesion type and breast density), the correlation ratio was applied to access the strength of correlation.For quantitative clinical information (age), Pearson correlation test was performed to measure the correlation.The square of eta were mostly less than 0.1, the absolute value of Pearson correlation coefficient were mostly between 0 and 0.1.The results demonstrated that there was no correlation between radiomics and clinical features.The detailed of correlation analysis could be seen in supplementary material Fig. S3.

Comparison of diagnostic performance between radiomics models and radiomics-clinical models
A predictive model incorporating radiomics and clinical features was built using the random forest algorithm in the training set and used as a convenient visible tool to predict the malignancy of breast lesions.The diagnostic abilities of RF classification methods in four models (namely, lesion alone, lesion with clinical information, dual-view lesion, and dual-view lesion with clinical information) are shown in Table 2.The AUC curves for  evaluating the four discriminant models on the validation cohort are shown in Fig. 6a-d.On the testing cohort, it was clearly found that models with clinical information performed better than those without clinical information, and the highest AUC scores of the four radiomics models were 0.808 (95% CI: 0.683-0.921),which outperformed other models.The radiomics model combining dual-view mammographic features with clinical parameters using the random forest algorithm demonstrated the best predictive performance.The accuracy and area under the curve metrics were 75.9% and 0.804, respectively.When discussing the impact of the independent factor of projection position in radiomics models, the AUC scores between the singleview and dual-view models were 0.734 and 0.808, respectively.To further investigate the effect of inputting view on prediction results, the single-view was subdivided into CC-view and MLO-view to compare the difference in classification performance.

Comparison of diagnostic performance between single-view models and dual-view radiomics models
The diagnostic abilities of RF classification methods in six models (namely, CC/MLO-view lesion alone, CC/MLOview with clinical information, dual-view lesion, and dualview lesion with clinical information) are shown in Table 3, and the P value of difference in AUCs in comparing different models is shown in Table 4.The AUC curves for evaluating the six discriminant models on the validation cohort are shown in Fig. 7a, b.
Among them, the method of fusing clinical information can improve diagnostic efficiency with a highest accuracy of 79.60% and specificity of 85.30% (in the combination of MLO-view and clinical information).The highest sensitivity of 82.40% was reached at the combination of CC-view and clinical information.Further, we found that the model combining dual-view and clinical information reached the best performance (AUC = 0.808 95% CI: 0.683-0.921)among these models in testing sets.The performance of radiomics features from the combined position and clinical parameters in the differentiation between benign and malignant mammographic lesions was better than that with the radiomics features alone.The difference for AUCs between single-view (MLO) and dual-view nearly reached a significant level (P = 0.07).Although we saw some evidence of improved AUC in the combined model, the difference of AUC between the two groups did not reach the traditional level of statistical significance.
To further investigate the distribution between radiomics features and malignant probability for breast lesions, Kolmogorov-Smirnov test was applied to compare the distribution of radiomics features between malignant and benign lesions, and the corresponding P value has been evaluated.As shown in Table 5, six radiomics from GLSZM and one radiomics from the first order have a statistically different distribution in different pathological categories (benign or malignant).Seven (43.75%) of 16 radiomics features from our proposed model were different in pathological category in the significant level (P < 0.05).
From the results shown above, it is clear that radiomic features from mammograms can accurately discriminate Fig. 5 Importance score in dual-view radiomics with clinical features.The columns of bar graph illustrated the category of the selected features from radiomics or clinic.GLCM, gray-level co-occurrence matrix; GLSZM, gray-level size zone matrix; GLDM, gray-level dependence matrix malignant lesions from benign lesions, which provides a noninvasive approach to diagnosing.

Discussion
In this retrospective study, we constructed a stable predictive model with clinical features and significant radiomics features extracted from mammograms to discriminate malignancy of breast lesions.It was observed that the distinction of the model improved with dual-view and clinical information.We also investigated the additive role of With more attention paid to women's health awareness, breast cancer screening with digital mammography is becoming popular, which increases the workload of radiologists, especially in areas where primary care is severely under-resourced.In clinical practice, differential diagnoses made by radiologists should depend on several factors, including age of patients, location, morphology, and boundary of the lesion [34].In this study, a pairing of body positions was performed to compensate for the macroscopic directional features, which is following the radiologist's observation habits and the analysis of multiple mammographic views is beneficial to the radiomics process.This  type of tool may help radiologists in assessing the investigated breast and in choosing the appropriate follow-up without resorting to histology.It has been well established that radiologists are better able to interpret mammograms when two mammographic views are available [26].Consequently, a radiomic model with two mammographic projections was observed to assist in the diagnosis of breast lesions [35].
In previous studies [36], radiomics features were extracted from different views, most focused on the single-view in CC or MLO position, and there was manual intervention in the single-view approach: Two radiologists will jointly select only one CC or MLO image per lesion that better represents the integrity and heterogeneity of the lesion and extract its textural features for inclusion in the data analysis.
There is no evidence to confirm whether this human intervention is the best method or not, so I followed up the previous study with a methodological comparison and confirmed that human intervention is still controversial in the process of computer-aided diagnosis [26,37].The results of the comparison between single-and dual-view models demonstrated that the human intervention was less effective than that rely only on machine learning to extract features, which incorporates all information without trade-offs.It is not difficult to understand that in the process of manual selection, the author unintentionally loses some factors in the process of obtaining the other factors, and the presence of each factor has an impact on the final results.In recent research, Ma et al. demonstrated that combining both CC and MLO positions radiomics data had good classification performance between HER2-enriched BC and non-HER2enriched BC [38].Intuitively, we expect that there would only be value in merging data from two views if they provide complementary information.By combining features extracted from two different mammographic views and risk factors in breast cancer, the predictive model showed promising performance in identifying benign and malignant breast lesions on mammograms.
In this research, it can be seen that clinical features, firstorder and gray-level features have a large proportion in modeling, which are important for the differential diagnosis of benign and malignant breast.First-order statistics describe the size and morphology of tumor, and the shape regularity of the lesion is usually negatively correlated with the degree of malignancy.Because breast cancer is often growing as invasive and mostly irregular, this is consistent with the findings of Hui et al. [8].Gray-level features respond to tumor heterogeneity, with objectivity, and can identify characteristics that cannot be distinguished by the naked eye.In previous studies, gray symbiosis matrix is mentioned as highly repetitive, which proves that gray symbiosis matrix has a high ability to identify benign and malignant breast lesions.In a recent study, Shuxian Niu found that compared with MRI, radiomics based on mammography performed better than MRI in the prediction of Luminal A and B [39], which inspired us exploring the radiomics based on dualview mammography in prediction of breast cancer molecular subtypes.
Admittedly, our study has some limitations.Firstly, the data set included in this research is imbalance and not all confirmed by pathological analysis and pathological features were not included in our research.However, in reality, data imbalance and lack of pathological results are common in clinical practice.The promising founding was that, in the case of imbalanced data and the absence of gold standard for training data, a good efficacy prediction model for lesion malignancy was built and it performed well in data with the precise pathological results.Secondly, using a combination of dual-view mammograms as input showed a better performance in differentiation between benign and malignant breast lesions, but honestly, both CC and MLO position images were not available in every lesion in reality, especially in asymmetry and calcification.Further work is needed to construct a predictive model using more pathological data and external validation and to compare the twoview correlation of features for subgroups of data including masses versus calcification and benign versus malignant lesions to investigate the correspondence in radiomics features between the MLO and CC mammographic views of breast lesions.
In conclusion, the model incorporating the radiomics signature and clinical risk factors may facilitate the individualized, preoperative prediction of malignant potential in breast cancer.Currently, single-modal radiomics is often used, but it cannot capture the complexity of the tumor ecosystem.In this study, we obtained more histological and clinical characteristics of the tumor ecosystem through multi-view integration and machine learning to predict the benignity and malignancy of breast lesions, and this approach can also be used to develop other cancer prediction models [40].

Fig. 2
Fig. 2 Examples of delineating the region of interest with a 52-year-old women with a 2.24cm mass in the right breast.a The CC-view and b the MLOview images.CC, craniocaudal; MLO, mediolateral oblique

Fig. 3 Fig. 4
Fig. 3 Heat map of radiomic features after feature selection.Development set included the training set and validation set.GLCM, gray-level co-occurrence matrix features; GLRLM, gray-level run length matrix features; GLSZM, gray-level size zone matrix features;

Fig. 6
Fig. 6 The receiver operating characteristic (ROC) curves of different radiomics models in the testing cohort.a Single-view radiomics.b Singleview radiomics with clinical features.c Dual-view radiomics.d Dual-view radiomics with clinical features

Fig. 7
Fig.7 The receiver operating characteristic (ROC) curves of three different position models in the testing cohort using a radiomics features.b radiomics with clinical features.AUC, area under the receiver operating characteristic curve; CC, craniocaudal; MLO, mediolateral oblique

Table 1
Clinic-radiological characteristics in the training and testing cohorts

Table 4
The P values for the difference in AUCs between different predictive models AUC, area under the receiver operating characteristic curve; CI, confidence interval; CC, craniocaudal; MLO, mediolateral oblique

Table 5
The P value for distribution of radiomic features between benign and malignant lesions *p < 0.05; GLCM, gray-level co-occurrence matrix; GLSZM, graylevel size zone matrix; GLDM, gray-level dependence matrix