Performance of a novel Convolutional Neural Network with the incorporation of automatic segmentation of peritumoral region in shear-wave elastography for predicting breast cancer

doi:10.21203/rs.3.rs-2114378/v1

Download PDF

Research Article

Performance of a novel Convolutional Neural Network with the incorporation of automatic segmentation of peritumoral region in shear-wave elastography for predicting breast cancer

https://doi.org/10.21203/rs.3.rs-2114378/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Objective: Peritumoral stiffness of lesions measured by shear-wave elastography (SWE) can improve the diagnostic specificity of the conventional ultrasound(US) in breast cancer. Our aim was to develop dual-modal CNN models based on combining US images and SWE of peritumoral region to improve prediction of breast cancer.

Materials and Methods: We retrospectively collected US images and SWE data of 1271 BI-RADS-4 breast lesions from 1116 female patients (mean age ± standard deviation, 45.40 ± 9.65 years). The lesions were divided into three subgroups based on the maximum diameter (MD): ≤15 mm; >15 mm and ≤25 mm; >25 mm. We recorded lesion stiffness (SWV1) and 5-point average stiffness of the peritumoral tissue (SWV5). Based on the segmentation of different widths of peritumoral tissue (0.5 mm, 1.0 mm, 1.5 mm, 2.0 mm) and internal SWE image of the lesions, CNN models were built using PP-LiteSeg and EfficientNet-B0 architecture. All single-parameter CNN models, dual-modal CNN models, and quantitative SWE parameters inthe training cohort (971 lesions) and the validation cohort (300 lesions) were assessed by receiver operating characteristic (ROC) curve.

Results: The dual-modal CNN models based on the combination of US and SWE segmentation images had better diagnostic performance than the quantitative SWE parameters, US CNN model, and SWE CNN models in predicting breast cancer (P < 0.001 for all). The US + 1.0 mm SWE model achieved the highest area under the ROC curve (AUC) in the subgroup of lesions with MD ≤15 mm in both the training (0.94) and the validation cohorts (0.91). In the subgroups with MD between 15 and 25 mm and above 25 mm, the US + 2.0 mm SWE model achieved the highest AUCs in both the training cohort (0.96 and 0.95, respectively) and the validation cohort (0.93 and 0.91, respectively).

Conclusion: The dual-modal CNN models based on the combination of US images and peritumoral region SWE images allow accurate prediction of breast cancer.

convolutional neural networks (CNN)

shear-wave elastography (SWE)

peritumoral stiffness

segmentation

breast cancer

Breast cancer, the main cause of cancer death in women worldwide, has replaced lung cancer as the malignancy with the highest incidence rate [1]. The morbidity of breast cancer in Asian women with dense breasts is even 4–6 times higher than that in western women with fatty breasts [2,3]. Because of its capability of real-time dynamic imaging and imaging of dense breast tissue, breast ultrasound (US) has been recognized as the main imaging method for diagnosing breast cancer [4-6].At present, the Breast Imaging Reporting and Data System (BI-RADS) is used for the evaluation and follow-up recommendations of breast lesions detected by US. The idea of BI-RADS is to ensure standardized diagnosis of breast lesions; however, radiologists’ subjective classification differences may affect diagnostic performance, thereby leading to overtreatment, especially for lesions with BI-RADS 4 category [5-6].

Previous studies have shown that shear-wave elastography (SWE) may improve the diagnostic specificity of the conventional US for breast cancer, even in cases of small or interval cancer [4-7]. Although the value of a lesion’s stiffness measured by SWE is objective, inhomogeneity within the lesion (hemorrhage, calcification, and cystic appearance) introduces subjective bias, which influences the final measured value, thereby resulting in diagnostic differences. The diagnostic process highly relies on visual interpretation of image information by experienced radiologists; hence, it is time-consuming, subjective, and limits the diagnosis accuracy, which may cause low specificity (false-positive detection) and lead to unnecessary biopsies [7-8].

Previous studies have confirmed that the assessment of peritumoral stiffness of breast lesions can improve the accuracy of SWE in predicting breast cancer, considering desmoplastic reaction and tumor cell infiltration into the peritumoral stroma [9-12]. Several other studies have demonstrated that the peritumoral invasion is an independent prognostic factor significantly associated with an increased risk of relapse and death in node-negative breast cancer patients [13,14]. However, it is difficult to distinguish the boundary between the normal and tumor tissue in SWE [13]. Therefore, peritumoral stiffness of breast lesions is highly dependent on radiologists’ experience rather than on the integrated high-throughput imaging information [12-14]. Thus, it is desirable to develop approaches using artificial intelligence (AI) to integrate high-throughput imaging information that cannot be directly identified by unaided eye, so as to offer assistance to radiologists and improve the efficiency and accuracy of breast cancer diagnosis.

US images are well suited for deep learning as they have rich and consistent data sources. The deep learning–based methods consider the raw US image pixels and corresponding class labels from the medical imaging data as the input, and automatically learn the feature representations [15-17]. Recently, deep convolutional neural network (CNN)–based approaches have been considered as a stable, effective approach for the feature extraction, detection, and classification of US images in breast cancer diagnosis [15-18]. However, most of the CNN models used in the diagnosis of breast cancer have been based on the US or SWE images of intratumoral tissue rather than peritumoral tissue. Judging from the results of previous studies, peritumoral stiffness of breast lesions is an accurate predictor of breast cancer [9-12]. However, based on the traditional SWE technology, it is difficult to obtain accurate SWE image of the peritumoral tissue, and it is hard to estimate which width of the peritumoral tissue should be evaluated to provide the optimal diagnostic index of benign and malignant lesions [13].

To the best of our knowledge, no studies have used CNN-based AI diagnostic systems to predict breast cancer based on peritumoral region’s SWE image.Therefore, the purpose of this study was to develop a dual-modal CNN model based on peritumoral SWE image of breast lesions and examine its diagnostic performance in breast cancer. The contributions of this work are threefold: first, we improved the novel architecture for real-time semantic segmentation, named PP-LiteSeg and EfficientNet-B0 architecture, which can automatically recognize the location of breast lesions in B-mode US images. After mapping the lesions’ boundaries detected on B-mode US images to SWE images, segmentation of different widths (0.5 mm, 1.0 mm, 1.5 mm, 2.0 mm) of the peritumoral tissue can be automatically completed. Then, we aimed to evaluate the predictive performance of each CNN model based on different peritumoral widths for breast cancer. Finally, the dual-modal CNN prediction models for breast cancer based on effective peritumoral region SWE segmentation images were established.

Study population

This retrospective study was approved by the institutional ethics committee of the First Affiliated Hospital of the University of Science and Technology of China (USTC). Informed consent was not required from the patients because only anonymous US images were analyzed retrospectively.

Between December 2019 and April 2022, the initial population included 1876 breast lesions in 1532 consecutive patients who had undergone US and SWE examinations. The inclusion criteria were as follows: (i) BI-RADS 4 category of breast lesions according to the American Society of Radiology fifth edition mammography Breast Imaging Report and Data System (BI-RADS); (ii) solid or cystic solid breast lesions examined by B-mode ultrasound (US) and SWE; (iii) core needle biopsy or surgical resection performed to obtain accurate pathological results. The exclusion criteria were as follows: (i) radiotherapy, chemotherapy, or biological treatment before ultrasonic examination; (ii) a history of breast surgery (including excision or plastic surgery); (iii) pregnancy or lactation; (iiii) non-mass lesions or larger lesions (larger than 40 mm), which were beyond the maximum range of the SWE sampling frame. Finally, a total of 1271 BI-RADS 4 lesions from 1116 female patients were analyzed in this study. The lesions were assigned to the training cohort (971 lesions) and the verification cohort (300 lesions) by random sampling at an approximate ratio of 3:1 (Fig. 1).

Dual-modal image acquisition and preprocessing

All of the US and SWE ultrasound examinations were performed by breast radiologists using the Siemens ACUSON Sequoia (Siemens Healthcare GmbH, USA) US system equipped with 10 MHz linear array transducers. We acquired and stored transverse and longitudinal static images with the maximum diameter of breast lesions on US, and images containing the lesion’s characteristics (such as calcification, angulation, and speculation sign) were also stored. All breast lesions included in this study were of BI-RADS 4 category. The category of each lesion was reassessed by two radiologists with more than eight years of working experience, and another radiologist with twelve years of breast examination experience was consulted to reach a final decision when disagreements occurred.

SWE examination of each lesion was performed in machine default modes, and we adjusted the size of a rectangular region of interest (ROI) to cover the whole lesion. Then, the probe was kept still for 3–5 seconds; meanwhile, the patient was asked to hold breath. We stored a static SWE image of the lesion and the surrounding tissue without measurement (for image segmentation); the lesion was put in the middle of the SWE region ensuring that the ROI included the lesion and at least 2 mm of the surrounding breast tissue.Then, quantitative SWE parameters were measured as follows: (i) A round ROI containing the lesion was measured and its shear-wave velocity (SWV) was recorded as SWV1, which represents the average stiffness of the lesion. (ii) Five 3-mm-wide round ROIs were selected for measurement; four of them were placed at locations adjacent to the lesion (including peritumoral and intratumoral tissues), and one ROI was placed inside of the lesion, and the average SWV was recorded as SWV5, which represents the average stiffness of the lesion including the peritumoral tissue (Fig. 2). All SWE images were taken from the longitudinal and transverse section of the breast lesions, and two senior breast radiologists with more than eight years of clinical experience performed all of the SWE examinations. The radiologists in this study were blinded to the patients’ clinical data and pathological results. All of the US and SWE images extracted from the Siemens database sites were in .jpg format. To maintain a high quality of US images, all images were screened, and low-quality images containing severe artifacts or significant image resolution reductions were removed.

Theoretically, the peritumoral tissue affected by breast cancer of different sizes shows statistically different results on an SWE diagram, and the peritumoral tissue radiated by small lesions is less widespread than that affected by large lesions. In order to determine whether the size of lesions has an impact on the selection of optimal peritumoral region SWE images of the lesions, in this study we divided the lesions to three subgroups depending on their maximum diameter (MD) (≤15 mm; >15 mm and ≤25 mm; >25 mm).

All of the breast lesions were pathologically confirmed through US-guided core needle biopsy after US and SWE examination. A total of 712 lesions were resected after malignant, atypical, or high-risk core biopsy result, and the final diagnosis was confirmed by surgical pathology. Benign lesions were followed-up with US for at least 18 months, and lesion stability was confirmed in all patients.

Labeling and segmentation of the peritumoral region

We constructed a CNN segmentation model for segmentation of lesions on SWE images, and compared it with the performance of manual segmentation to evaluate the consistency between the CNN segmentation model and the observers.

Traditionally, color elastography images are often not suitable for deep learning or induce instability because boundaries of lesions are difficult to label. Therefore, for the segmentation of the interior of the lesion, we labeled the SWE images according to the lesion boundary on the US image.First, we pretrained two backbone models of PP-LiteSeg and EfficientNet-B0 using Simsiam network architecture to identify the lesions on US images. Simsiam is a siamese network that has shown to be an effective self-supervised method. We used 83590 non-annotated US images of breast lesions and thyroid nodules for pretraining. Then, the lesion’s region in each US image in the training cohort was manually labeled using an open-source annotation tool (Labelme, https://github.com/wkentaro/labelme) by one radiologist (with eight years of experience in breast US) who was blinded to the clinical data and histopathological results of the patients.

After training, we obtained a segmentation model for describing the shape of the lesion on US images. The segmentation model can draw an ROI encompassing breast lesion’s boundary on the two-dimensional image, and then map the lesion’s boundary onto the SWE image and automatically expand the boundary of the peritumoral tissue with different widths (0.5 mm, 1.0 mm, 1.5 mm, 2.0 mm). For each SWE image of breast lesions, six images were finally segmented: the interior of the lesion; the lesion including 0.5 mm of peritumoral tissue; the lesion including 1.0 mm of peritumoral tissue; the lesion including 1.5 mm of peritumoral tissue; the lesion including 2.0 mm of peritumoral tissue; and the whole rectangular SWE ROI image.

The segmentation model comprised encoder, aggregation, and decoder. It was implemented on PaddlePaddle (https://github.com/PaddlePaddle/PaddleSeg). Compared to the default config, we removed some data augmentations including ResizeStepScaling and RandomPaddingCrop, added rotation augmentations, modified the num_classes to 2, and limited input_size to 320×320. OHEM loss was selected according to the better performance in a small target binary segmentation than cross-entropy loss. The process of model pretraining and segmentation network construction is shown in Fig. 3.

To achieve high consistency of the automatic segmentation model, the Dice similarity coefficient, Cohen’s kappa, Hausdorff 95 (95% HD), and the segmentation metrics (area, major axis length, and minor axis length) were used to evaluate each lesion’s pixel and boundary consistency of the three radiologists and the CNN segmentation model in the validation cohort. Each of the three radiologists had eight years of experience in breast US and was blinded to the clinical data and histopathological results of the patients.

CNN-based predictive model building based on the segmentation of peritumoral region SWE images

For predicting breast cancer, we built benign and malignant binary classification CNN models by EfficientNet-B0. Seven single-parameter CNN models were trained based on seven individual input images (Fig. 4). Five of them were segmented on SWE (Internal SWE CNN, 0.5 mm SWE CNN, 1.0 mm SWE CNN, 1.5 mm SWE CNN, 2.0 mm SWE CNN); one was segmented on US image; and the last one was the whole heat map region on SWE image. EfficientNet-B0 architecture is shown in Fig. 4. The optimizer was SGD with 0.9 momentum and 1e-4 weight-decay; the loss function was cross entropy loss; batch was 128; and the learning rate was 0.01. Data augmentation included random horizontal flip, random brightness, random contrast, and random saturation. We implemented them in an open-source machine learning framework (PyTorch, version is 1.10, https://pytorch.org).

CNN-based predictive model building based on the fusion of peritumoral region segmentation SWE images and US images

Furthermore, six corresponding US + SWE image CNN models (US + Internal SWE CNN; US + 0.5 mm SWE CNN; US + 1.0 mm SWE CNN; US + 1.5 mm SWE CNN; US + 2.0 mm SWE CNN; US + ROI SWE CNN) were also trained based on the fusion input binary classification CNN by using both US and SWE images as the network input. We fused the two images on the first convolution layer of the network. For fusion input, we modified the EfficientNet-B0, which increased input channel of the first convolution layer from 3 to 6. Correspondingly, two 3-channels images were concatenated into 6-channels on the channel axis and then resized into 224×224. Other parameters of the model were the same as before, and the model architecture is shown in Fig. 5.

Statistical analysis

Statistical analysis was performed using commercially available SPSS software (version 19.0; Chicago, USA). All numerical data were presented as the mean ± standard deviation. The Shapiro–Wilk test was used to verify whether the quantitative data were normally distributed. The Mann–Whitney U tests were used to compare the continuous variables between the benign and malignant groups or between the three subgroups in the training and validation cohorts. To evaluate the consistency between the radiologists and the CNN segmentation model segmentation, we utilized the sensitivity, specificity, Dice coefficient, Cohen's kappa, and 95% Symmetric Hausdorff Distance. The Pearson correlation coefficients and Wilcoxon signed-rank tests were used to determine whether the CNN segmentation model’s performance aligned with that of the radiologists. We used three metrics (area, length of major axis, and length of minor axis) to evaluate the consistency of segmentation. The performances of all predictive CNN models and two quantitative SWE parameters were assessed using receiver operating characteristic (ROC) curve analysis. ROC was also used to calculate the corresponding sensitivity (SEN), specificity (SPE), positive predictive value (PPV), negative predictive value (NPV), and area under the ROC curve (AUC). The McNemar test was used for paired comparisons of proportions. All statistical tests were two-sided, and p values lower than 0.05 were considered to indicate statistical significance.

Clinicopathologic characteristics of all breast lesions

The clinicopathologic characteristics and subgroups of all 1271 breast lesions are summarized in Table 1. Among the 1271 BI-RADS 4 lesions, 749 (58.9%) were malignant and 522 (41.1%) were benign, as shown in Table 2. The mean age of the entire cohort was 45.40±9.65 years (range, 19–79 years); the mean age in the malignant group was greater than that in the benign group (51.29±7.29 vs. 39.56±6.73 , P < 0.001), and there was no significant difference in the mean age between the training and the validation cohorts (P > 0.05 for all). The maximum diameter of the malignant group was larger than that of the benign group (19.98±5.22 mm vs. 14.55±7.74 mm, P < 0.001), but there was no difference between the training cohorts and the validation cohorts (P > 0.05 for all). The lesions were divided into subgroups depending on the maximum diameter. The subgroup with the maximum diameter of lesions (MD) ≤15 mm included 218 lesions (167 in the training cohort and 51 in the validation cohort; 89 benign and 129 malignant); the subgroup with MD between 15 mm and 25 mm included 779 lesions (598 in the training cohort and 181 in the validation cohort; 327 benign and 452 malignant); and the subgroup with MD >25 mm included 274 lesions (206 in the training cohort and 68 in the validation cohort; 112 benign and 162 malignant).

Quantitative SWE parameters of all breast lesions

Considering 1271 lesions, SWV values of the malignant group were significantly higher than those of the benign group, including the intratumoral stiffness (SWV1: 3.76±0.78 m/s vs. 1.85±0.65 m/s; P < 0.01) and the peritumoral stiffness (SWV5: 4.02±0.82 m/s vs. 1.67±0.74 m/s; P < 0.01).

SWV5 values were significantly higher than SWV1 values in the malignant group; SWV5 values were lower than SWV1 values in the benign group (all P < 0.01). SWV5 and SWV1 values in the subgroup with 15 mm <MD ≤25 mm were higher than those in the subgroups with MD ≤15 mm and MD >25 mm, because there were more malignant lesions in this subgroup (all P < 0.01).

Consistency analysis of lesion segmentation manually vs. CNN model

We calculated the average values of the segmentation metrics’ intra- and interrater consistency of the three radiologists, as shown in Table 3. Wilcoxon signed-rank tests were conducted, and Pearson’s correlation coefficients were calculated using geometric features extracted from pairwise comparison metrics in the radiologists’ segmentations. Wilcoxon tests indicated that the CNN segmentation model satisfactorily matched the performance of the radiologists regarding sensitivity, specificity, the Dice coefficient, Cohen’s kappa, and 95% Symmetric Hausdorff Distance (P > 0.05). The Dice coefficient and Cohen's kappa had the best values in Radiologist 2-CNN (0.83; 0.82); the specificity had the best values in Radiologist 3-CNN (0.99); and the 95% Symmetric Hausdorff Distance had the best values in Radiologist 1-CNN (1.19 mm) (Table 3). The box-plot diagrams show the intra- and interrater consistency of the radiologists and the CNN model compared to the radiologists (Fig. 6). The Pearson’s correlation coefficients showed a strong relationship between the area, major and minor axis length between all of the observers (radiologists vs. CNN model r = 0.98, 0.97, 0.99). As shown in Bland–Altman plots, the differences between the CNN model and the three radiologists in segmentation area, major axis length, and minor axis length were almost 0 (Fig. 7).

Diagnostic performance of the quantitative SWE parameters, US CNN model, and SWE CNN models for predicting breast cancer in the training and the validation cohorts

The diagnostic performances of quantitative SWE parameters, US CNN models, and SWE CNN models in both the training and the validation cohorts are summarized in Table 4.

Among these three single models, the 1.0 mm SWE image CNN model had the highest AUC, ACC, sensitivity, and specificity for predicting breast cancer in the subgroup with MD ≤15 mm both in the training and in the validation cohort (0.81, 79.26%, 68.86%, 82.52% vs. 0.75, 74.49%, 62.97%, 78.53%).

In the subgroups with 15 mm <MD ≤25 mm and MD >25 mm, the 2.0 mm SWE image CNN model had the highest AUC, ACC, sensitivity, and specificity both in the training cohort (15 mm <MD ≤25 mm: 0.85, 82.64%, 66.24%, 80.33%; MD >25 mm: 0.84, 80.34%, 69.34%, 82.63%) and the validation cohort (15 mm <MD ≤25 mm: 0.81, 78.87%, 63.44%, 76.85%; MD >25 mm: 0.78, 77.73%, 65.73%, 77.64%).

Regardless of the grouping method, the AUCs and ACC of the SWE image CNN models, except for the 0.5 mm and the internal SWE image CNN models, were all better than SWV5 and the US CNN model in predicting breast cancer (all P < 0.05). There was no significant difference between the SWV5 and the US CNN model in sensitivity, specificity, and AUCs (all P > 0.05, Fig. 8).

Diagnostic performance of the US + SWE dual-modal CNN model for predict breast cancer in the training and the validation cohorts

In both the training and the validation cohort, the overall performances of the US + SWE image CNN models were slightly higher than those of the corresponding single CNN models.

The US CNN + 1.0 mm SWE model achieved the highest AUC for MD ≤15 mm both in the training cohort (0.94) and in the validation cohort (0.91) (Fig. 8). In the subgroup with MD ≤15 mm, the US + 1.0 mm SWE CNN model achieved the highest ACC, sensitivity, and specificity in the training cohort (88.67%, 78.53%, 85.63%, respectively) and the validation cohort (85.54%, 77.72%, 82.52%, respectively).

In the subgroups with 15 mm <MD ≤25 mm and MD >25 mm, the US CNN + 2.0 mm SWE model achieved the highest AUCs both in the training cohort (0.96, 0.95, respectively) and in the validation cohort (0.93, 0.91, respectively) (Fig. 8). Comparing the results obtained by the US + SWE dual-modal CNN model to those of the US images CNN model, there were 6.2%, 5.7%, and 8.7% average percentage increases for ACC, sensitivity, and specificity, respectively, where the improvement in specificity was significant, indicating that the dual-modal CNN model can improve specificity without loss of sensitivity for classifying breast cancer.

Similarly, the US + 2.0 mm SWE CNN model achieved the highest ACC, sensitivity, and specificity in the training and the validation cohorts for 15 mm <MD ≤25 mm (91.34%, 75.63%, 86.98% and 90.76%, 72.53%, 84.22%) and MD >25 mm (86.65%, 80.32%, 87.44% and 84.23%, 77.38%, 84.75%).

The most important contribution of this work is the introduction of the dual-modal CNN architecture to predict breast cancer. We adopted the deep learning algorithm–assisted strategy for clinical diagnosis of breast cancer based on the automatic segmentation of peritumoral region on ultrasound SWE images. Our dual-modal CNN architecture not only improved the SWE diagnostic accuracy of breast cancer, reaching 90.76%, but also allowed to identify the effective peritumoral region around the breast lesions, which can achieve more accurate, efficient, and superior diagnostic performance in breast cancer.

The BI-RADS category based on the conventional US has been frequently used to diagnose breast lesions, and it is primarily based on the morphological features visible on US images. This approach has high sensitivity but relatively low specificity, thereby leading to unnecessary biopsy and excessive diagnosis [19,20]. The SWE examination has been reported to be able to increase specificity and sensitivity for predicting breast cancer [8-10].Among the quantitative SWE parameters, we showed that SWV5 values were higher than SWV1 values in the malignant group (all P < 0.001), while there was no significant difference between SWV1 and SWV5 in the benign group . SWV5had higher AUCs than the SWV1 both in the training and in the validation cohorts for predicting breast cancer, regardless of the subgroup . Our results demonstrated that stiffness that includes some peritumoral tissue is a better indicator of breast cancer, which is consistent with our previous studies [21-23] and other studies [11-14,24], because the peritumoral region of breast cancer has abnormally stiff collagen fibers, which is related to cancer fibroblasts, as well as infiltration of cancer cells into the surrounding tissue [9-12].The peritumoral stiffness of breast lesions has been shown to be a robust effective predictor of benign and malignant lesions [23-24,25]; however, based on the traditional SWE technology, it is difficult to accurately select the surrounding tissues and obtain accurate peritumoral stiffness value. This may be the reason why SWE technology has not yet been widely applied to clinical diagnosis of breast cancer.

Although SWE imaging-based deep learning approaches have been demonstrated to improve the accuracy of breast cancer diagnosis [27.28], few studies have evaluated the value of intra- and peritumoral regions in the prediction of breast cancer, and no studies have shown the diagnostic efficacy of CNNs segmenting different widths of peritumoral region of breast cancer.Given that we hypothesized that lesions with different diameters have differential infiltration into the peritumoral tissue, we grouped the lesions depending on their maximum diameter.

In the subgroup with MD ≤15 mm, comparing the results obtained by dual-modal CNN models integrating US and SWE images to those of just using US or SWE images single CNN models, the accuracy, sensitivity, and AUC scores were all improved, with specificity increasing most significantly (average 7.8%). Such an improvement in specificity may be because SWE images provide information on the stiffness of lesions, complementing the US image diagnosis of breast cancer from another dimension. The US + 1.0 mm SWE model achieved the highest AUC in diagnosing breast cancer both in the training (0.94) and in the validation cohorts (0.90) .

In the subgroups with 15 mm <MD ≤25 mm and MD >25 mm, the dual-modal US + SWE model still had higher AUC scores, accuracy, sensitivity, and specificity than any of the single-parameter CNN models. The US + 2.0 mm SWE model achieved the highest AUC in predicting breast cancer both in the training (0.96; 0.95) and in the validation cohort (0.93; 0.91). These results show that the most effective SWE image region for predicting breast cancer is the area containing 2.0 mm of peritumoral tissue. Similarly, for the lesions with the maximum diameter ≤15 mm, the most effective image region for SWE to predict breast cancer is the area containing 1.0 mm of peritumoral tissue.Regardless of the group, the 0.5 mm SWE CNN model and the US + 0.5 SWE dual-modal CNN model showed no significant difference from the corresponding lesion’s internal SWE CNN model, possibly because stiffness of the 0.5 mm peritumoral tissue was very close to intratumoral tissue stiffness.

If deep learning can be regarded as a tool for assisting radiologists, the reliability and repeatability of automatic segmentation are crucial. In our research, the segmentation CNN model achieved stable consistency with all three radiologists in terms of Dice coefficient and Cohen's kappa. The difference of each observer in the segmentation area and the maximum diameter of the transverse and longitudinal sections of the lesions was very close to 0, indicating that the CNN segmentation model has excellent consistency with radiologists.

In the wake of advanced deep learning algorithms, machine learning methods based on manual extraction of radiomics signatures are rapidly becoming obsolete (29,30). The CNN architecture does not require manual input of radiomics signatures, but automatically selects useful features from US images for classification; this improves diagnostic performance and efficiency while minimizing artificial false-positive or false-negative errors [31,32]. However, it is unknown which region of the CNN model based on US and SWE images can improve the diagnostic efficiency of breast cancer, so it is called a black-box learning method, which may lead to ambiguous interpretation of CNN features [33,34].In our study, first, we constructed an image segmentation model to automatically segment the effective area (the optimal peritumoral width) on SWE images that is conducive to the diagnosis of breast cancer, and then integrated the automatic segmentation model into our CNN prediction model, thereby breaking the previous CNN black-box learning mode and greatly improving the efficiency of the CNN model in the diagnosis of breast cancer.

The results of this study showed that the CNN models combining US images with SWE images containing 1 mm or 2 mm of the peritumoral tissue were superior to whole SWE ROI image models in predicting breast cancer both in the training and in the validation cohorts, indicating that the optimal predictor of breast cancer is not the whole SWE image, but the area covering a certain peritumoral tissue.Our dual-modal US + SWE models achieved superior performance by more fully capturing the comprehensive information of breast lesions. The AUCs were between 0.91 and 0.96, and the ACCs were between 84.23% and 91.34%, indicating that our method based on peritumoral tissue has the most advanced breast cancer classification ability compared with the single models. With high-speed CNN algorithm, our CNN model can be used to quickly evaluate the SWE images of breast lesions, thereby improving the work efficiency of radiologists and reducing their SWE inspection workload.

There are some limitations to this study and directions for future work. First of all, this study only considered BI-RADS 4 lesions, which are difficult to diagnose, and the other categories of breast lesions were not included. Therefore, there are some deviations in the study samples. We will recruit other categories of breast lesions in future to further evaluate our methods. Second, our study was implemented by reviewing US and SWE images in one center only. Therefore, a larger data cohort acquired from multiple centers with different models of US equipment is necessary to create a more comprehensive training cohort. In the future, we hope to expand the sample size through multicenter research and build more a comprehensive CNN model that can predict the molecular typing and axillary lymph node metastasis of breast cancer.

In conclusion, the dual-modal CNN models based on the combination of US images and peritumoral region’s SWE allow accurate prediction of breast cancer. Moreover, when lesion’s diameter is ≤15 mm, the best diagnostic SWE image area for predicting breast cancer contains 1.0 mm of the peritumoral region. When lesion diameter is >25 mm, or between 15 mm and 25 mm, the SWE image containing 2.0 mm of the peritumoral region is the optimal diagnostic area for predicting breast cancer.

US: Ultrasound; SWE: Shear wave elastography; SWV: Shear wave velocity; MD: maximum diameter; CNN: Convolutional Neural Network; AUC: Area under the curve; ROC: Receiver operating characteristic; ROI: Region of Interest

Acknowledgements

The authors would like to thank Dr. Hu L for insightful comments regarding statistical analysis and Dr. He HA, Dr. Pei C, Dr. Cui YY, for their assistance in data acquisition at diferent periods during the patient studies. Also, the authors would like to thank Mr.Liu X for scanning patients and Mr.Liu Zhen for model building.

Authors’ contributions

HL contributed to conceptualization, methodology, investigation, visualization, validation, supervision, project administration, and writing—review and editing. HNA contributed to conceptualization, methodology,investigation, validation, supervision, project administration, resources. XL contributed to data curation, image processing, statistical analysis, formal data analysis, and writing—original draft, review and editing. LZ contributed to software, visualization and model building. PC contributed to formal analysis, and data analysis. LX contributed to validation,image processing, writing—review and editing. CYY contributed to image processing, validation and formal data analysis. WYQ contributed to validation and writing—review and editing. YL contributed to datacuration and image processing. All authors read and approved the final manuscript.

Funding

This study has no funding.

Availability of data and materials

The data that support the findings of this study are available from the corresponding author upon reasonable request. The requested data may include figures that have associated raw data.

Code availability

The custom code or mathematical algorithms that are deemed central to the conclusions are available from the corresponding author upon request.

Ethics approval and consent to participate

This retrospective study was approved by the ethics committee of the First Affiliated Hospital of University of Science and Technology of China, and the requirement to obtain an informed consent was waived (Ethical Approval NO.:2022–RE–164).

Consent of publication

Not applicable.

Competing interests

The authors do not have competing interests to declare.

Author details

¹ Department of Ultrasound, The First Affiliated Hospital of USTC, Division of Life Sciences and Medicine, University of Science and Technology of China, Hefei, Anhui, 230001, P.R. China; ² Hebin Intelligent Robots Co., LTD., Hefei, 230027, PR China

³Department of Respiratory and critical care medicine, The First People's Hospital of Hefei City, The Third Affiliated Hospital of Anhui Medical University, Hefei 230001, China ⁴Anhui Medical University,Hefei, Anhui, 230001, P.R. China

Hyuna S, Jacques F, Rebecca LS, Laversanne M, Isabelle S, Ahmedin J, et al.Global Cancer Statistics 2020:GLOBOCAN Estimates of Incidence and Mortality Worldwide for 36 Cancers in 185 Countries. Ca-Cancer J Clin. 2021;71(3):209-249.
John AS, Karla K. Do fatty breasts increase or decrease breast cancer risk?Breast Cancer Res. 2012;14(1):102.
Daniëlle VW, André LMV, Mireille JMB. Breast density and breast cancer-specific survival by detection mode. BMC cancer. 2018;18(1):386.
Young KM, Soo-Yeon K, Soo KY, Kim ES, Chang JM. Added value of deep learning-based computer-aided diagnosis and shear wave elastography to b-mode ultrasound for evaluation of breast masses detected by screening ultrasound. Medicine. 2021;100(31):e26823.
Evans A, Whelehan P, Thomson K, McLean D, Brauer K, Purdie C, et al.Quantitative shear wave ultrasound elastography:initial experience in solid breast masses. Breast Cancer Res. 2010;12(6):R104.
Gu J, Ternifi R, Larson NB, Carter JM, Boughey JC, Stan DL, et al. Hybrid high-definition microvessel imaging/shear wave elastography improves breast lesion characterization. Breast Cancer Res. 2022;24(1):16.
Tomoyuki F, Leona K, Kazunori K, Mori M, Kikuchi Y, Kato A, et al. Classification of Breast Masses on Ultrasound Shear Wave Elastography using Convolutional Neural Networks. Ultrasonic Imaging. 2020;42(4-5):213-220.
Choi HY, Seo M, Sohn YM, Hwang JH, Song EJ, Min SY, et al. Shear wave elastography for the diagnosis of small ( 2 cm) breastlesions:Added value and factors associated with false results. Br J Radiol. 2019;92 20180341.
Xiang Z, Ming L, Yang ZH, Zheng CS, Wu JY, Ou B, et al. Deep Learning-Based Radiomics of B-Mode Ultrasonography and Shear-Wave Elastography:Improved Performance in Breast Mass Classification. Front Oncol. 2020;10:1621.
Xie X, Zhang Q, Liu S, Ma Y, Liu Y, Xu M, et al. Value of quantitative sound touch elastography of tissues around breast lesions in the evaluation of malignancy. Clin Radiol. 2021;76(1):79.e21-79.e28.
Zhang L, Xu JF, Wu HY, Liang WY, Ye XQ, Tian HG, et al. Screening Breast Lesions Using Shear Modulus and Its 1-mm Shell in Sound Touch Elastography. Ultrasound Med Biol. 2019;45(3):710-719.
Kong WT, Wang Y, Zhou WJ, Zhang YD, Wang WP, Zhuang XM, et al.Can measuring perilesional tissue stiffness and stiff rim sign improve the diagnostic performance between benign and malignant breast lesions? J Med Ultrason. 2021;48(1):53-61.
Xu P, Wu M, Yang M, Xiao J, Ruan ZM, Wu LY. Evaluation of internal and shell stiffness in the differential diagnosis of breast non-mass lesions by shear wave elastography. World J Clin Cases. 2020;8(12);2510-2519.
Mao YJ, Lim HJ, Ni M, Yan WH, Wong DW, Cheung JC. Breast Tumour Classification Using Ultrasound Elastography with Machine Learning:A Systematic Scoping Review. Cancers. 2022;14(2).
Zhou JQ, Zhan WW, Dong YJ, Yang ZF, Zhou C. Stiffness of the surrounding tissue of breast lesions evaluated by ultrasound elastography. Eur Radiol. 2014;24(7):1659-67.
Yaghoub P, Esmaeil Z, Mohammad SP, Amin SM. Presentation of Novel Architecture for Diagnosis and Identifying Breast Cancer Location Based on Ultrasound Images Using Machine Learning. Diagnostics (Basel). 2021;11(10).
Agata W, Jacek A, Bartłomiej P. An Automatic Biopsy Needle Detection and Segmentation on Ultrasound Images Using a Convolutional Neural Network. Ultrasonic Imaging. 2021;43(5):262-272.
Jiang MP, Lei SL, Zhang JH, Hou LQ, Zhang MX, Luo YC. Multimodal Imaging of Target Detection Algorithm under Artificial Intelligence in the Diagnosis of Early Breast Cancer. J Healthc Eng. 2022.9322937.
Wang Y, Jung CE, Younhee C, Hao C, Jin GY, Ko SB. Breast Cancer Classification in Automated Breast Ultrasound Using Multiview Convolutional Neural Network with Transfer Learning. Ultrasound Med Biol. 2020;46(5):1119-1132.
Jaeil K, Jung KH, Chanho K, Won Hwa K. Artificial intelligence in breast ultrasonography. Ultrasonography. 2021;40(2):183-190.
Saba T, Abunadi I, Sadad T, Khan AR, Bahaj SA. Optimizing the transfer-learning with pretrained deep convolutional neural networks for first stage breast tumor diagnosis using breast ultrasound visual images. Microsc Res Techniq. 2022;85(4):1444-1453.
Cui Y, He NA, Ye XJ, Hu, L, Xie, L, Zhong W, et al. Evaluation of Tissue Stiffness Around Lesions by Sound Touch Shear Wave Elastography in Breast Malignancy Diagnosis. Ultrasound Med Biol. 2022;48(8):1672-1680.
Lei H，Xiao L，Chong P，Li X，Nianan H. Assessment of perinodular stiffness in differentiating malignant from benign thyroid nodules. Endocr Connect. 2021;(10)5:492~501.
Lei H, Nianan H，Li X, Xianjun Y, Xiao L，Chong P, et al. Evaluation of the perinodular stiffness potentially predicts the malignancy of the thyroid nodule, J Ultrasound Med. 2020,89:3251~3255.
Sun PH, Jung SH, Chang SK, Hee CJ, Young CE, Jung CW, et al. Comparison of peritumoral stromal tissue stiffness obtained by shear wave elastography between benign and malignant breast lesions. Acta Radiol. 2018;59(10):1168-1175.
Sun QC, Lin XN, Zhao YS, Li L, Yan K, Liang D, et al. Deep Learning vs. Radiomics for Predicting Axillary Lymph Node Metastasis of Breast Cancer Using Ultrasound Images:Don't Forget the Peritumoral Region. Front Oncol. 2020;10:53.
Dong FJ, Wu HY, Zhang L, Tian H, Liang W, Ye X, et al. Diagnostic Performance of Multimodal Sound Touch Elastography for Differentiating Benign and Malignant Breast Masses. J Ultras Med. 2019;38(8):2181-2190.
Choi JS, Han BK, Ko ES, Bae JM, Ko EY, Song SH, et al. Effect of a Deep Learning Framework-Based Computer-Aided Diagnosis System on the Diagnostic Performance of Radiologists in Differentiating between Malignant and Benign Masses on Breast Ultrasonography. Korean J Radiol. 2019;20(5):749-758.
Wang QC, Chen H, Luo GN, Li B, Shang HT, Shao H, et al. Performance of novel deep learning network with the incorporation of the automatic segmentation network for diagnosis of breast cancer in automated breast ultrasound. Eur Radiol. 2022.
Misra S, Jeon SW, Ravi M, Seiyon L, Gyuwon K, Chiho K, et al. Bi-Modal Transfer Learning for Classifying Breast Cancers via Combined B-Mode and Ultrasound Strain Imaging. IEEE T Ultrason Ferr. 2022;69(1):222-232.
Jabeen K, Khan MA, Alhaisoni M, Tariq U, Zhang YD, Hamza A, et al. Breast Cancer Classification from Ultrasound Images Using Probability-Based Optimal Deep Learning Feature Fusion. Sensors (Basel). 2022;22(3).
Yi J, Kang HK, Kwon JH, Kim KS, Park MH, Seong YK, et al. Technology trends and applications of deep learning in ultrasonography:image quality enhancement, diagnostic support, and improving workflow efficiency. Ultrasonography. 2021;40(1):7-22.
Gu JH, Tong T, He C, Xu M, Yang X, Tian J, et al. Deep learning radiomics of ultrasonography can predict response to neoadjuvant chemotherapy in breast cancer at an early stage of treatment: a prospective study. Eur Radiol. 2022;32(3):2099-2109.
Kim YS, Lee SE, Chang JM, Kim SY, Bae YK. Ultrasonographic morphological characteristics determined using a deep learning-based computer-aided diagnostic system of breast cancer. Medicine. 2022;101(3):e28621.

Table 1 Distribution of subgroups studied and quantitative SWE parameters of all lesions
	Total	Benign	Malignant	Training cohort	Validation cohort	Subgroup A MD≤15mm	Subgroup B 15mm≤MD≤25mm	Subgroup C MD＞25mm
Age	45.40 ± 9.65	39.56 ± 6.73	51.29 ± 7.29	47.10 ± 6.65	46.87 ± 7.54	47.33 ± 7.45	46.98 ± 6.11	47.28 ± 9.29
lesions (n)	1271	522	749	971	300	218	779	274
Maximum diameter (mm)	17.45 ± 8.51	14.55 ± 7.74	19.98 ± 5.22	18.55 ± 7.74	17.67 ± 6.42	^_	^_	^_
B vs. M	522 vs.749	^_	^_	401 vs. 570	121 vs. 179	89 vs.129	327 vs.452	112 vs. 162
T vs. V	971 vs.300	401 vs.121	570 vs. 179	^_	^_	167 vs.51	598 vs.181	206 vs.68
SWV1(m/s)	2.67 ±1.87	1.85±0.65	3.76±0.78	2.65±1.34	2.71±1.52	2.48±1.35	2.98±0.98	2.65±1.08
SWV5(m/s)	2.98 ±1.65	1.67±0.74	4.02±0.82	2.91±1.42	3.05±1.17	2.66±1.12	3.21±1.02	2.73±0.97
P value	0.000	0.000	0.000	0.000	0.000	0.000	0.000	0.000
MD＝maximum diameter of lesions；T= training cohort; V= validation cohort; B= benign; M= malignant; ^_indicates not applicable

Table 2 Summary of pathologic findings
Histopathologic results		n (%)
Benign (n = 522)	Fibroadenoma	206(39.5)
	Adenosis	136(26.1)
	Intraductal papilloma	59(11.4)
	Adenosis with Fibroadenoma or Intraductal papilloma	54(10.3)
	Others^#	67(12.7)
Malignant (n =749)	Invasive ductal carcinoma	556(74.3)
	Ductal carcinoma in situ	84(11.2)
	Invasive lobular carcinoma	54(7.3)
	Intraductal papillary carcinoma	20(2.6)
	Mucinous carcinoma	13(1.7)
	Others*	22(2.9)

Table 3 Mean values of pairwise comparison metrics between all observers
	Dice	Cohen’s kappa	95% HD	Specificity	Sensitivity
Radiologists 1-2	0.79	0.74	1.45 mm	0.98	0.91
Radiologists 2-3	0.81	0.79	1.34 mm	0.98	0.84
Radiologists 1-3	0.78	0.80	1.26 mm	0.95	0.76
Radiologists 1-CNN	0.80	0.81	1.19mm	0.96	0.84
Radiologists 2-CNN	0.83	0.82	1.29 mm	0.97	0.85
Radiologists 3-CNN	0.82	0.81	1.44 mm	0.99	0.77
The best values are shown in bold. The radiologists are numbered 1, 2, 3, and the model is labeled CNN.

Table 4 Performanc summary of SWE CNN model, US+SWE CNN models and quantitative SWE parameters for predicted breast cancer in the training and validation cohort
		SWE CNN models							US + SWE CNN models							US CNN models	SWV1	SWV5
		Internal	0.5 mm	1.0 mm	1.5 mm	2.0 mm		ROI	Internal	0.5 mm	1.0 mm	1.5 mm	2.0 mm	ROI		US CNN models	SWV1	SWV5
Subgroup of lesions ≤ 15 mm (n= 216 )
AUC	T	0.65	0.70	0.81	0.76	0.75		0.74	0.81	0.89	0.94	0.92	0.90		0.87	0.75	0.70	0.79
AUC	V	0.59	0.66	0.75	0.65	0.64		0.65	0.79	0.85	0.91	0.89	0.88		0.82	0.77	0.65	0.73
ACC%	T	64.23	72.46	79.26	74.70	73.56		70.71	75.65	82.97	88.67	87.32	85.45		84.57	75.54	78.64	81.89
ACC%	V	62.45	68.38	74.49	72.46	72.23		69.55	74.11	80.43	85.54	85.54	82.24		80.98	72.55	74.61	73.57
SEN%	T	47.65	63.69	68.86	66.85	65.64		64.53	57.53	65.64	78.53	76.58	70.32		69.63	77.33	65.52	65.63
SEN%	V	45.42	61.42	62.97	64.53	63.52		62.37	55.36	60.48	77.22	75.22	67.73		65.53	75.26	63.63	62.73
SPE%	T	68.23	73.63	82.52	75.64	74.63		72.52	74.66	79.64	85.63	84.93	85.50		83.57	76.58	67.53	79.33
SPE%	V	66.53	69.93	78.53	72.25	71.55		68.42	73.52	74.74	82.52	81.86	81.93		79.87	71.48	62.53	75.78
PPV%	T	67.75	73.97	77.64	75.90	74.94		72.73	76.33	82.53	85.85	84.63	77.63		75.44	76.34	71.64	76.53
PPV%	V	62.77	70.53	73.28	71.63	70.64		68.28	74.77	80.27	82.12	81.73	79.24		74.24	73.64	70.53	72.63
NPV%	T	65.75	67.85	75.66	74.63	72.64		70.64	72.74	75.72	77.14	76.84	74.35		73.63	74.86	71.63	74.25
NPV%	V	62.65	65.66	73.67	73.32	70.35		66.67	70.53	73.02	73.54	72.53	71.89		70.52	73.67	68.52	72.55
Subgroup of 15 mm > lesions ≤ 25 mm (n= 778 )
AUC	T	0.69	0.73	0.76	0.78	0.85		0.78	0.82	0.86	0.91	0.93	0.96		0.91	0.76	0.75	0.81
AUC	V	0.64	0.68	0.70	0.73	0.81		0.70	0.75	0.81	0.85	0.87	0.93		0.86	0.73	0.71	0.76
ACC%	T	67.33	68.43	72.32	75.46	82.64		75.34	78.64	80.63	85.64	88.69	91.34		90.45	81.43	76.64	80.32
ACC%	V	64.43	64.76	69.76	72.42	78.87		73.53	75.75	76.75	83.63	85.85	90.76		88.65	77.64	75.74	78.34
SEN%	T	46.53	57.35	60.63	64.35	66.24		61.53	70.53	71.74	72.64	74.53	75.63		74.53	64.53	63.63	64.33
SEN%	V	44.43	55.53	57.63	60.42	63.44		56.63	65.53	67.53	66.53	70.53	72.53		70.53	58.42	59.34	63.53
SPE%	T	71.74	73.64	76.53	75.68	80.33		74.78	75.85	78.96	84.36	84.74	86.98		85.64	76.74	66.84	74.26
SPE%	V	66.47	70.63	72.35	72.87	76.85		72.25	73.64	72.63	80.75	81.85	84.22		83.63	73.45	62.67	70.75
PPV%	T	70.53	71.53	75.63	76.76		80.56	73.64	83.65	83.76	85.64	85.98	87.96		84.53	78.64	75.76	79.54
PPV%	V	65.75	68.82	72.73	73.81	76.73		70.97	78.14	80.14	83.34	81.34	85.34		82.53	74.22	73.25	74.82
NPV%	T	64.54	66.76	67.54	65.53	76.87		65.78	74.65	76.64	77.75	79.94	81.26		80.09	76.75	71.64	75.84
NPV%	V	61.85	63.64	62.75	64.74	74.93		62.97	70.93	72.64	73.36	75.13	77.57		76.32	73.21	68.96	73.26
Subgroup of lesions > 25mm (n=274 )
AUC	T	0.66	0.74	0.78	0.80	0.84		0.81	0.84	0.88	0.91	0.94	0.95		0.92	0.77	0.72	0.82
AUC	V	0.63	0.67	0.70	0.74	0.78		0.70	0.75	0.78	0.87	0.89	0.91		0.87	0.75	0.68	0.74
ACC%	T	65.45	67.63	71.77	73.53	80.34		71.76	79.75	80.74	83.64	85.64	86.65		86.32	78.79	77.90	78.06
ACC%	V	62.25	64.75	69.85	70.74	77.73		69.64	75.63	78.74	80.25	83.22	84.23		83.63	75.52	76.73	74.86
SEN%	T	50.64	53.65	65.83	68.85	69.34		67.53	56.33	67.73	76.53	78.58	80.32		75.63	76.53	74.59	79.61
SEN%	V	48.52	50.66	63.27	66.83	65.73		64.33	54.78	65.37	73.72	75.72	77.38		72.63	74.27	71.23	76.36
SPE%	T	72.64	75.64	73.78	76.86	82.63		75.53	74.53	78.57	82.64	84.35	87.44		84.53	76.63	74.53	80.53
SPE%	V	68.87	72.54	70.98	72.29	77.64		70.28	73.74	73.88	80.75	83.98	84.75		80.37	73.96	73.64	74.23
PPV%	T	67.85	70.65	73.66	75.63	82.43		72.64	82.64	83.64	84.23	86.04	86.57		84.35	80.42	76.53	79.64
PPV%	V	64.65	67.37	70.75	72.64	77.74		70.83	78.44	80.85	82.75	83.44	84.37		83.25	76.74	73.67	75.36
NPV%	T	65.64	67.54	70.63	73.53	83.65		72.64	79.64	80.53	81.04	82.03	87.64		80.74	76.37	73.53	79.84
NPV%	V	62.53	62.79	66.95	70.95	76.96		69.85	73.26	74.86	75.74	75.98	86.43		73.44	74.75	70.64	75.89

The best values are shown in bold. T= training cohort; V= validation cohort; ROI= whole SWE ROI box image；SWV=shear wave velocities

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

Performance of a novel Convolutional Neural Network with the incorporation of automatic segmentation of peritumoral region in shear-wave elastography for predicting breast cancer

Status:

Version 1

Abstract

Figures

Introduction

Materials And Methods

Results

Discussion

Conclusion

Abbreviations

Declarations

References

Tables

Additional Declarations

Status:

Version 1