In this study, the deep learning model based DM + DBT images showed the best discriminative performance among the four models evaluated. It achieved an AUC of 0.908, accuracy of 83.3%, sensitivity of 86.2%, and specificity of 80.7% on internal dataset. When the external data was used for verification, the deep learning model based on DM and DBT still outperformed the other models, with an AUC of 0.854, accuracy of 79.6%, sensitivity of 81.8%, and specificity of 75.0%. Moreover, on internal dataset, the performance of deep learning model was assessed in different subgroups of patients with varying tumor sizes, age ranges, and breast densities.The advantage of the deep learning model based DM + DBT images is especially prominent in patients with small masses less than 1cm, 20 to 40 years old, and dense breast.
In recent years, it has been a emerging trendency that radiomics become a promising method in disease diagnosis because it extracts quantitative imaging features for pathophysiology characteristics description[17–20]. In our study, the radiomics model based DM + DBT exhibited more power than the radiomics model based DM, with an AUC of 0.869 for the radiomics model based DM + DBT and 0.810 for the radiomics model based DM in the AUC of the ROC curves. Another study also demonstrated that showed that the AUC of the test ROC curves for classification of mass for the 2D alone, 3D alone, and combined 3D and 2D were 0.85, 0.86, and 0.91, respectively[21]. And the most frequently features included two morphological features, six global and five local SGLD texture features for the 2D detection approach and two morphological features, four gray features, two RLS texture and one SGLD texture feature for the 3D detection approach. In our study, among the 825 radiomic features, the gray-level characteristics were higher value, which indicates the more complicated texture and heterogeneity of the tumoral region. This fingding was consistent with previous reports by Kontos’s group[22, 23], who also found that texture features were closely correlated to breast cancer in the DBT images. Niu et al.[24] evaluated the tumoral and peritumoral regions in the breast DBT image in differentiating malignant from benign lesions using handcrafted and deep features and developed a radiomics nomogram by integrating the radiomics signature and important clinical factors for facilitating early diagnosis of breast cancer. Fusion DM and DBT images is a promising approach to small mass detection. A larger data set should be collected to improve the training efficiency of the fusion model.
Recently, medical software devices based on deep learning artificial techniques have been developed to automatically detect and classify benign and malignant lesions on DM and DBT mammograms. In a previous study, the researchers investigated to predict malignancy of masses in DBT images by transfer learning of deep learning[25]. Rabili et al. attempted to detect malignant masses and calcifications through the common Faster RCNN model[16]. However, these studies were performed based on the 2D analysis of a deep neural network. In this study, we showed that the 3D deep learning method is superior to the 2D methods in size ≤ 2cm mass for benign and malignant classification. Compared with the model based on radiomics, deep learning–based model based on DM + DBT have a better performance with an AUC of 0.908. Our results were comparable to the performance of three different DCNN networks by Ricciardi et al .[26] Reported, one developed ad-hoc (DBT-DCNN) and the other two, Alex-Net and VGG 19, with the AUC values ranging from 0.70 to 0.93, and accuracies ranging from 69–93%. Fan et al.[15] showed that the Mask RCNN has better lesion-based detection performance while the Faster RCNN achieved better breast-based mass detection in DBT images. The common deep learning model of ResNet-34 was used in our datasets. Deep learning model might be less influenced by lesion-specific features than feature-based radiomics methods, resulting in a better chance of recognizing a in size ≤ 2cm mass and a significantly better breast DM + DBT-based detection performance.
The systematic analyses of the deep learning model based on DM + DBT showed that the mass AUCs performance varied on different subgroups of characteristics such as age, mass size, and breast density. Interestingly, the radiomics model based on DM + DBT achieved better performance than the deep learning model based on DM + DBT on fatty breast. One possible reason for this is that the lesion-specific features of mass are more prominent on the fatty breast due to the absence of tissue overlap. Fan et al.[20] also found that 3D-Mask RCCN had significantly better mass detection performance than the 2D methods for patients with 40–59 years old, benign tumors, irregular tumors and dense breast, which were comparable to our results, with obviously improving AUCs from 0.736 to 0.843 on the subgroup of 20 ~ 40 years old, from 0.682 to 0.886 on the dense breast subgroup. This improvement can be attributed to the fact that DBT reduces the tissue overlap and increases the lesion conspicuity, particularly in dense breasts and younger women.
Our study had limitations. First, the radiomics and deep learning models were evaluated on a relatively small patient cohort, which may limit the generalization of the current model to a larger patient group. Moreover, due to the limited amount of external data, no subgroup analysis was performed. Second, the ROIs on each slice were manually segmented, which increased the workload. Also, we used image patches for detection to save computer memory, and thus future studies that focus on the entire image should be conducted. In addition, the performance of models may depend strongly on factors that affect image quality such as the DBT reconstruction methods and parameters, and image acquisition methods such as the x-ray techniques, number of projection views, and tomographic angle. Further investigations will be needed to evaluate the effects of these factors on the performance of different models using different approaches.
In summary, we proposed the deep learning model based DM + DBT images has a best performance. A comparison of the DM and DM + DBT models under different subgroups based on age ranges, lesion sizes, and breast densities was conducted.The advantage of the deep learning model is especially prominent in patients with small masses less than 1cm, 20 to 40 years old, and dense breast. It is expected that AI will play a major role in the evaluation of DBT in ≤ 2cm mass, particularly detecting early breast cancer in the screening setting.