Histo-fusion: a novel domain specific learning to identify invasive ductal carcinoma (IDC) from histopathological images

On the global level, Invasive Ductal Carcinoma (IDC) is one of the leading malignancies among women undergoing breast cancer screening. A slight delay in detecting and diagnosing this disease may result in irreversible complications. Histopathological images obtained from a biopsy examination present enormous structural information, which helps significantly in improving the prognosis of the disease. Pathological analysis involving microscopic examination of the histopathological slides is a highly challenging task due to the limited abilities of conventional computer-aided detection (CAD) methods to reach a precise diagnosis. The problem has become more accentuated due to the less availability of medical data in the public domain. Lately, deep learning methods using artificial neural networks are being used consistently to improve the performance of these CAD methods. Despite less medical data availability, transfer learning has recently been commonly practised to facilitate a deep neural network to train on a specific dataset in resolving the problem. However, the performance of this method is not remarkably appreciable on small and low-resolution medical image datasets compared to those comprising whole slide images of higher resolutions. In this direction, this study proposes a domain-specific learning strategy, Histo-Fusion, with the objective of detecting the IDC more precisely. In the proposed method, the deep CNN model is initially trained on higher-resolution histopathological images of breast tissue which presents ample information for a model to learn the significant features for better discrimination between normal and malignant tissues. Subsequently, through a positive transfer of domain features, the model is further trained on the small and low-resolution images, enabling it to classify these histology images into IDC - and IDC + categories. Moreover, shallow and deep neural network architectures are utilized in the study to compare their performance across the two learning approaches: transfer learning and Histo-Fusion on the IDC dataset. As revealed by the present study results, the proposed Histo-Fusion learning approach has improved each deep CNN’s discriminating abilities by yielding better accuracy scores of around 5% over and above those obtained by the commonly used transfer learning strategy. Therefore, the procedure is expected to reduce false-positive rates and help expert pathologists reach accurate diagnoses.


Introduction
Breast cancer is the leading diagnostic cancer among various cancers that affect women today. With 2.3 million new cases and 685,000 deaths worldwide, breast cancer is the fourth leading cause of mortality [53]. The Indian Council for Medical Research (ICMR) predicted 2,00,000 new breast cancer cases among Indian women [34]. An effective prognosis of this morbidity is possible with an earlier diagnosis through an efficient detection system. The assessment method to help detect breast cancer follows a 3-step process: mammogram, clinical assessment, and needle biopsy (H&E-stained histology image) [21]. Mammograms and histopathological images are the most commonly used modalities for breast cancer screening and diagnosis. However, around 10% of patients are generally advised to go for a more thorough assessment after mammography [37]. Despite it being an effective method, the procedure exhibits a trade-off between specificity (0.91) and sensitivity (0.84), resulting in needless biopsies [17]. As a result, patients are subjected to significant stress and trauma, besides being forced to bear increased healthcare costs.
The Histopathological analysis is the only appropriate method presently available for diagnosing malignancy on breast tissues, i.e., invasive ductal carcinoma (IDC) [17], which involves a keen microscopic examination by an expert pathologist. Nevertheless, in the present-day healthcare scenario, India is facing a severe dearth of pathologists against a looming number of patients. As per statistics, the population-pathologist ratio in India is around 65,000 [43], far lower than the USA, which is around 17,500. The complex nature of the histopathological examination and the massive number of investigations per pathologist may lead to false diagnoses [18,59]. To address this problem, computer-aided detection (CAD) methods using image processing and analysis have been developed to help domain experts to arrive at a correct diagnosis [21]. Such techniques aim to provide additional insight into tumour region identification, mitotic activity, and the identification of breast cancer subtypes such as IDC (Invasive Ductal Carcinoma) and ILC (Invasive Lobular Carcinoma) [56]. Although CAD methods are effective, accurate detection of abnormal breast tissue still remains a challenge, because the texture patterns present in histopathological images are highly complex. For precise diagnosis, an appropriate feature extraction method with the capability to extract significant features from these complicated patterns is highly desirable [60]. This cannot be achieved through CAD, because these systems primarily rely on prior knowledge of the input data [5], while the focus remains to find a fair balance between detection accuracy and computational complexity [35]. CAD systems tend to result in high rates of false positives, and recent studies have confirmed that the diagnostic capacity of these models cannot improve further [29]. As such, it is urgent to look up for some advanced and accurate detection procedures.
The expeditious advancement in computational technologies makes it possible to use deep learning methods (especially deep convolutional neural networks) for object detection and classification tasks [14,20,32]. Such methods can also be used for various critical tasks, such as the early detection of various malignancies affecting the breast, colon, and lungs tissues.
Recently several studies employing deep learning methods have achieved better results than traditional CAD systems by providing diverse analyses of suspicious scans [31] [23]. Unlike conventional machine learning and CAD methods, deep learning models can automatically extract significant features from the input data [62]. Deep CNNs carry out this feature learning by following a hierarchical framework, combining the features extracted from the low-level and high-level abstracts through a non-linear approach [21]. Deep learning methods adjust themselves according to the inputs provided, improving the correlation between input and output using an iterative training process [8]. Indeed, deep learning techniques using transfer learning have provided competitive results in diagnosing breast cancer from histopathological samples. However, the performance of these methods is not promising in case of the datasets with image samples of low resolution and small size.
The present study proposes a novel Histo-Fusion learning method that aims to identify IDC from histopathological images more accurately than the conventional methods presently in vogue. The study uses various state-of-the-art deep CNNs including shallow network architectures (AlexNet and VGG19) along with deep architectures (ResNet and DenseNet), to investigate the performance of the Histo-Fusion learning approach against the established transfer learning strategy.

Related literature
Data Acquisition, data pre-processing, feature mining, and decision mining are the integral modules for any automatic classification system [49]. Among these, feature mining holds prominence because its performance correlates directly to the quality of features extracted from input images. Conventional methods like Local Binary Pattern (LBP) [38], Parameter Free Threshold Adjacency Statistics (PFTAS) [15], Oriented Fast and Rotated BRIEF [47], and Local Phase Quantization (LPQ) [39] use 'texture' as a significant attribute for feature extraction. [50] evaluated the feature extraction ability of the conventional methods in conjugation with various classifiers like '1 Nearest Neighbor' (1-NN) (K. Q. [58]), Quadratic Linear Analysis (QDA) [55], Support Vector Machine (SVM) [9], and Random Forest (RF) [30]. The study concluded that SVM, using fractal dimension as a feature descriptor, performed well among these classifiers on low-resolution images [11]. An in-depth comparative analysis of different machine learning classifiers concluded that the CatBoost (CB) classifier proved more efficient than the extra-Tree model, multi-level perceptron (MLP) and RF methods [10,40,46]. Further, after extracting the textural features from the histopathological images, an' ensemble learning by stacking' technique classifies them into IDC -and IDC + categories. But the results so obtained were highly inconsistent due to the wholesome dependence of these techniques on the standard of features retrieved by various feature descriptors. To address the issue, recourse is taken to utilize deep learning methods discussed as follows.
Cruz-Roa [16] took a lead in using deep CNNs to identify IDC from whole-slide histopathological images besides using classical handcrafted methods such as Fuzzy Color Histogram (FCH), RGB Histogram (RBGH), LBP, Haralick features, Graph-based features, and Gray Histogram (GH). With an F1-score of 0.7180 and a balanced accuracy score of 0.8423, a threelayer CNN provided better results when compared to handcrafted methods. On the other hand, FCH (F1: 0.6753 and balanced accuracy: 0.7874) and RGBH (F1: 0.666 and balanced accuracy: 0.7724) yielded better results among other classical methods used. [57] applied many CNNs for automatic feature extraction and classification tasks to optimize the F1 and area under the curve (AUC) scores, four of which were already implemented by [16]. Each CNN architecture was trained on the IDC dataset until it achieved maximum accuracy with minimal gradient. The deeper network among the four deep CNNs yielded higher AUC and F1 scores compared to the less deep networks. Precisely the deepest of the four CNNs with a single dropout layer yielded F1: 0.923, balanced accuracy (BAC): 0.866, and accuracy: 0.89 compared to a network comprising two dropout layers which yielded F1: 0.911, BAC: 0.814, and accuracy: 0.87. An interesting inference drawn from the study is that the performance increase through data augmentation is directly proportional to the depth of the CNN used. [42] applied data augmentation through random rotation, flipping and shifting on a training set comprising IDC histopathological images. A six-layer CNN with ReLU non-linearity and Adadelta optimizer using data augmentation classified the test samples into IDC -and IDC + categories, obtaining F1: 0.8934 and accuracy: 0.89. Moreover, the architecture displayed less model overfitting, apparently by virtue of data augmentation.
B. N. Narayanan [36] used color consistency and histogram equalization for pre-processing the histopathological patch images. The deep CNN used in the study obtained an AUC value of 0.935 on the patches pre-processed using color consistency which is comparatively higher than 0.876 obtained on histogram equalization pre-processed patches. We may attribute this increment to the color consistency procedure's ability to maintain the contrast levels across all the images of the dataset used, imparting better discriminating abilities to a deep CNN for classification tasks. [2] investigated the performance of the depth-wise separable convolution model against the standard convolution neural network model on the IDC dataset. The depthwise separable convolution characterizes itself by performing one convolution operation on a single channel at a given time. In contrast, the standard convolution performs the convolution operation on all channels simultaneously. For the rectification of non-linearity towards the end of convolution operations, numerous activation functions like ReLU, Tanh, and Sigmoid are independently used in both models to test their responses in the classification task. The standard convolution neural network model provided better performance results (precision: 0.81, specificity: 0.73, sensitivity: 0.71, F1: 0.3, and accuracy: 0.87) in contrast to the depthwise convolutional neural network. Among various activation functions used in this model, ReLU (accuracy: 0.871) outperformed others followed by Sigmoid (accuracy: 0.864). [24] used the balanced dataset for training a CNN architecture as implemented by [42] with a slight variation i.e., use of variable dropout rates. The optimized CNN-based approach yielded an accuracy score of 0.854 on the IDC dataset. Further, the study used an under-sampling strategy to reduce the bias of a classifier towards the majority class by removing some patches from a class with more sample count. [45] implemented a multi-level batch normalization using the inception CNN as a base architecture. These modules help mitigate the internal co-variance shift, enabling better training of the CNN model. The implementation of this method finally resulted in obtaining a BAC score of 0.89 on an imbalanced IDC test set.
Sujatha [12] used ResNet34 and ResNet50 on different training set configurations that contain varying IDC patch sample instances. ResNet50 performed slightly better than ResNet34 in identifying IDC -and IDC + samples. The study concluded that the discriminating ability of the CNN models decreases directly in proportion to the decrease in sample instances of the training sets [1] investigated the performance of machine learning methods like Logistic Regression, KNN, and SVM, along with deep learning methods involving three different CNN architectures. SVM outperformed its counterparts in classifying samples into IDC -and IDC + categories, yielding an accuracy score of 0.785. On the other hand, out of the three CNNs used in the study, the two shallower networks yielded accuracy scores of 0.59 and 0.76 against the accuracy score of 0.87 obtained by the deeper network.
Implementation of ensemble learning that seeks a better performance by combining the outputs of two deep CNNs (DenseNet121 and DenseNet169) obtained a BAC score of 0.9207 and an F1 score of 0.9570 [7]. In conjugation with Test-time-augmentation, the ensemble model displayed better performance in classifying IDC patch samples than the individual pretrained models. The study further highlights the performance improvements achieved in the classification task of the ensemble model by upscaling the image patch samples from 50 × 50 pixels to 250 × 250 pixels. [4] used a data-driven learning approach similar to the one used in this study by initially training the deep CNN on a large dataset comprising images of skin cancer. Afterwards, the deep CNN is trained and fine-tuned on histopathological images of the BreakHis dataset. The deep CNN model achieved an accuracy score of 0.975 against 0.852 using the training-from-scratch approach. [6] proposed a novel automatic detection method, Histo-CADx, for identifying breast cancer from histopathological images. In its initial phase, the model investigated the impact of combining features extracted using deep learning methods with those obtained from traditional methods. Next, the Histo-CADx implemented a multi-classifier system by fusing the outputs of three individual classifiers. Such an arrangement achieved better performance for identification of breast cancer from histopathological images of BreakHis and ICIAR datasets.
The literature cited above reveals that most studies employing deep CNN methods have either used a learn-from-data approach (training from scratch) or transfer learning to identify IDC from histopathological images. Deep learning models require enormous labelled medical data for better performance and generalization [4]. On the other hand, the limited availability of labelled medical data makes feature learning a highly challenging task for deep CNN methods due to the expensive and time-consuming annotation process [19]. Transfer learning is an effective strategy for training deep CNN models, where data availability is generally scarce. Training a deep CNN model using a transfer learning strategy involves pretraining it on a source dataset -ImageNet [48], consisting of around 1.2 million images of heterogeneous nature. Afterwards, transfer learning allows the pre-trained networks to receive training on any specific dataset enabling it to perform the required classification task. Although, contemporary research has endorsed this technique as efficient for improving the performance of deep CNN models on image-related tasks like detection, segmentation and classification (Alghodhaifi, [3]) [16]. However, the images present in the ImageNet dataset and the histopathological images in particular, differ substantially in terms of colour, shape, and resolution [4].
The small-sized and low-resolution histopathological images containing IDC are characterized by limited semantic features, requiring these images to be captured at higher resolutions. A recent study implemented a 'super-resolution microscopy' to improve ovarian biopsy images' resolution (J. [13]). The framework used a channel fusion transfer learning for generating higher-resolution slice images from input slices of lower resolution and yielded better performance. On the other hand, recent studies have empirically demonstrated that a CNN trained from scratch on histopathological data provides nearly similar results as those obtained by a pre-trained deep CNN using transfer learning [41]. This fact is corroborated by the information presented in Table 1 as well. The deep CNNs using transfer learning generally do not perform satisfactorily due to the negative transfer problem (W. [61]). Another issue faced by Deep CNNs employing transfer learning is that of early model overfitting. It leads the model to learn unnecessary details including noise [26], inhibiting the generalizing ability of the model on any new data. Given the disadvantages, as discussed, the suitability of transfer learning as a technique to precisely detect the affected tissues on the images has come under question, especially on the datasets exhibiting a high degree of inter-class similarity [26,27].
In order to combat the problems stemming from transfer learning, the present study proposes a domain-specific training framework: Histo-Fusion, that aims to classify IDC small and lowresolution images more accurately. The proposed framework tends to accrue several advantages: (i) positive transfer learning, (ii) learning significant features by the deep CNNs used, and (iii) reducing the false-positive rates that enhances the discrimination of the models used.

Proposed approach
As mentioned above, the present study proposes a novel Histo-Fusion learning approach for classifying small-sized and low-resolution IDC images to achieve a higher level of precision. The proposed framework is based on a strenuous training of each deep CNN model on large and high-resolution whole-slide histopathological images of the BreakHis Dataset. These images, captured at higher resolutions of 40X, 100X, 200X, and 400X, present a large number of morphological features for deep CNN models to learn the distinguishing features between benign and malignant breast tissue. Afterwards, each model is further trained and fine-tuned on small-sized, low-resolution images of the related domain i.e., the IDC dataset. It helps each deep CNN model to maximize its feature space by fusing the histology features acquired through two distinct feature extraction processes, viz. those obtained from BreakHis images and those from the IDC patch images. Finally, each deep CNN applies the histology-fused learning features to categorize small-sized and low-resolution test IDC images into their appropriate categories of IDC -and IDC +. In our study we have coined the term 'Histo-Fusion' which has been used for this fusion strategy. An illustrative rendition of the proposed Histo-Fusion strategy is brought out in Fig. 1. The effectiveness of the proposed framework against the commonly practised transfer learning approach is tested using five deep CNNs comprising both shallower (AlexNet and VGGNet) and deeper networks (ResNet and DenseNet). Each model is adequately trained on a domain-specific dataset to improve convergence of weights. The domain-specific learning pursued in this manner imparts a positive learning experience in all models, which yields comparatively encouraging results, as discussed in subsequent sections. Considering the clinical significance of identifying correct IDCand IDC + from the histopathological images, a framework with even a marginally better discrimination ability is an immense gain.

Dataset description
In order to classify the histopathological images into sub-categories of IDC -and IDC +, and to further validate the proposed Histo-fusion approach, two different datasets, BreakHis and IDC, have been used. PNG image format has been utilized for storing histopathological image samples of breast tissue in both datasets. Generally, the distribution of image samples of benign and malignant categories is highly imbalanced across all multi-class datasets, which holds for the datasets used in this study (Tables 2 and 3).

BreakHis dataset
The BreakHis dataset [50] is the source dataset for pre-training the deep CNN models utilized in this study, which is currently the largest repository of breast cancer histopathological whole slide images accessible to the research community on the Internet. In the BreakHis dataset, benign and malignant samples are subdivided into eight sub-classes, four in each category.

IDC dataset
The IDC dataset [25] used in this study is the target dataset. It pertains to the class of breast cancer histopathological datasets comprising 162 whole slide images captured at 40X magnification with huge spatial dimensions of the order of 10 10 pixels. A group of expert pathologists subdivided each of the full-slide sample images into a number of non-overlapping patches. After annotating each sub-sample with an appropriate label, the pathologists extracted 198,738 patch subsamples belonging to the IDC -category and 78,786 belonging to IDC + category. Figure 3 shows the IDC -and IDC + histopathological image patches, in which the IDCsamples representing normal/benign tissues are shown in green while as the IDC + samples representing abnormal/malignant tissues are shown in red outline. A megascopic examination of these slides reveals that a good number of slides contain a large portion of white or black segments. Such samples are removed from the total count to ensure a confusion-free training of the deep CNN model. Finally, the number of patch samples is reduced to 190,972 for IDCand 34,977 for IDC +.

Convolutional neural networks
Deep learning models have recently been widely used for computer vision and object recognition tasks [36]. In the present study, recourse is taken to four state-of-the-art architectures (AlexNet, VGG19, ResNet and DenseNet) for classifying the IDC samples. The study prefers these CNNs for their remarkable performance on medical images in diagnosing malignancy to a precision level [12,26,54]. Moreover, of the CNNs utilized, the two network architectures: AlexNet and VGG19, belong to a category of shallow CNNs, whereas ResNet and DenseNet are deep network architectures. Such an arrangement provides an opportunity to test the discriminating ability of the proposed Histo-Fusion strategy on both network configurations, shallow as well as deep.  DenseNet DenseNet [24] is a recent paradigm of implementing a deep CNN that uses dense connections interconnecting the layers of a network in a cascade fashion. Each layer accepts the feature maps from all the preceding layers, which enables the seamless propagation of rich features across a network, ensuring that the network is least affected by the vanishing gradient problem. The model comprises a convolution layer, a dense block, a transition layer, and a classifier. The architecture is considered one of the most computationally efficient deep CNN [24].

Pre-processing
Before a pathological examination, the extracted tissues are generally stained using two chemicals: Haematoxylin and Eosin. Haematoxylin provides a dark purple colour to the tissue nuclei, while Eosin imparts a light pinkish colour to the cytoplasm. Since the histopathological images are captured at different time stamps from distinct patients, as a result, some of these images are under-stained while others are over-stained due to disproportional use of Haematoxylin and Eosin stains. Therefore, it is highly desirable to carry out the stain normalization of histopathological images before performing any classification task. These two stain colours (purple and pink) may essentially represent all the pixels of an ideal H&E image. Generally, these stain colours change across the images and may be represented as a stain matrix I s .
Where H r , H g , H b and E r , E g , E b in the first and second row of the matrix represent the respective Haematoxylin and Eosin stain colors in the RGB space. In the present study, a stain-normalization procedure proposed by [33] is carried out on whole-slide histopathological images to conform these images to a uniform colour space. Logarithmic transformation is applied to an original H&E image (I) to obtain its optical density version (OD) in the normalization procedure. Next, it is followed by a singular value decomposition operation that produces 2D projections of higher variance.
In the final step, the image histogram stretch and colour space transform procedures are utilized to obtain a stain-normalized image. An original H&E image from the BreakHis dataset and its corresponding stain normalized image are shown in Fig. 4. Furthermore, the study employs a post-augmentation procedure on the training set after the train/valid/test dataset split. These techniques increase the training data size, which helps overcome the overfitting issues, enhancing the generalizing ability of deep CNN models. Image augmentation in the present study is accomplished through random rotation (between 45 and 315 degrees), zooming, flipping (horizontal and vertical), and shifting techniques.

Proposed model
As discussed earlier (Section 3.1), the proposed Histo-Fusion strategy is a domain-specific learning method formulated for training deep CNNs with an aim to identify IDC from small and low-resolution images. This strategy performs the process of training in two distinct stages.
Histo-fusion stage 1 In this stage, each deep CNN is trained on the whole slide images of the BreakHis dataset. Since the images of the BreakHis dataset have larger dimensions and have been captured at higher magnification factors, it benefits the deep CNNs by providing more details and robust features. Each deep CNN is trained from scratch on the BreakHis histopathological images with weights assigned randomly at the initialization time. Each input image is down-sampled to a size of 224 × 224 pixels from its original size of 700 × 460 pixels. Each Fig. 4 Original H&E-stained histopathological image, (b) Corresponding Stain-Normalized image model starts with an initial convolution block that performs the preliminary convolution and pooling operations on a 3-channel 224 × 224 input image. The convolution layer is responsible for extracting features from the input image through different filters. As a result, the number of output channels is reduced towards the end of each convolution operation. The pooling layer works on individual feature maps and performs the down-sampling operation to accelerate the computation task, besides aiding reduce the overfitting issue. After taking the flattened output, the fully connected layer forwards its output to a final layer comprising a SoftMax activation function. The activation function provides the probability value of the input for a particular class (benign/malignant). Finally, the cross-entropy loss function is used to calculate the error rate during one training cycle. The deep CNNs continue to train through forward and backward propagation until error-rate improvements are observed. Afterwards, the weights of each model are recorded for further use in the Histo-Fusion stage 2. Figure 5 presents a schematic visualization of DenseNet121 architecture for Histo-Fusion stage 1. The architectural details (number of convolution layers, size of input and feature map after each convolution operation) of the rest of the deep CNN models are summarized in Table 4.
Histo-fusion stage 2 In this stage, the deep CNN models pre-trained on the BreakHis dataset are further fine-tuned on the small and low-resolution histopathological images of the IDC dataset. Initially, the IDC patch images are downsized to 48 × 48 pixels from their original size of 50 × 50 pixels. Next, each deep CNN is initialized with previous weights recorded in stage 1 of Histo-Fusion learning. Deep CNN models perform convolution and pooling operations to extract significant features for better discrimination between the normal and malignant IDC tissues in a manner similar to the one described in Histo-Fusion stage 1. Training of deep CNNs on the IDC dataset is performed using partial network training and layer-wise fine-tuning approaches. In the partial network training approach, only the terminal layers are allowed to participate in the re-training process, keeping the rest of the network intact (i.e., weights of the initial network layers are kept as it is). The training of terminal layers is carried out using a constant learning rate. On the other hand, layer-wise fine-tuning is accomplished through the use of discriminative learning rates, in which the layers of a network are split into three distinct groups -initial layers, middle layers and terminal layers. Each group is assigned a set of different learning rates using an incremental approach. Lower learning rates are used for the initial layers, followed by moderately higher learning rates for the middle layers, whereas the high optimal learning rates are applied for the terminal ones.
Since the initial layers of a network contain very fine details (edges or lines) about the data compared to the information learnt by the terminal layers, such information is not likely to change. Therefore, through the use of discriminating learning, optimal network training is

Experimental setup and parameter setting
Before training the deep CNNs using the proposed Histo-Fusion strategy, the study uses a random image distribution for performing the dataset splitting. The BreakHis dataset is split into a train set (80%) and a validation set (20%). On the other hand, 70% of the IDC dataset is reserved for model training, 10% for model validation, and 20% for testing purposes. In carrying out the dataset splits, it is ensured that there is no overlap of images across train/valid/ test splits of each dataset. The validation set reports the incurrence of the error rate at the end of each training cycle. It benefits i) in fine-tuning the hyperparameters more conveniently, and ii) in selecting the best model for testing on any new data.
Each deep CNN is trained independently on two distinct training sets: one belonging to the BreakHis dataset (Histo-Fusion stage 1) and the other to IDC dataset (Histo-Fusion stage 2). The training set enables each model to learn distinctive features from the input stain normalized histopathological images. In both the stages of Histo-Fusion learning, various hyperparameters viz. learning rate, the number of epochs, weight decay and momentum are fine-tuned at the time of training for better network convergence. Hyperparameter tuning is highly significant as it influences the performance of a deep CNN model. The initial constant learning rate is set to a small value of 0.01 to prevent the model from fluctuating or diverging Table 4 Architectural details of various CNNs for an input histopathological image (BreakHis dataset) of size 224 × 224 pixels used in the study. 'C' refers to convolutional block and 'OS' is the output size after each convolution operation. Within each convolution block the notation [n x n, f]; n x n is filter of size n and f is the number of channels around the minima. Next, the discriminative learning rates provide variable learning rates to different layers of individual networks. Since the study utilizes different deep CNN models to assess the generalization ability of the proposed approach, the deep CNN models converged at different epochs due to varying network trainable parameters. Training of each deep CNN model is stopped only after obvious signs of overfitting are prominently seen. Moreover, the values of other hyperparameters like weight decay and momentum are set at 10 −4 and 0.9 respectively. Finally, proposed strategy's performance is evaluated on a test set comprising small-sized and low-resolution images of the IDC dataset.

Performance evaluation
The performance of the deep CNN model using the proposed Histo-Fusion approach is evaluated on various standard evaluation metrics viz. accuracy, F1-score, specificity, sensitivity and precision. All these metrics are computed on the bases of calculations of True Positives (TP), True Negatives (TN), False Positives (FP), and False Negatives (FN). TN signifies the total number of benign/IDC -samples classified correctly, whereas TP is the sum of malignant/ IDC + samples classified correctly. FP is the summation of incorrectly classified benign samples, whereas FN is the total malignant samples classified incorrectly. These metrics are calculated at the image level methodically in the following manner: Precision Furthermore, to establish the correspondence between the true-positive rates and false-positive rates, AUROC (AUC -Area under the curve and ROC -Receiver operating characteristics) is used to assess the relationship.

Results
The study conducts a series of experiments to identify IDC from small and low-resolution images. These experiments pertain to the deep CNN models that are separately trained using two distinct learning strategies: (i) the established transfer learning and (ii) the proposed Histo-Fusion learning. Results obtained by adopting these strategies are discussed as follows:

Transfer learning results
In transfer learning, the weights of pre-trained deep CNN models are optimized to perform a 1 K classification task on the ImageNet dataset. Compared to training deep CNNs from scratch, this approach has the advantage of achieving better accuracy results besides economising on time in training the model. Since these deep CNNs are trained on a different dataset (ImageNet dataset of 1000 distinct classes), the last layer of each network is replaced with a SoftMax layer comprising two neurons for the classification of IDC images. The performance in the form accuracy scores of the deep CNNs employed on the IDC dataset using the established transfer learning approach is presented in Table 5. The intra-tabular comparison of these scores reveals that the DesneNet121 yielded the highest accuracy score of 0.9305, followed by 0.9227 by ResNet50. AlexNet attained an accuracy of 0.9041, the lowest among all deep CNNs used. On applying transfer learning, various studies using different deep CNNs in the classification of IDC patch samples have reported almost similar results (Table 9).

Histo-fusion learning results
As discussed earlier, the proposed Histo-Fusion learning operates in two stages: Histo-Fusion stage 1 and Histo-Fusion stage 2. In the first stage, each deep CNN employed in the study is trained on high-resolution images of the BreakHis dataset from scratch. Table 6 shows the validation error rates (Section 3.2.1) incurred by each deep CNN model in the Histo-Fusion stage 1.
In Histo stage 2, the training is carried on the IDC images, where each model is initially trained through a partial network training strategy at a constant learning rate. Afterwards, the individual deep network architectures are trained using a layer-wise training strategy, providing an opportunity to use discriminative learning rates. The accuracy scores obtained by using the partial network training strategy are fairly high, but the accuracy scores obtained by applying the layer-wise training strategy are significantly higher. A comparison of accuracy scores obtained by different deep CNNs on the IDC dataset is brought out in Table 7, which points to a critical observation that the deeper the network greater the accuracy score accrued by applying this technique. The increment may be smaller in magnitude, yet it has significant clinical implications. Moreover, Table 7 reveals that the application of discriminative learning rates enables deep CNN architectures to use different learning rates for different layers eventually benefitting each model to increase its accuracy score.
Performance scores obtained by various deep CNNs after training using the Histo-Fusion approach are presented in detail in Table 8. AlexNet, the primitive most among the various In light of the data presented above, it is profoundly evident that the proposed domainspecific learning, i.e., Histo-Fusion, has considerably improved the discriminating abilities of each model by reporting superior metric scores compared to transfer learning, which is illustrated in Fig. 6.
To measure the diagnostic ability of various deep CNNs used in the study, values obtained for true-positive rates are plotted against those of false-positive rates for each model in the form of a ROC curve (Fig. 7). This metric is significant in quantifying the diagnostic ability of

Comparison with related studies
In order to assess the effectiveness of the proposed Histo-Fusion method, a brief review of the methods used and the results achieved by other research studies having used the same dataset is presented in this section. For the sake of brevity, a comparison of the results obtained by these studies and those achieved in the present study has been reproduced in a tabular form in Table 9. The procedures adopted in these studies has already been briefly discussed in the preceding paras (Section 2 and Table 1   limited abilities to perform the required task yielding lesser accuracy and increased falsepositive rates [46] implemented a wide range of machine learning methods, the most efficient of which is the CatBoost method, which also yielded less accurate results, even lesser than those obtained by simple CNN methods. On the other hand, the studies employing deep learning methods for identifying IDC have reported comparatively better results. An ensemble of pre-trained networks: DenseNet121 and DenseNet169, implemented by [7], achieved superior results among various studies employing deep learning methods. As reflected in Table 9, the various performance scores obtained by using the proposed Histo-Fusion strategy are significantly superior to those achieved using the established transfer learning strategy and also to those used in other related studies.

Discussion
Most of the studies dealing with deep CNNs to classify the IDC images have employed the transfer learning approach [12,24,57]. A substantial decrease in false-positive rates has been observed by using deep CNNs employing transfer learning compared to traditional CAD methods. However, an important observation is that irrespective of the network depth, the diagnostic ability of these deep CNNs cannot be improved beyond a certain level using this technique (93.05% - Tables 5 and 9). The results obtained in the present study also corroborate this fact (Table 5, 9 and Fig. 6). By using the proposed Histo-Fusion learning approach the overall discriminating abilities of each deep CNN is enhanced, and an appreciable increase in the accuracy scores of each model is achieved (Table 8). This performance increase may be attributed to the positive transfer of features taking place in the two stages of Histo-Fusion. In the proposed strategy, each model initially carries out the feature engineering process on histopathological images of higher resolution. These high-resolution images present ample distinctive features for a model to perform feature extraction effectively. Subsequently, the deep CNN supplements its pool of learnt features with additional ones by performing the feature extraction on small and low-resolution images. This fusion of histology features from disparate resolutions of histopathological images enables each deep CNN to identify IDC more precisely. Table 9 Comparison of the performance of proposed Histo-Fusion learning with related studies. In the table header Sen., F1, Prec., and Acc. indicate the sensitivity, F1-Score, precision and accuracy respectively, while '-' indicates that the specific value is not available and '*' represents the balanced accuracy

Conclusion
IDC is one of the most prevalent abnormalities affecting women's breast tissue worldwide. Histopathological images serve as an appropriate procedure for its diagnosis. An expert pathologist carries out the histology analysis, which is a highly challenging task. The situation has worsened due to an increased patient-pathologist ratio on the global scale. Previously CAD methods employing various image analysis techniques were conceived to help pathologists in arriving at an accurate diagnosis. However, despite being effective, these methods are more susceptible to false-positive rates.
Deep-learning methods utilizing transfer learning have recently yielded competitive results in the prognosis of breast malignancy from histopathological images. Unfortunately, it does not hold for datasets comprising smaller-sized images of comparatively low resolution. The poor performance of transfer learning on these images may mainly be attributed to the negative-transfer problem. To address the issue, the study proposes domain-specific learning, i.e., Histo-Fusion. which is an efficient procedure to train deep CNN architectures for automatic classification of IDC into IDCand IDC + categories from small sized and lowresolution histopathological images. In Histo-Fusion learning, deep CNNs are independently trained on two different datasets: BreakHis and IDC, each comprising histopathological images of breast tissue. The proposed Histo-Fusion strategy enables deep CNNs to improve their ability to discriminate between positive and negative samples, thereby reducing falsepositive rates. Discriminative learning rates have also been used to enhance the performance of deep CNNs for the automatic classification task.
Funding The authors received no funding for this work.

Declarations
Competing interests The authors declare that there are no competing interests.