A Deep Neuro-Fuzzy Rule-Based System for Mammography Image Classification

doi:10.21203/rs.3.rs-1647197/v1

Download PDF

Research Article

A Deep Neuro-Fuzzy Rule-Based System for Mammography Image Classification

https://doi.org/10.21203/rs.3.rs-1647197/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Breast cancer has become one of the most significant diseases all over the world as the mortality rate is increasing day by day. Therefore, early diagnosis of breast cancer is very important to reduce the mortality rate. This is because the earlier breast cancer is detected; the more successful treatments are applied to get rid of the disease. The aim of this work is to calculate the breast cancer risk score using deep learning methods and a fuzzy rule-based system to enable people to take precautions against breast cancer. Therefore, a hybrid system consisting of two independent parts is described in this paper: In the first part, a deep neural network is trained to provide inputs to the fuzzy rule-based system. In the second part, the fuzzy rule-based system calculates the risk scores for breast cancer by using the output of the deep neural network structure. We also propose a simple rule-based classifier that uses the calculated risk value to diagnose breast cancer. We compare the results of our model with the existing neural network-based models and show that breast cancer can be diagnosed with high confidence using the proposed neuro-fuzzy rule based system.

Terms—Computer aided diagnosis

Fuzzy systems

Hybrid intelligent systems

Image classification

Neural networks

Breast cancer has become one of the most significant diseases all over the world as the mortality rate is increasing day by day. Uncontrolled growing breast cells are the cause of breast cancer. A tumor formed by these cells can be seen on mammogram images or can be felt as nodules. Tumors formed by these cells are studied in two classes. One is benign, known as non-cancerous, while the other is malignant, known as cancerous types [1–2].

Early diagnosis of breast cancer is very important to reduce the mortality rate. However, there is a need for reliable and correct diagnostic methods for the early detection of breast cancer [3]. Mammography imaging is one of the most common radiological methods to diagnose breast cancer at an early stage. Making an accurate diagnosis of breast cancer based on mammography imaging results is very difficult due to the low specificity and high sensitivity of mammography images [4]. This may lead to some false-positive findings. Therefore, there is a need for a technique that reduces the number of false-positive findings while physicians interpret mammography results to diagnose breast cancer.

Recently, it has become very popular among doctors to use computer-aided diagnosis techniques in diagnosing breast cancer. This is because these computer-aided diagnosis techniques help physicians to interpret mammography images more accurately and minimize the number of false-positive findings [5]. There exists a variety of computer-aided diagnosis techniques used in medical imaging [6]. Deep learning methods and fuzzy systems have been the most successful of these methods in image analysis [7].

The use of deep learning models has been widely used in breast cancer classification [8–9]. However, most studies have classified mammogram images using the results of deep learning models to investigate whether the patient has breast cancer or not [10–11]. In this study, we use Fuzzy Logic in addition to Deep Learning, to further investigate whether the patient has breast cancer risk. In this research, we determine the class of breast calcifications as malignant or benign by using a combination of a classical Convolutional Neural Network (CNN) architecture with a rule-based fuzzy system. For this purpose, we build a hybrid system consisting of a deep learning model and a rule-based fuzzy system such that the deep learning model receives the mammogram image as input and generates probabilities for the image to belong to the classes malignant or benign. In the second stage, the rule-based fuzzy system receives these probability values and determines a breast cancer risk score. Moreover, we propose a simple rule-based classifier that receives the calculated risk score and determines the class label for the image, whether it belongs to the "benign" or "malignant" class.

In the Deep Learning part of our system, we test two different architectures to improve the inputs of the fuzzy rule-based system. For this purpose, we compare the classification success of ResNet50 and a classical CNN architecture [12–13]. We also compare the performance of our hybrid system with a pure deep learning model in terms of precision and f-score of calcifications in mammography images as benign or malignant class using the BCDR-D02 dataset and the merged larger sized dataset [14].

There are some studies [15–22] that use fuzzy logic-based approaches in the diagnosis of breast cancer. Some of these studies [15], [18], [20–22] use only fuzzy logic or combine it with classical machine learning methods, while the others [16–18] use neuro-fuzzy systems for breast cancer diagnosis. As an example, in [15], a Mamdani-Fuzzy expert system was proposed to decide the BIRADS category of mammogram images by using the size of the mass and calcification as input to their fuzzy inference system. In [20], features of smears of breast mass obtained by Fine Needle Aspirate (FNA) such as area, circumference, etc. are used as input for their fuzzy system to diagnose breast cancer. They used Wisconsin Diagnostic Breast Cancer Dataset (WDBC) and applied trapezoidal membership functions and centroid defuzzification methods (center of area). In another study [18], a knowledge-based breast cancer classification system was developed by using classification and Regression Trees (CART) to generate fuzzy rules. Wisconsin Diagnostic Breast Cancer and mammography mass data sets were used to test the success of the knowledge-based system in diagnosing breast cancers. In this research, a combination of the Expectation-Maximization, Principal Component Analysis, CART, and Fuzzy Rule-based methods are used and it has been shown that the use of clustering, noise removal, and fuzzy rule-based techniques gives good predictive accuracy for breast cancer.

In other studies, an attempt is made to calculate a risk score for breast cancer using fuzzy logic-based technique. In [21], breast cancer risk is predicted using a well-selected fuzzy rule set that includes patient age and automatically extracted tumor features. The proposed system has been tested on 60 patient datasets, each of which contained patient age, extracted tumor segmentation results from mammograms, and clinical and pathological truth-finding by experts. When the breast cancer risk calculated by the system was compared with the clinical truth, the fuzzy logic results were found to be consistent with the clinical truth. In [22], symptoms of breast cancer, such as lumps, size, shape, papillary discharge and retraction, and skin changes in the breast have been analyzed, so these symptoms are used as input to a fuzzy logic-based system to calculate a risk value.

Some other systems use a hybrid of neural networks with fuzzy logic to make breast cancer predictions from mammogram images. One such study is [16], which compared the performance of the Adaptive Neuro-Fuzzy Inference System (ANFIS) with a fuzzy inference system in diagnosing breast cancer using the Wisconsin dataset. The authors implemented different fuzzy inference systems with different membership functions and fuzzy inference models to compare the classification results with those of ANFIS. In [17], the researchers designed a neuro-fuzzy rule-based expert system for breast cancer diagnosis using a mammography mass dataset provided by UCI Machine Learning Repository. This neuro-fuzzy system consists of three-layer feed-forward architecture with an input layer, a hidden layer, and an output layer, where the hidden layer represents fuzzy rules. They developed an ex-DBC inference engine based on these neuro-fuzzy rules. Another study [19] used artificial intelligence technologies to extract fuzzy rules for breast cancer classification. They used a neuro-fuzzy classification tool called NEFCLASS to obtain strong fuzzy rules. The NEFCLASS fuzzy classifier was tested with the Mammographic Mass dataset.

According to recent research, fuzzy logic has been widely used in breast cancer diagnosis. However, to the best of our knowledge, there is no study using a rule-based neuro-fuzzy system that combines a deep neural network architecture with a fuzzy rule-based system for breast cancer diagnosis. In our model, we use mammogram images as input to the deep learning model, which generates probabilities for the image to belong to the “benign” or “malignant” class. Then, we use the output of the deep learning model as input to our rule-based fuzzy system to compute a breast cancer risk score. Then, our rule-based classifier determines the class label of the mammogram image based on the calculated risk score. In this study, we use a fuzzy inference system because it provides effective results based on uncertain data. The output of the fuzzy rule-based system can be used as an alternative method to assist experts in breast cancer diagnosis.

The decision-making process for selecting the most appropriate follow-up treatment for a suspected breast cancer case depends heavily on the correct diagnosis and assessment of breast cancer risk [23]. Therefore, we believe that our proposed deep-neuro fuzzy system can be used by experts to support their breast cancer diagnosis.

Our system is different from previous research in the literature because we use Deep Learning, while other neuro-fuzzy systems [16–17], [19] use classical neural network structures. In addition, we use mammogram images as input unlike studies in [15], [17], [19–22] which use symptoms of breast cancer, or some extracted features and patient's age. Our system is somewhat similar to [24], in which a hybrid neuro-fuzzy decision support system was developed for diagnosing myocardial perfusion in cardiac images. In [24], a backpropagation neural network model was used to classify SPECT images, while in our study, we use a neuro-fuzzy model to classify mammogram images. Similar to [24], we use a fuzzy approach to improve the success of the neural network model.

The rest of this paper is organized as follows: in the second section, we present the approach and the methods used in this research. Section 3 discusses the experimental results in detail. Finally, Section 4 concludes our study.

The main objective of this research is to design a hybrid system to calculate the risk scores of breast cancer by using a deep neuro-fuzzy system and build a simple rule-based classifier to assign class labels using the calculated risk scores. The hybrid system model is very important for the health sector outside the traditional methods of diagnosing breast cancer. Since the disease can be diagnosed more accurately by using the proposed method, the number of unnecessary biopsies is reduced and the negative effects of performing biopsies, such as anxiety and cost, are also eliminated.

Figure 1 shows the architecture of our proposed deep neuro-fuzzy system. This hybrid system consists of two separate parts: a deep neural network model and a rule-based fuzzy system. Our system starts by training a deep neural network model using the BCDR-D02 dataset and the merged dataset which is a combination of images from BCDR-D02, mini-MIAS datasets as well as images obtained from our university hospital in the first stage. We obtain two output variables from the deep neural network model. The first output is the probability that the mammogram image belongs to the benign class (NN output1), and the second output is the probability that the mammogram image belongs to the malignant class (NN output2). In the second stage, the output variables obtained from the deep learning model are used as input to the rule-based fuzzy system, which uses a set of fuzzy rules and fuzzy sets to evaluate the risk of breast cancer. Then, this risk value is used to classify the mammogram image more accurately. For example, if the computed risk breast cancer risk is 90%, the risk of breast cancer is high therefore the class of the mammogram image is malignant.

The classical CNN and ResNet50 deep learning models trained on the Breast Cancer Digital Repository (BCDR-02) dataset and the merged dataset are used in the proposed system. The success of the models in determining the presence of breast abnormalities in mammogram images is determined by taking only images with calcification abnormalities from both datasets.

A. Deep Learning

Recently, Deep Learning has been widely used to diagnose breast cancer. In this styudy, we utilize deep learning methods to generate input variables for the fuzzy rule-based system. We start our work by training two different architectures, a classical CNN and ResNet50, with the BCDR-D02 dataset and the merged dataset to obtain input variables for the fuzzy rule-based system. In this study, we chose to use CNN and ResNet50 architectures because CNNs have been successfully used in breast cancer detection and the ResNet50 model is fast and provides higher accuracy results in classifying mammograms [12],[25].

A simple CNN architecture, shown in Fig. 2, is used to obtain knowledge from mammogram images, and we compare the results with those of the ResNet50 model. This classical CNN model consists of 3 convolutional blocks, a global average pooling layer and a sigmoid layer. As shown in Fig. 2, a mammogram image is passed through convolutional layer 1 as an input matrix to extract features from the mammogram image. Each convolutional layer consists of 2 convolutional layers and 1 maximum pooling layer. Convolutional layer 1 generates 2 feature maps using a 32x32 filter. Then the maximum pooling layer is used to reduce the dimensionality of the feature maps using a 2x2 matrix. Convolutional layer 2 takes the output of convolutional layer 1 as input and generates feature maps using a 64x64 filter. Then maximum pooling is used to reduce the dimensionality of the feature maps. Similarly, convolutional layer 3 takes the output of convolutional layer 2 as input and generates feature maps using a 32x32 filter. As the last step of convolutional layer 3, the max-pooling layer is applied to reduce the dimension of the feature maps. The global average pooling layer takes the output of convolutional layer 3 and computes the mean of each feature map and passes it to a sigmoid layer. The sigmoid layer produces a vector where each element represents the probability of each class and outputs the class with the higher probability.

The dataset used in this study contains only a few data instances (see subsection C), which are not sufficient to accurately train a deep neural network. Therefore, we use the transfer learning model ResNet50 because transfer learning is a very effective method when the training dataset that is small. Like the classical CNN model, we train the ResNet50 model with the BCDR-D02 dataset and the merged dataset to generate the input for the fuzzy rule-based system. As shown in Fig. 3, we use the original architecture of the ResNet50 model except for the last fully connected layer, where the number of nodes depends on the number of classes in the dataset. To adapt the ResNet50 architecture to our breast cancer dataset, we updated the last fully connected layer.

B. Fuzzy rule-based system

In this paper, the fuzzy system is used to determine the risk scores of mammography images. This expert system helps us to model how a doctor determines the risk of breast cancer. The fuzzy rule-based system consists of two input variables which are the outputs of the neural network model and one output variable which is the breast cancer risk.

Table I. The numerical range of the probability of belonging to malign/benign class for each neural network output

NN output (class label)	Probability of malign/benign class
NN output (class label)	Max	Min	MEAN
NN output 1 (Benign)	0.99	0.39	0.69
NN output 2 (Benign)	0.59	0.0001	0.30
NN output 2 (Malign)	0.99	0.53	0.76
NN output 1 (Malign)	0.38	0.008	0.19

First, the parameters affecting the risk calculations for breast cancer diagnosis were determined and the input/output variables were defined. Table 1 shows the maximum and minimum probability values calculated by the deep neural network for the malignant and benign classes. For example, according to the first row in Table 1, the image in the benign class is between 0.39 and 0.99, if the output NN is 1. The range in Table 1 was considered when defining the input/output variables of the rule-based fuzzy system. The system consists of two inputs and one output; "probability of belonging to benign class (NN output 1)" and "probability of belonging to malignant class (NN output 2)" are the inputs, while "breast cancer risk" is the output of the fuzzy system.

The output of the fuzzy system (breast cancer risk) is in the range of [0, 100] and the inputs of the fuzzy system are in the range of [0, 1]. Since the triangular and trapezoidal membership functions are the most used and have the simplest shapes, they are selected as membership functions. As shown in Fig. 4, we use both triangular and trapezoidal membership functions to characterize NN-output 1 and NN-output 2. Similarly, both triangular and trapezoidal membership functions are used for breast cancer risk, as shown in Fig. 5. These membership functions are used to fuzzify the input and output variables of the rule-based fuzzy system.

The purpose of the fuzzy rule-based system is to reduce the number of unnecessary breast biopsies by calculating the risk of breast cancer, thereby eliminating the negative consequences of unnecessary breast biopsies such as cost and anxiety. First, appropriate numerical ranges are created for the system’s inputs and outputs. For each range, suitable linguistic expressions are found and assigned. All these ranges are validated with the help of a domain expert. As can be seen in Table 2, for each input there are three different linguistic variables "low", "medium" and "high" whose minimum and maximum values for benign and malignant classes are defined in the table.

Table II. Distribution of probability values obtained from the neural network model

Probability of each class	Min	Max	MEAN
NN output 1 (Benign) Low	0.1	0.4	0.25
NN output 1 (Benign) Medium	0.2	0.8	0.5
NN output 1 (Benign) High	0.6	0.9	0.75
NN output 2 (Malign) Low	0.1	0.4	0.25
NN output 2 (Malign) Medium	0.2	0.8	0.5
NN output 2 (Malign) High	0.6	0.9	0.75

Based on the training results of the neural network models, we obtained two different outputs. One is the probability that the image belongs to the malignant class (NN output 2); the other is the probability that the image belongs to the benign class (NN output 1). As listed in Table 2, there are three important points for NN output 2: 0.1 (lower bound), 0.5 (middle bound), and 0.9 (upper bound). Therefore, we have defined three different linguistic variables such as low, medium, and high for NN output 2. We have defined the low, high, and medium linguistic variables respectively in the range of 0.1 to 0.4, 0.6 to 0.9 and 0.2 to 0.8. When we examine the distribution of probabilities, we find that there are probabilities smaller than 0.1 and larger than 0.9. We include values smaller than 0.1 for low linguistic variable, and values larger than 0.9 for high linguistic variable by using a trapezoidal membership function. Thus, the degree of low linguistic variables is 1 for probabilities between 0 and 0.1, and as this probability grows from 0.1, the degree of low linguistic variables decreases. Thus, the degree of the high linguistic variable becomes 1 for all probabilities between 0.9 and 1. We use the triangular membership function for the medium linguistic variable to express ranges between 0.2 and 0.8. This is because at a probability of 0.5, the medium degree of membership is 1. While the mean of 0.1 and 0.4 establishes the lower bound of the triangular membership function of the medium linguistic variable, the mean of 0.6 and 0.9 yields the upper bound for the medium linguistic variable.

Therefore, as shown in Fig. 4, the trapezoidal membership function for the low linguistic variable defined by a lower bound of 0.1, an upper bound of 0.4, is calculated by using Eq. (1).

${\mu _{low}}\left( x \right)=\left\{ {\begin{array}{*{20}{c}} {0,x>0.4} \\ {\frac{{\left( {0.4 - x} \right)}}{{0.3}},0.1 \leqslant x \leqslant 0.4} \\ {1,x<0.1} \end{array}} \right.$ (1)

Similarly, the trapezoidal membership function for the high linguistic variable in Fig. 4 is defined by a lower bound of 0.6 and an upper bound of 0.9, and it is calculated by using Eq. (2).

$${\mu _{high}}\left( x \right)=\left\{ {\begin{array}{*{20}{c}} {0,x<0.6} \\ {\frac{{\left( {x - 0.6} \right)}}{{0.3}},0.6 \leqslant x \leqslant 0.9} \\ {1,x>0.9} \end{array}} \right.$$

The triangular membership function of the medium linguistic variable in Fig. 4 is defined by a lower bound as 0.2, an upper bound as 0.8, and a value m where 0.2 < m < 0.8. We calculate the medium degree of membership by using the following equation:

$${\mu _{medium}}\left( x \right)=\left\{ {\begin{array}{*{20}{c}} {0,x \leqslant 0.2} \\ {\frac{{\left( {x - 0.2} \right)}}{{m - 0.2}},0.2<x \leqslant m} \\ {\frac{{\left( {0.8 - x} \right)}}{{0.8 - m}},m<x<0.8} \\ {0,x \geqslant 0.8} \end{array}} \right.$$

On the other hand, in Fig. 5, we define the risk parameter with 5 different linguistic variables such as very low, low, medium, high, and very high. The risk of 50% is the middle point to define the risk of breast cancer as low or high. When the risk is lower than 50%, we can express the range [0, 50] with the linguistic variables low and very low. Similarly, if the risk is higher than 50%, then we define the range [50, 100] with the linguistic variables high and very high. And if the risk is 50%, then we can define the risk with medium linguistic variable. Table 3 shows the range of each linguistic variable for risk score. We use trapezoidal membership functions to include values between 0 and 10 and between 90 and 100. Triangular membership functions are used for risk values between 10 and 90.

Table III. Ranges for breast cancer risk score

RISK	Min	Max	MEAN
Very Low	0	10	5
Low	10	40	25
Medium	25	75	50
High	60	90	75
Very High	90	100	85

Algorithm 1 Fuzzy rules for breast cancer risk

Finally, the risk of breast cancer is determined as "very low", "low", "medium", "high", and "very high" by applying the rules in Alg. 1. As shown in Alg. 1, this expert system uses 9 rules to calculate the risk values for breast cancer. These fuzzy rules are determined with the help of a domain expert, a physician. In our system, we give the probability of NN-output1 and NN-output2 as input to the fuzzy inference system to calculate the risk values using the fuzzy rules in Alg. 1. Then, the defuzzification process is applied to convert the fuzzy risk variable into a numerical value. We calculate numerical risk values by using the defuzzification method with the centre of the area. The midpoint of area method calculates a vertical line that divides the area under the curve into two equal areas. The defuzzified value or numerical risk value is calculated by applying Eq. (4), where$\alpha {\text{ }}={\text{ }}min\left\{ {x|{\text{ }}x{\text{ }}\epsilon {\text{ }}X} \right\}and\beta {\text{ }}={\text{ }}max\left\{ {x|{\text{ }}x{\text{ }}\epsilon {\text{ }}X} \right\}$.

$$\mathop \smallint \limits_{\alpha }^{{{x^*}}} \mu A\left( x \right)dx=\mathop \smallint \limits_{{{x^*}}}^{\beta } \mu A\left( x \right)dx$$

In the following, we give a simple example of how to obtain risk values with fuzzy rules: Suppose the probability values computed by the deep neural network are as follows: Benign probability = 0.8, and Malignant probability = 0.2. According to Fig. 6, the benign probability is in the range of high linguistic variable, so the membership degree of the benign input is calculated by using Eq. 2 as follows:

$${\mu _{high}}_{{}}\left( {0.8} \right){\text{ }}={\text{ }}\left( {0.8{\text{ }} - {\text{ }}0.6} \right){\text{ }}/{\text{ }}0.3{\text{ }}={\text{ }}2/3$$

According to Fig. 7, membership value of malign probability is computed as low.

In this case, Rule 7 in Alg. 1 is applied, which reads, "If benign is high and malignant is low, the risk is very low." Since the fuzzy rule contains the operator AND, the membership degree for very low is the minimum value of the membership degrees of the inputs benign and malignant, which is calculated as follows:

$${\mu _{very - low}}={\text{ }}Min{\text{ }}\left( {2/3,{\text{ }}2/3} \right){\text{ }}={\text{ }}2/3$$

$$\begin{array}{*{20}{l}} {A1=15*2/3=10~~~~~~~~A2=10*2/3*1/2=10/3} \\ {X1=15/2=7.5~~~~~~~~~~~X2=15+(10*1/3)=55/3} \\ {Y1=2/3*1/2=1/3~~~~~~Y2=1/3*2/3=2/9} \\ {X{\text{ }}={\text{ }}((A1*X1){\text{ }}+{\text{ }}(A2*X2)){\text{ }}/{\text{ }}\left( {A1+A2} \right)} \\ {~~~={\text{ }}\left( {\left( {75} \right){\text{ }}+{\text{ }}\left( {550/9} \right)} \right){\text{ }}/{\text{ }}\left( {40/3} \right){\text{ }}={\text{ }}10.2} \\ {Y{\text{ }}={\text{ }}((A1*Y1){\text{ }}+{\text{ }}(A2*Y2)){\text{ }}/{\text{ }}\left( {A1+A2} \right)} \\ {~~~={\text{ }}\left( {\left( {10/3} \right){\text{ }}+{\text{ }}\left( {20/27} \right)} \right){\text{ }}/{\text{ }}\left( {40/3} \right){\text{ }}={\text{ }}0.3} \end{array}$$

Risk score = 10.2

Therefore, the risk score is found as 10.2

After obtaining the numerical risk score value, we apply our simple rule-based classifier to determine the class label of the image. Our classifier labels an image as benign if the risk score is less than or equal to 50, otherwise it is classified as malignant. We have developed this classification rule with the help of an expert. According to our classifier, our toy example where the risk score is 10.2, is classified as benign.

C. Datasets

BCDR-D02 Dataset:

In this research, we use the Breast Cancer Digital Repository (BCDR-DM) to evaluate the performance of the proposed system. The BCDR-DM consists of images from Portuguese patients. We used the BCDR-D02 dataset to classify only calcification abnormalities. The BCDR-D02 database, downloaded from [14], consists of 397 benign and 42 malignant mammogram images. The number of images in the benign class is larger than in the malignant class. We balance the number of instances in each class by applying undersampling [26]. So, we randomly select 42 benign images to have the same number of instances in each class. In addition, we increase the number of mammogram images by combining flipping with rotation transformations of 0, 90, 180, and 270 degrees [27]. In this way, we create 8 new instances for each resulting mammogram image and end up with 336 malignant and 336 benign images.

Merged Dataset:

The merged dataset contains images from BCDR-D02, miniMIAS, our hospital dataset. The hospital dataset have been collected from patients’ mammogram images at the Department of General Surgery, Faculty of Medicine, Cukurova University. We have put a lot of effort into collecting this dataset. Doctors and experts in the hospital helped us to create the dataset. We obtained 74 malignant and 32 benign mammogram images.

The merged dataset also includes mini-MIAS dataset [29] that contains 322 mammogram images, from which 20 images of calcifications, 10 malignant and 10 benign images.

In the merged dataset we have a total of 126 malignant and 439 benign images. We balance the number of instances in each class by applying undersampling [26]. Thus, we randomly select 126 benign images to have the same number of instances in each class. We also increase the number of mammogram images by combining flipping with rotation transformations of 0, 90, 180, and 270 degrees [27]. In this way, we create 8 new instances for each resulting mammogram image and eventually obtain 1008 malignant and 1008 benign images.

D. Evaluation Metrics

We use accuracy, precision, recall, and F-score metrics for evaluating the proposed method. These metrics are computed by using the confusion matrix [28] given in Table 4, which consists of the values for true positive (TP), true negative (TN), false positive (FP), and false negative (FN).

Table IV. A Confusion Matrix

Actual class	Predicted class
		Class = YES	Class = NO
	Class = YES	True Positive (TP)	False Negative (FN)
	Class = NO	False Positive (FP)	True Negative (TN)

According to Table 4, TP is the number of samples of the positive class (class = YES) that were correctly predicted, TN is the number of samples of the negative class (class = NO) that were correctly predicted, FP is the number of instances of the negative class that were predicted to be in the positive class, and FN is the number of instances of the positive class that were predicted to be in the negative class.

We use accuracy, precision, recall, and F-score values that can be computed by using TP, TN, FP, and FN values to evaluate performance of the proposed system. Accuracy is simply the ratio of correctly predicted observations to total observations, and it is calculated according to Eq. (5). Precision is the ratio of correctly predicted positive observations to total predicted positive observations, and Eq. (6) is used to compute it. Recall is the ratio of correctly predicted positive observations to all observations in the actual positive class, and it is calculated according to Eq. (7). F-Score, which is the weighted average of Precision and Recall, is given in Eq. (8).

$Accuracy=\frac{{\left( {TP+TN} \right)}}{{\left( {TP+TN+FP++FN} \right)}}$ (5)

$$Precision=\frac{{TP}}{{\left( {TP+FP} \right)}}$$

$$Recall=\frac{{TP}}{{\left( {TP+FN} \right)}}$$

$$F - score=\frac{{2 \times \left( {Recall \times Precision} \right)}}{{\left( {Recall+Precision} \right)}}$$

In this study we calculate the risk of breast cancer by combining a deep neural network with fuzzy approach, and then we use this risk score to label mammogram images as benign or malignant. To show the success of the proposed risk model for diagnosing breast cancer, we train two different deep neural network models: a classical CNN and ResNet50 to provide inputs to the rule-based fuzzy system that calculates the risk score. The fuzzy rule-based system was implemented in MATLAB. The CNN and ResNet50 architectures are also used as baseline models to compare the classification results of our hybrid system.

We used the BCDR-D02 dataset and the merged dataset to test the performance of the hybrid model and the baseline models. The total number of malignant images is less than that of benign images in these datasets. We apply the method of undersampling to solve this problem of unbalanced data. After balancing the dataset, we divide the dataset into training and test groups. For the experiments, we apply 5-fold cross validation method. After computing the risk scores, each image in the test set is labeled by applying our simple rule-based classifier. Then, the accuracy, precision, recall, and F-score values of our proposed system are calculated and compared with the values obtained when only the baseline CNN and ResNet50 deep neural network architectures are used to classify the test images.

In this study, we also compare the performances of two Deep Learning architectures, namely CNN and ResNet50, so we repeat the experiments for both architectures. In our balanced datasets, we have a total of 672 and 2016 images, (i.e., 134 and 403 test cases), for BCDR-D02 and the merged dataset, respectively. Table 5 shows the risk values computed by our proposed system to give an idea of the results of our deep neuro fuzzy system. To save space, the risk values for 7 randomly selected test cases are given in Table 5, where we report the probabilities of malignant and benign classes obtained by training the classical CNN architecture, and the computed risk percentages for these samples. If the probability value of the benign class is lower than the probability value of the malignant class, the risk percentage of that test case is higher. For example, in Table 5, for case 1, the probability of belonging to the benign class is 0.36, while this probability of the malignant class is 0.89, so the risk is calculated as 76%. Then, if the risk value is greater than 50%, the case is labeled as malignant, otherwise it is labeled as benign. So, case 1 is labeled by our risk-based system as malignant.

Table V. Risk results of the breast cancer using CNN

Mammogram image	Benign	Malignant	Risk %
Case 1	0.36	0.89	76
Case 2	0.52	0.35	47
Case 3	0.74	0.63	34
Case 4	0.48	0.41	50
Case 5	0.92	0.71	44
Case 6	0.87	0.69	41
Case 7	0.32	0.95	79

We compute the accuracy, recall, precision, and f-score values for the risk-based classification model and compare them with the results of the deep neural network architectures, the classical CNN and ResNet50. Table 6 presents confusion matrices and Table 7 accuracy, precision, recall and F-score for all the tested models for BCDR-D02 dataset. The best values are written in bold case in the tables.

Table VI. Confusion matrices of CNN, ResNet50 and risk score based hybrid deep neuro-fuzzy model for BCDR-D02 dataset

Actual Label	Predicted Label
	CNN		CNN+ Fuzzy		ResNet50		ResNet50+ Fuzzy
	yes	no	yes	no	yes	no	yes	no
yes	238	98	214	132	294	42	291	45
no	92	244	46	290	48	288	41	295

Table VII. Results of CNN, ResNet50 and risk score based hybrid deep neuro-fuzzy model for BCDR-D02 dataset

	CNN	CNN + Fuzzy	ResNet50	ResNet50 + Fuzzy
Accuracy	0.72	0.75	0.87	0.87
Recall	0.71	0.64	0.88	0.87
Precision	0.72	0.82	0.86	0.88
F-Score	0.71	0.72	0.87	0.87

As can be seen in Table 7, when we use the baseline CNN model to classify mammogram images, we obtain 72% accuracy, 71% recall, 72% precision, and 71% F-score. However, when we classify mammogram images with our model and use the same CNN model (i.e., CNN + Fuzzy) to compute inputs for the fuzzy rule-based system, we obtain 75% accuracy, 64% recall, 82% precision, and 72% F-score. Our hybrid model achieves higher values for accuracy, precision, and F-score compared to the classical CNN model. In Table 7, when comparing the results of the baseline models, we can see that the values obtained using ResNet50 are higher, which shows that using the pre-trained ResNet50 model improves the classification results compared to the classical CNN. On the other hand, using a hybrid of a deep neural network model and a rule-based fuzzy system improves the classification performance for both CNN and ResNet50 architectures. We obtain the best results when the ResNet50 architecture is used with the rule-based fuzzy system. We observe the highest precision rate with our risk-based hybrid fuzzy model (88%), while the precision of the pure CNN and ResNet50 models are 72% and 86%, respectively. Moreover, we obtain higher value for F-score (87%) with our model than with the pure CNN which have F-score of 71%. As can be seen from Tables 6 and 7, by comparing all the scores of the experimental results, we can say that our proposed risk score-based hybrid deep neuro fuzzy model is superior to the pure CNN and ResNet50 deep learning models, which indicates that using a fuzzy rule-based system improves the precision rate in diagnosing breast cancer. Using fuzzy rule-based system improves prediction of CNN and ResNet50 models especially for benign cases that cause to reduce unnecessary biopsies (see Table 6).

Table VIII. Confusion matrices of CNN, ResNet50 and risk score based hybrid deep neuro-fuzzy model for the Merged dataset

Actual Label	Predicted Label
	CNN		CNN+ Fuzzy		ResNet50		ResNet50+ Fuzzy
	yes	no	yes	no	yes	no	yes	no
yes	677	331	622	386	910	98	876	132
no	398	610	272	736	134	874	91	917

Table IX. Results of CNN, ResNet50 and risk score based hybrid deep neuro-fuzzy model for the Merged dataset

	CNN	CNN + Fuzzy	ResNet50	ResNet50 + Fuzzy
Accuracy	0.64	0.67	0.88	0.89
Recall	0.67	0.62	0.90	0.87
Precision	0.63	0.70	0.87	0.91
F-Score	0.65	0.66	0.88	0.89

Tables 8 and 9 show the experimental results of our risk-based hybrid system (i.e., CNN + Fuzzy, and ResNet50 + Fuzzy) and the classical CNN and ResNet50 models using the merged dataset. As shown in Tables 8 and 9, we obtain similar results as we have with the BCDR-D02 dataset. Again, ResNet50 has better prediction performance than that of CNN, and using fuzzy rule-based system with these deep learning models improves prediction especially for benign cases that reduces unnecessary biopsies. Using the larger sized dataset increases prediction accuracy, recall, precision, and F-score for all models.

The fuzzy approach has the undeniable advantage of analyzing uncertainties well and expressing them in real numbers. In this study, a fuzzy rule-based system was performed to reduce the number of unnecessary biopsies in breast cancer diagnosis by using the outputs obtained by training deep neural network models. However, our aim is to compare baseline models with the proposed hybrid model based on a hybrid of deep neural network, fuzzy approach and rule-based classifier and to determine the best method in terms of classification accuracy, precision, recall and F-score. We used the BCDR-D02 dataset and merged dataset (BCDR-D02 + mini-MIAS + hospital data) to test the performance of the fuzzy rule-based system in diagnosing breast cancer. We also used two different baseline models to compare the effectiveness of the deep-neuro fuzzy rule-based system in classifying mammogram images. For this purpose, we have trained two different deep learning models, a classical CNN and ResNet50. We used a pre-trained model ResNet50 because the BCDR-D02 dataset and the merged dataset for training the deep neural network are quite small and we need more data to improve the trained model.

First, we have tested the classification performance of the classical CNN and ResNet50 baseline models for the BCDR-D02 and the merged datasets. ResNet50, which was used as the baseline model, improved the classification performance compared to the classical CNN. Then, we tested the performance of the hybrid model, which is based on a fuzzy approach and uses the outputs of the CNN and ResNet50 models as input to the risk model and compared the results of the hybrid model with the results of the baseline models. We show that our hybrid neuro-fuzzy system using ResNet50 model as the deep neural network model has the highest precision and F-score values.

Consequently, we found that employing the fuzzy rule-based system with the deep learning models provides more sensitive classification compared to using pure deep learning for mammogram classification. Our study shows that the breast cancer can be diagnosed more accurately by using the fuzzy rule-based system. We believe that this fuzzy rule-based system can successfully help to reduce the number of unnecessary biopsies and thus reduce the negative impact on patients such as anxiety, cost, and unnecessary surgery.

In the future, we plan to extend this study by using different neural network architectures and fuzzy rule bases to help physicians in breast cancer decision making.

Ethics approval and consent to participate:

Disclosure of potential conflicts of interest: No conflicts of interest.

Consent for publication: All authors whose names appear on the submission approved the version to be published.

Availability of data and materials: The two of datasets generated during and/or analysed during the current study are available in the A Breast Cancer Digital Repository (BCDR) repository [14], mini-MIAS [http://peipa.essex.ac.uk/info/mias.html]. Our hospital dataset is available from the corresponding author on reasonable request.

Competing interests: The authors have no relevant financial or non-financial interests to disclose.

Authors' contributions: The concept was proposed by Pınar Uskaner Hepsağ, Selma Ayşe Özel, and Adnan Yazıcı. The first draft and literature search were by Pınar Uskaner Hepsağ. The review was revised by Selma Ayşe Özel and Adnan Yazıcı. All authors commented on the previous versions and approved the final manuscript

Funding: This work was supported by Scientific Research Project Unit of Çukurova University [grant number FDK-2016-6931]; and Nazarbayev University (Kazakhstan) Faculty-development competitive research [grant number FY2019-FGP-1-STEMM].

UA Irshath, M Yokesh, MV Rao, “Awareness of breast cancer and breast self-examination among female students in South Chennai”, Drug Invention Today, vol. 11, no.11, pp. 2850–2853, 2019.
BF Binhussien, M Ghoraba, “Awareness of breast cancer screening and risk factors among Saudi females at family medicine department in security forces hospital, Riyadh”, Journal of Family Medicine and Primary Care, vol. 7, no. 6, pp. 1283–1287, 2018. Doi: 10.4103/jfmpc.jfmpc_286_18
VL Irvin, Z Zhang, MS Simon, RT Chlebowski, SW Luoh, et al., “Comparison of Mortality Among Participants of Women’s Health Initiative Trials with Screening-Detected Breast Cancers vs Interval Breast Cancers”, JAMA Network Open, vol. 3, no. 6, pp. e207227-e207227, 2020. Doi: 10.1001/jamanetworkopen.2020.7227
MV Prummel, D Muradali., R Shumak, V Majpruz, P Brown, et al., “Digital Compared with Screen-Film Mammography: Measures of Diagnostic Accuracy among Women Screened in the Ontario Breast Screening Program”, vol. 278, no. 2, pp. 365–373, 2016. Doi: 10.1148/radiol.2015150733
C Dromain, B Boyer, R Ferre, S Canale, S Delaloge, et al., “ Computed-aided diagnosis (CAD) in the detection of breast cancer”, European Journal of Radiology, vol. 82, no. 3, pp. 417–423, 2013. doi: 10.1016/j.ejrad.2012.03.005
V Goreke, E Uzunhisarcikli, B Oztoprak, ” Type-2 fuzzy inference system design for computer aided detection in mammogram image”, 2016 Medical Technologies National Congress (TIPTEKNO), pp. 1–5, 2016. doi: 10.1109/TIPTEKNO.2016.7863068
E Uzunhisarcikli, V Göreke, SARI Vekil, “Comparison of type-2 fuzzy inference method and deep neural networks for mass detection from breast ultrasonography images”, Cumhuriyet Science Journal, vol. 41, no. 4, pp. 968–975, 2020. Doi: 10.17776/csj.691683
L Shen, LR Margolis, JH Rothstein, E Fluder, R McBride, et al.,” Deep learning to improve breast cancer detection on screening mammography”, Scientific reports, vol. 9, no. 1, pp. 1–12, 2019. Doi: 10.1038/s41598-019-48995-4
D Lévy, A Jain, “Breast mass classification from mammograms using deep convolutional neural networks”, 30th Conference on Neural Information Processing Systems (NIPS 2016), arXiv:1612.00542 [cs.CV]
MM Jadoon, Q Zhang, IU Haq, S Butt, A Jadoon, “Three-class mammogram classification based on descriptive CNN features”, BioMed Research International, vol. 2017, pp. 1–11, 2017. doi: 10.1155/2017/3640901
PU Hepsağ, SA Özel, A Yazıcı, “Using deep learning for mammography classification”, 2017 International Conference on Computer Science and Engineering (UBMK), pp. 418–423, 2017. Doi: 10.1109/UBMK.2017.8093429
L Wenzhong, L Hualan, H Caijian, Z Liangjun, “Classifications of breast cancer images by deep learning”, medRxiv, pp. 13, 2020. Doi: 10.1101/2020.06.13.20130633
A Khan, A Sohail, U Zahoora, AS Qureshi, “ A survey of the recent architectures of deep convolutional neural networks”, Artificial Intelligence Review, vol. 53, no. 8, pp. 5455–5516, 2020. Doi: 10.1007/s10462-020-09825-6
MG Lopez, N Posada, DC Moura, RR Pollán, JMF Valiente, et al., BCDR: a breast cancer digital repository, 15th International Conference on Experimental Mechanics, vol. 1215, 2012
WNA Baharuddin, RI Hussain, SNHS Abdullah, N Fitri, A Abdullah, A, “Mamdani-fuzzy expert system for BIRADS breast cancer determination based on mammogram images”, Communications in Computer and Information Science, vol. 378, pp. 99–110, 2013. Doi: 10.1007/978-3-642-40567-9_9
M Arora, D Tagra, “Neuro-fuzzy expert system for breast cancer diagnosis”, In Proceedings of the International Conference on Advances in Computing, Communications and Informatics, pp. 979–985, 2012. Doi: 10.1145/2345396.2345554
A Keleş, A Keleş, U Yavuz, “Expert system based on neuro-fuzzy rules for diagnosis breast cancer”, Expert Systems with Applications, vol. 38, no. 5, pp. 5719–5726, 2011. Doi: 10.1016/j.eswa.2010.10.061
M Nilashi, O Ibrahim, H Ahmadi, L Shahmoradi, “A knowledge-based system for breast cancer classification using fuzzy logic method”, Telematics and Informatics, vol. 34, no. 4, pp. 133–144, 2017. Doi: 10.1016/j.tele.2017.01.007
A Keleş, A Keleş, “Extracting fuzzy rules for the diagnosis of breast cancer”, Turkish Journal of Electrical Engineering & Computer Sciences, vol. 21, no. 5, pp. 1495–1503, 2013. doi:10.3906/elk-1012-938
GR Sizilio, CR Leite, AM Guerreiro, ADD Neto, “ Fuzzy method for pre-diagnosis of breast cancer from the Fine Needle Aspirate analysis”, Biomedical engineering online, vol. 11, no. 1, pp. 1–21, 2012. Doi:10.1186/1475-925X-11-83
V Balanică, I Dumitrache, M Caramihai, W Rae, C Herbst, “Evaluation of breast cancer risk by using fuzzy logic”, University Politehnica of Bucharest Scientific Bulletin, Series C, vol. 73, no. 1, pp. 53–64, 2011.
Sahria, I Mandang, “Diagnosis study of carcinoma mammae (breast cancer) disease using fuzzy logic method”, In Journal of Physics: Conference Series, vol. 1277, no. 1, pp. 012039, 2019. doi:10.1088/1742-6596/1277/1/012039
AV Senthil Kumar, (Ed.) “Fuzzy expert systems for disease diagnosis”, pp. 200–246, IGI Global, 2015. Doi: 10.4018/978-1-4666-7240-6
M Negnevitsky, “Design of a hybrid neuro-fuzzy decision-support system with a heterogeneous structure”, 2004 IEEE International Conference on Fuzzy Systems (IEEE Cat. No. 04CH37542), vol. 2, pp. 1049–1052, 2004. Doi: 10.1109/FUZZY.2004.1375554
A Haija, Q Abu, M Ghandi F., “Development of breast cancer detection model using transfer learning of residual neural network (ResNet-50)”, American Journal of Science & Engineering, vol.1, no. 3, pp. 30–39, 2020. Doi: 10.15864/ajse.1304
A Haseeb, MNM Salleh, K Hussain, A Ahmad, A Ullah, et al., “A review on data preprocessing methods for class imbalance problem”, International Journal of Engineering and Technology, vol. 8, no. 3, pp. 390–397, 2019. Doi: 10.14419/ijet.v8i3.29508
ML Huang, TY Lin, “Dataset of breast mammography images with masses”, Data in Brief, vol. 31, pp. 105928, 2020. Doi: 10.1016/j.dib.2020.105928
J Han, M Kamber, J Pei, “Data Mining: Concepts and Techniques”, Morgan Kaufmann, 3rd ed, pp. 365–366, 2011.
SUCKLING J, P., “The mammographic image analysis society digital mammogram database” Digital Mammo, pp. 375–386, 1994.

No competing interests reported.

Download PDF

Version 1

posted

You are reading this latest preprint version

A Deep Neuro-Fuzzy Rule-Based System for Mammography Image Classification

Status:

Version 1

Abstract

Figures

I. Introduction

Ii. The Approach And Methods

Iii. Results And Discussions

Iv. Conclusion

Declarations

References

Additional Declarations

Status:

Version 1