A novel machine learning approach on texture analysis for automatic breast microcalcification diagnosis classification of mammogram images

Screening programs use mammography as a diagnostic tool for the early detection of breast cancer. Mammogram enhancement is used to increase the local contrast of the mammogram so that the lesions are more visible in the advanced image. For accurate diagnosis in the early stage of breast cancer, the appearance of masses and microcalcification on the mammographic image are two important indicators. The objective of this study was to evaluate the feasibility of the automatic separation of images of breast tissue microcalcifications and also to evaluate its accuracy. The research was carried out by using two techniques of image enhancement and highlighting of breast tissue microcalcifications for the desired areas by regional ROI based on fuzzy system and also Gabor filtering method. After determining the clusters of breast tissue microcalcifications, the clusters are classified using the decision tree classification algorithm. Then, for segmentation, samples suspected of microcalcification are highlighted and masked, and in the last stage, tissue characteristics are extracted. Subsequently, with the help of an artificial neural network (ANN), determining the benign and malignant types of segmented ROI clusters was accomplished. The proposed system is trained with a Digital Database for Screening Mammography (DDSM) developed by the University of South Florida, USA, and the simulations are performed under MATLAB software and the results are compared with previous work. The results of this training performed under this work show an accuracy of 93% and an improvement of sensitivity above 95%. The result indicates that the proposed approach can be applied to ensure breast cancer diagnosis.


Introduction
Breast cancer is a malignant tumor that starts growing in the breast cells. A malignant tumor is a group of cancer cells that can spread to nearby tissues or throughout the body. It is most common in women, but rarely in men (Caldarone et al. 2021;Fox 2010). One in eight women develops breast cancer and the most important cause of death in women 40-44 years old is cancer (Hosseini et al. 2022), so its diagnosis and treatment are very imperative. Since 1990, the death rate of women due to breast cancer has decreased, which is because breast cancer can be diagnosed using mammography and other diagnostic methods, as well as increasing women's awareness (Fox 2010).
At present, mammography is considered the most effective way of imaging in the diagnosis of breast cancer. Digital mammography, which uses a digital detector to capture the passing rays, enables a better diagnosis of breast cancer, especially in its early stages, by optimizing each of the three stages, image processing and image screening (Schulz-Wendtland et al. 2009). Digital mammography has many advantages over film mammography, which has led to the rapid development of this method, especially in developed countries; For example, in a country like Germany, about 88% of mammography systems are digital. The following are some of the benefits: 1-High efficiency and wide dynamic range of detector which can reduce the patient's dose and also imaginary noise. 2-The possibility of applying imaging processing techniques and preparing the subject for better diagnosis by the treating physician without reducing the image resolution compared to mammography film scanning. 3-The possibility of eliminating noise and fixed patterns in the image by using the necessary corrections such as uniformity correction and eliminating some of the main factors of noise such as non-uniformity in the structure of the phonograph grains. 4-Possibility of applying different imaging techniques to extract texture properties and light intensity in both low-energy and high-energy images.
X-ray mammography is one of the most common methods used by radiologists to diagnose and screen breast cancer and also determine the presence of cancerous masses and cysts. However, mammographic images are difficult to interpret, and according to the National Cancer Center in the United States, 10-30 percent of breast masses are not visible to a radiologist (Wallis et al. 1991). Microclassified masses and particles, which are very small particles, are signs and symptoms of cancer in mammographic images. Hence it is very difficult to correctly diagnose these symptoms. Masses are generally classified into two categories, benign and malignant, each of which has certain characteristics in terms of appearance in the image. Benign elliptical masses are characterized by distinct, angled edges; whereas the malignant masses possess uneven, irregular edges and sharp edges, often similar to a lobule or lobular (Xu et al. 2022).
Several works have proposed computer methods for detecting abnormalities in mammography play a key role in the early diagnosis of breast cancer and thus help to reduce mortality due to breast pathologies in a cost-effective manner (Jalalian et al. 2013). Such procedures are known as computer-aided diagnosis (CAD) systems and may provide radiologists with reliable assistance in evaluating mammographic images (Vafaeinika et al. 2022;Molani et al. 2019). If a physician has to examine a set of multiple mammograms, their visual evaluation capacity is greatly reduced. As a result, computer-aided diagnosis (CAD) is being developed to facilitate the diagnostic process for radiologists (Mohaidat et al. 2022). Standard functions of CAD systems include segmentation (Duarte et al. 2015), feature extraction (Maas et al. 2021), and classification (Zou et al. 2022;Jung et al. 2021) to determine if there is a waste.
The main purpose of the CAD system is to identify areas suspected of underlying parenchyma (i.e., organ fibers that are different from related tissues or connective tissue). This ensures that the mammogram can be broken down without cross-section into different areas and discrete areas of interest, where the ultrasound image can identify potential mass applicants. Architectural defects, which cause changes in length, shape, and edges, when they are visible with lesions, such as thickness, asymmetry, and microcalcification, are valid indicators of malignant changes. Image segmentation is therefore essential to ensure that the entire mass detection and classification system is sensitive and accurate.
Image preprocessing is essential to identify any anomalies in medical images. Since most medical images are poor, the diagnosis work is complicated without improving the contrast of these images. Mammographic images are used to detect the presence of breast lesions, or any abnormal tissue in the breast to diagnose early-stage breast cancer. The increase, in contrast, is required to improve the visual clarity of mammography images, and it is usually preferred by radiologist to prevent the progression of cancer at high levels. In this work, the preprocessing step is used to improve the contrast of images from the reference (Al-Ameen et al. 2012). Moreover for improving the quality of the original image in the second stage of preprocessing, the technique of increasing the quality of the reference fuzzy image is utilized (Sheet et al. 2010;Trik et al. 2021).
In mammographic images, very fine microcalcification particles are usually seen as noise particles, and the masses have very low light intensities, making them difficult for radiologists and physicians to detect. Since accurate and timely diagnosis of a cancerous mass and its different types are of special importance in the health of people in the community, the diagnosis of breast cancer masses, which are mainly associated with human errors in the accuracy of diagnosis by researchers should be considered. For this reason, it is important to provide an automated methodology with the help of image processing techniques and recognition algorithms as well as optimize and upgrade existing diagnostic systems for reducing human error and assisting the early detection of the disease. In this paper, we tried to achieve higher accuracy by introducing a new and automated procedure, for detecting and extracting the mass of breast cancer cells. In this regard, due to the importance of breast cancer diagnosis in the early stages, and the facilities and benefits of X-ray imaging, this study was aimed at using two techniques to improve digital mammography images and the estimation of suspicious cases in the image processing and finally for detecting benign and malignant selected clusters. Improvements to images have been made using facilities similar to digital radiology systems, which can be practiced and accurately performed by digital mammography using conventional radiological systems. Ultimately, various techniques proposed for the segmentation of ROI images and the selection of suspicious masses are carried out with the help of the decision-tree algorithm for 250 images of the digital database for mammography screening (DDSM). To diagnose cancerous mass and microcalcification, the texture features 1 3 of the images are extracted and segmented. Then, malignant and benign microcalcification is determined by the artificial neural network (ANN) method under MATLAB software. According to the problem of searching for suspicious nodules from mammography images, the main contribution of this article is as follows: • Highlighting areas suspected of microcalcification using the Gabor filter and fuzzy highlighting techniques. • Selection of highlighted suspicious nodules using decision tree algorithm. • Masking of selected areas in the whole breast and classification of benign and malignant disease types for different images with the help of the proposed ANN technique.
The rest of the article is as follows: The related work is shown in "Literature review". In "Preliminary", we describe the basic mathematics of the fuzzy set, the Gabor filter, the neural network, and the decision tree algorithm. The model for improving mammography design is examined in detail using the new fuzzy deviation criterion and Gabor segmentation technique by selecting the suspicious parts using the decision tree algorithm in the proposed design, in "Methodology". "Results and discussion" shows the experimental results and validation. Conclusions and future work are shown in "Conclusion".

Literature review
In recent years, extensive research has been done to reduce fault detection and recognition of breast cancer masses and increase speed and accuracy to assist radiologists. In general, research in this area; Includes sections for selecting the appropriate database, including digital mammography images (images of healthy tissue, and tissue with benign and malignant masses), image preprocessing, identification and extraction of parts of breast tissue containing cancerous masses, expansion of the susceptible area and also the mass has been exposed to the boundaries of the mass or vice versa. Hence, first, the boundaries of the mass are revealed and then the final diagnosis of the image containing the cancerous mass is carried out. Various features (such as morphological, geometric, tissue-based, violet, etc.) are extracted from the mass, and subsequently, the final classification of the extracted properties is performed to determine the type of cancerous mass (benign or malignant) by intelligent algorithms. In the continuation of this section, some of the most important research in this field are introduced.
Breast cancer is one of the main causes of death for women. Recent studies have revealed that the best way to prevent cancer is through routine screening and prompt treatment. X-ray mammography can be considered an early diagnostic procedure. Good and sometimes very gentle contrast of healthy and unhealthy lymph node tissues has been described to help treat radiologists and their internal medicine specialists. A new approach is presented in diagnosing mammograms with weight in breast cancer (Ghantasala et al. 2020). The proposed study of the morphological operators is noticeably identified for the segmentation and clustering of existing anomalies in the form of biomass and microcalcification. The findings indicate that most of the abnormalities are recognized and eliminated for the lower gray comparison points; however, there are other areas with the same texture. Certain areas with identical textures, on the other hand, gradually vanish from the image for higher gray reference values, although the area of distinct disturbance and discrimination is against a smaller area. Therefore, the appropriate value of the gray reference level is required to ensure that the suspected regions are effectively divided and extracted, and efficiently classified, thereby preventing the extraction of areas unrelated to the same tissues is minimized. The optimal value of the gray reference point in this regard is shown to be calculated about to with concerning the overall image size as a result of the relative error evolution. However, this cannot be a safer option in terms of medical recognition assistance because sensitive areas associated with non-relevant areas can be identified instead of being ignored and then deprived of a significant sequence of the potential pathological sites. The planned algorithm enables reliable and scalable computing only by changing the grey reference point to a given threshold value.
A method of enhancing mammograms for visual interpretation of the breast masses was suggested using the measurement of adaptive neuro-fuzzy divergence under hyperbolic regularization (HR) (Ghosh and Ghosh 2022). The proposed scheme is developed to increase the unpleasant and abnormal growth of cells such as breast masses, abnormal tissues, nodules, and masses in mammography images. In the first stage, a complementary image of mammography is produced to separate the object and the background. Then, both source and background images are imaged with intuitionistic fuzzy set (IFS) under hyperbolic regularization (HR). A new entropy-based fuzzy deviation criterion is designed using the hyperbolic function to modify the membership. Moreover, the distance function of the fuzzy ambiguity interval is obtained from the vectors of hesitation from both the source and the complement images. Werner's AND-OR has been used in both to generate a modified membership function. Finally, by improving the quality of the mammogram, an increase in breast masses is achieved through the defuzzification process.
A computer method has been presented recently for the segmentation of microcalcifications in mammograms (Ciecholewski 2017). This makes use of morphological changes and consists of two parts. The first section recognizes the microcalcification from the morphological point of view, allowing the approximate range of occurrences to be determined, the contrast improves, and the noise in mammograms decreases. In the second part, watershed classification of microcalcifications is accomplished. This study was performed on an experimental set containing 200 pixels 512 × 512 × 512 × 512 Rois taken from mammography from the digital database for mammography screening (DDSM), including 100 malignant lesions and 100 benign lesions. The performed tests were averaged to obtain the following values of the measured indices: 80.5% (similarity index), 75.7% (overlap fraction), 80.8% (overlap value), and 19.8% (additional fraction). The average execution time of all steps of the methods used for ROI is about 0.83 s. An automated binary model for tissue diagnosis in digital mammograms is suggested as a support tool for radiologists (Fanizzi et al. 2020;Ha et al. 2022). For each ROI, texture features are identified on HAAR wavelet decomposition as well as points of interest and corners identified using the speeded up robust feature (SURF) and EigenValue (Mineigenalg). Binary classification of random forest is then trained on a subset of features selected by two different types of feature selection techniques such as filtering and embedded methods. It should be noted that the proposed model was tested on 260 ROIs extracted from digital mammograms of the BCDR public database. The best predictive performance for normal/abnormal and benign/malignant problems is the average AUC of 98.16% and 92.08% and the accuracy of 97.31% and 88.46%, respectively.
Since radiologist extensively uses mammography as the primary means of screening for breast cancer, the exact diagnosis of Microcalcifications is an unavoidable step for developing an effective diagnosis system. A stationary wavelet transform (SWT) technique has been presented for the detection and classification of breast microcalcifications (Mazumder et al. 2020). To identify suspicious regions of mammography, SWT was performed at different levels for constant wavelet energy analysis to extract specificity from each of the obtained SWT coefficients. Four different group classifications were utilized to categorize Microcalcifications as benign or malignant using these SWE features, performing 10 cross-validations.
The survey (Kang et al. 2021) was carried out to assess diagnostic performance using deep convolutional neural network (DCNNS) in the Microcalcifications classification of the breast at the mammograms. For this purpose, 1579 mammographic images were collected from patients with suspected microcalcifications in screening mammography between July 2007 and December 2019. Five DCNN models were previously used to classify Microcalcification as malignant or benign. Nearly one million images from the ImageNet database have been used to teach the five DCNN models. Hence, 1121 mammographic images were used for individual model adjustment, 198 for validation, and 260 for testing. Gradient-weighted classification activation mapping (Grad-Cam) was used to authorize the validity of DCNN models in highlighting Microcalcification areas to determine the final class.
Two new methods (Christopher and Simon 2020) are presented for improving mammogram images based on a new Unsharp mask design, called nonlinear minimum and slope minimization, NLUM GMIn (nonlinear UM and L0 Gradient Minimization, NLUM GMIN) for cancer diagnosis. Three different techniques are combined in the proposed method; (I) a non-linear filtering process to increase precise detail in a local 3 × 3 neighborhood; (II) A globally minimized L 0 gradient step that preserves high-contrast edges while suppressing low-contrast detail as matte fiber textures; and obtain a partial mammogram after subtracting a smooth mammogram from the original mammogram, (III) Finally, the Unsharp mask technique combines the details of filtered mammography through a non-linear filter, using PLIP operators that meet both Weber's law features and the saturation features of the human visual system. An HVS-based analysis scheme is used to analyze and visualize malignant areas on advanced mammography. The distinct arrangement of Plip operators in the proposed framework leads to the NLUM Gminauto and NLUM0Gmin methods, which use the NLUM method and other available techniques to increase the mammogram. The results indicated that NLUML0Gmin's proposed scheme for detecting cancerous masses and microcalcifications in dense X-ray mammograms is strong and effective in assisting physicians in better diagnosing cancer by improving the lives of countless cancer patients.
The correlation between two-dimensional shear-wave elastography (2D-SWE) and histopathological results of microcalcifications (MCS) was performed using ultrasonography (USG) (Kayadibi et al. 2021). Fifty people with suspected MCS were evaluated. They were monitored with USG and 2D-SWE before the Tru-Cut biopsy. It is worth noting that SWE values and histopathological features were compared statistically. Variables between groups were analyzed using the Mann-Whitney U test. Consequently, SWE is a useful approach in clinical practice to identify MCS that can be visualized with USG.
Automatic detection of microcalcification is essential for proper diagnosis and treatment. An automated approach and parts of microcalcification are introduced in mammographic images (Hossain 2019). Initially, to enhance the image, image preprocessing programs are applied. Subsequently, the breast area is separated from the Pectoral area. Suspicious areas are distinguished using the C-means fuzzy clustering algorithm and divided into negative and positive sections. This method eliminates the need to manually tag the area of interest. Positive snippets containing microcalcification pixels are taken to teach a modified U-Net segmentation network. Finally, the trained network is used to automatically isolate the microcalcification area from mammographic images. This procedure can be served as an assistant radiologist for early detection and increase the accuracy of segmentation of microcalcification areas.
A method has been presented for mammographic imaging using Fuzzy rough set theory (FRST) and a detailed approach to texture and feature extraction (Punitha and Perumal 2019). The main objective of the FRST setup is to extract the feature that is obtained using a rapid reconstruction algorithm which is extracted. The main purpose of establishing FRST is to extract the feature, which is obtained using a rapid regeneration algorithm that helps identify the tumor without losing pixels in a short period of time. Fuzzy rough instance selection (FRIS) is utilized to eliminate noise from the mammographic image, and finally, a combination of the fuzzy-rough nearest neighbor (FRNN) method is used in the segmentation.
A three-step method for the detection of cluster microcalcification is planned (Basile et al. 2019). In particular, it is made up of a preprocessing step, aimed at highlighting potentially interesting breast structures, followed by a single micro collection detection step, based on the Hugh Hough transform, which can be able to understand the specific shape of interest structures. Finally, a cluster identification step is performed for the Microcalcifications group using a clustering algorithm capable of implementing expert domain rules. An automated approach has been performed to detect the location of each microcalcification in mammographic images completely and simply (Hakim et al. 2021). First, the image processing algorithms were applied to increase the image quality. Subsequently, the microcalcification area was labeled using image segmentation based on the radiologist's expertise. Positive tags containing microcalcification pixels are taken for training with a segmentation grid. In this research study, 354 images from InBreast data were used. Finally, to automatically detect the area of microcalcification from mammographic images, the trained network was utilized.
Calcification was characterized by descriptors derived from deep learning and bracelet descriptors. They compared the performance of different image sets to digital mammography (Cai et al. 2019) (E sets of attributes include profound features alone, manipulated features, the combination of them, and deep properties). Experimental results have revealed that deep features outperform manipulation features, but manipulation features can provide complementary information for deep features. In this effort, the classification accuracy is 89.32% and the sensitivity is 86.89% using deeply filtered features, which can be considered the best performance among all feature sets.
Two approaches to feature extraction using empirical mode decomposition (EMD) are proposed to classify masses in mammographic images as benign or malignant (Nagarajan et al. 2019). The first method of feature extraction is based on Bi-dimensional empirical mode decomposition (BEMD). In performing BEMD in the Region of Interest (ROI) mammogram, the ROI is broken down into a set of different frequency components called Bi-dimensional intrinsic mode functions (BIMFS). Gray level co-occurrence matrix (GLCM) and gray level run length matrix (GLRM) are extracted from these BIMFs and given as input for benign or malignant classification. Due to the mixing mode problem that exists in BEMD, the BIMFS obtained from BEMD are less orthogonal to each other. To overcome this, the second method of feature extraction is proposed the name of modified generator experimental state decomposition (MBEMD). BIMFs are extracted using the proposed MBEMD in ROI mammography. Properties are extracted in the same way as the BEMD method. Support vector device (SVM) and linear discriminant analysis (LDA) are used to classify mammogram mass.
Diagnosing suspicious nodules and microcalcifications from mammography images is an important challenge for doctors. Bearing the above considerations in mind, it was observed that various efforts have been made in the direction of intelligent identification. Each of the methods has achieved worthy results for detecting different areas, but the proposed method, compared to the technique of other works, has responded to two important goals. In the first objective, it has been able to help doctors identify suspicious areas by scanning ROI to different areas of the breast and automatically searching. In the second objective, the speed of action of the proposed method for identifying and classifying different areas in the proposed methods works very well and will help to speed up the diagnosis of the disease.

Preliminary
In this section, we will introduce the basic topics used in this article. The studied concepts include fuzzy set system, mean and Gabor filters in different directions, decision tree classification algorithms, and artificial neural networks, which will be described in the following.

Fuzzy logic technique
It can define fuzzy image processing as the whole assemblage of all methods that apprehend, represent, and process the images, their segments, and features as fuzzy sets. The most important steps in fuzzy image processing; representation, and processing depend on the selected fuzzy technique and on the problem to be solved. Fuzzy image processing has three main stages: image fuzzification, modification of membership values, and if needed, image defuzzification. The fuzzification and defuzzification steps are the coding of image data (fuzzification) and decoding of the results (defuzzification). These steps make it possible to process images with fuzzy techniques. Therefore, the coding of image data (fuzzification) and decoding of the results (defuzzification) are the most important stages that provide us with the ability to handle the image with techniques as shown in Fig. 1 (Arora and Kaur 2012).
The most effective element of fuzzy image processing is that we can observe it in the middle stage, which is the modification of membership values or we can call it the intelligence step because this step makes the difference between one approach and another one. Fuzzy logic is characterized by a large variety of membership functions as shown in Fig. 2, each one of them has its distinctive effect. Utilizing appropriate membership by fuzzy system inference is increased the effectiveness of the method. This method assumes the adjacent points of pixels and then divides them into classes by using the membership function (Salahshour et al. 2020;Trik et al. 2023).
For an image to be handled in the fuzzy logic technology, has to be converted as a gray level and then converted into a membership function (Fuzzification step) where its value can be readily adjusted by fuzzy technology. This could either be called a fuzzy clustering, a fuzzy rulebased approach, or a fuzzy integration approach. Fuzzy image processing is essential to find out uncertain data. There are many advantages of image processing based on fuzzy logic such as: The reason why fuzzy logic is better than others is because everything is suffered from lacking exactness while fuzzy logic structures its understanding by taking into account.
In several image processing applications, to handle various types of complexities such as object recognition and scene analysis, it is suggested to utilize human logic according to if-then rules which can be offered by fuzzy set theory and fuzzy logic. On the other hand, many reasons such as randomness, ambiguity, and vagueness lead to uncertainty in image processing results and data. Moreover, those uncertainties hurt image processing progress that leads to many difficulties (Feng and Zhang 2015;Wang and Gao 2015).

Gabor filter
Gabor filters can be applied to images to extract features aligned at particular angles. Gabor filters acquire optimal localization properties in both spatial and frequency domains. The most considerable parameters of a Gabor filter are angle and frequency. Certain features that share a similar angle or frequency can be chosen and used to individualize between different facial expressions depicted in images. A Gabor filter is a complex exponential transmogrified by a Gaussian function in the spatial domain. A Gabor filter can be represented by the following Eq. 1 (Wang and Ou 2006;Sun et al. 2022): A Gabor filter is a linear band-pass filter of the form where x′ = xcosθ + ysinθ and: y′ = − xsinθ + ycosθ In this equation, λ represents the wavelength of the sinusoidal factor, θ represents the orientation of the normal to the parallel stripes, ψ is the phase offset, σ is the sigma/standard (1) g(x, y; , , , , ) = exp − x �2 + 2 y �2 ∕2 2 exp i 2 x � ∕ + , Fig. 1 Steps involved in Fuzzy image processing (Arora and Kaur 2012) deviation of the Gaussian envelope and γ is the spatial aspect ratio.
The amplitude and phases of the Gabor filter bank both contribute valuable cues about specific patterns present in images. The amplitude consists of directional frequency spectrum information and a phase contains information about the location of edges and image details.
The feature extraction method converts the pixel data into a higher-level representation of the structure, movement, intensity, characteristics of the surface, and spatial configuration of the face or its components. The Gabor features are computed by the convolution of an input image with the Gabor filter bank. I(x, y) is a grey-scale face image of size M*N pixels. The feature extraction method can then be defined as a filtering operation of the given face image I(x, y) with the Gabor filter u,v(x, y) of size u, and angle v are given as Eq. 2.
In the Gabor feature extraction method if the Holistic approach is used then features are extracted from the entire image. Gabor filters are applied on images to extract features fix at a particular angle (orientation) than the Gabor feature representation |o(x,y)|m,n of an image I(x,y), for x = 1,2,…..N, y = 1,2,….M, m = 1,2…NL, n = 1,2,….No, is computed as the convolution of the input image I(x,y) with Gabor filter bank function g (x,y, λ m, θ n). The convolution operation is performed separately for the real and imaginary parts are given as Eq. 4. This is followed by the amplitude calculation given as Eq. 4: Research Issues with Gabor Feature Extraction Technique Beneficial issues of the Gabor Technique are as follows: 1. Gabor filters have excellent localization properties in the spatial and frequency domains. 2. Gabor Filters are more robust than DCT for light variation.

Amplitude and phases provide valuable and maximum
cues about specific patterns present in an image.
The problem definition of the Gabor feature technique is as follows: 1. The Disadvantage of the Gabor filter is it takes too high time for performing features due to the dimension of the feature vector being very long. 2. Challenge of the Gabor technique is it has high redundancy of features. Redundancy is responsible of reduce of recognition rate.
Research issues: High dimension and high redundancy is a problem issue for the Gabor while it has a maximum variance of features. Dimension and redundancy should be reduced using some technique. The dimension reduction (3) Re (O(x, y))m, n = I(x, y) × Re(g(x, y, m, n)) Im (O(x, y))m, n = I(x, y) × Im(g(x, y, m, n)) (4) |O(x, y)|m, n = (Re (O(x, y))m, n ) 2 + (Im (O(x, y))m, n ) 2 1∕2 technique for the Gabor is called filtering so this whole technique is called the Gabor filter. There is some filter technique that is proposed by researchers. These filtering techniques are sampling, average filtering, etc.

Sampling filtering
In Gabor, features are distributed uniformly so in Sampling Filtering a sample is created using a selection of features uniformly from large dimension Gabor. For example, one Gabor feature is selected from one Gabor feature vector by a difference of 25.
For example, the sampling Filter, the time for calculating Gabor feature extraction is very large and the dimension of the Gabor feature vector is very long also. For example, an image of the size of 256 × 256 pixels is transformed through a Gabor Bank of size (frequency = 5, angle = 7) 35 total in which 5 Gabor feature matrix of size 25 × 25, 5 of 51 × 51, 5 of 76 × 76, 5 of size 102 × 102 and 5 of size 128 × 128 or a feature vector of length 178,950 samples. For reducing the size of these feature vectors, downsampling using a sampling filter can be performed without losing important information. When using the downsampling factor of 125, the length of each feature vector can be reduced to 1431 samples feature vectors (Li et al. 2010).

Decision tree algorithm
One of the widely used techniques in data mining is systems that create classifiers (Kumar and Verma 2012). In data mining, classification algorithms are capable of handling a vast volume of information. It can be used to make assumptions regarding categorical class names, classify knowledge based on training sets and class labels, and classify newly obtainable data (Alqudah et al. 2013;Nikam 2015). Classification algorithms in machine learning contain several algorithms, and in this work, the paper focused on the decision tree algorithm in general. Figure 3 illustrates the structure of DT. Decision trees are one of the powerful methods commonly used in various fields, such as machine learning, image processing, and the identification of patterns (Stein et al. 2005). DT is a successive model that unites a series of the basic test efficiently and cohesively where a numeric feature is compared to a threshold value in each test (Damanik et al. 2019;Trik et al. 2022). The conceptual rules are much easier to construct than the numerical weights in the neural network of connections between nodes (Ha et al. 2019;Gupta 2014). Mainly for grouping purposes, DT is used. Moreover, DT is a usually utilized classification model in Data Mining (Gavankar and Sawarkar 2017). The nodes and branches are composed of each tree. Each node represents features in a category to be classified and each subset defines a value that can be taken by the node (Verma 2022). Because of their simple analysis and their precision on multiple data forms, decision trees have found many implementation fields. Figure 4 shows an example of DT.

Entropy and information gain
Entropy is employed to measure a dataset's impurity or randomness (Charbuty and Abdulazeez 2021). The value of entropy always lies between 0 and 1. Its value is better when it is equal to 0 while it is worse when it is equal to 0, i.e. the closer its value to 0 the better. As shown in Fig. 5. If the target is G with different attribute values, the entropy of the classification of set S concerning c states . As shown in Eq. 5.
where Pi is the ratio of the sample number of the subset and the i-th attribute value. Information gain is one metric used for segmentation and is often called mutual information. This intuitively informs how much knowledge of a random variable's value is (Liu et al. 2013). It's the opposite of entropy, the higher its value is the better. The data gain Gain (S,A) is defined as the following on the definition of entropy (Taneja et al. 2014), as shown in Eq. 6.  (Maszczyk and Duch 2008) where the range of attribute A is V(A), and Sv is a subset of set S equal to the attribute value of attribute v (Liu et al. 2013).

Artificial neural networks (ANN)
Artificial neural networks are model-free intelligent dynamic systems based on experimental data that do not require any reception and by processing experimental data, transfer the knowledge or law behind the data to the network structure. Artificial neural networks learn general rules based on calculations on numerical data or examples and try to model the neuro-synaptic structure of the human brain (Krose and Smagt 2011).
In fact, in neuronal calculations, an attempt is made to imitate the way the brain works, which is one of several methods of machine learning. Neural networks have the potential to create features that humans can use to solve complex problems such as logic simulations, expert systems analytical techniques, and software technologies. The artificial neural network system is inspired by the brain and the human neural network system and, like the human brain, is composed of a large number of neurons. These networks, like the human brain, can learn, retain, and communicate data. Implementing the amazing features of the brain in an artificial system has always been tempting and desirable. Many researchers have worked in this field over the years; but the result of these efforts, regardless of valuable findings, has been the growing belief in the principle that the human brain is unattainable. These networks are used in many sciences including physics, electrical engineering, environmental engineering, chemical engineering, computer science, robotics, etc.) (Norvig and Intelligence 2002). There is an artificial neural network with two types of feed-forward and feedback, in which the type of feed-forward is used for training.

Methodology
The main purpose of the proposed method is to search for masses and microcalcification from different parts of the mammogram image and then to classify the masses in benign or malignant mammography using extracted features. In this task, the method of extracting the Local Oriented Statistics Information Booster (LOSIB) and Gray-Level Co-AVCurrence Matrix (GLCM) is suggested. Mammogram images are collected for the proposed work of the digital database for screening mammography (DDSM) that the public database is available. The proposed work on the MATLAB 2017b environment has been tested on a computer with an Intel Core i5 processor with a 2.67 GHz CPU and 4 GB of RAM.
The DDSM database includes images taken from about 2,500 patients. From each patient, two images, each of the left breasts and the right breasts are taken from both craniocaudal views (CC) and the MLO view. Of these images, 125 images are used with benign masses and 125 images with malignant masses for this work. The block diagram is shown in Fig. 6. The proposed extraction method includes the following steps: I. Pre-processing for image enhancement and noise removal and background subtraction: In this step, fuzzy segmentation is used to highlight the breast. Then, with thresholding and active contour technique, the main part of the breast is separated from the radiological image. II. ROI extraction for mass cutting and microcalcification: In this step, certain areas of the image with dimensions of 70 × 60 pix or other standard ranges are cut from the original image. This selection helps the tracking system to more accurately identify suspected areas of microcalcification in the selected area. III. ROI analysis based on the two proposed methods for segmentation of masses and microcalcifications for the target image of the ROI: this step is implemented by two Gabor filter techniques and fuzzy segmentation to highlight suspicious areas. IV. Selection of suspicious sections of crime and microcalcification with the help of a decision tree algorithm. At this stage, with the help of the decision tree algorithm and according to the training samples to select the mass and microcalcification areas from other nonsuspicious areas, we will classify the selected areas. V. Extraction of GLCM and LOSIB features from these highlighted and masked images. The microcalcification areas determined from the images in the ROI are masked and highlighted with the help of the binary method. Then the conventional texture features (GLCM and LOSIB) are extracted from these regions. VI. Classification of the final ROI as benign or malignant.
For ROI classification, feedforward neural network classifier training has been used. Finally, the results obtained from this classification have been compared with other works in terms of accuracy and sensitivity.

Pre-processing techniques
In this section, two pre-processing actions are performed, including improving image quality with the help of the fuzzy technique (Sheet et al. 2010) and removal of background and reducing the noise of mammogram images, including numbers and patient information, and so on. At the stage of improving the image quality, a new modification of the dynamic histogram consolidation technique is to maintain the lighting ability to improve the lighting and increase contrast, while also reducing its computational complexity. The modified technique is called Brightness Preservation Dynamic Fuzzy Histogram Equalization (BPDFHE), which uses fuzzy digital images to display dynamic lighting. Refresh and processing images in the fuzzy domain enable this technique to manage the level of gray surface values in a better way and therefore better performance. The runtime depends on the size of the image and the nature of the histogram (Sheet et al. 2010;Trik et al. 2021). In the second step, preprocessing is used to remove the background of a binary masking technique and image segmentation, and then the section with a larger area as the breast image portion is highlighted for use in later stages, and the rest of the sections are removed from the original image. Pre-processing results for the first and second stages are shown in Fig. 7b and c respectively.

Tracking the mass or microcalcification from the image
In this section, we first cut the standard part of the image manually and prepare for the current stage. To find microcalcification, the image cut areas as ROI enhances the Fig. 6 Block diagram of the Proposed Feature Extraction method accuracy of the detection, so in this section, we try to use standard cuts to identify. In this work, two mass tracking techniques of ROI images are suggested, and we will discuss and examine each. These techniques will lead to the initial part of the defects and masses and microcalcification from the selected image.

Segmentation technique with Gabor filter
This proposed method has a high speed compared to another proposed method. The basis of the performance technique is the ROI image review in terms of the presence of abnormalities and sudden tissue changes in the desired image, with the help of the Gabor filter at wavelength 3 in different directions including different angles 0, 30, 45, 60, 90, 120, 120 and 135 and 150 and 180 degrees are reviewed and calculated according to the following code. By doing so, we will be highlighting the image for different areas of the image that shows their distinctions against its neighborhoods and consider it as a suspected microcalcification area.  The proposed technique for enhancing the contrast When improving the contrast of medical images of computer tomography (CT), two factors should be taken into account, which includes speed and efficiency. The proposed technique considers these two factors by providing quick processing with effective results. This technique has been used in the spatial area. In addition, instead of pixel processing to pixel image, it is applied directly to the entire image. The image normalization is based on its size as follows: First, the processed image size is determined. Then the variable of interest (k) is calculated using the following equation ( which (X) the initial image, (min, max) minimum and maximum values pixels of the processed image, (K) improve the variable, and (EI) improved image with contrast. In this paper, a fuzzy technique for replacing (e K ) exponential function is used to improve the increase in the scanning performance. We use the characteristic of the Bell-based membership function for this section to improve the number of different pixes of the ROI image as shown in Fig. 9. So, Eq. 8 is modified to the following relationship: Figure 10 shows the proposed correction results for a mammogram image sample. improvement of the contrast of the image for the method (Al-Ameen et al. 2012) and improved fuzzy in Fig. 10B and C have been shown. As can be seen, the resolution rate of the image will further improve.

ROI image
Increase the contrast according to (Feng and Zhang, 2015).
Filter the average of the image to blur.

Gabor Filtering
Threshold-based binary masking for average ROI image.

Segmentation technique with the proposed fuzzy system
In this section, to provide a more accurate technique that can identify suspicious points of microcalcification, a fuzzy logic system of the Takagi Sugeno type has been used. The base of performance for this proposed project is by determining the average gray image of the ROI area. The fuzzy system creates a threshold for image segmentation to highlight suspected pixels that have more light and distinctive light from ROI. This method takes more time than the previous method because of the pixel examination of different areas of ROI but is much more accurate in identifying mass and microcalcification areas and determining their boundaries. Figure 11 displays the proposed fuzzy system consisting of membership functions and input-output characteristics. This model is introduced based on the rules defined with the help of the Fuzzy Takagi Sugeno. Based on this proposed work, more light-intensity pixels in comparison with neighboring areas in the ROI area will be more likely to be micro circle and mass. The proposed operation in this section is presented on the original image by the normalization of pixel light intensity in the ROI region. This figure introduces the process of the fuzzy system for highlighting microcalcifications. Fuzzy rules of the proposed system are introduced based on the contrast intensity of RIO images in each step. That is, with fuzzy thresholding based on contrast intensity, it has been able to identify these suspicious areas. Fuzzy rules are applied with the help of fuzzy system training for several different samples of ROI images to improve the proposed method. Figure 12 examines and shows the images and steps to execute the proposal.

Selection of areas suspected of microcalcification by decision tree (DT) algorithm
After segmenting the different areas of the ROI image and binary masking, it should be labeled for each highlighted part, and in this work, the parts suspected of mass and microcalcification are selected and highlighted by a trained system of decision tree algorithm as a machine learning system. Then, mask unsuspecting areas with a background color.
In this step, 4 characteristics of each section including light intensity, variance, area, and scattering are used to classify different parts. Figure 13 shows some examples of search and selection operations with the help of the proposed idea. In this case, with the help of two previously proposed segmentation techniques and the selection of suspicious areas by the decision tree algorithm, the results are reviewed and shown.
As you can see, the proposed fuzzy method has shown better performance compared to Gabor filtering, but for higher speeds, the Gabor technique can be used with less accuracy.
In these results, the final image is the result of the decision tree function for selecting areas suspected of microcalcification from the mask image. The main reason for using the decision tree algorithm in this work is to reduce the number of nodules and remove those that are not included in the microcalcification category. The use of this algorithm increases the accuracy and sensitivity of detection in the proposed scheme.
Based on the performance of the proposed techniques, in each ROI, suspected areas of microcalcification mass are introduced through highlighting compared to the background. In the comparison of defined methods for highlighting, the fuzzy technique has created a more accurate demarcation compared to the Gabor method. Therefore, the final images have been segmented with better quality. Of course, this segmentation can be improved by increasing the number of angles in the Gabor technique. The application of the fuzzy technique in the problem of highlighting the areas is fully trained and based on the ROI contrast. Applying a trained decision tree system to classify areas close to or completely accurate to microcalcifications can increase the accuracy of the identification system. This function is performed in both methods. This will help increase the sensitivity of the classification because the imperceptible areas of the extracted nodules will be separated. Hence, the decision-making will be improved.

The proposed artificial neural network (ANN)
In this paper, we have used two series of texture extraction features including Local Oriented Statistics Information Booster (LOSIB) and GLCM. In this regard, a total of 29 features have been extracted for each image (125 images with malignant masses and 125 samples with benign masses) has been executed and simulated under MATLAB software.
LOSIB is a descriptive booster based on the extraction of gray surface differences in several directions. In particular, the average of the differences along specific orientations is considered. Experiments have been performed using several classical texture descriptors to show that the classification results are better when combined with LOSIB (Rastegarpouyani et al. 2018;García-Olalla et al. 2014). Combining this descriptor of LOSIB texture properties with the GLCM method in this work has been able to create higher accuracy for simulation results than the methods found in other papers. Figure 14 shows the proposed feedforward neural network structure for 19 selected features from previous examples. This structure consists of two hidden layers according to the figure that the results of the system training are listed under the features defined in the results section.

Results and discussion
There is real information for each mammogram image in the DDSM dataset that provides a further enhancement for quantitative evaluation. Several evaluation criteria such as sensitivity, precision, accuracy, and specificity are calculated as follows : where TP defines a true positive, which can determine benign accuracy. TN mode for true negative, which is characterized by malignant samples of malignant images. FP equals false positive, which falsely identifies the sample as benign. FN stands for False Negative, which refers to a malignant with a benign specimen. In Table 1, the simulation results are tested for training data extracted from different samples for different values. Image samples are selected for training 80% and for test data 20%, which is compared in Table 1 with the proposed ANN method. Based on the comparison of the obtained results, it can be seen that the proposed design has been able to create a very good sensitivity for different samples of images. The proposed technique with a fast intelligent operation has been able to improve the design accuracy compared to other works. Table 2 compares the classification results of the proposed scheme for microcalcification detection by CNN with other works. The results show that the proposed CNN technique has performed quite well compared to other methods and has good accuracy and sensitivity. One of the advantages of this method in the future is that it is possible to set the mammography image input (DICOM, TIF, JPEG/JPG, BMP, etc.) using any image format. Also, the speed of the proposed system from the beginning of image processing to the identification of benign and malignant cancers as the last step is enhanced.

Conclusion
This work manifests a mammography improvement plan based on the new fuzzy system to improve image quality for highlighting breast masses, lumps, lesions, and deformed textures. The proposed design first separates a mammogram through a complementary image to the foreground/background. This is followed by intuitive fuzziness, measurement, and membership grade correction using averaging operations through the measured ROI area of the source and supplementary images. Finally, we obtain improved visual mammography through the fuzzy process. Fuzzy structure and rules are obtained from mammography using Takagi Sugeno fuzzy system training, which due to its parametric nature has an effect on highintensity pixels for local intensity manipulation. Compared to advanced methods, the proposed improvement plan shows better performance due to increased contrast and improved visual quality for mammography. Also, in another technique, to increase the detection speed of microcalcifications, the Gabor filtering technique is used for highlighting. It is worth noting that the mentioned method has a weaker performance than the proposed fuzzy design. In the next step in this article, we have used a technique for selecting suspicious areas with the help of the decision tree algorithm, so that we can increase the accuracy by classifying the areas selected by the mentioned methods. The radiologist can make an initial assessment of breast masses, soft textures, or tumors with the initial screening. The main advantage of the proposed design is that it works accurately to measure ambiguity in the dynamic range of the gray surface, thus highlighting small abnormal tissues in the breast. In this work, the similarity coefficient with the decision tree training is used, which leads to faster convergence and in turn reduces the computational cost. In the future, we will improve the current algorithm in various ways and work very carefully on the automatic detection of abnormal texture areas, masses, and microcalcifications in the early stages in real time.
Based on abovementioed results, we will use a second-type fuzzy system for automatic detection. Significantly, to classify them, meta-heuristic algorithms combined with artificial neural network (ANN) will be utilized.
Author contributions Availability of data and materials: The data used to support the findings of this study are included within the article. Competing interest: The authors have no competing interests to declare that are relevant to the content of this article.Funding: No funding was received for conducting this study.Disclosure of potential conflicts of interest: the authors have no conflict of interest to declare.
Funding No funding was received for conducting this study.

Availability of data and materials
The data used to support the findings of this study are included within the article.  (Han et al. 2017) 94.9 Grassmannian + VLAD (Dimitropoulos et al. 2017) 90.5 CNN is filtered by morphologic (Cai et al. 2019) 88.59 CNN (Hakim et al. 2021) 90.30 This work 93