Several previous articles have argued for the use of CNN and AI in image recognition and healthcare monitoring. Roughly 58% of samples are correctly identified to all classes, 70% to all mass classes, and 94% to all calcifications [26]. Except for the calcification argument and the bulk alone argument, all of these arguments can be improved to produce a better outcome [27][32][33][34][35]. This study has as its overarching goal the improvement of CNN's ability to detect breast cancer. We propose a method for automatic breast cancer detection in this research by investigating several distinct CNN architectures. Numerous DL and regression-based techniques are incorporated into the plan. The proposed system builds on top of a CNN and adds two more architectures that are inspired by the massive dataset of over 284,000, 48 48-pixel RGB image patches. The precision of the numerical data will be examined with validation tests. The impact of various CNN architectures on the optimum parameters is a primary focus of this investigation. The paper's remaining sections are structured as follows: The sections of this paper will be an introduction, materials and methods, results and discussions, and a conclusion that includes limitations, suggestions for future study, and recommendations

*A. Analysis by incremental independent components*

The incremental independent component analysis technique uses the Euclidean distance between a defect-free data set and an individual sub of an input texture image to extract features from non-overlapping sub-windows of the reference and input textures in order to classify them as defective or non-defective.

*B. IICA Data Pre-Processing*

In incremental independent component analysis, centering and whitening are examples of pre-processing techniques. Taking the mean of each column in the input matrix X and subtracting zero yields a centred matrix. The second order dependence is then removed using the whitening matrix V. A matrix's covariance matrix can be used to derive a whitening matrix, such as V, by taking its square root twice. After the images have been cleaned up, the IICA technique is utilised to find the flaws.

**The IICA Design**

The IICA model can be described as

**SSX = AQSS**…………………… (3.1)

Where,

**X** **–** This empirical matrix (data samples)

**SS** **–** Predictive factor (incremental independent components)

In IICA, only the vector X is known beforehand, while the other two, AQSS and QSS, are presumed to be unknowable. The AQSS sources can be calculated when an estimate has been made.

**SSZ = WQX**………………………(3.2)

Assume a mixing matrix AQSS, and let WQX be its (pseudo)inverse; this matrix is referred to as the phase separation matrix.

*C. Support vector machine*

To solve classification and regression problems, the support vector machine is a standard supervised learning method. To solve classification problems, it is frequently implemented in machine learning. By calculating the optimal line or decision boundary for partitioning an n-dimensional space into classes, the SVM method may swiftly identify subsequent data points. One of the most optimal decision boundaries is a hyper-plane. Maximum hyperplane-constructing vectors are selected with support vector machines. This technique, known as a Support Vector Machine, is best illustrated by the use of support vectors.

*D. Working SVM*

It is possible to use an example to show how the SVM algorithm functions. Let's pretend we have a data set with two groups and two characteristics: a blue group representing depressed individuals and a black group representing healthy individuals who are not depressed (x1 and x2). In order to tell the difference between black and blue locations, we need a classifier [figure 3.0]. Since this is a two-dimensional space, we can easily differentiate between the two categories by drawing a line in the middle. However, many different lines can be drawn to separate these categories [Fig. 3]. Therefore, the optimal line or decision boundary, also called a hyperplane, can be found with the help of the SVM method. The SVM technique finds the point where the lines denoting the two categories meet. A technique called "support vectors" is used to zero in on these locations. The vectors' margin is the area outside the hyperplane. For SVM, the goal is to maximise this margin. If you look at Fig. 2 you'll see that the hyperplane with the highest margin is the optimal one. [11][13][36][37][38].

All the patches are represented as RGB pixels, with values ranging from 0 to 255. To classify these photos using a machine learning system was one of our original goals. So, to work with the techniques, we created a scale from 0 to 1.

In machine learning and statistics, classification refers to a supervised learning technique in which a program first uses the input data to learn how to classify incoming observations Only multiclass or class datasets are permitted [31]. Voice recognition, characterised, person authentication, handwriting recognition, and so on are all examples of significant classification difficulties.

The pattern recognition algorithm known as k-nearest neighbour uses previously collected data to find the next most similar example to the current one. Several training samples adjacent to the new point are defined using nearest-neighbor algorithm theory, and then used to predict the label. K-Nearest Neighbor Learning allows the user to either set a fixed number of samples or have them adapt to the local point density. A standard Euclidean distance is often used as a distance metric, but other distance metrics are acceptable. Its straightforward design makes k-nearest neighbour a viable option for a wide variety of datasets, and this method has been shown to produce superior results for complex boundary conditions. Figure 3 appears to have a smoother boundary with smaller variances for higher values of K.

Specifically, it excels in situations with several dimensions. This algorithm takes a data collection and plots it in n-dimensional space, where n is the number of features and each feature value is a single feature. The best hyperplane for making this distinction can then be utilised to make the categories. For calculating the closeness between data points in the KNN, the metric of Euclidean distance is chosen. The equation in Fig. 4 is used to represent the Euclidean distance d (x, y).

$$d\left(x,y\right)= \sqrt{{\left({x}_{test}-{y}_{1}\right)}^{2}+ \dots \dots +{\left({x}_{test}-{y}_{n}\right)}^{2}}$$