The approach proposed an automated detection of potato leaves disease detection using machine learning and computer vision into early blight, healthy and late blight category. The quality expected is high, hence a computer vision with machine learning system ought to attain rigid parallel feature extraction after pre-processing, feature selection using principle component analysis and a non-destructive classification technique. The fundamental gait used for potato leaves disease detection are indicated in Figure I. The sub-section follows with detail explanation of each gait.

**a. Image Acquisition** The algorithm proposed uses two different potato leaf datasets. The first database (D1) contains early blight, healthy and late blight (Plant Village Dataset); a total of 1500 images are taken from Kaggle.com created by Muhammad Ardi Putra (2021). The second database (D2) contains healthy and late blight images; a total of 430 images are taken from mendely.com created by Natnael Tilahun (2020). Table III represents the dataset used by proposed algorithm. Figure II represents some of the sample database images.

Table III Description of database used

S.No. | Dataset | Author’s | Early Blight | Healthy | Late Blight |

1 | D1 | Muhammad Ardi Putra (2021) | 500 | 500 | 500 |

2 | D2 | Natnael Tilahun (2021) | - | 363 | 67 |

**b. Image Pre-processing** Dataset obtained with multiple techniques can contain a different buzz that reduces image quality. As a result, it will not provide enough data set for image processing. Therefore, the firmness assessment is accommodated or the color map is created to enhance the image. It reinforces the difference by choosing a stronger width for the image. Disabilities can be conveniently identified by color adjustment. The coloured images (RGB) are available on digital kodak cameras and light details cannot be separated by color. Hence, gray images are used which reduces the processing time. The conversion occurs with RGB intermediate values (Dorj et al. 2017).

$${G}_{Int.}=\frac{1}{3} (R+G+B)$$

1

where GInt is the intensity of grayscale, R is bright red & G is bright green & B is bright blue.

The gaussian filters are used to shorten the noise. The image composition is enlarged by the various scales values. Statistically, kernel values for Gaussian filters are included.

**c. Segmentation**

The fuzzy c-means partition all the data set point that belongs to similar groups with distinct membership function. These cluster centroids attenuate the dissimilarity objective function. Mathematically, matrix with membership (M) function

$$\sum _{j=1}^{l}{M}_{jk}=1, \forall k=\text{1,2}\dots ..n \left(2\right)$$

Also, the dissimilarity (D) function represented as

$$D\left(M, {l}_{1}, {l}_{2},\dots \dots {l}_{l}\right)=\sum _{j=1}^{l}{D}_{j}=\sum _{j=1}^{l}\sum _{k=1}^{n}{M}_{jk}^{m}{E}_{jk}^{2}$$

3

where li is cluster centroid, Ejk is Euclidean distance, l is cluster number

**d. Feature Extraction**

The processed image is analysed further by feature measurement. This step establishes feature vector of input data consist of image translation values. Hence the recognition level of the data improves. The proposed method extracts distinct feature such as textural, statistical, geometrical, histogram of gradient, law’s texture energy, discrete wavelet transforms (Ou et al. 2014, Moallem et al. 2017) to prove accuracy. SURF (Quick Robust Features) tests 112 more features using the image processing toolbox. Fig III indicates the total extracted features in the proposed automated potato leaf disease detection technique.

Mathematical features include variability, smoothness, general deviation, i.e., RMS, kurtosis, skewness and Inverse minute difference represent the average order time value. The proposed method uses the histogram pixel size in the defective part. The features of the text include strength, contrast, homogeneity, entropy, merging represents the average order of the order second. The Gray dissemination serve as the time sequence of image texture by elements which means sequence is uniform.

The rules of power texture include many masks to understand the different textures of the image. The ROI is resolved by the strength of the stitching. The DWT includes compression of data and removal of sound from the frequency band (LL, LH, HL, HH). The component LL contains higher power and knowledge. The composite of mathematical values, color, text, texture strength, histogram gradients and flexible discrete wavelet includes the discovery of various levels of potato leaves disease.

**Feature Selection**

The proposed system doesn’t work using all 112 components concurrently leading to performance impairment. Therefore, the structural analysis of the reduction process features is used to select a 30-feature division among 150 features. The advantage of using the process to reduce the feature is low power space- and small-time limitations. “The central idea of PCA is to reduce the size of the database with a large number of related variables while maintaining as much as possible the variability in the database. This is achieved by switching to a new set of variables, key components, unrelated, and ordering so that the first few keep the existing variations in all of the original variables” (Li et al. 2019). Mathematically,

$${X}_{k}=X- \sum _{s=1}^{k-1}X {W}_{s}{W}_{s}^{T}$$

4

**e. Classification**

The basic gait for obtaining the quality of disease in plants. Here, the relevant details are drawn among the image processed. The method proposed includes the subsequent separation approaches used to obtain the quality of potato leaves.

**k- Nearest Neighbour Classifier (k-NN)-** It is an unintentional and targeted method (Pandey et al. 2014) of classification without considering the distribution of data. In this approach, most classes are categorized by selecting data values close to the set training values. Neighbour range is calculated by hamming formula.

**Logistic Regression (LR)**

The independent and dependent and variable relationships are established by logistic regression. The class identification is done by cumulative independent variables for each object (Irin et al. 2014).

**Artificial Neural Network (ANN)**

The ANN contains threshold values and weights with transfer functions integrated through neurons (Wen et al. 2012). If the statistical formula does not divide an object, then ANN exits the lead of the lead selection. The ANN shows the leading conception classification of agricultural products. The input weighted biasing results in an output of neurons. Most of the neurons are considered to be carrier components of inactivation function.

$$f\left(net\right)=\text{tanh}\left(x.net-y\right)+z$$

5

**Support Vector Machine (SVM)**

The SVM indicates the division of linear & non-linear data. SVM (Chang & Lin 2001) converts a hyperplane to divide two distinct classes into a high line. Bias is introduced by ordering random samples. Figure IV indicates a substantial hyperplane line and empty lines of training. Line reversal is used to select samples.

$$C\left(k\right)=sgn(\sum _{i=1}^{N}{d}_{i}^{*} {c}_{i}+{e}^{*})$$

6

where \({d}_{i}^{*}\)is multiplier standards, e* threshold, ci 1/-1.

Mathematically,

$$\text{M} \left(\text{w},{\text{w}}^{{\prime }}\right)=⟨{\phi }\left(\text{w}\right), {\phi }\left({\text{w}}^{{\prime }}\right)⟩$$

7

Here, w is the input vector and M is Mercer’s condition,

$$M \left(w,{\text{w}}^{{\prime }}\right)=\sum _{n}^{\infty }{a}_{n}{\phi }_{n}\left(w\right){\phi }_{n}\left({w}^{{\prime }}\right)$$

8

$$\iint M\left(w,{w}^{{\prime }}\right)g\left(w\right)g\left({w}^{{\prime }}\right)dw d{w}^{{\prime }}$$

9

These values specify kernel values of product.