Microscopic peripheral malarial parasite detection and classiﬁcation in blood smears using Gabor Filters and Machine learning algorithms

Background: Malarial fever disease mainly caused by Plasmodium parasite that is infectious to red blood cells. Manual mode of blood cell counting is a tedious process, this leads to distressing method for diagnosis. This process’s mainly impacted on larger screening process. Introduction: The advanced stage of technology, computer aided detection and analysis of this malarial disease, based on Gabor Filters followed by the comparison of XG-Boost classiﬁer, Support Vector Machine and Neural Network Classiﬁer algorithms chosen as architecture of choice for recognition and classiﬁcation of these malarial blood cells. Objective: The goal of this paper is to slow down the complexity in model discrepancy’s, and bring it to more desirable robustness and generalization, through the model development which detects and classify the parasitized and uninfected blood cells in the given sample. Roughly 13750 parasitized and 13750 unparasitized samples was taken for experiments. Results: From the experiments the models such as S.V.M achieved 94% and XG-Boost achieved 90% neural network classiﬁer achieved 80% , out of these S.V.M performed good results in classifying and recognizing the parasitized and uninfected blood cells to increase the accuracy in decision making. Conclusion: The accomplishment of these M.L models, pretends these prob-lems with less variance and obtained excellent results. 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26

risk. A distinct and confined area found in a currently or earlier malarious area that contains the epidemiologic and ecological factors crucial for malaria diffusion.The WHO boosts the progress of rapid and economical diagnostic tests that allow for the identification of proper treatment methods [1][2] [3]. Malaria is a fatal disease caused by parasites that spread to people through the bites of infected female Anopheles mosquitoes. As per survey report in the year 2019 roughly predictable 229 million cases of malaria worldwide and the death rates hoisted at 409 000 in 2019. Small kids under 5 years are the most susceptible group affected by malaria [17] [18]. The report from WHO predicts that African Region carries a extremely high share of the global malaria burden.In general microscopy testing was widely accepted, the drawback to this process was time consuming and also the expected result of this diagnosis depends on the parasitologists [4] and due to misconception diagnosis this was leading to wrong treatment. In order to overcome this drawback system, an automated system for the malaria diagnosis is an evoking research to bring into more desirable treatment such as to provide reliability , to make the quantity explicit of disease accuracy, and reduce cost effectiveness in rural areas [20] [21]. Generally as per the medical experts there are 5 plasmodium classes which may be harmful from malaria disease to human beings. P.Falciparum, P.Vivax, P.Malariae, P.Ovale,P.Knowlesi. Out of these the most common two classes are P.Falciparum, P.Vivax. The P.Falciparum is severe from other classes which is causing more deaths in the society. The stages of the malarial cells are as shown in the Figure:3. Observation made from the first slide shows the P.Falciparum trophozoites and gametocytes can be observed along with the white blood cells. Next the enlarged nucleus will be compared with the rest of red blood cells in the image. The second image P.Falciparum ring stages are erected with P.Schizonts.  [16] .

Classification of malaria cells
In recent time's emergent amount of studies profound to the application of computer vision and machine learning technologies to the automated diagnosis of malaria. Recently allied effort [1][2][3], an automated analysis method was presented in [4] for detection and enactment of red blood cells infected by the malaria parasite.To organise RBCs, three different types of ML algorithms were established for prediction accuracy and promptness as RBCs classifiers.

Image Smoothing
The cell images' was carried out with various smoothing techniques like Gaussian noise and salt and pepper noise, compare the effect of blurring via box, Gaussian, median and bilateral filters for both noisy images as per the expected results were not promising. The 2D convolution filtering was applied with various low-pass, high pass filters in removing the noise and blurring the image. An high pass filter produced the promising results by finding the edges in an cell images. a 2* 2 averaging filter kernel was applied for this cell images K=1/9. 1 1 1 1 The above filtering kernel resulted as per our expectations., for every pixel, 2 * 2 window is centered on this pixel, then all the pixels which as coming in this window were calculated on this pixel and the result was divided by 9. This values was considered for computing the average of pixel values inside the window. This was carried out to get the filtered image as output as shown in the figure 4.to figure 7. Based on these results of parasitized images the parasitic region were mostly circular in nature and hence the circular kernel was chosen for feature extraction and recognition.

Gabor Filtration technique applied
Normally many samples are visible to naked eyes as no malarial infected cells, hence these can be used to reduce overall processing run-time. In this regards to calculate the infected cell samples, statistical analysis technique was implemented. After this infected area was noticed and threshold was performed on the color image using Gabor Filter method. The outcome of this method confirmed that noise was not only present in background as well as inside RBC's. Later morphological series was applied to fill the holes to obtain individual samples as shown in the Figure 1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63 64 65 Observe the figure 8 infected image and figure 9 uninfected image that shows the real and imaginary parts of the Gabor kernel. Lets consider the value of I(x+y) as gray value at (x,y). The convolution of the sample I and the Gabor kernel of the sclae ω and the orientation of the θ are is as mentioned in the below equation.

Data-preprocessing
As we knew that the behaviour and enactments of the model completely depends on the data which is fed on the supervised learning. Hence this plays a major role in decision making. Since we used Image smoothing and gabor filter for feature extraction, we obtained a large feature vector for each image. In order to classify the data properly, we normalized each vector in the range of 0 -255. For feature selection, chi square feature selection method was applied to select the best 80% of the features for final feature vector.

Support Vector Machine
SVM is generally useful for statistical learning and determining the point location of decision boundaries which results the optimal separation of classes. The SVC classifier was used in implementing the "one-against-one" approach for multi-class classification problem where the label's were drawn from finite set of several elements.The samples of infected parasite and uninfected samples, is 2 class which was taken as the number of classes. The decision function shape option allows to transform the results of the "one-against-one" classifiers to a decision function of shape (13000samples, 2classes). Applying each classifier to the test data vectors gives one vote to the exact class.

Neural Network Classifier
The usage of Neural nets was taken for the classification and recognition, since it consists of artificial network of functions called parameters which was able to learn all the feature of the images for analyzing the new data after receiving   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19  20  21  22  23  24  25  26  27  28  29  30  31  32  33  34  35  36  37  38  39  40  41  42  43  44  45  46  47  48  49  50  51  52  53  54  55  56  57  58  59  60  61  62  63  64 65  As we all aware that boosting builds the model from individual 'weak learners' in an iterative way. Unlike random forest, in boosting individual models are not completely build on random subsets or data/features. Gradient boosting uses gradient descent to minimize loss function.The XGBoost algorithm was used to optimize the accuracy and speed, which uses advanced L1 and L2 regularization to prevent overfitting and fast computing. By applying this alogrithm the result achieved about 90% which is as shown in the table 3

Result Analysis
To achieve the accuracy for the malarial parasite detection, a series of experiments tested using various machine learning algorithms. The S.V.M Classification Accuracy Obtained was 94% as shown in table 2 The XG-Boost