KNN is a classification algorithm that classifies the given data by how closely the data are related. Distance calculation methods like Euclidian distance are used to find the cohesive data in the given dataset. KNN was used as a classification model that can identify PTB bacilli and applied to the feature vector constructed to classify the PTB bacilli detected. The PTB bacilli object class is identified then the recognition/classification is completed. If the PTB bacilli object class is not known after KNN classification, then KNN is applied to reduce the number of classes into two, and then the distance matrix computed during the KNN is converted to a kernel matrix using the kernel trick. The experimentation was conducted by segregating the dataset into different numbers of training and test images based on the features extracted for each image in training and testing, the preprocessing, feature extraction, feature reduction process and feature vector construction process. In this study, we used the KNN classifier algorithm on the selected view of sputum smear images under-sample of a dataset mentioned before, 70% of this dataset was used for training, and 30% was used for testing purposes for the scenario.
KNN is classified data based on distance metrics and was used as multi-class classifiers. KNN use distance metric is calculated each time it comes across a set of new unlabelled data.
As was presented in the previous section, the experiments were conducted under scenario by using extracted features of the sputum smear images. The experimental results were used KNN classifier using holdout validation at 30% percentage held out were display results shows over the scenario and their performance in table 2.
The total number of the dataset was 180 sputum smear images. There were two output classes in this study because the predefined sputum smear images of PTB bacilli were positive and negative. Classifying the test images into PTB bacilli negative or positive is required to evaluate the system's performance by assigning the image into categories done by domain experts (pathologists), and the domain experts were selected from Ethiopian Public Health Institute. As indicated in table 4.1, the results of the KNN classifier using both computed morphological and color features alone showed that from the tested dataset of 54 sputum smear images. Fourteen features were combined both (8 morphological features and six color features), including the dataset predefined by the radiologist reading label of each sputum smear image as PTB bacilli positive (1) and negative (-1). Finally, the classification performance of the prototype system was computed based on table 3.4, using the level set method with classifier on the extracted feature of 54 (tested dataset) view of sputum smear images.
As described before, most pathologists failed to identify PTB bacilli detected and missed less than 50% due to an oversight error or done manually (Lumb et al., 2013). Therefore, in diagnosing PTB detection, it is amenable that pathologists' skills have an essential role in the accuracy of detecting the bacilli. In this regard, the developed model could make a higher level of accuracy that depends on pathologists' skills and decision-making. The researchers developed a model for PTB bacilli detection, and its accuracy was tested using a sample dataset selected from the ground truth, and sources mentioned previously. As described above, performance like accuracy, sensitivity and specificity of the developed model were measured using 30% of the tested dataset. A confusion matrix was utilized to carry out this. The four categories in the confusion matrix are true Positive, False Positive, False Negative and True Negative. True positives are bacilli images that were accurately classified by the prototype model and also identified by the domain expert. False positives happen when the model receives incorrect image data but is produced as correctly classified results. As a result, the model returns some inaccurate images as relevant. True negatives are images that the prototype model and expert domain incorrectly recognize. This is the image after inaccurate detections were made and the suggested model made a wrong classification, meaning that PTB bacteria were not found. False negative is when incorrectly identified images are inserted into the system for testing, and the prototype model classifies positive result.
Table 2: Confusion matrix of prototype system of KNN classifier
As shown in table 2, out of those 54 sputum smear images, the KNN classifier predicted as bacilli True Positive was 36 out of 54 and as bacilli True negative was 14 out of 54. However, in reality, as it is evaluated by domain experts, 37 bacilli images were as positive and 17 bacilli images. Based on Table 2, the researcher obtained the following findings that aid PTB bacilli detection through accuracy, sensitivity, specificity, and F-measure calculation. Thirty-six true positives, 14 true negatives, three false positives, and one false negative were observed. According to the performance data used by KNN algorithms, the overall detection accuracy was 92.6%, with sensitivity, specificity, and F-measure, 93%, 92%, and 94.7%, respectively.
Algorithms that guarantee reliable detection in unpredictable situations are data dependent.
KNN can function successfully if the data points are heterogeneously distributed. Thus, for most practical problems, KNN is the wrong choice because it scales poorly, and it would take a long time (linear to the number of examples) to find K nearest neighbors.
Graphical User Interfaces of PTB Bacilli Detection
A graphical user interface (GUI) is a set of techniques and mechanisms used for interactive communication between programs and users. GUI has been designed for the user action to display the PTB bacilli detection results. It gives the user a better perspective of the operation that they can perform. GUI of PTB bacilli detection can make programs easier to use by providing them with a consistent appearance and with intuitive controls like buttons, boxes, axis and menu. In this study, the researchers developed a GUI using a user guide to browse images and analysis the display results of the PTB bacilli detected or not. The user can browse images by clicking the components' button at any location.
After loading the image, the PTB bacilli detected button displays the image processed and performs the classification by the k-nearest neighbor classifier. Finally, the presented result gives the user a better view of each processed, whether PTB bacilli are positive or negative at the click of the button. Generally, GUI can be used to identify PTB positive or negative after being analyzed. The same GUI can be used to image processing by altering the callbacks. With the use of GUI-based programs, PTB can be quickly and effectively detected without the need to rewrite the program's code. The proposed GUI, which can clearly demonstrate the findings, whether PTB bacilli are positive or negative, is shown below in Fig. 1.
The number of acquired images from each of the two categories of the PTB positive and negative detected. The second step is an image preprocessing technique manipulating images to remove unwanted (undesired) noise and enhance the image quality from the image acquired. Therefore, image preprocessing is employed to make images look better to human viewers and to get them ready for image segmentation of the region of interest. To reduce the workload associated with image preprocessing, the study considers various environmental parameters, such as illumination and camera resolution. In addition, the relative positions of the sources and camera concerning the items of interest, or the geometry of the viewing scenario, typically also significantly impact the contrast between the object and its background.