Viral and Bacterial Pneumonia Detection using Arti�cial Intelligence in the Era of COVID-19

Background: The outbreak of COVID-19 on the eve of January 2020 has led to global crisis around the world. The disease was declared pandemic by World Health Organization (WHO) in mid-March. Currently the outbreak has affected more than 150 countries with more than 20 million con�rmed cases and more than 700,000 death tolls. The standard method for detection of COVID-19 is the Reverse-Transcription Polymerase Chain Reaction (RT-PCR) which is less sensitive, expensive and required specialized health expert. As the number of cases continue to grow, there is high need for developing rapid screening method that is accurate, fast and cheap. Methods: We proposed the use of Deep Learning approach based on Pretrained AlexNet Model for classi�cation of COVID-19, non-COVID-19 viral pneumonia, bacterial pneumonia and normal Chest X-rays Images (CXR) scans obtained from different public databases. Result and Conclusion: For non-COVID-19 viral pneumonia and healthy datasets, the model achieved 94.43% accuracy, 98.19% Sensitivity and 95.78% Speci�city. For bacterial pneumonia and healthy datasets, the model achieved 91.43% accuracy, 91.94% sensitivity and 100% Speci�city. For COVID-19 pneumonia and healthy CXR images, the model achieved 99.16% accuracy, 97.44% sensitivity and 100% Speci�city. For classi�cation of COVID-19 pneumonia and non-COVID-19 viral pneumonia, the model achieved 99.62% accuracy, 90.63% sensitivity and 99.89% Speci�city. For multiclass datasets the model achieved 94.00% accuracy, 91.30% sensitivity and 84.78% speci�city for COVID-19, bacterial pneumonia and healthy. For 4 classes (COVID-19, non-COVID-19 viral pneumonia, bacterial pneumonia and healthy, the model achieved accuracy of 93.42%, sensitivity of 89.18% and speci�city of 98.92%.


Introduction
Pneumonia is a common disease caused by different microbial species such as Bacteria, virus and Fungi as shown in Fig 1 .The word "Pneumonia" comes from the Greek word "Pneumon" which translates to lungs.Thus, the word pneumonia is associated to lung disease.In medical terms, pneumonia is a disease that causes in ammation of either one or both lung's parenchyma [1].However, pneumonia often result from infection or not, such as food aspiration and exposure to chemicals.Based on infection, pneumonia occur as a result of in ammation caused by pathogens which lead the lung's alveoli to ll up with uid or puss and thereby leading to decrease of Carbon dioxide and Oxygen exchange between blood and the lungs, making it hard for infected persons to breathe.Some of the symptoms of pneumonia are: shortness of breath, fever, cough, chest pain etc.Moreover, the people at risk of pneumonia are elderly people (above 65 years), children (below the age of 5 years) and people with other complications such as HIV/AIDS, diabetes, chronic respiratory diseases, cardiovascular diseases, cancer, hepatic disease etc. [2,3,4,5].Table 1 presents classi cation of pathogens that causes pneumonia.

Diagnosis and Treatment of Pneumonia
There are different approaches for the diagnosis of pneumonia, some of these approaches include Chest X-rays and CT Scan (which form the basis of our contribution), sputum test, pulse oximetry, Thoracentesis, blood gas analysis, bronchoscopy, pleural uid culture, complete blood count etc. Mostly, pneumonia infection is treated based on the causative pathogen.For bacterial pneumonia, antibiotics are used, for viral pneumonia such as in uenzas, SARS and MERS, antiviral drugs are used while antifungal drugs are used for fungal pneumonia [5,6,7].[11,12,13,14].The pandemic caused by SAR-CoV-2 is alarming due to the fact there is no approved drug or vaccine [15].
In order to curb further spread of the virus, parliaments or governments of various countries and states imposed city lockdowns, ight cancellations, border restrictions, closure of workplaces, restaurants, postponement of sport, religious, cultural and entertainment event and activities, wearing of face mask, social distancing of 1-2m, and creating awareness on hygiene.Many countries are facing challenges regarding number of reported cases of COVID-19 as a result of the lack of RT-PCR test kit and delay in test kit.This delay is detrimental as it leads to more cases due to interaction between infected patients waiting for result with healthy population [16,17].

Deep Learning (DL) and Transfer Learning (TL)
Deep Learning is a branch of machine learning (ML), a subset of Arti cial intelligence (AI) inspired by the make-up of the human brain.It is termed as a sub-eld of Machine Learning (ML) that works similar to the biology of human brains by taking data and processing the data through networks and neural networks.Many biomedical health issues such as cancer (brain tumor and breast cancer) detections are using computer aided diagnosis base on AI models.Precisely, DL Models can detect hidden features in images which are not apparent or cannot be detected by medical expert.In terms of DL, Convolutional Neural Network (CNN) is the leading DL tool that is popularly used in different sub-eld of healthcare system due to their ability to extract features and learn to distinguish between different classes (i.e.
positive and negative, infected and healthy, cancer and non-cancer etc. Transfer learning has provided easier approach to quickly retrain neural networks on selected dataset with high accuracy [18,19,20].the model [21,22].

Challenges
As the number of COVID-19 patient grows exponentially, there is high need massive detection which is critical for prevention and control.Medical practitioners all over the world required sophisticated system to accurately diagnose COVID-19.Different approaches are currently in used for detection of different types of pneumonia.However, detection of different strains of pathogens using molecular testing is still not up to standard of point of care diagnostics.Instead, specimens are collected from site of infections are transfer to equipped or specialized laboratories for diagnosis using RT-PCR sequencing approach which is the current gold standard [23].This method is expensive and often lead to false result.Moreover, underdeveloped countries and remote areas with limited testing kit and equipped hospitals with ventilators have become the epicenter of the disease.Thus, there is high need for developing an alternative approach which is fast, cheap, simple and reliable.The use of X-ray has proven to be an alternative; however, this method is sometimes tedious for quali ed radiologist [24].These challenges can be addressed by computer aided detection method using DL approach which is accurate, fast and precise.

Contribution
Accordingly, our contributions have been summed up as follows.
We suggested the use of Pretrained (transfer learning) AlexNet Model to detect COVID-19 pneumonia non-COVID-19 viral pneumonia, bacterial pneumonia and normal/healthy patients using CXR image.
We trained the models separately to differentiate: The last decade has seen exponential rise for the application of DL in healthcare system.Different studies have shown that DL models can be used for pathological cancer images, diabetic retinopathy, CT scan of pneumonia and tuberculosis as well as microbial slide images.In the eld of pathology, pathologist, Computer scientist and radiologist have been working together to detect diseases such as cancer, pneumonia and tuberculosis using computer aided diagnosis [25,26,27].
In terms of application of DL models for detection of Pneumonia using CT scan and Xray images, we provided literature review based on studies that: Chest Scan based on Chest X-ray or Computed Tomography (CT) scan is an approach radiologist used to distinguish between patient suffering from pneumonia and healthy person.The difference is based on the presence of white hazy patches which is known as "Ground-glass opacity" in infected patient which is absent in healthy person.However, as a result of scarcity of test for diagnosing COVID-19 as well as the high cost (120-130 USD), time consuming, low sensitivity, laborious of RT-PCR method, scientist turn to chest scan such as CT scans and X-rays as an alternative approach for diagnosis of severe pneumonia caused by SAR-CoV-2 and Bacterial Pneumonia [28].Moreover, this approach has its own challenges such as shortage of expert (i.e.radiologist) that can interpret the result and the tediousness of interpreting thousands of CT scan and Xray images.These challenges are addressed by AI driven models which have shown high e ciency in assisting medical expert in classi cation and prediction of disease [29,30].
Many studies have reported the use of CXR and CT scans along with Deep Learning models in order to achieve automated detection of COVID-19 pneumonia and other type of pneumonia such as non-COVID-19 viral pneumonia and bacterial pneumonia.Moreover, many studies have shown the viability of using TL models which are deep networks pretrained on the ImageNet database for classi cation of for classi cation of pneumonia from healthy CT scans [31,32,33].
The approach of TL in DL is utilized by Chowdhury et al., 2020 [17]  The number of each CXR images used are presented in Table 3.   comparison between some state of art approaches with our models are presented based on COVID-19 and non-infected (healthy) CXR images and multiclass as shown in Table 6.

Performance Evaluation
The datasets are divided into two -70% used in training and 30% used for testing.Performance of the models are evaluated based on testing accuracy, sensitivity and speci city.Firstly, we carried out a pilot study using 371 CXR images each for COVID-19, non-COVID-19, bacterial pneumonia and healthy Images.We obtained low accuracy, sensitivity and speci city due to low amount of dataset.We carried out this study to analyzed the linearity of the dataset by using same amount training and testing dataset due to the fact that we have only 371 COVID-19 CXR images.
Before we carried out a multiclass classi cation, we trained each type of pneumonia with healthy (nonpneumonia or non-infected) CXR images.For Non-COVID-19 viral pneumonia and Healthy datasets, we achieved 94.43% Testing accuracy, 98.19% sensitivity and 95.78% Speci city.In terms of bacterial pneumonia and healthy datasets, we achieved 91.43% Testing accuracy, 91.94% sensitivity and 100% Speci city.This shows that the model has learned to classi ed negative images (non-infected/healthy) accurately compare to positive CXR images (bacterial pneumonia).Moreover, majority of the recent studies focused on COVID-19 pneumonia and Non-infected CXR dataset.Our model achieved high evaluation performance with 99.16% Testing Accuracy, 97.44% sensitivity and 100% Speci city.
CXR scan images of a variety of viral pneumonia are similar, making it hard for radiologist to distinguish COVID-19 with other viral pneumonia.This limitation can lead to miss-diagnosis and at the same time can lead to non-COVID-19 viral pneumonia miss-diagnosed as COVID-19 pneumonia [17].To addressed this limitation, we trained our model to distinguish between COVID-19 pneumonia and non-COVID-19 viral pneumonia.The model was able to achieved 99.62 Testing Accuracy, 90.63% sensitivity and 99.89% Speci city.
For multiclass dataset, before we train the whole classes, we examine the performance of the model based on 3 classes (COVID-19, bacterial pneumonia and healthy) to see how the model will perform before integrating non-COVID-19 viral pneumonia.The model achieved low accuracy compare to models trained to distinguish between 2 classes with 94.00% testing accuracy, 91.30% sensitivity and 84.78% speci city.Based on this result, we hypothesized to achieve lower performance based on 4 classes (COVID-19, non-COVID-19 viral pneumonia, bacterial pneumonia and healthy).However, the model achieved lower accuracy compare to 3 classes in terms of testing accuracy (93.42%) and sensitivity (89.18%) while the model achieved higher speci city (98.92%) compare to 3 classes as shown in Table 5 and Figure 4.  [43] and Narin et al., 2020 [32] have also reported high degree of similarity between COVID-19 and other viral pneumonia when considering physiological and clinical prospective.
With regards to the classi cation of COVID-19 and normal CXR images, it can be observed that our model provides signi cantly a better performance compare to studies that utilized small amount of dataset such as Mahmud et al., 2020 [34] and models developed from scratch.The impressive performance of the model is attributed to the use of TL based on pretrained models which have shown to perform e ciently with less amount of data compare to models designed from scratch such as Tan et al., 2018 [19].In terms of classi cation between Non-COVID-19 viral pneumonia and Healthy CXR images, several studies utilized same dataset made available by Kermany et al., 2018 [42].Majority of these studies achieved higher performance of above 90% Accuracy such as Stephen et al. 2019 [36], Saravia et al., 2019 [38] and Rajaraman et al., 2018 [40].However, our model achieved result within same range with 94.43% Accuracy.The higher performance achieved for classi cation of COVID-19 pneumonia and non-COVID-19 viral pneumonia and COVID-19 pneumonia with healthy CXR images has shown that computer-aided detection approach can be used as an alternative or con rmatory approach against RT-PCR method which has shown to be less sensitive, time consuming and laborious.One of the limitations of this research is the fact that we used a small dataset of COVID-19 pneumonia.This challenge makes it di cult to generalized our result.In the future, we hope to acquire more dataset and to train the images

Between COVID- 19
pneumonia and normal/healthy patient Between non-COVID-19 Viral pneumonia and normal/healthy patient Between Bacterial pneumonia and normal/healthy Patient Between COVID-19 pneumonia and non-COVID-19 Viral pneumonia Between COVID-19 pneumonia, Bacterial pneumonia and normal/healthy Patient Between COVID-19 pneumonia, non-COVID-19 viral pneumonia, bacterial pneumonia and normal/healthy Patient We assessed the performance of the network based on accuracy, sensitivity and speci city 1.6 Related Work

2. 1 . 2
ParametersTo assess how the trained models performed, three parameters are employed; accuracy, sensitivity and speci city.Accuracy is termed as the ratio of correctly classi ed images over total number of images, it is also termed as the sum of sensitivity and speci city.For evaluating the loss and accuracy of a model the following formulas are utilized as shown in equation 1 and 2. Sensitivity (True Positive rate) is the proportion of positive image samples that are accurately identi ed as positive sample (i.e. it shows the percentage of positive samples that are identi ed correctly as positives).While Speci city (False positive rate (FPR)) is the proportion of positive samples that are identi ed incorrectly as positive samples (i.e. it shows the percentage of negative samples that are identi ed incorrectly as positives).The formula of sensitivity and speci city are shown in equation 3 and 4 respectively.Where TPS = True Positives, FNs = False Negatives, TNs = True Negatives and FPs = False Positive Results In this section, the performance of the models are presented based on each type of pneumonia (COVID-19, bacterial and non-COVID-19 viral pneumonia) with healthy CXR images, COVID-19 and non-COVID-19 viral pneumonia and multiclass (1) COVID-19, bacterial pneumonia and healthy and (2) COVID-19, non-COVID-19 viral pneumonia, bacterial pneumonia and healthy as shown in Table 5 and Fig 4.Moreover,

Figure 1 Classi cation of Pneumonia Figure 2
Figure 1

Table 1 .
Classification of pneumonia based on Pathogens Coronaviruses invade the lung's alveoli (an organ responsible for exchange of O 2 and CO 2 , thus causing pneumonia.The symptoms of COVID-19 include dry cough, fatigue, fever, septic shock, organ failure, anorexia, dyspnea, myalgias, sputum secretion severe pneumonia, Acute Respiratory Distress Syndrome (ARDS) etc.
[8,9,10] was reported in Wuhan, Hubei province of mainland China on 31 st December, 2019.The virus spread from city to city and from one country to another leading to global health crisis.However, it was not until March 11, 2020 that WHO declared it as pandemic[8,9,10].COVID-19 can be transmitted through respiratory droplets that are exhaled or secreted by infected persons.

Table 2 .
[41]ifferentiate between COVID-19 and viral pneumonia based on dataset acquired from public database.The models were trained using 423 COVID-19, 1458 viral pneumonia and 1579 normal Chest X-ray images on 2 basis (I) augmentation and (II) without augmentation.The models achieved higher accuracies, sensitivities and speci cities.A multi dilation CNN is utilized byMahmud et al., 2020 [34]to classify COVID-19 and other forms of pneumonia.and96%specicityfordetection of COVID-19 and 87% sensitivity and 92% speci city for detection of CAP.Apostolopoulos et al., 2020 [31]utilized TL approach on dataset that contain 1427 x-ray images (504 Normal Xray Images, 700 Bacterial Pneumonia and 224 COVID-19 Xray Images).The model was able to achieved 96.78% accuracy, 96.46% speci city and 98.66% sensitivity.The summary of application of AI for detection of pneumonia is presented in Table2.Detection of different types of Pneumonia using AI-driven tools.MethodsIn this section, we detail the proposed approach procedures and its main assumptions.The work process of the proposed approach is schematically shown in Fig 2.TL on DL Models have shown to perform e ciently even with small amount of dataset compare to Deep Learning models build from scratch which require large amount of dataset[41].
[35]D-19 viral pneumonia, 94.7% for COVID-19 vs bacterial pneumonia and 90% for multi-class.In order to show the difference between COVID-19 and Community Acquired Pneumonia (CAP), Li et al 2020[35]utilized 3-Dimensional DL framework know as COVID-19 detection neural network (COVNet) using 4352 CT scans (1292 of COVID-19, 1735 of CAP and 1325 normal CT scans).The model achieved 90% sensitivity *Ac is Accuracy, *BP is Bacterial pneumonia *Sv is Sensitivity, *Sf is Specificity *VP is Viral Pneumonia We removed 1 image due to low contrast, making the total number of images 371.We also obtained 1341 normal Xray images, 1345 non-COVID-19 viral pneumonia 3. 1341 normal, non-COVID-19 viral pneumonia, 4274 bacterial pneumonia from https://www.kaggle.com/sudalairajkumar/novel-corona-virus-2019-dataset 4. We obtained CXR images made available by Kermany et al., 2018 [42].The dataset contains 3 folders (Training, validation and Testing with a total number of 5856 positive and negative cases.In each folder there is a subfolder with names Pneumonia and normal folders.The dataset description is based on X-ray images collected from retrospective pediatric patients between the age of 1 to 5.
Pretrained AlexNet model is employed due to it high accuracy in carrying out feature extraction and image classi cation.The training is carried out using 20 epochs with 0.0001 learning rate.

Table 5
Bai et al., 2020en our result with State of ArtAs seen in Table5, the performances of Pretrained AlexNet Models are compared with other proposed models.Compare to our work, the study carried out by Li et al 2020[35]grouped viral and bacterial pneumonia as Community Acquired Pneumonia (CAP).However, our study disputes this approach, COVID-19 as viral disease resembles other viral pneumonia.The result we achieved when comparing COVID-19 and other viral pneumonia has shown lower sensitivity and speci city (90.63% and 99.89% respectively) compare to COVID-19 and healthy which achieved 97.44% sensitivity and 100% speci city.Our claim is also supported by Chowdhury et al 2020[17]who stated that "Models performed extremely well when used for classifying COVID-19 and normal images compared to COVID-19 and other viral pneumonia.BothBai et al., 2020

Table 6 .
Comparison between our Result and State of Art Conclusion This work presents the utilization of Deep Neural Network based on TL approach (known as Pretrained AlexNet Model) for automatic detection of COVID-19 pneumonia, non-COVID-19 viral pneumonia and bacterial pneumonia.The models were trained based on 2 classes and multiclass.For 2 classes (each of COVID-19, non-COVID-19 viral pneumonia and bacterial pneumonia with healthy CXR Images, COVID-19 and non-COVID-19 viral pneumonia.For multiclass, the models are trained based on (1) 3 classes (COVID-19, bacterial pneumonia and healthy CXR images) (2) 4 classes (COVID-19, non-COVID-19 viral pneumonia and bacterial pneumonia and healthy CXR images.The models were evaluated using Accuracy, Sensitivity and Speci city.However, the outcome has shown that these models achieved 94.43% Testing Accuracy, 98.19% Sensitivity and 95.78% Speci city for non-COVID-19 viral pneumonia and healthy datasets.For bacterial pneumonia and healthy datasets, the model achieved 91.43% Testing accuracy, 91.94% sensitivity and 100% Speci city.In terms of COVID-19 pneumonia and healthy CXR images, the model achieved 99.16% Testing Accuracy, 97.44% sensitivity and 100% Speci city.For classi cation of COVID-19 pneumonia and non-COVID-19 viral pneumonia, the model achieved 99.62 Testing Accuracy, 90.63% sensitivity and 99.89% Speci city.For multiclass datasets the model achieved 94.00% testing accuracy, 91.30% sensitivity and 84.78% speci city for 3 classes (COVID-19, bacterial pneumonia and healthy) and testing accuracy of 93.42%, sensitivity of 89.18% and speci city of 98.92% for 4 classes (COVID-19, non-COVID-19 viral pneumonia, bacterial pneumonia and healthy).