DarkCVNet: Optimized Pneumonia and Covid 19 detection using CXR Images

Since the advent of the Covid 19 infectious disease, a variety of studies have been conducted around the world in order to accurately forecast its outcome. Covid 19 is linked to the previous lung disease pneumonia, since many individuals died from severe chest congestion (pneumonic condition). The objective of this study is to use computerized x ray images to identify bacterial pneumonia, covid 19 viral pneumonia, other viral pneumonia, lung opacity and normal instinctively. It continues with a lengthy report on developments made in the accurate identification of pneumonia, followed by the authors' technique. For transfer learning, thirteen different deep Convolutional Neural Networks (CNNs) were used: AlexNet, ResNet18, ResNet50, ResNet101, VGG16, VGG19, MobileNetV2, GoogLeNet, DenseNet201, NasNetMobile, NasNetLarge, DarkNet and Inception-ResNet V2. Our data set consisting of 6530 images which are divided into five categories such as bacterial pneumonia, covid 19 viral pneumonia, other viral pneumonia, lung o pacity and normal or healthy person’s x ray images , then the pre-processed images were trained for the transfer learning based classification task. Contrasting to other deep learning classification techniques with a large image data base, obtaining a large amount of pneumonia dataset for this classification task is difficult. As a result, we used several data augmentation techniques to enhance the CNN model's validation and classification accuracy as 99.50% and 98.20% respectively, and we produced significant validation accuracy. This DarkCVNet (DarkCovidNetwork) approach could be able to help with the trustworthiness and interpretability issues that come up frequently when dealing with medical images.


Introduction
Pneumonia is now a dangerous lung infection caused by bacteria, particularly coronavirus or any other viruses, and fungus [1].Since December 2019, a novel Covid 19 virus (SARS-CoV-2) has spread across Wuhan, China, and a number of other countries.Covid 19 virus, similar to any other virus or bacteria, has been reported to cause pneumonia in some people.In all of these circumstances, however, the treatment is different.The initial screening/testing determines whether or not a person has pneumonia.It is critical to determine either the patient has bacterial pneumonia or Covid 19 viral pneumonia or any other kind of viral pneumonia other than Covid 19 to stop the virus from spreading [2].By January 1st, 2022, about 305 million cases have been confirmed, with approximately 5.5 million deaths worldwide.Coronavirus (https://www.worldometers.info/coronavirus).Covid 19 is mainly spread through personal contact from one individual to another.Healthy persons can become infected by coming into contact with those who have COVID 19 through their breath, hands, or mucus [3].
Examining every X-ray image and extracting key information takes a significant amount of time and requires the existence of medical professionals in the field.As a result, medical practitioners will require computer intervention to assist in the diagnosis of Covid 19 Pneumonia patients using CXR images [4].In the current situation, when millions of people must be tested every day to see if they are infected with the deadly Covid 19 viral, an automatic, efficient and precise computer-assisted approach is necessary to distinguish the signs of infection.There have been reports of significant subjective variations in radiologist's decisions while diagnosing pneumonia.In low-resource countries there is also a limitation of certified radiologist, particularly in rural regions [5].As a result, there is indeed a pressing demand for computer aided diagnosis systems that can assist radiologists in quickly recognising distinct kinds of pneumonia from chest X-ray pictures.Deep learning methods in computer assisted approach add greatly to state-of-the-art medical image analysis and exhibit successful outcomes [6].Most deep learning techniques and CNN models has used chest X-ray image data to detect the disease [7].Although the availability of a variety of imaging methods, lung radiography is thought to having great substandard sensitivity for relevant examination findings [8].Medical practitioners routinely employ X-ray imaging to analyse pneumonia, however X-ray imaging systems are an important aspect of medical care around the world [9].GGO (Ground glass opacity) pertains to the hazy grey regions that may be seen in lungs CT scans or X-rays.Those grey patches represent increased density throughout the lungs.GGO can be caused by a variety of factors, including inflammation, infections and growths.Trusted Source discovered that GGO was the most prevalent abnormality among persons with Covid 19 related pneumonia in a 2020 review [10].The proposed study assessed the performance of thirteen distinct pre-trained network designs (AlexNet, ResNet18, ResNet50, ResNet101, VGG16, VGG19, MobileNetV2, GoogLeNet, DenseNet201, NasNetMobile, NasNetLarge, DarkNet and Inception-ResNet V2) utilising a transfer learning approach.
The state-of-the-art approaches for Covid 19 diagnosis are discussed in Section 2. In section 3, we covers the materials and methods of our proposed model.In section 4, we described results and discussions as well as some suggestions for future improvements and finally, in section 5, we provide our findings.

Related works
In contrast to deep learning systems, machine learning relies heavily on knowledge in the selection and extraction of appropriate features and has restricted performance.As a result, deep learning approaches have become increasingly popular in recent years.The followings are some of state of the art algorithms are based on deep learning and are built utilizing chest X-ray pictures or CT scans.[12].Amit Kumar Jaiswal et al. utilized a pixel wise segmentation with mask RCNN model and achieved higher accuracy [13].Tulin Ozturk et al. introduced DarkNet model with 87.02% accuracy for multi class classification [14].Table 1 describes some of the deep learning techniques with its accuracies.

Materials and Methods
The suggested approach is divided into four steps, as illustrated in Fig. 4, namely i) Dataset Collection ii) Image pre-processing iii) Image augmentation (iv) Pre-trained CNNs to distinguish pneumonia cases caused by bacteria, viral, or specifically the Covid 19 virus, lung opacity, and Healthy.
MATLAB (R2020a) was used to train, assess, and test several algorithms in this work.To train the model, an Intel i9 9820X processor with an NVIDIA GeForce RTX 2080 Ti with 16 GB memory is employed.

Dataset
Kaggle is the open-source repository that was used to collect CXR images of all five cases analysed, namely, Lung Opacity, normal / healthy, bacterial pneumonia, Covid 19 viral Pneumonia and any other viral Pneumonia.There are 3840 images from various matters pretentious by pneumonia (1290 images of bacterial pneumonia, 1345 images of Covid 19 pneumonia and 1205 images for other viral pneumonia), 1345 images for lung opacity and 1345 images from healthy persons among the 6530 chest X-ray images which is known as dataset1 and after performing different augmentation techniques the image count was increased to 16000 which is known as dataset2.We used 80 percent of the pictures for training and the rest 20% for testing.Furthermore, we used 20% of the data from the 80 percent of training data for validation purposes.Table 2 provides the dataset details associated with this study.

Image preprocessing
Images are preprocessed in two phases.
Step 1: Image Resizing is the key step in preprocessing method.All of the images that were initially acquired has been scaled to the desired size of 299×299.
Step 2: The grayscale (input) resized images are filtered with an edge-preserving Gaussian bilateral filter.This filter smooth's neighborhood's with smallest variance (identical areas) and not neighborhood's with significant variance (strong edges) while degree of smoothing is a small value.This filter smooth's both uniform areas and neighborhoods with greater variance as degree of smoothing increases.The spatial Gaussian smoothing kernel's standard deviation, spatial sigma, is specified here.Greater spatial sigma values enhance the contribution of more distant surrounding pixels, essentially expanding the neighborhoods size represented in figure 1.
(Normalization term) = ∑   (∥ (  ) − () () represents the filtered output image shows in equation ( 1) and equation (2).I represents the original image input that would be filtered. denotes the present pixel to be filtered's coordinates.Ω is the window adjusted in , then   ∈ Ω is a different pixel.The range kernel   (Gaussian function) is used to smooth out disparities in intensity.The domain (or spatial) kernel for smoothing coordinate variances is   (Gaussian function).

Image Augmentation
As previously stated, CNNs perform better when dealing with huge datasets.The working database, on the other hand, is not that large.A prevalent trend in deep learning algorithm training is to use data augmentation techniques to turn a small dataset into a larger one.On the off chance that the genuine number of gathered pictures in each picture class varies, this difference in picture count makes a critical class awkwardness issue.This imbalanced information would prompt an assortment of issues, for example, over fitting, where the model can't sum up well to obscure datasets.Interestingly, exactness is definitely not a reasonable key measure in the present circumstance.The state of over fitting prompts the organization to get familiar with the particulars of the preparation dataset tests to where it can't sum up as expected.Subsequently, regularization procedures, for example, information expansion are utilized in this exploration to battle this issue of over fitting.

Proposed model
Transfer learning is a methodology where the information obtained by a CNN from gave information is moved to tackle a related yet unique issue, typically requiring new information from a restricted example to prepare a CNN without any preparation.CNN utilizes a few structure blocks like as convolution layers, pooling layers, and completely associated layers to learn spatial orders of data independently and adaptively by backpropagation.For transfer learning, thirteen different deep Convolutional Neural Networks (CNNs) AlexNet, ResNet18, ResNet50, ResNet101, VGG16, VGG19, MobileNetV2, GoogLeNet, DenseNet201, NasNetMobile, NasNetLarge, DarkNet and Inception-ResNet V2 were used to perform multi-class classification in this study.Figure 3 shows the workflow of our model.

Fig 3. Work Flow of our model
As a base model, we have utilized 13 different Pre-trained CNNs on ImageNet with pre-trained weights.The suggested model (Fine-tuned DarkCVNet) has been taught to adapt and learn the basic features of computer vision (e.g., edges and boundaries), ensuring that the model does not have to learn from scratch each time it is used to train on CXR datasets.We investigated the most appropriate hyper parameters as maximum epoch is 10 and maximum iteration is 500 and have given various initial learning rates (0.0001, 0.0003, 0.001) for different optimizers (SGD, ADAM, RMSProp) using that the maximum accuracy for the suggested model has been achieved after several attempts of experimentation.Figure 4 shows our proposed model architecture.

AlexNet
AlexNet is a CNN model that essentially affects the utilizations of profound figuring out how to machine vision.It astonishingly won the 2012 ImageNet LSVRC-2012 challenge by a huge degree (15.3 percent botch rates versus 26.2 percent blunder rates in runner up which is VGG-16).The organization's configuration was very like LeNet that of Yann LeCun et al's LeNet, yet it was more profound, with more channels per layer and layered convolutional layers.Convolutions, max pooling, dropout, information increase, ReLU initiations, and SGD with force were all important for it.After each convolutional and completely associated layer, it added ReLU initiations.Moreover, instead of regularisation, dropout is used to deal to address overfitting.

ResNet 18, 50, 101
The ResNet50 model was the winner of ILSRVC 2015, with a 3.57 percent error rate and the input image size is 224 by 224 pixels.ResNet is the very well deep learning method that was first published by Shaoqing Ren, Kaiming He, Jian Sun, and Xiangyu Zhang in a publication.ResNet18 model consists of 18 layers and the ResNet50 is a deep 50 layered ResNet blocks, each with two or three convolutional layers.ResNet-101 comprises of 101 layers of convolutional neural network [25].This is established on the knowledge of skip connections and makes extensive use of batchnormalization to successfully train several of layers without losing speed.

Vgg 16, 19
VGG is a type of CNN that has been around for quite some time.It was derived from research into how to densify such networks.The network employs small 3 x 3 filters.Aside from that, the network is also known by its simplicity, with only pooling layers as well as a fully linked layers as additional features.Vgg16 model consists of 16 layers and Vgg19 has 19 layer deep.

MobileNetV2
MobileNetV1 introduces Depth wise Separable Convolution, which drastically reduces the network's complexity costs and model size, making it ideal for mobile platforms or devices with limited processing capacity.In MobileNetV2, an improved module with inverted residual pattern is introduced.Here non-linearities in restricted layers are eliminated.With MobileNetV2 as the spine for highlight extraction, state-of-the-art object identification and semantic segmentation results also are accomplished.

Google/Inception
The ILSVRC 2014 challenge was awarded by the Inception Network or GoogLeNet, that had a top-5 failure rate of 6.67 %, and this was practically human level ability.The model was designed by Google and features an upgraded version of the original LeNet design.This is modelled on the inception module idea.GoogLeNet is a 22 layer deep CNN that is a version of the Inception Network.9 inception modules are placed on top each in the InceptionNet/GoogleLeNet design, having max pooling layers among them (to halve the spatial space).There are 22 layers in total (27 with the pooling layers).From the last inception module, it applies global average pooling.

DenseNet201
Huang et al. [26] presented the DenseNet201 architecture, which is one of the most recent variants based on the Dense network.Each layer in DenseNet gets extra data from all past layer and sends its own characteristic to every single ensuing layer.The term link is utilized.Each layer gets aggregate information from the levels above it.A pooling layer and bottleneck development are likewise utilized in the DenseNet201 model.Here, the blunder worth can be passed on more straightforwardly to past layers.As past layers can get close oversight from the last order layer, this is a type of certain more profound management.This is the advantage of this model.DenseNet201 is used in this work which consists of 201 layers.

NasNet
Google has introduced NASNet that structured the task of finding the ideal CNN architecture as a Reinforcement Learning problem, owing to its vast computer capacity and engineering talent.The basic idea would have been to find the optimal combination of filter sizes, strides, output channels, number of layers, and other characteristics in the specified search space.The accuracy of the searched architecture on the given dataset was the reward for each search action in this Reinforcement Learning environment.
In the ImageNet competition, NASNet received a state-of-the-art result.The computer power required for NASNet, on the other hand, was so large that only a few companies were able to apply the same technology.NasNetMobile and NasNetLarge are the two techniques used here.

DarkNet 19,53
In the same way as Highway Networks and ResNet were presented to resolve the vanishing gradient difficult, DenseNet was offered to solve it (Huang et

Inception-Resnet-V2
One of the upgraded version of Inception V1 and Inception V2 model is Inception-ResNet-V2.Inception V4 with lingering associations (Inception-ResNet) has a similar summed up ability as standard InceptionV4, yet additionally with higher profundity and width, as indicated by Szegedy et al.They found that Inception-ResNet combines quicker than Inception V4, inferring that preparation Inception networks with leftover associations fundamentally speeds up the interaction.
A parameter would be a variable which is automatically optimized throughout the training phase, whereas a hyper parameter is one that must be set ahead of time.The following figure 5 represents the hyper parameters which are involved in enhancing the performance of our model.

Figure 5. Hyper parameters of CNN Model
The number of epochs refers to how many times the full training data is run through the neural network.We should increase the number of epochs until the difference between the test and training errors is negligible, here 10 is used.In the learning phase of convnet, mini-batch is frequently preferred.A range of 16 to 128 is a decent starting point for testing.It's worth noting that batch size affects convnet.Here 50 is used.The model's non-linearity is introduced via the activation function.In most cases, a rectifier and a convnet work well together.Sigmoid, tanh, and other activation functions, depending on the task, are also options.The learning rate determines how frequently the weight in the optimization method is updated.Depending on the optimizer we employ (SGD, Adam, RMSProp), we can utilise a 3 different constant learning rate of 0.0001, 0.0003 and 0.001, a progressively decreasing learning rate, momentum-based approaches, or adaptive learning rates.

Optimizer
One of the most crucial phases is selecting an algorithm for optimizing a neural network.The most commonly utilized optimization algorithms such as Stochastic Gradient Momentum (SGDM), Adaptive moment (ADAM) and Root Mean Squared Propagation (RMSProp) are investigated in the field of deep learning in this paper.When these algorithms run on minimal resources, DL approaches become more relevant in order to reduce recurring costs and give efficient results in less time.An optimizer is a strategy or algorithm for updating different parameters so that the loss can be reduced with far less effort.The following figure 6 shows the different kinds of optimizers available for neural network design. () )) (5)  =  +  (6)

ADAM
Adam changes the learning rate for each weight of the neural organization by assessing the first and second snapshots of angle.Adam incorporates the advantages of AdaGrad, which is really great for inadequate inclinations, and RMSProp, which is great for on the web and non-fixed conditions.Adam has a significant advantage.First and foremost, the learning rate must be fine-tuned.It is also an easyto-use approach that is insensitive to gradient rescaling on the diagonal.It has a low memory footprint and is computationally efficient.Furthermore, Adam is well suited to non-stationary targets as well as issues with very noisy and sparse gradients.The first moment variable m and second moment variable u, are required by the Adam method.At time step t, biased first and second moment estimates are updated after computing gradient is given in equation ( 7) and (8).
The first and second moments are then used to correct bias.Parameter changes are generated and implemented using revised moment estimates is given in ( 9), ( 10), ( 11), (12).

RMSProp
RMSProp is another AdaGrad-modifying algorithm.By adopting an exponential average of the gradients rather than the cumulative sum of squared gradients, this seeks to lower the aggressiveness of the learning rate.The adaptive learning rate stays same, as the exponential average now punishes higher learning rates in conditions with fewer updates and lower rates in conditions with a greater number of updates.The squared gradient is generated after calculating gradient in terms of developing RMSProp is given in (13).
Where is the rate of decay.Then, as seen below, the parameter update is calculated and applied is given in ( 14) and (15).

Results and Discussions
Covid 19 cases have been confirmed in 305 millions of people around the world, with 5 million of deaths as on 1st January 2022.Covid 19 has been labeled a global epidemic by the World Health Organization.We were able to identify Covid 19 from CXR data using the described method.Because pneumonia is one of Covid 19's symptoms, we compared Covid 19 viral pneumonia lung image, bacterial pneumonia lung image, other viral induced pneumonia lung image, lung opacity image and normal lung image by using thirteen different CNN models.The method starts with downsizing chest X-ray images to a fraction of their original size.The convolutional neural network architecture, which collects features of an image and classifies it, then identification and classification could be done in the next stage.Our proposed DarkCVNet architecture is based on the baseline model of DarkNet19.In this study, MATLAB (R2020a) was used to train, evaluate, and test numerous algorithms.An Intel i9 9820X processor with an NVIDIA GeForce RTX 2080 Ti with 16 GB memory is employed to train the model.The experimental study is based on the performance measurable matrices produced from the confusion matrix such as TP (True Positive), FP (False Positive), TN (True Negative), FN (False Negative), Accuracy, Sensitivity, Specificity, F-Score etc., which is shown in table 3. The following Table 4 provides performance of Various CNN Techniques for 3 different optimizers such as SGD, ADAM and RMSProp with different learning rates of 0.0001, 0.0003 and 0.001.5 shows that the suggested model detected Pneumonia and Covid19 with an average accuracy of 98.20 percent, with average precision, recall, specificity, and F-score values of 98 percent, 98.1 percent, and 99.49 percent, and 97.99 percent, respectively.Performance analysis could be done with some parameters such as Precision, Recall, Specificity, F-Score and Accuracy for 5 classes of Bacterial Pneumonia, Covid Pneumonia, Healthy, Lung Opacity and other Viral Pneumonia lung images which is described in figure 7.

Figure 7 Performance Metrics Comparison of 5 class classification task
The plots could reveal useful information about the model's training, such as its speed of convergence over epochs (slope).Is it possible that the model has already converged or not (plateau of the line).Is it possible that the mode is over-learning the training data or not (inflection for validation line).Our proposed model was trained for 100 epochs with 5000 iterations and it was produced 99.50% validation accuracy which is shown in figure 8.Each iteration involves a gradient estimation and a network parameter update.Apart from tuning hyper parameters, Class imbalance is a major issue in deep learning models because model will not behave well for the unseen dataset.Our fine-tuned DarkCVNet model provides highest accuracy of 98.20% with 0.001 learning rate of ADAM optimizer.Some of the sample images obtained from DarkCVNet model is shown in figure 9. Since the X ray images of infection are not always clear, they are frequently misidentified as other diseases and benign anomalies.Among the most significant drawbacks of lung radiography analysis is their inability to detect Covid 19 in its early phases due to a lack of sensitivity in GGO monitoring.Furthermore, specialists may misclassify bacterial or viral induced pneumonia photos, resulting in patients receiving incorrect medicine and, as a result, deteriorating their condition.Another limitation is over fitting which means model has not learned the decision boundaries and hasn't been extended to unseen data and has only learned the data that should be avoided.

Figure 9 Sample output images of proposed model
In the future, we want to add more images to our model to validate it.This model can be stored in the cloud to give fast diagnosis and aid in the rehabilitation of impacted patients.This should result in a significant reduction in clinician workload.We will use CT scans to detect COVID 19 and compare the results to the suggested model trained on X-ray pictures.

Conclusion
We described a deep learning approach for pneumonia detection from Chest X rays by finetuning thirteen pre-trained convolutional models (VGG16, VGG19, AlexNet, DenseNet201, ResNet50, ResNet101, MobileNet, and GoogleNet) on our training set that have shown promise in a number of tasks in recent years, and evaluate their effectiveness in terms of pneumonia diagnosis.Before using deep learning techniques to train each image, we preprocessed it.The application of pre-processing techniques to the images is one of the new novel aspect of the suggested approach.More efficient characteristics are derived from the image data when pre-processing techniques are used.Here Gaussian Bilateral filter is used to smooth's both uniform areas and neighborhoods with greater variance as degree of smoothing increases.In our study, thirteen fine-tuned pre-trained models were compared with augmented dataset and non-augmented dataset.Our fine-tuned DarkCVNet model was achieved with an accuracy of 99.16% with 0.001 learning rate of ADAM optimizer.We hope that this CAD (computerassisted diagnostic) tool can considerably assist the radiologist in taking clinically better useful images and quickly identifying pneumonia and its type.

Figure 1 .
Figure 1.The sample X ray images of 5 different classes taken from Edge-preserving Gaussian Bilateral Filter: a) Healthy Lung image b) Lung Opacity c) Covid Pneumonia d) Bacterial Pneumonia e) other Viral Pneumonia

Figure 4
Figure 4 Proposed DarkCVNet architecture for 5 class classification

Figure 6
Figure 6 Different kinds of Optimizers in Neural Network Design

Figure 6
Figure 6 Confusion matrix of 5 class classification task of our proposed DarkCVNet modelTable5also includes precision, recall, specificity, F-score, and accuracy statistics for the fiveclass classification test.Table5shows that the suggested model detected Pneumonia and Covid19 with an average accuracy of 98.20 percent, with average precision, recall, specificity, and F-score values of 98 percent, 98.1 percent, and 99.49 percent, and 97.99 percent, respectively.

Table 2 Total number of images in each category S. No Image Category Original No. of Images Dataset 1 Training Set (after Augmentation) Dataset 2 Test Dataset Validation Dataset
al. 2017, He et al. 2015a, Srivastava et al.).Several layers might yield very tiny or no knowledge because ResNet expressly maintains knowledge through additive identity modifications.To alleviate this difficulty, DenseNet employs cross-layer connections, albeit in an improved version.DenseNet connects every previous layer to the next layer in a feed forward technique.DarkNet19 is 19 layers deep and DarkNet53 consists of 53 layers.All subsequent layers used feature-maps of previous levels as inputs represented by the following equations of 3 and 4.

Table 4 Performance of Various CNN Techniques for different optimizers with different learning rate
We have made a comparison for thirteen different fine-tuned pre-trained CNN models with 3 different optimizers such as SGD, ADAM, and RMSProp with 3 different learning rates as 0.0001, 0.0003, and 0.001.Our proposed DarkCVNET model outperforms the other state of the art methods with training accuracy of 98.20% and validation accuracy of 99.10% with ADAM optimizer and with the 0.001 learning rate.The following figure6provides the confusion matrix for 5 class classification of our proposed DarkCVNet model.