Transfer Learning is an efficient approach to train CNN when there is a dearth of adequate training data and computational resources.16 The parameters learnt on large data such as ImageNet are used for weight initialization. Transfer learning can be implemented on CPU’s since training and classification takes less time to complete, with no special requirements for GPU’s. The features of the previous layers of a pre-trained network usually contain low-level (edge, color) information. The later layers contain high-level attributes, particularly the categorical details. In this paper, an approach grounded upon transfer learning based technique is proposed which employs pre-trained weights for initial layers of the network and last few layers are fine-tuned according to the dataset.6 This assists in classifying the chest X-Ray Images and identify the COVID-19 images from the dataset. The proposed network architecture is shown in Fig.1.
3.1. Dataset Description
The dataset is taken from Kaggle dataset repository,17 it is an open access database of chest X-Ray images. It consists of X-Ray images of COVID-19, bacterial and virul pneumonia and normal people. The two image classes are considered for this work: COVID-19 and normal images. There are 70 COVID-19 and 930 normal images. The dataset is skewed towards normal chest images. Therefore, for the proposed work 70 COVID-19 and 80 normal images are considered. Each image has a different size. These images are pre- processed to resize them to 224x224 pixels, to be fed to network. The dataset includes both training and testing images. 20% of the data is used for validation. Fig 2 and Fig. 3 shows some COVID-19 and normal X-Ray images from dataset.
3.2. Training and Algorithm
The explanation of the proposed the model is as follows. Some layers of a pre-trained model are used as a feature extraction component. The last few layers need to be finetuned.6 The approach for fine-tuning is to load the model, then simply add new layers. This can be done by specifying the include top argument to False.
The final three layers of the model are modified conforming to the new classification task as follows:
- “Average Pooling” Layer,
- “Dense (fully-connected)” layer with ReLU activation function,18
- “Dense (fully-connected)” layer with Softmax function18with a classification output. The pooling layer is added to extend the feature extraction potential of model and, the new fully connected layers are added to perform classification by learning features of a new dataset. ReLU (Rectified Linear Unit) is an activation function that is most commonly used in CNNs. The function introduces non-linearity to the structure and is preferred as it converges faster and overcomes the vanishing gradient problem.10 Mathematically, it is defined as y = max(0,x).18 The size of the final dense layer is set to the two which indicates the no. of classes in the given dataset i.e., COVID-19 & normal. The complete methodology for chest X-Ray images classification using the transfer learning is given in Algorithm
Step 1: The dataset is taken from Kaggle Dataset repository17, it contains two folders COVID-19 and normal chest X-Ray images. Some images are shown in Fig. 2 and Fig. 3. Step 2: Resize the input images so that they are consistent with the size of the input layer of pre-trained network.
Step 3: Images are augmented to handle varying rotations during training, by setting rotation to 15 degrees.
Step 4: Partition the data into training and test sets; 80% of images are used for training and 20% as a test dataset to test the network.
Step 5: Modify the Network Architecture by swapping the final layers of the pre-trained network as: “average pooling”, “fully-connected layer”, “softmax” with a “classification output”, to identify the probability of COVID-19 and normal class.
Step 6: Train the Network.
Step 7: Test the new classifier on the testing dataset.
Step 8: Plot the accuracy and loss during training and validation phase to analyze the performance of the model.
3.3. Performance Measures
It is crucial step to validate the effectiveness of a model. The performance of the proposed model is evaluated on the parameters of Sensitivity, Specificity, and Accuracy. They can be mathematically calculated using equations Eq. (1)-(3).19 In the equations, P and N indicate the number of positive and negative samples respectively while TP is True positive, FN is False Negative, TN is True Negative and FP is False Positive.
Accuracy is the ratio of the correctly classified samples to the total number of samples.
Sensitivity is the True positive rate (TPR), it represents the ratio of the positive correctly classified samples to the total number of positive samples.
Specificity is the True negative rate (TNR), is represents the ratio of the correctly classified negative samples to the total number of negative samples.