The general overview of the proposed MLHC-COVID-19 for identifying CXR images of COVID-19 infection is shown in Fig. 1. The block diagram depicts the vital processes embedded in the model. The following subsections will discuss each process.
COVID-19 CXR images dataset
In this retrospective study, five datasets of CXR images were used and made publicly available by Mendeley [22], the Italian Society of Medical and Interventional Radiology (SIRM) [23], Github [24], Radiopaedia [25], and Kaggle [26]. The number of images used in this study were 4,200 divided in three classes: (a) healthy, (b) non-COVID-19 (viral pneumonia + bacterial pneumonia), and (c) COVID-19. Each image was annotated by medical specialists. The first four sources contained 1,050 CXR images of COVID-19 infected patients (912 + 138). The last source contained 3,150 CXR images of healthy, viral pneumonia, and bacterial pneumonia patients (1,050 healthy, 2,100 non-COVID-19). Table 2 lists the CXR image datasets used in this study and the samples of the CXR images in the combined dataset are shown in Fig. 2.
Table 2
Open Source | Classes | Number of Images |
Kaggle [26] | Healthy | 1050 |
Kaggle [26] | non-COVID-19 (Viral + Bacterial) | 2100 (1050 + 1050) |
Mendeley dataset [22] | COVID-19 | 912 |
SIRM [23], Github [50] and Radiopaedia [25] | COVID-19 | 138 |
Image Preprocessing
In this study, image pre-processing consists of three steps: image segmentation, image analysis, and feature extraction as shown in Fig. 3. For image segmentation, we designed three sub-steps: removal of the partial black background, image resizing, and grayscale conversion. The combined dataset had different formats and a partial black background in the CXR images. This was caused by many factors such as the filming position, types of X-ray machines used, etc. The black background affected the classification performance therefore it was removed. In Fig. 4, an example of a black background completely removed. Furthermore, the CXR images were of different dimensions; consequently, all the images were resized to an identical size of 500 × 500 pixels. The last image segmentation step is converting the images to grayscale to reduce the image features. Reducing the image features improves the classification result and mimics the complexity of the algorithm [27]. The grayscale formula is given by Eq. 1.
$${G}_{intensity}\left(x,y\right)= 0.2989\text{*}f\left(x,y,R\right)+0.5871\text{*}f\left(x,y,G\right)+0.1140\text{*}f\left(x,y,B\right)$$
1
where \({G}_{intensity}\) (x, y) is an image with grayscale, then f (x, y, R) is a pixel value in the (x,y) coordinates of the red channel, f(x, y, \(G\)) is a pixel value in the (x,y) coordinates of the green channel, and f (x, y, B) is a pixel value in the (x,y) coordinates of the blue channel [28].
The second image preprocessing step was image enhancement. We utilized the power law transformation to adjust the brightness of the CXR images. We applied a γ value of 0.5. The power-law transformation formula is shown in Eq. 2 [29, 30]. In addition, the two-dimensional Gaussian filter technique was selected to reduce the Gaussian and salt-and-pepper noise [31]. The Gaussian filter technique is given by Eq. 3. In Fig. 5, the CXR image after image enhancement.
where s is the output pixel value, c is a value of the normalized image, γ is the gamma value, and r is the input pixel value.
$${G}_{Filter}\left(x,y\right)= \frac{1}{2\pi {\partial }^{2}}{e}^{-\frac{{x}^{2}+{y}^{2}}{2{\partial }^{2}}}$$
3
where ∂2 is the variance of the Gaussian filter with a 3 x 3 kernel size, x, and y are the horizontal and vertical axes of the kernel size [32].
The last image preprocessing step is feature extraction for which histogram analysis and L2-normalization were performed. Histogram analysis reduces the image features by retrieving vital image statistics. Specifically, low-intensity values produce a dark-toned image and vice versa [33] (see Algorithm 2 in Appendix A). After the histogram analysis, each feature in the histogram had different scales. Subsequently, to obtain a standard scale, we performed L2-Normalization. L2-Normalization or the Euclidean norm normalizes the features of the histogram to the same scale using Eq. 4. In Fig. 6, the output of feature extraction.
$$L2= \frac{X}{\sqrt{{X}_{i}^{2} + {X}_{i+1}^{2} + {X}_{i+2}^{2} + \dots + {X}_{i+n}^{2}}}$$
4
Where X is a feature of the histogram.
Multi-layer Hybrid Classification Model
First, the original MLHC was developed to classify the sleep stages automatically by our research team. Each layer in the MLHC is a binary classification model that uses different machine-learning techniques [34]. The MLHC-COVID-19 method in this study was designed with two layers. The first differentiated unhealthy or infected subjects (viral, bacterial, and COVID-19 infection) from healthy subjects. The second classified COVID-19 and non-COVID-19 (viral and bacterial) subjects. Figures 7 shows pseudocode of the MLHC-COVID-19.
Three machine-learning techniques, DT, SVM, and NN were experimentally evaluated as candidates for embedding into MLHC-COVID19. DT is a well-known supervised machine-learning method. Each node represents a condition for deciding on data classification, where various branches of trees represent the results from testing and the leaves of the DT represent the classification [35, 36]. DT is one of the simplest techniques to understand and is suitable for classification tasks [37]. SVM is a supervised machine-learning method with promising performance in statistical classifications [38]. It distinguishes data by finding hyperplanes as separators. Identifying hyperplanes iterates toward the best line during the training [39]. The radial basis function (RBF) was selected as the kernel function [40]. Finally, NN are types of mathematical models for processing data with connected computation nodes that mimic the functions of biological neural networks [41]. They build complex models between the inputs and outputs with high efficiency [42]. We designed four fully connected layers (dense layers) and two dropout layers. The input to the first dense layer consisted of 256 histograms. The first and second dense layers used 128 neurons and the third dense layer used 32 neurons. The last dense layer was fed into the SoftMax classifier [43]. In addition, dropout layers with an 0.2 ratio were interpolated between the dense layers [44]. Figures 8 shows the MLHC-COVID-19 flowchart.