A New Radiomic Study on Lung CT Images of Patients with Covid-19 using LBP and Deep Learning (Convolutional Neural Networks (CNN))

The Covid-19 virus outbreak that emerged in China at the end of 2019 caused a huge and devastating eﬀect worldwide. In patients with severe symptoms of the disease, pneumonia develops due to Covid-19 virus. This causes intense involvement and damage in lungs. Although the emergence of the disease occurred a short time ago, many literature studies have been carried out in which these eﬀects of the disease on the lungs were revealed by the help of lung CT imaging. In this study, the amount of 25 lung CT images in total (15 of Covid-19 patients and 10 of normal) was multiplied (250 images in total) using three data augmentation methods which relate to contrast change, brightness change and noise addition, and these images were subjected to automatic classiﬁcation. Within the scope of the study, experiments were made for each case which include the use of the CT images of lungs (gray-level and RGB) directly, the images obtained by applying Local Binary Pattern (LBP) to these images (gray-level and RGB) and the images obtained by combining these images (gray-level and RGB). In the study, a 23-layer Convolutional Neural Networks (CNN) architecture was developed and used in classiﬁcation processes. Leave-one-group-out cross validation method was used to test the proposed system. In this context, the result of the study indicated that the best AUC and EER values were obtained for the combination of original (RGB) and LBP applied (RGB) images, and these ﬁgures are 0,9811 and 0,0445 respectively. It was observed that, applying LBP to images, the use of CNN input causes an increase in sensitivity values while it causes a decrease in values of speciﬁcity. The highest sensitivity was obtained for the case of using LBP-applied (RGB) images and has a value of 0,9947. Within the scope of the study, the highest values of speciﬁcity and accuracy were obtained by the help of CT of lungs (gray-level) with 0,9120 and 95,32%, respectively. The results of the study indicate that analyzing images of lung CT using deep learning methods in diagnosing Covid-19 disease will speed up the diagnosis and signiﬁcantly reduce the burden on healthcare workers.


Introduction
In December 2019, an outbreak occurred in Wuhan of China's Hubei province, caused by a new corona virus which is of zootonic origin and affecting acute respiratory tract in general [1]. With the spread of the virus in question in the following days, the disease began to seriously affect all countries in the world. The World Health Organization announced in March 2020 that the disease became a global pandemic. In addition, the World Health Organization named this new epidemic in question "Covid-19". Common symptoms of the disease are fever, cough, shortness of breath, muscle pain and weakness [2]. The disease has quite severe and negative effects on the lungs in general. In this context, many literature studies have been realised in a short time in which these effects of the disease on lungs were shown by the help of lung CT imaging. These studies reveal that besides clinical symptoms, blood and biochemical tests, lung CT imaging is an important diagnostic tool for the diagnosis of the disease as well.
In a study conducted by Qin et al. [3], a clinical examination images of lung CT belonging to four cases, two men and two women with Covid-19 disease, was performed. The results of the study reveal that lung lesions due to Covid-19 show a high degree of involvement in patients with pneumonia disease. In the study carried out by Albarello et al. [4], the change in the images related to chest X-Rays and CT of lungs belonging to 2 Covid-19 cases in Italy in the disease process was investigated. In the context of the study, it was evaluated by the authors that monitoring the deterioration of lungs by radiological images is an important alternative for the early diagnosis of the disease, considering the clinical findings. In the study conducted by Lin et al. [5], the change in CT images of lungs belonging to a 61-year-old male Covid-19 patient in accordance with the progress in the disease was examined. In the scope of the study, it is reported that there is an increase in lung involvement depending on the progress of the disease. In a similar study by Li et al. [6], CT images of lung belonging to 5 Covid-19 patients, whose ages range from 10 months to 6 years were evaluated clinically. As a result of the study, it was reported that 2 of these patients had no signs of disease on CT images of lungs but 3 of them showed significant abnormalities.
In the study conducted by Xu et al. [7], CT images of lungs belonging to 50 Covid-19 patients in total, of whose 9 are in mild, 28 moderate, 10 severe and 3 critical level, were evaluated clinically. In the examinations carried out within the scope of the study, it was reported that no changes in radiological images occurred in 9 patients, while symmetrical lesions developed in 26 patients and asymmetrical lesions in 15 patients. In addition, in the discussion part of the study, it was evaluated that repeated CT scanning is a useful method for monitoring disease progress and timely treatment of Covid-19. In the study performed by Xia et al. [8], CT images of 20 Covid-19 diagnosed pediatric patients were analyzed. The results of the study reveal that all patients have subpleural lesions. A similar study by Chen et al. [9] was carried out by analyzing the clinical data history of 9 women who were diagnosed with Covid-19 and were also pregnant. As a result of the study, it is stated that CT images of lungs have high diagnostic value in Covid-19 diagnosis. A study of clinical data of patients diagnosed with Covid-19 was performed by Huang et al. [10]. It was reported that lung abnormalities were detected in 40 of 41 Covid-19 patients examined within the scope of the study and that there was bilateral involvement.
In the study performed by Hu et al. [11], the CT images of lung belonging to 2 Covid-19 patients were evaluated. In the scope of the study, although the symptoms of the disease decreased after two days of treatment, this reduction and recovery showed inconsistency with the CT images of lungs. In the study conducted by Liu et al. [12], clinical evaluation of CT images of lungs belonging to 73 Covid-19 cases of various severities was performed. The results of the study show that all patients except the patient group of 8 percent who undergo the disease at the level of mild pneumonia had abnormal CT images of lungs. In the study conducted by Xu et al. [13], the clinical data histories of 90 Covid-19 patients were examined. From the results of the study, it was reported that Covid-19 patients had multiple patchy ground glass opacities on CT images. A similar study was carried out by Pan et al. [14], using the clinical data of 63 Covid-19 patients and abnormalities were detected in CT images of lungs belonging to the patients. As for the study conducted by Shen et al. [15], the lesion levels in CT images of lungs belonging to 44 Covid-19 patients were labeled by radiologists and computers respectively. The study results show that computerized labeling is a reliable alternative method in detecting the severity and distribution of pneumonia due to Covid-19 disease. In a study conducted by Li et al. [16], a clinical evaluation of CT images of lungs taken in the course of pneumonia due to Covid-19 disease was performed. In the study, attention was drawn to the importance of CT imaging of lungs in understanding the effect and progress of the Covid-19 disease. Clinical symptoms and CT images of lungs belonging to a 54-year-old male patient with Covid-19 disease in South Korea were investigated by Lim et al. [17]. A similar study was conducted by Cheng et al. [18] for the first Covid-19 case in Taiwan.
In a review study conducted by Long and Ehrenfeld [19], it is emphasized that using artificial intelligence methods to reduce the effects of Covid-19 outbreak crisis is an essential requirement.
In this study, Convolutional Neural Network (CNN), one of the deep learning methods, was used which suggested automatic classification of CT images of lungs for early diagnosis of Covid-19 disease. Within the scope of the study, besides the results obtained when the images are used directly with the CNN classifier, the results obtained by using Local Binary Pattern (LBP) as a preprocess were found out as well. In addition, the success of the proposed method was tested separately in case that all the image data were combined. The results of the study indicate that analyzing CT images of lungs with the help of deep learning methods in diagnosing Covid-19 disease will speed up the diagnosis and significantly reduce the burden on healthcare workers.

Used Data
The CT images of lungs belonging to Covid-19 patients used in the study were taken from a data set created and accumulated in the form of metadata by Cohen et al. [20] who then made it available to public through Github for open access. In this context, 15 CT images of lungs in RGB format and 24 bits in total were used in the study. The images in question were obtained from a total of 8 patients, which were included in the data set by Lim et al. [17] (2 images of 1 patient), Cheng et al. [18] (4 images of 1 patient) and the Italian Medical and Interventional Radiology Association [21] (9 images of 6 patients) respectively. The images obtained from the same patient belong to different days in the course of the disease and therefore, they are not identical. The sizes of the images in question are between 509×341 and 2024×1523 and vary widely. First of all, in order to clarify the area of interest on the images, framing was carried out so as to include the lung region. After this process, the dimensions of all the images were rearranged as 224×224. In the scope of the study, 10 normal CT images of lungs were also used. The images in question were similarly taken from, IA/I-ELCAP public access research database [22], a previously published data set. The processes applied to the abnormal images were also applied to the normal ones.

Data Augmentation
Within the scope of the study, the number of CT images of lungs belonging to 25 patients of which 15 were Covid-19 and 10 were normal subjects, was augmented by 10 times (total 250 images) using conventional data augmentation methods and these images were subjected to automatic classification. In this context, data augmentation was realized by using contrast changing, brightness changing and noise adding methods on the original images. First, the pixel values related to the original image size being multiplied by 0,8 and 0,6 respectively and the contrast being changed, the second and third images were obtained. The brightness change was made by increasing the value of each pixel of the second and third images by 7 and the fourth and fifth images were pixels to image b, e) Image obtained by adding 7 pixels to image c, f) Image obtained by adding 0,03 density salt and pepper noise to image a, g) Image obtained by adding 0,03 density salt and pepper noise to image b, h ) Image obtained by adding 0,03 density salt and pepper noise to image c, i) Image obtained by adding 0,03 density salt and pepper noise to image d, j) Adding 0,03 density salt and pepper noise to image e created. A salt and pepper noise of a density of 0,03 was added to the original image and the first four images obtained from this image, and the number of images was increased to 10. In this context, the images related to the data augmentation performed by applying these steps to the image of one of the patients diagnosed with Covid-19 are given in Figure 1. The selection of the type of noise to be added and the parameter sizes used in the data augmentation processes was made by taking into consideration the literature studies [23], [24] having successful results. In this context, it is possible to choose different parameter sizes and noise types and to use other data augmentation methods.

Local Binary Pattern (LBP)
Local Binary Pattern (LBP) was first introduced by Ojala et al. [25]. This method is often used to reveal local spatial structures. LBP is a sequential comparison process of a center pixel withh its neighboring pixel values. So it is a non-parametric method. The first example of this comparison was made for a 3×3 square operator. Then, operations were realized for operators of different sizes too. In Equation (1), there is a mathematical representation of this operation.
As can be seen from the Equation (1), threshold measurement is made by realizing a comparison with neighbors of the central pixel. As a result of the threshold measurement, a LBP code with a total of 256 combinations comes out. This code is then mathematically converted to a local number, the new pixel value. A sample LBP application is shown in Figure 2. As for the Figure  3, there are images obtained by applying LBP to each color space of the images given in Figure 1.
In case LBP is applied to the images, it is seen that the image sizes decrease. For example, when an LBP operation with a radius 2 is applied to an image of 224×224, an image of 220×220 is obtained. The reason for this is that this operation cannot be applied to the pixels in the starting and ending rows and columns. The diameter of the LBP used in this study is 2. To prevent this size reduction from causing any problem in the operations mentioned above, the dimensions of the image obtained after LBP operations were restored to 224×224. The use of the LBP process in the study enable us to have new images from the original ones that reflect local features. Thus, the total image feature depth was increased.

Convolutional Neural Network (CNN)
Deep learning is realized using too many layers in the realization of learning process. Convolutional Neural Network (CNN) is the most frequently used model of deep learning. This model has come into use widely, especially in image processing applications in recent years. CNN consist of some layers such as convolutional layer, activation function, pooling and fully connected layer. Convolutional layers are often designed to follow each other and enable us to obtain feature patterns from low-level features of images to high-level features [23]. Activation functions in CNN architecture can be defined as the functions that bring incoming inputs to a certain range or accept some of the input values while eliminating some of them again. Pooling layers, on the other hand, let the size of feature matrices to be reduced, through sampling. As for fully connected layer, it is the layer where the classification process is performed according to the features obtained through convolution, activation function and pooling. This layer works like a classic artificial neural network. Before the classification process, the conversion of feature matrices into feature vectors, that is, flattening process is performed. In this context, Figure 4 shows the general architecture of the CNN classifier.
Within the scope of the study, a CNN architecture consisting of 23 layers in total was designed and all the related experiments were made based on this architecture. Table 1 shows the information and parametric features of the layers of this CNN architecture. In programming processes, Matlab 2019a program was used and the function names and parameters of the layers used in the program were written directly in the layer name and parameters sections. The reason why there are four different dimensions in the first layer, the image input layer, is that experiments were made for multiple and different sized input images within the scope of the study.

The Evaluation Criterias For The Classification Results
In this study, besides the parameters: TP, TN, FN, FP and the dependent variables such as sensitivity (SEN), specificity (SPE) and accuracy (ACC) which were obtained by mathematical processing of the values of the abovementioned parameters were used in the evaluation of the results. Within the scope of the study, Receiver Operating Characteristic (ROC) analysis was made and the area (AUC) sizes under the ROC curve were compared. In this context, TP is the number of times that the actually patient data is labeled as patients as a result of classification as well. FP, on the other hand, is the number of times that non-patient data is labeled as patient in the same way. TN is the number of times that non-patient data is labeled as not patient as a result of the classification. FN, on the other hand, is the labeling of the patient data as non-patient in the same way. Sensitivity (SEN), specificity (SPE) and accuracy (ACC) values calculated using these parameters are defined mathematically between Equation (2) -(4). Within the scope of the study, SEN, SPE and ACC values were calculated for the threshold (cut-off) value to be 0,5.
ROC analysis examines the change of sensitivity (SEN) (y-axis) relative to the discrimination threshold value, that is, the cut-off value [26], in relation to the precision that is the complement of the specificity to one (1-SPE) (xaxis) in the classification processes performed in two groups. In this context, the area under the curve that reflects the change in question is called AUC. This field approaching to 1 indicates that the classification goes to perfection, while approaching 0 indicates that the classification deteriorates. The point at which the ROC curve intersects with its diagonal line drawn between the points on the ROC curve where the x and y axes indicate values 1 is called Equal Error Rate (EER). At this point, since the SEN and SPE values are equal, the complement of the EER value to one is also equal to the ACC value at the intersection point where this equation is achieved. In Figure 5, visuals of the above-mentioned issues regarding ROC analysis are given.

Experiments
Within the scope of the study, a deep learning-based artificial intelligence application was implemented for the automatic diagnosis of CT images of lung for early diagnosis of Covid-19 disease. In this study, a total of 25 lung CT images were used, 15 of which were belonging to patients diagnosed with Covid-19 and 10 to normal subjects.
In the study, first of all, framing process was carried out to include the lung region in order to clarify the interests of the related images. Since the images used are of very different sizes, the images were rearranged in size and the image dimensions were set to 224×224. Later, the images in question were recorded in RGB format and in 24 bit depth in jpg format. In this context, it is possible to name these processes as pre-process applications.
In the second stage of the study, data augmentation was performed by the help of changing contrast and brightness as well as noise-adding methods described earlier. As in the first stage, these images were recorded in RGB format and 24 bit depth in jpg format. At the end of the second phase, the amount of data was augmented 10 times, including original images on which were based derivation.
In the third part of the study, a 23-layer CNN architecture, previously described, was developed and this architecture was used in all the experiments. Leave-one-group-out cross validation method was implemented in the train-ings practiced in the experiments done in this section. In other words, all the images except the image to be classified and 9 other images in the same data generating group with this image were used in the trainings. At the end of each training, these 10 images in the same data-generating group were tested and classified. The training and test procedures mentioned were repeated 25 times and the classification result was obtained for all the images. Six experiments were basically carried out in this part of the study.
-First of all, training and test operations were carried out by converting RGB format images into 8-bit gray-level images and results were obtained. -In the second experiment, the images in question were given as inputs to the CNN, in RGB format, that is, having three image matrix spaces. -In the third experiment, the images obtained by applying LBP to 8-bit gray-level images obtained in the first experiment were used. -In the fourth experiment, the data with three image matrix spaces obtained by applying LBP separately to each color space of RGB format images were given as input to CNN and results were obtained. -In the fifth experiment, the gray-level images used in the first and the third experiment were combined and an image data with a total of two matrix image spaces were used. -In the sixth and final experiment, the RGB images used in the second and fourth experiments were combined and an image data with a total of six image matrix spaces was given as an input to CNN.
In this context, since there was a decrease in image sizes after each LBP application, the LBP-applied images were rearranged in 224×224 dimensions. Each experiment was repeated 10 times in itself in order to let the results reach stability due to the random assignment of some initial weights and parameters used in the structure of CNN.
Within the scope of the study, the times needed to be able to classify an image were also measured In this context, the information on the hardware and software on which these processes are realised must be taken into consideration in evaluating the times mentioned. The experiments of this study were carried out using MATLAB 2019 (a) software running on Intel i5-2430M 2.4 GHz computer with 4 GB RAM.

Results
Within the scope of the study, first of all, training and test procedures were performed for the images obtained by converting CT images of lungs (graylevel), that is, RGB format images into 8-bit gray-level images, and results were obtained. As mentioned earlier, each experiment was repeated 10 times in order to let the results reach stability due to the random assignment of some initial weights and parameters used in the structure of CNN. The dimensions of the image given to the CNN input for the experiment in question are 224×224×1. In this context, the results obtained for the first experiment are presented in Fig. 6 The best (green line) and worst (red line) ROC curves obtained using lung CT images (gray-level) Table 2. As can be seen from Table 2, the average sensitivity is 0,9807, while the average specificity was calculated to be 0,9120. The average accuracy is 0,9532 and the area under the average ROC curve is 0,9804 (EER=0,0458). These classification processes were concluded in an average of 11,1942 seconds per image. The best and worst ROC curve obtained in the experiment can be seen in Figure 6.  9804 11,1942 Within the scope of the study, in the second experiment of the study, training and test procedures were performed using lung CT images (RGB) and results were obtained in the study. The dimensions of the image given to the CNN input for the experiment in question are 224×224×3. In this context, the results obtained for the first experiment are presented in Table  Fig. 7 The best (green line) and worst (red line) ROC curves obtained using lung CT images (RGB) Table 3, the average sensitivity is 0,9687, while the average specificity was calculated to be 0,8920. The average accuracy is 0,9380 and the area under the average ROC curve is 0,9728 (EER=0,0701). These classification processes were concluded in an average of 14,1371 seconds per image. The best and worst ROC curve obtained in the experiment can be seen in Figure 7. In the third experiment of the study, training and test procedures were performed using the images obtained as a result of applying LBP to the CT images of lungs (gray-level) used in the first experiment, and the results were obtained. The dimensions of the image given to the CNN input for the exper- Fig. 8 The best (green line) and worst (red line) ROC curves obtained using LBP images (gray-level) iment in question are 224×224×1. In this context, the results obtained in the third experiment are presented in Table 4. As can be seen from Table 4, the average sensitivity is 0,9940, while the average specificity was calculated to be 0,7150. The average accuracy is 0.8824 and the area under the average ROC curve is 0,9606 (EER=0,0717). These classification processes were concluded in an average of 11,7981 seconds per image. The best and worst ROC curve obtained in the experiment is shown in Figure 8. In the fourth experiment carried out within the scope of the study, the data related to the three image matrix spaces obtained by applying LBP separately to each color space of CT images of lungs (RGB) was given as an input to Fig. 9 The best (green line) and worst (red line) ROC curves obtained using LBP images (RGB) CNN and results were obtained. The dimensions of the image given to the CNN input for the experiment in question are224×224×3. In this context, the results obtained for the fourth experiment are given in Table 5. As can be seen from Table 5, the average sensitivity is 0,9947 while the average specificity was calculated to be 0,6600. The average accuracy is 0,8608 and the area under the average ROC curve was 0,9592 (EER=0,0807). These classification processes were concluded in an average of 15,2373 seconds per image. The best and worst ROC curves obtained in the experiment were shown in Figure 9. In the fifth experiment of the study, the gray-level images used in the first experiment and the third experiment were combined and the data with a total Fig. 10 The best (green line) and worst (red line) ROC curve obtained by combining lung CT images (gray-level) and LBP images (gray-level) of two matrix image spaces were given as an introduction to CNN and the results were obtained. The dimensions of the images given to the CNN input for the experiment in question are 224×224×2. In this context, the results obtained for the sixth experiment are given in Table 6. As can be seen from Table 6, the average sensitivity is 0,9827, while the average specificity was calculated to be 0,8300. The average accuracy is 0,9216 and the area under the average ROC curve is 0,9746 (EER=0,0654). These classification processes were concluded in an average of 12,9802 seconds per image. The best and worst ROC curves obtained in the experiment were shown in Figure 10. In the sixth and last experiment of the study, RGB images used in the second and fourth experiments were combined and the data related to the total of six image matrix spaces was given as input to the CNN. The dimensions of the images given to the CNN input for the experiment in question are 224×224×6. In this context, the results obtained for the sixth experiment are given in Table 7. As can be seen from Table 7, the average sensitivity is 0,9907, while the average specificity was calculated to be 0,8420. The average accuracy is 0,9312 and the area under the average ROC curve is 0,9811 (EER=0,0445). These classification processes were concluded in an average of 16,9653 seconds per image. The best and worst ROC curves obtained in the experiment were shown in Figure 11. The summary of the results obtained from the study and given between Table 2 and Table 7 can be seen in Table 8.

Discussion
Important results were obtained within the scope of this study, which recommends automatic classification of CT images of lungs for early diagnosis of Covid-19 disease and benefits from Convolutional Neural Network (CNN), one of the deepest learning methods. As a result of the study, it has been shown that this automatic classification can be realized at a high value of AUC such as 0,9811 (EER=0,0445). At this stage, it is considered that the use of Convolutional Neural Network (CNN), which is one of the deep learning methods, augmentation of the number of training images with the help of conventional data augmentation methods and benefits drawn from the images obtained by applying LBP were decisive in reaching the achievement mentioned.
Within the scope of the study, the highest value of sensitivity is 0,9947 for the case that the images obtained by applying LBP operation to each color space of CT images of lungs (RGB) were given to CNN input. In this context, in case a general evaluation is to be made, it can be seen that the use of the images on which LBP process was directly applied or the combination of the original images with the ones to which LBP process was applied leads to improvement in values of sensitivity. In addition, using the original images and the images obtained by applying LBP process to these images provides an improvement in value of AUC and EER. These results measure up to supporting some other studies [27][28][29][30][31] where CNN and LBP methods were used together and LBP has been shown to increase study success. Within the scope of the study, the highest values of specificity and accuracy were obtained using CT images of lungs (gray-level) at values 0,9120 and 95,32%, respectively.
The results of the study indicate that analyzing CT images of lungs by the help of deep learning methods in diagnosing Covid-19 disease will speed up the diagnosis and significantly reduce the burden on healthcare workers. In this context, it is critical to increase the number of radiological and clinical data of Covid-19 patients and to make them available to researchers through open access in order to improve these studies and obtain better results.
As the size of the inputs given to CNN increases, the time taken for classification increases. In this context, the slowest calculations were carried out in an average of 16,9653 seconds per image for combining CT images of lung (RGB) and the images obtained by applying LBP to each color space of these images to the CNN input. The fastest procedures were done on in an average of 11,1942 seconds per image for lung CT images (gray-level) to the CNN input.
In the studies to be carried out after this stage, it is aimed to make automatic classification of X-Ray images of chest, which is an important diagnostic tool such as CT images of lung, by the help of methods based on deep learning, in Covid-19 disease diagnosis. In this context, it is evaluated that using multiple resolution analyses such as wavelet transform beside LBP process can contribute to study results. In addition, it will be tried to increase the success of study by making use of complex-valued CNN and transfer learning approaches.