A Novel Automated Method for COVID-19 Infection and Lung Segmentation using Deep Neural Networks

The COVID-19 pandemic rst originated in Wuhan, China and has spread to every country in the world. Without a viable cure in the near future, there is an urgent need for rapid diagnosis of COVID-19, faster test results and automated segmentation of infected region in the lungs. The aim of this paper is to assist in the rapid detection and segmentation of COVID-19 patients using deep learning techniques. This paper proposes a method for automatic segmentation of the lung and infected regions of COVID 19 patients using lung CT scan dataset. This has been done using a modied U-Net model along with different cross validation folds. The region of infection which is segmented will contain the lesion, which if identied in the early stages can be benecial during treatment of the person. This can help doctors to determine the severity of the infection and suggest treatments based on it. A comparative analysis of the proposed architectures has been done against recently published results which proves the superiority of our models in terms of dice similarity coecients. proposed segmentation achieved better dice This proves the our This model can accurate this along with the portion from a CT scan This method can be done


Introduction
The COVID-19 pandemic which was rst reported in late December, 2019 in the city of Wuhan, in the Hubei province of China, was declared a pandemic in March, 2020. As of mid-September, 2020, there have been about 28 million cases in the world, with around 900000 deaths. The inability to control the spread and failure of containment has had severe impacts on global health and world economy [1,2].
This virus causes respiratory disease with clinical symptoms of cough, fever, shortness of breath which is complicated by lung in ammation. More severe cases, the virus can cause pneumonia, severe acute respiratory syndrome, kidney failures and nally leads to death of the patient [3][4][5][6]. Complications also arise because of the mutations the virus undergoes, which makes it more di cult to diagnose and prepare any speci c treatment [7]. Global economy went through a decline and unemployment started to rise when the governments of various countries halted everything and went into lockdown. The practice of social and physical distancing as well as wearing masks was widely publicised as a way to stop the transmission of the virus in view of lack of any absolute means of diagnosis [8,9]. Scientists have also proved the existence of asymptomatic patients which means the presence of the virus in a person does not cause any symptoms until it is too late to diagnose [10]. An important step is the regular and effective screening of patients so as to provide quick treatment facilities or quarantine them so that they don't spread the virus to somebody else.
Due to unavailability of a treatment or vaccine for COVID-19, it is usually diagnosed with treatment options for pre-existing diseases [11,12]. This has had some success because of the low mortality rate of the virus. But this is not a long-term solution, since the countries which were once able to control the ow of transmission of the virus, are seeing a second wave of infections. Scientists have also con rmed that a patient who tested negative for COVID-19 after undergoing treatment has once again been diagnosed positive for the virus. This also proves that immunity to the virus might not be permanent. Scientists have also proved that patients after recovering from the virus usually have some problems like not gaining their full appetite, staying tired and fatigued, and in general having a lack from energy. It is also advised not to socialise after recovery as studies have shown the presence of virus in the respiratory tract after testing negative for the virus. The ever-changing genetic sequence of the virus and the fact that it undergoes various mutations present a challenge to the development of a vaccine [13].
This makes early diagnosis an important factor in the survival of a patient. Early diagnosis leads to immediate quarantine of the infected person, stop the ow of virus further and make a difference in saving the life of asymptomatic patients. Various conventional methods of testing for the presence of COVID-19 exist, which include Molecular tests, Antigen tests and Serological tests [14,15]. There are several limitations associated with these forms of testing. They include the probability of generating high amounts of false negatives and false positives, which is usually because of the sampling procedures. Their accuracies are usually questionable and sometimes they take a lot of time to return the test results. This is usually because of backlog of a testing site, its location from the patient as well as human factors involved [16].
Chest radiography imaging, which can be either x-rays or CT (Computed Tomography) scans has important applications in the diagnosis of COVID-19 [17]. Medical professionals use these scans of the chest of the patient to get more information about the presence and extent of the virus, the severity of the spread as well as regions of infection. These reports facilitate doctors regarding multilobe involvement, ground class and peripheral opacities. The patterns or abnormalities present in these parameters and reports are usually minute and need to be identi ed by expert radiologists and clinicians. Since there are a large number of tests to be processed and limited number of technicians, there are inadvertent delays and wrong results. This leads to the need for an automated method for the identi cation and detection of the virus in the patient and segmentation of the infected and healthy regions of a lung CT scan [18].
Deep learning solutions have proved to be extremely powerful in these situations [19]. These techniques can provide game changing solutions which can prevent the breakdown of a country's health institutions.
Since the advent of the pandemic, several researchers have published solutions for automatic detection and segmentation of COVID-19 in suspected patients, which will be discussed in Section 2 of this paper.
So far, there are very few publicly available diverse datasets. The datasets which are available do not more than a few hundred COVID-19 positive sample images. The aim of this paper is to propose automated technique for infection and lung segmentation of COVID-19 patients based on chest CT scan images. The chest CT scan images have been obtained from Zenodo [20]. This dataset contains both the lungs and infections, which are labelled and veri ed by expert pathologists.
A modi ed version of the U-net model has been used for lung and infection segmentation. This is done to automate the process of segmenting the region of infection in the lung as well as accurately identify all the lesions in the lung region. This can be immensely helpful for the doctors who can determine the severity of the disease based on these lesions and suggest treatment options. The objectives of the paper have been summarized as follows: To overcome challenges associated with conventional methods of testing and diagnosis of COVID-19 To present an automated technique using U-Net architecture for lung and infection segmentation using chest CT scan images To provide a comparative analysis of proposed methods with recent results and prove the superiority of our model This paper has been divided into 5 sections. Section 1 gives the introduction and lists the objectives of this paper. Section 2 provides a detailed literature study and short description of recent models and results for COVID-19 diagnosis. Section 3 gives the detailed explanation of the proposed methodology. Section 4 gives the results obtained and provides a discussion of the results. Section 5 gives the conclusion of this paper and suggests future research opportunities.

Literature Review
In this section, various researches in the eld of COVID-19 have been studied and analysed. Numerous studies have been conducted in the eld of medical imaging [21][22][23]. Many studies have been published which con rms the superiority of deep learning techniques over traditional machine learning methods, hand crafted methods, manual selection and texture analysis methods [24,25]. Within last ten years or so, deep learning has had far reaching applications. It has proven to be immensely successful and accurate in medical imaging, classi cation, disease prediction and segmentation applications.
Convolutional Neural Networks (CNN), a branch of deep learning, has been used as a topic of research in huge proportions. It has been used in the diagnosis of diseases like Alzheimer's disease, cancer detection, brain neuroimaging [26][27][28] and many others.
Although COVID-19 is very recent in comparison to other diseases, it has piqued a lot of interest among scientists, academics and deep learning practitioners. There are numerous studies regarding COVID-19 predictions. This paper will only analyse those studies which deal in automated techniques using deep learning methodologies where the datasets used are chest x-ray images or chest CT scan images. It has been found that patients infected with COVID-19 present various abnormalities in chest radiography images [29][30][31]. These abnormalities can be identi ed by good pathologists but can also be more accurately identi ed using deep learning techniques. Various promising and accurate results have been published which can detect patients with COVID-19 using chest radiography images, focusing on CT scan images [32,33]. Many of these proposed systems have been developed on private or closed datasets, not available to the general public, which were taken after getting permission from hospitals. In more recent studies, there is a considerable push for AI based deep learning solutions and automated methods which can assist in COVID-19 diagnosis. This led to more open access COVID-19 source datasets being developed. Many researches have been published which show very good results in terms of accuracy by using chest x-ray datasets [34,35].
Medical Image segmentation can be de ned as detection of some type of boundary or region within an image. It has an important role in recognizing patterns in any kind of disease which can assist further diagnosis [36]. It segments a region of the organ, tissue or part of a body based on a speci c description for detection of infected regions. Many segmentation techniques have been proposed which provide promising results [37,38]. A technique was developed in [39] which described an automated method for segmentation of infected region inside the lung of a COVID-19 patient. One of the benchmark techniques has been proposed in [40], where the authors have developed their own segmentation network, namely Inf-Net. This is usually a starting point for many researchers. In [41], a residual attention network based network was developed based on the U-net architecture for automated multi class segmentation of COVID-19 CT scan images. Results have been published using a feature variation block to enhance the capability of feature representation [42]. This paper will prove the superiority of its methodologies over previously de ned and published architectures by providing a comparative analysis.

Proposed Methodologies
In this section, the proposed methodology for the segmentation tasks have been explained in detail. The entire process before the training starts including the pre-processing techniques as well as the hyperparameters involved have been described. The architecture of the U-Net model has also been described in brief [43]. The methodology for lung and ROI (region of infection segmentation) in CT scan images of COVID-19 patients has been described in detail. The dataset was downloaded from Zenodo. It contained CT scans of 20 patients, along with infection mask, lung and infection mask and lung mask images. These images have been veri ed by an experienced radiologist. The masks give the region of infection and the region of lung in the actual CT scan. The U-net architecture has been applied. This architecture has given state of the art results in medical image segmentation. For infection segmentation, three-fold, four-fold and seven-fold cross validation methods have been used. It is used for validating any overlap or similarity between two images. Fig. 1. illustrates samples of images from the dataset containing the original CT scan and its infection mask. Fig. 2. illustrates samples of images from the dataset containing the original CT scan and the lung mask.
In Fig. 1 and Fig. 2, the highlighted portion on the images on the right-hand side show the masked regions. These images are already masked and veri ed by an expert radiologist and will be used to train a deep neural architecture which can also accurately determine the region of infection and the lung by segmentation.
U-Net Architecture: U-Net is a CNN architecture developed mainly for biomedical image segmentation. Its architecture was modi ed to work with a smaller dataset and observe segmentations with higher accuracy. The architecture consists of a contracting path and an expansive path with gives the U-type architecture. During the contracting path, the images are passed through a series of convolution, relu and pooling operations where the spatial dimensionality is reduced, increasing the size of feature maps. The expansive path combines the spatial and feature maps using a series of concatenations which help to increase the resolution of the output.

Pre-processing
The images are rst resized to 512x512 dimensions. The CLAHE (Contrast Limited Adaptive Histogram Equalization) technique is applied to enhance the contrast of the images. Medical images usually have a lot of problems in contrast, especially CT scan images. The parameters involved in applying CLAHE are the clip limit and grid size. Correct amount of clip limit prevents the over ampli cation of noise in the image. This is one of the main advantages of CLAHE over AHE (Adaptive Histogram Equalization). This value is generally kept between 2 and 4. The clip limit value has been tuned at 3. The processor in this method has to go through a lot of black areas, which are not necessary for the segmentation process.
The images can be cropped so that only the region of interest is utilized. This can be done by slicing the images using trial and error, but that would be generalized for the dataset in use only. A better approach would be to draw contours over the image and crop out the rectangle with biggest contour and area. This would point to the contour covering both the lungs. Maximum region of intersection can be obtained by taking the next two largest contours by area for the two lungs and combine them. While cropping a CT scan, the corresponding segmentation map should also be cropped by same limits to avoid wrong labelling of a region. In global thresholding an arbitrary chosen value can be used as a threshold in contrast to Otsu's method, which automatically determines the value [44]. A total of 3520 slices were obtained. About twenty percent from the front and last of each le in general did not have any infection mask and some did not have lungs too, which is why they were discarded as noise. A total of around 500 slices had a complete black mask, which meant there was no infection in these regions. They were kept out of the segmentation model. Finally, about 1600 samples were obtained which were later split into train and test. Since the images and masks were cropped, all of them were not of the same size. This is why they were reduced to 224x224 dimensions. The same pre-processing steps were followed for all Kfold infection segmentations as well as for lung segmentation. Fig. 3. illustrates a comparison between the images enhanced using the CLAHE algorithm with their original CT scan. In the enhanced image, the infection can be clearly distinguished. Fig. 3. also provides the histogram comparison of original image and enhanced image. Fig. 4. illustrates the pre-processed (Enhanced using CLAHE algorithm and Cropped) image along with the infection mask. Fig. 5. illustrates the enhanced CT scan of lung along with the mask of the lung.

Proposed Architecture
The architecture proposed in this paper is based on the concepts employed in the U-Net architecture. The architecture for the segmentation process is illustrated in Fig. 6. It is a modi ed version of the U-Net architecture. It has the familiar U-shape with various changes made to the individual blocks. The contraction path can be explained as a combination of feature extraction blocks. The feature map has to converted to a vector and an image has to be reconstructed from this vector. The main idea behind the process is using the same feature maps which were obtained during the contraction path. These feature vectors are then used in the expansion path to form a new segmented image which contains only the boundary or region of interest of the original image, preserving integrity of the image.
In Fig. 6, the input images of 224x224 dimensions are given to the input layer. This layer is connected to the rst block of the contraction path. There are four contraction path blocks altogether. The number of lters in the convolutional layer of the contraction path are 32, 64, 128 and 256 respectively for each subsequent block. In each block a 3x3 size lter and the relu activation function has been applied. Two convolutional layers are followed by a batch normalization layer which is followed by a max pooling layer in two dimensions with a 2x2 pool size. This performs the down sampling operation to reduce dimensionality of the feature map. The job of each contraction block is to extract ne features of the image. This can be thought of as a feature extraction task. A routine dropout regularization layer has been added at the end of each contraction block.
After the feature extraction is completed by four contraction blocks, two convolutional blocks are added with 512 lters. These are followed by the expansion block. The expansion block consists of a 2D Convolutional Transpose Layer, which is basically the inverse of a pooling operation. This is used to perform the up-sampling operation and has the ability to interpret raw data to ll in the feature matrix. This layer is a combination of convolutional and up-sampling layer in two dimensions. This layer is followed by the concatenate layer, which joins its previous layer with the nearest batch normalization layer in order to combine the information on location of the features from the contraction paths with the contextual information obtained in the expansion path. This layer is followed by two convolutional layers. The number of lters in the convolutional layer and the convolutional transpose layer are 256,128,64 and 32 respectively for each subsequent block in the expansion path. The size of lters used is 3x3 with the relu activation function. The nal convolutional layer is connected to the output layer, which is also a convolutional layer with a single neuron to give the single segmented output image, with a sigmoid activation function.
The modi cations done in this model which improve the performance of the model are the additions of a batch normalization layer and the Convolutional Transpose layer. Usage of the transpose layer instead of only the up-sampling layer which is done in the conventional U-Net model gives us interpret the coarse input data in a better way. Batch normalization is an important part in the contraction or classi cation process [45]. These layers provide an advantage of the proposed model over the original U-Net architecture.

Model Training and Hyperparameter Tuning
The lung and infection segmentation tasks were both trained using the same proposed model. The infection segmentation task was done with three, four and seven consecutive folds. Each fold is used as validation during which the remaining folds act as the training set.
The hyperparameters used in this paper have been tuned to their value after repeating trial and error training runs for a xed number of epochs. All the segmentation tasks have been carried out with the same hyperparameters. For model optimization, the Adam optimizer function has been used with a learning rate of 0.0005. The metric to be measured is the dice coe cient. The loss is calculated using the following equations.

Results And Discussions
This section demonstrates the results achieved by this paper. It proves the superiority of the proposed model in terms of accuracy and dice similarity coe cient in comparison with previous architectures. During the infection segmentation task, three-fold, four-fold and seven-fold segmentation was done.
Since the metric taken here is the dice similarity coe cient and the data trained in folds, the mean of the entire training process has been calculated. The segmentation process does not have labels like the classi cation process. The predicted segmented image needs to be checked against the actual segmentation mask. The aim is to get the maximum similarity between the output predicted segmented mask and the actual mask. This is calculated using the segmentation performance metric, the dice similarity coe cient.
For the three-fold training process, the dice coe cient was calculated for all the folds at different levels of threshold values. The mean of all obtained dices was 94.68%. The mean of all the obtained precisions was 94.03%. The mean of all the obtained recall values was 95.35%. For the four-fold training process, the dice coe cients were calculated in a similar manner. The mean of all the obtained dices was 95.58%. The mean of all the obtained precision values was 95.34%. The mean of all the obtained recall values was 95.86%. For the seven-fold segmentation process, the mean dice coe cient was found to be 96.91%, the mean value of the precision was 96.42% and the mean value of the recall was found to be 97.4%. The above data has been presented in a tabular form in Table 1. This table shows that increasing the number of folds helps improve the performance evaluation parameters. For the lung segmentation process, the obtained dice similarity coe cient is 98.45%. The best IoU (Intersection over Union) score obtained was 0.969 at the threshold of 0.492. The average precision value obtained was 96.26% while the average recall score that was obtained was 98.29%. Fig. 7 illustrates the comparison between the original CT scan, the actual and predicted masks for all seven folds of the infection segmentation process. Fig. 8 illustrates the comparison between the original lung mask and the predicted mask of the image. Figures 7 and 8 show that the predicted mask is almost identical to the actual mask. Table 2 clearly demonstrates the superiority of the proposed U-Net modi ed segmentation model by doing a comparative analysis with previous results. The comparison has been made between the results achieved by our model and the results achieved by other models in terms of their average dice similarity coe cients.

Conclusion
This paper proposes novel and e cient algorithm for the automated segmentation of COVID-19 chest CT scan images. The proposed architecture ful ls the objectives of this paper and identi es the lesions or regions of infection and very accurately segments that region along with the lung. The model has been designed based on the structure of the U-Net model architecture. This modi ed U-Net model accurately predicts the segmented lung with a dice similarity coe cient of 98.45%. This model also predicts the segmentation region of infection in the lung using K-Fold training process. It achieves a dice similarity coe cient of 96.91% for the infection region segmentation. The average similarity coe cient achieved by the model is 97.68%. A comparison of the results achieved by previous studies has been done against the results achieved by our proposed model. The comparative analysis has been presented in a tabular form.
This shows that our proposed segmentation model has achieved better results than previous studies in terms of dice similarity coe cient. This proves the superiority of our model. This model can be very e cient in quick and accurate identi cation of infected region of the lung of a COVID-19 patient and segment this region along with the lung portion from a CT scan image. This method can be done in a completely automated way by giving the CT scan image as an input to the software with the proposed model as a backbone. This can provide the segmented output which the doctor can access remotely and suggest treatment. Since this method is automated it decreases the need of physical contact with a doctor or clinician, further reducing the chances of transmission of the virus. There are various future research opportunities in this eld. One of the most interesting research opportunities would be to make use of patient medical history and data in conjunction with chest x-ray images or CT scan images to improve the reliability of the model and use it for diagnostic purposes. This model can be used to predict the probability of infection in the patient. The process after the segmented output can also be automated by developing a model which can suggest treatment options on its own just by seeing the parameters involved in the segmented output. Severity of the infection can also be predicted, which can help give an insight into duration of medical care needed. These models predict the possibility of a person catching the virus rather than detect whether the person is infected. AI based solutions using thermal sensors preequipped with these data can also provide far more accurate results.

Con ict of Interest
Author has declared no con ict of interest with any organization.