Per-Covid-19: A Benchmark Database For Covid-19 Percentage Prediction From CT-scans

Covid-19 infection recognition is very important step in the ﬁghting against the new pandemic Covid-19. In fact, many methods have been used to recognize the Covid-19 infection including Reverse transcription polymerase chain reaction (RT-PCR), X-ray scan and CT-scan. In addition to the recognition of the Covid-19 infection, CT-scans can provide more important information about the evolution of this disease and its severity. With the extensive number of Covid-19 infections, estimating the Covid-19 percentage can help the intensive care to free up the resuscitation beds for the critical cases and follow other protocol for less severity cases. In this paper, we propose Covid-19 percentage estimation database. Moreover, we evaluate the performance of three Covolutional Neural Network (CNN) architectures which are ResneXt-50, Densenet-161 and Inception-v3. For the three CNN architectures, we use two loss functions which are MSE and Dynamic Huber. In addition, two pretrained scenarios are investigated (ImageNet pretrained models and X-ray pretrained models). The evaluated approaches achieved promising results, where Inception-v3 with using Dynamic Huber loss function and X-ray pretrained model achieved the best performance. were trained on medical imaging task. The experimental results show that using the X-ray pretrained models improve the results. Moreover, the experiments using Dynamic Huber loss function achieved better performance than the ones used standard MSE loss function. From other hand, Inception-v3 outperformed ResneXt-50 and Densenet-161 architectures in both scenarios.


Introduction
Since the end of 2019 year, the world has faced a health crisis because of the spread of Covid-19 pandemic. The crisis influenced the whole aspects of human lives. To save the infected persons lives and stop the spread of Covid-19 disease, many methods have been used to recognize the infected persons. These methods include Reverse transcription polymerase chain reaction (RT-PCR) 1 , X-ray scan [2][3][4] and CT-scan 5,6 . Despite that the RT-PCR test is considered as the global standard method for Covid-19 diagnosis, this method has many downsides 7,8 . In details, RT-PCR test is time consuming and has considerable False-Negative Rate 1 . Using X-ray scan and CT-scan methods can replace RT-PCR test and give an efficient result in both time and accuracy 2,8 . However, both these methods need an expert radiologist to identify Covid-19 infection. Artificial Intelligence (AI) can provide the right solution to make this process automatic and limit the need of the radiologist to recognize the Covid-19 infection from these medical imaging. Indeed, computer vision and machine learning communities have proposed many algorithms and frameworks which have proved their efficiency. Especially by using deep leaning methods which have proved their efficiency on different tasks 9 including medical imaging tasks 10,11 .
Compared with the other two diagnosis methods, CT-scans method has many advantages. In addition to the use of CT-scans to recognize the Covid-19 infection, they can be used for other important tasks which include quantifying the infection and monitoring the evolution of the disease which can help on the treatment and save the patient life 12 . Moreover, the evolution stage can be recognized where the typical signs of Covid-19 infection could be ground-glass opacity (GGO) in the early stage, and pulmonary consolidation in the late stage 7,8 . According to the estimated Covid-19 infection percentage from the CT-scans, the patient state can be classified into Normal (0%), Minimal (<10%), Moderate (10-25%) , Extent (25-50%) , Severe (50-75%) and Critical (> 75%) 13 .
The State-of-the-Art methods using CT-scans can be classified into two main tasks: Covid-19 Recognition 5, 6, 14 and Covid-19 Segmentation 7, 8 . In 15 , Zheng, C.et al. proposed DeCoVNet approach which is based on 3D deep convolutional neural Network to Detect COVID-19 (DeCoVNet) from CT volumes. The input to DeCoVNet is CT volume and its 3D lung mask which was generated by using pre-trained UNet 16 . Their proposed DeCoVNet architecture has three parts: vanilla 3D convolution, 3D residual blocks (ResBlocks) and progressive classifier (ProClf). He, K. et al. proposed a multi-task multi-instance deep network (M 2 UNet) to assess the severity of COVID-19 patients. 17 . Their proposed approach classifies the volumetric CT-scans into two classes of severity: severe or non-severe. Their M 2 UNet approach consists of a patch level encoder, a segmentation sub-network for lung lobe segmentation, and a classification sub-network for severity assessment. In 18 , Yao, Q. et al. proposed NormNet architecture, which is a voxel-level anomaly modeling network, to distinguish healthy tissues from the COVID-19 lesion in the thorax area. Zhao, X.et al. proposed a dilated dual attention U-Net (D2A U-Net) approach for COVID-19 lesion segmentation in CT slices based on dilated convolution and a novel dual attention mechanism to address the issues above 7 .
Most of the state-of-the art methods have been concentrating on the recognition of Covid-19 from the CT-scans or segmentation of the infected regions. Despite these huge efforts that have been made, the state-of-the-art methods do not provide a helping tool to monitor the patient state, the evolution of the infection and the response of patient to the treatment which can play a crucial role in saving the patient life. In this paper, we propose a fully automatic approach to evaluate the evolution of Covid-19 infection from the CT-scans as regression task which can provide a richer information about the Covid-19 infection evolution. The estimation of Covid-19 percentage can help the intensive care to identify the patients that need urgent care, especially the critical and severe cases. With the extensive number of Covid-19 infections, estimating the Covid-19 percentage can help the intensive care to free up the resuscitation beds for the critical cases and follow other protocol for less severity cases.
Unlike the main stream that dealt with Covid-19 recognition and segmentation, this paper addresses the estimation of Covid-19 infection percentage. To this end, we created Per-Covid-19 database, then we used it to evaluate the performance of three CNN architectures with two loss functions and two pretrained models scenarios. In summary, the main contributions of this paper are: • We propose Per-Covid-19 dataset for estimating the Covid-19 infection percentage for both slice-level and patient-level.
The proposed database consists of 183 CT-scans with the corresponding slice-level Covid-19 infection percentage which were estimated by two expert radiologists. To the best of our knowledge, our work is the first one who propose a finer granularity of Covid-19 virus presence and solve a challenging task related to exact estimation of Covid-19 infection percentage.
• In order to test some state-of-the art methods, we evaluate the performance of three CNN architectures which are ResneXt-50, Densenet-161 and Inception-v3. For the three CNN architectures, we use two loss functions which are MSE and Dynamic Huber loss. In addition, two pretrained scenarios are investigated. In the first scenario, the pretrained models on ImageNet are used. To study the influence of using pretrained models on medical imaging task, we use the pretrained models on X-ray images.
• We make our database and codes publicly available to encourage other researchers to use it as a benchmark for their studies 1 . ( Last accessed on May 2 nd 2021)

Per-Covid-19 dataset
Our  Figure 1 shows the histogram of number of CT-scans over the number of CT slices. Figure 2 shows examples of slices images with their corresponding Covid-19 infection percentage. For evaluating different machine learning methods, we divided the Per-Covid-19 database into patient-independent five folds, where each patient slices are included into one fold.

Loss Functions
In our experiments, we used two loss functions which are Mean Squared Error (MSE) and Dynamic Huber loss. The loss functions are defined for N batch size and X = (x 1 , x 2 , ..., x N ) are the ground-truth percentages andX = (x 1 ,x 2 , ...,x N ) are their corresponding estimated percentages. MSE is sensitive towards outliers. For N predictions, MSE loss function is defined by: On the other hand, Huber loss function is less sensitive to outliers in data than L 2 loss function. For N training batch size images, Huber loss function is defined by 19 : where N is the batch size and z i is defined by: where β is a controlling hyperparameter. In our experiments β decreases from 15 to 1 during the training.
Where Y = (y 1 , y 2 , ..., y n ) are the ground-truth Covid-19 percentages of the testing data which consists of n slices and Y = (ŷ 1 ,ŷ 2 , ...,ŷ n ) are their corresponding estimated percentages. For equation 6, y i andŷ i are the means of the ground-truth percentages and the estimated ones, respectively. In addition, we used subject-level metrics which are MAE_subj, RMSE_subj and PC_subj which are defined in equations 7, 8 and 9, respectively.
RMSE_sub j = 1 s Where Y s = (y s 1 , y s 2 , ..., y s n ) are the ground-truth means of Covid-19 percentages of each patient' slices from the testing data. andŶ s = (ŷ s 1 ,ŷ s 2 , ...,ŷ s s ) are their corresponding estimated patient-level percentages (means of patient' slices percentages). For equation 9, y s i andŷ s i are the means of the ground-truth patient percentages and the estimated ones, respectively. MAE and RMSE are error indicators where the smaller values indicates better performance. From other hand, PC is a statistic measurement of linear correlation between two variables Y andŶ . A value of 1 means that there is a total positive linear correlation and 0 indicates no linear correlation.

Results
To train and test the CNN architectures (ResneXt-50, DenseNet-161 and Inception-v3), we used Pytorch 20 library and SGD optimizer with momentum equals 0.9 is used during the training phase. All experiments were carried out on PC with 64 GB Ram and NVIDIA GPU Device Geforce TITAN RTX 24 GB. Each CNN architecture was trained for 30 epochs with initial of 10 −4 with decays by 0.1 every 10 epochs and batch size equals 20. In addition, we used active data augmentation by rotating the input image by random angle between (-10 to 10 degrees). In summary, our experiments are divided into two scenarios. In the first scenario, we used retrained models of ImageNet. While, in the second scenario, we used pretrained models that were trained on medical imaging task.

First Scenario
In the first scenario, we used three pretrained CNN architectures on ImageNet 21 (ResneXt-50 22 , Inception-V3 23 and Densenet-161 24 ). Moreover, we used two loss functions which are MSE and Dynamic Huber. Table 1 summarizes the obtained results of the first scenario. From the results, we notice that for all models almost the Dynamic Huber loss gives better results then MSE loss function. This proves the efficiency of using Dynamic Huber loss function for this regression task. From the other hand, we notice that the three trained models with Huber dynamic loss achieved close results. In details, ResneXt-50 achieved the best results performance in MAE, PC_subj, MAE_subj and RMSE_subj. While, Densenet-161 achieved the best result for PC metric and Inception-v3 for RMSE metric.

Second Scenario
In the second scenario, we use the same models as the first scenario but this time they were trained on the recognition of Covid-19 from X-ray scans 2 . In more details, four lung diseases plus neutral were used to train the CNN architectures 2 . The objective of this scenario is to study the influence of the pretarained model which were trained on medical imaging task. The experimental results are summarized in Table 2. From these results, we notice that Inception-v3 achieved the best performance. Similar to the results of the first scenario, the Dynamic huber loss gives better results then MSE loss function for most of the evaluation metrics.

Discussion
The comparison between the first and second scenarios experiments (Tables 1 and 2) shows that the pretrained models on medical imaging task give better result than the pretrained models of ImageNet. From all experiments, we conclude that the best scenario for Covid-19 infection percentage estimation is by using Inception-v3 architecture with X-ray pretrained model and Dynamic Huber loss function.

Conclusion
In this paper, we introduced Per-Covid-19 database which presents Covid-19 infection percentage estimation. Moreover, we evaluated the performance of three CNN architectures which are ResneXt-50, Densenet-161 and Inception-v3. For the three CNN architectures, we use two loss functions which are MSE and Dynamic Huber loss. In addition, we evaluate two pretrained models scenarios. In the first scenario, we used ImageNet pretrained models. In the second scenario, we used pretrained models that where trained on X-ray scans to investigate the influence of using pretrained models that were trained on medical imaging task. The experimental results show that using the X-ray pretrained models improve the results. Moreover, the experiments using Dynamic Huber loss function achieved better performance than the ones used standard MSE loss function. From other hand, Inception-v3 outperformed ResneXt-50 and Densenet-161 architectures in both scenarios.

Methods
Topical subheadings are allowed. Authors must ensure that their Methods section includes adequate experimental and characterization data necessary for others in the field to reproduce their work.