The feasibility of transfer learning for differentiation H1N1 In�uenza from COVID-19 on chest CT

Objectives: It is unlikely that by fall and winter of 2020, standard vaccine or treatment is available for COVID-19 infection. In this period, differentiation between COVID-19 and In�uenza induced pneumonia will be critical for patient management. To develop an automated platform to perform this task, arti�cial intelligence models were developed by using the transfer learning techniques on chest CT. Methods: Chest CT images from known cases of COVID-19, H1N1 In�uenza induced pneumonia (before December 2019), and normal chest CTs were collected. Different pre-trained Convolutional Neural Networks (CNN) models, including VGG 16, VGG 19, ResNet-50, Wide ResNet, InceptionV3, and SqueezNet were ne-tuned on this data set. 60% of the dataset was used for training, 20% for validation, and 20% for test the �nal models. Accuracy, Precision, Recall and F1 score of each model were calculated. Results: For differentiation of COVID-19 pneumonia versus H1N1 In�uenza pneumonia versus normal CTs, the ResNet-50 (accuracy above 92%) outperformed other models followed by InceptionV3 and wide ResNet. Conclusions: The pre-trained image classi�cation AI models are feasible to be ne-tuned and used for differentiation COVID-19 versus H1N1 In�uenza pneumonia. In this context, ResNet-50 and then InceptionV3 architectures appear more promising and are suitable start points for further development. We share the source code and trained models in the supplement of this manuscript to be used by other researchers for further development.


Introduction COVID-19
A new form of respiratory infection was detected in Wuhan city, China, in December 2019.In less than one month, the pathogen was recognized as a novel type of Coronavirus.It was named as "severe acute respiratory syndrome coronavirus 2" (SARS-CoV-2), and the resultant infection was called COVID-19.
Despite robust international measurements and quarantines in many countries, this form of infection became a pandemic in less than three months.By July 24, 2020, about 9.3 million infections and 478000 death have been reported in 188 countries because of COVID-19 (1)(2)(3)(4).The SARS-CoV-2 is an enveloped single-stranded RNA virus that attaches to the angiotensin-converting enzyme 2 of airways with a duration of infectiousness of 3 days before the onset of symptoms until clearance of the virus.The incubation time is 2-14 days.No vaccine or standard treatment is available for COVID-19 at this time.It is unknown if the COVID-19 prevalence is different in different seasons (5).

In uenza
In uenza infection is one of the common forms of respiratory viral infection.It has been estimated that each year between 291,000 to 645,000 death happens because of the in uenza infection in the world (6, 7).There are four forms of in uenza virus, including A, B, C, and D. Usually, the seasonal In uenza is caused by type A and B (8). H1N1 In uenza is one of the worst subtypes of in uenza infection and caused two pandemics in 1918 and 2009 with 50 million and over 280000 death, respectively (9,10).The in uenza virus is an enveloped single-stranded RNA virus that attaches to the N-acetyl neuraminic acid in the airways.Its distribution is via droplets.Duration of infectiousness is from 1 day before illness to as long as severe symptoms persist with an incubation time of 1-4 days.There are several approved vaccines and treatments for the In uenza virus, and its prevalence is seasonal (5).
Clinical differentiation between COVID-19 and In uenza is di cult.There is a substantial overlap between the clinical manifestations of In uenza and COVID-19.Fever, cough, expectoration, and dyspnea are the main manifestation of these two infections.Other clinical presentations of these viral infections are headache, sore throat, chest pain, fatigue, myalgia, nausea, vomiting, and diarrhea.Cough and expectoration have been reported more commonly in In uenza.The prevalence of other clinical ndings is similar in these two diseases (6).Also, there is a substantial overlap in laboratory ndings in these two diseases.Lymphopenia elevated Creactive protein, and erythrocyte sedimentation rate levels have been reported in both diseases without signi cant differences (6).The Procalcitonin level is also not signi cantly different between these two infections (6).

Reverse transcription-polymerase chain reaction (RT-PCR)
While the rapid RT-PCR assay is a very robust technique for diagnosis of In uenza A and B with a sensitivity of 98% and speci city of 99% (11), the same thing is not true for COVID-19 PCR.The accuracy of PCR in COVID-19 depends on the time and technique of sampling.Its false negative result is very high in the incubation phase (100% false-negative ve days before initial symptoms, 67% false-negative one day before initial symptoms).PCR is falsely negative in 38% of patients on the day of initial symptoms.
Its false negative rate is 20% three days after initial symptoms and 21% four days after initial symptoms (12).

CT scan
Currently, there are a few studies regarding the role of medical imaging for the differentiation of these two infections.The disease burden of COVID-19 on chest CT is reported to be higher than the In uenza A (CT score of 13 versus 6) (6).Also, it has been reported that the frequencies of bronchiectasis, pleural effusions, linear opacities, crazy-paving opacities, and vascular enlargement within the pulmonary lesions are different in COVID-19 and In uenza A pneumonia and can be potentially be used for differentiation (6).In another study, the COVID-19 patients had more rounded opacities and interlobular septal thickening on chest CT but less pulmonary nodules, tree-in-bud opacities and pleural effusion in comparison to patients with In uenza A and B (13).Even though the most CT manifestations of viral infection are nonspeci c, the preliminary reports are promising about the role of CT to differentiate these two infections.

Fall and winter of 2020
It is unlikely that the vaccine and standard treatment for the COVID-19 are available by fall and winter of 2020.In this period, seasonal u can be superimposed on the COVID-19 pandemic.Differentiation between these two infections is critical for patient management.Given the facts mentioned above and substantial overlap between clinical and laboratory presentations of COVID-19 and In uenza, CT would be critical for this task.Differentiation of COVID-19 versus In uenza on chest CT is challenging for medical centers without experienced chest radiologists.In this study, we tested the feasibility of automated diagnostic techniques based on transfer learning AI on chest CT images.Having such platforms can help physicians without chest imaging experience during the fall and winter of 2020.

Materials And Methods
This study was approved by the ethical committee of Arak University of Medical Sciences (IR.ARAKMU.REC.1398.339).Medical data and Images from patients with diagnoses of COVID-19 and H1N1 In uenza were reviewed by a pulmonologist with 25 years of experience and a radiologist with 12 years of experience.Cases with motion artifacts, poor image quality, and chronic lung disease were excluded.Medical data and chest CT images from 72 patients with clinical and PCR diagnosis of COVID-19 were collected from February 2020 to May 2020 from the tertiary referral centers of the mentioned university.Also, medical data and chest CT images from 39 patients with PCR positive H1N1 In uenza induced pneumonia were collected from 2017 to December 2019 at the same medical centers.The In uenza cases were collected before December 2019 to make sure that there is no concurrent infection of COVID-19 and In uenza.Finally, 26 normal chest CT studies were also collected.All CT scans were performed by the standard chest protocol (MA: 24-40, KVp: 100-110, Slice thickness: less than 1.5 mm, Pitch factor: 0.8 and Matrix: 512X512).Axial slices were used in this study.Using the ImageJ platform, the grayscale images were converted to RBG format.All slices of COVID-19 and In uenza cohorts were reviewed by the same radiologist, and slices without visible pathology were deleted.Then in the normal cohort, one slice out of nine slices and in COVID-19 and In uenza cohorts, one slice out of any four slices were selected and used for training (this technique was performed to feed the models with different slices and help them to capture useful information).Each axial slice was divided vertically into right and left hemithorax.Augmentation was also performed by 30 degrees of rotation, 0.2 rescalings (0.2 width shift range, 0.2 height shift range, and 0.2 share range).The nal labeled images were uploaded to different pre-trained CNN models.Overall, 12744 images were used, including 2503 COVID-19, 3035 In uenza, and 7206 normal CT images.Images were resized to be acceptable to each model.The pre-trained models were VGG 16, VGG 19, ResNet-50, Wide ResNet, InceptionV3, and SqueezNet.Each model was pre-trained on 1000 image classes of ImageNet.For ResNet-50, four models were developed by 0%, 20%, 30% and 40% trainability.For InceptionV3, three models were developed by 0%, 20%, and 30% trainability.Other models were developed by 0% and 20% trainability.60% of data was used for training, 20% for validation, and 20 % for test.The training was done on the Deep Learning Studio (14).The output of each pre-trained model was attened using a atten layer and then was fed to a dense layer with three classes output representing the prediction for normal, COVID-19, and In uenza.Training process was similar in all models (Number of Epoch: 10, Batch size: 32, Loss Function: categorical crossentropy, Optimizer: Adam, beta 1: 0.9, beta 2: 0.999, decay: 0 and lr: 0.001).The nal models were tested on the test cohort (20% of unseen data).The accuracy, precision, recall, and F1 score of each model were calculated and reported.

Results
The COVID-19 cohort consisted of 38 male and 34 female, the mean age of 60.9 years, and with the mean time interval between the initial symptoms and CT of 4.37 days.The In uenza cohort consisted of 20 males and 19 females, the mean age of 62.4, and the mean time interval between initial symptoms and CT of 5.41 days.By our pipe-line design, the ResNet-50, InceptionV3, Wide ResNet with various trainablities, and VGG 19 with 0% trainability were able to capture useful information and performed well on the validation and test cohort.The ResNet-50 with 20 and 30% trainability had the best performance to differentiate COVID-19 from In uenza induced pneumonia.The results of each model are summarized in

Discussion
The COVID-19 disease, a novel infective pandemic is now one of the worst challenges the modern medicine has ever encountered.It became a pandemic in less than three months involving the entire world.It is unlikely that an effective treatment or vaccine would be available in the near future.This fact would be challenging in the fall and winter of 2020 when the seasonal In uenza outbreak may be superimposed on the COVID-19 pandemic.There is substantial clinical and laboratory overlap between these two diseases.To make things even more complicated, the current standard of care (PCR tests) is not perfect for the diagnosis of COVID-19.In this context, imaging (chest CT) may play a critical role.The chest CT is the backbone of medical imaging in such respiratory infections.In one study by Yin et al., the prevalence of bronchiectasis, pleural effusion, linear opacities, crazy-paving sign, vascular enlargement, and pleural thickening were statistically different in these two infections.However, the distribution of lesions, ground-glass opacities, consolidations, nodular opacities, bronchial thickening, lymphadenopathy, pericardial effusion, and air-bronchogram were not statistically different in these two infections (6).Despite these ndings, differentiation between In uenza and COVID-19 remains challenging because the CT manifestations of these viral infections are nonspeci c.Diagnosis would be even more challenging in medical centers without expert chest radiologists.Given the facts mentioned above, having an automated diagnostic system to differentiate COVID-19 versus In uenza on chest CT images may improve the diagnostic accuracy and patients' management.These automated platforms would be an essential part of the diagnosis process, especially in the areas without access to expert radiologists and while the pandemic overload can overwhelm the medical staff.
Recently, the state of the art AI models has been used for image classi cation.In this context, the convolutional neural network (CNN) models are very promising.Different CNN-based models even were able to achieve accuracy better than humans in image classi cation.The common problem during the application of CNN models for medical imaging is the size of datasets.The modern AI models, including the CNNs, are data-hungry algorithms.To achieve accuracy equal to humans or even more, these models need thousands to million images to be trained.Having datasets of medical images containing millions of samples is challenging in medicine.This problem was partially solved by transfer learning techniques.
In transfer learning, the CNN model is not developed from scratch.Instead, a pre-trained model is used.The pre-trained models usually are trained on large datasets of million images (mostly on ImageNet dataset containing millions of non-medical images).Such a pre-trained model then retrained ( ne-tuned) on a small dataset of medical images.The idea behind such transfer learning is the fact that basic tasks of image classi cation (such as edge detection, vertical and horizontal lines, etc…) can be learned from the non-medical datasets.These pre-trained models are then ne-tuned over medical images and can achieve acceptable performance on small medical datasets (15).A similar concept is true in our study.
Here we were able to train CNN models for the image classi cation while our dataset is small (137 subject and 12744 images, which is considered to be a small dataset for CNN models).We detected the high performance in ResNet-50 and Inception.We caught the highest accuracy in ResNet-50 with the trainability of 20 and 30%.The accuracy and precision of ResNet-50 30% was 97.17% and 100% to diagnose COVID-19.The accuracy of the same model was 94.1% and 91.2% for the diagnosis of the normal and In uenza CTs, respectively.Such a model can be an added value to daily clinical practice.Our models have been trained to predict normal, COVID-19, or In uenza as their output.
Recently there have been a few case series of co-infection of COVID-19+In ueza (16, 17).It is unclear what would be the output of our models while they see a case of co-infection of COVID-19+In ueza.It is expected that such a co-infection may encounter more frequently by the end of 2020.
The source code and the trained models are provided in the supplements and can be used by other researchers for further research projects (they are not approved for clinical applications).It must be noted that the provided models have been trained on the RBG images, so grayscale CT images must be converted to RBG format to be feed to the models.Also, they work only on axial images.They have been trained on the images of the lung parenchyma, so the neck base and upper abdomen images must be deleted before the implementation of these models.Lastly, for data augmentation, these models have been trained on the right or left hemithorax, so for deployment, they must be fed by divided axial slices.
The input of models is an axial image of each hemithorax in the lung window, and the output would be a prediction about normal, H1N1 In uenza, or normal hemithorax.
In conclusion, the development of an automated diagnostic platform to differentiate COVID-19 from H1N1 In uenza or normal chest CTs is feasible with acceptable accuracy.Given our knowledge, our platform is the rst solution for this task.Such a platform may enhance the e ciency of the radiologist with limited chest imaging experience.The ResNet-50 and then InceptionV3 were promising for this task in our study and are suitable start point to develop an automated platform.