Application Research on the Medical imaging testing of Novel Coronavirus Pneumonia COVID-19 based on Transfer Learning

Novel coronavirus pneumonia (COVID-19) is a highly infectious and fatal pneumonia-type disease that poses a great threat to the public safety of soci-ety. A fast and eﬃcient method for screening COVID-19-positive patients is essential. At present, the main detection methods are nucleic acid detection of manual diagnosis and medical imaging (CT image/X-ray image), both of which take a long time to obtain the diagnosis result. This paper discusses the common processing methods for the problem of insuﬃcient medical image data. Then, transfer learning and convolutional neural network were used to construct the screening and diagnosis model of COVID-19, and diﬀerent migration models were analyzed and compared to select a better pre-training model, which was trained and analyzed under small data sets. Finally, it analyzes and discusses how to train a highly reliable model to quickly help doc-tors provide advice in the critical moment of epidemic prevention and control when only a small sample data set is available.


Research background
The 2019 novel coronavirus pneumonia (COVID-19) is extremely contagious and fatal. Patients with COVID-19 typically present with fever, dry cough, sore throat, headache, fatigue, muscle pain, and shortness of breath. The emergence of asymptomatic patients with a specific infectious disease further complicates the existing epidemic prevention efforts. The SARS-CoV-2 virus is highly infectious, and the incubation period of the virus is 1 to 14 days. The initial symptoms of COVID-19 patients are not obvious, which makes the COVID-19 virus more insidious and more easily transmitted. Therefore, the rapid screening of potential new coronary pneumonia infections and the timely adoption of corresponding measures are essential for the prevention and control of the new coronary pneumonia epidemic and the protection of people's health.
Traditional COVID-19 detection methods mainly include nucleic acid detection [1] and manual diagnosis of medical imaging. Nucleic acid detection relies on nucleic acid detection kits and takes a long time to produce results. It is not completely accurate and may require multiple RT-PCR or other detection methods to confirm the diagnosis. However, the manual diagnosis of medical imaging requires quite professional doctors, analysis takes longer, and does not exclude the possibility of other difficult and hidden symptoms. Thus, in order to assist the traditional COVID-19 detection method for faster and more accurate clinical diagnosis, there is an urgent need for an auxiliary diagnostic system that can conduct COVID-19 screening through X-ray/CT images, improve the screening efficiency of diagnostics physicians, and achieve "early detection, early diagnosis, early isolation, and early treatment" in COVID-19 epidemic prevention and control.

Current research status at home and abroad
In order to solve the status quo of COVID-19 screening, more and more research teams at home and abroad build their own data sets and apply the deep learning model to build an automatic diagnosis system for COVID -19. Here are four reference neural network models.

OpenCovidDetector [3]
OpenCovidDetector is an AI diagnostic system for rapid detection of COVID-19, jointly completed by two research teams from the Department of Automation of Tsinghua University and the Union Hospital Affiliated to Tongji Medical College of Huazhong University of Science and Technology. The dataset used in the system is a combination of three hospitals in Wuhan and a number of publicly available datasets containing over 10,000 chest CT data from subjects with COVID-19, influenza A/B, non-viral community-acquired pneumonia (CAP), and non-pneumonia. The system can directly input the CT image data, extract and segment the data of the lung region from the image, and get the prediction result. The prediction accuracy of CC-CCII and MosmedData reaches 92.99% and 93.25%, respectively.

DAMO Academy AI system
The DAMO Academy team, the research and innovation organization of Alibaba Group, used the diagnostic guidelines issued by Dr. Shi Heshui, a senior radiologist at Tongji Medical College, to train its algorithms for diagnosing Covid-19. The AI diagnostic system can diagnose COVID-19 in CT images with 96% accuracy within 20 seconds. Currently, it has been used in 26 hospitals in China and has helped diagnose more than 30,000 cases of COVID-19.

CovidGAN [4]
CovidGAN is an Auxiliary Classifier Generative Adversarial Network (ACGAN). It can be used to generate X-ray images of the chest to improve the detection performance of the Convolutional Neural Network (CNN). Only using CNN to classify COVID-19 X-ray images can achieve an accuracy of 85%, and after adding and using CovidGAN-generated images, the accuracy can reach 95%.

COVIDiagnosis-Net [5]
Covidiagnostion-Net, an AI model for COVID-19 detection based on deep SqueezeNet and Bayesian optimization, is a novel model for rapid diagnosis of COVID-19. It overcomes the problem of unbalanced common data set and extends the data at multiple scales. With the advantages of SqueezeNet neural network calculation speed and model size, COVIDiagnosis-Net is more convenient to deploy deep learning networks in embedded and mobile systems to help physicians quickly and efficiently diagnose COVID-19 patients.

CoroNet [6]
CoroNet, a network model based on deep convolutional neural networks, is a model that can automatically screen for COVID-19 from chest X-ray images. It is based on the Xception pre-trained convolutional neural network model, with a Dropout layer and two fully connected layers added to it. Experimental results show that the overall accuracy of the CoroNet model on a specific data set reaches 89.6%. Among them, the four categories (COVID vs a vs viral pneumonia bacteria pneumonia vs normal), COVID -19 cases of accuracy and recall rate reach 93% and 98.2%; For three kinds of classification (COVID, pneumonia and normal), the classification accuracy of the model can reach 95%.

CovXnet [7]
The CovXNet model based on deep Convolutional Neural Network (CNN) utilizes deep convolution with different expansion rates to efficiently extract various features from chest X-rays, and uses a stack algorithm to jointly converge the features extracted from X-Ray images of different resolutions.The CovXNet model was extensively experimented in two different datasets and achieved very high accuracy rates: 97.4% in the COVID or normal combination, 96.9% in the COVID or viral pneumonia combination, 94.7% in the COVID or bacterial pneumonia combination, and 90.2

Methods used in this paper
This paper refers to the above domestic research and the experimental methods of the above four neural net-work models and introduces the transfer learning and pre-training model in combination with the actual situation ; Next, a small self-built data set is constructed by using the X-Ray image data collected from the open source data set related to COVID-19, and the X-ray image images are deblackened by the data set preprocessing method, and the self-built data set is expanded by data enhancement method ;Finally, the pre-training models are fine-tuned through transfer learning, used for the transfer learning task of self-built data sets, and the post-training data are analyzed to screen out highperformance pre-training models, and the prediction effects of the screened pre-training models in small sample data sets of different sizes are tested.

Transfer Learning
Definition: Given the source domain D s and learning task T s , the target domain D t and learning task T t , the purpose of transfer learning is to acquire knowledge in the source domain D s and learning task T s to help improve the learning of the prediction function f t (·) in the target domain D t , where D s =D t or T s =T t .
The process of transfer learning is shown in Figure 1. The green part on the upper side represents the whole process of traditional machine learning, while the red part on the lower side represents the whole process of transfer learning. Thus, transfer learning not only uses the data in the target task as the input of the new task machine learning algorithm, but also uses all the learning processes in the traditional machine learning process in the source domain, including training data, model and task, as the input of the machine learning algorithm. In other words, transfer learning can acquire more knowledge from the source domain, and then solve the problem of lack of training data in the target domain.
Thus, transfer learning is not learning from scratch, but building on the experience learned while solving other problems. Especially in Computer Vision (CV) tasks, transfer learning can be carried out through pretrained models. These pre-trained models are usually models trained on large data sets and perform well in classification tasks of large data sets. Through transfer learning, they can be used to solve problems similar to those we want to solve, such as image classification problems.

Pre-training models in Computer Vision
VGG [15] was constructed and trained by Karen Simonian and Andrew Zisserman from the University of Oxford in 2014, in which VGG-19 is a deep convolutional neural network with 19 layers.
ResNet-50 [16] (Residual Network) was a 50-layer deep convolutional neural Network constructed and trained by Microsoft in 2015.

Fine-tuning strategies for the pre-training model
In transfer learning, the original classifiers in the pretraining model of the convolutional neural network can be deleted, because they correspond to the original classification task. The pre-training model can be applied to a new classification task by adding a suitable new classifier. Next, three strategies for fine-tuning the model will be introduced.
(1) Train the entire model: use the complete structure of the pre-trained model, and use the data set corresponding to the target task to train from scratch; (2) Keep some layers frozen, and train other maps: Generally speaking, the dichotomy is generally used to freeze the model. When freezing, freeze the general feature levels in the pre-training model that are not related to the problem, and use the pre-training model to extract the weights of the features; keep the structure of other levels unchanged, and perform training. In general, if the data set is small and the parameters are large, more layers can be frozen to avoid overfitting of model training; If the data set is large and the number of parameters is small, a small number of levels can be frozen and more levels trained to adapt to the new training task.
(3) Freeze the entire convolution basis: keep the weight of the convolution basis of the pre-training model unchanged, and then let its output directly connect with the classifier. This is usually the case when the computation power is insufficient, the data set is small, or the source domain task is similar to the target domain task.

Self-built data set
In the open source data set [23] , the number of images of each category (COVID-19 positive patients, normal, other pneumonia, etc.) is not the same, and the image sizes are also various. In order to satisfy this experiment, we built our own data set based on the open source data set. Since this experiment uses X-ray images for analysis, 200 X-ray images of COVID-19 positive patients and 200 normal images are randomly selected from the above X-ray list as the data set of this experiment.

Image black border problem
Most of the public data sets have not been uniformly pre-processed, and all CT/X-ray medical imaging pictures (as shown in the figure below) have different num-bers of black borders. Most of the healthy subjects had obvious black edges on the left and right sides, and the proportion of chest was smaller than that of COVID-19 positive patients. On CT/X-ray images of patients with COVID-19 positive, there were almost no black edges on the left and right sides, and the chest represented a significantly larger proportion of the image. As mentioned above, the size of the black edge and the proportion of the chest image have a high probability of judging the image model, that is, the trained model is likely to judge whether a new sample is positive or negative for COVID-19 based on the black edge on the side.
Manual examination of the data revealed that almost all negative images had prominent black edges on the left and right sides, and that the images screened for positive cases had black edges in less than 15 percent of the images. Such obvious data bias can easily lead to the wrong judgment of the model, that is, there is a high probability of negative cases with black edges or a large proportion of black edges. Therefore, the preprocessing of data sets plays a decisive role in the identification and correlation of models.

The solution to the image black border problem
As shown in Figure 4, the image on the left is an unprocessed image. There are obvious black edges on the top and left and right sides of the image, and the area of the black edges is relatively large. This has a larger image for the training of the model. Therefore, it is necessary to preprocess the image, remove the black edges on the upper side and left and right sides of the image, and then cut the center part of the image in accordance with the corresponding proportion, and then enlarge to a certain value in accordance with this proportion, so that the width and height of all images after preprocessing are as same as possible, that is, the processing results as shown in the right figure below are obtained.

The steps of image black border problem
The key to dealing with image black edges is to detect the black edges that appear in the image: the black edges at the top, the possible black edges at the bottom and the left and right sides of the black edges. The basic steps to deal with the image black edge problem are as follows: (1) It is necessary to set a black edge determination threshold, that is, the percentage of black pixels in a row or column in the row or the total pixels in the column, as the basic basis for judging the black edge: equal to or beyond this percentage, the row or the column can be determined to be black edge; (2) Scan successively from the outside to the inside to determine whether the proportion of black pixels in each row or column pixel is equal to or exceeds the threshold of black edge determination. If so, delete all pixels in this row or column until the proportion of black pixels in 2-4 rows or columns is below the thresh- (3) Check whether the black edge is cleared, if cleared, then according to a specific ratio of width and height toward the middle of the image to the maximum cut image; (4) Receive the next image and repeat the above three steps; (5) The image of the pre-processed data set is obtained.

Image equal proportion cropping strategy
In order to obtain the output image with the same proportion as the input image, the image with the black edge removed must be further processed.
In dealing with a black border, need to record the image up and down, left and right sides is the removal of the number of rows and columns, then calculate and delete rows with left and right sides of the removed the difference between the number of columns, and the numerical divided by 2, and then according to the numerical value of the positive and negative decision from the left and right sides of the delete column pixels or delete rows from the top and bottom pixels. Finally, we can get the output image with the same width to height ratio as the input image.
On the one hand, the same proportion cropping strategy can reduce the loading time of images and save memory space; on the other hand, it can reduce the negative impact of invalid images on model training. Input images for pre-training models have size requirements. For example, VGG-19 and RENET-50 require input of 224×224 pixels, while INCEPTION-V3 and Xception models require input of 299×299 pixels. Therefore, we need to cut the picture at the maximum width of 1:1, and then scale the picture to the corresponding size according to the needs of the model.

Background
At present, the data sets disclosed in the network are becoming more and more abundant, and more and more data can be used for training, which greatly improves the accuracy of training models. However, for other emergent health emergencies, it is difficult to obtain enough data in a short period of time. It is a fantasy to train a robust model in a short period of time.
This section describes a general method to solve the problem of insufficient training data: data enhancement.

The impact of insufficient training data or unbalanced data
(1) Due to the lack of training data, the model can easily learn information that is not irrelevant to the problem, that is, the noise of the sample data is too large, such as the influence of special cases on the image and other factors, which are prone to overfitting; (2) Less training data and less data used for testing the model make it difficult to measure the results of training, that is, the quality of the model.
(3) If the training data is too small, it is easy to fall into the local minimum; (4) Model training may not converge, and the information that the model can learn from these small amounts of data is relatively limited.

Data Augmentation : a strategy to alleviate the lack of training data
In the task of image deep learning, small sample data sets are processed, and the data set can be expanded by using data augmentation. Manually let the existing limited positive patient image data generate more equiva- However, data augmentation is not omnipotent and reliable. Because the generated positive patient image data is bound to be different from the real image data, this inevitably brings about the problem of noise information.
Next, I will explain how to expand the data set through basic data augmentation.

Flip transform
The image flipping transformation includes horizontal flipping and vertical flipping, which is the flipping operation of the image with respect to the horizontal or vertical axis.

Moving
By moving the image up, down, left, right, or a random combination of the first four, the deviation caused by the position of the lungs in the image data is not completely centered.
For example, if all lung images of the negative patients are in the center in the sample data set, while the lung images of the positive patients are not completely in the center, this kind of data enhancement method can be used to well avoid the adverse effects caused by location deviation.

Rotation
Select clockwise or counterclockwise directions and randomly select 10-45 degrees for the image to change the orientation of the lungs in the image. In general, the method of rotation Angle must be considered carefully,

Saturation
By randomly adjusting the saturation of the image picture, that is, the vividness of the color, the purity of the image color is changed, and the purpose of data enhancement is achieved.

Contrast
The image is enhanced by randomly adjusting the contrast, that is, measuring the different brightness levels between the brightest white and the darkest black in the bright and dark areas of the image.

Integrated use
The original image can be randomly combined with the above methods for data enhancement, so as to generate more image data.

Introduction to experimental environment
In this experiment, the deep learning framework Tensorflow 2.4.0 is mainly used for model training, and the entire training process is completed through Jupyter Notebook.

Experimental steps for transfer learning
The transfer learning task is mainly divided into the following three steps: (1) Select a pre-training model: select a model suitable for solving the problem from the list of pre-training models; (2) Classify the problem according to the size similarity matrix: Generally speaking, we can classify the problem by comparing the similarity between the current data set and the data set of the training pretraining model.

Performance index of classification effect
The six indicators for evaluating CNN performance are sensitivity, specific validity, accuracy, precision, F1 value and the area under the receiver operating characteristic curve (AUC) [24].

Comparison experiment of classification performance of pre-training model
In this experiment, VGG-19, Resnet-50 and Inception V3 were used as the pre-training models. According to the principle of size similarity matrix, we chose to freeze the whole convolution basis of the pre-training model, and adjusted the classifiers into normal and COVID-19.
In order to obtain relatively accurate data and avoid the randomness of training, the self-built data set is now randomly divided into 80% for training and 20% for testing. For the same pre-training model, in addition to scaling the images of the self-built data set to the corresponding size, the self-built data set also needs to Next, performance analysis of VGG-19, Resnet-50, and Inception V3 is performed by comparing their data before and after data enhancement.

Result analysis
(1) Without using data augmentation, the training data are shown in Table 4  From the above comparison data, it can be seen that among the three pre-training models, ResNet-50 performs well.
(2) In the case of using data augmentation, the training data are shown in Table 5    From the above comparison data, it can be seen that among the three pre-training models, ResNet-50 performs well.

Analysis summary
According to the above data analysis, in terms of accuracy, the pre-training model with no data enhancement obtains better results than the pre-training model with data enhancement. The average AUC value of the model without data enhancement is higher or equal to that of the model with data augmentation.
All in all, the best pre-training model is resnet-50, which has the best accuracy and the highest F1 value and AUC. Therefore, when using transfer learning, we can use the above analytical methods to select high-performance pre-training models for training to respond to COVID-19 in crisis situations.

Analysis of the training effect of ResNet-50 on a small sample data set
From the comparison experiment of pre-training model classification performance, it can be seen that ResNet-50 has better performance among the three common pre-training models studied in this paper. Therefore, the following will briefly explore the effect of ResNet-50 on the classification effect of the model under the small sample data set.

Self-built data set
Different from the data set used in Sec.4.4, this experimental data set consists of 50 images in each of COVID-19 and normal categories in the COVID-19 Radiography Database [17]. Among them, each picture has been carefully selected and processed: images with marks, symbols, etc. are excluded, and the chest image of each picture accounts for no less than 80%; Manually crop the image to remove the black edges around the image as much as possible, so that the core part is in the center of the image; Finally, adjust the processed image size to the input size of ResNet-50, which is 224px×224px. The processed image is shown in Figure 12 below.

Model training
Due to the small amount of data in this experiment, the data-enhanced API provided by Keras is used. The division of the data set still uses the division ratio in Sec.4.4, that is, 80% is used for training and 20% is used for verification. Due to the small number of training samples, the convolutional base of the entire ResNet-50 pre-training model is frozen, and the final output is calculated using the sigmoid activation function. The model structure is shown in Table 6 below.

Experimental results
This experiment is divided into three experiments to explore the influence of the training set data volume on the ResNet-50 pre-training model. Among them, the pre-training model is trained with a learning rate of 0.0005, and the data volume of the training set is 100, 80, and 60 (the COVID-19 and normal are each half). The data used for the test comes from the COVID-19 Radiography Database [17] and is not duplicated with the training set. The experimental results are shown  in Figure 14, Figure 15, and Figure 16, where Train corresponds to the training set and Test corresponds to the verification set.

Experiment analysis
According to the Loss curves in Figure 14, 15 and 16, the Loss value decreased significantly at the beginning of the training, indicating that the learning rate was appropriate and the gradient descent process was carried out. After a certain stage of learning, the Loss curve tended to be relatively stable (see Figure 15), but the Loss value in Figure 14 and 16 still had a large shock. The oscillation amplitude of loss can be alleviated by increasing batch size. It can be seen from the Accuracy curves in Figure  14, 15 and 16 that the Accuracy curve of the verification set often lies on the Accuracy curve of the training set, indicating that the model training has not been overfitted and the trained model performs relatively well on the verification set.
As can be seen from Table 7, with the decrease of loss, the value of accuracy is higher and higher, and the recognition effect of the model is better. At the same time, compared with the experiment in 4.4, the larger the non-sample data is, the better effect can be achieved in model training. The experimental data in Table 7 and Sec.4.4 show that the training effect of the pre-training model is positively correlated with the quality of the training set. Even in the case of small sample size, the effect of model training can be improved by optimizing the data set itself.

Conclusions
For scenarios with a small number of training samples, it is feasible to analyze medical images (CT/Xray) using a transfer learning-based screening model for COVID-19 positive patients to a certain extent. Because, using just the pre-training model, it's already nearly 99 percent accurate.
The study in this paper also found that ResNet50 has higher performance in COVID-19 diagnosis using X-ray images for the three pre-training models compared in this paper through some parameter specifications and fine-tuning strategies of pre-training models, which can be tried to be used in the process of COVID-19 diagnosis.
Given the shortage of nucleic acid test kits and the severe shortage of imaging physicians, it is necessary to rapidly screen potential COVID-19 positive patients in a large population and prevent the transmission of the Novel coronavirus as much as possible. This paper presents the " Diagnostic process through the COVID-19 Diagnostic System " (Figure 17) as a reference for rapid screening in response to health emergencies.
The COVID-19 imaging diagnosis system has greatly reduced the workload of imaging physicians and provided rapid and effective diagnosis and treatment measures for suspected COVID-19 positive cases, greatly improving the efficiency of the entire healthcare system and preventing the spread of the epidemic to the maximum extent.

Prospects
The research in this paper still has many deficiencies, and the model still needs further tuning and improvement.
Some pre-trained CNN models have high performance in image recognition, but the development of neural network suitable for medical image analysis and its application in clinical application is still a problem that needs serious discussion. For example, whether the preprocessing of data sets complies with medical norms [25], how to screen out a highly reliable pretraining model, and how to adjust the training strategy of the pre-training model [26].
In terms of pre-training models, the current common high-performance pre-training models are all trained in large data sets, but the images in the data sets may have nothing to do with medical images. These pretraining models have strong and general ability to extract image features, but there may still be local optimal value for medical images. Especially for small sample data set training, the original pre-training model can learn little knowledge.
Therefore, all the previous image data can be summarized, classified and constructed into a large medical image data set, and the neural network structure of the pre-training model can be used for reference to train this large image data set, so as to screen out the pre-training model with high performance based on the large image data set. This pre-training model can extract features of medical images to a greater extent to provide higher performance even for small sample image data sets.
Due to limitations of machine performance, this study only considered the dichotomies (COVID-19 pneumonia and health) on X-ray images, and did not consider the triplets (COVID-19 pneumonia, other pneumonia and health) or more. Compared with the work in this paper, the TL based deep neural network proposed by Ibrahim et al. [28] has made good progress in the automatic detection of COVID-19 pneumonia, non-COVID-19 viral pneumonia and bacterial pneumonia. In triage (COVID-19, bacterial pneumonia, and healthy chest X-ray images), the test set achieved 93.42% accuracy. Darkcovid-19 net [29] used IEEE8023 / Covid chest X-ray data set and ChestX-Ray8 database [30] to perform a study on binary and triplex classification, and achieved 87% accuracy in triplex classification with an F1 value of 0.87, making good progress.
The limitation of this study lies in the research situation of classification. In the case of binary classification, training models of ResNet-50 fine-tuning to make the test set accuracy reached the high level of training, but there exist three classification, four categories, and more realistic environment classification, second only to a single medical imaging classification appears unable to meet the practical requirement of operational work, there is still a very large research space.
In general, this paper studies to verify the migration study in medical imaging screening COVID -the feasibility of 19 patients with positive, believe in the future, using a large image data sets to build up the training model can provide higher performance, in a small sample data set for emergency health event imaging diagnostic work to provide a better support.