Since the first COVID-19 case was discovered in 2019, more than 9.47 million cases of novel coronavirus pneumonia have been diagnosed worldwide, with 484,249 deaths recently according to World Health Organization Coronavirus disease (COVID-2019) situation report - 158. Currently, the detection of COVID-19 mainly relies on nucleic acid testing. However, many infected patients with obvious typical symptoms passed multiple nucleic acid tests but diagnosed positive in the last test1. The high false negative rate results in delayed treatment and even aggravating the spread of the pandemic. On February 5, the Chinese National Health and Health Commission launched the "Novel Coronavirus Pneumonia Diagnosis and Treatment Program (Trial Version 5)", which updated the diagnostic criteria for novel coronavirus pneumonia with adding CT imaging examinations as one of the main basics for clinical diagnosis of COVID-19. CT screening is considerably popular, easy to operate and sensitive to COVID-19, which is critical for both early diagnosis and pandemic control.
Nevertheless, influenza virus pneumonia and other types of pneumonia might occur in this season as well. In some aspects, especially according to clinical features, it is troublesome to differentiate COVID-19 from general pneumonia. For instance, the main manifestations of COVID-19 in the early stage were fever, fatigue, dry cough, and expiratory dyspnea while patients with general pneumonia have similar symptoms 2. COVID-19 pneumonia places a huge burden on the health care system because of its high morbidity and mortality. Therefore, early diagnosis and isolation of GP patients and COVID-19 patients can better prevent the spread of the pandemic and optimize the allocation of medical resources. However, except for the overlapping symptoms and detection abnormalities, CT manifestations of GP and COVID-19 were similar, causing instability and uncertainty for distinguishing them 3, 4.
Typical CT manifestations of COVID-19 patients consist of pleural indentation sign, unilateral or bilateral pulmonary ground-glass opacities, opacities with rounded morphology and patchy consolidative pulmonary opacities with the predominance in the lower lung 5-8. GP infections have similar CT manifestations at presentation. However, COVID-19 presents more bilateral extensive GGO while GP shows more unilateral GGO or consolidation 9. Furthermore, the other CT funding of GP and COVID-19 and GP is difficult to be observed and the areas of lungs contain lager scale of insignificant extraneous parts. To avoid the interference of irrelevant information and more accurately and stably identify COVID-19 from GP, GGO was cropped as the region of interest (ROI) and features were extracted based on ROIs. Figure 1 shows the samples of COVID-19 and GP CT images from the collected dataset.
Lin et al. proposed a deep learning model, COVNet, based on visual features from volumetric CT images to distinguish COVID-19 from community acquired pneumonia 10. 4536 three-dimensional CT images (COVID-19: 30%; Community Acquired Pneumonia: 40%; non-pneumonia: 30%) were included in their study. U-net was applied to crop the lung region as the ROI and both 2D and 3D features were extracted by COVNet based on the ROIs. Then the features were combined and inputted to the proposed scheme for predictions. The sensitivity and specificity for detecting COVID-19 were 90% and 96% while for CAP were 87% and 92%. The AUCs were 0.96 and 0.95. However, due the deep learning method, which feature determines the result stays unknown. Thus, the method lacks of interpretability and transparency.
Charmaine et al. evaluated ResNet with a location-attention mechanism model for screening COVID-19 11. Two ResNet models were enrolled in their study. Three-dimensional features were extracted by ResNet-18 and fed into ResNet-23 with location-attention mechanism in the full-connected layer for classification while ResNet without location-attention mechanism was applied as well for comparison with the proposed method. Accordingly, the results show the proposed method achieved better performance with an overall accuracy of 86.7%.
Asif et al. proposed CoroNet model based on Xception architecture using X-ray images to differentiate COVID-19 from heathy, bacterial pneumonia and viral pneumonia 12. Notably, Xception is a transfer learning model which was pertained on ImageNet dataset and then retained on the collected X-ray dataset. In the proposed architecture, the classical convolution layers were replaced by convolutions with residual connections. The overall accuracy was 89.6% while average accuracy of detecting COVID-19 was 96.6%. To test the stability and robustness, CoroNet was evaluated on the dataset prepared by Ozturk et al 13 with an accuracy of 90%.
Ozturk et al. developed DarkNet model based on the you only look once (YOLO) system to detect and classify COVID-1913. Their model achieved the accuracy of 98.08% for classifying COVID-19 and non-infections and 87.02% for distinguish COVID-19 from COVID-19, no-findings and GP. Nevertheless, the proposed methods by Asif et al. and Ozturk et al. were based on X-ray images. X-ray screening is not sensitive to GGOs which is one of the most significant manifestations at the early stages of COVID-19. This can cause high error rate and ineffective containment of the pandemic.
Kang et al. developed a machine learning method with structured latent multi-view representation learning to diagnose COVID-19 and community acquired pneumonia 14. In their work, V-Net was leveraged to extract lung lesions. Then, radiomic features and handcrafted features, totally 189-dimentional features, were extracted from the CT images. In the end, the proposed model yielded the best accuracy, which was 95.50%. The sensitivity and specificity were 96.6% and 93.2%. Compared with other methods in the study, the accuracy was improved by 6.1%~19.9% and the sensitivity and specificity were improved by 4.61%~21.22%.
To our knowledge, most recent researches carried out for detecting COVID-19 are based on deep learning. However, deep learning models require a large scale of training data while initially the COIVD-19 samples are in shortage. Transfer learning might be promising method in terms of small amount of data while negative transfer may exist, for initial dataset and target domains may not related to each other and the standards on what types of training data are sufficiently related are not clear.
Machine learning plays an insubstitutable role in artificial intelligence with outstanding results in medical imaging classification. We developed a machine learning method using ensemble of bagged tree based on statistical texture features of CT images, particularly focusing on differentiating COVID-19 from GP, demonstrating high efficiency in the identification of COVID-19 and GP, helping to reduce misdiagnosis and control pandemic transmission.
Material
From January 2020 to March 2020, there were 73 COVID-19 cases confirmed by nucleic acid test positive and 27 general pneumonia cases enrolled in this study (age ranges from 14 to 72 years). Both COVID-19 and GP patients who had undergone chest CT scans were retrospectively reviewed by two senior radiologists. Of the COVID-19 cases, twelve patients without obvious characteristics on CT images were excluded (negative rate 16.4%, 12/73). Finally, sixty-one confirmed COIVD-19 cases and 27 general pneumonia cases were enrolled in this study.
The images were independently assessed by two radiologists. If the radiologists disagreed with each other, a senior radiologist would be invited to review the pulmonary CT images and make the final examination. All the CT images were generated from the Siemens Sensation 16-layer spiral CT (Siemens, Erlangen, Germany). The image format was Digital Imaging and Communications in Medicine (DICOM). The scan parameters were: tube voltage 120 kV; tube current automatic regulation; 1-2 millimeters cross-sectional thickness; 1-2 millimeters cross-sectional distance; scan pitch 1.3; and 16×0.625 millimeters collimation.