Differentiating novel coronavirus pneumonia from general pneumonia based on machine learning
Background: Chest CT screening as supplementary means is crucial in diagnosing novel coronavirus pneumonia (COVID-19) with high sensitivity and popularity. Machine learning was adept in discovering intricate structures from CT images and achieved expert-level performance in medical image analysis.
Methods: An integrated machine learning framework on chest CT images for differentiating COVID-19 from general pneumonia (GP) was developed and validated. Seventy-three confirmed COVID-19 cases were consecutively enrolled together with twenty-seven confirmed general pneumonia patients from Ruian People’s Hospital, from January 2020 to March 2020. To accurately classify COVID-19, region of interest (ROI) delineation was implemented base on ground glass opacities (GGOs) before feature extraction. Then, 34 statistical texture features of COVID-19 and GP ROI images were extracted, including 13 gray level co-occurrence matrix (GLCM) features, 15 gray level-gradient co-occurrence matrix (GLGCM) features and 6 histogram features. High dimensional features impact the classification performance. Thus, ReliefF algorithm was leveraged to select features. The relevance of each features was the average weights calculated by ReliefF in n times. Features with relevance lager than the empirically set threshold T were selected. After feature selection, the optimal feature set along with 4 other selected feature combinations for comparison were applied to the ensemble of bagged tree (EBT) and four other machine learning classifiers including support vector machine (SVM), logistic regression (LR), decision tree (DT), and K-nearest neighbor with Minkowski distance equal weight (KNN) using 10-fold cross-validation.
Results and Conclusions: The classification accuracy (ACC), sensitivity (SEN), specificity (SPE) of our proposed method yield 94.16%, 88.62% and 100.00%, respectively. The area under the receiver operating characteristic curve (AUC) was 0.99. The experimental results indicate that the EBT algorithm with statistical textural features based on GGOs for differentiating COVID-19 from general pneumonia achieved high transferability, efficiency, specificity, sensitivity, and impressive accuracy, which is beneficial for inexperienced doctors to more accurately diagnose COVID-19 and essential for controlling the spread of the disease.
Figure 1
Figure 2
Figure 3
Figure 4
This is a list of supplementary files associated with this preprint. Click to download.
On 19 Aug, 2020
On 07 Aug, 2020
On 05 Aug, 2020
On 04 Aug, 2020
On 04 Aug, 2020
Posted 02 Jul, 2020
On 31 Jul, 2020
Received 22 Jul, 2020
Received 22 Jul, 2020
On 11 Jul, 2020
On 11 Jul, 2020
On 09 Jul, 2020
Invitations sent on 08 Jul, 2020
On 01 Jul, 2020
On 30 Jun, 2020
On 30 Jun, 2020
On 01 Jun, 2020
On 26 May, 2020
On 25 May, 2020
On 25 May, 2020
On 23 May, 2020
Differentiating novel coronavirus pneumonia from general pneumonia based on machine learning
On 19 Aug, 2020
On 07 Aug, 2020
On 05 Aug, 2020
On 04 Aug, 2020
On 04 Aug, 2020
Posted 02 Jul, 2020
On 31 Jul, 2020
Received 22 Jul, 2020
Received 22 Jul, 2020
On 11 Jul, 2020
On 11 Jul, 2020
On 09 Jul, 2020
Invitations sent on 08 Jul, 2020
On 01 Jul, 2020
On 30 Jun, 2020
On 30 Jun, 2020
On 01 Jun, 2020
On 26 May, 2020
On 25 May, 2020
On 25 May, 2020
On 23 May, 2020
Background: Chest CT screening as supplementary means is crucial in diagnosing novel coronavirus pneumonia (COVID-19) with high sensitivity and popularity. Machine learning was adept in discovering intricate structures from CT images and achieved expert-level performance in medical image analysis.
Methods: An integrated machine learning framework on chest CT images for differentiating COVID-19 from general pneumonia (GP) was developed and validated. Seventy-three confirmed COVID-19 cases were consecutively enrolled together with twenty-seven confirmed general pneumonia patients from Ruian People’s Hospital, from January 2020 to March 2020. To accurately classify COVID-19, region of interest (ROI) delineation was implemented base on ground glass opacities (GGOs) before feature extraction. Then, 34 statistical texture features of COVID-19 and GP ROI images were extracted, including 13 gray level co-occurrence matrix (GLCM) features, 15 gray level-gradient co-occurrence matrix (GLGCM) features and 6 histogram features. High dimensional features impact the classification performance. Thus, ReliefF algorithm was leveraged to select features. The relevance of each features was the average weights calculated by ReliefF in n times. Features with relevance lager than the empirically set threshold T were selected. After feature selection, the optimal feature set along with 4 other selected feature combinations for comparison were applied to the ensemble of bagged tree (EBT) and four other machine learning classifiers including support vector machine (SVM), logistic regression (LR), decision tree (DT), and K-nearest neighbor with Minkowski distance equal weight (KNN) using 10-fold cross-validation.
Results and Conclusions: The classification accuracy (ACC), sensitivity (SEN), specificity (SPE) of our proposed method yield 94.16%, 88.62% and 100.00%, respectively. The area under the receiver operating characteristic curve (AUC) was 0.99. The experimental results indicate that the EBT algorithm with statistical textural features based on GGOs for differentiating COVID-19 from general pneumonia achieved high transferability, efficiency, specificity, sensitivity, and impressive accuracy, which is beneficial for inexperienced doctors to more accurately diagnose COVID-19 and essential for controlling the spread of the disease.
Figure 1
Figure 2
Figure 3
Figure 4