Predicting COVID-19 Disease Progression with Chest CT Images

Background: Some mild patients can deteriorate to moderate or severe within a week with the natural progression of COVID-19.it has been crucial to early identify those mild cases and give timely treatment . The chest computed tomography (CT) has shown to be useful to assist clinical diagnosis of COVID-19.In this study, machine learning was used to develop an early-warning CT feature model for predicting mild patients with potential malignant progression. Methods (cid:0) The total of 140 COVID-19 mild patients were collected. All patients at admission were divided into groups (alleviation group and exacerbation group) with or without malignant progression.The clinical and laboratory data at admission, the rst CT, and the follow-up CT at critical stage of the two groups were compared with Chi-square test,.The CT features data (distribution, morphology,etc) were used to establish the prediction model by Fisher's linear discriminant method and Unconditional logistic regression algorithm. And the model was validated with 40 exception data.and the Area Under ROC curve (AUC) was used to evaluate the models. Results (cid:0) The model ltered out three variables of CT features including distal air bronchogram, brosis,and reversed halo sign. Notably, the distal air bronchograms was less common in alleviation group, while the brosis and reversed halo sign were more common.The sensitivity, specicity and Youden index of unconditional logistic regression were 86.1%, 92.6% and 78.7%, For the analysis of Fisher's linear discriminant, the sensitivity, specicity and Youden index were 83.3%, 94.1% and 77.4%. The generalization ability of both models were consistent with sensitivity of 95.89%, specicity of 100%, and Youden index of 83.33%. Conclusions: The CT imaging features-based machine learning model has a high sensitivity for nding out the mild patients who are easy to deteriorate into severe/critical cases eciently so that timely treatments came true for those patients,while help to the medical pressure.

(COVID-19) on February 12, 2020. At the same time, International committee on the Taxonomy of viruses (ICTV) named the disease as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). One of the primary manifestations for COVID-19 is pneumonia which greatly challenges the public medical system because of its high infection and mortality [1][2][3][4]. According to the Chinese epidemic data, the majority of the patients was the mild(81%) [5]. Some mild patients can deteriorate to moderate or severe within a week with the natural progression of COVID-19. It is important to identify those mild patients in order to prevent malignant progression and reduce the mortality of COVID-19. However, Most studies focused on cross-sectional descriptions, comparisons of clinical, laboratory and CT imaging ndings, or analysis of risk factors of death outcome [6][7][8], and less methods focused on predicting the mild COVID-19 patients with potential malignant progression.
Chest CT is a non-invasive imaging modality of COVID-19 with high speed and accuracy [9]. And the course of lesion evolution can be assessed by the distribution (subpleural, central), morphology (Ground glass opaci cation (GGO), mixed GGO, consolidation), distal vascular dilatation, etal [10][11]. Machine learning (ML) is largely used to develop automatic predictors in pneumonia classi cation but automatic predictors for COVID-19 disease progression are still in their infancy. Thus, to understand the bene ts of ML in COVID-19 disease progression prediction, a study was designed to analyze multivariate heterogeneous data (clinical data and serial chest CT imaging) and to further develop an accurate and effective prediction model that combines Unconditional logistic regression and Fisher's linear discriminant analysis .Since the two learning-based methods had been widely adopted and had achieved great performance in disease classi cation and prediction.
Therefore, the purposes of our study are to develop CT features models to identify the mild patients who are easy to deteriorate into severe cases using Unconditional logistic regression and Fisher's linear discriminant analysis respectively.

Ethics statement
This study was conducted in accordance with the Declaration of Helsinki. Informed consent was waived due to the retrospective nature of the study and the analysis of anonymous clinical data.Institutional Review Board approval was obtained.

Data collection
A total of 151 patients were hospitalized and had con rmed COVID-19 infection in Chongqing city and Hubei province from December 21, 2019 to February 21, 2020. Patients tested positive for the nucleic acids of COVID-19 were identi ed as con rmed cases.All the clinical data on signs and symptoms, epidemiology (including the history of recent exposure) and underlying comorbidities as well as laboratory results were retrospectively collected from electronic medical rebands. After excluding invalid information,140 patients were in this condition,as showed in the owchart (Fig. 1).The degree of severity of COVID-19 patients (severe vs. non-severe) at the time of admission were de ned according to the American Thoracic Society guidelines for community-acquired pneumonia [12]. Only non-severe cases (blood oxygen saturation 90%) were included in the analysis. They received treatment focused on symptomatic and respiratory support, while systemic corticosteroid therapy was not applied. Of these patients, 62 patients had two follow-up CT scans, and 78 patients had more than two. The follow-up interval was ranged from 1 day to 21 days. According to the results of clinical follow-up, the patients were divided into two groups, the Alleviation group (n = 72) and the Exacerbation group((n = 68).

Results
Demographic, clinical and CT characteristics 140 patients with mild COVID-19 pneumonia at admission included 92 male and 48 female, age ranged from 18 to 82 (45 ± 15) years, 68 patients (68/140,48.5%) malignantly progressed to severe periods during the hospitalization, while the remaining 72 patients (72/140, 51.4%) did not. The demographic characteristics of 140 COVID-19 patients are shown in Table 1. The laboratory data and clinical manifestations including leucocytes,lymphocytes,elevated C-reactive protein level,fever,cough,etal are shown in Table 2. At the time of admission, the levels of C-reactive protein were obvious increased, the level of lymphocytes was decreased ,and the level of leucocytes was normal in most patients. Fever and cough were the most common symptoms, whereas headache and no obvious symptoms were rarely observed. Serial CT imaging features of patients with and without severe progression were summarized in Tables 3. In brief, comparing to the patients with exacerbation progression, the disease distribution of alleviation group were single and multiple mainly. bers and reversed halo sign were common in the alleviation group while the distal air bronchogram sign was common in the exacerbation group.
All patients underwent chest non-contrast enhanced CT scans. The obtained images were reconstructed with a slice thickness of 1.5 mm. Lesions and imaging features were assessed in each lung segment of each patient.All imaging features were reviewed and evaluated by two experienced radiologists (9 and 15 years of experience in chest CT) independently blinded to the clinical information, and the discrepancy was resolved by consulting another radiologist (18 years' experience in chest CT).
The CT features of lung lesions, including the distribution characteristics (peripheral, central, single, multiple, diffuse), morphology (GGO and mixed GGO, consolidation), and associated manifestations (distal air bronchogram sign, brosis, reversed halo sign, etc.), before and during follow-up were evaluated.Meanwhile, the location of the lesion was considered as peripheral if it was in the outer onethird of the lung; otherwise, it was deemed as central [13].

Statistical analysis
All the statistical analysis was performed using SPSS (Version 18.0) with statistical signi cance set at 0.05. The continuous variables were analyzed by the t-test of two independent samples, while the categorical variables were analyzed by the chi-squared test. The discriminant model employed Unconditional logistic regression and Fisher's linear discriminant method.The discriminant equation was obtained, and the discriminant function was used to determine the machine learning. The machine learning results of stepwise logistic regression and Fisher's linear discriminant were then inputted into MedCalc 19.2, and the Area Under the Curve(AUC) of the two models were analyzed. Finally, Z test was used to compare the prognostic e cacy of the two models.

Establishment of prediction model
The machine learning models were established with the signi cant in uencing factors such as distribution, morphology, distal air bronchogram sign, etc. Finally, the model lters out three variables including distal air bronchogram sign, brosis, and reversed halo sign.Notably, during patient follow-up, the CT feature of distal air bronchograms was less common in alleviation group, while the brosis and reversed halo sign were more common (Fig. 2-4).

Discussion
Assessing the progression of COVID-19 is crucial for the disease treatment and control. In the early stage, the lung stroma of COVID-19 patient was mostly invaded, which could be manifested by the thickening of interlobular septa, angioedema dilatation and GGO appearance. As the disease progresses, the alveolar structure was gradually affected by in ammation, while alveolar edema, exudation and bleeding might occur. On the CT image, lung consolidation and mixed GGO can be manifested. Parts of the lesions exhibited distal air bronchograms sign and thickening of the bronchial wall, while the remaining parts displayed other signs. Based on the clinical data published in recent literature, almost all patients with COVID-19 had characteristic CT features during the course of the disease, including angioedema dilatation sign, paving stone sign, etc [14].
The imaging manifestations of the in uencing factors were successfully extracted by machine learning model during the course of COVID-19. Finally, the three objective variables, namely, brosis formation, distal air bronchogram sign and reversed halo sign, were incorporated into the model, which could serve as potential indicators to predict the disease outcomes. In the follow-up CT images, we found that 61.8% (42/68) of patients with exacerbation had distal air bronchogram sign. Among the alleviated cases, the CT features of brosis formation, reversed halo sign were observed to be 83.3% (60/72), 63.9% (46/72), respectively. The pathological mechanism of brosis formation is that the immune response of human body is intense or when the wall of small blood vessel is damaged by edema, the permeability of blood vessel wall is increased, the plasma and brin exudate, which can be interwoven into a net to limit the spread of pathogens and attenuate the lesion [15]. As for the occurrence of reversed halo, it represents a rare sign of a focal ground glass area surrounded by a complete ring of consolidation. Surgical pathology con rmed that the central GGO was actually alveolar septal in ammation and cellular debris, and the lesions surrounding alveoli tended to be mechanical in ammation, Some literature has suggested that the lesions turn out to be benign when their center part began to be absorbed [16][17][18]. The bright bronchogram seen in the area of diseased lung tissue is known as air bronchogram sign, which can be considered as strong evidence of in ammatory lesions. It has been reported that distal air bronchogram sign is helpful to distinguish the lung and pleura or mediastinal lesions,Alveolar lesions can be detected by air bronchogram sign, whereas thoracic reef and mediastinal lesions display no such signs [19][20]. So from this study, we observed the presence of more distal air bronchograms in the follow-up CT images of patients with exacerbation, suggesting that the lesion is further aggravated by expanding from the septal injury to the alveoli. brosis and reverse -halo signs are the prediction of benign outcome.
Unconditional logistic regression and Fisher's linear discriminant analysis are very important tasks in machine learning, which can be used to automatically derive the generalized description of a given dataset from known historical data, in order to predict future events [21][22].The results of the two models are relatively satisfactory, and consequently afford greater con dence in the assessment of COVID-19. In addition, the above CT features indicate that the lesion is in the critical period, and this trend change is helpful for clinicians to judge the therapeutic effect and predict the outcome of the disease. In the single factor analysis, the model variable, such as crazy paving pattern, is associated with the evolution of COVID-19. However, in the multiple factor analysis, it can be in uenced by other factors with "false association" claims, and hence the "false association" should been adjusted in the analysis.
This study has several limitations. First, there was no long-term clinical follow-up and the CT examination data of discharged patients was lacking. Hence, the severity of pulmonary brosis at the time of its formation and later changes needs to be further observed. Second, severe cases were not included. The prognosis of severe patients can be affected by many factors.

Conclusions
In summary, the CT imaging features-based machine learning model provides a non-invasive and easy-touse method for the outcome prediction of COVID-19 patients. Our future work will focus on mining richer spatial information using deep learning technologies to screen the risk factors of CT and clinic. Prof.CW and WJ take responsibility of conception and design of the study as the corresponding author.
LQ designed the study, and wrote the manuscript. QM collected and analyzed datasets from study patients.They made equal contributions as the rst author.
ZQ was in charge of statistics. CH, HF and LC revised the initial manuscript draft.
All authors read and approved the nal manuscript.