Computation Tomography (CT)-Based Evaluation of Temporal Changes in Lung Abnormalities During the Recovery Stage in COVID-19 In-Hospital Patients

Objectives: To assess the late phase CT changes of COVID-19 patients, and gure out factors predicting lung abnormality in late phase. Methods: We conducted a retrospective study on 42 patients (14 males, 28 females; age 65±10 years) with COVID-19 admitted between February 7, 2020 and March 27, 2020. Only patients with at least 3 CT scans taken at least 3 weeks after initial symptom onset were included in the study. CT images were analyzed by 2 independent radiologists using different scoring: (1) area-based scoring (ABS); and (2) intensity-weighted scoring (IWS). Temporal changes in the average lung lesion were evaluated by averaged area under the curve (AUC) of the CT score-time curve. Correlations between averaged AUCs and clinical characteristics were determined. Results: Temporal changes in lung abnormalities during recovery (weeks 3 through 8) of CT ndings using the ABS system were variable (P=0.934). By contrast, the IWS system detected more subtle changes in lung abnormalities during the late phase of recovery in COVID-19 patients, with consistent week-to-week relative reductions in IWS (P=0.025). In assessing the correlation between averaged AUCs and clinical characteristics, strong relationships were observed with D-dimer and C-reactive protein (CRP) levels on admission, with hazard ratios (HR)(95%CI) of 5.32 (1.25-22.6)(P=0.026) and 1.05 (1.10-1.09) (P=0.017), respectively. Conclusion: Our results suggest an intensity-weighted rather than area-based scoring system is more sensitive to detect subtle temporal CT changes in COVID-19, with D-dimer and CRP levels on admission being predictive of the time course of late phase recovery from the disease. PACS database system. All CT images were interpreted by two different radiologists, who had over 10-years of experience. Images were independently reviewed and the average of the two scores was determined. To settle disagreement between the interpretations of the two primary radiologists, a third senior radiologist with 20 years of experience adjudicated a nal decision. Given the retrospective nature of the study, no negative controls were examined.


Background
The coronavirus disease of 2019 , caused by Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-COV2) (1), that outbroke in Wuhan, China in December 2019 has been declared a worldwide pandemic by the World Health Organization (WHO), with a total of 20525924 cases including 1622333 deaths worldwide reported as of December 15, 2020 (2). Chest computation tomography (CT) imaging is critical for clinical evaluation, monitoring progression, and guiding management of COVID-19 patients (3). The characteristic patterns of the early-phase (i.e., symptom onset to ~ 3 weeks) have been reviewed (4,5), with lung abnormalities including ground-glass opacity, consolidation, and crazy paving commonly observed after which lesions begin to decline after 2-3 weeks after symptom (6,7). However, in spite of many patients being discharged within 3 weeks of symptom onset, lung abnormalities may persist and recovery may be incomplete (8), with the latter phase of disease (i.e., ≥ 3 weeks post-symptom onset) yet to be fully characterized.
In the present study, a retrospective, semi-quantitative CT grading system was used to evaluate temporal changes and degree of recovery of lung abnormalities during late phase (week 3-8) of disease, with the time course of lung abnormalities linked to clinical characteristics and disease severity. Interpreting the long-term patterns of recovery of lung abnormalities in COVID-19 patients using CT imaging may contribute greatly to clinical decision making on patient discharge and follow-up.

Materials And Methods
Patient screening, management and recording A total of 63 (31 males and 32 females) in-patients diagnosed with COVID-19 from China-Japan Union Hospital of Jilin University and Tongji Hospital of Huazhong University of Science and Technology, who were admitted between February 7th, 2020 and March 27th, 2020, were initially screened as candidates for the study. The retrospective cohort data was restricted to patients in which at least 3 CT exam images taken ≥ 3 weeks after COVID-19 symptom onset were available, resulting in 42 patients (14 males and 28 females, aged 65 ± 10 years) being included in the nal study to assess temporal changes in lung abnormalities (Fig. 1). The Community-Acquired Pneumonia (CAP) Symptom questionnaire (CAP-Sym 18)(9) was used to assess symptom changes during the study period, and was calculated according to the medical record by investigators, who were also the attending doctors. For the CAP-Sym 18 score, there are 18 items in the system, and the total score would be more than 18. The discharge criteria for our patient cohort were: 1) Complete recovery from symptoms judged by the attending doctor, with a CAP-Sym 18 score < 5; 2) Two negative RNA tests for COVID-19 separated by at least 24 hours; and 3) Evidence of complete control of combined diseases. The study was approved by the Ethics Commission of Tongji Hospital (TJ-IRB20200345) and the Ethics Commission of China-Japan Union Hospital of Jilin University (2020032622).

Ct Scan Protocol
All scans were performed with the patient in the supine position during end-inspiration. Three CT scanners (uCT780, United Imaging; or Somatom Force, Siemens Healthcare; or Nru VIZ128) were used for chest CT examinations. No standard CT protocol was applied owing to the retrospective design of our study, but the quality of images was evaluated by both technologists and radiologists. 5-mm slices of CT images were used in our analysis.

Ct Data Interpretation And Analysis
Image collection and analysis were performed using the PACS database system. All CT images were interpreted by two different radiologists, who had over 10-years of experience. Images were independently reviewed and the average of the two scores was determined. To settle disagreement between the interpretations of the two primary radiologists, a third senior radiologist with 20 years of experience adjudicated a nal decision. Given the retrospective nature of the study, no negative controls were examined.
For each of the 42 patients, CT scans were evaluated using two semi-quantitative scoring systems and compared to assess the severity of lung abnormalities and the time course of recovery: (1) Area-Based Scoring (ABS); and (2) Intensity-weighted scoring (IWS). Both scoring systems were modi ed from previously reported systems (10,11). Speci cally, for the ABS system, the extent score of each of the ve lobes of the lung were evaluated using a semi-quantitative system and attributed scores (0-5) based on the percentage of parenchymal opaci cation: score of 0 for 0%, score 1 for less than 5%, score 2 for 5-24%, score 3 for 25%-49%, score 4 for 50%-74%, an score 5 for greater than 75% involvement (for a maximal severity score of 25 points). For the IWS system, intensity was graded from grade 1 to grade 4: Grade 1 corresponded to mild and pure ground-glass opacity (GGO), with GGO de ned as hazy increased lung attention and preserved bronchial and vascular margins; Grade 2 corresponded to GGO with a crazypaving (CP) pattern; Grade 3 corresponded to GGO with mixed consolidation (GMC); and Grade 4, corresponded to completed consolidation. Consolidation was de ned as opacity with obscurities of the margins of vessels and airway walls (12). For each lobe, the total score was calculated from the following equation: Extent score of grade 1 × 1 + Extent score of grade 2 × 2 + Extent score of grade 3 × 3 + Extent score of grade 4 × 4. The sum of 5 every-lobe score is recorded as the nal lesion score of the patient, and the maximal score for a given patient was 100 points.
Representative CT images and score assessments for each grade are shown in Fig. 2(A-D). In addition, other patterns of lung abnormalities, including lesion distribution, scattering, and pleural changes, shown in the CT images were also recorded.
To assess the link between the extent of and temporal changes in lung abnormalities during the late phase of recovery and clinical biomarkers, CT imaging data was compared with individual patient characteristics on admission, including in ammatory indices such as C-reactive protein (CRP), white blood cell (WBC) count, and D-dimer that were elevated, that may be predictive of disease progression as well as assist in clinical decision-making.
To evaluate temporal changes in the lesion burden during late phase recovery, an average lesion during recovery was calculated as follows: After rst generating a CT course-score plot, the area under the curve (AUC) of the mean lesion burden was calculated per assessment week (i.e., weeks 3-8) to reveal changes in the intensity of lung lesions during the late phase recovery period (shown in Fig. 2E). Lastly, in order to evaluate the rate of change in CT score over time, we de ned a relative change of CT score as follows: relative CT score change of week n = (CT score of week n + 1 -CT score of week n)/CT score of week n.

Statistical analysis
Continuous variables were presented as mean ± standard deviation (SD)(i.e., normal distribution) or median (Range)(i.e., non-normal distribution). Categorical variables were presented as n(%). Kruskal-Wallis test was used to compare non-normalized data (i.e., symptom scores). Chi-square (χ 2 ) test was used to compare the data of weekly ABS and IWS changes determining the differences in the proportion of individuals with a score change ≤ 10%. Violin plots were generated to visualize the distribution and probability density of area-based and intensity-weight scoring during the study period. Individuals were grouped as a high-level AUC and a low-level AUC group based on whether the average AUC is ranked in the higher 50% or the lower 50%. A Spearman correlation test was used to determine the correlation between the average AUC level and critical clinical factors. Logistic regression analysis was used to identify predictive factors within these clinical markers. A P-value < 0.05 was considered statistically signi cant. Statistical analyses were done using SPSS 24 (IBM Corporation) software.

Results
Demographic, clinical and image ndings on admission 63 patients that were treated between 7th, February, 2020 and 27th, March, 2020 were screened. Among them, 21 patients were excluded (see Methods and Fig. 1), resulting in a total of 42 patients being included in the nal analysis. The median CT assessments per patient was 4 (range: 2-5 times), while the median follow-up period was 4 weeks (range: 3-6 weeks). The major demographic characteristics, clinical outcomes and laboratory ndings of the cohort on admission are summarized in Table 1. Consistent with previous studies (4,5), the majority of patients presented with fever (38/42, 90.5%), cough (39/42, 92.9%), and dyspnea (29/42, 69.0%) on admission, with an average CAP-Sym18 score of 18.7 ± 8.4. Moreover, the most frequent abnormalities were elevations in white blood cell, lymphocyte, and platelet counts as well as D-dimer (32/42, 76.2%) and C-reactive protein (CRP)(38/42, 90.5%) levels in the majority of patients on admission. Lastly, high-sensitivity cardiac troponin I was elevated in a subset of patients (8/42, 19.1%).  On admission, a number of lung abnormalities were detected in chest CT images among all patients. As outlined in Table 2, patients typically presented with a multi-lobar involvement (more than one lobe), with the majority of patients showing lesions in 4 (23.8%) or 5 (61.9%) lobes. Consistent with previous studies (4, 5), lung abnormalities were peripherally distributed (31.0%), with a diffuse (50.0%) and/or multifocal pattern (97.6%) frequently observed on admission. In addition to the prominent nding of ground-glass opacities (GGO), a crazy-paving pattern of lung abnormalities was common (81.0%). Atypical CT manifestations included pleural changes, microvascular dilation and subpleural transparent line (45.2%) were less commonly observed.
To assess temporal changes in patient symptoms over the study period, CAP-sym 18 scores were evaluated from week 3 to week 8. As shown in Fig. 3, CAP-sym scores were highly variable in weeks 3 and 4, but showed a progressive decline from week 3 (median score 8, range = 0-20) to week 8 (median score 1, range = 0-4), consistent with patient recovery from COVID-19 and hospital discharge throughout the study period.
CT image patterns change course after 3 weeks since symptom onset To determine temporal changes in CT ndings at least 3 weeks from symptom onset to patient discharge, CT scans were analyzed by two experienced radiologists using two different scoring systems that were based on the area (i.e., ABS analysis) and intensity (i.e., IWS analysis) of lung abnormalities. Temporal changes in the two scores as well as the IWS grades (1-4) are shown in Fig. 4. Speci cally, as shown in Fig. 4B, while individual IWS scoring grades, namely Grade 1 (GGO), Grade 2 (GGO with crazy paving), Grade 3 (GGO with consolidation), and Grade 4 (total consolidation), appeared consistent between weeks 3 and 4, there was a clear reduction in Grades 2 through 4 over time, consistent with resolution of more severe lung abnormalities. In comparing the ABS-and IWS-scoring approaches (Fig. 4A), IWS-scores showed a consistent decrease in a temporal fashion in all patients, indicative progressive resolution of lung abnormalities that aligned with symptom improvement. By contrast, rather than showing a reduction in lesion areas, ABS scores unexpectedly remained relatively stable, developing a more binary distribution in which scores remained elevated in a number of patients. Speci cally, we found that in many patients the intensity of pulmonary lesions would decline throughout the study period without any change in lesion area. To determine the e cacy of ABS versus IWS systems within a given patient to track temporal changes in lung abnormalities, we compared the relative (week-to-week) changes in ABS and IWS systems. As shown in Fig. 4C, the IWS system showed a more dramatic and progressive changes in lung abnormalities from week 3 after symptom onward compared to the ABS system, suggesting the IWS system is more sensitive to detect improvements in lung abnormalities. Thus, changes in lung abnormality scoring with the IWS system closely tracks with improvements in symptom scores, with the ABS system showing an apparent disconnect.
Given the apparent greater sensitivity of the IWS system to assess temporal changes in lung abnormalities, an area under the curve (AUC) analysis was undertaken to further determine the average temporal changes in lesion burden during recovery by dividing the IWS-time curve by time. As shown in Fig. 5, the distribution of averaged AUC scores was 26.9 ± 16.1 (points per week), suggesting variability in the extent of lung lesions persisted into the late phase of recovery. Speci cally, some patients still exhibited strong pulmonary pathology at weeks 7 and 8, the long-term impact of which require further study. Next, we sought to determine whether temporal changes in mean lesion burden (AUC score) were linked to clinical characteristics on admission. As shown in Table 3, Spearman correlations revealed a signi cant relationship between white blood cell count (WBC)(p = 0.0435), lymphocyte percentage (Lym%)(p = 0.0073), C-reactive protein (CRP)(p = 0.0004), and D-dimer levels (p = 0.0039) on admission with the average AUC level of a patient. To assess whether these four factors were predictive of mean lesion burden in our patient cohort, a Logistical regression analysis was undertaken (Fig. 6), with individuals initially grouped as a high-level AUC and a low-level AUC group based on whether the average AUC is ranked in the higher 50% or the lower 50%. Our results revealed that only CRP and D-dimer levels, with hazard ratio (HR) values of 5.32 and 1.05 respectively, were predictive of the degree of lung abnormalities in late-phase recovery. In spite of elevated in ammatory indices, no evidence of persistent interstitial brosis on CT images was observed during the late-phase in our patient cohort.

Discussion
Chest CT images are critical for the evaluation of lung abnormalities in COVID-19 patients, especially in patients with severe disease, with the early phase (≤ 3 weeks from symptom onset) of disease progression having been well-characterized (4,5). To our knowledge, this is the rst study focusing on late-phase (≥ 3 weeks) temporal changes in lung CT patterns in COVID-19 patents, which will enhance our understanding the time course and pattern of lung recovery of the disease that may persist beyond patient discharge and symptom resolution. The key ndings of our study are that although symptom score declined after week 3, the two CT scoring systems, ABS and IWS, used to assess changes in lung abnormalities revealed differential resolution of lesion burdens over time. Speci cally, the IWS system more closely tracked with resolution of disease symptoms than area-based scoring, suggesting intensityweighted scoring is more sensitive to evaluate temporal changes in lung abnormalities during late-phase recovery in COVID-19 patients. Second, D-dimer and C-reactive protein (CRP) levels on admission were strong independent risk factors for high-level lesion burden during late-phase recovery, indicating that a strong in ammatory response to the disease may be predictive of long-lasting lesion burden in COVID-19 patients.
Chest CT imaging has become one of the most important evaluation approaches for assessing COVID-19 severity, progression, and guiding effective management (3). The speci c chest CT changes of COVID-19 patients have been described by several previous studies (3,13,14), with lung abnormalities including ground glass opacities, consolidation, reticular pattern and crazy paving patterns found to be common hallmarks at symptom onset into peak illness in COVID-19 patients (4,5). Our ndings con rm that these typical patterns persist throughout the late-phase (≥ 3 weeks from symptom onset) of the disease, even showing up in some scans in week 8. While these lung patterns were consistent between early-and latephase disease, their presence alone was not indicative of the time course of resolution of lung abnormalities from symptom onset. As different CT patterns and the intensity and extent of lesion burden would indicate different pathophysiological changes in the lungs (15)(16)(17), taking these factors into consideration is necessary for assessing the severity of COVID-19-mediated lung abnormalities. The semi-quantitative visual CT grading approaches used in the current study have been previously indicated, with both area-based (ABS) (8,18,19) and intensity-weighted (IWS)(10) grading systems having been used to assess various pulmonary abnormalities. While the ABS system has been applied to COVID-19 patients (8), the present study is the rst to use these grading systems to evaluate pulmonary CT changes in the latter stages of recovery in COVID-19 patients. It is clear from our ndings that these two techniques are not synonymous when evaluating temporal changes in lung abnormalities throughout recovery. Speci cally, the IWS approach aligned with symptom presentation and showed progressive reductions in lesion burden. By contrast, the ABS approach suggested less recovery over time, with pattern and time course of changes in many patients lagging behind the recovery implied by symptom scores and the IWS approach. Given the IWS approach encapsulates the speci c nature of the lung abnormalities in its scoring of lesion extent and intensity, it is clear that this method possesses greater sensitivity to detect subtle lesion resolution and more adequately assess recovery, especially in those patients in which lesion area remains unchanged.
The time course of resolution of pulmonary lesion burden during the late phase of recovery is of great concern in COVID-19 patients, especially given the potentially long-term consequences in elderly cohorts that are disproportionately impacted by the disease (20). Although there is no evidence of delayed virus elimination in patients with higher lesion burden during recovery in the current study, our data suggest that elevations in in ammatory markers on admission, speci cally D-dimer and CRP, may be predictive of a higher and more prolonged lesion burden in COVID-19 patients. The link between elevated lesion burden and in ammatory markers is not surprising, especially in light of established links between peak CRP levels and ground-glass opacities in patients with severe acute respiratory syndrome (11). Moreover, late pulmonary brosis, which is a severe complication after recovery for other coronavirus infections, is highly linked to in ammation (21)(22)(23), thus there is a concern that elevated pro-in ammatory markers and high lesion burden on admission may be a harbinger for pulmonary brosis in the long-term. Thus, while there is currently no evidence showing severe brosis in recovered COVID-19 patients, including our own patient cohort, the long-term impact of SARS-COV2 is unclear due to lack of long-term follow-up of patients following discharge. Nonetheless, our results of our AUC analysis suggest the recovery of this disease is slower compared to other forms of viral pneumonia that typically resolve within 2-3 weeks (24)(25)(26), with lung abnormalities detectable in some patients 8 weeks after symptom onset. Hence, given the paucity of studies examining the long-term pulmonary effects of COVID-19 (27) and as our understanding of the complexity and pathophysiology of COVID-19 continues to evolve, late-phase follow-up should be indicated, especially in those patients with high-levels of in ammatory markers on admission.

Study Strengths And Weaknesses
To the best of our knowledge, our study is the rst investigation to formally document late-phase (≥ 3 weeks symptom onset) CT ndings in COVID-19 patients, which is essential to further our understanding of the pathophysiology of COVID-19 and to evaluate and manage the recovery phase of the disease. The retrospective design of our study may in uence the accuracy of symptom scoring and lead to selection bias of our patient cohort. However, the infectious characteristic of COVID-19 and the sudden onset of disease outbreak in our region made a perspective study design unfeasible. Moreover, all the patients treated by our group were analyzed to minimize the selection bias, and symptom scoring was performed by 2-3 independent attending physicians to ensure reliability.

Conclusion
The present study analyzed chest CT images from late phase (3-8 weeks post-symptom onset) of COVID-19 patients, showing that an intensity-weighted scoring approach is more sensitive to assess temporal resolution of lung abnormalities, with the intensity and extent of lesion burden predicted by elevations in in ammatory markers on admission. As our understanding of the pathophysiology of COVID-19 continues to evolve at a rapid pace, our ndings suggest that more long-term CT assessments will only serve to enhance our insight into disease progression, patient management, and clinical improvement.