Lung Ultrasound in COVID-19 Patients: Characteristics and Factors Affecting its Diagnostic Accuracy

Background: Although chest computed tomography (CT) is the gold standard for diagnosing the majority of lung conditions, its use in screening patients for coronavirus disease 2019(COVID-19) pneumonia is not recommended. Lung ultrasound (LUS) is an alternative modality. To investigate the characteristics and diagnostic accuracy (DA) of bedside ultrasound for lung lesions in patients with COVID-19 and to determine the factors inuencing the DA of lung ultrasound (LUS). Methods: A total of 330 patients with COVID-19 admitted to the hospital between February and March 2020 were retrospectively recruited. The imaging characteristics of LUS and computed tomography (CT) scans were analysed and summarized. DA was calculated using a chest CT scan as the reference standard. Furthermore, a binary logistic regression analysis was conducted to investigate the factors inuencing the DA of LUS for interstitial syndrome. Results: The ultrasound ndings of COVID-19 patients presented mainly as B lines (195/330, 59.1%), unsmooth or interrupted pleural lines (118/330, 35.8%), consolidation lesions (74/330, 22.4%), and pleural effusion (11/330, 3.33%). Compared with the chest CT scan, the DA of LUS for interstitial syndrome, consolidation, pleural effusion, and pleural thickening were 0.821, 0.927, 0.988, and 0.863, respectively. The diagnostic coincidence rate of LUS and chest CT in the mild, common, severe, and critical groups were 93%, 68.6%, 100%, and 100%, respectively. According to the results of the binary logistic regression, sex, disease duration, experience of the doctor, and involved lobes were independent predictors of the DA for interstitial syndrome. Conclusions: LUS had good diagnostic performance for diagnosing COVID-19 pneumonia, and showed a relatively low DA for interstitial syndrome. Female sex, doctors with less experience, long disease duration, and lesions limited to the upper or lower lobes may decrease the DA. The model was explained using the Nagelkerke R square. One-way ANOVA was used to compare sex, age, BMI, disease duration, experience of the doctor, comorbidities, and involved lobes. Parameters with a statistical probability (p) value below 0.0625 were included in the logistic regression risk analysis. From the binary logistic regression analysis, corrected p-values were obtained for the risk factors whose effects were examined, and p < 0.05 was considered statistically signicant. All statistical analyses were performed using SPSS version 23 (IBM Corp, Armonk, NY, USA).


Background
The global coronavirus disease 2019 (COVID-19) pandemic that was con rmed at the end of 2019 has entered the stage of regular epidemic prevention and control. As the gold standard for the majority of lung pathologies [1], many authors have proposed chest computed tomography (CT) as the rst assessment technique for COVID-19 infection in epidemic areas [2]. However, the American College of Radiology does not recommend the use of chest CT to screen patients for COVID-19 pneumonia and stated that CT scanning should be reserved for symptomatic patients with speci c clinical indications [3], which implies a huge burden on radiology departments when CT scans are performed indiscriminately.
Lung ultrasound (LUS) penetrates the air barrier and plays an important role in the diagnosis and management of many pulmonary diseases including pulmonary oedema, pneumothorax, and interstitial brosis [4][5][6][7]. Compared with CT, LUS is a non-invasive, low-cost, and radiation-free imaging modality that allows repeated imaging, reducing the exposure of healthcare workers and other patients to potential infection, as well as alleviating the potential risk of hemodynamic and respiratory instability when patients are transferred to the radiology department [8]. Many studies have suggested that LUS has high accuracy and speci city comparable to CT scans in diagnosing lung pathologies [4,5,9]. Therefore, LUS can be used easily and immediately as a bedside tool, thereby being a major game-changer in the diagnosis and treatment of COVID-19.
Several articles describe LUS characteristics in COVID-19 patients, but no reports exist on the factors in uencing the diagnostic accuracy (DA) of LUS. In this study, using chest CT as the reference standard, we systematically assessed the LUS manifestations of COVID-19 pneumonia, analysed the DA of LUS, and determined the factors in uencing the DA of LUS.

Materials And Methods
This retrospective study was approved by the Institutional Review Board. The need for informed consent was waived given the retrospective nature of the study. From February 15, 2020, to March 20, 2020, patients with con rmed COVID-19 at four hospitals in Wuhan, China were selected as research subjects. The inclusion criteria were as follows: (1) patients with positive results for the COVID-19 nucleic acid test (reverse transcription-polymerase chain reaction assay) using respiratory secretions obtained by nasopharyngeal swabs or oropharyngeal swabs and (2) patients who underwent both chest CT and bedside ultrasound examination (interval ≤ 72 h). The exclusion criteria were (1) patients aged < 18 years, (2) patients whose chest CT and bedside ultrasound images were of poor quality and could not be used for image analysis, and (3) patients whose baseline demographic and clinical characteristics could not be obtained.
According to the diagnostic criteria for novel coronavirus pneumonia issued by the National Health Commission, People's Republic of China [10], the patients were divided into mild, common, severe, and critical types.
First, using chest CT scan as the gold standard, the diagnostic performance of LUS in patients with COVID-19 pneumonia of different classi cations were analysed. Second, the possible reasons for inconsistencies between LUS and CT were analysed. Finally, the clinical data and CT and LUS images were analysed to determine the factors in uencing diagnostic accuracy in patients with COVID-19 .

Data collection
Data were obtained from the hospital case notes. Clinical information on each patient, including patient demographics, severe acute respiratory syndrome coronavirus-2 testing, symptoms and the timing of symptom onset, and main comorbidities were collected. In the four hospitals, 508 patients with positive results for the COVID-19 nucleic acid test underwent both lung CT and LUS; 58 patients were excluded due to poor image quality for CT or LUS, 52 patients were excluded because of insu cient clinical characteristics, and 68 patients were excluded because the interval between chest CT and LUS examination exceeded 72 h. Finally, 330 patients were included for the analysis. Figure 1 presents the owchart of the study.

Technique and equipment
All patients underwent simultaneous LUS and chest CT scans (de ned as an exam interval ≤ 72 h). Bedside LUS examinations were performed using an ultrasound scanner (Mindray M8 Expert, Mindray Medical Systems, Shenzhen, China), equipped with a C5-2 probe (bandwidth: 5-2 MHz) and an L9-3 probe (bandwidth: 9-3 MHz). CT examinations were performed with a 1.5-mm slice thickness on two commercial multi-detector CT scanners (Philips Ingenuity Core 128, Philips Medical Systems, Best, the Netherlands; SOMATOM De nition AS, Siemens Healthineers, Germany).

CT review
All CT images were reviewed by two radiologists with more than 5 years of experience. Imaging was reviewed independently, and nal decisions were reached by consensus. No negative controls were examined, and no blinding was performed. For each patient, the CT image was evaluated for the following characteristics: (1) the presence of ground-glass opacities, (2) the presence of consolidation, (3) the number of lobes affected by ground-glass or consolidative opacities, and (4) the presence of pleural effusion.

Lung ultrasound
Within 72 h of CT scanning, bedside LUS was performed by 10 ultrasound-certi ed doctors who were blinded to the chest CT ndings. The operators wore adequate personal protective equipment. Images were saved using ultrasound software and reviewed by operators after the examination to avoid unnecessary prolonged contact with patients. Six regions of each hemithorax (anterior-superior, anterior-inferior, lateral-superior, lateral-inferior, posterior-superior, and posterior-inferior) were scanned for the presence of B lines, consolidations, pleural line abnormalities, and pleural effusion, de ned according to the recommendations of the International Consensus Conference on Lung Ultrasound [5]. The anterior-superior, lateral-superior, and posterior-superior lobes correspond to the upper lobes on CT, while the anterior-inferior, lateral-inferior, and posterior-inferior lobes correspond to the lower lobes on CT. B lines were hyperechoic lines extending vertically from the pleural line to the opposite edge of the screen, indicating alveolarinterstitial oedema. Within each region, having two or fewer B lines in the absence of consolidation was considered normal [11]. Consolidation could be demonstrated by hypoechoic or hyperechoic artifacts that disrupted the pleural line.
Consolidation adjacent to pleural effusion could be a consequence of pulmonary atelectasis and was thus not considered to represent pulmonary in ltrates.
Two observers evaluated the LUS, and each was blinded to the results of the other and the patients' clinical data.
Disagreements were resolved by consensus. An independent observer recorded the demographics, clinical data, initial symptoms, and comorbidities.

Statistical analysis
Continuous variables were expressed as mean ± standard deviation and ranges, while categorical variables were expressed as counts and percentages. Diagnostic agreement between LUS and CT was assessed using the Kappa statistic. The diagnostic performance of LUS was evaluated using a multivariate binary logistic regression model, where the consistency of LUS and CT results were the dependent variables, and sex, age, body mass index (BMI), disease duration, the experience of the doctor, comorbidities, and the number of involved lobes were the independent variables. The Hosmer-Leme goodness-of-t test was used to determine how well the model t the data. The model was explained using the Nagelkerke R square. One-way ANOVA was used to compare sex, age, BMI, disease duration, experience of the doctor, comorbidities, and involved lobes. Parameters with a statistical probability (p) value below 0.0625 were included in the logistic regression risk analysis. From the binary logistic regression analysis, corrected pvalues were obtained for the risk factors whose effects were examined, and p < 0.05 was considered statistically signi cant. All statistical analyses were performed using SPSS version 23 (IBM Corp, Armonk, NY, USA).

Clinical manifestations
Three hundred thirty patients (140 males, 190 females; mean age 48.66 ± 11.43 years) were included in our study.
Among them, 88 patients were diagnosed with mild, 169 with common, 43 with severe, and 30 with critical COVID-19 pneumonia. The most frequent symptoms of presentation were fever (86.67%), cough (78.18%), and fatigue (33.33%). On admission, at the time of LUS, the duration of symptoms was on average 20.00 ± 12.15 days. A total of 119 patients (36.06%) required high-ow oxygen support (≥ 15 L/min). The baseline clinical characteristics of the patients are summarized in Table 1. the main ultrasonic manifestations, and small patchy consolidations under the pleura appeared in some patients (9/169, 5.33%). In the severe and critical groups, B lines and interrupted pleural lines were found in each patient, consolidation was observed in most patients (81.40% and 100.00%, respectively), and pleural effusion was detected in a few patients (13.95% and 16.67%, respectively). The number of involved areas for B-line increased with the severity of the disease (p < 0.05). The main ultrasound ndings are shown in Fig. 2. 11.10 ± 1.38 abc , 12 Consolidation involves area (mean ± SD, median) 0 cd , 0 0.08 ± 0.35 cd , 0 3.14 ± 2.44 abd , 4 4.60 ± 2.02 abc , 5 Pleural effusion involves area (mean ± SD, median) 0, 0 0, 0 0.37 ± 0.99,0 0.70 ± 1.68,0 a: P < 0.05vs. mild group, b: P < 0.05 vs common group, c: P < 0.05 vs. severe group, and d: P < 0.05 vs. critical group Diagnostic coincidence rate and value analysis of the two types of imaging Using chest CT scan as the reference standard, the sensitivity, speci city, positive predictive value, negative predictive value, and diagnostic accuracy of LUS for the diagnosis of consolidation, interstitial syndrome, pleural thickening and pleural effusion are reported in Table 3. The coe cients of diagnostic consistency between LUS and chest CT for interstitial syndrome, consolidation, pleural effusion, and pleural thickening were 0.61, 0.82, 0.84, and 0.69, respectively.
There was strong agreement between LUS and chest CT in terms of consolidation and pleural effusion. four groups, the diagnostic accuracy in the common group was the lowest (p < 0.05) ( Table 4). Six patients in the mild group found to have B lines on LUS were negative on CT, and 53 patients with lung lesions on CT were missed by LUS in the common group. However, none were missed by LUS in the severe and critical groups. Analysis of factors predictive of inconsistency between ultrasound and ct ndings To analyse the factors predictive of inconsistency between ultrasound and CT ndings, a multivariate binary logistic regression analysis was conducted. Binary logistic regression was performed by considering sex (male = 0, female = 1), age (< 60 years = 0, ≥ 60 years = 1), BMI (< 25kg/m 2 = 0, ≥ 25kg/m 2 = 1), disease duration (< 15 days = 1, 15-30 days = 2, > 30 days = 3), experience of doctors (senior doctors with more than 2 years' experience in LUS = 0, doctors with less than 2 years' experience = 1), history of diseases (without history of lung or heart diseases = 0, with history of lung or heart diseases = 1), and the number of involved lobes (without lesions in lung = 0,lesions located in upper lobes = 1, lesions located in lower lobes = 2, diffuse pulmonary lesions = 3) as independent variables and the diagnostic accuracy of LUS as the dependent variable. The results demonstrated that sex, the experience of the doctors, the disease duration, and involved lobes were independent predictors (p < 0.0625), after adjusting for the Charlson Comorbidity Index. The odds of inconsistency between ultrasound and CT ndings were 2.135 times higher in women compared to those in men (p = 0.031). The ndings of LUS performed by junior doctors were more likely to be inconsistent with the CT results (OR = 3.665). The odds of inconsistency between ultrasound and CT ndings were 83% and 72.1% in patients whose disease duration was 15 days and between 15 to 30 days, respectively. The odds of inconsistency between ultrasound and CT ndings were higher in patients with upper lung lesions (OR = 9.797) and lower lung lesions (OR = 4.021) compared to those with bilateral diffuse lung lesions (Table 5). Traditionally, because of the air barrier, ultrasound did not apply to the diagnosis and management of most lung diseases and was limited to the study of super cial pleural conditions, such as tumors and effusions, and to guide invasive procedures [12,13]. However, this situation has changed with literature supporting the use of LUS for multiple conditions [5][6][7]. The use of LUS was recently demonstrated for the detection of a broad range of pathologies characterized by a decrease in lung air content and an alteration of the tissue, air, and uid ratio [14].
In the rst part of our study, LUS signs of the 330 diagnosed patients included B lines; thickening or irregularity of the pleural line; consolidation, most of which appeared in severe and critical patients; and, rarely, pleural effusion. The percentage of these four signs increased with disease severity. The severe and critical groups had more areas with B lines and consolidation than the mild and common groups. Therefore, LUS may be useful for the preliminary evaluation of disease severity and follow-up.
The pathophysiological mechanisms of these signs can be explained using autopsy results [15]. At the start of the infection, leakage from pulmonary capillaries into the interstitium and alveoli leads to interstitial pulmonary oedema and pulmonary alveolar oedema. LUS reveals a gradual increase in the B lines. As leakage uid continues to increase, resulting in a decrease in the air content of lung tissues, it causes pulmonary consolidation characterized by a heterogeneous subpleural hypoechoic region. When the in ammatory exudate involves the pleura and the thoracic cavity, it manifests as pleural thickening or irregularity and small amounts of pleural effusion.
In our study, LUS had a 92.70% diagnostic accuracy in detecting consolidation, which was similar to that reported in the literature [16]. Of the 20 consolidations missed on LUS, nine were located in the upper lobes (anterior-superior region) where lesions were easily missed on ultrasound due to its distance from the pleura and the in uence of the spine and scapula. In 11 missed cases, the consolidations occurred deeply and did not extend up to the pleural surface.
Thus, LUS is capable of detecting super cial pneumonia; however, it remains doubtful whether it can detect deep alveolar lesions [12].
LUS showed a relatively low diagnostic accuracy (82.10%) and a high false-negative rate for interstitial syndrome. Fifty-three patients with interstitial syndrome, all in the common group, were missed on LUS. Six patients in the mild group with no lung lesions were misdiagnosed with interstitial syndrome on LUS. Thus, further analysis of predictors of the diagnostic accuracy for interstitial syndrome was conducted; this is discussed later. LUS had an 86.30% diagnostic accuracy in detecting pleural thickening or irregularity. Forty-ve cases were diagnosed with pleural thickening or irregularity on LUS that was absent on CT. The CT images of these 45 patients were re-analysed focusing on the pleura; 32 patients were found to be consistent with the LUS ndings. In our opinion, LUS with high-frequency super cial probes was more sensitive to super cial tissue such as the pleura than was CT; however, subjectivity of LUS maybe the reason for false positive cases. LUS had a 98.80% diagnostic accuracy for detecting pleural effusion; the four missed cases were all critical patients who were unable to turn over to allow the examination of each area.
The diagnostic coincidence rates in mild, common, severe, and critical types were 93.18%, 68.64%, 100.00%, and 100.00%, respectively. In general, the lungs of severe and critical patients had more lesions than those of patients with mild and common types, which might be the reason for the good diagnostic coincidence of LUS for severe and critical types. In the mild group, six patients were positive on LUS and negative on CT. The images were re-analysed, and four cases showed two to three B lines near the diaphragm in the lower lobe of the lung. As reported in the literature, B lines can be found in the intercostal space of the chest near the diaphragm in normal lungs. Though they do not usually exceed two to three lines, they are easily misdiagnosed as positive cases [17]. The remaining two patients had a history of interstitial pneumonia. The diagnostic accuracy was the lowest in the common group.
Since LUS showed a relatively low DA for interstitial syndrome, a regression analysis was conducted on the data to nd out the possible in uencing factors.
First, according to our results, female patients were more likely to have inconsistent ndings between LUS and CT than male patients. The missed female ultrasound images were re-read, and it was found that the missed area in female patients was mainly located in the anterior chest area, which corresponded to the deep surface of the breast. Female breast tissue increases the scan depth of the lungs, which may lead to false-negative results. LUS can detect super cial pneumonia, but the increase of depth from skin to lungs may lessen the accuracy [12]. To tackle this phenomenon, when the scanning anterior chest area in female patients, especially with plump breasts, we suggest that the probe frequency be reduced appropriately to increase the penetrability; careful scanning is necessary to avoid false-negative ndings.
Second, the results showed that junior ultrasound doctors were more likely to miss lesions than their senior counterparts. As reported in the literature [18], lung scanning should be performed by well-trained operators to achieve high diagnostic accuracy. There appears to be a gap in the operating skills and experience of junior ultrasound doctors, especially in the challenging environment of the epidemic isolation ward. Therefore, we recommend that ultrasound doctors undergo systematic training and assessment before performing LUS.
Third, as the duration of the disease increased, the diagnostic coincidence rate of ultrasound decreased. In the early stage of disease, the coincidence rate of ultrasound was higher than that in the recovery stage, probably because with the prolongation of the disease duration and effective treatment, pulmonary lesions became smaller, thus reducing the ultrasonic concordant rate.
Fourth, according to the regression analysis, lesions located only in the upper or lower lobe were more likely to be missed than diffuse lung lesions. This could be due to the overlapping of adjacent regions. For example, the lower-left lobe of the lung may be obscured by the heart or gas in the fundus of the stomach. Gastrointestinal decompression, if possible, could be helpful for patients with severe atulence. Lesions in the upper lobes of the lungs were more likely to be missed, and the reasons are not well-de ned. A previous study [19] suggested that false-negative results might be due to the distance from the pleura, and the coverage of the spine and scapula might make it di cult to detect lesions.
More detailed scanning from multiple angles is recommended in areas covered by surrounding organs or bones to reduce false negatives.

Limitations
First, the sample size was not su ciently large, especially in the critical group, which was consistent with the clinical characteristics of COVID-19 [10]. Most patients with COVID-19 pneumonia fall into the mild or normal group and are either asymptomatic or have mild symptoms. Furthermore, some critically ill patients were too weak to tolerate the entire LUS process and were not enrolled in our study.
Due to a shortage of doctors and a lack of medical resources, most patients only underwent LUS one to three times, and in some patients, the interval between the onset of the disease and the time of LUS examination was longer than 15 days. At the time of imaging examination, some severe-or critical-type patients were improving after treatment, and pulmonary lesions became smaller.
During the epidemic, because of the shortage of ultrasound doctors, some junior ultrasound doctors who had no experience in LUS performed these scans after hurried training. According to our results, the lack of experience with LUS appears to be a factor in uencing the results.
Finally, due to the use of thick isolation gowns and the challenging examination environment during the epidemic, the examination duration had to be as short as possible. Thus, a 12-point protocol, instead of scanning every intercostal space, was implemented, which might be the reason for the false-negative ndings with LUS.

Conclusion
Characteristic LUS ndings of COVID-19 pneumonia include B lines, consolidations, thickening or irregularity of the pleural line, and pleural effusion. LUS has good diagnostic performance in diagnosing COVID-19 pneumonia. The diagnostic accuracy for interstitial syndrome was slightly lower than for the other conditions. Female sex, doctors with less experience, long disease duration, and lesions limited to the upper or lower lobes may decrease the diagnostic accuracy. These results were obtained with limited sample size and in special conditions; further investigation of this nding will likely be of interest.  Figure 1 Flowchart of the patient inclusion process.