Lung Ultrasound Signs to Diagnose and Discriminate Interstitial Syndromes in ICU Patients: A Diagnostic Accuracy Study in Two Cohorts*

OBJECTIVES: To determine the diagnostic accuracy of lung ultrasound signs for both the diagnosis of interstitial syndrome and for the discrimination of noncardiogenic interstitial syndrome (NCIS) from cardiogenic pulmonary edema (CPE) in a mixed ICU population. DESIGN: A prospective diagnostic accuracy study with derivation and validation cohorts. SETTING: Three academic mixed ICUs in the Netherlands. PATIENTS: Consecutive adult ICU patients that received a lung ultrasound examination. INTERVENTIONS: None. MEASUREMENTS AND MAIN RESULT: The reference standard was the diagnosis of interstitial syndrome (NCIS or CPE) or noninterstitial syndromes (other pulmonary diagnoses and no pulmonary diagnoses) based on full post-hoc clinical chart review except lung ultrasound. The index test was a lung ultrasound examination performed and scored by a researcher blinded to clinical information. A total of 101 patients were included in the derivation and 122 in validation cohort. In the derivation cohort, patients with interstitial syndrome (n = 56) were reliably discriminated from other patients based on the presence of a B-pattern (defined as greater than or equal to 3 B-lines in one frame) with an accuracy of 94.7% (sensitivity, 90.9%; specificity, 91.1%). For discrimination of NCIS (n = 29) from CPE (n = 27), the presence of bilateral pleural line abnormalities (at least two: fragmented, thickened or irregular) had the highest diagnostic accuracy (94.6%; sensitivity, 89.3%; specificity, 100%). A diagnostic algorithm (Bedside Lung Ultrasound for Interstitial Syndrome Hierarchy protocol) using B-pattern and bilateral pleural abnormalities had an accuracy of 0.86 (95% CI, 0.77–0.95) for diagnosis and discrimination of interstitial syndromes. In the validation cohort, which included 122 patients with interstitial syndrome, bilateral pleural line abnormalities discriminated NCIS (n = 98) from CPE (n = 24) with a sensitivity of 31% (95% CI, 21–40%) and a specificity of 100% (95% CI, 86–100%). CONCLUSIONS: Lung ultrasound can diagnose and discriminate interstitial syndromes in ICU patients with moderate-to-good accuracy. Pleural line abnormalities are highly specific for NCIS, but sensitivity is limited.

manifestations and radiographic appearances, but of clinical importance considering their associated therapeutic and prognostic implications. Despite advances in diagnostic and monitoring instruments, the current gold standard remains post-hoc expert clinical review (1).
Lung ultrasound is an accurate bedside diagnostic tool that can differentiate between several causes of respiratory failure (2,3). In healthy lungs, the pleura act as a specular reflector of ultrasound beams due to high acoustic impedance disparity with air-filled lungs beneath. In diseased lung, ultrasound acoustic impedance disparity is altered, resulting in artifactual or anatomical ultrasound findings. Both NCIS and CPE disrupt (regional) acoustic behavior of the lung surface, but their distinct pathogenesis would theoretically lead to different ultrasound findings (4). The 2012 international consensus recommendations suggest a potential role for ultrasound to diagnose and discriminate interstitial syndromes, but advise further research as recommendations are based on one study (5,6). To date, barring a study using M-Mode, this role remains unvalidated (7).
This study aims to evaluate the diagnostic accuracy of predefined lung ultrasound signs for the diagnosis of interstitial syndrome and to discriminate NCIS and CPE in ICU patients. We hypothesized that a B-pattern can be used to accurately diagnose interstitial syndrome, whereas consolidations and pleural abnormalities can discriminate between NCIS and CPE.

Study Design
This is a multicenter observational diagnostic accuracy study with a derivation and validation cohort. The protocols for the derivation and validation cohorts were reviewed and approved by the respective local institutional review boards (Medisch-ethische toetsingscommissie Vrije Universiteit medisch centrum [VUmc], registration 2016.002; ische toetsingscommissie Vrije Universiteit medisch centrum, registration W18_311). Written informed consent for use of data was obtained from the patient or legal representative for the validation cohort. Procedures were followed in accordance with the ethical standards of the institutional review boards and with the Helsinki Declaration of 1975. The Standards for Reporting Diagnostic accuracy studies checklist was used (Enhancing the QUAlity and Transparency Of health Research network, 2015).

Participants
The derivation cohort included prospectively collected adult (>18 yr) patients admitted to the academic ICU of Amsterdam University Medical Center, location VUmc, Amsterdam, The Netherlands, between January 1, 2018, and August 1, 2020. Patients were included when they received a clinically indicated lung ultrasound (as determined by the clinical team).
The validation cohort consisted of a post-hoc analysis using prospectively acquired data from an observational study performed by separate teams at the academic ICUs of Amsterdam UMC, location AMC and Maastricht UMC+, The Netherlands, between March 26, 2019, and February 26, 2021. Patients were included when their expected ventilation duration was greater than 24 hours.

Test Methods: Index Test
For the derivation cohort, all images were acquired using a standardized protocol based on the Bedside Lung Ultrasound in Emergency (BLUE) protocol (details are in Supplemental Digital Content 1, http:// links.lww.com/CCM/H164) (8,9).

KEY POINTS
• Question: Can lung ultrasound both diagnose interstitial syndrome and discriminate NCIS from CPE?
In the validation cohort, bilateral pleural abnormalities discriminated NCIS from CPE with a sensitivity of 31% and specificity of 100%.
• Meaning: Lung ultrasound can diagnose interstitial syndrome with high diagnostic accuracy, and discriminate NCIS from CPE with high specificity but limited sensitivity.
All 2D and M-mode images and clips were independently evaluated offline by two investigators

Test Methods: Reference Standard
In the derivation cohort, the reference standard was determined by expert consensus. Two investigators (M.L.A.H., S.R.K.-E.) independently annotated patients with NCIS, CPE, other pulmonary diagnoses ("other"), and no pulmonary diagnoses ("healthy") at the moment of examination based on full post-hoc clinical chart review including imaging, laboratory data, physical examination, and other clinical characteristics except lung ultrasound. Any disputes were resolved by a third investigator (P.R.T.). Interstitial syndrome was defined as the composite group of NCIS and CPE. The NCIS group included patients with interstitial lung disease (ILD) and acute respiratory distress syndrome (ARDS) (10). The CPE included patients with decompensated heart failure (detailed in Supplemental Digital Content 1, http://links.lww.com/CCM/H164). The noninterstitial group was defined as the composite group of "other" and "healthy": other were patients with noninterstitial pulmonary pathology (atelectasis, pneumonia, or pleural effusion), and healthy were patients without pulmonary pathology. Patients with an overlapping diagnosis of NCIS and CPE were excluded from the study. Patients were also excluded if a pneumothorax was present as it impedes normal view of the pulmonary surface.
In the validation cohort, the NCIS group consisted of patients with certain ARDS scored by an expert panel (10). CPE was scored by one expert (L.D.J.B.). The groups "other" and "healthy" were not specified in the validation cohort as data on these diagnoses were not prospectively collected.

Outcomes and Analysis
The primary outcome of the study was to identify which predetermined lung ultrasound signs were most accurate to diagnose and discriminate interstitial syndromes. To address this, the following comparisons were made: 1) patients with interstitial syndrome versus the noninterstitial group and 2) patients with NCIS versus CPE. Novel findings from the derivation cohort were corroborated in the validation cohort. To further evaluate clinical applicability, secondary outcomes were interrater agreement of ultrasound signs and constructing a clinical diagnostic algorithm to arrive at the correct pulmonary diagnosis. In both cohorts, baseline, laboratory, ventilator characteristics, and the Sequential Organ Failure Assessment score were collected from the electronic patient database at the time of ultrasound examination. Baseline characteristics and lung ultrasound signs were tested for differences across patient groups using Pearson chi-square test. All data were analyzed with SPSS for Windows, Version 22 (IBM Corp., Armonk, NY) and R studio, Version 4.0.3 (IBM Corp., Armonk, NY).

Diagnostic Accuracy
Diagnostic accuracy parameters (sensitivity, specificity, positive likelihood ratio, and negative likelihood ratio) of 2D and M-mode ultrasound signs were calculated for diagnosis of interstitial syndrome and for discrimination of NCIS and CPE. Specific pleural line abnormalities (fragmented, thickened, and irregular) were tested for internal consistency across items with a two-way random intraclass correlation model with average measures of absolute agreement to evaluate whether compilation was possible. In case of excellent consistency (intraclass correlation coefficient greater than 0.8 [11]) between specific pleural abnormalities, diagnostic accuracy parameters for the one, two, or three pleural line abnormalities simultaneously, in any ultrasound zone or bilaterally, were calculated.

Interrater Agreement
Agreement for dichotomous and ordinal variables as derived by two observers was evaluated with a kappa statistic and a Spearman correlation coefficient, respectively.

Diagnostic Algorithm
For the derivation cohort, a hierarchical diagnostic algorithm was built based on substantial reliability (kappa statistic > 0.60 or Spearman correlation coefficient > 0.70 [12,13]) and substantial accuracy (positive and negative likelihood ratios of > 4.0 and < 0.30, respectively, indicating a clinically useful shift in disease probability of at least 25% [14]). The initial step was diagnosis of interstitial syndromes, and thereafter discrimination of NCIS and CPE, as well as other and healthy patients, respectively.

Sample Size
A sample size calculation was made using a chi-square contingency table. Based on previous literature and expert opinion, the proportion of pleural line abnormalities was estimated to be 0.9 in the NCIS group and 0.1 in the CPE group (5,7). At least 21 patients per group would be required assuming an α of 0.05 and a β of 0.20. Patients were collected until each group reached the required sample size. This study was powered for pleural line abnormalities because previous literature indicated their importance for discrimination, whereas anterior consolidations and B-pattern are already well-established signs of NCIS and CPE, respectively. No sample size calculation was performed for the validation cohort.

RESULTS
The derivation cohort included a total of 110 patients until the sample size per group was reached. Nine patients were excluded due to pneumothorax (n = 5) or simultaneous occurrence of CPE and NCIS (n = 4). Finally, 101 patients were included, and 1,010 ultrasound files were examined. Twenty-seven (2.7%) of these files were deemed of insufficient quality, leaving 983 for analysis. Characteristics of patients at time of ultrasound examination are shown in Table 1. Supplemental Digital Content 3 (http://links.lww. com/CCM/H164) contains details concerning the imaging modalities employed by the clinical team around the time of lung ultrasound.
The distribution of ultrasound signs across diagnostic groups is shown in Supplemental Digital Content 4 (http://links.lww.com/CCM/H164). Prevalence of B-pattern was different for interstitial compared with noninterstitial groups. Anterior consolidation, abnormal lung sliding, and pleural abnormalities had a different prevalence for the NCIS group compared with the CPE group. Any ultrasound sign appearing bilaterally had a different prevalence in the interstitial groups compared with the noninterstitial group, except for M-mode subpleural vertical orientation, which was not differently distributed across diagnostic groups.
In the validation cohort, 122 patients with interstitial syndrome were included. Baseline characteristics for the validation cohort are shown in Supplemental Digital Content 5 (http://links.lww.com/CCM/H164). Table 2 shows the diagnostic accuracy parameters of ultrasound signs for interstitial syndrome (vs noninterstitial). Only signs with a sensitivity and specificity of greater than 60% for diagnosis or discrimination are presented (full table of estimates per group is shown in Supplemental  (11). Therefore, diagnostic accuracy parameters of one, two, or three composite pleural line abnormalities were also calculated and reported. For diagnosis of interstitial syndrome, presence of a B-pattern had the highest diagnostic accuracy (91.0%; 95% CI, 83.6-95.8); specificity increased to 100% when observed bilaterally.

Secondary Outcome: Interrater Agreement
All ultrasound signs had at least a substantial strength of interrater agreement, except for a nonhomogenous B-pattern and M-mode pleural line evaluation, which had fair and moderate agreements, respectively (Supplemental Digital Content 10, http://links.lww. com/CCM/H164).

Secondary Outcome: Diagnostic Algorithm
The high diagnostic accuracy and interrater agreement of ultrasound signs enabled the development of a diagnostic algorithm presented as the Bedside Lung Ultrasound for Interstitial Syndrome Hierarchy (BLUISH) protocol (Fig. 2). First, presence of any B-pattern was selected to diagnose interstitial syndromes, and presence of bilateral pleural line abnormalities (at least two out of three specific pleural abnormalities present in a zone) was subsequently used to discriminate NCIS from CPE for a total accuracy of 0.86 (95% CI, 0.79-0.93). Although not part of our primary outcome, diagnostic accuracy parameters to discriminate other noninterstitial pulmonary pathology from healthy lungs using anterior consolidation and positive PLAPS were included (Supplemental  When restricting the diagnostic algorithm to mechanically ventilated patients, its accuracy remained 0.86 (95% CI, 0.77-0.95).

Post-Hoc Analysis
The NCIS group of the derivation cohort contained 12 ILD patients (41.4%), whereas in the validation cohort, it contained ARDS exclusively. A logistic regression analysis on the derivation cohort showed that bilateral B-pattern had an odds ratio of 0.067 (95% CI, 0.007-0.678) for ILD (vs ARDS), and total pleural abnormalities had an odds ratio of 2.99 (95% CI, 1.1-8.3) for ILD (vs ARDS).

DISCUSSION
The main findings of this diagnostic accuracy study on lung ultrasound to diagnose and discriminate interstitial syndromes in ICU patients are as follows: 1) B-pattern is the most accurate ultrasound sign to diagnose interstitial syndromes, 2A) bilateral pleural line abnormalities are the most accurate ultrasound sign to discriminate NCIS from CPE, 2B) in the validation cohort, bilateral pleural line abnormalities had a specificity of 100% for the differentiation of NCIS from CPE, but limited sensitivity, 3) interrater agreement for aforementioned ultrasound signs is excellent and substantial, respectively, and 4) an ultrasound diagnostic algorithm (BLUISH protocol) can diagnose and discriminate interstitial syndromes in critically ill patients with a high accuracy of 0.86. Discriminating NCIS from CPE in the acute or critical setting is notoriously difficult but of major clinical importance. Ultrasound signs to discriminate NCIS from CPE have been included in expert consensus recommendations despite paucity of evidence (6). The only prior study on this subject has a small and selected population, a single derivation cohort, and is inapplicable in a complex ICU setting. Similarly, recently described M-mode signs of NCIS lack validation (7). We comprehensively evaluated and validated diagnostic accuracy of NCIS-specific ultrasound signs in ICU patients and show that lung ultrasound is an accurate tool in this regard. Evaluating these signs in ultrasound examinations could lead to quicker, bedside, arrival at diagnosis and facilitates timely and appropriate treatment (15). As such, prompt utilization of lung ultrasound can reduce uptake of more invasive or costly monitoring tools such as pulmonary artery catheters or chest CT (16). Even more so, it could provide a framework for an improved clinical definition of ARDS (17).
Based on the most accurate lung ultrasound signs, we developed a diagnostic algorithm, the BLUISH protocol, for use in ICU patients. The BLUISH protocol contains simple and often-used sonographic signs, which will allow for rapid clinical implementation.
Its high diagnostic accuracy of 86% is comparable with the accuracy of the landmark BLUE-protocol in emergency department patients. Clinically, accuracy of NCIS and CPE discrimination may even be further improved by incorporating functional echocardiography. Combined thoracic ultrasound is increasingly familiar to ICU physicians and, depending on sonographer skill and experience, may provide integral clues for the discrimination of NCIS from CPE (18). Finally, the addition of nonultrasound, but readily available, clinical information such as C-reactive protein and brain natriuretic peptide, may also be used to augment diagnostic certainty.
Inconsistent definitions and methodology are a persistent issue in lung ultrasound literature, especially concerning pleural line abnormalities (19). The consensus for specific pleural line abnormalities is categorizing as "irregular, " "fragmented, " and "thickened, " but other terms such as "blurred, " "coarse, " and "tightening" are also used, albeit equally ill-defined (6,(20)(21)(22). Consequently, reproducibility and generalizability of studies are limited. The current study's explicit a priori definitions generated substantial interrater agreement, increasing external validity. In addition, specific pleural abnormalities carry distinct diagnostic accuracies, thickening being the most accurate for NCIS, followed by irregular and fragmented. Considering their consistency across items, it is reasonable to compile pleural line abnormalities for bedside decisions. After testing many variables, it appears that the dichotomy of normal or abnormal pleural line is most appropriate to balance clinical applicability and efficacy of the test. Future investigations should explore whether specific pleural line abnormalities have characteristic spectral signatures and distinct histopathological correlates (4,23).
External validation demonstrates that clinically evident bilateral pleural abnormalities could be an excellent sign to rule in NCIS, with a specificity of 100%. Lower sensitivity found in validation cohort may be due to several differences between the cohorts. First, 41% of included NCIS patients in the derivation cohort were classified as ILD, whereas the validation cohort did not include ILD patients. Post-hoc analyses showed that ILD had a higher propensity for pleural abnormalities and lower propensity for B-pattern when compared with ARDS. Second, scoring of pleural line was different between the cohorts. Finally, the validation cohort performed transverse lung ultrasound scanning, which increases visible pleural surface when compared with examination perpendicular to the ribs. Despite differences, this finding is very useful for the derivation of refined ARDS definitions and clinical diagnosis of NCIS, but does not allow for ruling out of CPE.
This study has several limitations. There is a risk of selection bias as the patient population was derived from patients with a clinical indication for bedside ultrasound. Besides a relatively high ILD population, baseline characteristics of our population are comparable with the case mix of ICU patients in The Netherlands (24). The additional validation on an external cohort further diminished this limitation. There are differences in cohort design, such as timing of ultrasound examination, examination protocol, and center-specific population, but these can be considered within the scope of normal between-center variation and, therefore, further increase external validity. Furthermore, only novel findings to discriminate NCIS from CPE were validated, as the identification of interstitial syndrome using B-pattern is already widely established in previous literature (25,26). Future investigations may, aside from including echocardiographic and nonultrasound components, consider the relevance of "spared areas, " another potential sign of interest (5). It should be noted that this study evaluates the diagnostic accuracy of singular lung ultrasound signs in a deterministic fashion, whereas clinicianbased practice in a complex ICU setting often relies on clinical conglomerates. This should be considered in future research and during clinical application (3). Finally, the validation cohort contained exclusively mechanically ventilated patients, and those with uncertain ARDS diagnosis were excluded. This study also has several strengths. Our study is adequately powered and, even more so, the largest study to date on this subject. Compared with previous research, we increased external validity for a general ICU population by adding "other" and "healthy" patient populations as control groups and by including a validation cohort.

CONCLUSIONS
This diagnostic accuracy study in ICU patients shows that lung ultrasound is a valuable tool to diagnose and discriminate interstitial syndromes. Pleural line abnormalities, anterior consolidations, and subtle lung sliding have high diagnostic accuracy for NCIS and substantial to excellent interrater agreement. Our diagnostic algorithm (BLUISH protocol), containing a B-pattern and bilateral pleural line abnormalities, has an accuracy of 0.86 for diagnosing and discriminating the cause of interstitial syndrome. External validation demonstrates that pleural line abnormalities have a perfect specificity for NCIS, but limited sensitivity to rule out CPE.