Developing a Diagnostic CT Scoring System for Granulomatous-Lymphocytic Interstitial Lung Disease (GLILD) in Common Variable Immunode ciency Patients


 PurposeTo determine the utility of GLILD CT scoring system for assessment of GLILD in patients with CVID to avoid the need for lung biopsy. MethodsThe CT scoring system was developed as a consensus based on review of literature. Two radiologists, blinded to patient clinical information, retrospectively scored CT scans of selected patients with histopathologic diagnosis of GLILD. Discrepancies were settled by a third radiologist. Inter-observer reproducibility was calculated with intra-class correlation coefficients. ResultsWe identified 25 CVID patients with histologically confirmed GLILD (12 male, 13 female). The median age was 38 years (range 12 – 72 years). Inter-observer reproducibility was 0.76 (95% CI 0.53 – 0.89). Splenomegaly/splenectomy and basilar predominant perilymphatic nodules greater than or equal to 4mm have good interobserver agreement in our CT scoring system. Discrepancy between observers occurred most often in differentiating intermediate versus high confidence diagnosis of GLILD as opposed to differentiating between low versus intermediate confidence. ConclusionGLILD CT scoring system of pulmonary disease in CVID patients can stratify patients into categories that may assist in avoiding lung biopsy for diagnosis in a subset of patients.


Introduction
Common variable immunode ciency disorders (CVID) are a group of heterogeneous diseases primarily characterized by hypogammaglobulinemia and increased risk of infection. As treatment with immunoglobulin replacement therapy has increased median survival, noninfectious complications have become the predominant source of morbidity and mortality in these patients (1)(2)(3). Depending on the study, 8% to 22% of people living with CVID develop an interstitial lung disease (ILD) termed "granulomatous-lymphocytic interstitial lung disease" (GLILD) (4-8) which has been associated with early mortality in patients with CVID (5).
Over ten years ago it was established by Routes and colleagues that the presence of GLILD signi cantly worsens survival in CVID. However, GLILD has proven challenging to diagnose, di cult to treat, and is not reversed by conventional immunoglobulin (Ig) replacement therapy in most patients. New treatment using immunomodulation, especially the combination of rituximab and azathioprine has shown promise (6); however, lung disease may return or continue to progress when treatment is terminated.

Page 3/14
The pathogenesis of GLILD in CVID is not well understood. While most pulmonary manifestations of CVID represent recurrent infections with up to 50% developing bronchiectasis, not all have or will progress to GLILD. Currently, con rmatory diagnosis of GLILD requires lung biopsy, which is an invasive procedure with substantial risk and complications. It is relatively common for CVID patients to have lung structural abnormalities on their HRCT, likely due to frequent respiratory infections (9)(10)(11). Indeed, comparison of chest x-rays (CXR), pulmonary function test (PFT) and high resolution computed tomography (HRCT) showed HRCT as the most sensitive method for identi cation of structural abnormalities, detecting pulmonary complications that were missed on CXR and PFT in 2-59% of patients (12). Scoring systems based on HRCT have been commonly used to diagnose and track progression of various lung diseases such as cystic brosis (13)(14)(15)(16), sarcoidosis (17,18), tuberculosis (19), and idiopathic pulmonary brosis (IPF) (20,21). A CT scoring system for GLILD may offer a reliable alternative to biopsy in a subset of CVID patients demonstrating a classic CT appearance of GLILD and may provide prognostic value in determining treatment. Therefore, we propose and evaluate a potential CT scoring system to initially diagnose GLILD and test it retrospectively in patients with con rmed diagnosis of GLILD in order to determine reproducibility between radiologists and re ne the scoring system for future studies.

Materials And Methods
Study Population Following IRB approval, a retrospective chart review of the electronic database was conducted to identify CVID patients with GLILD who underwent chest CT between March 2013 to October 2016 as part of the routine clinical protocol to evaluate GLILD. GLILD diagnosis was con rmed by histopathologic evidence of granulomatous disease in the lungs or alternative organ(s) in the setting of CVID and CT proven interstitial lung disease. The CT at time of GLILD diagnosis was used for scoring purposes. All patients were evaluated at Mayo Clinic, Rochester, MN. Proposed GLILD scoring system on Chest CT The identi cation of variables to be included in the analysis was based on a consensus of three board certi ed radiologist specializing in thoracic radiology and guided by a review of the literature (22)(23)(24)(25)(26)(27). Features visualized on the chest imaging suggestive of GLILD that were consistent across the reviewed literature included the presence of perilymphatic pulmonary nodules, septal thickening in the lung bases, patchy bilateral ground glass opacities, thoracic lymphadenopathy (LAN), and splenomegaly.
Imaging features of perilymphatic nodularity and perivascular cysts were included to re ect the known overlap in appearances with sarcoidosis and lymphocytic interstitial pneumonia, respectively. Nodules that surround patent airways were included based on anecdotal experience of the radiologists. We categorized these features into two groups: pulmonary nodule/nodularity scoring systems in which a score of 1 to 3 is assigned based on size and distribution patterns of the pulmonary nodules, and a score of 1 for presence of other abnormalities listed above. Representative CT images of typical GLILD CT patterns are provided in Figure 1. The total scores were derived as composites of all scored variables. The scoring system proposed is summarized in Table 1. In the setting of a clinical diagnosis of CVID, we prospectively set a score of 5 or more as high con dence for radiologic diagnosis of GLILD, score of 3-4 as intermediate con dence for radiologic diagnosis of GLILD, and score of 2 or less as low con dence for radiologic diagnosis of GLILD. Chest CT Acquisition Of the 25 identi ed patients, 21 underwent high resolution helical chest CT in supine inspiration at our institution with reconstructed images at 1.5 mm slice thickness. "High resolution" chest CT was de ned for the study as less than 2 mm slice thickness in a high spatial resolution kernel as per Fleischner Society Diagnostic Criteria for Idiopathic Pulmonary Fibrosis (28). The remaining four CT exams were performed at outside institutions utilizing variable equipment and acquisitions to include two high-resolution chest CT studies, one pulmonary CT angiography, and one standard chest CT at 2 mm slice thickness. Eight chest CT exams were performed with IV contrast. Evaluation of Chest CT Blinded to clinical information beyond the study inclusion criteria, two independent board-certi ed fellowship trained thoracic radiologists (CWC and RML with 12 and 18 years of experience, respectively) evaluated each CT and assigned scores based on the proposed scoring system. Images were reviewed on GE Centricity RA 1000 version 4.1 PACS with EIZO RadiForce RX340 diagnostic monitors and scoring recorded on a computerized form. Following completion of scoring, the evaluators also provided subjective assessment regarding expected appearance of GLILD as well as non-scored ndings. In this study, discrepancies between the primary evaluators were resolved through a tie-breaker score provided by a third board certi ed fellowship trained thoracic radiologist (JHC with 10 years of experience). Statistical Analysis Interobserver reproducibility of the scoring systems was tested with the intraclass correlation coe cient (Ri) and Bland-Altman plot. Values of Ri greater than 0.80 are generally considered to represent good agreement between observers. Interobserver agreement of the various scoring system variables was evaluated with the statistic. The values of 0.20, 0.21-0.40, 0.41-0.60, 0.61-0.80, and 0.81-1.00 are generally considered to represent poor, fair, moderate, good, and very good agreement, respectively. Radiologic diagnoses based on the scoring system were then compared to histopathology diagnosis using Spearman rank coe cient. A software package (JMP Pro version 13) was used for statistical analysis. A difference with a P value less than .05 was considered signi cant. Table 2 summarizes the distribution of the study population. We identi ed 25 CVID patients with GLILD (12 male patients, 13 female patients). The median age was 38 years (range 12 -72 years old) at the time of CT imaging. GLILD was con rmed by histopathology analysis of lung or biopsy in 21 of the patients. Four of the patients were diagnosed by biopsy of granulomas found in the spleen or lymph nodes in the setting of interstitial lung disease proven by CT. Clinical and immunological characteristics of these patients are summarized in Table 2. Eight patients were current smokers while the other 17 patients never smoked.

Thin-Section CT Scoring Systems
When considering radiologic ndings identi ed by at least one thoracic radiologist, the most common CT nding in the study cohort was splenomegaly or splenectomy (n=20, 80%). Other common CT ndings identi ed by at least one observer included pulmonary nodules surrounding patent airways (n=16, 64%) and bibasilar predominant perilymphatic nodules equal to or greater than 4mm (n=14, 56%). (Table 3) When considering agreement between the two primary thoracic radiologists for a given radiologic nding, splenomegaly/splenectomy remained the most common nding (n=17), and bibasilar predominant perilymphatic nodules equal to or greater than 4mm became the second most common nding (n=10).
With regards to uncommon CT ndings, bilateral perilymphatic nodularity less than 4mm was only considered present in one patient by one thoracic radiologist. Pulmonary cysts were not identi ed by any reviewers.
Interobserver variability was generally good (Ri 0.76, 95% CI 0.53 -0.89) ( Table 3). Agreement was good in rating of bibasilar perilymphatic nodules equal to or greater than 4mm, bilateral perilymphatic nodularity less than 4mm, and splenomegaly/splenectomy. However, coe cients were less than 0.61 for most scoring system variables. For bilateral perilymphatic nodules without basilar predominance equal to or greater than 4mm, septal thickening in the bases, LAN, and chronic bilateral groundglass/cysts, coe cients were less than 0.40 for all comparisons.
Predictive value of the CT scoring systems From 25 cases evaluated by the CT scoring system, 12 cases (48%) were categorized as high con dence GLILD, 3 cases (12%) diagnosed as intermediate con dence GLILD, and 10 cases (40%) diagnosed as low con dence for GLILD. (Table 3) The Bland-Altman plot ( Figure 2) illustrates that there was no particular bias between the two reviewers. The main disagreement between the reviewers were in differentiating the intermediate vs. high con dence diagnosis of GLILD.

Discussion
This study attempts to develop and test a systematic scoring system that can assist clinicians and radiologists in evaluation of GLILD in patients with CVID using Chest CT. Based on our systematic review of the literature and experience, our proposed scoring system included classic ndings of GLILD as well as potential characteristic patterns typical of sarcoidosis (perilymphatic nodularity) and lymphoproliferative disease (perilymphatic cysts). Consistent with the ndings of Mannina et al, splenomegaly and splenectomy were the most common nding in CVID related GLILD [25], with our study supporting this as a reproducible nding between observers. Bibasilar predominant perilymphatic nodules greater than or equal to 4mm were weighted heavily to re ect the characteristic nding in GLILD and proved to have good agreement between radiologists, favoring that this nding should be kept as a central nding in GLILD CT scoring. Nodules surrounding patent airways is a common nding associated with GLILD, although our study suggests that characterization of such nodules may be inconsistent between thoracic radiologists. Bilateral perilymphatic nodularity less than 4mm and perilymphatic cysts, more characteristic of sarcoidosis and lymphocytic interstitial pneumonia respectively, were uncommon ndings and may not contribute meaningfully to the scoring system.
Certain CT imaging ndings commonly seen in CVID were purposefully excluded as they suggest alternative diagnoses that are not mutually exclusive from GLILD. Bronchiectasis in the lower lungs is a common nding in CVID related lung disease, although is attributable to immunode ciency and recurrent infection. Alternatively, mass-like opacities or consolidation are not characteristic of GLILD and raise concern for complicating lymphoma, particularly when chronic (29). Our novel scoring system may have many advantages. Firstly, it has the potential to help standardize and streamline the diagnosis of GLILD. A community radiologist, who is not familiar with GLILD, may be able to use this scoring system with the provided example images to help make an early diagnosis, which so far has been left to individual features without a conclusive system to help further management. Secondary, this may improve overall patient management and be cost saving. When the score is in the range of "high con dence for GLILD", as nearly half the patients in our study were, additional invasive interventions like an open lung biopsy may be able to be avoided. Some radiologists may score con dent cases as intermediate or low con dence for GLILD and, in those cases, a second radiology opinion, interdisciplinary conference discussion and/or additional interventions may be pursued.
Overall, the inter-observer variability based on these criteria is good, and most of the disagreement is in differentiating intermediate vs high con dence diagnosis of GLILD instead of differentiating between low vs intermediate or high con dence. Category descriptors are critical in de ning downstream decisions, and alternative category names could be considered, such as "consistent with", "probable" and "inconclusive for" GLILD. Such categories could then favor biopsy in patients with "inconclusive" CT ndings, favor avoiding biopsy in "consistent with" CT ndings and favor multidisciplinary discussion for consideration of biopsy in "probable/possible" ndings. While our scoring system only captured 60% of the GLILD con rmed cohort when scored across multiple radiologists, these patients could have potentially avoided the risk of biopsy by meeting imaging criteria for GLILD, much like how diagnostic criteria for IPF de ne the usual interstitial pneumonia CT pattern to avoid biopsy (20,21).
Primary limitations of our study include the retrospective design, limited number of patients, absence of control group, variability of CT imaging during the time course of disease, absent surveillance imaging to assess evolution of disease, and cohort selection from a quaternary referral center. At the same time, the cohort was relatively large for the rarity of GLILD and did re ect our experience of the scope of GLILD CT ndings. While the scoring did not represent the evolution of disease over time, the scoring of a single timepoint at initial diagnosis is more representative of general practice and the intended application of the scoring system. Future considerations for study should focus on prospective application of CT scoring of possible GLILD in practice, to include determining sensitivity and speci city when compared to histopathologic results. Additional potential considerations include re nement of scoring by inclusion of alternative laboratory or physiologic testing (30), the relationship of prognosis to CT scoring and the potential for CT scoring in treatment decision making.
In conclusion, splenomegaly/splenectomy and basilar predominant perilymphatic nodules greater than 4mm are typical CT ndings of GLILD and have good interobserver agreement in a CT scoring system. Bilateral perilymphatic nodularity less than 4mm and perivascular cysts can likely be excluded from CT scoring of GLILD aimed at identifying characteristic ndings of GLILD but may need further validation using a larger prospective cohort. Finally, a GLILD CT scoring system of pulmonary disease in CVID patients can help in early identi cation and strati cation of patients that may assist in clinical management and potentially avoid invasive lung biopsy for diagnosis.

Declarations
Compliance with ethical standards This study was approved by the university human research ethics committee and all procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. Informed consent was obtained from all individual participants included in the study.

Authors Contribution
SPH participated in the design of the study, collected and interpreted the data, and draft the manuscript.
RML and JHC participated in the design, conducted the study, and helped to draft the manuscript. CWC and AYJ conceived of the study, formulated its design, coordinated the conduct of the study including data collection, helped to interpret the data and helped to draft the manuscript. All authors read and approved the nal manuscript.

Funding
No funding was received for conducting this study.

Competing interest
The authors declare no con ict of interest

Availability of data and materials
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request Disclosure of con ict of interest: The authors declare no con ict of interest. The authors received no speci c funding for this work. In the setting of a clinical diagnosis of CVID High con dence of GLILD radiologic diagnosis ≥ 5

Tables
Intermediate con dence of GLILD radiologic diagnosis 3 -4 Low con dence of GLILD radiologic diagnosis ≤ 2 * Nodules greater than 1.5 cm, nodules growing over time, masses or chronic consolidation should be evaluated for lymphoma **Chronic de ned as persisting longer than 3 months without immunosuppressive treatment  Intermediate Con dence 1 7 3 Low Con dence 13 9 10 PL=perilymphatic, LAN=lymphadenopathy, GGO=ground glass opacites