Mass cytometry identifies the source of cellular and functional changes of gene expression profiles associated with Idiopathic Pulmonary Fibrosis mortality

Background We have previously identified a 52-gene signature that distinguishes Idiopathic Pulmonary Fibrosis (IPF) patients with low and high risk genomic profiles with significant differences in mortality. Here, we aimed to apply mass cytometry to characterize phenotypic and functional differences in immune cells between controls and IPF patients with low and high-risk genomic profiles based on the 52-gene signature. Methods PBMCs from 20 subjects, including 9 low-risk, 6 high-risk and 5 controls were barcoded with unique palladium-based combinatorial labels and stained as a single multiplexed sample using mass cytometry. 30 antibodies were studied simultaneously. K-means clustering was used to classify cell events into cell subsets. Differences in immune cell subpopulations were examined by generalized linear mixed-effects model. Linear mixed-effects model was used for differential analysis of functional targets in each individual cell subset between groups. Results Compared to controls, IPF patients were characterized by significantly increased number of classical monocytes, mDCs, activated Th and Tc cells, and decreased naïve Th cells. We observed a significantly higher frequency of classical monocytes in IPF patients with high versus low risk genomic profiles. Significantly increased levels of CD121b and decreased levels of CD28, LCK, ICOS, CD127, CD96 and ETS1 were observed in monocytes, T-cells, B-cells and granulocytes in high-risk compared to low-risk IPF suggesting these cells as the source of our previously identified gene expression findings. Conclusions Our study identified novel immune aberrations in IPF patients and the source of cellular and functional changes of genomic risk profiles associated with IPF mortality.


Abstract
Background We have previously identified a 52-gene signature that distinguishes Idiopathic Pulmonary Fibrosis (IPF) patients with low and high risk genomic profiles with significant differences in mortality. Here, we aimed to apply mass cytometry to characterize phenotypic and functional differences in immune cells between controls and IPF patients with low and high-risk genomic profiles based on the 52-gene signature.
Methods PBMCs from 20 subjects, including 9 low-risk, 6 high-risk and 5 controls were barcoded with unique palladium-based combinatorial labels and stained as a single multiplexed sample using mass cytometry. 30 antibodies were studied simultaneously. Kmeans clustering was used to classify cell events into cell subsets. Differences in immune cell subpopulations were examined by generalized linear mixed-effects model. Linear mixed-effects model was used for differential analysis of functional targets in each individual cell subset between groups. Results Compared to controls, IPF patients were characterized by significantly increased number of classical monocytes, mDCs, activated Th and Tc cells, and decreased naïve Th cells. We observed a significantly higher frequency of classical monocytes in IPF patients with high versus low risk genomic profiles. Significantly increased levels of CD121b and decreased levels of CD28, LCK, ICOS, CD127, CD96 and ETS1 were observed in monocytes, T-cells, B-cells and granulocytes in high-risk compared to low-risk IPF suggesting these cells as the source of our previously identified gene expression findings. Conclusions Our study identified novel immune aberrations in IPF patients and the source of cellular and functional changes of genomic risk profiles associated with IPF mortality.

Background
Idiopathic pulmonary fibrosis (IPF) is a chronic, progressive and irreversible interstitial lung disease of unknown etiology. The median survival of untreated IPF patients is approximately 3.5 years and the clinical course of IPF is unpredictable [1,2] . It progresses very rapidly in some patients while others can have a stepwise decline. Another subgroup of patients progress more insidiously with periods of clinical stability [1][2][3] . No reliable clinical parameters are available to reflect disease prognosis. Consequently, the identification of candidate biomarkers that can accurately predict disease progression and their cellular source is critical for development of clinical prediction models and for the identification of novel therapeutic targets.
Peripheral blood has been a valuable source for biomarker discovery in IPF. Circulating humoral factors studied in this context include MMP7 [4][5][6] , CXCL-13 [7,8] , IL-8 [9,10] , CCL-18 [11] and anti-HSP70 antibody [12] , levels of which are increased in IPF and highly correlated with disease progression. Changes in distribution patterns of peripheral immune cell subsets, such as CD4 + CD28 + T cells [13,14] , fibrocytes [15] and Semaphorin7a + regulatory T cell subset [16] , are associated with mortality in IPF. We previously identified a 52-gene expression signature in peripheral blood mononuclear cells (PBMCs) that was associated with transplant-free survival (TFS) in two independent cohorts [14] . In a follow up study, the 52-gene signature was used to classify IPF into low-risk and high-risk genomic profiles with significant differences in mortality and TFS [17] findings validated in 6 independent IPF cohorts. In the present study, we hypothesized that genomic risk profiles in PBMC are reflective of changes in the proportion and function of specific immune cell populations in IPF, thus we sought to determine these differences in IPF patients when compared to control subjects and in IPF patients with low versus high-risk gene expression profiles based on selected proteins of the 52-gene signature. To validate our hypothesis, we used mass cytometry by time of flight (CyTOF), a highly parameterized single cell-based platform that can determine functional characteristics of several immune cell subsets in the same subject [18][19][20] . We found distinct immune cell subpopulation patterns and functional markers on each immune cell subsets in PBMCs that distinguished IPF patients from controls, and high-risk from low-risk IPF patients based on the 52-gene signature. Our study identified novel immune aberrations in IPF patients when compared to controls and the cellular source of gene expression changes associated with high risk of mortality in patients with IPF.

Human participants
Studies were carried out with explicit informed consent on protocols that were approved by the Human Investigation Committee at Yale University School of Medicine (approval number 1307012431). IPF diagnosis was based on current guidelines of the American Thoracic Society and European Respiratory Society [21] . Patients who were followed at the Yale Interstitial Lung Disease Center of Excellence were eligible to enroll. Healthy, agematched controls were recruited from Yale's Program on Aging and from the greater New Haven community.

Isolation of peripheral blood mononuclear cells (PBMCs)
PBMCs were isolated from whole blood by using Ficoll paque-based separation (Stem Cell Technologies, Vancouver, BC, Canada) and stored in liquid nitrogen for future studies.

Study strategy and patient population
We designed a CyTOF panel of 30 metal isotope-tagged antibodies to analyze and compare the phenotypic and functional heterogeneity of PBMCs across IPF patients and control ( Figure 1 and Table 1). The phenotypic panel contained 15 lineage markers to distinguish the major immune cell populations of PBMCs. The functional panel contained 15 selected protein markers of a 52-gene signature previously associated with increased IPF mortality.
These two panels were combined with PBMC samples from control individuals (N=5) and IPF patients (Total IPF N=15; patients with low-risk genomic profile N=9 and patients with high-risk genomic profile N=6). The patient's information is listed in Table 2 and Table 3.

Palladium (Pd) barcoding and staining by automated robotic platform
Twenty-samples were processed in batches using a state of the art automated Biomek robotic platform (custom BeckmanCoulter Biomek® robotic machine FXp) for cell barcoding and staining to minimize experimental variations [24,25] . Samples were washed and counted using automated counter (Guava). 500,000 cells for each sample were

Preprocess
The Single Cell Debarcoder was used to debarcode the barcoded FCS file into separate FCS files for each subject [26] . The marker intensities in each sample were log10-transformed.
Cell events with low intensities for the two DNA markers (191Ir and 193Ir) and cell events with high intensity for the cell viability marker (195Pt) were gated out using the threshold defined as three median absolute deviations below the median intensity for the pooled cells.
viSNE viSNE was implemented in MATLAB and Cytobank . [27] . 10,000 cells were uniformly subsampled from each group. The phenotypic markers were selected for building up the structure of viSNE map. viSNE maps showed marker expression levels from low to high as color coded. Analysis was performed multiple times with downsampling to optimize graphical output and the results were comparable and stable.

Statistical analysis
To decide whether each cell event belongs to the cell subset of interest, we performed the following analysis: for each phenotypic marker, we classified all cell events into cell events having low, medium, or high expression of this marker using K-means clustering method with the number of clusters fixed at three. Based on the expression levels of all relevant phenotypic markers for a given cell subset of interest, cell events were decided whether they belonged to this cell subset. Because the definitions of cell subsets are not mutually exclusive, a cell event can be classified into multiple cell subsets. For clarity, we excluded ambiguous cell events from our analysis, which were classified into more than one cell subsets. For a given cell subset, we conducted differential analysis in the proportions of this cell subset for both IPF patients versus controls and low-risk versus high-risk patients. The differential analysis was done using a generalized linear mixedeffects model for binomial data, in which the group label was included as a fixed effect and random intercepts were included to account for intra-patient correlations. For functional markers on each of the cell subset, we conducted differential analyses for both IPF patients versus controls and low-risk versus high-risk patients. The differential analysis was done using a linear mixed-effects model, in which the group label (IPF versus controls, IPF low-risk versus IPF high-risk) was included as a fixed effect and random intercepts were included to account for intra-patient correlations. We declared a p-value < 0.05 to be significant. All statistical analyses were conducted using R statistical software (v.3.4.0) and the lme4 package.

Mass cytometry identifies differences in immune cell subpopulations between IPF patients and control subjects.
Our study design is summarized in figure 1. viSNE plots are shown in two dimensions and each dot represents a single cell positioned in the two dimensional space. The intensity of each marker was visualized as color bar indicated ( Figure 2). By K-mean clustering, 17 immune cell subpopulations were inferred based on 15 selected phenotypic surface markers. This robust unsupervised clustering algorithm allowed us to automatically classify the number of cell events into each of the cell subsets studied based on more than ten markers. We compared cell frequencies between IPF patients and control subject on the total of 17 immune cell populations analyzed and identified distinct immune cell composition between these 2 groups ( Figure 3A and B). As indicated in Figure 3C, myeloid DCs and classical monocytes were increased in IPF patients versus controls (myeloid DC, p<0.05; classical monocytes, p<0.001). In addition to monocytes, activated CD4 + Th cells and activated CD8 + Tc cells were also significantly increased in IPF patients versus controls (activated Th cells, p<0.01; activated Tc cells, p<0.001). Naïve CD4 + Th cells were significantly reduced in IPF patients versus controls (p<0.01).

52-gene signature functional markers differentiate IPF patients and controls
To determine cell-subtype specific changes of functional markers of the 52-gene signature, at single-cell resolution, we analyzed the expression of 15 selected targets of this signature in each of the 17 immune cell subsets studied. We found statistically significant differences in several cell subsets between IPF patients and controls. Notably, both innate and adaptive immune cell subsets showed differences between IPF and controls. In non-classical monocytes, IPF patients exhibited lower expression of CD127  Figure 6A and B. In general, the subset distribution between PBMC samples from low high-risk patients was similar with the exception of a significant increase in the frequency of classical monocytes in patients with a high-risk genomic profile compared to low risk (p<0.05, Figure 6C). The proportion of naïve Th cells was decreased in high-risk versus low-risk IPF patients (25.7% versus 33.8%), however, this difference was not statistically significant (p=0.12).

52-gene signature functional markers differentiate IPF patients with low-risk versus high-risk genomic profiles and identify the cellular source of gene expression changes
We identified significant differences in functional markers from the 52-gene signature between IPF patients with low-risk versus high-risk genomic profiles. CD121b was significantly higher in classical monocytes in high-risk versus low-risk IPF patients ( Figure   7A, p<0.05). In non-classical monocytes, CD127, CD96, and LCK were lower in high-risk versus low-risk IPF patients ( Figure 7B, CD96, p<0.05 ; LCK, p<0.05; CD127, p<0.05).
Analysis of the CD4 + T helper cell compartments showed that the expression of CD28 was lower in high-risk versus low-risk IPF patients in the memory Th cell subset ( Figure 7C, CD28, p<0.001). Lower expression of ETS1 and CD96 was detected in high-risk versus lowrisk IPF patients in the effector Tc cell subset ( Figure 7D, ETS1, p<0.01; CD96, p<0.05).
LCK expression was lower in high-risk versus low risk IPF subjects in the memory Tc cell subset ( Figure 7E, LCK, p<0.01). In B cell populations, high expression of ICOS was noted between high-risk versus low-risk IPF patients in naïve B cells ( Figure 7F, ICOS, p<0.05).
Within IPF patients, significant differences were also noted in granulocytes, where increased levels of MCEMP1 and CD121b were observed in high-risk versus low-risk IPF patients ( Figure 7G, MCEMP1, p<0.0001; CD121b, p<0.05).
In summary, we present directional changes of all the functional makers in low-risk versus high-risk IPF patients in Figure S1. Considering that one of the main goals of this study was to identify the cellular source of the gene expression changes of the 52-gene signature in IPF of bulk RNA from PBMC [14,17] , we further compared cell-type specific changes with our previous gene expression results (Table 4). As indicated, MCEMP1 and CD121b showed increased expression and CD28, CD96, CD127, ETS1, LCK and ETS1 showed decreased expression in both bulk RNA from PBMC in low-risk and high-risk IPF subgroups by gene expression and by mass cytometry analysis independently, validating the cellular source of functional targets of the 52-gene signature.

Discussion
We have previously identified and validated a 52-gene signature in PBMC predictive of mortality and poor disease outcomes in IPF [14,17] . Here, we performed deep immunophenotyping using mass cytometry in IPF samples and controls, compared the immune composition between these two groups and in IPF patients with high and low genomic profiles based on the 52 gene signature associated with their respective high and low risk of mortality. We also analyzed 15 selected functional makers of the 52-gene signature by mass cytometry to identify the cellular source of gene expression changes and compared the levels of these markers in IPF and controls samples and between IPF patients with different genomic risk profiles.
When we analyzed the immune cell composition of IPF and control samples, we observed that monocytes and myeloid dendritic cells were significantly increased in PBMCs of IPF patients. [28,29] Our finding demonstrating increased monocytes in IPF patients deserves special attention. Increased monocyte counts appear to have prognostic value in IPF and other fibrotic lung diseases suggesting a role in lung fibrosis progression [29,30] . In this study, we demonstrated increased proportions of classical monocytes in patients with a high-risk genomic profile based on the 52-gene signature, previously associated with increased IPF mortality [14,17] . Our findings validate a very large study by Scott et al involving 7000 patients with fibrotic lung diseases from five independent cohorts [30] showing that increased monocyte counts were associated with increased mortality and poor disease outcomes in IPF and other fibrotic lung diseases. While the findings by Scott et al are impressive, they are limited by the fact that authors did not perform detailed immunophenotyping of the monocytes involved in IPF progression. Our study shows that increased proportions of classical monocytes, particularly those expressing CD121b are associated with increased risk of mortality in IPF. CD121b (Interleukin 1 receptor, type), one of the IPF mortality prediction genes when up-regulated interacts with TOLLIP, a gene that encodes a ubiquitin-binding protein involved in interleukin-1 receptor trafficking and in the turnover of IL1R-associated kinase. Interestingly, TOLLIP variants have been associated with IPF survival [31] thus follow up studies are required to establish the relationship between these two proteins in patients with high risk genomic profiles based on the 52-gene signature. It is important to mention that human findings suggesting the role of monocytes in IPF progression are supported by murine studies in bleomycininduced lung fibrosis. A recent report showed that circulating Ly6C hi monocytes facilitated the progression of pulmonary fibrosis in mice exposed to bleomycin through direction of alternatively activated profibrotic macrophages [32] .
Besides monocytes, we also found increased proportions of circulating dendritic cells in IPF patients when compared to control subjects. While previous reports have shown increased immature and mature DCs in the lung tissue [33] and bronchoalveolar lavage [34] from IPF patients, to our knowledge, our report is the first to demonstrate increased levels of DCs in peripheral blood of IPF patients when compared to controls. This finding is relevant since DCs may have a role in lung fibrosis pathogenesis and progression as demonstrated by the large accumulation of mature DC in mice lungs after bleomycin exposure [35] and the fact that antibodies used to block DC maturation or deplete mature DC chemokine receptors attenuate the pathological hallmarks of pulmonary fibrosis in this model [36] .
Our study also identified higher frequencies of activated T lymphocytes, mainly activated Th and Tc subsets in PBMCs of IPF patients when compared to controls. Other reports have shown than relative to controls, lung tissue and BAL fluid from patients with IPF are enriched for several population of T lymphocytes [37,38] . One of the unexpected findings of our study, which likely resulted from the advantage of deep immunophenotyping using mass cytometry, is that CD4 + T cell from IPF patients have a skewed composition with an subsets correlated with IPF [29] , we have previously shown that decreased proportion of CD4 T cells in peripheral blood was found to be associated with poor IPF outcomes. Further studies including more samples and additional validation cohorts will be required to finally determine whether lower proportion of naïve T cells and not increased proportion of activated T cells in peripheral blood are associated with IPF progression and mortality.
Our study not only shed light regarding the differences in immune cell composition in IPF patients when compared to controls and between IPF patients with different genomic subphenotypes associated with increased mortality but more importantly, validated our previous findings demonstrating that decreased levels of T-cell costimulatory pathway genes such as CD28 and LCK can be seen in T-cell subsets of IPF patients with a high-risk genomic profile. Mass cytometry also provided some unexpected findings regarding our previously identified survival prediction genes such as decreased levels of LCK in nonclassical monocytes and decrease levels of ICOS in B-cells in patients with a high-risk profile. Decreased levels of CD127, CD96 were also seen in non-classical monocytes whereas ETS1 and CD96 were decreased in seen in effector T-cells in the in the IPF highrisk subgroup. Our study also showed other novel findings such as increased expression MCEMP1 [39] , mast cell-expressed membrane protein 1, one of the mortality predictive genes in IPF patients when upregulated based on our previous findings, when IPF samples were compared to control samples. Increased MCEMP1 protein levels were seen in IPF in several immune subpopulations such as classical monocytes, naïve Th cell, effector Th cells and granulocytes. These and other findings of our study will require additional validation in larger cohort of patients.
Despite our impressive results and the validation of our previous gene expression findings, our study obviously has limitations, the two most important being our small sample size and the lack of a validation cohort. While we were able to perform genomic profiling of N=425 individual IPF patients in our 52-gene signature validation study [17] , only a limited number of samples had viable PBMC collected simultaneously for mass cytometry. This limitation may have impaired our ability to detect more granular differences in the proportions of immune cell subpopulations studied across subgroups. It may have also impaired our ability to detect more impressive changes of functional markers in patients with low-and high-risk genomic profiles. Additional studies with larger number of samples and a validation cohort with simultaneous gene expression and mass cytometry measurements are required to further validate our findings. Future studies should also include the collection of time course samples to determine whether cellular and functional changes are static or dynamic, particularly in response to anti-fibrotic therapies. Another limitation of our study is the purity of PBMC samples, while Ficoll-Paque isolation provides a highly cell purified pellet containing more than 95% mononuclear cells, they may include up to 3±2% granulocytes. In our data, we were able to detect a strong signal from granulocytes between studied groups and although they accounted for only a minor percentage of cells this may be off relevance since a recent study showed that accumulation of aged neutrophils in the lung may cause fibrotic interstitial lung disease [40] thus, additional studies looking a gene expression and functional differences in neutrophils in IPF is warranted, however, this is beyond the scope of our study.

Ethics approval and consent to participate
The study was approved by the Yale University Human Investigation Committee (approval number 1307012431), and written informed consent was obtained from all the participating patients.

Consent for publication
Not applicable.

Availability of data and material
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Competing interests
Dr. Kaminski reports personal fees from Biogen Idec, Boehringer Ingelheim, Third Rock, Miragen, Pliant, personal fees from Samumed, NuMedii, Indaloo, Theravance, LifeMax, Three Lake Partners, outside the submitted work; In addition, Dr. Kaminski has a patents on New Threapies in Pulmonary Fibrosis Peripheral Blood Gene Expression in IPF. Other authors declare that they have no competing interests.       The expression of cell type-specific functional molecules differentiates IPF patients from control subjects. Functional molecules displayed significantly differential changes between IPF patients and normal subjects in non-classical  The expression of cell type-specific functional molecules differentiates high-risk from low-risk IPF patients. Functional molecules displayed significantly differential changes between high-risk and low-risk IPF patients classical monocytes (A), non-classical monocytes (B), memory Th cells (C), effector Tc cells (D), memory Tc cells (E), naïve B cell (F) and granulocytes (G). * represents p<0.05, ** represents p<0.01, *** represents p<0.001, **** represents p<0.0001

Supplementary Files
This is a list of supplementary files associated with the primary manuscript. Click to download.