1. Single cell RNA sequencing reveals 17 distinct cell clusters from human lungs with severe COPD.
We conducted single cell transcriptomic analysis of all cells obtained from the whole parenchymal lung tissue of three nonsmokers without underlying lung disease and three patients with severe COPD (demographic data: Table 1). The pathology with the H&E staining confirmed the presence of moderate to severe emphysematous changes in all three COPD patients (see Supplementary Figure S1). We examined a total of 29,961 cells from six subjects (3914–5920 cells/sample). All the samples were pooled and analyzed together to gain the power to detect rare cell types as described previously [24]. t-distributed stochastic neighbor embedding (t-SNE) blots were generated using statistically significant principal components and cells were clustered using an unbiased graph-based clustering algorithm (smart local moving [SLM] clustering), which in identified 17 distinct types of cells (Fig. 1A). Cells from disease conditions and the individual subjects were indicated in different colors (Figs. 1B and 1C, respectively). SLM clustering was made according to the distinct gene expression patterns based on various cell types in the lung (Fig. 1D). The cell clusters that cannot be reliably classified due to fewer unique molecular identifiers were referred as “low quality” cells. Cell clusters were predominantly identified as: FABP4 as a cell cluster of FABP4 macrophages (#0); S100A8 as a cluster of monocytes (#1); CD3D as cytotoxic T cells (#2); GNLY as NK cells (#3); CLDN5 as endothelial cells (#4); LTB as T cells (#5); IGKC as B cells (#6); CAPS as ciliated epithelial cells (#7); SFTPC as alveolar type 2 epithelial (AT2) cells (#8); MALAT1 as low quality T cells (#9); RSPH1 as low quality cells (#10); DCN as fibroblasts (#11); SCGB1A1 as club epithelial cells (#12); AGER as alveolar type 1 epithelial (AT1) cells (#13); CCL21 as lymphatic endothelial cells (#14); TPSAB1 as mast cells (#15); KIAA0101 as proliferating macrophages (#16). These and other markers provided strong transcriptome signatures for each cell cluster (see Supplementary Table 1, shown graphically in feature plots (Fig. 1E). The prevalence of individual cell types between normal nonsmokers and patients with COPD is shown in Table 2. None of the difference in the abundance of individual cell types between normal nonsmokers and COPD patients reached statistical significance (Wilcoxon), however, a trend existed for decreased percentages of macrophages, endothelial cells, AT2 cells, and fibroblasts in the COPD lungs.
Table 2
Identified clusters, number of differentially expressed genes in Case vs. control, mean % of cells belonging to this cluster both in cases and controls, fold change differences between cluster percentage and the p value of the comparison of the cluster frequency in cases vs. controls both with t-test and Wilcoxon-test.
Cluster
|
Population
|
# DE genes
|
up
|
down
|
Control % of cells
|
COPD % of cells
|
Fold Change
|
P value (t.test)
|
P value (Wilcox)
|
0
|
FABP4 Macrophages
|
868
|
620
|
248
|
27.97 ± 10.74
|
5.15 ± 1.57
|
5.43
|
0.02
|
0.057
|
1
|
Monocytes
|
1499
|
677
|
822
|
23.80 ± 12.90
|
11.24 ± 4.84
|
2.12
|
0.18
|
0.23
|
2
|
Cytotoxic T cells
|
539
|
165
|
374
|
11.18 ± 6.48
|
21.86 ± 9.80
|
-1.96
|
0.14
|
0.11
|
3
|
NK cells
|
431
|
233
|
198
|
3.85 ± 2.89
|
22.14 ± 27.41
|
-5.75
|
0.23
|
0.4
|
4
|
Endothelial
|
242
|
185
|
57
|
9.17 ± 3.90
|
2.37 ± 0.52
|
3.87
|
0.03
|
0.057
|
5
|
T cells
|
213
|
64
|
149
|
2.11 ± 1.00
|
11.68 ± 9.02
|
-5.54
|
0.08
|
0.23
|
6
|
B Cells
|
153
|
30
|
123
|
0.99 ± 0.59
|
6.80 ± 7.95
|
-6.84
|
0.19
|
0.23
|
7
|
Ciliated
|
590
|
433
|
157
|
1.84 ± 0.98
|
6.18 ± 4.36
|
-3.35
|
0.10
|
0.23
|
8
|
AT2
|
476
|
34
|
442
|
4.12 ± 2.52
|
0.96 ± 0.32
|
-4.28
|
0.09
|
0.057
|
9
|
Low quality T cells
|
41
|
14
|
27
|
1.75 ± 0.68
|
3.71 ± 2.07
|
-2.12
|
0.13
|
0.11
|
10
|
Low quality cells
|
596
|
369
|
227
|
2.02 ± 0.95
|
2.83 ± 1.50
|
-1.40
|
0.41
|
0.63
|
11
|
Fibroblasts
|
96
|
87
|
9
|
3.17 ± 1.62
|
0.27 ± 0.07
|
11.60
|
0.03
|
0.057
|
12
|
Club cells
|
98
|
70
|
28
|
1.35 ± 0.81
|
2.72 ± 2.16
|
-2.02
|
0.29
|
0.63
|
13
|
AT1
|
68
|
59
|
9
|
2.60 ± 1.91
|
0.46 ± 0.36
|
5.71
|
0.12
|
0.11
|
14
|
Lymphatic-endothelial
|
27
|
20
|
7
|
1.93 ± 1.10
|
0.33 ± 0.29
|
5.85
|
0.06
|
0.23
|
15
|
Mast cells
|
32
|
25
|
7
|
1.32 ± 0.79
|
0.91 ± 0.24
|
1.44
|
0.44
|
0.63
|
16
|
Proliferative macrophages
|
111
|
55
|
56
|
0.84 ± 0.35
|
0.39 ± 0.25
|
2.17
|
0.12
|
0.11
|
2. Gene expression in 17 individual types of cells in severe emphysematous lungs.
For each cluster, the differential gene expression among the cells of the 3 COPD patients vs. the controls was computed using a Wilcoxon test. This analysis identified the number of differentially expressed genes per cluster when applying the filter of log fold change >|0.4| and FDR = 0.05 (Supplementary Table S2). Interestingly, the major populations showing the largest differences in gene expression were: monocytes, macrophages, low quality cells, ciliated epithelial cells, T cells, AT2 cells, and NK cells (Table 2). Most gene expression changes corresponded to an increased transcription (up-regulation) in patients with COPD (Table 2).
3. The Gene Ontology enrichment Gene expression in 17 individual types of cells from severe emphysematous lungs.
Next, we performed the functional enrichment with the upregulated or downregulated genes in COPD for each population for GO biological processes (Supplementary Tables 3 and 4). Results for the upregulated genes in COPD (i.e., macrophages, monocytes, ciliated epithelial cells and NK cells), have been summarized with Revigo and are shown as Treemaps (Supplementary Figures S2-S5). Of note, these four individual cell types shared ontologies related to the activation of T cells, defense response and three of them (not macrophages) also presented ontologies related to the viral life cycle.
4. Single cell contribution to the 127 emphysema-related gene signatures
Next, we identified how the transcriptomic changes of each cell type contribute to the previously described 127 gene signatures of emphysema reported by Campbell et al [23]. We computed the overlap using GSEA (Table 3). Three individual cell types were enriched with a nominal p value < 0.05 and an FDR < 0.1 (i.e., ciliated epithelial cells, cytotoxic T cells, and low quality T cells). The genes in the core enrichment for these 3 individual cell types and the values of differentially expressed genes in Campbell’s data set are shown in Supplementary Table 5. EPAS1, QKI and STOM were differentially expressed core genes in all three-cell types (Supplementary Table 5).
Table 3
GSEA enrichment of the different clusters with the 127 gene signature
Cluster
|
ES
|
NES
|
NOM p-val
|
FDR q-val
|
FWER p-val
|
Ciliated
|
0.33
|
1.33
|
0.00
|
0.09
|
0.00
|
Cytotoxic T
|
0.33
|
1.29
|
0.00
|
0.09
|
0.00
|
Low quality
T cells
|
0.47
|
1.29
|
0.00
|
0.11
|
0.00
|
AT2
|
0.39
|
1.27
|
0.09
|
0.18
|
0.04
|
Fibroblasts
|
0.40
|
1.31
|
0.10
|
0.18
|
0.05
|
T cells
|
0.31
|
1.17
|
0.10
|
0.21
|
0.05
|
Proliferative macrophages
|
0.37
|
1.23
|
0.10
|
0.22
|
0.05
|
B cells
|
-0.24
|
-0.97
|
0.18
|
0.28
|
0.09
|
NK
|
0.31
|
1.05
|
0.29
|
0.37
|
0.14
|
AT1
|
0.33
|
1.00
|
0.43
|
0.52
|
0.22
|
Monocytes
|
0.20
|
0.84
|
0.68
|
0.77
|
0.36
|
Macrophages
|
-0.22
|
-0.92
|
0.70
|
0.83
|
0.37
|
Club
|
-0.20
|
-0.82
|
0.76
|
0.88
|
0.39
|
Low quality
|
-0.25
|
-0.78
|
0.88
|
1.00
|
0.43
|
Endothelial
|
-0.17
|
-0.53
|
0.89
|
1.00
|
0.44
|
Mast cells
|
0.28
|
0.77
|
0.92
|
1.00
|
0.47
|
As a complementary analysis, we showed which individual cell types differentially expressed the 127 emphysema related genes (Supplementary Table 6 and Supplementary Fig. 6). FCN3, RTN4 and CCR7 were differentially expressed in a total of 5 individual cell types, and EPAS1, QKI and STOM in the 4 individual cell types. The individual cell types with more differentially expressed genes were: monocytes (n = 10), AT2 cells (n = 9), macrophages (n = 8), ciliated cells (n = 5), and T cells (n = 5).
5. Comparison with severe airflow limitation genes.
Next, we determined whether the gene expression changes observed in the individual cell types represent the previously identified gene signatures in whole lung tissue of patients with severe airflow limitation. To achieve this, we assessed the enrichment with GSEA of the individual cell types with the differentially expressed genes between n = 17 non-smokers and n = 30 patients with GOLD stage 4 obtained from LTRC (GSE47460, GPL14550). Overall, the enrichment in all individual cell types appeared to be consistent with the genes differentially expressed in whole lung tissue. However, the enrichment was significant at a nominal p value for only 4 individual cell types (mast cells, proliferating macrophages, monocytes, and FBPB4 macrophages). The full list of genes differentially expressed in the LTRC and differentially expressed in the individual cell clusters is shown in Supplementary Table 7.
Our analysis identified several genes expressed in a distinct type of cells, which were previously reported to be differentially expressed in lung tissue homogenates according to the severity of airflow limitation [14, 31]. These genes are FGG (AT2 cells), CCL19 (monocytes), PLA2G7 (macrophages), HP (macrophages), TNFSF13B (monocytes) and FCRLA (B cells). In addition, following genes were differentially expressed in several individual types of cells: S100A10 (n = 7), RPS10, GNG11, and CAV1 (n = 6), S100A6 (n = 5), and AGER (n = 4).
6. Quantifying protein levels of some genes significantly altered between normal and COPD lungs.
To determine whether significantly altered genes between normal and COPD lungs also correlate with the related protein levels, whole parenchymal lung tissues from non-smokers without COPD (n = 5 per group) and former-smokers with COPD GOLD stage 3 or 4 (n = 6 or 7 per group) were evaluated for protein expression of QKI, STOM, EPAS1, and IGFBP5 by immunoblot analysis. Consistent with altered gene expression, QKI and IGFBP5 protein levels were significantly increased in the COPD lungs relative to non-smokers (Figs. 2A and 2B), but neither STOM nor EPAS1 protein levels were altered (Supplementary Figure S7A). Further, we determined whether QKI and IGFBP5 gene expressions in whole lung tissue correlated with the emphysema severity. QKI expression appreared to decrease according to % emphysema (n = 208; p = 0.0854), whereas, IGFBP5 expression significantly increased according to % emphysema (p = 00150). Since IGFBP5 is an excretory protein, we measured the serum levels of IGFBP5 in smokers with or without COPD (n = 40, each group) that were not significantly altered between the two group (Supplementary Figure S7B). These results suggest that some of the significantly altered genes in COPD lungs identified by scRNA seq indeed correlate with the individual protein levels in the whole lung tissue.