High-resolution landscape of NSCLC by scRNA-seq
The brief workflow is shown in Fig. 1A. Four NSCLC tumor tissue and matched circulating blood samples were used in our study. Nine types of cells were identified by UMAP clustering analysis (Fig. 1B). Most of the cells in TILs are T cells. Ten types of T cells were identified. The most abundant T cells are natural killer T and effector memory CD4+ T cells. Graph-based clusters were manually annotated based on known marker genes for the main expected cell types (Fig. 1C).
The proportion of each T cell lineage varies greatly in different tumors and blood samples. Compared with blood samples, naïve CD4+ and effector memory CD8+ T cells are significantly increased in tumor, while resident memory CD8+ and regulatory T cells are decreased (Fig. 1D). Natural killer T cells contains the largest number of differentially expressed genes (DEGs). Volcano plots and Kyoto encyclopedia of genes and genomes (KEGG) pathways of the DEGs in natural killer T and naïve CD4+ T cells were shown in Fig. 1E and F. Some pathways have significant systemic anti-tumor effects in immunotherapy, such as: MARK, p53, NF-kappa B and TNF signaling. The cell differentiation pathways of Th1, Th2 and Th17 were also found, which drove CD4+ T naive cells into diverse T helper subsets.
Identification of potential prognostic biomarkers of TILs in NSCLC
Next, by using Kaplan-Meier plots, DEGs in T cells were used to further study their overall survival (OS) and recurrence-free survival (RFS). Best cutoff analysis showed that the low expressed of CD8A, CD8B, CD38, CD69, ENTPD1, GZMA, GZMH, MYO1F, SYNE1, TSC22D3 and XCL2 were associated with the poor OS and RFS of LUAD patients (Fig. 2A).
Some of them have higher Moran’s I values, and they may largely represent markers of different T cells (Fig. 2B). We consistently observed substantially higher Moran’s I values of them and revealed that they may largely represented markers of different T cells. CD69 and TSC22D3 have peaks in exhausted CD8+ T cells, GZMA, GZMH and SYNE1 in resident memory CD8+ T cells, while CD8A and CD8B increased in effector memory CD8+ T cells. CD69, GZMA and TSC22D3 were highly expressed in T helper cells, but low expressed in naïve CD4+ T cells.
In hierarchically-clustered heatmaps, compared with other types of T cells, CD8A, GZMA, CD8B and GZMH are highly expressed in effector memory CD8+ and naïve CD4+ T cells (Fig. 2C). Furthermore, box-and-whisker plots of these genes were shown. CD69, MYO1F, SYNE1, and TSC22D3 were significantly higher (P-value < 0.01) expressed in tumor tissue than in normal samples (Fig. 2D). The expression levels of these genes spanning different stages are retrieved (Fig. 2E). They were lower in the late/advanced stage of LUAD patients, but higher in early/primary stage (except CD38). All these indicate that the above genes could be used as potential prognostic biomarkers of TILs in NSCLC.
Identification of lower expressed genes in exhausted CD4/8+ T than in CD8+ T cells
Next, we used We used the GSE123902 dataset which contains transcriptionally profiling of single cells from patients spanning different stages of LUAD progression to discover genes which are lower expressed in exhausted CD4/8+ T than in naive CD8+ T cells. First, ten types of cells were identified by using UMAP (Fig. 3A). By using hierarchical clustering of heatmaps (Fig. 3B). CD8A, CD8B, CD69, GZMA, GZMH and XCL2 were higher expressed in naive CD8+ and CD8+ T cells than other types of cells.
Next, afore-mentioned genes across different stages of LUAD progression were studied. All of them are statistically lower (P-value < 0.05) expressed in metastatic than in normal or primary tumor patients (Fig. 3C). In addition, CD8A, CD8B, CD69, GZMA, GZMH and XCL2 were significantly lower (P-value < 0.0001) expressed in exhausted CD4/8+ than in naive CD8+ or CD8+ T cells (Fig. 3D). Among all these genes, GZMA has the highest expression (about 1.2) in CD8+ T cells, therefore GZMA could be a potential biomarker for the prognosis of patients with LUAD.
GZMA was also correlated with some genes by using bulk RNA-seq GSE90728 (Fig. 3E). It was positively correlated with cyclins, such as: CCL5, CD3D, CCL3, CD2, CCR5, CD3G, CD27, checkpoint inhibitor: LAG3 and PDCD1, chemokine: CXCR6, natural killer cell: ID2, cytokine: IFNG, GZMK, LF2, cell phenotype: CCR5 and CD27, and negatively correlated with CD68 and IL7R. These genes may have anti-tumor immune functions related to GZMA. Therefore, GZMA can be used as a novel prognostic marker of TILs in LUAD.
Finally, the correlation between immune cell infiltration and GZMA expression was evaluated by TIMER2.0 15. As shown in the scatter plots (Fig. 3F), GZMA negatively correlates with tumor purity (correlation = -0.427, P = 2.67e-23), suggesting that the main source of GZMA expression detected are stromal and immune cells. Also, GZMA was positively correlated with immune infiltration of CD8+, CD4+, regulatory, gamma delta T, myeloid dendritic cell, macrophage, and B cell. These findings further revealed a strong relationship between GZMA and immunosuppression in the inflammatory tumor microenvironment.
scTCR-seq repertoire of TILs in NSCLC
To investigate the diversity and dynamics of T cell repertoire of TILs in NSCLC, a combination of scRNA-seq and scTCR- seq data from GSE162500 was used 11. Firstly, the number of unique clonotypes pre-patient was studied (Fig. 4A). Except for patient 58, the percentage of unique clonotypes in circulating blood was higher than that in tumor tissue. These findings indicated that circulating blood has more unique clonotypes than tumor tissue.
In fact, by tracking the single clones of all patients, we revealed the top 25 most amplified cell clones, and these clones had significant dynamic changes in patients (Fig. 4B). Except for patient 58, all these clones had more significant amplification in tumor tissue than in circulating blood.
Corresponding to Fig. 1b, the top eight most amplified cell clones were concentrated in terminally exhausted CD8+ T cells, with the first one referring to the amino acid sequence: CAASRNAGNMLTF_CASSISGTGEIGEAFF (Fig. 4C). There are more terminally exhausted CD8+ and regulatory T cells clones in tumor tissue than in circulating blood samples, and most of them have been expanded in large and medium scales.
Top ten uses of TCR gene are shown in Fig. 4D. A putative binding motif for CDR3 is predominantly composed of 5-residue peptides. The first two peptides are SSSSG and GGGGF, with polar residues S and G and hydrophobic residues F and A.
Interestingly, highest diversity indices were observed in dendritic cells and macrophages, while natural killer and T were the lowest (Fig. 4E). To calculate the degree of TCR repertoire overlap among various cell types, we used the overlap coefficient method. There is substantial degree of 36.4% TCR clones overlap between dendritic cell and T cells, between macrophages and T cell populations (Fig. 4F).
The existence of clonal cells spanning several different tissue sites suggests the migration, transformation and expansion of the specified cell types. As expected, among all the cells, macrophages (FTL) and natural killer (GNLY) showed the highest clonal transition, while T (IL7R) cells showed the highest clonal expansion index (Fig. 4G and H). Moreover, macrophages (FTL) and dendritic (AIF1) cells transfer the most CD3 TCR clones to T (IL7R) cells. These results implicate a potential role of different immune cells in shaping the NSCLC during T cell infiltration
scTCR-seq repertoire of CD4+ and CD8+ T cells in NSCLC
Next, we investigated the diversity and dynamics of CD4+ and CD8+ T cells repertoire in NSCLC. Figure 5A shows the UMAP, developmental trajectory and clonalNetwork of CD4+ T cells. T helper appeared at first, followed by effector memory or regulatory T cells. It is worth noting that naive cells will still appear, and mainly accumulate at the final stage of progression. Effector memory cells receive most clones from regulatory and naïve cells.
From alluvialClonotypes in Fig. 5B, effector memory shows the high proportion. There are 13.5% clone overlapping between regulatory and T helper cells (Fig. 5C). Interestingly, highest diversity indices were observed in naïve and effector memory cells, while T helper cells was the lowest (Fig. 5D). Besides, T helper showed the highest clonal transition and expansion index (Fig. 5E).
In CD8+ T cells, cytotoxic appeared at first, followed by exhausted or memory cells. Terminally exhausted cells receive most clones from pre-exhausted and resident memory cells (Fig. 5F). The proportion of terminally exhausted cells is the highest (Fig. 5G). There is 13.5% clone overlapping between cytotoxic and terminally exhausted cells (Fig. 5H). Higher diversity indices were observed in effector memory and pre-exhausted cells, while terminally exhausted was the lowest (Fig. 5I). Also, cytotoxic and resident memory showed the highest clonal transition index, while terminally exhausted cells showed the highest clonal expansion (Fig. 5J).
Collectively, in CD4+ T cells, T helper (CXCL13) and regulatory T (FOXP3) cells have higher ability of transition and expansion. They also transfer the largest number of CD3 TCR clones to each other. In CD8+ T cells, cytotoxic (NKG7) and resident memory (GZMK) have higher ability of transition, and terminally exhausted (CCL5) have higher expansion ability. Cytotoxic (NKG7) transports the most CD3 TCR clones to terminally exhausted (CCL5) cells (Fig. 5K).
Developmental trajectories of NSCLC during tumor progression
To further understand the different cell states in NSCLC spanning different stages, the developmental trajectories of tumor cells was studied. Figure 6A shows the human tissue samples in GSE123902: adjacent non-tumor involved lung (n = 4, ‘normal’), primary LUAD (n = 8, ‘primary tumor’), and metastatic tumor of LUAD (n = 5, ‘metastasis’).
UMAP of different types of cells is shown in Fig. 6B. Naïve CD4+ T cells only appeared in normal samples, while regulatory T cells in metastasis. Pre-exhausted CD8+ T cells showed up in primary tumor, while cytotoxic CD8+ T cells in metastasis.
The predicted ordering of different cell types was displayed in box plots (Fig. 6C). In normal samples, macrophages first appeared, followed by CD4+, CD8+ T cells. In primary tumor, progenitor and epithelial developed at first, then is T helper, effector CD8+, CD4+, pre-exhausted CD8+ T cells. Furthermore, cytotoxic CD8+ T, effector memory and recently activation CD4+ T still exist at the late stage of metastasis.
Additionally, the top 10 (less differentiated; red) and bottom 10 (most differentiated; blue) genes in this dataset based on their correlation with CytoTRACE can be predicted (Fig. 6D). Some of them are significantly related to the specific states in different types of cells in pseudotime trajectory (Fig. 6E). In normal samples, FGFBP2, NKG7 and PRF1 are associated with cytotoxic CD8+ T cells; while FTH1, FTL and HLA-DRA are correlated with macrophages. In primary tumor samples, BTG1, CCL5, CXCR4, FTL, IL7R, and TMSB10 are highly expressed during the process of development. In metastasis samples, FTH1 and FTL were related to either CD4+ or CD8+ T cells
The expression profiling of cytokines, checkpoint receptors and their ligands during tumor progression
Cytokine therapy helps the immune system stop the growth of cancer cells or kill them 21. To address the expression profiling of cytokines receptors and their ligands in different samples, 26 kinds of cytokines were selected (Fig. 7A). Most of them are highly expressed in primary tumors, while CCL5 and CCL4 are highly expressed in cytotoxic CD8+ T cells of metastasis samples. Cytokine receptors are highly expressed in macrophages of normal and primary tumor samples, while CXCR4 is highly expressed in CD4+ T cells in metastasis samples.
It is generally believed that cancer cells are the only source of checkpoint ligands and are responsible for inhibiting T-cell immune responses 22. Checkpoint receptors are highly expressed in chronic activation CD4+ T cells of normal samples, T helper of primary tumor samples, and regulatory T of metastasis samples (Fig. 7B).
Next, the transcript levels of some ligands and receptor in different samples were also studied (Fig. 7C). Notably, in primary tumor samples, FTL was highly expressed in T helper cells, and IFNG was in pre-exhausted CD8+ T cells. In normal and metastatic samples, CCL5 was highly expressed in cytotoxic CD8+ T cells. TNFRSF4, TIGIT and CTLA4 were highly expressed in regulatory T cells of metastasis samples.
By integrating the data of single cell transcriptome of different samples, the development trajectory of CD8+ T cells is shown below (Fig. 7D). In normal samples, it is from effector memory, then is progenitor and cytotoxicity (FGFBP2, NKG7, PRF1 and CCL5). In primary tumor, it is from effector (CCL5) to pre-exhausted (CCL5 and IFNG). Cytotoxicity (CCL5) still appeared in the metastatic samples.
In CD4+ T cells, the development trajectory is as follows. In normal samples, it starts from chronic activation, T helper, and then naïve. In primary tumor, it is from T helper (TNFRSF4, TIGIT and FTL) to effector. In metastatic samples, it begins with regulatory T (CTLA4, TIGIT and FTL) to effector memory, and then recently activation.