Tumor Characteristics and Single-cell Transcriptome Profiling of Immune Cells
To uncover the complexity of immune ecosystem in TNBC, we performed deep single-cell RNA sequencing on immune cells flow-sorted from the primary tumors of 14 treatment naïve TNBC patients based on the expression of CD45 (Fig. 1a and Supplementary Fig. 1a). After filtering out cells with low quality, we obtained 9,683 CD45+ cells with an average of 14.80 million uniquely mapped reads per cell (Supplementary Fig. 1b and Supplementary Table 1). This sequencing depth assured the detection for genes with low expression level, allowing reliably profiling of cytokines and transcription factors in immune cells (Supplementary Fig. 1c).
The Immune Landscape of TNBC
To reveal the immune cell populations in TNBC, we performed unsupervised clustering using Seruat method on CD45+ cells and obtained 22 cell clusters, which were then visualized by t-SNE algorithm 24 (Fig. 1b). We identified the immune cells as T cells, B cells, macrophages and DCs based on the expression of classic cell type markers as well as reference component analysis 25 (Fig. 1c and Supplementary Fig. 1d). With this approach, nine clusters were annotated as T cell clusters, six as macrophage clusters, three as B cell clusters, two as DC clusters, and also there remained two clusters that could not be well determined (Fig. 1b and Supplementary Fig. 1e). The two DC clusters could be further categorized as one classical DC subset that expressed CD1c and Dectin 1 (CLEC7A), and one plasmacytoid DC subset specifically expressing IL3RA and CD303 (CLEC4C)26. For the B cell clusters, all three clusters are CD10- mature B cells27. Most of the clusters consisted of cells from multiple patients, indicative of common immune traits among patients (Fig. 1d; Supplementary Fig. 1f). Besides, the proportion of different immune cell types and clusters varied across patients, consistent with descriptions in previous researches (Fig. 1e,f; Supplementary Fig. 1g) 17.
Subtype Analysis of Tumor Infiltrated T cells
Next, we sought to characterize infiltrated T cells in TNBC tumors. Unsupervised clustering of CD45+ cells identified nine T cell clusters, including three clusters of CD4+ T cells, and six clusters of CD8+ T cells (Fig. 2a). Distinct signatures of T cell clusters were revealed by expression patterns of known T cell functional markers (cytotoxic, naïve, regulatory or exhausted) and differentially expressed genes (DEGs) among clusters (Fig. 2b, c; and Supplementary Table 2). Among the CD8+ clusters, T2 and T4 were suggested as CCR7+ T cell clusters for their specific expression of CCR7, SELL, and LEF1, indicative of naïve or memory T cell 28. To reveal the proportion of naïve T cells in T2 and T4, we analyzed the expression of CD45RA/RO isoforms (method). We showed that the proportions of CD45RA, which indicates T cells’ naïve status, were 14.2% and 8.0% for T2 and T4, respectively. This observation was further confirmed by flow cytometry analyses on cells isolated from three freshly collected TNBC samples, using antibodies of CD45RA, CD45RO and CD8. We found that CD45RA+CD8+ cells accounted for 6.33%, 8.27% and 17.46% of CD8+ cells, respectively (Supplementary Fig. 2). These results indicated that naïve T cells accounted for a small portion of T2 and T4, others were considered as memory T cells. T1 and T6 showed high expression of exhaustion markers HAVCR2 (TIM-3), TIGIT, and LAG329, suggestive of exhausted CD8+ T cell clusters. T3 and T5 were characterized by high expression of genes associated with cytotoxicity, including IFNG, PRF1, GZMA and GZMB, and meanwhile expressed low levels of exhaustion markers, representing cytotoxic T cell clusters (Fig. 2b).
For the three CD4+ clusters, T7 and T9 exhibited remarkable regulatory T cell (Treg) features for high expression of FOXP3 and IL2RA (CD25). The remaining CD4+ cluster T8 was marked by high levels of CXCL13, CD200, BTLA, PDCD1 (PD-1), and CTLA4, consistent with previously reported dysfunctional T helper cell (Th) features30, 31. Both CD8+ and CD4+ T cell clusters exhibited distinct distributions among patients (Fig. 2d).
Genes Uniquely Expressed by Exhausted CD8+ T Cells and Tumor-infiltrated Tregs in TNBC
Current immune checkpoint blockade therapies mainly target exhausted CTL or Tregs. To uncover potential therapeutic targets for immunotherapy of TNBC, we sought to investigate genes uniquely expressed by these two immunosuppressive T cell subsets. By comparing the expression profiles of exhausted CD8+ clusters (T1 and T6) with non-exhausted clusters (T2, T3, T4 and T5) using R package limma, we obtained a set of 114 exhaustion-specific genes (adjusted P < 0.01 and log2FC≥ 1, Supplementary Table 3), including multiple known exhaustion markers, such as TIGIT, HAVCR2, LAG3, and CTLA412 (Fig. 2e). Among these genes, 23 were also demonstrated in the exhausted CD8+ T cell-specific gene list by a previous study of liver cancer 31 (Fig. 2e; Supplementary Table 4). Notably, tumor-infiltrating CD8+ T cells in our TNBC tumors expressed low level of PDCD1 (PD-1), while high levels of other exhaustion markers such as HAVCR2, TIGIT, and CTLA4 were detected (Fig. 2g). While both exhausted T cell clusters showed high levels of exhaustion marker expression, T1 was also characterized by tissue-resident memory T (TRM) cell feature for high expression of TRM marker genes, such as CD69, ITGAE (CD103) and ITGA132(Fig. 2c).
The same method was applied to tumor-infiltrating Tregs and a total of 343 genes uniquely expressed by Tregs were identified (Supplementary Table 5). The TNBC Treg-specific genes largely overlapped with those identified in previous studies in liver cancer, melanoma and breast cancer 31, 33 (Fig. 2f).
Pseudotime State Transition of T cells and a “Pre-exhaustion” T Cell Subset
Pseudotime analysis provides us a method to depict the T cell developmental trajectories that correspond to biological processes such as activation or exhaustion, as recently suggested 31. To probe into the T cell functional state transition in TNBC, we used the Monocle2 algorithm to order CD8+ and CD4+ T cells in pseudotime based on transcriptional similarities34. The CD8+ T cell trajectory began with cells of CCR7+ clusters T2 and T4, which were separately located at two different branches, followed by cytotoxic clusters T3 and T5, and terminated with exhausted clusters T1 and T6 (Fig. 3a). The enrichment of exhausted T cells at the late period of the pseudotime was in line with previous analysis 31, 32, indicating the CD8+ T cell developmental process through activation to exhaustion. Similarly, we analyzed the CD4+ T cells’ differentiation trajectory, in which Treg clusters T7 and T9 were located at the other end of the exhausted CD4+ Th cluster T8, demonstrating distinct functional states among these T cell subsets (Fig. 3b).
As shown by the trajectory analysis of CD8+ T cells, T5 appeared to be an intermediate functional state, locating between cytotoxic cluster (T3) and exhausted clusters (T1 and T6). Comparison of the expression patterns of T5 and T3 revealed a set of 259 genes with elevated expression in T5, including exhausted marker genes CTLA4, HAVCR2, and TIGIT (Fig. 3c). Survival analysis of the BC cohort from METABRIC revealed that patients expressing T5high T1low or T3high T1low signature were associated with better overall survival and breast cancer specific survival than patients expressing T1highT5low or T1highT3low signature, respectively (Fig. 3d; supplementary figure 3). Also, patients expressing high T3 and low T5 signature showed comparable survival when compared with high T5 and low T3 signature patients (Fig. 3d; supplementary figure 3). This survival disparity suggested that although T5 showed an elevated expression of exhausted markers, its signature was associated with more favorable prognosis than the exhausted T1 signature, consistent with the description of a “pre-exhaustion” T cell state that have been suggested in lung cancer and liver cancer 31, 32. Genes involved in the cellular senescence pathway, such as KDM6B and ZFP36L2 (Supplementary Table 2), were highly expressed by cells of T5, indicating a possible mechanism that triggers T cell exhaustion 35.
Clonal Enrichment of Exhausted T Cells and Tregs in TNBC Microenvironment
TCR analysis provides another approach to gain insight into the various states of T cells. The diversity of TCR sequences is pivotal for recognizing viral antigens or tumor-specific neoantigens presented by the major histocompatibility complex (MHC) on antigen-presenting cells (APCs). While the TCR repertoire is enormous due to the large amount of TCRs and random recombination, identical TCR sequences can indicate T cell clonal expansion. Our single-cell RNA-seq data allowed us to track the lineage of each T cell based on their full-length TCR α and β sequences assembled by the TraCeR method 36. Full-length TCRs with both α and β chains were obtained for 3,315 T cells from 14 patients, among which 2,311 harbored unique TCRs and 1,004 harbored repeatedly used TCRs, implying clonal expansion. Patterns of clonal expansion were detected at varying degrees in different T cell clusters. While only approximately 10% CD8+ cells of CCR7+ clusters (T2 and T4) harbored clonal TCRs (those whose α and β TCR pairs were shared by at least two cells), the percentage reached over 30% in exhausted CD8+ T cell clusters (T1 and T6) (Fig. 3e). For the CD4+ clusters of Treg and exhausted Th, 17.82% to 30.61% of clonal CD4+ T cells were observed (Fig. 3e). Thus, identifying clonal TCRs at single-cell level verified the previously suggested activation and exhaustion status of different T cell clusters in TNBC microenvironment.
A CXCL8-Producing CD8+ T Cell Subset Associated with Survival
CXCL8 production is mostly ascribed to myeloid and epithelial cells, and previous studies have shown that human umbilical cord blood CD4+ naive T cells can express CXCL8, but CXCL8-producing T cells become rare in adults 37. Here we observed T2, a new subset of CD8+CXCL8+ T cells, which was found in 10 of 14 TNBC tumors and accounted for 16.00% of the T cell population (Figure 4a). We also performed immunofluorescence testing to validated the existence of CXCL8+CD8+ T cells, and quantification analysis showed consistent percentage of this subset of cells with RNA-seq result (Figure 4b). The broad existence of CXCL8+ T cells highlighted the possibility that this T cell subset might have active and potentially significant functions in TNBC microenvironment. By analyzing the T cell function-associated gene expression in T2, we found this subset of cells expressed high levels of naïve markers CCR7 and SELL, and activation marker CD69, but low levels of differentiation markers, including CD57 and KLRG1 (Supplementary Fig. 4a). These CXCL8+ T cells expressed low levels of CXCR1 or CXCR2, the receptor for CXCL8, but elevated levels of other chemokines including CCL3, CCL4 and CCL2, when compared with T4, the other CCR7+ T cell cluster (Supplementary Table 6). These chemokines might work corporately with CXCL8 in TNBC microenvironment.
The observation of CD8+CXCL8+T cells was further confirmed using data from other single cell studies of breast cancer, liver cancer and colorectal cancer17, 31, 38, 39. By defining CXCL8 positive using the cut-off of log2TPM>1, we found CD8+CXCL8+ T cell in all these four datasets (Supplementary Fig. 4b). We then explored whether CD8+CXCL8+ T cells revealed in other single cell studies shared common characteristics with ours. We compared the high expressed genes (log2FC>1) between CD8+CXCL8+ T cells observed in Zheng et al.’s study and that of our T2 cluster, which revealed 25 shared genes (Supplementary Fig. 4c). Among the 25 genes, nine were cytokines and chemokines, indicating the secretory function of this subset of T cells.
Interestingly, a recent study revealed that CD4+CXCL8+ T cells might support tumor growth and lymphoid metastasis via CXCL8-mediated neutrophil migration40. Consistently, survival analysis of an independent METABRIC cohort of breast cancer showed that patients with high expression of T2 signature genes and low T4 signature had significantly worse survival compared to those with high T4 and low T2 signature gene expression (P = 0.0263, Fig. 4c). To rule out the influence of CXCL8 produced by cells other than T cells, we further excluded CXCL8 from the signature genes used for the survival analysis, which consistently showed worse survival for patients with high T2 signature (Supplementary fig. 4d). These results suggest that CXCL8+CD8+T cells may associate with poor prognosis in breast cancer, illuminating a potential therapeutic target in TNBC.
To reveal the potential function of CD8+CXCL8+ T cells in breast cancer, we performed DEG analysis between tumors showing high and low T2 signature using data from the METABRIC cohort. This analysis revealed 75 genes with elevated expression (Log2FC≥1) in the T2high group (Supplementary Fig. 4e; Supplementary Table 7). GO biological process enrichment analysis showed these highly expressed genes were generally enriched in the leukocytes (including granulocyte and myeloid leukocyte) chemotaxis and migration pathways (Fig. 4d; Supplementary Table 7). Moreover, this analysis also revealed that MAPK and ERK1/ERK2 cascades were activated in the T2high group (Supplementary Table 7). These results indicate that CD8+CXCL8+ T cells, a subset of chemokine-producing T cells, might promote cancer progression through mediating leukocytes migration to tumor site and activating the MAPK/ERK pathways.
Characterization of TAM subpopulations in TNBC
For the categorization of the macrophages in TNBC, we first applied the classic macrophage polarization model. Based on the activation state, macrophages can be recognized as anti-tumoral classically activated (M1) cells and pro-tumoral alternatively activated (M2) cells 41. We observed highly positive correlation of M1 and M2 signature gene expressions in all the macrophages we identified (Pearson correlation test, R = 0.624, P < 2.20×10-16, Fig. 5a). Although this M1/M2 co-expressed state in single cell level raised challenge to the classic model of macrophage polarization showing exclusive discrete M1 and M2 activation state, along with recent report17, 42, we were able to classify the macrophages into six clusters using Seruat based on their transcriptome features. Totally, we identified six macrophage clusters, each one of which had its uniquely expressed genes (Fig. 5b, Supplementary Table 8).
Identification and Confirmation of TCR+ Macrophages in TNBC tumors
Interestingly, we found a subset of macrophages specifically expressed T cell marker CD3 (Fig. 5c and Supplementary Fig. 5a), as well as the constant region of TCR α/β chains (Fig. 5c), which suggested that these are macrophages bearing α/β TCR (or TCR+ macrophages, the same hereinafter). Only paired productive TCRs can exert antigen recognition function. Thus, we explored whether these macrophages expressed paired productive TCRs by reconstructing the TCRs in these macrophages through the scRNA-seq data. Totally, we identified 382 TCR+ macrophages with paired productive α and β chains in 13 of the 14 patients, making up 14.28% of the total amount of macrophages, with the proportion ranged from 0.25% to 33.52% across patients (Supplementary Fig. 5b). The percentages of TCR+ macrophages in MΦ1 to MΦ6 cluster ranged from 1.97% to 19.67% (Supplementary Fig. 5c). We considered these cells as macrophages rather than T cells for they clustered with other macrophages. In these cells, the expression levels of CD3 and TCR genes were between those for T cells and macrophages, while the macrophage markers had a similar expression level as in macrophages (Fig. 5d). In addition, through reference component analysis, we compared the transcriptome signature of TCR+ macrophages with that of the reference bulk transcriptomes summarized by Li et al.25. We found that these cells were more similar to monocytes and macrophages, and were significantly different from T cells, B cells and NK cells (Supplementary Fig. 5e).
Next, we sought to validate the existence of TCR+ macrophages. To verify the accuracy of TCR assembly and the expression of intact productive TCR α/β chains, we conducted Sanger sequencing validation in 10 selected single cells (including 5 macrophages and 5 T cells). The whole V region and 50 bp of the 5’ C region of α and β chains were successfully amplified in 9 and 7 cells, respectively by 5’race PCR in corresponding cDNA libraries. The Sanger results of all these amplicons were compared to the assembled TCR α/β chains of the corresponding cells as well as the VJ references from IMGT database. In V/J region, all amplified sequences reached more than 99.5% identity with IMGT references. And in the length of the amplicon, all assembled sequences reached more than 97% identity with corresponding sanger sequences (Supplementary Table 9).
Furthermore, we performed immunofluorescence assay on the tumor specimens of our study cohort to confirm the existence of these macrophage populations in TNBC tumor microenvironment. We observed the co-localization of CD68 and CD3/TCRα as well as CD14 and TCRα on the cell membrane in tumor infiltrated immune cells (Fig. 5e and Supplementary Fig. 5d, and Fig. 6), suggesting the presence of TCR+ macrophages. Additionally, we extended our validation by performing FACS analysis using tumor infiltrated immune cells isolated from fresh breast tumors of additional patients independent of our cohort. We found 26.58% and 25.96% of the tumor infiltrated CD68+ cells co-expressed CD3 and TCR, respectively (Fig. 5f). Taken together, these results reconfirmed that there is a subset of TCR+ macrophages in TNBC microenvironment.
Possible TCR signaling pathways activation in TCR+ Macrophages Revealed by Uniquely Expressed Gene Analysis
We further investigated the differentially expressed genes of TCR+ macrophages compared with TCR- macrophages. Notably, significantly up-regulated genes included LCK, LAT, FYN, and ICOS, which were necessary for TCR signaling in T lymphocytes, and GZMA, and GZMB, which were involved in the cytotoxic effect (Fig. 6a, b and Supplementary Table 10). Gene set enrichment analysis showed that the up-regulated genes in the TCR+ macrophages (log2FC ≥1) were enriched in TCR signaling pathway, MAPK pathway, JAK-STAT signaling pathway, and nature killer cell mediate cytotoxicity pathway (Fig. 6c). These findings illuminated that there might exist activated TCR signaling in these TCR+ macrophages. To verify the hypothesis, we examined the phosphorylation state of ZAP-70, a main switch controlling the activation of TCR signaling pathways 43. We showed that phosphorylated ZAP-70 in TCR+ macrophages were observed in tumor specimens from multiple TNBC patients (Fig. 6d). We believe these findings raised a very interesting topic that worth future exploration—whether these TCR+ macrophages can exhibit functions related to TCR signaling transduction.