Standardization of NK cell activation
The activation of NK cells was established, in principle, by the evaluation of cell lines NK-92MI and MDA-MB-231. Activation was assessed after 24 and 48 hours of cell incubation in a proportion of 3 NK cells for each MDA-MB-231 by evaluating the markers (CD25, CD69, and CD314) expressed from cell-cell contact, using flow cytometry (Sony SH800S).
Culturing MDA-MB-231 and NK-92MI
MDA-MB-231 cells (triple-negative breast carcinoma, ATCC® HTB-26 ™) were cultured using the DMEM (high modified) culture medium supplemented with 10% inactivated fetal bovine serum and 1% penicillin/streptomycin, replicating the culture every three days, keeping the culture at a confluence of 40% at the time of passage, at 37°C in a humid atmosphere at 5% CO2. The adherent cells were dissociated with the TrypLE reagent (Gibco) and resuspended in the complete culture medium.
NK-92MI cells (human NK cells genetically modified to produce interleukin 2, ATCC® CRL-2408) were cultured using the AMEM (Alpha Minimum Essential medium) culture medium, plus 0.2 mM inositol, 0.1 mM mercaptoethanol, 0.02 mM folic acid, 12.5% fetal bovine serum and 12.5% horse serum. The cells were homogenized to separate the clusters of NKs prior to the replication of the culture into a new 25 cm² flask (1mL of cells cultured previously + 9mL new culture medium). Both cell lines were cultured separately for the subsequent cell-cell interaction experiment using the Polaris system.
Selection and incubation using the Polaris system (Fluidigm)
The first step is to prime the IFC with beads that will allow the adhesion of the cells to be incubated in the microchambers. After treatment, reagents and cells already labeled using specific markers to differentiate cell types (for NK cells, celltracker far red, and celltracker orange for cancer cells) and viability (calcein AM) were pipetted into the IFC. The selection of the cells to be incubated was performed on the Polaris system. It was configured for NKs and cancer cells to select positive cells for celltracker far red and calcein AM.
Some IFC wells were maintained with only one cell for later comparison of the gene expression of single cells with the gene expression obtained when the cell was incubated in doublets (NK cells + breast cancer cell lineage). After selection, the equipment was programmed to keep the cells in incubation for 16 hours, performing the replacement of culture medium (20% DMEM + 80% AMEM) every 5 hours, and time-lapse images every hour. Images were captured immediately before and after culture medium change.
71 single cancer cells, 77 single NK cells, and 132 cells in doublets (NK + cancer cells) were incubated, followed by the cell lysis, reverse transcription, and cDNA amplification. Subsequently, sequencing of the single-cell RNA, using NextSeq (Illumina) was performed.
Distance measurement between cells
The distance shown in our study is the shortest distance between the membrane of MDA-MB-231 cell and NK cell on IFC within the doublets (NK + cancer cells) group. The distance has been measured for 13 time points. To analyze the videos frame by frame to yield the distance data, we used ImageJ, a public domain Java-based image processing software developed at the National Institutes of Health60.
Data preprocessing
In total, we obtained expression profiles of 340 cells. The cells can be classified as follows. 1. Single NK cells (n = 77); 2. single tumor cells (n = 71); CIDs throughout all time points (n =132), and CIDs that were left with NK cells alone at the terminal time point (n=10). Out of the 340 cells, we removed 50 expression profiles. The cells discarded include — 1. 2 bulk RNA-Seq replicates of each of the NK-92MI and 2 MDA-MB-231 cell-lines; 4 empty chambers (no cells could be traced in the chambers from the start to the end); 8 empty chambers that initially contained tumor cells; 10 empty chambers that initially contained NK cells; 2 CIDs that started with single NK cells; 22 tumor cells that started as CIDs. After removal of these transcriptomes, we were left with 290 cells/doublets. Further, we discarded cells having less than 2000 expressed genes (non-zero RNA-seq by Expectation Maximization (RSEM) expression value61). Next, we retained genes having RSEM expression >5 in at least 10 cells. After these filtering steps, we were left with an expression matrix constituting 290 cells and protein-coding 8907 genes.
Batch correction, clustering, and visualization of the single cells
The matrix obtained after performing the basic pre-processing steps was used as input for the Seurat, single cell analysis R package, as well as for other downstream analyses. We used Seurat’s data integration workflow to process the data with the genes detected in at least 5 cells, further employing the standard routines for log-normalization, variance stabilizing transformation for identification of highly variable genes using NormalizeData() and FindVariableFeatures() with default parameter settings. The cells used in this study originated from two independent runs. For integration and batch correction, we used the FinIntegrationAnchors() with k.filter=100 for identification of anchor cells that represent matching cell pairs across the two datasets in order to project the transcriptomes into a shared space and IntegrateData() function was used for integrating these anchors, which involve Canonical Correlation Analysis (CCA). These steps provided the batch corrected matrix. 2-D map of the cells was created using the RunUMAP() function.
Differential gene expression analysis
Limma-voom62 was used to identify differentially expressed genes (DEGs) among the cell-groups. Top DEGs that qualified adjusted P-value cutoff of 0.05 and absolute log2 fold change cutoff of 1 were further analyzed for their biological significance.
Survival analysis based on the upregulated genes governing NK cell antitumor activity
PROGgeneV236 was used for gene signature based (based on a gene list) overall survival analysis combined gene signature analysis functionality using the TCGA-BRCA dataset, containing survival information for 594 patients. Out of 187 upregulated genes in NK cells showing antitumor activity, 164 genes were found overlapping with the TCGA-BRCA dataset. PROGgeneV2 produced a Kaplan-Meir plot, segregating the patients into high and low risk groups using the median value of the combined gene expression signature as cut-off.
Regulation of cell-cell distance and transcriptional memory
We considered two matrices to track the association between temporally recorded cell-cell distance and terminally profiled gene expression. First, the processed gene expression matrix of dimensions |G| ×|C|, where |G|, and |C| denote the number of genes and the number of CIDs, respectively. Second, the cell-cell distance matrix of dimensions |C| ×|T|, where |T|denotes the time points when the cell-cell distance measurements were recorded. We used these two matrices to compute the third matrix of dimensions |G| ×|T| containing the Pearson’s correlation coefficients 〖ρ〗_ g,tfor each gene-time point pairs {(g,t) | t∈T,g∈G}. Notably, |G|=2000,|C|=102,and |T|=13. We retained 90 genes with 〖|ρ〗_ g,t|≥0.25for at least one time point for further analysis. These 90 genes were subjected to hierarchical clustering based on the computed correlation values. 4 gene modules were retrieved using cutree(). For each of these modules, motif enrichment and transcription factor activity analyses were performed using RcisTarget35 and ShinyGO34, respectively. RcisTarget discerned enriched TF binding motifs and reported a module-specific list of TFs, using the hg19-tss-centered-10kb-7species.mc9nr.feather database containing genome-wide ranking for the motifs. Gene regulatory networks were constructed using igraph63.
Cell-cell signaling among CIDs
We used iTALK R package featuring information on 2,648 cancer-specific ligand-receptor interactions. We applied the rawParse() function to select the top 50% highly expressed genes from our processed gene expression matrix (constituting 290 cells and 8907 genes), using mean as a method of choice. This identified 230 ligand-receptor pairs using the FindLR() function. We computed Pearson’s correlation coefficient between expression values associated with the selected ligand-receptor pairs across CIDs (TU-NK/TU-NK, TU-NK/NK) and qualified ones that exhibited Pearson's correlation coefficient > 0.4. To rule out the possibility that the co-expression stems solely from either NK or Tumor cells alone, we tracked Pearson's correlation coefficient among NK/NK and TU/TU cases as well. With these criteria, 20 ligand-receptor pairs were selected, and three of these were found directly involved in breast cancer signaling.