3.1 The landscape of characterizing TIICs composition in GC
CIBERSORT coupled with LM22 allows for highly sensitive and specific discrimination of human TIICs composition, and it has been applied in previous researches[18-20]. We firstly conducted CIBERSORT to systematically describe the constituent pattern of GC immune microenvironment. As shown in the figure 2A-2C and figure S2, the TIICs in GC tissues substantially differed from that of normal gastric tissue. In particular, plasma cells and T cells CD4 memory resting were most frequent in normal tissue; whether macrophages M0 and macrophages M1 were most frequent in GC tissue (figure 2B). In order to evaluate the effect of P-value and barcode genes, we then calculated the CIBERSORT outcomes where barcode genes were randomly removed in increments of 10%[20]. Obviously, the P- value was sensitive to lessen the selection bias of barcode genes (Figure 2D and figure S3).
3.2 meta-analyze of the proportions of TIICs
To confirm the accuracy of above results, we inferred its accuracy in other independent RCC datasets both containing adjacent normal and GC specimens. GEO dataset gene expressions were measured by various platforms, as detailed in Table 1(except GSE84426 and GSE84437). In total, 1542 GC and 416 normal cases were enrolled in the subsequent analyses. We pooled different platform datasets and wiped out binding batch effect. Bar charts summaried TIICs subpopulations and CIBERSORT p-value by study (figure S4). Obviously, although above chip profiles were obtained from different specimen sources and platforms, the abundance of TIICs subpopulation did not show evident cohort bias. Furthermore, the relative proportions of 22 TIICs subpopulation were also compared between these two independent datasets (containing TCGA and GEO), and their distribution showed a high degree level of consistence (Figure 2E-2F, P < 0.001).
meta-analysis is a efficient and effective standard method for summarizing the results of many studies than subjective judgment. So, we conducted meta-analyze for each significant TIICs composition. Notably, the plasma (SMD =-1.18; 95 % CI, − 1.32 to -1.04; p < 0.0001) and mast cells resting proportions (SMD =-0.71; 95 % CI, − 1.68 to 0.27; p = 0.16) exhibited a decrease trend in the case group of GC patients; the T CD4+ memory activated (SMD =1.30; 95 % CI, 0.63 to 1.97, p < 0.0001), T cells follicular helper (SMD =0.31; 95 % CI, -0.19 to 0.82, p = 0.23) , T cells regulatory (SMD =0.53; 95 % CI, 0.21 to 0.84, p = 0.001), macrophages M1 (SMD =2.69; 95 % CI, 1.62 to 3.76, p < 0.0001), macrophages M0 (SMD =6.42; 95 % CI, 4.23 to 8.62, p < 0.0001) and mast cells activated (SMD =0.98; 95 % CI, 0.46 to 1.50, p = 0.0002) proportions exhibited an increasing trend in the case group of GC patients. In brief, these meta-analyze outcomes combination with prior studies demonstrated that our CIBERSORT results were powerful enough to precisely discriminate TIICs subpopulation in GC patients.
3.3 CIBERSORT p-Values Reflect the Proportion of TIICs in GC
A greater proportion of TIICs would generate a corresponding smaller p-value, which could meticulously reflect the proportion of TIICs versus non-immune cells. Then we explored CIBERSORT P-Values against immune cytolytic activity. Rooney et al defined the geometric mean of GZMA and PRF1 as cytolytic activity [15]. As shown in the figure 4A, strong relation existed between cytolytic activity and different P- value thresholds in both the GEO and TCGA datasets. On the other hand, cytolytic activity also had moderate correlation with the proportions of different TIICs subpopulations (figure 4B). As concretely shown in the figure 4C, cytolytic activity was mostly correlated with the proportion of T cells CD4 memory activated (Pearson correlation = 0.59) and T cells CD8 (Pearson correlation = 0.52).
Considering the important role of the TIICs composition in cancer progress, we then investigated their role in carcinogenesis. Common cancer immune signature processes include MHC class, HLA expression, check point recognition, IFN response, APC recognition, inflammation promotion and parainflammation. We obtained the set of genes relative immune pathways from KEGG[21] , and applied the single-sample gene-set enrichment analysis (ssGSEA) to quantify these immune signatures. We surprisingly discovered that immune cells exerted multiple effects on different immune responses in GC (figure 4D). Then, we detailedly explored the relationship between TIICs and pro-inflammatory cytokines, inflammatory cytokines and check point recognition. As can be seen from figure 4D, T cells CD4 memory activated, T cells CD8 and macrophages M1 played significant functions in above immune reaction.
3.4 Distribution of Proportion of TIICs and their clinical characteristics
Immune scores calculated based on the ESTIMATE algorithm could facilitate the quantification of the immune cells. Based on ESTIMATE algorithm, we calculated each GC patient immune score, ranging from -2862 to 2826. Setting median immune scores as the cutoff line, we divided the patients into high- and low- immune groups. As shown in the figure 5A, high immune score group was associated with poorer survival (P = 0.064 in log-rank survival test). The average immune scores of diffuse subtype GC cases ranked the highest of all 5 subtypes, followed by that of mucinous subtype, and signet ring subtype (figure 5B). Since Hp infection is a powerful factor participating in GC development, we detailedly analyzed its effect. As shown in the figure 5B, HP infection strongly improved the immune scores of GC patients. To further explore the relationship between GC progression and immune scores, we plotted the distribution of immune scores based on the TNM stage. As shown in the figure 5C, immune score accumulates with the GC advanced progression. In a word, immune score may serve as newly developing biomarkers for GC subtype classification. Based on the classification provided by Asian Cancer Research Group, MSI, TTP and TP53 are the newly biological characteristics for GC classification and treatment. When the defective mismatch repair system is defective, it results in mismatched mutations in the genome, especially in the repeating DNA (microsatellite) region, which leads to microsatellite instability (MSI). Thus, we conducted exploratory subgroup analyses immune score by molecular subtype defined by TTP, TP53 and MSI. As shown in the figure 5D, mutation of TP53, TTP and MSS reduce GC patient immune scores.
To achieve a better overview of immune subpopulation clinical characteristics, stratification analyses were performed in the entire TCGA cohort of GC patients grouped by clinical characteristics. In view of smaller P- value represented the greater confidence of sample, we selected cases with a CIBERSORT p ≤ 0.05 for follow-up analysis. As shown in the figure S5A, the proportion of mast cells resting, macrophages M1, T cells regulatory, T cells CD4 memory activated and T cells CD8 were increased accompany with advanced tumor grade; yet mast cells activated, macrophages M0 and neutrophils were down-regulated. Macrophages M1, macrophages M2 and eosinophils can also function in diagnosing GC T stage (figure S5B). To further confirm TIICs clinical value, we applied the validation cohort in GEO datasets provided with clinical information. We only found GSE26899 possess both mRNA expression data and clinical information. As shown fig S6A-S6B, there was a similar Macrophages M1 and Macrophages M2 component tendency in the validation cohorts (The proportion of eosinophils was too tiny to analyze).
3.5 Immune clusters associated with GC prognosis
On the base of many findings, the Proportion of TIICs subgroups partly reflects the GC prognosis. Thus, we performed distinct patterns of immune infiltration analyzed by hierarchical clustering of all samples. Restricting follow-up calculations to samples with CIBERSORT P ≤ 0.05, there were 440 patients with a median follow-up of 493.78 days for GC. Detailed results were provided in Table S2. We firstly performed hierarchical clustering analysis for each sample and selected the optimal number of clusters by combining Elbow method and Gap statistic method. The scatter diagram identified three well independent clusters (figure 6A-6B). Cluster1 were defined by relative high level of T cells CD4 memory resting, low level of T cells CD4 memory activated, T cells CD8 and Macrophages M0. Moreover, these three immune clusters were closely correlated with distinct survival pattern(Figure 6C).
Now that we have confirmed the close association with immune clusters with GC prognosis, we further concretely investigated the association of each TIICs subpopulation statistically with GC overall survival and PFS by Cox regression analysis. As shown in the figure 6D, T cells follicular helper (OR: HR=1.31, 95%CI=1.25-1.48, P=0.13; PFS: HR=1.47, 95%CI=1.05-2.18, P=0.048), T cells CD4 memory activated (OR: HR=1.55, 95%CI=1.11-2.17,P=0.0093; PFS: HR=1.73, 95%CI=1.16-2.58, P=0.00097) were significantly associated with poor overall survival and PFS, whereas B cells naïve (PFS: HR=0.57, 95%CI=0.36-0.88, P=0.011) was correlated with improved PFS. In line with the hierarchical clustering outcomes, TIICs subpopulation, specially for Macrophages subgroup, T cells follicular helper and T cells CD8, made great contributions to tumorigenesis
3.6 Prognostic Subsets of TIICs in GC molecular subtypes
Epstein-Barr virus infection and the microsatellite instability subtype (respectively accounting for 9% and 22% of cancers in TCGA cohort[22]), have been proven momentous immunological significance for GC clinical therapy. EBV-positive cancer is characterized for frequent PIK3CA mutations[22]; MSI tumor is characteristic of hypermutated, chromosomal instability and TP53 mutation[23]. It has been proven that SNPs were closely associated with GC risks and prognosis. As shown in the figure S7, TTN and TP53 were the most characteristic SNP gene in GC. Combined with the above conclusions, we conducted exploratory PI3K, TTN and TP53 subgroup analyses of the prognostic effect of 22 TIICs subgroups. As shown in the figure S8A, there was a significant different proportion of T cells CD8, T cells CD4 memory resting, T cells CD4 memory activated, T cells follicular helper, T cells regulatory, T cells gamma delta, NK cells activated, Macrophages M0, Macrophages M1 and neutrophils in PI3KCA mutant and wildtype (P≤0.05, table S3). There was also a significantly diverse distribution of TIICs in TP53 and TTN wild and mutant GC patients (Figure S7B-7C and Table S4-S5).
We next carried out subgroup exploratory analyses of the prognostic effect of 22 TIICs subsets by molecular subgroup defined by PIK3CA, TP53 and TTP based on TCGA classier. As shown in the figure 7, T cells follicular helper showed the strong association with poor outcome in TTP mutant (HR=1.75, 95% CI 1.04–2.94) and TP53 wildtype (HR=1.57, 95% CI 1.01–2.34). In contrast, dendritic cells activate and T cells regulatory respectively associated with poorer outcome in the TTP wildtpye and TP53 mutant. Most strikingly, monocytes and polarization macrophages have the most effect in SNP‐related gastric patients, with the largest effect observed monocytes in the TP53 mutant GC (HR=0.51, 95% CI 0.3–0.86). Similarly, B cells naïve show association with better outcome in the TP53 wildtype (HR=0.65, 95% CI 0.42–0.96). Collectively, these results indicated that there existed considerable variability in the nature of the immune system across GC—partly determined by key molecular characteristics of the primary cacner—and that this exert significant effect clinical outcome.