Genome-wide mutation analysis
The mutation data of 433 samples was obtained from the TCGA database and visualized through ‘maftools’ package. In summary, the variant classifications were performed, where missense mutation was the most common category (Fig 1A). Among variant type, the proportion of single nucleotide polymorphism was much greater than deletion and insertion (Fig 1B). And the frequency of C > T was much higher than other single nucleotide variants (Fig 1C). Additionally, the number of mutated bases was calculated in each sample (Fig 1D). The box plot was applied to show variant classifications (Fig 1E). And the histogram exhibited the top ten mutated genes, and the sequence of variants from high to low was TTN (48%), TP53 (44%), MUC16 (31%), ARID1A (25%), LRP1B (24%), SYNE1 (22%), FLG (19%), FAT4 (19%), CSMD3 (18%), and PCLO (17%) (Fig 1F). Moreover, the waterfall chart showed the detailed variant categories with different colors of mutated genes in 89.38% of samples (Fig 1G). Cloud map (Fig 1H) visualized the mutated genes with various colors, where the larger the font, the higher the frequency of gene variant. And the variant relationship between mutated genes was shown in Fig 1I, where brown indicated that two genes are mutually exclusive variant, while green represented common variant.
Evaluation of TMB in GC
After calculating the number of TMB per million bases in each GC sample, TMB was divided into TMBlow and TMBhigh groups based on the median value. Kaplan-Meier analysis showed that TMBhigh indicated better OS (Fig 2A, P = 0.043), higher disease specific survival (DSS) (Fig 2B, P = 0.029), and longer progression free interval (PFI) (Fig 2C, P = 0.004). Additionally, TMB was closely related to the patient’s clinicopathologic variables (age, sex, stage, T stage and N stage), among which TMB was higher in elderly and female patients, and the higher the TMB, the earlier the TNM stage, T stage and N stage (Fig 2D).
Joint analysis of TMB and MSI
We acquired the MSI of each sample in TCGA cohort from the previous research. Correlation analysis showed that TMB was positively correlated with MSI (Fig 3A), and MSI exhibited a significant difference between TMBlow and TMBhigh groups, that was, MSI in TMBhigh group was higher than TMBlow group (Fig 3B). Additionally, based on the relationship among MSI and clinicopathologic variables (Fig 3C), MSI was closely related to age, sex and N stage, which was manifested as higher MSI in elderly female patients with earlier N stage.
Comprehensive analysis of TMB and TME
After calculating the immune score, stromal score and tumor purity in TME, correlation analysis demonstrated that TMB was negatively associated with immune/stromal scores, while positively correlated with tumor purity (Fig 4A). Subsequently, the t-test proved that the TMBlow group had higher immune/stromal scores, while lower tumor purity (Fig 4B). Additionally, chi-square test was utilized to explore the relationship between TME and clinicopathologic variables (Fig 4C-4E), we observed that low-grade patients had lower immune/stromal score and higher tumor purity than high-grade patients; patients with earlier T stage have lower stromal score and higher tumor purity; and patients with earlier TNM stage have higher tumor purity.
Integrated analysis of TMB and immune cells infiltration
The content of 22 immune cells in each sample was calculated based on the CIBERSORT algorithm, where different colors indicated cell types (Fig 5A), then the correlation analysis of TMB and immune cells was performed (Fig 5B). TMB was negatively correlated with B cells naive, mast cells resting, monocytes and T cells regulatory, while positively associated with T cells CD4 memory activated, T cells follicular helper, macrophage M0 and macrophage M1. Besides, the difference analysis (Fig 6A) demonstrated that the TMBlow group has more T cells CD4 memory resting, T cells regulatory, monocytes, dendritic cells resting, mast cells resting and NK cells activated; while the TMBhigh group has more T cells CD4 memory activated, T cells follicular helper, mast cells activated, neutrophils, macrophages M0, macrophages M1 and NK cells resting. Furthermore, Kaplan-Meier curves (Fig 6B) revealed that the infiltration of dendritic cells activated and macrophage M2 predicted a worse OS, while NK cells activated and T cells CD8 meant a better OS. Additionally, the infiltration of T cells CD4 memory activated and T cells follicular helper also indicated the trend of better OS, but no statistical significance was seen, which may be the reason for the insufficient sample size.
Immune checkpoint expression across two TMB groups
Thirty-eight confirmed immune checkpoints have been accessed from previous studies (Fig 7). We observed significant differences in the expression of CD28, CD40, CD40LG, IFNG, IL12B, JAK1, LDHA, PD-L1, PTPRC, TNFRSF4, TNFSF18, TNFSF9 and VTCN1 across two TMB groups, through differential expression analysis, where CD28, CD40, CD40LG, IL12B, JAK1, PTPRC, TNFRSF4, TNFSF18 and VTCN1 are highly expressed in TMBlow group, while IFNG, LDHA, PD-L1, and TNFSF9 are up-regulated in TMBhigh group, which will separately guide the immunotherapy for two groups of patients.
Functional analysis of DEGs
A total of 816 DEGs were identified and the heatmap showed the first 40 DEGs with significantly differentially expression across two TMB groups, where most DEGs were significantly up-regulated in the TMBlow group (Fig 8A). According to GO annotation, biological process (BP) was mainly enriched in regulation of membrane potential, calcium ion homeostasis and regulation of blood circulation; cellular component (CC) was involved in collagen-containing extracellular matrix, contractile fiber and myofibril; while molecular function (MF) was participated in receptor ligand activity, actin binding and glycosaminoglycan binding (Fig 8B), etc. Moreover, KEGG enrichment analysis revealed that DEGs were closely related to cAMP signaling pathway, calcium signaling pathway and ECM-receptor interaction, etc (Fig 8C). Additionally, GSEA was completed with TMB as phenotype, the active pathways in TMBlow group included calcium signaling pathway, cell adhesion molecules cams and ECM-receptor interaction (Fig 8D); while cell cycle, DNA replication and P53 signaling pathway activated in TMBhigh group (Fig 8E).
Construction, evaluation and validation of TMB-based signature and nomogram
After univariate analysis of DEGs in the training set (TCGA cohort), 261 prognostic genes were obtained and intersected with the validation set (GSE84437) genes to construct a TMB-based prognostic scoring signature via Lasso regression analysis. Finally, a three-gene signature was identified for OS in GC, the RS could be calculated based on the expression levels of three genes and the relative coefficients. RS = (0.0816923878082827 * expression of SCG5) + (0.0398504629411058 * expression of SERPINA5) + (0.00259059404152605 * expression of SPARCL1). Additionally, whether in the training set (Fig 9A, P = 9.508e-03) or the validation set (Fig 9B, P = 4.273e-03), the Kaplan-Meier curves revealed that the OS in the low-risk group was better than that in the high-risk group.
In the training set, univariate analysis showed that age, TNM stage, N stage and RS were closely related to OS (Fig 9C). After statistical adjustment of above variables by multivariate analysis, age and RS were independent prognostic indicators of OS (Fig 9D). Subsequently, three clinical attributes (namely age, TNM stage and N stage) and RS were enrolled into the construction of nomogram for predicting 3- and 5-year OS of GC patients (Fig 9E). The area under the curve of ROC curves for predicting 3- and 5-year OS was all 0.705 (Fig 9F). Additionally, the calibration curves showed that the prediction ability of nomogram for the 3-year and 5-year OS had no significant deviation from the actual reference line (Fig 9G). Moreover, the same variables were utilized to construct a nomogram again in the validation set, the ROC curves also showed a higher predictive efficiency (Fig 9H), and there was no deviation based on the calibration curves (Fig 9I).