3.1 Identification of common DEGs
The flow chart of the study was generalized in Fig. 1. After performing difference analysis with dataset collection (GSE116959, GSE21933, GSE31210) in GEO database, a total of 356 LUAD samples (304 tumor tissues and 52 normal tissues) in TCGA database were explored. There were 1868 DEGs filtered from the GSE116959 data set, including 624 upregulated and 1244 downregulated genes; 2570 DEGs screened from the GSE21933 data set, including 1193 upregulated and 1377 downregulated genes; 7220 DEGs selected from the GSE31210 data set, including 4107 upregulated and 3013 downregulated genes. Making the results more intuitive, we visualized the results. We displayed the DEGs among each data set via volcano plots (Fig. 2A–2C). What’s more, cluster analysis of DEGs showed two evidently different distribution patterns between the tumor and normal samples, suggesting crucial roles of DEGs in tumorigenesis and progression of LUAD (Fig. 2D–2F). Through Venn diagram analysis, 670 common DEGs in the intersection of the three data sets were identified and selected for further analysis.
3.2 The Selection of Signature Gene
In order to search for the signature gene, we performed gene set enrichment analysis on GSE21933, GSE31210 and GSE116959. Afterward, with GSEA, we found there were nine sets of results related to immunity closely and all of them were highly expressed gene sets. In particular, GSE21933 was associated with macrophage, which are one of our focuses. The information on GSEA results was listed in Fig. 3A. Next, we selected up-regulated genes for further analysis of the differences between genes and enrichment.
We conducted GO analysis and KEGG pathway enrichment analysis on DEGs from differential analysis results to explore their potential biological functions and pathways associated with LUAD. The results of GO analysis in Fig. 3C showed that DEGs were significantly related to mitotic nuclear division, cell-substrate adhesion, organelle fission, mitotic sister chromatid segregation, nuclear division, regulation of cell-substrate adhesion, regulation of mitotic nuclear division, microtubule cytoskeleton organization involved in mitosis, chromosome segregation, regulation of chromosome segregation, sister chromatid segregation, mitotic spindle organization, extracellular structure organization, urogenital system development, regulation of nuclear division, DNA-dependent DNA replication, cell junction assembly, which were essential for the rapid growth of tumors. Additionally, shown in Fig. 3B, there were 5 hub genes (SHMT2, PYCR1, PSAT1, PC, LDHA) of common DEGs selected from Cytoscape software and, attentionally, the gene SHMT2 was at a specific central position in Fig. 3B, which suggested SHMT2 played an important role in regulating cellular behavior. In a function model of TISIDB, we verified the SHMT2 was involved in Glycine, serine and threonine metabolism, metabolic pathways, carbon metabolism, biosynthesis of amino acids, providing the crucial basis for protein and nucleic acid production for cancer growth and metastasis. Thus, we believe that SHMT2 plays an important role in regulating LUAD growth. Immediately following, we carried out further analysis of SHMT2 in other aspects.
3.3 SHMT2 High Expression Level in tumors
The expression level of SHMT2 in tumor and corresponding normal tissues in cancer was verified on Oncomine database. As shown in Fig. 4D, SHMT2 displayed a higher expression level in Bladder cancer, Breast cancer, Colorectal cancer, Kidney cancer, Lung cancer and Lymphoma, while the lower expression level was in Liver cancer and Pancreatic cancer. We also analyzed the mRNA-seq expression data in tumors by UALCAN database and TIMER database (Fig. 4A and 4C). The results both showed the semblable results that SHMT2 display obviously high expression in LUAD. Besides, we explored the protein expression of SHMT2 between LUAD and normal tissues in UALCAN database (Fig. 4B) and conduct immunohistochemistry (IHC) on The Human Protein Atlas (HPA) (Fig. 5A-5F). From above analysis, we summarize the protein expression of SHMT2 was significantly reduced in normal while the protein level significantly elevated in tumors. Higher expression level suggested that SHMT2 possess diverse functions in various tumors, especially in LUAD.
3.4 Prognostic Value of SHMT2 in LUAD
We calculated the area under the curve (AUC) of the receiver operating curve (ROC) to evaluate the discriminative ability of prediction rules. And the AUC score for the training dataset was 0.842 (Fig. 6), indicating better survival prediction performance of the training data set. A Kaplan-Meier analysis showed poor overall survival in patients with higher SHMT2 level. Multivariate analysis and the Cox proportional hazard regression model uncovered that the expression of SHMT2 is an independent prognostic indicator for patients with LUAD (Fig. 7A-7B).
We then examined the prognostic value of SHMT2 using the Kaplan-Meier plotter and the Gene Expression Profiling Interactive Analysis (GEPIA) database. We calculated the Cox P/log-rank P value and hazard ratio with 95% intervals. We set Cox P/log-rank P = 0.05 as the thresholds. The patients were divided into two groups based on the median level of the SHMT2 expression in each queue. Univariate analysis was carried out through the Kaplan-Mayer plotter database and GEPIA to assess the impact of SHMT2 on various cancer survival rates (Fig. 7C-7D). The results indicated that the expression level of SHMT2 had a significant effect on the prognosis of LUAD. Moreover, low level of SHMT2 evinced a longer survival period for patients with lung adenocarcinoma. Given all that, these results suggested that high expression of SHMT2 was related to the poor prognosis of LUAD.
3.5 SHMT2 immune regulation molecules
The result of GESA analysis based on 3 datasets in Fig. 8 showed that the up-regulated gene in GSE21933, GSE31210 was apparently correlated with immune-related biological functions and SHMT2 was proved as a significantly up-regulated gene in 3 datasets, suggesting SHMT2 is possibly connected with immune.
In order to explore whether SHMT2 exert potential biological roles in immune infiltration, we conducted an integrated analysis based on TIMER database and TISIDB database, analyzing the link between SHMT2 and immune cell infiltration as well as the gene markers of immune cell subtypes in LUAD. The results in Fig. 8 suggested high levels of SHMT2 mRNA expression was associated with high immune infiltration in LUAD. SHMT2 mRNA expression level was significantly negatively correlated with infiltrating levels of immune cells, CD4 + T cells (r = -0.055, P = 2.22e-01), macrophages (r = -0.17, P = 1.65e-04) and dendritic cells (DCs) (r = -0.111, P = 1.44e-02) (Fig. 9). Besides, Table 1 also demonstrated the SHMT2 mRNA expression level had significant correlations with immune cells, TAMs, DCs, CD4 + T cells, neutrophils, Th1, Th2, Thf, T Cell exhaustion in LUAD.
Table 1
Correlation analysis between SHMT2 and related gene markers of immune cells
Description | Gene marker | SHMT2 | | Description | Gene marker | SHMT2 |
None | Purity | | None | Purity |
cor | p | cor | p | | cor | p | cor | p |
TAM | VSIG4 | -0.0859 | ** | -0.05576 | 0.079945 | | B cell | CD40 | -0.13984 | ** | -0.10275 | * |
CSF1R | -0.11735 | ** | -0.07542 | 0.094366 | | CD80 | -0.09015 | * | -0.05303 | 0.239835 |
FCGR2A | -0.09113 | * | -0.05383 | 0.232868 | | CXCR4 | -0.09868 | * | -0.07199 | 0.110367 |
FCER2 | -0.19299 | *** | -0.15948 | *** | | CXCR5 | -0.11635 | ** | -0.07293 | 0.105789 |
Treg | FOXP3 | 0.024662 | 0.576457 | 0.085218 | 0.058653 | | TLR4 | -0.13316 | ** | -0.1059 | * |
STAT5B | -0.06034 | 0.171461 | -0.05006 | 0.267301 | | Dendritic cell | HLA-DPA1 | -0.25847 | *** | -0.23163 | *** |
TGFB1 | -0.19353 | *** | -0.16703 | *** | | CCR7 | -0.18066 | *** | -0.05576 | 0.216479 |
CCR8 | -0.00152 | 0.972596 | 0.049541 | 0.27226 | | ITGAM | -0.17099 | *** | -0.05383 | 0.232868 |
Th1 | STAT4 | -0.18081 | *** | -0.15209 | *** | | CD59 | -0.39605 | *** | 0.085218 | 0.058653 |
CD4 | -0.16932 | *** | -0.13212 | ** | | HLA-DPB1 | -0.30492 | *** | -0.28417 | *** |
STAT1 | 0.161045 | *** | 0.207551 | *** | | HLADQB1 | -0.16634 | *** | -0.136 | ** |
IFNG | 0.120743 | ** | 0.165272 | *** | | HLA-DRA | -0.2821 | *** | -0.26451 | *** |
Th2 | GATA3 | -0.08165 | 0.064121 | -0.03198 | 0.478618 | | ITGAX | -0.09346 | * | -0.05407 | 0.230776 |
CXCR4 | -0.09868 | ** | -0.07199 | * | | CD1C | -0.3915 | *** | -0.37063 | *** |
CCR4 | -0.17256 | *** | -0.1376 | ** | | THBD | -0.1806 | *** | -0.16764 | *** |
Tfh | IL21 | 0.134901 | ** | 0.174578 | *** | | Natural killer cell | KIR2DL1 | 0.071633 | 0.104432 | 0.082454 | 0.067362 |
BCL6 | -0.12226 | ** | -0.11344 | * | | KIR2DL3 | 0.094744 | * | 0.115068 | * |
T Cell exhaustion | PDCD1 | 0.103873 | * | 0.167267 | *** | | KIR2DL4 | 0.254623 | *** | 0.284648 | *** |
CTLA4 | -0.00659 | 0.881405 | 0.054575 | 0.22644 | | KIR3DL1 | 0.05615 | 0.203316 | 0.080966 | 0.072474 |
LAG3 | 0.157443 | *** | 0.205703 | *** | | KIR3DL2 | 0.120751 | ** | 0.14666 | ** |
GZMB | 0.262616 | *** | 0.323581 | *** | | KIR3DL3 | 0.171332 | *** | 0.193336 | *** |
Neutrophil | CCR7 | -0.18066 | *** | -0.14587 | *** | | M1 Macrophage | IRF5 | -0.15333 | *** | -0.13549 | ** |
ITGAM | -0.17099 | *** | -0.13038 | *** | | | | | | | |
CD59 | -0.39605 | *** | -0.38037 | *** | | | | | | | |
CCR7 | -0.18066 | *** | -0.14587 | ** | | | | | | | |
*P < 0.05, **P < 0.01, ***P < 0.001 |
For further investigation, we found the expression of SHMT2 was associated with tumor-infiltrating lymphocytes (TILs), including activated Type 1 T helper cell, nature killer cell, T follicular helper cell, active B cell, immature B cell, active CD4 T cell, Type 17 T helper cell, Tem CD8 cell, CD56dim nature killer cell (Fig. 10A-10I). Particularly, the P-value of above-mentioned cells are all less than 0.001. Overall, these results suggested that the SHMT2 and its associated genes were important for immune cells infiltration in LUAD microenvironment and possibly exited a more significant effect on the prognosis of LUAD.