Screening and identification of SLC4A4 as an independent protective factor of prognosis prediction in colon cancer.
First, we retrieved the raw data of 150 pyruvate metabolism related genes from the MSigDB database. In order to explore the comprehend functions of pyruvate metabolism related genes, GO analysis was performed and the results showed that the top five enriched BP terms were “pyruvate metabolic process”, “glycolytic process”, “ATP generation from ADP”, “ADP metabolic process” and “nucleoside diphosphate phosphorylation”. The top five enriched CC terms were “nuclear envelope”, “host cellular component”, “host cell”, “nuclear pore” and “mitochondrial matrix”. The top five enriched MF terms were “structural constituent of nuclear pore”, “lyase activity”, “monosaccharide binding”, “carbohydrate kinase activity” and “oxidoreductase activity, acting on the aldehyde or oxo group of donors” (Figure 1A). KEGG pathway analysis revealed that pyruvate metabolism related genes were involved in glycolysis/gluconeogenesis, carbon metabolism, citrate cycle (TCA cycle), RNA degradation, and HIF-1 signaling pathway (Figure 1B). Then, expression analysis of 150 pyruvate metabolism related genes was performed and 116 differentially expressed genes were selected, which were subsequently subjected to univariate and multivariate Cox regression analysis and the results indicated that the pyruvate metabolism related genes SLC4A4 and PPARCG1A were independent protective factors, and INSR and SLC16A8 were independent risk factors in colon cancer (Figure 1C, 1D). Therefore, ENO3, INSR, PPARGCA1, SLC16A8 and SLC4A4 were selected as pyruvate metabolism hub genes for subsequent analysis. With the expression of ENO3, INSR, PPARGCA1, SLC16A8 and SLC4A4, we generated a 5-gene signature risk model and found out that higher 5-gene signature represented poor prognosis. To verify the validity of 5-gene signature, Kaplan-Meier curves for the training cohorts in TCGA and validation cohorts in GEO database were shown in Figure S1A&S1D. The 5-gene signature was positively correlated with poor prognosis. Then, we divided patients into high and low risk groups according to the best cut-off value of 5-gene signature. The distribution of the survival data and 5-gene signature for each patient, as well as the heatmaps of ENO3, INSR, PPARGCA1, SLC16A8 and SLC4A4 were shown in Figure S1B、S1C、S1E、S1F, in which patients with higher 5-gene signature usually had shorter survival time. Moreover, we detected the relationship between expression levels of ENO3, INSR, PPARGCA1, SLC16A8 and SLC4A4 with prognosis of colon cancer with Kaplan Meier analysis, and the results showed that the differences between the expression of INSR and PPARGC1A and OS were not significant, high expressions of ENO3 and SLC16A8 were associated with worse OS, whereas high expression of SLC4A4 was correlated to better OS, disease specific survival and progression free survival status (Figure 1E-L). On the basis of above results of Cox regression analysis and Kaplan Meier analysis, SLC4A4 was an independent protective factor of prognosis prediction and was selected for in-depth study to investigate its expression and function in colon cancer.
The correlation of SLC4A4 with clinicopathological factors and validation of its expression in clinical specimens of colon cancer.
We obtained raw data of RNA-seq profile and clinical information from the Genomic Data Commons (GDC) of the TCGA database, including 478 primary colon cancer samples, and the baseline data of training set are shown in Table 1. Using Xiantao online tools, the expression levels of SLC4A4 with clinicopathological factors were analyzed, and the results showed that the expression of SLC4A4 was significantly lower in colon cancer samples than in normal colon tissues (p<0.001, Figure 2A). The expression of SLC4A4 in colon cancer was also lower than in paired normal tissues (p<0.001, Figure 2B). The expression of SLC4A4 was lower in T4 than in normal tissues (p<0.001) while there was no difference between stage T4 and T1&T2&T3 (p=0.34, Figure 2C). And there were significant differences between stage N1&N2 and N0 (p<0.05, Figure 2D), stage M1 and M0 (p<0.05, Figure 2E), pathological stage Ⅳ and Ⅰ&Ⅱ&Ⅲ (p<0.05, Figure 2F). Further analysis of relationship between expression levels of SLC4A4 and clinicopathological factors in TCGA cohort revealed that the expression levels of SLC4A4 was significantly correlated to N stage (p=0.018), M stage (p=0.042) and pathological stage (p=0.016) (Table S3). Therefore, SLC4A4 correlates with a better prognosis and clinicopathological factors in colon cancer.
Analysis of potential upstream factors of SLC4A4.
Genetic and epigenetic variants significantly contribute to the regulation of carcinogenesis and immune tolerance. We examined the frequency of SLC4A4 mutations in the TCGA database (10,967 samples/10,953 patients in 32 studies) using cBioPortal and the results showed no significant differences in mutation rates in COAD (Figure S2A), with SLC4A4 mutations accounting for 3.2% of all mutations. Then we subsequently analyzed the DNMIVD database to test whether methylation factors influence the expression of SLC4A4 in colon cancer, and the results showed no significant effect of methylation on the difference in SLC4A4 expression in colon cancer (Figure S2B), no significant difference in methylation between colon tumors and normal tissue (Figure S2C), and no significant correlation between SLC4A4 and methylation (Figure S2D). The prognostic differences between the two groups were investigated by grouping patients according to whether SLC4A4 were methylated, and the results showed that the prognosis of different human cancer cases with SLC4A4 methylation was similar to human cancer cases without SLC4A4 methylation, including overall survival (p=0.429), disease-free survival (p=0.509) and progression-free survival (p=0.281) (Figure S2E-G). Then we compared SLC4A4 expression levels across cell lines for pre- and post-cytokine treated samples via the TISMO online database. The cytokine treatments included in this module included four methods, IFNγ, IFNβ, TNFα and TGFb1. The results showed that there were significant differences in SLC4A4 expression levels after the three cytokine treatments, IFNγ, IFNβ and TNFα (Figure S3A). We predicted the TFs and miRNAs of SLC4A4 and TFs of miRNAs predicted through TransmiR and PROMO databases, and then intersected TFs and miRNAs and finally obtained 12 miRNAs and 28 TFs (Figure S3B). The results indicated that 28 TFs such as FOXP3, IRF1 and HNF1B acted on 12 miRNAs such as has-miR-92b-3p and has-miR-211-5p to regulate the expression of SLC4A4 through the feed-forward loops, which provide transcriptional–post transcriptional regulatory network.
Functional enrichment analysis of SLC4A4
To gain insight into the potential functions of SLC4A4 in colon cancer and the regulatory networks involved, a series of functional enrichment analysis was performed. The volcano plot showed 1121 genes associated with SLC4A4, including 896 up-regulated associated genes such as FAM11B, CA2, CDKN2BAS et al. and 225 down-regulated associated genes such as BYSL, DDAH2, WDR46 (Figure 3A). Heat map showing the top 50 up-regulated and 50 down-regulated SLC4A4-related genes (Figure 3B). Then we analyzed SLC4A4 co-expressed genes using Metascape, and the results indicated that SLC4A4 co-expressed genes were enriched in cargo concentration in the ER, ribosome biogenesis, regulation of viral-induced cytoplasmic pattern recognition and metabolism of RNA (Figure 3C, 3E). The PPI network identified densely connected regions between proteins and three molecular complex detection (MCODE) were extracted, including MCODE1, MCODE2, and MCODE3. MCODE1 is consisted of RPL37A, RRS1, WDR74, RPL37, RPL31, RPL30, and EIF6. MCODE2 is consisted of CTNNBL1, PRPF6, POLR2J, GTF2F2, and PQBP1. MCODE3 is consisted of SEC24D, TMED10, LMAN1, and SERPINA1 (Figure 3D). Then the top 100 DEGs of SLC4A4 were identified, and a heatmap indicated the correlation between them (Figure 4A). Gene set enrichment analysis (GSEA) revealed that DEGs of SLC4A4 were mainly enriched in cell cycle regulatory processes, including mitotic G1 phase and G1/S transition, DNA replication, cell cycle mitotic G2/M checkpoint, and cell cycle checkpoint, indicating that SLC4A4 may play an important role in proliferation of colon cancer (Figure 4B). GO analysis were subsequently performed on 300 SLC4A4 DEGs, which suggesting that the top five molecular function (MF) terms were hormone activity, UDP-glycosyltransferase activity, peptidoglycan binding, beta-1,3-galactosyltransferase activity and galactosyltransferase activity. The top five biological process (BP) terms were isoprenoid metabolic process, terpenoid metabolic process, diterpenoid metabolic process, retinoid metabolic process and retinol metabolic process. The top five cellular component (CC) were cluster of actin-based cell projections, brush border, brush border membrane, zymogen granule and zymogen granule membrane (Figure 4C). Moreover, KEGG pathways analysis was carried out to investigate comprehend functions of SLC4A4 DEGs and the results showed that SLC4A4 DEGs were involved in neuroactive ligand-receptor interaction, pancreatic secretion, retinol metabolism and nitrogen metabolism (Figure 4D).
SLC4A4 is correlated with immune cell infiltration and tumor microenvironment
Immune cell infiltration plays a crucial role in tumor progression. To investigate the relationship between SLC4A4 with immune cell infiltration, CIBERSORT, CIBERSORT-ABS, EPIC, MCPCOUNTER, QUANTISEQ, TIMER and XCELL algorithms were performed on SLC4A4 high expressed group and low expressed group. The heatmap indicates the correlation of SLC4A4 with immune infiltration phenotypes, including T cells, B cells, CD8+ T cells, macrophages, mast cells, neutrophils and NK cells (Figure 5A). The subsequent analysis of relationship between expression levels of SLC4A4 with immune cell infiltration showed that high expression levels of SLC4A4 are associated with higher expression of CD4+ T cell (p=0.019, TIMER, Figure 5B; p=2.5e-09, EPIC, Figure 5C), CD8+ T cell (p=3.4e-05, TIMER, Figure 5D; p=0.00047, EPIC, Figure 5E) and NK cell (p<2.22e-16, MCPCOUNTER, Figure 5F; p=0.031, QUANTISEQ, Figure 5G). Besides, high expression of SLC4A4 are associated with higher ESTIMATE score (p=0.0015), immune score (p=1.3e-06) and TMB (p=0.022) (Figure 5H-5J). Moreover, the correlation network indicates that SCL4A4 was positively correlated with FAS, CD274 and CTLA4 (Figure 5K).
SLC4A4 inhibits colon cancer cell proliferation and migration
Next, to explore the role of SLC4A4 in regulating colon cancer cell proliferation and migration, a series of experiments were performed in vitro. We firstly upregulated the expression of SLC4A4 in LOVO and RKO cells and the efficiency of overexpression was verified by qRT-PCR and western blotting (p=0.0275, Figure 6B&6C; p=0.0209, Figure 6E&6F) and qRT-PCR (p<0.0001, Figure 6A; p<0.0001, Figure 6D) analysis. The wound healing assay revealed that upregulation of SLC4A4 inhibits the migrative ability of LOVO (p<0.0001, Figure 6I&6J) and RKO (p<0.001, Figure 6K&6L) cells. Next, CCK8 assay was performed that the results indicated that SLC4A4 promotes the cell viability of colon cancer cells. The subsequent EdU assay also revealed that upregulation of SLC4A4 promotes the cell viability in LOVO (p<0.01, Figure 6M&6N) and RKO (p<0.001, Figure 6O&6P) cells.
SLC4A4 suppresses partial EMT signaling in vitro.
To further investigate the mechanisms of SLC4A4 in suppressing colon cancer progression, we firstly screened genes interacting with SLC4A4 using GeneMANIA online tool, and the results showed that there were 20 genes correlating with SLC4A4, including CA4, SLC4A7, CA2, MAPK7, et al. (Figure 7A). Then we analyzed the top 5 genes with the strongest correlations in cancer-related pathways by GSCALite , which revealed that SLC4A4 had strong inhibitory effects on EMT, apoptosis, cell cycle, DNA damage response, while had the strongest activate effects on RKT, suggesting that SLC4A4 affects oncogenesis through multiple pathways (Figure 7B). Since above bioinformatic analysis indicated the potentially inhibitory effect of SLC4A4 on EMT, we detected the protein levels of EMT signaling pathway related proteins through western blotting assays. And the results showed that the upregulated of SLC4A4 significantly increased the protein levels of E-cadherin (p=0.0017, Figure 7C&7D; p=0.0030, Figure 7E&7F) and decreased the protein levels of Twist1 (p=0.0015, Figure 7C&7D; p=0.0009, Figure 7E&7F), while there was no obvious change in N-cadherin in both LOVO and RKO cells. These findings suggest that SLC4A4 inhibits colon cancer proliferation and migration through suppressing partial EMT signaling pathway.
Expression and prognostic analysis of SLC4A4 and its correlation with immune cell infiltration in pan-cancer
Next, to investigate the role of SLC4A4 in pan-cancer, a series of bioinformatic analysis were performed. Figure 8A showed that the expression levels of SLC4A4 were significantly upregulated in tumors including GBM, GBMLGG, LGG, STES, PRAD, STAD, SKCM, LAML, PCPG, and CHOL, while downregulated in tumors including BRCA, KIRP, COAD, COADREAD, HNSC, LUSC, LIHC, WT, BLCA, THCA, READ, PAAD, TGCT, ALL, ACC, KICH, CESC, ESCA, KIPAN, and OV. Survival analysis revealed that SLC4A4 comes with better prognosis in tumors including GBMLGG, KIRC, MESO, COAD, COADREAD, NB, and SKCM (Figure 8B). Then we explored the correlation of SLC4A4 with immune cell infiltration in pan-cancer using Sangerbox online tool, and the heatmap showed the correlation of expression level of SLC4A4 with immune regulators including chemokines, receptors, MHC, immune inhibitors and immune stimulators in pan-cancer (Figure 8C). Besides, Figure 8D suggested the SLC4A4 was positively associated with immune checkpoints including CD276, IL4, IL13, CTLA4 et al. and negatively correlated to immune checkpoints including CD70, IL1A, CX3CL1 et al. in pan-cancer.
To investigate the immune cell phenotypic heterogeneity of SLC4A4, we performed single cell RNA-seq analysis of immune cells from colon cancer patients through THE HUMAN PROTEIN ATLAS (https://www.proteinatlas.org/) online website. After data pre-processing and quality control, we analyzed in the Monaco database to obtain immune cell type expression maps, which showed that the expression of SLC4A4 in the bar graph (Figure 9A) was closely related to Gama Delta (γδ) T cells, T cells, and NK cells, and the heat map (Figure 9B) further showed that the expression of SLC4A4 was mostly associated with γδ T cells. To identify cell subpopulations in colon cancer patients, we performed SC Transform standardization to regress UMI counts, gene counts, percentage of mitochondrial reads and cell cycle phases in immune cells, followed by integrated normalization of the dataset using Seurat. Following SingleR annotation, 52 immune cell expression clusters were predicted from the samples (Figure 9C) and correlated Gotreemap was done for γδ T cells (Figure 9D), showing that SLC4A4 expression was closely associated with cell membrane, plasma membrane and T cell receptor complexes in cellular components, and in biological processes, with negative regulation of translocation, positive regulation of calcium ion transport, and positive regulation of natural killer cell-mediated cytotoxicity, cytolysis, and adaptive immune responses.
Exploring the correlation of SLC4A4 with tumor mutation load (TMB) and microsatellite instability (MSI) and its role in pan-cancer immunotherapy.
To understand the effectiveness of SLC4A4 in predicting immune checkpoint inhibitor (ICI) therapy, we assessed the correlation of expression levels of SLC4A4 and the two biomarkers commonly used clinically for immunotherapy prediction: tumor mutational load (TMB) and microsatellite instability (MSI). The results showed that SLC4A4 expression was positively correlated with TMB values in COAD, COADREAD, USCN, OV, MESO and KIVH, and negatively correlated with TMB in CHOL, LUAD, THYM, LUSC and GBMLGG (Figure 10A). In addition, a positive correlation between SLC4A4 and MSI expression was found in TGCT, COAD, COADREAD, CHOL, UVM and READ; and a negative correlation was found in KICH, KIPAN, PAAD, UCS and LUSC (Figure 10B). The results suggest that SLC4A4 may have the ability to predict the efficacy of ICIs in the corresponding cancers. To this end, we further investigated the role of SLC4A4 expression levels in pan-cancer immunotherapy. In anti-PD-1 treatment group, the difference between the correlation of SLC4A4 and prognosis was not significant. Higher expression of SLC4A4 was negatively correlated with prognosis of patients in Anti-PD-1 treatment Nivolumab only group, whereas positively associated with better prognosis of patients in Anti-PD-1 treatment Pembrolizumab only group. In Anti-PD-L1 treatment, Anti-PD-L1 treatment Atezolizumab only, and Anti-CTLA4 treatment groups, SLC4A4 expression was positively correlated with prognosis (Figure 10C-H). Then we used the TISMO database(http://tismo.cistrome.org/) to verify the correlation between SLC4A4 with immunotherapy, through comparing SLC4A4 expression levels in different tumor models and ICB treatments and comparing SLC4A4 expression before and after ICB treatment and in responders and non-responders. Four ICB treatments were included in the module, including anti-PD1, anti-PDL1, anti-PDL2 and anti-CTLA4. The results suggested that the expression levels of SLC4A4 were significantly correlated to immunotherapies including anti-PD1, anti-PDL1 and anti-CTLA4 (Figure S4). To conclude, SLC4A4 is significantly associated with TMB and MSI and plays a crucial role in pan-cancer immunotherapy.