Functional enrichment of cadherin genes
Through functional enrichment analysis, the main functions of cadherin genes were found to include calcium ion binding and participation in various biological processes involving cell adhesion (Figure 1A).
Gene-gene interaction analysis revealed significant co-expression and pathway interactions among the CDH genes (Figure 1B-C). Subsequently, in order to verify the co-expression interaction relationship of these genes in GC tumor tissue, the cor function was used to calculate the co-expression correlation coefficient of these genes in R. These CDH genes were also found to have significant co-expression gene-gene interactions in GC tumor tissue (Figure 2).
Clinical significance of cadherin genes
A total of 408 GC-derived RNA-seq samples were obtained from TCGA website. These included 32 para-carcinoma and 375 tumor tissue samples. The pheatmap package in R was used to draw a heat map of the CDH gene expression distribution in GC para-carcinoma and tumor tissue samples (Figure 3). In the CDH gene family, 12 genes were found to be differentially expressed between GC para-carcinoma and tumor tissue. Five of these genes were significantly down-regulated in the cancer tissue, while seven were significantly up-regulated (Figure 4, Table 1). When adjusting for tumor stage and age in the subsequent multivariate Cox proportional hazard regression model of the CDH genes found that five CDH genes were significantly associated with GC prognosis. These five prognostic genes included CDH6, CDH10, CDH7, CDH2, and CDH13 (Table 1, Figure 5A-E). High expression of each these five genes was significantly associated with poor GC prognosis. Among these five prognostic CDH genes, with a time-dependent area under the ROC curve of 0.680, CDH6 was found to have the highest accuracy in predicting five-year survival in GC patients (Figure 6A-E). To verify the prognostic value of these five CDH genes in GC, GC patient data derived from the Kaplan-Meier plotter database were used as a validation cohort. CDH6, CDH10, CDH7 and CDH2 were found to be significantly correlated with GC prognosis, and that high expression of these genes were significantly correlated with poor clinical outcome in GC cases (P<0.01 for all log-rank values, Figure 7A-D). No significant correlation was found between CDH13 mRNA expression levels and GC prognosis (log-rank P=0.47, Figure 7E). An expression matrix of the TCGA cohort was therefore used for the CDH6, CDH10, CDH7 and CDH2 genes to construct the prognostic risk score model. Through step function screening, a prognostic risk score model was constructed based on the expression of CDH2 and CDH6. High- and low-risk GC patients were defined according to the median risk score value. The risk score calculation model was as follows: risk score=(0.0979 × CDH2 expression) + (0.1841 × CDH6 expression). In the analysis of the prognostic risk scores, it was observed that high-risk GC patients were significantly associated with poor prognosis (log-rank P<0.001, adjusted P<0.001, HR=1.910, 95%CI=1.339-2.724, Figure 8A-B), and that their median survival time of 675 days was shorter than low-risk patients (1686 days). The construction of the prognostic risk score model based on CDH2 and CDH6 was observed to significantly improve the accuracy when predicting the five-year survival of GC patients (AUC=0.698, Figure 8C). In order to evaluate the contribution of CDH genes to the prognosis of GC patients, CDH prognostic genes and associated risk scores were used to construct the nomogram models. Using the nomogram model, it was found that tumor stage had the greatest contribution to prognosis. Among the CDH prognostic genes, CDH6 had a greater contribution to GC prognosis than the other prognostic CDH genes (Figure 9A). In the risk-score nomogram model, tumor stage was found to have had a greater contribution to the prognosis of GC than risk score, but risk score had a greater contribution to GC prognosis than single CDH prognostic genes (Figure 9B).
Functional enrichment analysis of cadherin genes in GC
In order to understand the potential biological mechanisms involved in the prognostic CHD genes and risk scores in GC, GSEA was used to perform enrichment analysis for different CDH expression levels or risk score phenotypes. GSEA of CDH2 in TCGA cohort found that a high CDH2 expression phenotype was significantly involved in several systems and pathways. These included the integrin1 pathway, VEGFA targets, tumor tumorigenesis, JNK signaling dn, metastasis epithelial-mesenchymal transition (EMT) up, PI3K cascade: FGFR2, metastasis, Wnt signaling pathway, TGF beta signaling pathway, tumor vasculature up, calcium signaling pathway, cell substrate adhesion, G protein coupled receptor signaling pathway coupled to cyclic nucleotide second messenger, vascular endothelial growth factor signaling pathway, positive regulation of Erk1 and Erk2 cascade and adherens junction assembly (Figure 10A-P). For CDH6, we found that the high CDH6 expression phenotype was significantly associated with the integrin1 pathway, and Wnt signaling pathway, VEGFA targets, focal adhesion, NF-κB signaling, metastasis up, calcium signaling pathway, tumorigenesis up, tumor vasculature up, vascular endothelial growth factor signaling pathway, transforming growth factor beta binding, cell cell adhesion via plasma membrane adhesion molecules, Wnt protein binding, cell cell adhesion mediated by cadherin, G protein coupled neurotransmitter receptor activity and phosphatidylinositol 3 kinase signaling (Figure 11A-P). For CDH7, the differences between low- and high-CDH7 expression phenotypes were significant in metastasis dn, mTOR 4 pathway, TNF pathway, ERBB2/ERBB3 pathway, Notch signaling pathway, MAPK pathway, apoptosis, p53 downstream pathway, ERBB signaling pathway, NF-κB pathway, Akt pathway, signaling by EGFR, Ras pathway, G protein coupled glutamate receptor signaling pathway, cell cycle process and cell-cell recognition (Figure 12A-P). The high CDH10 expression phenotype was significantly involved in calcium signaling pathway, metastasis, PI3K cascade: FGFR1, PI3K cascade: FGFR2, EZH2 targets, targets of CCND1 and CDK4 up, fibroblast growth factor receptor binding, phospholipase C activating G protein coupled receptor signaling pathway, cell cell adhesion via plasma membrane adhesion molecules, Wnt protein binding, G protein coupled receptor signaling pathway coupled to cyclic nucleotide second messenger and regulation of cAMP mediated signaling (Figure 13A-L). The high-risk score phenotype was significantly involved in Kras targets up, ECM receptor interaction, integrin1 pathway, Wnt signaling pathway, PI3K cascade: FGFR1, VEGFA targets, calcium signaling pathway, metastasis EMT up, NF-κB signaling, vascular endothelial growth factor signaling pathway, regulation of cell junction assembly and regulation of non canonical Wnt signaling pathway (Figure 14A-L).
Drug screening for GC risk score model
In order to screen targeted therapeutic drugs for GC risk scores, edgeR was used. This enabled the screening of DEGs between high- and low-risk phenotypes. A total of 344 DEGs were obtained across high- and low-risk phenotypes (Figure 15, Table S7). The heat map for these DEGs is shown in Figure S1. A multivariate Cox proportional hazards regression model, corrected for tumor stage and age, was then used for prognostic analysis. A total of 45 DEGs were observed to be significantly correlated with GC prognosis in the TCGA cohort (Table S8, Figure 16A). The three most significant DEGs included cerebellin 4 precursor (CBLN4, log-rank P=0.00096, Figure 16B), chorionic gonadotropin subunit beta 3 (CGB3, log-rank P=0.0029, Figure 16C) and butyrylcholinesterase (BCHE, log-rank P=0.004, Figure 16D). Through functional enrichment analysis of the DEGs, it was found that these DEGs may participate in calcium signaling pathway, ECM-receptor interaction, cAMP signaling pathway, cGMP-PKG signaling pathway, gastric acid secretion, adenylate cyclase-activating G-protein coupled receptor signaling pathway, calcium-dependent protein binding and G-protein coupled receptor signaling pathway, coupled to cyclic nucleotide second messenger function, was also found to be impacted. These biological processes and signaling pathways may be the potential mechanisms driving the differences in clinical outcome among GC patients across high- and low-risk phenotypes (Table S9). The DEGs were used to conduct targeted drug screening through the CMap online tool. Three potential small molecule compounds, anisomycin, nystatin and bumetanide, were found to be linked to the GC risk score. The chemical structures of the three drugs are shown in Figure17A-C, while the CMap analysis results are summarized in Figure 17D. The STITCH online tool was used to construct the drug-gene interaction network. Bumetanide was found to potentially play a role in GC by targeting ERAS (ES cell expressed Ras) genes, while nystatin may play a role in GC by targeting SST (somatostatin) and ADRB3 (adrenoceptor Beta 3) genes (Figure 18).
Relationship between prognostic CDH genes expression and tumor immune infiltration abundance
By analyzing the relationship between the prognostic CDH genes and the abundance of tumor immune infiltration (Figure 19A-D), we found that expression level of CDH6 (r=0.138, P=7.82×10-3) and CDH7 (r=0.104, P=4.57×10-2) were significantly related to B cell infiltration in GC tumor tissues. For CD8+ T cell, we found that the expression level of CDH10 (r=0.183, P=4.06×10-4) in GC tumor tissues were closely related to CD8+ T cell immune infiltration. The expression level of CDH2(r=0.263, P=3.54×10-7), CDH6 (r=0.307, P=2.04×10-9) and CDH10 (r=0.33, P=9.99×10-11) in GC tumor tissues were closely related to CD4+ T cell immune infiltration. Tumor immune infiltration abundance of macrophage also shown a significantly associated with CDH2 (r=0.474, P=3.59×10-22), CDH6 (r=0.418, P=4.20×10-17) and CDH10 (r=0.492, P=6.60×10-24) in GC tumor tissues. Tumor immune infiltration abundance of neutrophil were closely correlated with CDH2 (r=0.149, P=4.10×10-3) and CDH10 (r=0.147, P=4.54×10-3) expression. All these four prognostic genes were significant associated with dendritic cell Tumor immune infiltration abundance: CDH2 (r=0.296, P=5.84×10-9), CDH6 (r=0.164, P=1.49×10-3), CDH6 (r=-0.105, P=4.41×10-2) and CDH10 (r=0.274, P=7.78×10-8).