Identication of VCAM1 as a Novel Biomarker in the Immune Microenvironment of Gastric Cancer by Comprehensive Bioinformatics Analysis

Gastric cancer (GC) is a common malignant tumor of the digestive tract worldwide and has high morbidity and mortality. The tumor immune microenvironment (TIME), especially the immune cell inltration, plays an important role in the progression and prognosis of GC. In this study, we investigated the TIME-related genes and explored their role in the GC immune microenvironment. We used ssGSEA to assess the immune cell inltration in 375 patients with GC downloaded from TCGA. Then GC samples were divided into high-, medium-, and low-immune cell inltration groups by hierarchical clustering. Differentially expressed genes analysis were further proceed between groups to determine TIME-related differentially expressed genes (DEGs). By protein interaction network and Cox analysis, the angiogenesis gene was intersected. The results showed that vascular cell adhesion molecular 1 (VCAM1) was the most critical gene. We further analyze the importance of VCAM1 in the progression of GC and its role in the GC microenvironment. Immuno-correlation analysis addition,


Introduction
Gastric cancer (GC) is the fth most common malignancy and the third leading cause of cancer-related death globally [1]. In 2018, more than 1 million new cases of GC have been recorded worldwide, resulting approximately 783,000 deaths. The incidence of GC in males is twice that of females, and this rate varies among regions [2]. At present, gastroscopy is the main diagnostic method for GC, but the application of this process is limited due to its invasiveness and inconvenience [3]. In addition, patients with early GC usually have no apparent symptoms. Most diagnosed patients are in the advanced stage, which undoubtedly increases the di culty of the treatment of GC. Therefore, understanding the pathogenesis of GC and nding effective biomarkers to improve the diagnosis, treatment, and prognosis of GC are essential.
In recent years, the tumor immune microenvironment (TIME) has attracted increasing attention in oncology. The TIME is a complex environment composed of various immune cells, among which immune cell in ltration (ICI) plays a leading role in TIME. Due to the high heterogeneity of immune-cell-in ltrated tumors, the nature and frequency of ICIs have been considered as important determinants of tumor development and prognosis [4][5][6]. Among the tumor-in ltrating immune cells, Curiel TJ et al. found that high rates of CD8 + T cells to Foxp3+ regulatory T cells (Treg cells) were associated with favorable clinical outcomes in ovarian tumor [7]. In addition, Mφ and Tregs in ltration have been associated with improved survival in patients undergoing radical gastrectomy for advanced GC [8]. More importantly, antitumor effect can be achieved by improving the immune cell in ltration in the TIME. For example, by blocking the interaction between the tumor cells expressing immune checkpoint and immune cells, tumor immune escape can be stopped, and the antitumor effect can be stimulated. Therefore, an in-depth understanding of the TIME may provide an important perspective for the progression of GC.
Angiogenesis is another malignant feature of cancer cells as a source of nutrition and a way of metastasis. To support the high proliferation rate of cancer cells, tumor cells often secrete high levels of angiogenic factors that promote the formation of disordered, immature, and permeable vascular networks [9,10]. Such immature blood vessels often have a great effect on the tumor microenvironment, leading to hypoxia, immune cell invasion, and increased risk of metastasis and migration [11][12][13].
Antiangiogenesis therapy has become a hot spot in clinical treatment. For example, apatinib, a smallmolecule tyrosine kinase inhibitor, highly selectively binds to and inhibits the vascular endothelial growth factor receptor 2. This molecule works well in patients with advanced or metastatic GC that is refractory to chemotherapy [14]. Therefore, identifying critical biomarkers related to angiogenesis and immunity in the TIME is of great importance to tumor diagnosis and treatment.
In this study, single-sample gene set enrichment analysis (ssGSEA) was used to categorize patients with GC in the Cancer Genome Atlas (TCGA) into high-, medium-, and low-immune cell in ltration groups. The results are validated in combination with ESTIMATE, CIBERSORT, and K-M analysis. Then, the key candidate genes were screened by differential analysis and univariate Cox analysis combined with angiogenesis-related genes. Finally, vascular cell adhesion molecular 1 (VCAM1) was identi ed as a key gene related to immunity and angiogenesis in GC. In addition, the role of VCAM1 in the development of GC was analyzed, and the underlying mechanism was elucidated.

Materials And Method
Data acquisition and clustering A total of 375 GC samples were collected from TCGA(https://portal.gdc.cancer.gov/). The data included transcriptome pro les and clinical data. The ssGSEA is a method of scoring the abundance of immune cell in ltrates in the immune microenvironment of each sample on the basis of 29 immune cellassociated gene sets (i.e., immune cell types, functions, and pathways) [15]. We used the R package GSEAbase to achieve ssGSEA analysis of the GC samples and obtained immune enrichment level of each GC sample. Then the R package sparcl was used to perform hierarchical clustering of the GC samples.
These GC samples were divided into high-, medium-, and low-immune cell in ltration groups. The heatmap further visualized the differences in the immune cell content among the three groups.

Determination of the validity of immune clustering
The ESTIMATE algorithm is based on the level of immune and stromal cell-speci c gene expression and aims to re ect the level of in ltration of the immune and stromal cells in the tumor microenvironment. Based on the R software package ESTIMATE, we calculated the tumor purity, immune score, and stroma score for each of the three GC cluster samples [16]. Then, the cluster heatmap was drawn by the R package pheatmap to verify the effectiveness of the ssGSEA grouping. Violin plots and boxplots were drawn using the R package ggpubr to show differences in the tumor purity, immune score, stroma score, and some immune-related genes among the three clusters. In addition, the abundance of 22 human in ltrating immune cells in each GC sample was calculated using the Cell type Identi cation By Estimating Relative Subsets of RNA Transcripts (CIBERSORT) [17]. Differences among the 22 kinds of immune-cell in ltration among the three groups were further compared. Finally, combined with survival data, survival analysis in the three clusters was performed using the R package Survival.

Identi cation of DEGs and functional enrichment analysis
According to the clustering results, the R software package limma was used for the difference analysis of the high-and low-immune cell in ltrating groups to obtain the DEGs (|lgFC| > 1 and P-Val < 0.05). Based on the STRING database(https://string-db.org/), a protein-protein interaction (PPI) network of these DEGs was constructed with a threshold of con dence of interaction relationship between nodes greater than 0.9. Then, Cytoscape software was used to identify the top 60 hub genes in the PPI network. In addition, Gene Ontology (GO) was used to analyze the DEGs and reveal the functions and molecular functions of genes in the biological processes. The results were visualized using R packages GGplot2, Enrichment Plot, and clusterPro ler. P value less than 0.05 indicated a statistically signi cant difference. Identi cation of key prognostic genes associated with angiogenesis and immunity in the immune microenvironment The angiogenesis-related genes were obtained from the GeneCards database based on a correlation coe cient ≥5. By the intersection of these angiogenesis-related genes and DEGs, the candidate gene associated with angiogenesis and immune in TIME was determined. Then, univariate Cox regression analysis was used to evaluate the relationship between the candidate genes and overall survival (OS), and P <0.05 indicated the prognostic gene. Venn diagrams were used to show the common genes between prognostic and hub genes in the PPI network. These shared genes were considered to be the vital prognostic genes associated with angiogenesis and immunity in the immune microenvironment.

Signi cance of VCAM1 Expression in GC
The expression pattern and prognostic signi cance of VCAM1 in GC were analyzed using TCGA data and the Gene Expression Pro ling Interactive Analysis website (GEPIA; http://gepia.cancer-pku.cn/). The site includes the difference in the expression of target genes in the tumor and adjacent normal tissues, the correlation between gene expression and clinical stage, and the prognostic value. P< 0.05 was considered statistically signi cant. In addition, according to the expression of VCAM1, patients with GC were divided into high-and low-expression groups to explore the potential mechanism of VCAM1 in GC. GSEA software was used for the enrichment analysis of the KEGG pathway, with truncation value NOM P <0.05 and FDR<0.05.

Role of VCAM1 in TIME
To explore the role of VCAM1 in the TIME of GC, multiple databases were integrated, and the Spearman correlation between VCAM1 expression and immune-related markers was analyzed. TISIDB is a website for the comprehensive research on tumor-immune interactions(TISIDB (hku.hk)) [18]. The section of immunomodulator and chemokine was selected to analyze their correlation with VCAM1 expression. TIMER is another comprehensive resource for analyzing immune in ltration across multiple cancer types (TIMER2.0 (cistrome.org)) [19]. The Gene_Corr module was used to detect the correlation between VCAM1 expression and various immune cells. In addition, according to the proportion of immune cell in ltration in the tumor microenvironment calculated by the CIBERSORT algorithm, the correlation between VCAM1 expression and immune cell in ltration in the tumor microenvironment was further explored. P < 0.05 was set as the statistical truncation value.
Predicting binding of miRNA and lncRNA to VCAM1 ENCORI is a comprehensive database dedicated to RNA interactions (https://starbase.sysu.edu.cn/index.php) [20]. Given the importance of VCAM1, its possible regulatory mechanisms were further explored. ENCORI database was used to predict miRNAs and lncRNAs that can bind to the target genes. Differential, correlation, and survival analyses were performed on the predicted miRNAs and lncRNAs in the GC cohort. MiRNAs and lncRNAs that satis ed the above conditions were regarded as the most potential candidates. Finally, Cytoscape was used to construct the mRNA-miRNA-lncRNA interaction ceRNA network.

Results
Clustering and Subtype Differential Analysis of GC All work ow charts are shown in Figure 1. A total of 375 GC samples were obtained from TCGA, and ssGSEA algorithm was used to calculate the degree of in ltration scores of 29 immune-related cells in all GC samples. Hierarchical cluster analysis was used to classify the tumor samples with different enrichment scores into high-immunity group (Cluster 1), medium-immunity group (Cluster 2), and lowimmunity group (Cluster 3) (Figures 2A, B). To verify the applicability of the clustering results, the ESTIMATE algorithm was also used to calculate the stroma score, immune score, and tumor purity of the tumor samples. The results showed signi cant differences in the immune and stromal scores among the different immune groups and an increasing trend from the low-immunity group (Immunity-L) to the highimmunity group (Immunity-H). By contrast, tumor purity was higher in the Immunity-L than in the other two groups (Figures 2C-F). Immunomodulatory factors,such as PD-L1 and CTLA4, showed the same increasing trend in the clusters, and they were all higher in the Immunity-H group ( Figures 2J, H). Figure 2I demonstrates that in the different groups, the expression levels of most HLA genes in the Immunity-H group were signi cantly higher than in the other two groups. CIBERSORT algorithm was also used to calculate the degree of immune cell in ltration in the GC samples. Comparison of the immune cell in ltration status among the different groups showed that the high-immunity group had higher immune in ltration ( Figure 2G). Finally, survival analysis showed signi cant differences in the survival rates among the three groups (P=0.043, Figure 2K). These results proved that the clustering strati cation of the patients with GC was meaningful and could be used for further analysis.

Identi cation and functional analysis of DEGs
To investigate the mechanism of the immune in ltration in the patients with GC, the DEGs between the high-and low-immunity groups were identi ed through differential analysis. According to |log2FC| > 2 and P<0.05, a total of 463 DEGs were obtained, which were regarded as TIME-related gens ( Figure 3A).
Then, PPI network diagrams were constructed through the STRING database to analyze the interactions between these DEGs ( Figure 3B). After calculating the number of adjacent nodes of each gene, the top 60 core genes were screened and arranged ( Figure 3C). Afterward, functional enrichment analysis was performed on the 463 TIME-related genes. Figures 3D-F show that these TIME-related genes were mainly involved in immune-related regulatory processes, such as T cell activation, lymphocyte differentiation, and various immune cell proliferation.
Identi cation of key genes related to angiogenesis in TIME Considering the critical role of angiogenesis in tumor development, 378 genes related to angiogenesis were obtained from the GeneCards database(https://www.genecards.org/). A total of 14 target genes were obtained by analyzing the intersection of angiogenesis-related genes and TIME-related DEGs. Further univariate Cox regression analysis showed that seven genes were associated with the OS of GC ( Figure 4A). Based on the above screening and the hub genes in the PPI interaction network, the candidate genes most related to angiogenesis in the GC immune microenvironment were obtained for further analysis ( Figure 4B) Expression pattern and prognostic signi cance of VCAM1 in GC From the above analysis, the core gene VCAM1 was obtained for follow-up research. First, Wilcoxon ranksum test results showed that VCAM1 expression was signi cantly higher in the tumor tissues than in the normal tissues in the unpaired and paired GC samples ( Figure 5A, P<0.001; Figure 5B, P=0.00083). GEPIA online database analysis showed that VCAM1 was signi cantly overexpressed in the GC tumor tissues ( Figure 5C). Combined with the GC samples from the TCGA and GEPIA online database analysis, VCAM1 expression level was also found to be signi cantly correlated with the clinical stage ( Figures 5D, E). The role of VCAM1 in GC survival was also assessed. Survival analysis results showed that patients with GC with high VCAM1 expression had a poor OS rate in in TCGA data set ( Figure 5G, P=0.01) and GEPIA database ( Figure 5H, P=0.025).
To explore the regulatory role of VCAM1 in the GC, the samples were divided into high-expression and low-expression groups based on the VCAM1 expression in the GSEA analysis. Figure 5F shows that VCAM1 was involved in the immune and angiogenesis-related regulatory processes, such as the chemokine signaling pathway, JAK-STAT signaling pathway, natural killer cell-mediated cytotoxicity, Tolllike receptor signaling pathway, KEGG hematopoietic cell lineage, and vascular smooth muscle contraction. These results suggest that VCAM1 plays a vital role in the development of GC.

Correlation between VCAM1 expression and in ltrating immune cell markers
The importance of VCAM1 in the immune microenvironment of GC was further analyzed. Gastric adenocarcinoma (STAD) is the most common histopathological type of GC [21]. Online analysis of the TISIDB database showed that VCAM1 expression was signi cantly positively correlated with immunostimulatory genes, chemokines, and chemokine receptors in pan-cancer, especially in STAD ( Figure 6A). The TIMER database showed that VCAM1 expression was negatively correlated with tumor  Figure 6B). The results of the CIBERSORT algorithm based on GC samples further con rmed that VCAM1 expression was signi cantly correlated with the B cells, CD8+T cells, CD4+T cells, and other immune cells ( Figure 6C). These results suggest that VCAM1 plays an important role in the regulation of immune activity in TIME.

Potential mechanism and expression regulation of VCAM1 in GC
CeRNA network is an important mechanism of tumor development and involves the participation of various RNA molecules. This network provides a brand new perspective for the transcriptome study of tumors and helps to comprehensively and deeply explain some biological phenomena [22]. Based on the ENCORI database, a ceRNA regulatory network in which VCAM1 may participate was constructed. By acquiring the information of miRNA-mRNA pairs and miRNA-lncRNA pairs step by step and combining with differential expression and correlation analyses, signi cant target genes were screened to construct the ceRNA network. As shown in Figures 7A and B, the expression of hsa-miR-33a-5p and hsa-miR-183-5p was negatively correlated with the expression of VCMA1. Subsequent differential and survival analyses revealed that only hsa-miR-183-5p was differentially expressed between the GC tumor tissues and normal tissues and was correlated with the prognosis of patients with GC ( Figure 7D, P=1.6E−15; Figure 7F, P=0.08). In contrast, hsa-miR-33A-5p expression was signi cantly different between gastric cancer tissues and normal tissues, but was not associated with the prognosis of gastric cancer patients ( Figure  7C Figure 7G, P=0.043). In conclusion, we speculated that AC104211.1 sponge hsa-miR-183-5p enhances the expression of VCAM1 and promotes the malignant behavior of gastric cancer ( Figure 7K).

Discussion
With the advent of the era of abundant biological data, bioinformatics technology provides a more convenient platform for researchers. Using high-throughput sequencing data to classify a large number of samples can better re ect the characteristics and heterogeneity of tumors. The classi cation of cancer subtypes can not only describe the characteristics of tumors from multiple dimensions but also help predict patient prognosis and guide immunotherapy [15,23,24]. However, previous studies have not fully considered the interaction between tumor and immunity. The progress of a cancer depends not only on the cancer cells but also on the TIME. The TIME is the internal environment of tumor malignant progression, which affects tumor progression and immunotherapy [25,26]. In this study, a hierarchical clustering algorithm was used to divide the GC samples into three clusters based on the enrichment of 29 immune cell types. Signi cant differences in the immune score, stromal score, tumor purity, and immunerelated gene expression were found in the three cell in ltration clusters. Moreover, the CIBERSORT algorithm was used to analyze the proportion of 22 immune cells in the three clusters, and the results showed signi cant differences among the three groups. Survival analysis also showed a signi cant difference in the survival rates among the three groups. These results suggest that the immune cell in ltration pattern in TIME played a crucial role in the progression of GC. The DEGs among the different immune-in ltrating clusters were identi ed, and enrichment analysis was conducted. The PPI network was constructed, and the hub genes were identi ed. Integration of the angiogenesis-related genes and Cox regression analysis showed that VCAM1 may play an important role in the regulation of the TIME and angiogenesis. VCAM1 expression was also found to be signi cantly higher in the GC tissues than in the normal tissues and was associated with clinical stage and survival in the patients with GC. This result suggested that VCAM1 may play an important role in the occurrence and development of GC and may be a therapeutic target and prognostic indicator of GC.
VCAM1 is an important member of the immunoglobulin superfamily and is considered to be a cell adhesion molecule that contributes to the regulation of in ammation-related vascular adhesion and trans-endothelial migration of white blood cells [27]. Studies have shown that VCAM1 plays a vital role in various diseases, such as rheumatoid arthritis, asthma, and cancer [28,29]. VCAM1 expression can be induced by the activation of proin ammatory cytokines, such as tumor necrosis factor, interleukin, and adipocytokine visfatin [30]. In high levels of in ammation and cancer, VCAM1 can also be activated by autosecretory or paracrine expression. For example, VCAM1 is highly expressed in the tissues of colorectal cancer (CRC) and has been strongly associated with aggressiveness, clinical features, and poor prognosis in patients with CRC [31]. In addition, recent studies have shown that cancer-associated broblasts (CAFs) secrete VCAM1 to enhance the growth and invasion of cancer cells by activating the AKT and MAPK signaling pathways of lung cancer cells [32]. Moreover, CAF-derived VCAM1 molecules have been found to interact with integrin αvβ1/5 to promote the invasion of GC cells in vitro and in vivo [33]. The serum concentrations of ICAM-1 and VCAM1 in patients with GC have been found to be signi cantly higher than those in the healthy controls, and their expression levels were signi cantly correlated with tumor stage, gastric wall invasion, distant metastasis, and prognosis [34]. In this study, VCAM1 was signi cantly associated with the occurrence, development, and survival of GC. These ndings further suggest that VCAM1 may be a potential biomarker for the diagnosis and prognosis of GC.
The potential role of VCAM1 in regulation of the TIME was also explored. TISIDB database analysis showed that VCAM1 expression was signi cantly correlated with immunostimulatory genes, chemokines, and chemokine receptors in various cancers, especially in GC. The correlation between VCAM1 and in ltrated immune cells in TIME was analyzed according to the CIBERSORT algorithm and TIMER database. The results showed that VCAM1 expression was associated with various in ltrating immune cells, such as B cells, CD8+T cells, and CD4+T cells. Furthermore, GSEA showed that VCAM1 was involved in the regulation of immunity and angiogenesis, including the chemokine signaling pathway and JAK-STAT signaling pathway. Chemokine signaling and the JAK-STAT signaling pathway are involved in immune function and the growth and differentiation of cells. The abnormal activation of both processes can easily lead to the occurrence of tumors [35][36][37]. Most cytokine-induced immune responses also depend on these processes [38,39]. PIK3CA mutations were reported in 5% of GCs, and mutations in this gene led to the constructional activation of this pathway [40]. Gong et al. examined 86 patients who were undergoing gastrectomy and found that activated STAT3 was associated with multiple angiogenesis factors, which in turn was found to contribute to the progression of GC [41]. Studies have also shown that the expression of PD-L1 in GC is mainly regulated by the JAK-STAT pathway-related interferon gamma [42]. In this study, the results also showed that the expression of VCAM1 was signi cantly correlated with various immunostimulatory genes, such as CD274(PD-L1) and CTLA4( Figure 6A).
Embelin, an IAP inhibitor, has been reported to accelerate tumor-in ltrating T cell numbers and enhance antitumor immune response by restoring VCAM1 expression in tissue endothelial cells in an animal model of ductal adenocarcinoma of pancreatic cancer [43]. Liu et al. found that the VCAM1 expression was positively correlated with glioblastoma multiforme of monocyte adhesion, and knockout of VCAM1 could eliminate the enhancement of monocyte adhesion [44]. In addition, macrophages are attracted to the tumor microenvironment by soluble VCAM1, promoting the drug resistance of pancreatic cancer cells to gemcitabine. The change in the soluble VCAM1 in the plasma is an independent prognostic factor of gemcitabine treatment in patients with advanced pancreatic cancer [45]. All data suggest that VCAM1 may be a promising key regulator of the GC immune microenvironment and involved in the regulation of immune activity status in the TIME. However, the regulatory mechanism between different VCAM1 expression levels and the immune microenvironment of patients with GC and the detailed immune cell regulation map need further studies.

Conclusion
In summary, the patients with GC were divided into different immune in ltrating groups by using ssGSEA algorithm. Further analysis and veri cation showed that the immune cell in ltrating mode in the TIME might play a key role in the progression of GC. The TIME-related DEGs were also explored, and the results showed that VCAM1 was related to the occurrence and development of GC. VCAM1 may be involved not only in the regulation of immune-in ltrating activity in the TIME but also in angiogenesis-related regulation. This protein may be a promising biomarker and therapeutic target.

Declarations
Ethical Approval and Consent to participate: Not applicable.
Consent for publication: All authors have made substantive contributions to this manuscript and approve the manuscript for publication.
Availability of data and materials All original data can be obtained from the public website, the speci c websites are provided in the article, data processing mainly using R software. Figure 1 The ow chart of this study Enrichment levels and types of 29 immune-related cells in the high, medium and low immune cell in ltrates groups, combined with tumor purity, estimate score, immune score, and stromal score. (D-F) Violin showed differences in immune score, stromal score and tumor purity among the three cluster groups.(J-I) The boxplot showed that PD-L1,CTLA4 and most HLA genes were signi cantly differentially expressed in the high, medium and low immune cell in ltration groups.(G) Proportion of immune cells in high, medium and low immune in ltration groups based on CIBERSORT algorithm.(K) The survival curve showed that the prognosis of patients in GC subgroups was different.