Prognostic Value of Immune-Related genes in the Tumor Microenvironment of Glioma - Analyzed by Integrated Bioinformatics

Object: Immune related genes play an important role in the process of tumor genesis and development. Therefore, we aim to nd the Immune genes which are related to the prognosis of glioma patients, and to explore the inltration of Immune cells in glioma microenvironment. Methods We downloaded the data of the glioma samples from the CGGA database, and performed batch correction to screen the primary glioma samples for subsequent analysis. Then the ESTIMATE algorithm was used to deal with the Stromal scores and Immune scores of the primary glioma samples, and the difference was analyzed. Then the common Immune related genes (IRGs) were obtained by intersecting with the Immune genes in the ImmPort database. Moreover, we used common IRGs to construct protein-protein interaction (PPI) networks, from which we screened the top 30 genes with high connectivity, and Lasso regression was used to screen the IRGs. Lastly, we obtained the combined genes, which were overlapped both in the top 30 high-connection genes and Lasso regression genes. The nal genes were used to construct COX risk prediction models. The accuracy of the model were veried by the TCGA glioma data, and the model genes were analyzed for Immune-related pathways, as well as the Hallmark and KEGG enrichment. Additionally, we used CIBERSOFT algorithm to estimate the Immune cell content of the samples, and analyzed the differences, correlations and survival of the Immune cells in high and low risk groups.


Introduction
The glioblastoma is the most common primary malignant brain tumors, which is typically associated with a dismal prognosis and poor quality of life [1]. By 2019, the CBTRUS statistical reports found that the incidence of the central nervous system (CNS) tumors was about 23.41 per 100,000, and the mortality rate was about 442 per 100,000 [2]. At present, the glioma can be divided into two main types: diffuse glioma, which was the most common one, and non-diffusive glioma. World Health Organization (WHO) divided this malignancy into four grades, for grade and grade were the Low Grade Glioma (LGG), and grade and grade were the High Grade Glioma (HGG). Meanwhile, the grade had the highest malignancy and extremely poor prognosis [3]. The systematic management for the tumor were including surgery, chemoradiotherapy, molecular targeting, Tumour Treating Fields (TTFs) therapy and so on [4].
For decades, immunotherapy has achieved fruitful results in the treatment of solid tumors, while is no remarkable breakthrough in the treatment of glioma, one of the might be the lack of the exploration and understanding of the effective Immune related genes (IRGs) and the Immune in ltration of glioma [5].
Recent studies have found that the growth and invasion of glioma are closely related to the Immune process and Immune response [6], while the underlying mechanism is still remain unclear.
The exploration of IRGs which were differentially expressed in the glioma patients with different prognosis can help us understand the pathogenesis of glioma, providing new ideas for the treatment of this malignancy and the management of the patients with this disease. In addition, tumor development was largely determined by the tumor microenvironment, which was an 'ecosystem' composed of Stromal cells, Immune cells, tumor stem cells and other appendages [6]. In previous studies, most of the studies mainly focused on the expression of tumor cell genes, but ignored the effects of the cell surrounding environment. However, it is found that Immune cells can promote and inhibit the invasion and growth of glioma. In brain glioma, microglia and macrophages can promote the growth, invasion and metastasis of glioma, which are associated with poor prognosis [7,8], although the relationship between the Immune cells with the genes remains to be explored.
Thus, for the rst time, we studied the relationship between IRGs and Immune cells, to reveal the role of immunity in glioma. At the same time, we also found the targeted polyenes, which would be the Immune prognostic markers of glioma. This study mainly used ESTIMATE algorithms to evaluate Stromal cells and Immune cells [9], and CIBERSOFT algorithms to estimate the in ltrating Immune cells in tumor samples [10]. Both of the two algorithms were successfully applied in the studies of urinary bladder carcinoma [11], lung cancer [12] and renal carcinoma [13], for which the reliability and accuracy were recognized.
In the present study, we downloaded primary glioma samples from the Chinese Glioma Genome Atlas (CGGA, http://www.cgga.org.cn/) database, and used the ESTIMATE algorithms to estimate Stromal cell and Immune cell scores in glioma sample, extracting the differentiated genes related to the scores. Then the co-expressed IRGs were obtained from the intersected gene sets with ImmPort Immune database.
Multiple bioinformatics analyses were perform, and COX risk prediction model was constructed. Survival analysis, performance prediction and validation of the model were done, and the results indicated that the prognosis of the patients in high-risk group and the low-risk group of the prognostic model was signi cantly different. Moreover, pathway analysis showed that the targeted genes in high-risk groups were signi cantly enriched in Immune response and Immune related processes, and were associated with multiple Immune-related pathways in tumors. Finally, we analyzed the Immune microenvironment of glioma by CIBERSOFT algorithm, and found that in numerous kind of Immune cells, there were signi cant differences between the model of high-risk group and low-risk group. The article research process is shown in Fig. 1.

Data collection and Processing
At rst, to explore the IRGs and Immune in ltration of tumor cells in the glioma Immune microenvironment, we collected mRNA-seq data from the CGGA database. The data were divided into two batches. Secondly, We combined the data for batch correction and selected the primary glioma samples for subsequent analysis. Thirdly, we used the ESTIMATE package for Stromal and Immune scoring of primary glioma samples. Finally, to compare the effects of the two scores on survival analysis, we took the median of the Stromal scores and divided the samples into the high Stromal score group and the low Stromal score group. The same method was used for the Immune scores and estimate scores. To analyze the correlations between clinical information and scores, we compared the clinical information correlations with Stromal scores, Immune scores, and estimate scores.

Screening of co-differentially expressed IRGs and enrichment analysis
Using the grouping principle in the previous survival analysis, the samples were divided into high Stromal score groups and low Stromal score groups according to the Stromal score's median value. Then the 'Limma' package was used for difference analyses. The nal result was screened using the criteria of |LogFC|≥1 and FDR < 0.001 [14]. For differentially expressed Immune genes with Immune scores, the same methods and criteria were used for screening. In addition, to obtain the Immune genes, we screened the Immune genes through gene list (https://www.immport.org/shared/home) in the ImmPort database. After the duplicate Immune genes were deleted, they were crossed with the differential genes in the Stromal scores and Immune scores to obtain the common differential IRGs. To explore the role of codifferentially expressed IRGs in tumor immunity, we obtained the co-differentially expressed IRGs via the Venn diagram and used the 'org.hs.eg.db' package [15] to convert geneID into entrezID. The genes extracted from entrezID were analyzed for GO and KEGG pathways analysis using the 'cluster pro ler' package, and the results were screened with a p-value < 0.05.

Construction of protein-protein interaction (PPI) network and risk prediction model
The PPI network was built using the online web tool STRING (Version 11.0, https://stringdb.org/cgi/input?sessionId=BBPrHBKJPUey&input_page_show_search=off), setting the minimum required interaction score value to 0.4 and hide disconnected nodes in the network. Finally, the PPI network was imported into Cytoscape software (Version 3.8.0, https://cytoscape.org/). Its plug-in Cytohubba was used to screen for gene connectivity, which proved that the higher the degree of the connectivity was, the more critical the gene was in the network.
To construct the risk prediction model, we extract the expression level of the common differentially expressed IRGs and combine the expression level with each sample's survival time. Then the 'Survival' package [16], 'Survivminer' package [17], and 'Glmnet' package [18] was used for univariate Cox and Lasso regression analysis. The nal result was intersected with the top 30 genes with the highest degree of connectivity in the PPI network. Finally, the common IRGs (target genes) of the two sets were used to construct a Cox risk prediction model, and the formula for calculating the risk prediction model was: Risk score = Exp1 * β1 + Exp2 * β2+ ···Expn * βn. The risk values of each sample obtained by the above formula were divided into high-risk groups and low-risk groups by the median of the risk values, and then subjected to survival analysis and model prediction accuracy analysis. To verify the constructed model's accuracy, we downloaded 672 glioma samples from the TCGA database and screened 569 samples with complete survival data (Table 1). In the same way, the risk prediction model was constructed, and its predictive value and the model's accuracy were veri ed. To further understand the relationship between the risk prediction model's high and low-risk groups, and the related Immune pathways, we conducted pathway analysis on Immune Response (M19817) and Immune System Process (M13664). All analysis were based on Molecular Signaling Database v7.1 (https://www.gsea-msigdb.org/gsea/msigdb/index.jsp). Similarly, we explored the Hallmark and KEGG pathways in gliomas and screened the analysis result with FDR q-val < 0.05.

Immune microenvironment of glioma
To evaluate the Immune in ltration of glioma samples' Immune microenvironment, we analyzed the Immune microenvironment of all primary glioma samples using the online network tool CIBERSORT (https://cibersort.stanford.edu/). CIBERSORT is a commonly used tool for assessing the level of Immune cells in tumor samples, and it can assess the cell content of 22 Immune cells. Firstly, to further understand the correlation between Immune cells in the Immune microenvironment, we analyzed Immune cells' correlation in tumor samples. Secondly, to explore the difference in the relative content of Immune cells between the high-risk groups and low-risk groups, the 'Vioplot' package was used to analyze the difference between the high and low-risk groups. Thirdly, Immune cells related to Immune genes and risk scores were screened by correlation test. In addition, we performed survival analysis on all Immune cells in the Immune microenvironment. Finally, we used FunRich (version3.1.3, http://www.funrich.org/) to intersect the results of different analysis, correlation analysis, and survival analysis of Immune cells, and the Immune cells with different expressions, high correlation and impacts on the prognosis of patients with glioma were obtained .

Data processing
The mRNA-seq data in the CGGA database were divided into two batches. A total of 1018 glioma samples were collected (including 426 cases of primary low-grade glioma, 199 cases of recurrent lowgrade glioma, 225 cases of primary glioblastoma, 133 cases of recurrent glioblastoma, 30 cases of secondary recurrent glioma, and 5 cases of incomplete information). To study the Immune microenvironment of primary gliomas, 626 patients with primary gliomas (including low-grade gliomas and high-grade gliomas) were screened for subsequent analysis. The survival analysis results of the Stromal score and the Immune score were both statistically different (P < 0.05) (Fig. 2). Meanwhile, the Stromal score, the Immune score and the composite score were correlated with the patient's Age, glioma's Grade, Radio status, Chemo status, MGMT P Methodology Status, IDH mutation status, and 1p19q codeletion status (P < 0.05) (Fig. 3&4&5).

Screening common differentially expressed genes (DEGs) and enrichment analysis
A total of 903 DEGs (including 684 up-regulated genes and 219 down-regulated genes) were obtained through difference analysis of Stromal scores. Similarly, 1006 DEGs (including 619 up-regulated genes and 387 down-regulated genes) were obtained through difference analysis of Immune scores. In the gene list of the ImmPort database, the duplicate genes were deleted, then the 1793 Immune genes were obtained. Then 117 common IRGs (111 up-regulated genes and 6 down-regulated genes) (Fig. 6) ( Table 2) were obtained form the intersection sets of the three sets. Evidently, the results of GO enrichment analysis (Fig. 7A) showed that IRGs in the biological process (BP) pathway was mainly enriched in humoral Immune response, cellular immunity, and lymphocyte-mediated immunity. Furthermore, the cellular component (CC) pathway was mainly enriched in immunoglobulin complex, blood microparticle, and cytoplasmic vesicle lumen. In molecular function (MF) pathway, it was mainly enriched in antigen binding, immunoglobulin receptor binding, and cytokine receptor binding. In addition, the results of the KEGG pathway enrichment analysis were also meaningfully, for the genes were mainly enriched in the Chemokine signaling pathway, TNF signaling pathway, and Antigen Processing and presentation (Fig. 7C).  (Fig. 8C). The Lasso analysis of the117 IRGs resulted in 26 genes (Fig. 8A&B). Then six IRGs (CXCL10, IL6, MMP9, CCL5, CSF3, and IL2Ra) were obtained form the intersection of the two sets (Fig. 8D). Subsequently, we used the six IRGs to construct a Cox risk prediction model and analyze the model's prediction accuracy (Fig. 8E). The results showed that there was a signi cant difference in prognosis between the high-risk groups and the low-risk groups (P < 0.001), and the ROC curve showed that the prediction accuracy of the model within ve years was greater than 0.8 (area under curve,AUC) (Fig. 9A&B). Finally, sample validation using TCGA glioma revealed a signi cant difference in prognosis between the high and low-risk groups (P < 0.001), with an AUC under the ROC curve of > 0.8, suggesting that the model was highly accurate in prediction (Fig. 9C&D).

Functional enrichment analysis of the model
The results of the pathway analyze showed that the high-risk group was more likely to accumulate on the Immune Response and Immune System Process pathways than the low-risk group, which indicated that the higher the risk score was, the more intense the Immune Response process was (Fig. 10A&B). Analysis of the glioma pathway at Hallmark showed that the high-risk group genes were mainly enriched in In ammatory Response, p53 Pathway, and PI3k Akt Mtor Signaling (Fig. 10C). In the KEGG pathway analyze, the high-risk group genes were more likely enriched in p53 signaling pathway, JAK-STAT signaling pathway, antigen processing and presentation, pathways in cancer, and cell cycle (Fig. 10D).

Immune microenvironment analysis
Using the online network tool of CIBERSORT, we estimated the relative content of 22 Immune cells in 626 glioma samples (Fig. 11). The results showed that the relative content of Macrophages M0, Macrophages M2, Mast cells resting, and Mast cells activated was more than 0.45. The results of the difference analysis showed that the contents of T cells follicular helper, NK cells resting, and Monocytes in the lowrisk group were higher than those in the high-risk group (Fig. 11B). Furthermore, the high-risk group had higher level of Plasma cells, T cells CD8, T cells CD4 naïve, T cells regulatory (Tregs), Macrophages M0, Macrophages M1, and Neutrophils than the low-risk group (Fig. 11B). The correlation between the risk score of the predictive model and Immune cells was also analyzed. The results showed that the risk score had a positive correlation with T cells regulatory (Tregs), Eosinophils, Macrophages M1, Neutrophils, T cells follicular helper and T cells gamma delta, and the positive correlation with Macrophages M0 was the highest (R = 0.72, p < 2.2e − 16). In contrast, it was negatively correlated with T cells CD4 memory recall and T cells CD4 naive, and was most negatively correlated with Moncyte (R= -0.6, p < 2.2e − 16). Finally, we obtained ve types of Immune cells (Fig. 12) through the intersected sets of the difference analysis, correlation analysis, and survival analysis, wherein the Macrophages M0, Neutrophils, and T cells regulatory (Tregs) cells were signi cantly correlated with the prognosis of glioma (Fig. 13F&g&H).
Interestingly, the higher the relative amounts of Monocytes and T cells CD4 naïve (Fig. 13I&J), the higher the survival rate of patients with gliomas.

Discussion
Glioma is a primary tumor of the brain, which accounting for about 81% of intracranial malignancies.
According to the invasiveness of the tumor, WHO divided the glioma into I-IV grades, while the grade IV glioma had the highest degree of malignancy, of which the ve-year survival rate was about 5 percent and the median survival time was about 12-15 months [19,20]. Because the tumor microenvironment of glioma survival is too complex, it brings challenges to our research and treatment, which leads to poor clinical therapeutic effect and even therapeutic resistance [21]. In recent years, with the development of bioinformatics and the multi-omics researches of tumor, we have a deeper understanding of the process of tumorigenesis and development. Immune in ltration plays an important role in the development of glioma cells, which recruits immune cells for their growth by secreting chemokines [22], although the speci c roles of immune cells in glioma still remains unclear. In line with the recent success of immunotherapy in multiple solid tumor [23], it is meaningful to explore the immune related genes of glioma and clarify the in ltration of immune cells in tumors.
CGGA database is constructed for offering convenience for the researchers to exploring the pathological mechanisms, molecular typing and therapeutic targets for glioma. Meanwhile, ImmPort database collected comprehensive immunological data [24]. By downloading the data of glioma samples from the have found that LncRNA CRNDE may trigger in ammation through Toll-like receptor pathway to regulate tumorigenesis and tumor development [33]. To sum up, the above evidences suggested that the IRGs could promote glioma proliferation, migration and invasion through multiple pathways.
Evidence revealed that the CXCL10 genes were overexpressed in glioblastoma, and were associated with its growth and progression [34]. Maru et al. revealed that CXCL10 could induce the Erk1/2 increase and uptake of [ 3 H] thymidine, then promoting glioma cell proliferation [35]. The level of IL-6 mRNA was reported to have relationship with prognosis and survival of the glioma patients [36]. What's more, through the JAK/STAT3 pathway, tumor growth may be promoted [37]. Meanwhile, it was also found that IL-6 can also promote the expression of MMP9 in U251 and T98G cell lines [37]. Recent studies have proved that IL-6 can induce the production of T H 17 cells and myeloid-derived suppressor cells (MDSCs) to promote tumorigenesis, as well as transforming the tumor-associated tumorigenic M1 macrophages into the immunosuppression-related M2 macrophages [38]. The hypoxia and in ammation stimulated the overexpression of MMP9, which is closely related to the glioma progression [39,40]. CCL5/CCR5 axis was reported to promote glioma cell proliferation activates through PI3K/Akt pathway [41,42]. Recently, researchers found that CCL5 expression is associated with poor prognosis in LGG patients [43]. Gao et al. found that CSF3 gene is involved in in ammatory reactions and is the GBM diagnostic plasma markers [44]. More importantly, IL2RA gene receptors are reported to locate on the surface of regulatory T cells and may be involved in immunosuppression of GBM tumor microenvironment through cytokine signaling pathways [45]. Moreover, through GSEA analysis of the high and low risk groups in risk prediction models, we found that the high risk groups were mainly enriched in the Immune response related processes, which suggested that the Immune response was reacted intensively in glioma prognostic models. Meanwhile, in Hallmark and KEGG pathways are enriched p53 signaling pathway, which is mainly related to cell cycle and apoptosis [45]. In KEGG pathways, we also found the high-risk groups signi cantly enriched in JAK-STAT signaling pathway, antigen processing and presentation, pathways in cancer and cell cycle.

The tumor microenvironment (TME) refers to the cell environment in which tumor cells or tumor stem cells exist [46], which is mainly composed of Stromal cells, endothelial cells, Immune cells and other
accessory cells [47]. Its main function is to participate in the tumor vascular survival, promoting invasion and metastasis [48]. In the microenvironment of glioma, tumor-associated macrophages (TAM) and microglia played key roles, while the inhibition of TAM could intervene the growth of glioma [49].
Moreover, TAM is also involved in the angiogenesis in brain tumors and the drug resistance of antiangiogenic drugs [50]. Prior to that, TAM were generally divided into M1 and M2 two phenotypes.
However, Gabrusiewicz proposed that there were still a large number of non-polarized M0 macrophages in the GBM region, and found that CD14 + cells were very similar to M0 macrophages, which may be related to the Immune in ltration of GBM [51]. In addition, Haung et al. found that the higher levels of M0 macrophages in the glioma samples [52],which was consistent with our ndings. They also suggested that the expression level of EFEMP2 was associated with M0 macrophage cell assembly, and that EFEMP2 could also act as a chemokine for TAM. Bertaut et al. found that the Neutrophils count in peripheral blood of GBM patients could predict the e cacy of bevacizumab [53]. Moreover, Liang et al.
suggested that Neutrophils in ltration was associated with glioma progression and drug resistance during anti-VEGF therapy [54]. T cells Regulatory (Tregs) plays a crucial role in normal immunity, mediating autotolerance [55], and regulating the function of Immune cells. In the research of the cancer eld, it is found that overactivity of the Tregs is associated with tumor cell escape [56]. Meanwhile, there is also a correlation between the degree of invasion of Tregs and WHO classi cation of the tumor [57]. In addition, Tregs recruitment is evident in glioma mouse models, suggesting that it might be necessary for tumor growth. Therefore, targeted anti-Treg therapy was proposed, and its effectiveness was veri ed in glioma mouse experiment and GBM patient treatment [58,59]. Monocytes can be transformed into tumorrelated macrophages to perform roles, while the relevant mechanism remains to uncover [60].

Conclusion
In this study, 6 genes (CXCL10 IL6 MMP9 CCL5 CSF3 IL2RA) of the risk prediction model were constructed by scoring Immune cells and Stromal cells, and the predictive ability of the model is veri ed.
The GSEA analysis of the model showed that the high-risk group of the model was signi cantly enriched in Immune-related pathways. In addition, we also estimated the content of Immune cells in glioma, and focus on Macrophages M0, Neutrophils and T Cells Regulatory Tregs, of which the contents were negatively correlated with the prognosis of patients. Combined with relevant studies, we concluded that the higher contents of the three kinds of Immune cells were related to the stronger the Immune in ltration of glioma, consequently, the causing worse prognosis of the patients.
6.Ethics approval and consent to participate Not applicable.

7.Consent for publication
All authors agreed on the manuscript.

8.Availability of data and materials
Data used to support the results of this study can be obtained from the corresponding authors as required.

9.Competing interests
The authors declare that they have no competing interests.

12.Acknowledgements
We thank all team members of TCGA and CGGA projects for providing public-access data.

Declarations
Ethics approval and consent to participate Not applicable.

Consent for publication
All authors agreed on the manuscript.

Availability of data and materials
Data used to support the results of this study can be obtained from the corresponding authors as required.

Competing interests
The authors declare that they have no competing interests.   Survival analysis results of Stromal score, Immune score and Estimate score of glioma samples(A&B&C). X-axis represents survival rate, Y-axis represents survival time.    The common immune-related genes of the intersected sets from Stromal score, Immune score and Immport database.

Figure 8
Identi cation of target genes and establishment of risk prediction models. (A&B) 117 IRGs were screened through univariate Cox ltering and Lasso regression analysis to obtain genes related to prognosis. (C) The top 30 high-connection genes in the PPI network. (D) PPI network gene and Lasso regression analysis take the intersection to screen target genes. (E) A risk prediction model constructed using six target genes. In the risk score map, green represents the low-risk group, and red represents the high-risk group. In the survival time map, green represents survival and red represents death. In the heat map, blue represents the high-risk group, pink represents the low-risk group, in the heat map red represents high expression, green represents low expression, and black represents medium expression. PPI: Protein Protein Interactions.

Figure 9
The results of the survival analysis and model prediction ability of the risk prediction model in CGGA (A) and TCGA (C). The X-axis in A&C represents survival time, the Y-axis represents survival rate, the red curve represents the high-risk group, and the blue represents the low-risk group. In B&D, the X-axis represents the false positive rate, the Y-axis represents the true positive rate, and the curves of different colors represent