Identication of a Potential Prognostic Signature based on Immune-related Genes in Primary Glioblastoma

Background: Glioblastoma (GBM) is a common primary brain tumor with a high incidence in adults with malignant and fast-growing biological characteristics. In this study, we explore the immune-related prognosis markers of GBM at the mRNA level. Methods: The sequencing data and clinical information of GBM patients were downloaded from The Cancer Genome Atlas (TCGA) and the Chinese Glioma Genome Atlas (CGGA). The differentially expressed genes (DEGs) were calculated between normal tissues from the Genotype-Tissue Expression (GTEx) database and tumor tissues from TCGA and CGGA. We obtained immune genes from the ImmProt database. The intersection of DEGs and immune genes were dened as the immune-related differential genes (IR-DEGs), based on which, survival associated IR-DEGs were determined by multivariate Cox regression analysis. The survival risk score (SRS) was determined for each sample with the top 6 prognostic associated IR-DEGs. One-year, two-year, and three-year potential survival were predicted by the prognosis prediction model established by multivariate and univariate Cox regression. In addition, we performed CIBERSORT in GBM patients with samples from TCGA cohort; association analysis was performed with prognostic IR-DEGs and immune cells. Furthermore, the inuence of prognostic IR-DEGs on the brain tumor microenvironment (TME) was validated in single-cell sequencing analysis. Results: We found 301 IR-DRGs in GBM primary tumor compared with normal tissue, and 19 of them could predict the overall survival (OS) more accurately in GBM patients. Six IR-DEGs (PLAUR, TNFSF14, CTSB, SOCS3, PTX3, and FCGR2B) were selected to construct the SRS, with which, GBM patients were divided into two different groups which combined with high and low risk. The SRS was found to be an independent prognostic factor for GBM and could predict GBM patients’ possible survival with an acceptable eciency. Moreover, the expression of 6 IR-DEGs and their co-expressed IR-DEGs could inuence TME and were associated with GBM prognosis. Conclusions: This study identied a potential immune prognostic signature of glioblastoma, which could enhance the prognosis prediction ability for GBM patients. The immune-related-genes in the TME could potentially benet the immunotherapy development for GBM patients.


Introduction
Glioblastoma (GBM) is a common primary brain tumor with a high incidence in adults with malignant and fast-growing biological characteristics. Due to the dismal prognosis and the impact on life quality of life and cognitive function, GBM has become one of the most feared diseases. Remarkably, the median survival of glioblastoma is 9-15 months and the 5-year survival rate of glioblastoma is 5.4% [1]. Currently, the best treatment strategies available are maximum safe resection plus adjuvant chemoradiotherapy in glioblastoma patients. What's more, tumor-treating elds (TTF) with adjuvant temozolomide can also prolong GBM patient survival [1,2]. However, primary glioblastomas will always recur due to aggressive and treatment-resistant recurrences, because of poor standard treatment for recurrent glioblastoma and nally the patients will quickly die with a short living time [3]. Extensive studies of immune checkpoint inhibitors (ICIs) therapy, especially anti-programmed cell death protein-1 (PD-1)/programmed death ligand-1 (PD-L1) and anti-cytotoxic T lymphocyte-associated protein 4 (CTLA-4), have been conducted on primary and recurrent GBM in clinical research and were unsuccessful [4,5]. Prognostic factors of GBM are related to age, performance status, tumor grade, speci c biomarkers (MGMT methylation, mutation of IDH1, IDH2 or TERT, 1p19q co-deletion, overexpression of EGFR, etc.) and the extent of resection [6,7]. Up to now, the existing evaluation markers cannot cover all the disease information of GBM patients. Thus, there is an immediate requirement for effective prognostic factors to help predict the prognosis evaluation of GBM. TME plays a vital role in tumor growth and invasion, and ultimately affects the patient's prognosis. TME in GBM contains a variety of immune cell types, including tumor-associated macrophages (TAM), regulatory T cells (Tregs), microglia, monocyte and bone marrow-derived suppressor cells [8,9]. The proin ammatory cytokines such as IL-12, IL-18, and IFN-γ in TME of GBM were signi cantly reduced, while the soluble inhibitory molecules (interleukin 10, VEGF and beta transforming growth factor) were enriched [10,11]. Under certain circumstances, the complement system plays a key and complex role. It can help swallow antibody-coated tumor cells, induce a series of in ammatory responses, and help immune cells ght infection and maintain homeostasis, or hinder anti-tumor T cell responses that favor tumor progression [12]. Complement proteins exist in TME and tumor cells have the capacity to produce in situ [13]. C3a produced by tumor cells promotes immunosuppressive TME by acting on TAMs [14]. However, the imbalance between complement activation and inhibition will in uence tumor development at different stages and nally in uence tumor progression and invasion [15]. Various studies have reported that complement is activated in GBM, and tumor growth in situ is signi cantly reduced after the inhibition of the complement system in a cascade [15,16]. However, the harmful impact on the prognosis of GBM remains to be con rmed.
Therefore, in order to identify the immune-related prognostic gene signature, we used differential gene analysis in The Atlas Cancer Genome (TCGA) GBM dataset and Chinese Glioma Genome Atlas (CGGA) dataset, intersection with ImmProt database, to screen 301 IR-DEGs. Then, ltered by Kaplan Meier survival analysis, the potential genes were applied to the multivariate analysis combining clinicalpathological features and gene expression. Six prognostic IR-DEGs (PLAUR, TNFSF14, CTSB, SOCS3, PTX3, FCGR2B) were selected. The prognosis model integrating six IR-DEGs was established to optimize the prognosis evaluation. The prognosis model was validated in an independent GBM cohort from CGGA.
To identify the impact of prognostic IR-DEGs on TME, a total of 22 types of in ltrating immune cells in the 166 patients from TCGA divided into two different groups were estimated in CIBERSORT. The results were further veri ed in single-cell sequencing analysis. The regulatory network was constructed based on prognostic IR-DEGs and their co-expressed genes. The study provides a piece of bene ts information that will guide clinical immunotherapy strategies. Moreover, the prognostic IR-DEGs could act as therapeutic targets for clinical immunotherapy of GBM.

Samples and Data acquisition
We obtained 171 mRNA expression les with FPKM normalized from The Cancer Genome Atlas (TCGA, https://tcga-data.nci.nih.gov/tcga/), included 5 normal samples and 166 glioblastoma samples. We downloaded both GBM clinical samples from UCSC Xena (http://xena.ucsc.edu/), in which 107 patients have completed survival information and clinicopathological features. 129 cases of mRNA expression were downloaded from CGGA (http://www.cgga.org.cn/) for validation and 86 of them had complete survival information and clinicopathological features. We got data for the single cell GBM analysis was obtained from the Gene Expression Omnibus (GEO, https://www.ncbi.nlm.nih.gov/geo/), in which 2,343 cells from tumor cores and 1,246 cells from peripheral regions, During the process, the reading depth of 10× genomics was on the basis of Illumina NextSeq 500, the accession number is GSE84465. Differential analysis and GBM sc-RNA data processing We used the R package "limma" (R version 4.0.2) to analyze differential gene expression. Genes with |logFC| (fold change) > 0.75 and FDR < 0.05 were considered as differential one. We got 5927 DEGs from the TCGA dataset, 4770 DEGs from the CGGA dataset, and 1793 immune-related genes from the ImmPort database (https://www.immport.org/). Next, we intersected each other and obtained 301 IR-DEGs in both TCGA and CGGA datasets. The statistical analysis was done in the scRNA-seq data, using the "Seurat R" package as quality control. According to the composition patterns of the marker genes in the singleR package, we determined and annotated different cell clusters on the basis of veri cation and correction from the CellMarker database [17].

Identi cation of prognostic IR-DEGs
Kaplan-Meier survival analysis was done by "survival" package in R to screen the IR-DEGs, which were signi cantly associated with the overall survival (OS) of the 107 samples from the TCGA dataset. To further determine the prognostic genes, all the information of 19 OS-related genes with P < 0.05 were analyzed by multivariate cox regression analysis included clinical features such as age, gender, karnofsky performance score, radiation therapy and IDH mutation status. Then, we performed ROC analysis to screen the best prognostic genes, selecting six of them.

Construction and evaluation of prognostic risk model
The risk model was constructed on the ground of the six prognostic IR-DEGs. We used the following formula to calculate the risk score in each patient: where Coef i mean the regression coe cient, calculated by the multivariate Cox regression, and x i represents the expression level of IR-DEG. GBM patients were divided into low-risk and high-risk groups according to the value of median risk score. Kaplan-Meier survival curve was charted to evaluate the OS of patients in two groups. The difference between the two groups was evaluated by a two-sided log-rank test. The ROC curve within 1, 3 and 5 years was plotted according to the "survivalROC" R package to assess the predictive accuracy of the risk model. To make clear that whether the risk score maybe an independent prognostic factor out of other clinicopathological parameters, we used multivariate Cox regression and univariate Cox regression analysis in TCGA testing cohort and CGGA validation cohort, Pvalue < 0.05 was considered statistically signi cant.
Tumor-In ltrating Immune Cells in high-and low-risk group Immune cells were classi ed into 22 types such as T cells, B cells, plasma cells, NK cells, and myeloid subsets by CIBERSORT. According to the signature gene expression pro le of the high-throughput array of RNA-sequencing data, the way of CIBERSORT mainly relied on the LM22 signature matrix [17]. Thus, we use CIBERSORT to analyze the mRNA expression matrix including 83 low-risk and 83 high-risk of glioblastoma samples from the TCGA dataset. Meanwhile we calculated the p-value for the deconvolution in each sample by the Monte Carlo sampling method. Finally, the relationship between IR-DEGs and immune cells was uncovered through Person's correlation analysis, and the comprehensive connection of IR-DEGs and tumor in ltrating immune cells were represented, which was performed in TIMER2.0 (http://timer.cistrome.org/).

Identi cation of prognostic IR-DEGs
The work ow chart of this study was shown in Figure 1. To investigate the immune-related-genes that might in uence the prognosis of GBM. We obtained sequencing information with clinical data of 166 GBM samples from the TCGA dataset and 86 samples from CGGA, including RNA-seq (FPKM and counts format). Combined with normal samples from the GTEx dataset. Differential analysis was performed in normal tissues and tumor tissues with the lter condition FDR < 0.05, |logFC| > 1.5 (Fig. 2a, 2b). And then, the overlap in Venn diagram of the intersection between differentially expressed genes in TCGA and CGGA and immune related genes in the Immprot dataset contained 301 IR-DEGs (Fig. 2c). Filtered by Kaplan-Meier survival analysis (p < 0.05), 19 survival related IR-DEGs were determined (Table 1). Then, those 19 IR-DEGs were applied to multivariate analysis combined with clinical-pathological features and gene expression, and six prognostic IR-DEGs (PLAUR, TNFSF14, CTSB, SOCS3, PTX3, FCGR2B) were selected for further analysis (Table 1). IR-DEGs based Risk score was an independent prognostic factor for GBM To optimize the prognosis evaluation system, we established prognosis model that integrated six prognostic IR-DEGs from GBM patients. In each sample, we calculated the risk score on the basis of the coe cients of the six predictive genes (PLAUR, TNFSF14, CTSB, SOCS3, PTX3, FCGR2B). According to the median risk score (Fig. 3a, 3b), we divided patients into high-and low-risk subgroups (Fig. 3a, 3b). Heatmap showed that the expression of the six prognostic IR-DEGs were different between the high-risk and low-risk group. The number of dead patients increased gradually as the risk score rose (Fig. 3a, 3b).
Moreover, the Kaplan Meier survival curve showed that the OS between high-and low-risk groups was signi cantly different (Fig. 3c, 3d). The prognostic model showed higher accuracy in predicting 3-year and 5-year OS of GBM in TCGA and CGGA dataset, according to time-dependent ROC curve (Fig. 3c, 3d).
To evaluate the independence of the prognostic model, we did univariate and multivariate Cox regression analyses in the TCGA dataset, and validated the result in an independent GBM cohort from CGGA. As shown in Table 2, the risk score was remarkably correlated with overall survival (OS) (HR=1.573, 95% CI=1.148-2.156, P = 0.005) in univariate analysis. The multivariate analysis illustrated that the risk score was an independent prognostic indicator (HR=1.566, 95% CI=1.133-2.166, P = 0.007). In summary, these results showed that the risk score on the ground of prognostic IR-DEGs was an independent prognostic factor (Fig. 3c, 3d).

Identi cation of Tumor-In ltrating Immune cells in two risk group
To identify the difference of in ltrating immune cells from the high-risk and low-risk groups of GBM, we analyzed the compendium of 22 immune cell types that in ltrated in tumors, in TCGA dataset with CIBERSORT. The making-up of immune cells from 166 samples and the relationship between the in ltrating immune cells showed in Figure 4. Macrophages were enriched both in the high-risk and lowrisk groups (Fig. 4a). The share of monocytes was remarkably increased in the high-risk group, while the share of NK cells was signi cantly enriched in the low-risk group (Fig. 5b).These results showed an immunosuppressive microenvironment in high-risk group.
Prognostic IR-DEGs promote immune-suppressive microenvironment The relationship between six prognostic IR-DEGs and 22 immune cells was performed by using Person's correlation analysis. The results indicated that the expression levels of 6 prognostic IR-DEGs were positively correlated with monocytes and neutrophils, and had negative correlation with NK cells (Fig. 5a,  5c). In single cell analysis, PLAUR and CTSB expressed high in macrophages, SOCS3 and FCGR2B expressed low in macrophages. However, TNFSF14 and PTX3 were not detected in any of the cell types (Fig. 6). Furthermore, we identi ed the co-expressed genes related to six prognostic IR-DEGs by using Person's correlation analysis, and applied them to GO and KEGG enrichment analysis. The results of GO analysis showed that the prognostic IR-DEGs can be enriched in biological processes including neutrophil degranulation, cell chemotaxis and macrophage activation (Fig. 7b). KEGG analysis showed that prognostic IR-DEGs were mainly enriched in TNF signaling pathway and complement and coagulation cascades (Fig. 7a). In general, prognostic IR-DEGs plays a vital role in promoting the immunosuppressive microenvironment.

Construction of regulatory network
In order to clarify the regulation mechanism of prognostic IR-DEGs, we performed regulatory network combined with prognostic IR-DEGs and their co-expressed genes, as well as the regulatory pathways. PTX3 and PLAUR related to complement and coagulation cascades, SOCS3 activated TNF signaling pathway, CTSB activated antigen processing and presentation, NF-kappa B signaling pathway regulated by TNFSF14, FCGR2B activated Fc gamma R-mediated phagocytosis (Fig. 8a). Moreover, the transcription factors regulatory network of six prognostic IR-DEGs is shown in Figure 8b. These regulatory mechanisms collectively mediated immunosuppressive TME and promote tumor progression, and nally affecting the prognosis of GBM patients.

Discussion
The accumulation analysis of data shows that the basis of the interplay between cancer cells and the innate immune system is a uni ed cohesion mechanism. According to our analysis of TCGA and CGGA data, we rst identi ed 301 differential immune-related genes and then con rmed that PLAUR, TNFSF14, CTSB, SOCS3, PTX3 and FCGR2B were signi cantly correlated with prognosis in univariate and multivariate Cox regression. The risk score of the six prognostic DEGs has great predictive potential in estimating the survival rate of GBM patients and promoting immune-suppressive TME of GBM. Currently, the therapeutic strategies of glioblastoma have been advanced, but the 5-year survival rate is 5.4% [1,19]. Therefore, the immune micro-environment acts an essential factor in the growth of novel therapies and shapes the outcome of patients with glioblastoma, and the prognostic IR-DEGs can be the promising prognostic predictors and therapeutic targets in GBM.
Glioblastoma has mortal cues. Take for instance, youth, poor cure state and incomplete scope of removal are identi ed as poor prognostic factors [1,19]. In elder, the median survival is less than four months with the best suitable treatment [22]. On the molecular level, isocitrate dehydrogenase 1 (IDH-1), IDH-2 variation and MGMT methylation have a better assessment of prognosis [23]. IDH mutation status can be divided into two different subgroups, including IDH-wild-type glioblastoma and IDH-mutant-type glioblastoma. On the one hand, IDH-Wild-type glioblastoma corresponds to the development of primary glioblastoma. This type represents approximately 90% of glioblastoma patients and usually appears in older patients with more clinical aggression [24]. On the other, IDH-Wild-type glioblastoma originates from either one-up spread or primary astrocytoma. This type contains approximately 10% of all glioblastoma patients. The diagnosis mainly emerges in 44-years median ages with a better prognosis [25]. In our analysis, the IDH mutation status was related to the overall survival in line with the ending of univariate analysis, but that was not an independent prognostic factor according to the multivariate analysis in the TCGA dataset.
The underlying reason might be that it predicts OS of glioblastoma cooperate with other elements.
In GBM microenvironment, macrophages were enriched in both high-risk and low-risk groups, and the diversity in in ltration between the two groups was not remarkable. However, the abundance ratio of NK cells was higher in the low-risk group than in the high-risk group. The prognostic IR-DEGs were positively correlated with macrophages and negatively correlated with NK cells. Macrophages are the primary immune cells in brain tumors, accounting for up to ~30% of malignant tumor [26]. TAMs (tumorassociated macrophages and microglia) are categorized as tissue-resident microglia and bone marrowderived macrophages (BMDMs), that are sorted to the glioma environment. TAMs have immunosuppressive impacts by releasing a broad range of growth factors and cytokines in response to the material derived from cancer cells [27]. In pathological circumstances, encircling monocytes are recruited to the brain essence and cause BMDMs [28]. The accumulative of macrophages in and around glioma actively in uence glioma growth and invasion. Researchers found that, the expression of MT1-MMP has positive correlations with the increasing glioma malignancy grade [29]. Toll-like receptors (TLR) are indispensable in the process of mediating immunologic responses to pathogens [29]. In the GBM microenvironment, TLR2 is appointed as the main TLR to trigger the course of MT1-MMP upregulation [31]. In the regulatory network, the Toll-like receptor signaling pathway is activated by TLR2, the co-expressed gene of the prognostic IR-DEGs, and that could promote tumor growth.
NK cells, the effector lymphocytes of the innate immune system, have the ability to limit tumors growth and dissemination [32]. However, tumors may resist attacks from endogenous NK cells through several paths, leading to immune evasion. The activation course of NK cell is mediated by the CD16 receptor (also known as FCGR3A), and the FCGR3A binds the constant region (Fc) of immunoglobulins to in ict heavy casualties on the antibody-coated cells [33]. Activating NK cell receptors and triggering optimal NK cells cytotoxicity could be a potential treatment target of GBM and improve patient prognosis.
In TME, the complement function is diverse, not only participating in anti-tumor defense but also promoting tumor progression and invasion. The two types combined body of C3a and C5a with their match receptors, C3aR and C5aR1 or C5aR2, play a crucial role to in bringing about in ammation and cause activation of immune cells as well as speci c malignant cells [34]. The in uence that C3a and C5a exert on TME are the recruitment of tumor-promoting macrophages [35], and negative recruitment of CD4+ T cells, neutrophils[36] and NK cells [37]. PTX3 activates and modulates the complement cascade by the interaction with C1q or Factor H, and PTX3 involves in the recruitment of TAMs [35]. C5a recruit neutrophils and leads to the upregulation of receptors such as toll-like receptors (TLRs) and/or complement receptors. Ultimately this action could cause a more vigorous NET response [38].
Researchers illustrate that C3aR drives pro-tumoral neutrophil extracellular trap (NET) formation and the latter causes thrombus formation [39]. The embolus excites neutrophils into a protumorigenic (N2) state and strengthens the pathological process of hypercoagulation and alexin activation [39]. Finally, the interaction of the complement and the coagulation worse the outcome of GBM patients.
We found six immune-related genes were signi cantly correlated with prognosis and have a profound in uence on the GBM microenvironment. In high-risk and low-risk groups, TAMs were positively relevant to the expression levels of prognostic IR-DEGs and have an impact on tumor progression and invasion.
NK cells were negatively relevant to the prognostic IR-DEGs and enriched higher in low-risk group, resulting in a better prognosis. The differential activation of complement and coagulation cascade in the high-risk and low-risk groups were related to the prognostic IR-DEGs, which might explain the differential prognosis of GBM patients. They may become new strategies for improving GBM patient outcomes in the future.
In summary, we developed a novel prognostic risk score based on the expression of PLAUR, TNFSF14, CTSB, SOCS3, PTX3 and FCGR2B by using univariate and multivariate analysis in TCGA and CGGA databases. This prognostic signature can help predict the survival rate of GBM patients more accurately.
What's more, we identi ed the difference of micro-environment in high-and low-risk group in GBM, and formed a regulatory network that needs further veri cation. These results provide a valuable insight into other further immunotherapy strategies, and the six prognostic IR-DEGs could become a target in improving the prognosis of GBM patients.