The Prognostic Value of Risk Score Based on Immunogenenomic Landscape Analysis in Glioma

Background: Glioma is a lethal intracranial tumor, and inflammation plays an important role in the initiation and development of glioma. Hence, there is an urgent need to conduct a bioinformatics analysis of immune-related genes (IRGs) for glioma. The present study aims to explore the association of the risk score with clinical outcomes and predict the prognosis with glioma. Methods: In The Cancer Genome Atlas (TCGA) database, 462 low grade glioma (LGG) samples and 166 glioblastoma (GBM) samples were reviewed, and IRGs correlated with the prognosis were selected by performing a survival analysis and establishing a Cox regression model. The potential molecular mechanism of these IRGs were also explored with assistance of computational biology. The risk score based on seven survival-associated IRGs was determined with the help of the multivariable Cox analysis, the patients were divided into two subgroups according to their risk score. Results: It was found that these differentially expressed IRGs were involved with the cytokine-cytokine receptor through functional enrichment analysis. The risk score based on the seven IRGs (SSTR5 CXCL10CCL13SAA1CCL21CCL27 and HTR1A) performed well in predicting patient’s the overall survival (OS), and correlated with age, 1p/19q codeletion status, IDH status, and WHO grades, both in the training (TCGA) datasets and the validation ((Chinese Glioma Genome Atlas) CGGA) datasets. The risk score also could reflect infiltration through several types of immune cells. Conclusions: This present study screened some IRGs associated with the patient’s clinical characteristic and prognosis, connect to the immune repertoire, demonstrated the importance of the risk score as a promising biomarker for estimating the clinical prognosis of glioma.

challenging malignancies in the world. Through the development of omics studies, the understanding of the potential molecular mechanisms of glioma has made great progress. However, a number of problems still needs to be further explored, and the identification of the specific molecular markers of glioma may bring opportunities to precisely predict the prognosis and choose a suitable personal medicine for the treatment of glioma. The human immune system has an important ability to regulate the potential molecular mechanisms of tumors, and a number of evidences have proven the signature of IRGs in the complex regulatory network [9,10]. Recently, studies on immunotherapy have provided a new perspective to effectively and safely treat some lethal tumors [11,12]. However, the molecular mechanism of the immunology in glioma still remains unclear, especially its immunogenomic effect [13]. At present, researchers can quickly identify crucial biomarkers for tumor monitoring and surveillance due to the development of gene expression databases. For example, Li et al investigated the potential molecular mechanisms of IRGs in non-squamous non-small cell lung cancer, which developed an immune signature to create a new individualized treatment [14]. In addition, Lin et al also comprehensively explored the prognostic value of IRGs in papillary thyroid cancer, and found some IRGs that were closely correlated to the clinical prognosis of patients [15]. However, the clinical relevance of IRGs in glioma remains unclear and needs further exploration.
In the present study, the clinical relevance of IRGs in glioma was demonstrated by using the RNA sequencing data from TCGA datasets and was verified in CGGA datasets. The differentially expressed IRGs were selected, survival analysis and the Cox regression model were used to identify the risk score relevant to the seven survival-associated IRGs in these glioma patients. Furthermore, bioinformatics analyses were conducted to investigate the underling biological functions of IRGs in glioma. In conclusion, the present study provides a novel insight that may have great promotion in the immunotherapy of glioma patients.
We screened the differentially expressed and survival-related genes in TCGA datasets, and the data of IRGs were through the Immunology Database and Analysis Portal (ImmPort) database (https://immport.niaid.nih.gov) [16]. A variety of IRGs supported by ImmPort were a strong basis of immunology research, and provided a number of IRGs for the tumor study. The IRGs downloaded from the ImmPort website had a significant bearing on the initiation and development of immune activity.

Selection of differentially expressed and survival-related IRGs
In order to identify the IRGs associated with glioma, differentially expressed IRGs between LGG and GBM were selected using the R language software (http://bioconductor.org/packages/edgeR/) [17].
Then, a univariate Cox regression analysis of the expressions of IRGs and prognosis of patients in TCGA was performed. Hence, we demonstrated that all differentially expressed IRGs were survivalrelated IRGs in glioma.

Molecular characteristics of Hub IRGs
IRGs with clinical applications as hub IRGs were identified, and the clinical values were further systematically explored. The protein-protein interaction (PPI) network, which was based on data obtained from the string online datebase (https://stringdb.org/) [18], which was constructed to determine the interactions between these differentially expressed IRGs. The result of the PPI network was displayed by Cytoscape v3.6.1 [19]. Functional enrichment analyses were also conducted to investigate the underling biological roles of these differentially expressed IRGs and hub IRGs in glioma. In order to explore molecular mechanisms of these IRGs, the Database for Annotation, Visualization and Integrated Discovery (DAVID) version 6.7 (http//David.abcc.ncifcrf.gov/) [20], via the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathways were used to analyze these differentially expressed IRGs [21], and performed the functional and pathway enrichment analyses. P<0.05 was set as the cutoff criterion.
Since transcription factors (TFs) with significant molecular functions to directly up-regulate and downregulate the gene expression, it is essential to determine the potential molecular biological functions of TFs to regulate these hub IRGs. Cistrome Cancer is a database that contains 318 TFs, and is a significant tool for computational and experimental tumor research [22]. TFs correlated to the OS were extracted to construct the regulatory network with 20 hub IRGs.

Development of the immune-related prognostic risk score
To further screening for the functions of these genes, hub IRGs were submitted for multivariate Cox regression analysis, with selecting seven IRGs as independent prognostic indicators to determine the risk score. The risk score was defined using the formula as follows [23]: This formula was used to calculate a risk score for each patient both in the training (TCGA) and validation (CGGA) datasets. In order to divided the glioma samples, high risk and low risk groups were established using the median risk score. The prognostic value of the risk score was assessed in patients with different grade of glioma both in the training datasets and validation datasets.
The TIMER online database has been used to visualize and analyze the abundance of infiltrating immune cells in tumors, which integrates a number of samples, including 32 kinds of tumors from TCGA datasets [24]. Furthermore, the TIMER database also has been used to research the infiltration level of six types of immune cells in tumors. Hence, the immune infiltrate levels of glioma patients were downloaded from the TIMER database, and the relationship between the risk core and immune cell infiltration was explored in present study.

Statistical analysis
The R language v 3.6.1 (https://www.rproject.org/) was mainly used to perform the statistical analysis.
In addition, the R language v 3.6.1 was used to conduct gene functional enrichment analyses, and determine the areas under the curves (AUC) of the survival, ROC curve, KM survival analysis [25].
Furthermore, the nomogram and risk score were also formulated using the R package. Differences among clinical outcomes were test using independent t-tests. A P-values < 0.05 was considered statistically significant.

Differential expression of IRGs
A total of 676 differentially expressed genes were finally identified using the R language v 3.6.1, which included 392 over-expressed and 284 lowly-expressed genes (Figs. 1A and 1B). In addition, 91 differentially expressed IRGs were also selected, which included 66 over-expressed and 25 lowlyexpressed genes (Figs. 1C and 1D). Furthermore, the cytokine-cytokine receptor interaction was the most frequently implicated in these differentially expressed IRGs in the KEGG pathways ( Fig. 2A).
"Behavior", "extracellular region", and "cytokine activity", were the most frequent biological terms among cellular component, biological process, and molecular function (Table 1).

Selection of survival-associated IRGs
Since the OS is crucial for clinical diagnosis and treatment, we focused our efforts on identifying the potential molecular biomarkers which could affect the prognosis or assist in the diagnosis of diseases.
In the present study, to extract survival-associated IRGs, we selected differentially expressed IRGs were related to the OS (P < 0.05). Finally, all differentially expressed IRGs were crucially related to the OS in glioma patients after screening (Additional file 2: table S2).

Identification of Hub IRGs
The PPI network analysis proved the 20 hub genes among these survival-associated IRGs (Figs. 2C and 2D). We found that the inflammatory pathway was the most often enriched by these hub IRGs, the chemokine signaling pathway and cytokine-cytokine receptor interaction was the most frequently implicated in these hub IRGs in the KEGG pathways (Fig. 2B). "cell", "immune response", and "C-C chemokine receptor activity", were the most frequent biological terms among cellular component, biological process, and molecular function ( Table 2). A forest plot demonstrated 8 hub IRGs were lowly-expressed and 12 hub IRGs were over-expressed in glioma samples (Fig. 2E). Owing to the significant clinical value of these IRGs, we embarked on a comprehensive exploration of their molecular characteristics. Moreover, genetic alterations (mutation and coy number change) of these hub IRGs were examined and the frequencies were very low (< 0.8%) in glioma (Fig. 3), demonstrating that the different expressions of these Hub IRGs in glioma were not caused by genetic alterations.

TF-based Network with Hub IRGs
In order to explore the regulatory biomolecular function of these hub IRGs that corresponded to the clinical outcomes, the potential molecular mechanisms of survival-associated IRGs were researched.
The expression of 318 TFs were detected, and 9 differentially expressed TFs between the GBM and LGG samples were selected (Figs. 4A and 4B). Among these 9 differentially expressed TFs, all of them were correlated with the OS of glioma patients, 8 TFs were up-regulated while 1 TF was downregulated in glioma samples (Fig. 4C). Furthermore, a gene regulatory framework was established based on the 9 TFs and 20 hub IRGs (Fig. 4D). The regulatory relationships between these survivalassociated TFs and hub IRGs were illustrated by the regulatory TF-based network.
The risk scores associated with clinicopathological features The expression of seven IRGs were selected to determine the risk score based on the minimum criteria, and the coefficients obtained from the least absolute shrinkage and selection operator (LASSO) in TCGA dataset. Then, the risk score was used to separate the glioma samples into high-risk and low-risk groups with differentially clinicopathological features (Fig. 5). The risk score played a crucial role in the identification of glioma patients, according to the potential differential clinical characteristics (P < 0.05) (Fig. 6A). In addition, the ROC curve suggested a very good predictive performance for the risk score based on IRGs in predicting glioma patient clinical outcomes (AUC = 0.885, Fig. 6B). It was also found that the risk score which was as a prognostic factor in different grades of glioma (P < 0.05) (World Health Organization [WHO] grade II, and III and GBM) (Figs. 6C-6E).
Furthermore, the univariate Cox regression analysis and the multivariate Cox regression analysis demonstrated that the risk score could become an independent prognostic predictor after other parameters were adjusted, including age, grade, gender, IDH status and 1p/19q codel status both in the training dataset (TCGA) (Fig. 7A, Fig. 7C) and the validation dataset (CGGA) (Fig. 7B, Fig. 7D).
Further, we also found the risk score could be a prognostic factor in all grade glioma and GBM groups in CGGA dataset (P < 0.05) (Figs. 7E-7F). This conclusion confirmed the risk score derived from seven IRGs could be considered as an independent prognosis predictor in glioma patients. The clinical outcomes of seven survival-associated IRGs were also explored both in the training dataset (TCGA) ( Table 3) and the validation dataset (CGGA) (Additional file 3: Table S3). We also examined the relationships between the risk score and each clinical characteristic, including age, grade, gender, IDH status and 1p/19q codel status. The risk score was significantly higher in old age, 1p/19q noncodel, IDH-wildtype and WHO Ⅳ, no difference was observed in gender (Fig. 8B) in both the TGGA (Figs. 8A-8E) and CGGA (Additional file 1: Fig.S1) datasets. In order to determine whether the risk score could reflect the GBM immune microenvironment status, the relationships between the risk score and infiltration level of immune cells were described (Figs. 9A-9F). Table 3 The relationship between the expression of seven IRGs and the clinical characteristics in TCGA dataset

Discussion
Although it has been confirmed that IRGs play an important role in tumorigenesis and tumor progression, there is still a need to make a comprehensive genome-wide profiling research to determine the relationship between molecular biological functions and patient clinical characteristics [26]. The risk score was chosen as a tool to monitor the immune microenvironment and suggested the OS in glioma patients. In addition, seven survival-associated IRGs (SSTR5, CXCL10, CCL13, SAA1, CCL21, CCL27 and HTR1A) were selected by the LASSO to calculate the risk score in TCGA dataset.
Considering that the underlying molecular mechanism corresponding to the potential clinical value of these seven survival-associated IRGs, among these genes, SSTR5 is downregulated in glioma and inhibits glioma cell proliferation [32], while CXCL10 CCL20 is upregulated and promotes glioma cell proliferation [33][34]. SAA2 can increase migration and invasion behaviors of glioma cell [35]. no literature has been reported on the molecular biological functions of HTR1A in glioma at present.
Overall, these studies provided little information on the molecular biological functions of these survival-associated IRGs in glioma.
The nomogram has been extensively applied to calculate the clinical risk factors and predict clinical prognosis in many tumors, which has shown favorable effects [36,37]. Hence, the risk score was connected with the clinical characteristics (grade, gender, age, IDH status, 1p/19q code), and a nomogram was constructed to prove that the risk score has a great performance in predicting the OS of glioma patients. In addition, the nomogram could also predict the 5-year survival rate in the differential risk score groups via the KM survival analysis. According to the nomogram established in the present study, the risk score based on seven survival-associated IRGs were demonstrated as an independent prognostic factor, which may provide new evidences for the treatment of glioma.
In order to explore the tumor-immune interactions, it is necessary to characterize the immune infiltration landscape. The relationship between risk score and level of immunocyte infiltration was established to determine the regulation mechanism of the immune microenvironment in GBM. In the present study, we demonstrated the infiltrations level of B_ cell and Neutrophil were evidently positively correlated with the risk score, while CD_4 T cell infiltration level was significantly negatively correlated with the risk score. These results indicated that the higher infiltration levels of B_ cell, Neutrophil and lower CD_4 T cell might be observed in high-risk patients.
These results suggested that immune cells played an important role in the pathogenesis of glioma, and it was also confirmed that the risk score could be regarded as a potential predictor for monitoring the infiltration level of immune cells. Previously, Ge J et al demonstrated that glioma patients had significantly upregulated frequencies of CD4 + cells, when compared to healthy controls [38].
Sokratous G et al suggested that the infiltration level of cytotoxic T cell in glioma could importantly affect the clinical prognosis [39]. However, the biological function of these immune cells in glioma has not be fully and comprehensive explored. The analysis in the present study can provide novel insights for solving several problems, and be a foundation for more in-depth and high-quality researches in the treatment of glioma patients.
There were still some limitations in the present research. First, the samples did not contain information about the excision scope of glioma that connected to a patient's prognosis. Hence, the collection of more thorough and comprehensive information needs to be explored in the future. In addition, without verifying with an independent cohort, the credibility of these present research results remains to challenged. Furthermore, the lack of an in vitro or in vivo experiment was also a limitation of present study.

Conclusions
In the future, certain problems still need to be explored. For example, the potential relationship between immunogenomics and proteomics needs to be further explored to clarify the immunologic varieties in glioma. Furthermore, the relationships between disturbed immuno-genomes and premalignant lesions also needs to be deeply explored. In the present study, it demonstrates that IRGs play a significantly role in the initiation and progression of glioma, and that the risk score is

Availability of data and materials
Data for this study were obtained from the TCGA and CGGA datasets.

Ethics approval and consent to participate
Because patient data in the TCGA and CGGA datasets were de-identified, signed informed consent was waived in this study.

Consent to publish
Not applicable.        26 The relationship between the risk score and the clinical characteristics in TCGA datasets.

Figure 9
The relationships between the risk score and infiltration level of the six types immune cells.

Supplementary Files 27
This is a list of supplementary files associated with this preprint. Click to download. Figure S1.tif Table S1.docx Table S2.docx   Table S3.docx