A 1p/19q Codeletion-Associated Immune Signature for Predicting Lower Grade Glioma Prognosis

Lower grade gliomas (LGGs) with codeletion of chromosomal arms 1p and 19q (1p/19 codeletion) have a favorable outcome. However, its overall survival (OS) varies. Here, we established an immune signature associated with 1p/19q codeletion for accurate prediction of prognosis of LGGs. The Chinese Glioma Genome Atlas (CGGA) and The Cancer Genome Atlas (TCGA) databases with RNA sequencing and corresponding clinical data were dichotomized into training group and testing group. The immune-related differentially expressed genes (DEGs) associated with 1p/19q codeletion were screened using Cox proportional hazards regression analyses. A prognostic signature was established using dataset from CGGA and tested in TCGA database. Subsequently, we explored the correlation between the prognostic signature and immune response. Thirteen immune genes associated with 1p/19q codeletion were used to construct a prognostic signature. The 1-, 3-, 5-year survival rates of the low-risk group were approximately 97%, 89%, and 79%, while those of the high-risk group were 81%, 50% and 34%, respectively, in the training group. The nomogram which comprised age, WHO grade, primary or recurrent types, 1p/19q codeletion status and risk score provided accurate prediction for the survival rate of glioma. DEGs that were highly expressed in the high-risk group clustered with many immune-related pathways. Immune checkpoints including TIM3, PD1, PDL1, CTLA4, TIGIT, MIR155HG, and CD48 were correlated with the risk score. VAV3 and TNFRFSF11B were found to be candidate immune checkpoints associated with prognosis. The 1p/19q codeletion-associated immune signature provides accurate prediction of OS. VAV3 and TNFRFSF11B are novel immune checkpoints.


Introduction
Glioma, which derives from glial cells, is the commonest primary intracranial malignancy and is associated with poor outcomes. Gliomas are classified into grade I, II, III, or IV (Louis et al. 2007). Those in histological grade IV, such as glioblastoma (GBM), are considered high grade gliomas, while those in grade II and III are regarded as lower grade gliomas (LGG) (Kiran et al. 2019). The median GBM survival is 1 to 2 years after diagnosis while the overall survival (OS) for LGG patients ranges between 5 and 10 years. Despite advances in cancer screening, diagnosis and treatment, LGGs often progresses into high grade glioma within years (Huang et al. 2017;Kiran et al. 2019;Stupp et al. 2005). While LGG patients experience a longer survival times and a better quality of life, progression into GBM, is associated with poor therapeutic options and significantly lower prognosis. Therefore, effective LGGs treatments are of utmost importance for improved glioma outcomes.
Conventionally, brain tumors are classified through histogenesis, by observing microscopic tumor features. However, over time, it became clear that more efficient techniques were needed, leading to the development of molecular classification features techniques (Louis 2012). Currently, the WHO recommends that molecular parameters, such as the codeletion of chromosomal arms 1p and 19q (1p/19 codeletion), and isocitrate dehydrogenase (IDH) mutation status be Jie Xu and Fang Liu should be considered joint first authors.
In glioma treatment, surgery, followed by chemotherapy and radiotherapy are associated with some improvement in therapeutic benefits relative to surgery alone. Immunotherapy is expected to improve treatment outcomes against glioma. While 1p/19 codeletion is being used in LGGs classification, little is known about the correlation between 1p/19 codeletion, the immune system and OS (Ceccarelli et al. 2016). Here, we uncovered a prognostic immune signature that correlates with 1p/19 codeletion. We hypothesize that gene expression reprogramming that follows the 1p/19 codeletion might modulate the immune system.

Identification of Immune-Related Genes Correlated with 1p19q Codeletion
Glioma RNA-seq datasets and corresponding clinical information were downloaded CGGA (https ://www.cgga.org.cn/) and TCGA (https ://porta l.gdc.cance r.gov/). Kaplan-Meier (KM) analysis was then used to evaluate survival. Log-rank tests were used to assess the correlation between 1p19q codeletion status and OS in various WHO grade phenotypes. The 1p19q codeletion status assessed using gene expression analysis as done previously Hu et al. (2017). Next, univariate and multivariate Cox regression analyses were used to evaluate the value 1p19q codeletion as an independent prognostic factor. Differentially expressed genes (DEGs) in 1p19q codeletion vs non-deleted samples were identified using the "limma" package in R software (version 3.6.1), by imposing the following criteria: |log2 fold change, log2FC|> 1 and an adjusted p = < 0.05. This analysis involved data from 192 1p19q codeletion LGG samples and 394 non-deletion LGG samples. 1p19q codeletion-associated immune-related DEGs were identified from the DEGs based on immune-related gene annotation on the IMMPORT website (https ://www. immpo rt.org/) (Zhang et al. 2019a). Only genes shared by CGGA and TCGA were included in downstream analyses. Only samples for which an OS time of > 90 days were retained for downstream analyses.

Elucidation of the Prognostic Signature
Next, univariate Cox proportional hazards regression and LASSO (least absolute shrinkage and selection operator) Cox regression analyses were done on the 1p19q codeletionassociated immune-related DEGs to prognosis-associated genes. The LASSO regression algorithm is used to reduce overfitting high-dimensional prognostic genes (Castro et al. 2019;Goeman 2010). Multivariate Cox proportional hazards regression analysis was then used to establish a prognostic signature with a coefficient (β) based on all the genes included in the signature (Deng et al. 2019). The risk score was a sum value calculated in accordance with the formulate: risk score = (expression of gene A1*β1) + (expression of gene A2*β2) + (expression of gene A3*β3) + … (expression of gene An*βn) (Qian et al. 2018). All CGGA dataset samples were identified and classified as either low-risk or high-risk based on the median risk score (Liu et al. 2019). KM survival plots and log-rank tests were used to evaluate the correlation between risk scores and OS.

Validation of the Prognostic Signature
Next, internal and external validation analyses were done to verify the prognostic signature's predictive power, which was evaluated using survival plots, 1-, 3-, and 5-year timedependent receiver operating characteristic (ROC) curves, and survival status plots were (Yang et al. 2020). Heatmaps and violin plots were used to visualize the expression profiles of the prognostic signature genes in the low and highrisk groups.

Evaluation of the Independent Value of the Prognostic Signature
Correlation between risk score and clinical information including age, sex, radiotherapy status, chemotherapy status, tumor grade, primary or recurrent tumor and IDH mutation status were analyzed. Univariate and multivariate Cox regression coupled with available clinical information were used to evaluate the independent prognostic capacity of the risk score.

Prognostic Nomogram Analysis and Validation
Independent prognostic factors emerging from the CGGA dataset were subjected to nomogram analysis to predict the 1-, 3-and 5-year survival (Long et al. 2019). 5 independent prognostic factors, age, WHO grade, primary or recurrent glioma types, 1p19q codeletion status, and risk score, were used to develop the nomogram. We then validated the nomogram's prognosis accuracy using concordance index (C-index) combined with a calibration curve plot. This analysis was done in 1000 reiterations (Duan et al. 2018;Kiran et al. 2019).

Gene Ontology Analysis
Next, we executed a GO term analysis of DEGs between the high and low-risk groups. DEGs were identified by setting the following thresholds: log2FC > 1.2 and p-value < 0.05. Significantly enriched GO terms were indicated by p-value < 0.01, q value < 0.01, and gene counts > 10. Results from this analysis were visualized on circle plots.

Analysis of Correlation Between Risk Score and Expression of Immune Checkpoints
Differential expression of 7 established immune checkpoint genes in the low and high-risk groups was analyzed. These are, T cell immunoglobulin domain and mucin domain 3 (TIM3), programmed cell death 1 (PD1), PD1 interacts with programmed death ligand 1 (PDL1), cytotoxic T lymphocyte antigen 4 (CTLA4), T cell immunoreceptor with Ig and ITIM domains (TIGIT), long non-coding RNA MIR 155 host gene (MIR155HG), and CD48 (Filippova et al. 2018;Hung et al. 2018;Liu et al. 2018Liu et al. , 2020Peng et al. 2019). This analysis evaluated the correlation between risk score and expression of the checkpoint genes. p-value = < 0.05 was considered statistically significant.

Evaluation of the Candidate Immune Checkpoints
The following criteria used to elucidate underlying immune checkpoints: (1) DEGs in high-risk and low-risk groups were common in the CGGA and TCGA datasets, (2) genes correlated with OS, (3) the genes were independent of clinical prognosis parameters including age, sex, WHO grade, and IDH1 mutation status, (4) genes had an AUC > 0.7, (5) there was correlation between gene expression and risk score, (6) candidate immune checkpoints genes had a correlation value > 0.6 both in CGGA and TCGA datasets, (7) candidate immune checkpoints show correlation with familiar immune checkpoints, (8) select candidates with a correlation score > 0.4 as immune checkpoints.

1p/19q Codeletion and Differentially Expressed Immune Genes
Analysis of survival in the CGGA datasets, revealed significantly lower OS in cases with 1p/19q codeletion relative to those lacking the codeletion. Interestingly, outcomes were markedly in grade II and III tumors with 1p/19q codeletion relative to grade IV tumors (Fig. 1a). Univariate and multivariate Cox analyses revealed 1p/19q codeletion as an independent prognostic factor for LGGs (Fig. 1bc). Differential gene expression analysis revealed that 551 DEGs between 1p/19q codeletion samples (n = 191) and non-codeletion samples (n = 393). Of the 551, 56 are immune-related genes. No statistically significant differences were noted in codeletion vs non-codeletion samples with regards to sex, age, primary or recurrent type, chemotherapy or radiotherapy status (Table 1).

Elucidation and Internal Validation of the Immune-Related Prognostic Signature
Univariate Cox proportional hazards regression and LASSO regression analyses of candidate genes revealed 23 genes that were screened in multivariate Cox proportional hazards regression. Thirteen immune-related genes associated with coefficients were included in the prognostic signature (Table 2). KM analysis of the low and high-risk LGG samples using log-rank test revealed that 1-, 3-and 5-year survival rates in the low-risk group were 97%, 89%, and 79%, respectively, while in the high risk groups survival rates were 81%, 50% and 34%, respectively (Fig. 2a). The prognosis was significantly better in cases with lower risk scores, indicating that the risk score negatively correlated with OS. AUC values for the signature's prediction of 1-, 3-, and 5-year survival were 0.818, 0.793, and 0.750, respectively (Fig. 2b). Indicating high risk score correlated with decreased survival (Fig. 2c). The heatmap depicted the visual difference trends of transcript expression of genes incorporated in the signature between the high-and low-risk categories (Fig. 2d). The violin plot presented a statistically differential expression between the two categories (Fig. 2e).

External Validation of the Prognostic Signature
The 1-, 3-and 5-year survival in the TCGA database were 99%, 89%, and 76%, respectively, while in the testing cohort, the corresponding survival rates were only 84%, 51%, and 36%, respectively (Fig. 3a). The capacity of the testing cohort to predict survival was very similar to that of the training cohort. ROC curve analysis was used to validate prediction accuracy. The AUC values for 1-, 3-and  5-year survival were 0.896, 0.785 and 0.708, respectively (Fig. 3b). A similar trend was observed in the testing cohort ( Fig. 3c-e).

Evaluation of the Independent Prognostic Value
Correlation between risk score and clinical parameters, including age, sex, radiotherapy and chemotherapy status, tumor grade, primary or recurrent types and IDH mutation status, was assessed. The value of risk score was lower in The heatmap (d) and violin plot (e) of 13 genes between the low-and high-risk groups included the signature chemotherapy, 1p/19 codeletion, tumor grade II, primary and IDH mutation categories (p < 0.05, Fig. 4a). Univariate and multivariate Cox proportional hazards regression indicated that the risk score phenotype had independent prognostic value in the training testing cohort (Fig. 4b-e).

Nomogram Analysis
Nomogram analysis, using 5 prognostic markers (age, tumor grade, primary or recurrent type, risk score, and 1p/19 codeletion status), was used to predict survival in and violin plot (e) of 13 genes between the low-and high-risk groups included the signature the training set. Among these prognostic factors, risk score ranked a vital proportion in the total points (Fig. 5a). To validate the accuracy of the individual assessment, concordance index (C-index) and calibration curve of the nomogram were evaluated for internal validation. The C-index of the nomogram was 0.794. The visualized calibration curve for probabilities for 1-, 3-and 5-year OS revealed good agreement between the predicted nomogram and actual survival (Fig. 5b-d). Additionally, internal validation was done by randomly sampling 50% of the CGGA samples. The C-index was 0.797, and the calibration curves had goodness-off-fit ( Fig. 5e-g).

GO Term Analysis
503 genes were differentially expressed between low and high-risk groups. Of these, 255 had a log2FC > 1.2, and 166 of them were included in the term GO analysis.
The GO term analysis produced 33 terms that had gene counts > 10, 12 of which (including 37 DEGs) were associated with immune-related terms, including B cell receptor signaling pathway, lymphocyte-mediated immunity and humoral immune response (Fig. 6).

Elucidation of Immune Checkpoints and Risk Score
Next, the relationship between 7 established immune checkpoints and risk score was evaluated (Fig. 7). Expression of immune checkpoint genes, (except TIGIT), in the low-risk vs high-risk samples was statistically significant in the training and testing sets (Fig. 7a, b). TIM3, MIR155, and CD48 exhibited the highest correlation (> 0.4) in the training set ( Fig. 7c-i). In the testing set, TIM3, MIR155, PD1, and PDL1 expression positively correlated with risk score (Fig. 7g-p). Analysis of the correlation between DEGs (in low vs high-risk samples) and survival indicated that 279 and 199 genes in the CGGA and TCGA dataset, respectively, are significantly associated with OS. Further analysis revealed 42 and 73 genes in the CGGA and TCGA datasets (AUC > 0.7), respectively, that were independent of age, gender, tumor grade and IDH mutation status. Of these, 20 were common between the 2 datasets. Analysis of correlation between expression of the 20 genes and the risk score revealed 6 (colorectal neoplasia differentially expressed (CRNDE), transmembrane protein 71 (TMEM71), growth arrest specific 2 like 3 (GAS2L3), insulin like growth factor 2 mRNA binding protein 3 (IGF2BP3), vav guanine nucleotide exchange factor 3 (VAV3), TNF receptor superfamily member 11b (TNFRSF11B)) with a correlation coefficient > 0.6 in the training and he validation groups ( Fig. 8a-d). VAV3 and TNFRSF11B are immunerelated genes. Sankey diagram analysis revealed co-expression between the 7 established immune checkpoint genes and the 6 immune checkpoint genes we identified. CD48, MIR155HG, PDL1 showed a strong relationship (Cor > 0.4, p < 0.05) with other immune checkpoints in the training and testing set (Fig. 8e, f). VAV3 exhibited a close relationship with MIR155HG, while TNFRSF11B correlated with MIR155HG and PD1 in the training and testing sets.

Discussion
1p/19q codeletion is a well-established biomarker (Durand et al. 2010), currently recommended by the WHO for tumor grade classification (Ceccarelli et al. 2016). Here, we find that the 1p/19 codeletion related immune genes have prognostic potential in LGG. To design an unbiased prognostic system, we uncovered a prognostic signature by analyzing LGGs RNA-seq and clinical data from CGGA and TCGA. This prognostic signature validated the hypothesis that improved outcomes upon 1p/19q codeletion are associated with altered immunoregulation. In addition to the wellestablished immune checkpoints, including PD1 and TIM3, uncovered 6 novel immune checkpoint candidate. Numerous studies have previously described prognostic signatures for glioma (Jang and Kim 2018;Zhang et al. 2019b;Zhou et al. 2018). Additionally, there has been a growing interest in glioma immunotherapy (Chheda et al. 2018;Deumelandt et al. 2018;Kohanbash et al. 2017;Weller et al. 2017). There is evidence that 1p/19q codeletion correlates with significantly improved glioma prognosis. However, it remains unclear whether the codeletion's impact on outcomes are mediated via immune regulation. Here, we find that an immune-related prognostic signature associated with 1p/19q codeletion might influence glioma prognosis. Unlike a previous study (Zhang et al. 2019c), our prognostic signature, based on a phenotype, decreased heterogeneity and increased prediction accuracy.
Immunotherapy has generated a lot of interest as a treatment for gliomas (Srinivasan et al. 2017). Deng et al. (2019) reported an IDH1 mutation prognostic signature and its association immune-related GO terms. Here, we find that highly expressed genes in high-risk group correlated with various immune-related pathways, including, B cell receptor signaling, lymphocyte-mediated immunity, and humoral immune response. Analysis of the microenvironment has shown that immune-related pathways influence behavior of glioma cells (He et al. 2014). To evaluate the relationship between our Fig. 7 The expression of seven immune checkpoints (TIM3, PD1, PDL1, CTLA4, TIGIT, MIR155HG, CD48) in low-and high-risk groups in CGGA database (a) and TCGA database (b). The correla-tions between the immune checkpoints and risk score in CGGA database (c-i) and TCGA (j-p) database prognostic signature and immunobiology, we evaluate its correlation with well-established immune checkpoint genes, and found that TIM3, PD1, PDL1, CTLA4, MIR155HG, and CD48, but not TIGIT (Hung et al. 2018;Liu et al. 2018;Peng et al. 2019), correlates with risk score. These findings are to some extent consistent with those by Deng et al. The genes are significantly associated with the OS in CGGA dataset (2) and TCGA dataset (3). The genes are independent predictors of OS in CGGA dataset (1) and TCGA dataset (4). Genes have an AUC of > 0.7 (6, 7) and a correlation value of > 0.6 (5, 8) in CGGA dataset and TCGA dataset, respectively. The correlations between six candidate immune checkpoints and the risk score in CGGA dataset (c) and TCGA dataset (d). Sankey diagrams showing the internal and external correlations between avowed immune checkpoints and candidate immune checkpoints in CGGA dataset (e) and TCGA dataset (f) (2019). Recent studies have highlighted the potential of immune checkpoints therapeutic targets. Here, we identified 6 novel immune checkpoint genes (VAV3, GAS2L3, IGF2BP3, CRNDE, TNFRSF11B, and TMEM71) that correlate with prognosis. VAV3 and TNFRSF11B had already been annotated as immune-related genes on IMMPORT (https ://www.immpo rt.org/). The expression of these genes also highly correlates with risk score and expression of 7 well-established immune checkpoints. Most of the candidate immune checkpoint genes have been previously associated with glioma (Kryvdiuk et al. 2015;Pop et al. 2018;Salhia et al. 2008). Kiang et al. (2017) reported that CRNDE is elevated in glioma and might be modulated by EGFR signaling to promote gliomagenesis. However, there is little knowledge of the role of our candidate immune checkpoints in immune regulation of glioma. Further studies are needed to experimentally validate their involvement in glioma.
Although the phenotypes of glioma were classified according to molecular biomarkers (Reifenberger et al. 2017), the OS of LGGs with 1p/19q codeletion varies widely. It is clear that the 1p/19q codeletion-associated immune prognostic signature reduces variability, making prognosis more accurate. The immune response pathways associated with high-risk group raised several important questions, including, which immune checkpoint genes might regulate these immune response pathways. Glioma prognosis remains extremely poor, suggesting that the single immune therapy in use has failed to significantly improve OS. However, it should be noted that a diversified immune therapeutic strategy may be more effective. Thus, additional immune checkpoints for LGG treatment should be made first-line treatments along with surgical resection, radiotherapy or chemotherapy. The novel candidate immune checkpoint genes identified here, especially VAV3 and TNFRSF11B, are likely to become established immune checkpoint genes.
The purpose of this study was to establish a prognostic signature for LGGs with 1p/19q codeletion that can be used in clinical settings. However, LGGs including astrocytomas and oligodendrogliomas, have great tissue heterogeneity. The utility of this prognostic signature will likely be limited by such heterogeneity. The candidate immune checkpoint genes, are in fact, prognostic related genes and had a close correlation with the risk score and well-established immune checkpoints. The immune-related function and mechanisms of candidate immune checkpoints in LGGs in our study were hypothesized and their experimental validation is necessary. We contend that immunotherapy based on multiple immune checkpoints simultaneously may provide improved outcomes in glioma.