Survival-Associated Differentially Expressed Alternative Splicing Signatures Reveal Neuron Signaling and Metabolism-Related Processes in Glioblastoma

Alternative splicing (AS) is a molecular event that drives protein diversity by generating multiple mRNA isoforms. Increasing evidence shows that the differential expression of AS is related to tumorigenesis. However, a synthetic analysis for identifying the AS biomarkers attributed to glioblastoma (GBM) is AS percent-splice-in (PSI) data were obtained from the TCGA SpliceSeq database. We systematically and interactively analyzed differentially expressed alternative splicing (DEAS) events in GBM. Univariate and multivariate Cox regression analysis was successively performed to identify the overall survival (OS)-associated DEAS events, followed by the construction of DEAS predictor through different splicing patterns. Then, a nomogram that combines the nal DEAS predictor and clinicopathological characteristics was established. Finally, a splicing regulatory network was created according to the correlation between the DEAS events and the splicing factors (SF). Summary illustrations show Figure1. the role of DEAS events in the progression of GBM, providing a potential clinical biomarker and therapeutic reference for GBM.


Introduction
Glioblastoma (GBM) remains the most aggressive and malignant form of primary central nervous system (CNS) tumors in adults. According to the 2016 World Health Organization (WHO) molecular diagnosis guidelines for CNS, GBM has high heterogeneity and poor prognosis [1]. The current rst-line treatment for newly diagnosed GBM is the maximum safe resection followed by chemoradiation [2,3]. GBM almost always recurs, 2-year survival rate of GBM is 25%, and its 5-year survival rate is 5-10% [4]. Consequently, GBM, which is di cult to cure, puts challenging pressure on future work related to the exploration of new therapeutic goals.
Alternative splicing (AS) is the main driving force for the production of multiple proteins, allowing cells to produce a large amount of protein diversity from a limited number of genes [5]. Alternative splicing occurs in approximately 95% of transcripts [6]. Current research indicates that AS events are divided into seven types, and each has a different mechanism of action described as follows: alternate acceptor site (AA), alternate donor site (AD), alternate promoter (AP), alternate terminator (AT), exon skip (ES), mutually exclusive exons (ME), and retained introns (RI). Cancer cells exhibit remarkable transcriptome alterations, partly by adopting cancer-speci c splicing isoforms. The aberrations of AS in cancers primarily include four categories: (a) AS alterations in oncogenes or tumor suppressor genes, (b) splicing factor mutations, (c) alterations in the upstream signaling pathways that decontrol splicing factors, and (d) aberrations in cancer-speci c spliceosomal components [7]. Alternative splicing can be a target for cancer treatment, and targeting aberrant splicing has provided novel perspectives for clinical therapy strategies [8].
Although Xie et al. [9] and Chen et al. [10] performed a transcriptome-wide analysis of alternative mRNA splicing markers in GBM tissue data from The Cancer Genome Atlas (TCGA), both articles only focused on single AS events. In addition, there is a lack of systematic analysis of differentially expressed AS events (DEAS) between GBM and normal tissues. Although only protein-coding genes have been studied, DEAS can generate differentially expressed transcripts and protein variants.
In this study, we obtained RNA sequences from the Cancer Genome Atlas (TCGA) program and clari ed the role of different DEAS patterns in 154 GBM samples and 5 normal control samples and conducted an in-depth analysis of the potential impact of speci c DEAS events on the prognosis of GBM patients. Finally, we investigated how splicing factor (SF) gene-regulated DEAS events are involved in the occurrence and development of GBM. The purpose of this study was to explore the role of various DEAS splicing modes in GBM and how SF genes regulate DEAS events, thus providing new ideas for the detection and treatment of GBM.

Materials And Methods
Collecting AS events and GBM-related clinical data TCGASpliceSeq (http://bioin forma tics.mdand erson.org/TCGAS pliceSeq) is a web-based resource used for investigating mRNA AS patterns for 33 different tumor types in TCGA database; it was used to download the splicing pro les of GBM [11]. The pro le of the AS events was de ned by a percent-splicedin (PSI) value, which ranged from zero to one. Simultaneously, each splicing event was allocated a unique code consisting of the splicing type, gene symbol, and ID number to ensure accuracy. The splicing pro les of seven types of AS, including alternate acceptor site (AA), alternate donor site (AD), alternate promoter (AP), alternate terminator (AT), exon skip (ES), mutually exclusive exons (ME), and retained introns (RI) were freely available in TCGA database. The expression data for GBM and variable splicing data were integrated. Finally, 154 GBM samples and ve normal control samples were selected for subsequent analyses. The lists of immune-related genes were downloaded from the ImmPort (https://www.immport.org/shared/home) and InnateDB (https://www.innatedb.com/) databases.

Differentially spliced AS events analysis
The limma package was used for differential analysis. In this study, a differentially alternative splicing event was carried out between the GBM samples and normal brain samples, and P < 0.05, and | log FC|>1 was de ned as differential alternative splicing events.
Gene Ontology (GO) functional and KEGG pathway enrichment analyses of DEAS events GO and KEGG pathway enrichment analyses were implemented with the "enrichplot" R package to analyze the function of the DEAS [12]. GO terms included biological process (BP), cellular component (CC), and molecular function (MF). For the analysis results, both P-value and false discovery rate values < 0.05, were de ned as statistically signi cant.

Prognostic model construction
We downloaded the clinical parameters of the GBM cohort from TCGA database (https://tcgadata.nci.nih.gov/tcga/). A total of 559 patients with GBM with overall survival (OS) information were included in the study. To analyze the association between DEAS events and OS of patients, we divided patients into two groups based on the median PSI value of each splicing event and performed univariate Cox regression. Multivariate Cox regression was further conducted to determine splicing events that were independent prognostic factors and to build predictive models. Kaplan-Meier curves were used to verify whether the predictive models could differentiate patients with long OS from those with shorter OS. The e ciencies of each predictive model were further assessed by the receiver operator characteristic curves using the survival ROC package in R. To investigate the intersections between different types of AS, we applied UpSetR to visualize intersections and their aggregates, which is more e cient than the traditional Venn diagram when addressing more than ve sets [13]. Function annotation, pathway enrichment, and gene interaction network analyses were performed on survival-related AS genes using the Reactome FI plugin in Cytoscape (version 3.6.1) [14].

Splicing-factor-regulated network construction
To explore the regulation of splicing factors on prognostic DEAS events, we rst curated TCGA level 3 mRNA-seq data of splicing factors in primary mRNA splicing pathways and determined the survivalassociated splicing factors. The expression levels of survival-associated splicing factors in GBM and non-tumor tissues were also compared. The Spearman test was then conducted to evaluate the potential relationship between the expression level of splicing factors and PSI value of AS events. The regulation networks were plotted using Cytoscape (version 3.6.1).

Integrating alternative splicing events in GBM
Our research conducts data analysis in sequence according to the owchart(  Figure 2A). The exon skip was the most dominant splicing type, followed by the AP splicing type. The number of AS events was larger than the total number of genes, implying that a single gene may have undergone multiple splicing. Some genes can be expressed in up to six AS types. We have described the detailed information of genes of a speci c AS type in the Upset diagram ( Figure 2A). Compared with a traditional Venn diagram, it can more effectively prove the quantitative results of multiple interaction sets.

Identi cation of differentially expressed AS (DEAS) events
By comparing the differences in AS events between GBM and normal tissues, we identi ed key DEAS events involved in tumorigenesis. A total of 2697 DEAS events detected in 2206 gene symbols were collected, including 1054 differentially upregulated AS events and 1643 differentially downregulated AS events. These results comprised 107 AAs in 101 genes, 116 in 110 genes, 695 in 568, 634 in 552, 1008 in 748, 23 in 23, and 114 in 104, as shown in Figure 2B. These data indicated that in GBMs, ES was the most frequent DEAS event, and ME was the rarest. Furthermore, there were 6170 upregulated genes and 4638 downregulated genes in TCGA GBM dataset. We also found that the DEAS events could compensate for the shortcomings of differential expression analysis; for example, we compared the genes involved in the differential AS event with the differentially expressed genes, and found that 234 genes were recognized, as shown in Figure 3A. Through observation, we found that there were no common transcription factors in these 234 genes. However, we used FUNRICH software to analyze and found that 234 genes have some common transcription factors, such as GATA1, POU3F4, MSX1, LHX3, NHLH1, and HENMT1 (p<0.05) ( Figure 3B).

DEAS events mainly are enriched in neuron signaling and metabolism-related processes in GBM
Subsequently, 2697 DEAS events detected in the 2206 gene were pro led using GO functional enrichment analysis. In biological processes, the top three enriched terms were axonogenesis, modulation of chemical synaptic transmission, and regulation of trans-synaptic signaling. In the cellular component, the top three enriched terms were cell leading edge, asymmetric synapse, and neuron to neuron synapse. Regarding molecular function, the top three enriched terms were actin binding, cadherin binding, and tubulin binding. as shown in Figure 3C. Taken together, the DEAS events were mainly enriched in synaptic genesis and transmission.
A Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis revealed several important pathways. The top ve enriched pathways were the HIF-1a signaling pathway, Rap1 signaling pathway, axon guidance, regulation of actin cytoskeleton, and insulin signaling pathway ( Figure 3D). Taken together, the DEAS genes regulated by DEAS events were mainly involved in metabolism-related processes. These results indicate that splicing some core genes involved in neuronal signaling and metabolism-related processes may be important in GBM.

Identi cation of DEAS events associated with survival of GBM patients
Using the DEAS event pro les in the GBM cohort, we identi ed 135 DEAS events that were signi cantly associated with the OS of GBM patients (p<0.05) by univariate Cox regression analysis (Table 1). In particular, we found genes with potentially more than one DEAS event that was signi cantly associated with patient survival. To better visualize the intersecting sets, an UpSet plot was created, as shown in Figure 3E. Interestingly, our analysis revealed that genes can exhibit up to seven types of DEAS events that were all found to be signi cantly associated with GBM patient survival. In addition, via FUNRICH software analysis, we found that 135 DEAS events were associated with the OS of GBM patients are not only related to central nervous system development, but are also closely related to other kidney, stomach, breast, skin, hematopoietic, and lymphoid tissues in cosmic analysis. ( Figure 3F).
We drew a volcano map based on the obtained data, which clearly showed that DEAS events related to prognosis accounted for the majority ( Risk score as a low-risk factor can be an independent prognostic indicator of GBM. Next, we performed a LASSO regression analysis of all types of DEAS events related to prognosis ( Figure  5A, B). The number of the other seven AS events is too small, and it is not suitable for LASSO regression analyses. The left mode of each group is the LASSO coe cient curve, and the right mode of each group is the choice of the adjustment parameter (λ) in the LASSO model. The y-axis represents partial likelihood deviations. The lower x-axis indicates the logarithm (λ), and the upper x-axis represents the average number of predictors. The red dot represents the average deviation value for each model with a given lambda, where the location indicates the best data. As a result, the prognostic-relevant AS events were selected in all types of AS events, these AS events included 8 AP events, 5 AT events, 4 ES events, and there was NKIRAS2|40977|AP, TMEM63B|76352|AP, GSG1L|35696|AP, RPL39L|68070|AP, To determine the independent prognostic factors associated with GBM patients with DEAS events, we screened the most important DEAS events. A multiple Cox regression model with independent prognostic factors was performed for 17 DEAS events. In our data analysis, we found that the above types of DEAS events had a strong ability to predict the prognosis of GBM patients ( Figure 5C). The red line represents the high-risk subgroup, the blue line represents the low-risk subgroup. The AUC for all types of AS events was 0.826 ( Figure 5D). Figure 5E shows the distributions of survival time, status, and risk scores among the low-risk and highrisk groups. The image is divided into the upper, middle, and lower sections. The top portion indicates the distribution of risk scores, the middle part shows the survival time and status of GBM patients (green indicates alive, red indicates dead), and the bottom part is the heatmap associated with it. The black vertical dashed line in the middle indicates the best dividing line between the low-risk and high-risk groups. DEAS events, including the AP of GSG1L and AT of CDKL3, were associated with poor prognosis. DEAS events including AP of TMEM63, ES of CPEB4, AT of FAM13A, AT of OBSL1, AT of HNF1A, and AT of CCDC40 were associated with good prognosis.
As shown in Figure 5F, we performed univariate and multivariate Cox regression analyses of all groups of prognosis-related AS events in the forest plot. We analyzed the data using Univariate Cox regression analysis showed that age (reference low), IDH mutation (reference high), and risk score (reference low) of the four groups were used as independent prognostic indicators of GBM (P<0.05). Multivariate Cox regression analyses also demonstrated that the risk score (reference low) of the four groups was an independent prognostic indicator of GBM (P<0.05).

Construction of Potential SF-AS Regulatory Network
To explore the upstream mechanism of AS regulation, we compared the identi ed SF genes with the genes obtained from TCGA and extracted the expression data of SF genes from the latter. We screened a total of 49 SF-related genes and constructed a regulatory network of DEAS and SF genes (IcorI> 0.8, P<0.001) using Cytoscape, and integrated the analyses with the risk factors ( Figure 6A). Our analysis identi ed nine key SFs, including ELAVL4, CELF3, CELF5, RBFOX1, YWHAH, STXBP1, CELF4, ELAVL2, and DNM1. Interestingly, we found that 49 SF-related genes were mainly enriched in the regulation of nucleobase, nucleoside, nucleic acid metabolism, and protein metabolism in biological processes. Regarding molecular function, 49 SF-related genes were mainly enriched in the structural constituents of ribosomes and RNA binding. In the biological pathway, 49 SF-related genes were mainly enriched in gene expression, RNA metabolism, and insulin synthesis and processing( Figure 6B). Meanwhile, we use FUNRICH software to nd Biological pathway enrichment analysis of SF ( Figure 6C)and analysis SF Site of expression( Figure 6D).

Immune-related hub genes
By intersecting 135 OS-DEAS with the lists of immune-related genes obtained from ImmPort and InnateDB, two hub immune-related genes, NR1H3|15695|AP and PIK3R2|48396|AT, are related to prognosis( Figure 6D). We found that NR1H3 was highly expressed in GBM( Figure 6E). The cg07298473 methylation site was related to NR1H3 expression( Figure 6F). Among the 25 methylation sites, 18 methylation sites showed low expression and seven methylation sites showed high expression( Figure  6G).

Discussion
AS is the basic mechanism of gene expression regulation in eukaryotes, which increases the complexity of gene expression, promotes higher transcription e ciency, and protein diversity [15]. It has been found that the dysregulation of DEAS in cancer-related genes is involved in many biological processes of tumors, and these abnormally regulated genes can be used as molecular markers for cancer prognosis and treatment [16,17]. However, a comprehensive analysis of DEAS signatures in GBM remains unknown.
Through analysis of TCGA database, we identi ed AS events and regulatory splicing factors in GBM to fully understand their differential RNA splicing patterns. In total, 2697 differentially AS events involving 2206 genes were identi ed, of which 1054 were upregulated and 1643 were downregulated. We then evaluated the potential functions and pathways of DEAS-associated genes by performing enrichment analysis. The pathways involved in neuronal signaling and metabolism-related processes It is worth noting that immune-related pathways may be involved in the tumorigenesis of GBM.
To further evaluate whether a speci c DEAS can be used as a prognostic indicator of GBM, we established a prognostic model based on the individual AS pattern. A total of 135 DEAS events were found to be signi cantly associated with OS in GBM patients. Interestingly, among these DEAS events related to OS, our analysis also includes some splice variants that have been identi ed to play an important role in tumor biology. For instance, NKIRAS2, which was identi ed as an IκB-interacting small GTPase, has been reported to induce cytokine-induced NF-κB activation [18]. NKIRAS2 is involved in glioma apoptosis [19]. The protein encoded by the gene SETD6 encodes a methyltransferase that adds a methyl group to the histone H2AZ, which is involved in nuclear receptor-dependent transcription [20]. The protein encoded by CSPG5 is a proteoglycan that may function as a neural growth and differentiation factor [21,22]. GRIA1 is the main excitatory neurotransmitter receptor in the mammalian brain and is activated in a variety of normal neurophysiological processes. These receptors are heteromeric protein complexes with multiple subunits, each subunit has a transmembrane region, and all are arranged to form ligand-gated ion channels [23]. The protein encoded by this gene is an auxiliary subunit of the AMPA ionotropic glutamate receptor. AMPA receptors mediate rapid synaptic neurotransmission in the central nervous system. It has been reported that this protein interacts with the type I AMPA receptor regulatory protein isoform γ-8 to control the assembly of the hippocampal AMPA receptor complex, thereby regulating receptor gating and pharmacological effects. Alternative splicing leads to the development of multiple transcript variants [24,25]. Splicing events such as CCDC40 and UBXN11 play a key role in tumor biology [26,27].
In addition, we constructed a predictive model for each splicing type using multivariate Cox regression analysis. The AUC of all types of AS events was 0.826, which could provide patients with new predictive information and facilitate patient assessment. Given that the regulation of key splicing factors may lead to changes in a wide range of AS events, we further analyzed survival-related AS splicing factors and their related networks, and found that 49 splicing factors are signi cantly related to the overall survival of patients. Interestingly, it was found that most of the splicing factor expression levels associated with poor survival were upregulated.
Next, we took the intersection of 135 OS-DEAS and the immune-related gene list obtained from the immune port and innate DB, and obtained two hub immune-related genes. Interestingly, we found that NR1H3|15695|AP may affect the expression of NR1H3 through methylation modi cation.

Conclusion
Here, we describe the most comprehensive and up-to-date study to assess prognostic factors and survival outcomes for GBM by analyzing abnormal AS variants, splicing factor regulatory networks, and related molecular cluster subtypes. Although the prognostic signi cance of these potential therapeutic targets for GBM still requires further functional and clinical trials to validate the ndings, this research enhances the potential of new clinical biomarkers and therapeutic approaches to help implement new treatment strategies for GBM. Tables   Table 1: The detailed information of the top 20 OS-DEAS events. Figure 1 Flow chart for analyzing GBM alternative splicing in large-scale RNA-Seq data. The RNA sequence data and the corresponding clinical information of the GBM cohort can be downloaded from TCGA data portal website. After excluding patients with incomplete clinical data and repeated data, we combined these data sets into a large-scale GBM cohort, and then further analyzed them via SpliceSeq. Based on the PSI value of each AS event, we determined the DEAS between GBM and normal tissues, and further studied the parental genes of these AS events through enrichment analysis. Next, we visually analyzed the interaction network of these parental genes and the regulatory network of DEAS and splicing factors.

Figures
Finally, to evaluate the prognostic value of each DEAS event in GBM, Cox regression and Kaplan-Meier analysis were used to determine their impact on OS.