3.1 Overview of AS events profiling in GBM
To profile the AS events in GBM, seven types of AS events present in GBM patients were analyzed (Figure. 1A). A total of 45610 AS events from 21134 genes were detected, which indicated that one gene was associated with approximately two AS events. We detected 3827 AA events in 2684 genes, 3269 AD events in 2269 genes, 8686 AP events in 3476 genes, 8456 AT events in 3696 genes, 18360 ES events in 6933 genes, 184
ME events in 180 genes, and 2828 RI events in 1896 genes. ES events were the most frequently detected, accounting for nearly half of the total AS events. The frequency of ME events was lower than that of other types of AS events (Figure. 1B).
3.2 Identification of prognostic AS events in GBM
The relationship between AS events and the prognosis of GBM patients was examined by univariate Cox regression analysis. Among the 45610 AS events, 1643 were identified as survival-associated AS events (P < 0.05). Because one gene might have more than one event, we visualized the intersecting survival-associated AS events using UpSet PlotR. The results showed that one gene could have up to three types of AS events significantly associated with GBM survival (Figure. 1C). Construction of a gene network by Reactome of Cytoscape including the 300 most significant survival-associated genes identified several genes, such as UBC, STAT3, and FYN, that could serve as the hub genes in the network. The pathways mediated by these genes may play critical roles in regulating the AS network (Figure. 1D).
3.3 Cluster analysis based on prognostic AS events in GBM
To better predict the prognosis of GBM patients based on the scale of AS events, consensus unsupervised clustering was performed based on the 1643 prognostic AS events. We found that the optimal number of clusters was five (Figure. 2A). GBM patients were assigned into five clusters as follows: cluster 1 (n = 100, 62.5%), cluster 2 (n = 24, 15%), cluster 3 (n = 26, 16.25%), cluster 4 (n = 7, 4.375%), and cluster 5 (n = 3, 1.875%). Because of the small number of patients in clusters 4 and 5, we performed survival analysis including clusters 1–3. Patients in cluster 2 showed significantly better survival than those in the other two clusters (Figure. 2B). The results of proportional analysis to characterize the molecular subtype of AS clusters suggested that cluster 2 was robustly correlated with the proneural subtype (Figure. 2C, D), which may be related to the better prognosis of cluster 2. However, assessing the prognosis of the other four clusters was challenging.
3.4 Regulatory network between prognostic splicing factors and prognostic AS events
AS profiles are regulated by several splicing factors that affect exon selection and splicing site choice by binding to pre-mRNAs [18]. To explore the correlation between the prognostic AS events and known splicing factors, data on splicing factors was obtained from SpliceAid2 and TCGA databases. Univariate Cox regression analysis showed that only five splicing factors (KHDRBS2, TIA1, TIAL1, ZRANB2, and PTBP2) were significantly associated with overall survival (Supplementary Table S1). The relationship between the expression of the five splicing factors and the PSI value of prognostic AS events was determined to build a splicing regulatory network (Figure. 2E). The splicing regulatory network identified 57 survival-associated AS events, including 39 unfavorable events (orange dots) and 18 favorable events (green dots), that were significantly correlated with the five splicing factors (purple dots). Most of the favorable AS events were positively correlated with splicing factors (green lines), whereas most unfavorable AS events had a negative correlation with splicing factors. For example, ZRANB2, one the most significant survival-related splicing factors, was positively correlated with the ERCC5 RI event, whereas it exhibited a significant negative correlation with C19orf53 of AP, which were favorable and adverse prognostic AS events, respectively (Figure. 2F, G). Kaplan-Meier survival analyses indicated that ZRANB2 was marginally correlated with survival regardless of the cutoff value, such as the tripartite point and the quarter point (Figure. 2H–J). The remaining four splicing factors showed similar relationships (data not shown). These data indicate that the splicing factors were not accurate predictors of survival in GBM patients.
3.5 Development of an individualized prognostic AS signature
Because a signature is more accurate for predicting overall survival than a single event, individual signatures were constructed for seven AS events to improve prognosis prediction in GBM. For each type of AS event, the top three prognostic AS events were identified according to the univariate Cox regression results (Figure. 3A–G). The results of the analysis of the 21 AS events are summarized in Table 1. The AS signatures were calculated as follows: risk score = (lnHR of the top AS event × PSI value of the top AS event + lnHR of the top two AS events × PSI value of the top two AS events + lnHR of the top three AS events × PSI value of the top three AS events). We further combined the seven signatures into one model and calculated the risk score as follows = (lnHR of AA × AA score + lnHR of AD × AD score + lnHR of AP × AP score + lnHR of AT × AT score + lnHR of ES × ES score + lnHR of ME × ME score + lnHR of RI × RI score). The median risk score was used as the cutoff to divide the patients into high or low risk groups. The eight predictive models indicated that the survival status of the high-risk groups was worse than that of the corresponding low risk groups (Figure. 3A–G).
Table 1
Top 3 survival related events of seven types of AS.
AS Events_ID | Gene Symbol | Type | HR | Adj P value | LnHR |
ID_19897 | CHD4 | AA | 0.0381 | 0.000849 | -3.266 |
ID_30617 | DMXL2 | AA | 0.094 | 0.0025 | -2.364 |
ID_64840 | USP19 | AA | 0.0457 | 0.0086 | -3.085 |
ID_48995 | ZNF302 | AD | 0.0176 | 0.00027 | -4.04 |
ID_41811 | TMUB2 | AD | 0.0897 | 0.0044 | -2.41 |
ID_14562 | SERGEF | AD | 8.444 | 0.00526 | 2.133 |
ID_76351 | TMEM63B | AP | 0.0253 | 0.000317 | -3.679 |
ID_3657 | DDAH1 | AP | 40.938 | 0.000557 | 3.712 |
ID_74448 | MAT2B | AP | 0.0135 | 0.000933 | -4.307 |
ID_78181 | SYNE1 | AT | 0.0035 | 0.00032 | -5.642 |
ID_44016 | CCDC40 | AT | 88.91 | 0.0016 | 4.488 |
ID_72688 | RPS23 | AT | 0.0121 | 0.0019 | -4.413 |
ID_29531 | C14orf2 | ES | 6.337 | 0.0001 | 1.847 |
ID_46873 | HSD11B1L | ES | 36.32 | 0.00016 | 3.592 |
ID_127358 | MRPS28 | ES | 17.127 | 0.00035 | 2.841 |
ID_10258 | TTC13 | ME | 0.211 | 0.0133 | -1.554 |
ID_100824 | RPE | ME | 0.248 | 0.0167 | -1.394 |
ID_35718 | CLN3 | ME | 26.391 | 0.0199 | 3.273 |
ID_10337 | COA6 | RI | 0.0616 | 0.0018 | -2.787 |
ID_85333 | SLC45A4 | RI | 0.108 | 0.0035 | -2.222 |
ID_69370 | HOPX | RI | 7.352 | 0.0042 | 1.995 |
Abbreviations: AS, alternative splicing; AA, Alternate Acceptor Site; AD, Alternate Donor Site; AP, Alternate Promoter; AT, Alternate Terminator; ES, Exon Skip; ME, Mutually Exclusive Exons; RI, Retained Intron; HR, hazard ratio.
Kaplan-Meier survival analyses according to the eight predictive signatures showed that survival time was considerably shorter for high risk patients than for low risk patients (Figure. 4A–H). ROC curves for 1-year survival were drawn to estimate the specificity and sensitivity of each signature. The results showed that the AS signatures had good predictive ability, and the combined model including all signatures had a better predictive accuracy than each individual signature (Figure. 4I). The potential of the AS-associated signatures as independent predictive factors was assessed by univariate and multivariate Cox analyses. Classical survival-related clinical parameters were identified in the univariate Cox regression analysis (Supplementary Table S2). The results of multivariate Cox analyses indicated that the eight signatures were independently correlated with GBM survival after adjusting for age, IDH1 mutation status, MGMT methylation status, chemotherapy, and radiotherapy (Table 2). These findings suggest that AS-associated signatures can predict the survival of GBM patients in a stable and independent manner.
Table 2
Cox Regression Analysis of TCGA RNA seq database.
(Adjusted by Age,IDH1,MGMT,Chemotherapy and Radiotherapy) | |
|
Multi-Variate Regression | |
Variable | HR | P value | |
| | | |
AA signature | 1.939 | 0.017 | |
(High RiskScore VS Low RiskScore) | | | |
AD signature | 1.632 | 0.048 | |
(High RiskScore VS Low RiskScore) | | | |
AP signature | 2.332 | 0.00015 | |
(High RiskScore VS Low RiskScore) | | | |
AT signature | 1.725 | 0.0409 | |
(High RiskScore VS Low RiskScore) | | | |
ES signature | 1.736 | 0.0033 | |
(High RiskScore VS Low RiskScore) | | | |
ME signature | 3.299 | 5.50E-03 | |
(High RiskScore VS Low RiskScore) | | | |
RI signature | 2.895 | 3.20E-04 | |
(High RiskScore VS Low RiskScore) | | | |
ALL signature | 1.979 | 1.70E-07 | |
(High RiskScore VS Low RiskScore) | | | |
Abbreviations: TCGA, The Cancer Genome Atlas; AS, alternative splicing; AA, Alternate Acceptor Site; AD, Alternate Donor Site; AP, Alternate Promoter; AT, Alternate Terminator; ES, Exon Skip; ME, Mutually Exclusive Exons; RI, Retained Intron; HR, hazard ratio.
3.6 The AP signature was identified as an immune relevant signature
To assess the association between the seven predictive signatures and GBM molecular subtypes, the risk scores for four molecular subtypes were analyzed. The AP risk score was significantly increased in the mesenchymal subtype (Figure. 5A), which is an immune relevant subtype [19]. PCA based on immune related genes showed that the distribution pattern of the mesenchymal samples was different from that of non-mesenchymal samples, suggesting that the immune status of the mesenchymal subtype differed from that the other three subtypes (Figure. 5B). GO and KEGG analyses of significantly upregulated genes in the mesenchymal subtype were performed in the DAVID website, and the enrichment results showed that the most relevant biological processes were immune-related terms such as immune response and leukocyte chemotaxis (Figure. 5C).
The close association between the AP risk score and the mesenchymal subtype indicated that the AP signature could be considered as an immune-related AS signature. To validate this hypothesis, differentially expressed genes between high AP risk and low AP risk patients were identified, and the ClueGo plugin of Cytoscape was used to analyze the functional enrichment. The results showed that the most relevant biological processes were immune response and immune cell chemotaxis (Figure. 6A). Moreover, we depicted the most common somatic mutation events in GBM stratified by AP score (Figure. 6B). The results showed that the low AP score group occupied more IDH1 mutation, which was another indication of that AP score involved with immune response and microenvironment (Fig. 6C). These findings were confirmed by GSEA (Figure. 6D–F). Consistent with our previous study [4], tumor purity was a key factor affecting the prognosis of glioma patients. To explore the effect of the AP risk on the composition of the immune microenvironment, tumor purity was assessed for each patient, as well as the enrichment score of several hub non-tumor cells. Correlation analysis showed a significant negative correlation between the AP risk score and GBM purity (Figure. 6G). Analysis of the correlations between AP risk and the scores of eight hub non-tumor cells suggested that neutrophils were the most relevant components of the microenvironment involved in AP risk (Figure. 6H). These findings indicated that the AP signature may predict the prognosis of GBM through its effect on the immune response and the composition of the immune microenvironment.