Association between AS events and cervical cancer survival
In total, 41426 AS events were detected from 19696 genes of 256 CESC patients and classified into seven AS types (Table 1). The events were mainly enriched in ES, followed by AT and AP. In detail, we detected 15942 ESs in 6271 genes, 8395 ATs in 3663 genes, 8066 APs in 3256 genes, 3424 AAs in 2397 genes, 3017 ADs in2106 genes, 2373 RIs in 1801 genes, and 209 MEs in 202 genes (Fig. 1A). On average, one gene might possess up to six types of AS. To explore the relationship between AS events and OS of CESC patients, all AS events mentioned above were subjected to the univariate Cox regression analysis, and 3708 survival-associated AS events in seven AS types in 3355 genes (Table 2) which we identified as significant AS events (Fig. 1B) were screened out.
Table 1
AS events were detected and classified into seven AS types.
Type
|
AS events
|
Genes
|
AA
AD
AP
AT
ES
ME
RI
Total
|
3424
3017
8066
8395
15942
209
2373
41426
|
2397
2106
3256
3663
6271
202
1801
19696
|
Table 2
3708 survival-associated AS events in seven AS types in 3355 genes.
Type
|
AS events
|
Genes
|
AA
AD
AP
AT
ES
ME
RI
Total
|
3424
3017
8066
8395
15942
209
2373
41426
|
2397
2106
3256
3663
6271
202
1801
19696
|
Biological terms enriched in AS-related genes in cervical cancer
Since we identified 3708 survival-associated AS events in seven AS types in 3355 genes, to explore possible pathways for these AS events, the parent genes of survival-associated AS events were subjected to functional enrichment analyses. GO analysis was performed to investigate the biological functions and pathways enriched in these genes as well as the interaction network among them. In the aspect of biological processes, the genes were mostly enriched in Golgi to lysosome transport, negative regulation of mitotic nuclear division and negative regulation of nuclear division (Figure S1A). In the aspect of cellular component, the genes were mostly enriched in lysosome, mitochondrion and lysosomal lumen (Figure S1B). In the aspect of molecular functions, the genes were mostly enriched in death domain binding, nucleoside-diphosphatase activity and GTPase activator activity (FIGS1C). KEGG analysis showed that the genes were mostly enriched in glycosaminoglycan degradation, glyoxylate and dicarboxylate metabolism and NF-kappa B signaling pathway (Figure S1D). Molecular Complex Detection (MCODE) was used to screen out the modules in the protein-to-protein network. PPI network propped by these genes revealed that the AS events were engaged in nine MCODE components (Figure S2A). Among these components, the top related cluster was composed of 12 nodes and 36 edges (Figure S3A), the second of 9 nodes and 17 edges and the third of 4 nodes and 6 edges (Figure S3B-C). The top related cluster was transferred to functional analysis. In the aspect of biological process, the genes were mostly enriched in regulation of oxidative stress-induced neuron intrinsic apoptotic signaling pathway, regulation of osteoclast development and regulation of DNA endoreduplication (Figure S4A). In the aspect of cellular component, the genes were mostly enriched in AP-2 adaptor complex clathrin coat of endocytic vesicle and endolysosome membrane (Figure S4B). In the aspect of molecular function, the genes were mostly enriched in endocytic adaptor activity, clathrin adaptor activity and nitric-oxide synthase binding (Figure S4C). KEGG analysis showed that the genes were mostly enriched in ubiquitin mediated proteolysis, endocytosis and endocrine and other factor-regulated calcium reabsorption (Figure S4D). The top 10 hub genes by PPI network were screened according to their degree values, which were significantly associated with AS events (Figure S2B). Among them, UBA52 connecting 38 nodes showed the most prominence. The distributions of survival-associated AS events were displayed in Fig. 6A. Among them, the MEs have no statistical significance. The 20 most significant prognosis-related AS events were also shown (Fig. 2B-G).
Prognostic models constructed with seven types of AS events for CESC
To investigate the relationship between each AS event and the prognosis of CESC patients, we constructed proportional hazards regression analysis on genes related to AS events. With a cutoff of p < 0.05, 402 genes with possible prognostic significance were screened out, including 33 of AD, 155 of AP, 103 of ES, 71 of AT, 15 of RI, and 25 of AA (TABLE S1).
These prognosis-related AS-event genes were further analyzed with the least absolute shrinkage and selection operator (LASSO) Cox regression algorithm (Figure S5). Then,multivariate Cox proportional hazards regression analysis was further performed to build the risk signature. Using the seven types of AS events, we constructed seven prognostic models. The risk scores were calculated (Table 3). Based on the risk score, CESC patients were divided into low-risk group and high-risk group based on the median levels of risk score. In each type of model, Kaplan-Meier survival analysis indicated that low-risk patients had significantly better overall survival than high-risk patients (Fig. 3). We also performed the receiver operating characteristic curve (ROC) analysis. As shown in Fig. 4, PI-AA demonstrated the highest capacity of estimating the prognosis of CESC patients (AUC 0.92), followed by PI-RI (AUC 0.871). The risk score, survival status and the expression of 7 types of AS-event genes in prognostic model for each patients was displayed in Figure S6-7, Fig. 5. To assess whether these model was an independent predictor of CESC, univariate and multivariate Cox regression analyses were performed, including clinical factors and risk scores. The results showed that this prognostic model showed moderate and independent prognostic power for all seven AS events (Fig. 6–7).
Table 3
Seven prognostic models based on 7 types of AS events.
Type
|
AS events
|
Genes
|
AA
AD
AP
AT
ES
ME
RI
Total
|
319
289
698
658
1471
14
259
3708
|
310
279
653
620
1237
14
242
3355
|
Type
|
Formula
|
AUC
|
AA
|
-ENTPD6|58863|AA*9.39-GATAD2A|48635|AA*13.54-ENAH|9992|AA*4.58-S100PBP|1637|AA*6.26-LY6E|85377|AA*2.75 + SLC35A2|89036|AA*8.23 + NCKIPSD|64733|AA*2.08-STX5|16450|AA*24.42-SLC25A39|41837|AA*4.14-BDKRB2|29188|AA*1.61 + KANSL3|54567|AA*4.64-KLHDC4|37952|AA*4.60-H2AFY|73449|AA*14.92
|
0.92
|
AD
|
FCF1|28425|AD*3.04 + ZCCHC9|72670|AD*5.08 + FAM115A|82126|AD*2.93-PIGT|59553|AD*2.94-TLE3|31411|AD*3.72 + NPIPB5|35571|AD*4.39 + TFG|65836|AD*3.56 + POPDC2|66348|AD*2.60 + POLM|79464|AD*6.22-NFKB2|12949|AD*4.29 + PAPOLG|53670|AD*3.18
|
0.792
|
AP
|
C1QTNF1|43985|AP*2.11 + FOXRED2|62052|AP*2.54-STK25|58383|AP*2.23 + HNRNPK|86708|AP*8.33 + PREPL|53429|AP*1.93-COMMD5|85672|AP*6.51-NT5DC2|65224|AP*1.89
|
0.807
|
AT
|
DYX1C1|30732|AT*10.05-TNNI1|9376|AT*10.37 + ZCCHC4|68961|AT*5.06-GAPVD1|87563|AT*40.73 + DAPL1|55686|AT*1.07 + HGF|80248|AT*1.71-GLT8D2|24090|AT*16.48-PGAM5|25293|AT*16.41-C4orf36|69840|AT*5.67 + YJEFN3|48656|AT*3.40-CHCHD3|81833|AT*2.82
|
0.826
|
ES
|
-TBC1D22A|62728|ES*27.75-SLC39A6|45210|ES*15.41 + GEMIN7|50397|ES*5.26 + AP2M1|67841|ES*5.23-ERBB2IP|72262|ES*1.89-YAF2|21209|ES*12.61 + GBA|8042|ES*38.51-FAM76A|1344|ES*1.63-DNAJC12|11904|ES*2.30 + TMEM51|739|ES*1.79-THAP8|49333|ES*5.57
|
0.849
|
RI
|
-SLC25A3|23859|RI*3.66 + NEIL2|82632|RI*3.33-AMDHD2|33282|RI*3.69-SH3BP1|62136|RI*18.95 + SMEK1|28882|RI*2.94-TDRKH|7663|RI*2.27 + DUOXA1|30388|RI*6.54-MAX|27938|RI*2.34-TMCO6|73700|RI*3.24 + DRG2|39549|RI*2.17-ABCD4|28376|RI*2.76
|
0.871
|
All
|
FCF1|28425|AD*4.32 + C1QTNF1|43985|AP*2.10-TBC1D22A|62728|ES*17.75 + ECE1|963|AP*3.33-SLC39A6|45210|ES*20 + GEMIN7|50397|ES*4.63-ACSS1|58860|AT*4.59-ENTPD6|58863|AA*5.89 + DYX1C1|30732|AT*10.38
|
0.865
|
Association between AS events’ risk scores and clinical features
We analyzed the association between AS event risk score and the clinical features of CESC(Figure S8). Interestingly, we found that all the AS genes had significant correlation with tumor status in both the high- and low-risk groups. Among them, AT-related genes were associated with tumor status and grade, RI-related genes with tumor status and stage.
Association between splicing factors and AS events
SFs play an important part in the occurrence and development of AS events via changing exon selection and splicing site. Therefore, it is necessary to uncover the correlation between SFs and AS events. We got 385 SFs from SpliceAid 2, and downloaded the expression profiles of these 385 genes from the TCGA database. Then we used the Pearson’s correlation matrices to calculate the relationship between the SFs and AS-events. With a cutoff of p < 0.05 and |Cor|>0.5, we constructed the SF-AS regulatory network with 43 significant SFs (Fig. 8A) (TABLE S3). The relationship between 43 SFs and CESC prognosis was determined with univariate Cox proportional hazard regression analyses. With a cutoff of p < 0.05, seven genes (SNRPA, PQBP1, CCDC12, SNRPF, CELF2, HSPA8 and TXNL4A) were screened out (TABLE S2). Then, seven genes were further submitted to LASSO Cox regression algorithm. The optimal values of the penalty parameter lambda were determined through 10-times cross-validations. The simplest (smallest parameter) model of prognostic genes signature within one standard error of the best lambda value was screened out (Fig. 8B-C). Finally,Multivariate Cox proportional hazards regression analysis screened out three genes (SNRPA, CELF2 and HSPA8) (Fig. 8D). Risk score=-0.033* SNRPA-0.34*CELF2 + 0.003*HSPA8. Based on the median levels of risk score, CESC patients were divided into low-risk and high-risk groups. Kaplan-Meier survival analysis indicated that low-risk group had better overall survival than high-risk group (Fig. 9A). In ROC curve analysis, risk score of 0.685 reaped the highest specificity and sensitivity in predicting the 5-year survival (Fig. 9B).
Meanwhile, we analyzed the association between the clinical parameters and the risk score of three genes. The univariate and multivariate Cox proportional hazards regression showed that only the risk score calculated based on three genes was an independent prognostic indicator of CESC (Fig. 9C-D). Of the three genes, SNRPA and CELF2 were protective genes to CESC. Interestingly, the AS most closely related to CELF2 was TC2N, type AP. Our analysis showed that TC2N was negatively correlated with CELF2, and univariate Cox proportional hazard regression analysis showed that TC2N was a causative factor, and the two results were consistent. Similarly, CALCOCO2, PLS3 and FAM76A developed from ES, NGLY1 from AP, COX15 and PPOX from AT; all these AS-developed genes were positively correlated with SNRPA. The results of univariate Cox proportional hazard regression analysis showed that these AS-developed genes were protective factors. PLEKHB2, PPP1CC and GALNT4 developed from AP, LYRM5 and NPIPA7 from AT, and these genes were negatively correlated with SNRPA. Univariate Cox proportional hazard regression analysis proved that they were all pathogenic factors.
Association between AS event and tumor immune microenvironment
In order to explore the role of AS-related genes in the tumor immune microenvironment, We divide the samples into high- and low-risk groups, combined with comprehensive analysis of immune cells. The correlation of riskscore and tumor immune microenvironment was shown in (Fig. 10A). Figure 10B-G indicated that Dendritic cells activated, Dendritic cells resting, Macrophages M0, Mast cells resting, T cells CD4 memory activated and T cells CD8 were most correlative with riskscore.