Systematic Proling of Survival-Associated Alternative Splicing Events in Adrenocortical Carcinoma

Background: Aberrant alternative splicing (AS) is involved in many oncogenic processes and systematic analysis of survival-associated aberrant AS events has been reported in many cancers. This study aims to systematic proling the AS signature in Adrenocortical carcinoma (ACC). Methods: Data of ACC were downloaded from TCGA and TCGA SpliceSeq. Clinical information and AS events data were integrated with the same TCGA ID. Then, we performed univariate Cox analysis to identify survival-related AS events. Lasso regression and multivariate Cox analysis were used to establish prognostic model. In addition, several bioinformatics analyses were conducted to identify pathways enriched by genes of survival-associated AS events and construct splicing-factor-regulated network. Results: A total of 77 patients with complete clinical information and PSI values of AS events were included in the present study. We detected 3781 AS events in 2366 genes were associated with overall survival of ACC patients. All the predictive models showed eciency in distinguishing good and poor outcomes of ACC patients. All the AUCs of predictive models were greater than 0.7. Functional analysis genes with survival-associated AS events suggested that the POLR2H, TCEB2, PSMA1, PSMD11 and SKP2 ranked at the core. The splicing-factor-regulated network revealed the potential regulatory mechanisms of AS events in ACC. Conclusions: Our systematic proling of survival-associated AS events in ACC patients provides novel molecular alternations and contributes to decipher the underlying mechanisms of AS in oncogenesis of ACC.


Methods
Data collection and collation RNA-seq data and matched clinical data of ACC cohort were downloaded from TCGA GDC data portal (https://portal.gdc.cancer.gov/repository). There are seven types of AS events, including alternate acceptor site (AA), alternate donor site (AD), alternate promoter (AP), alternate terminator (AT), exon skip (ES), mutually exclusive exons (ME), and retained intron (RI) [31]. To quantify AS events, percent spliced in (PSI) (ranging from zero to one) values for the AS events of ACC were used [18,32]. Seven splice event types with a PSI value of more than 75% in ACC samples were downloaded from TCGA SpliceSeq (https://bioinformatics.mdanderson.org/TCGASpliceSeq/PSI download.jsp).
When obtained the data of PSI values, we removed the AS events with mean value < 0.05 or standard diversion < 0.01. We integrated clinical information and AS events data with the same TCGA ID.

Survival analysis
A total of 77 ACC patients with overall survival (OS) >90 days were included in this study. UpSet, a visualization technique [33], was used to quantitative analysis of intersecting sets between different types of AS. We performed univariate Cox analysis on all of AS events to calculate the association between the PSI value of each AS events and the OS of ACC patients. Those AS events with P-values < 0.05 were identi ed as prognosis-related AS events. Then, the selected AS events were screened by the least absolute shrinkage and selection operator (Lasso) regression. Lasso regression is a statistical method that determines the best factors to use for prediction models [34,35] by using the "glmnet" and "survival" packages in R. We established prediction model for ACC patients by using the multivariate Cox analysis with the "survival" package in R. To further assess the predictive accuracy and sensitivity of the prediction model, we performed the receiver operating characteristic (ROC) curves analysis with the survivalROC package in R. Univariate and multivariate Cox regression were performed to determine the splicing-based prognostic signature as an independent prognostic factor.
Bioinformatics analyses for the genes of survival-associated AS events To explore the molecular characteristics for the genes with survival-associated AS events, we used "STRING" version 11.0 (http://string-db.org/cgi/input.pl) [36] to conduct bioinformatics analyses, which including protein-protein interaction (PPI), Gene ontology (GO), and the Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis.

Splicing factors and AS regulatory network
To explore the regulation of splicing factors (SFs) on prognosis-related AS events, we obtained 404 SFs which was previously reported [37]. The level 3 mRNA-seq data of SFs genes were curated from the TCGA dataset. Univariate Cox analysis was performed on the prognosis-related AS events and SFs and then constructed the f SF-AS regulatory network according to the following conditions: P value less than 0.05 and Pearson correlation coe cient more than 0.65. The regulation network was plotted via Cytoscape version 3.7.1.

Statistical analysis
R version 3.6.1 was used for all statistical analysis. For all statistical methods, p values less than 0.05 were considered signi cant.

Results
General information of ACC patients in the TCGA cohort As shown in Table 1, 92 tumors with clinical information was downloaded from TCGA. In our study cohort, the median age at diagnosis was 47.16 years old, patient age ranged from 14 year to 83 years.
Overview of AS events and survival associated AS events in TCGA ACC cohort Comprehensive AS events in seven splicing types, including AA, AD, AP, AT, ES, ME, and RI were summarized for ACC. A total of 34,420 AS events in 8994 genes were detected from TCGA SpliceSeq dataset, indicating that one gene may undergo multiple AS events simultaneously. In detail, 2707 AAs in 1960 genes, 2382 ADs in 1688 genes, 6342 APs in 2575 genes, 8201 ATs in 3575 genes, 12,269 ESs in 5337 genes, 124 MEs in 122 genes and 2395 RIs in 1605 genes (Fig. 1A).
Univariate Cox analysis revealed that 3781 AS events in 2366 genes were associated with ACC OS rates (P<0.05). In detail, as shown in Fig. 1B, 199 AAs in 186 genes, 248 ADs in 221 genes, 679 APs in 423 genes, 1224 ATs in 724 genes, 1184 ESs in 937 genes, 8 MEs in 8 genes and 239 RIs in 209 genes were identi ed as survival-associated AS events. The UpSet plot (Fig. 1C) vividly revealed that one gene could undergo multiple AS events simultaneously.

Molecular characteristics of survival-associated AS events
The distributions of AS events signi cantly associated with patient survival are shown in Fig. 2A. We displayed the top 20 (if available) most signi cant survival-associated AS events for each AS type in Fig.  2B-H. To explore the molecular characteristics of genes with the top 50 most signi cant survivalassociated AS events (if available), we performed several bioinformatics analyses. Firstly, we established a PPI network to reveal the relationships among these genes. As shown in Fig. 3A-B, POLR2H, TCEB2, PSMA1, PSMD11 and SKP2 ranked at the core in the network. According to the functional enrichments of these genes, "intracellular membrane-bounded organelle", "mitochondrion", "membrane-bounded organelle", "organelle part" and "intracellular organelle part" were the ve most signi cant cellular component terms (GO) (Fig. 4A). For biological process terms (GO), "metabolic process", "cellular metabolic process", "nitrogen compound metabolic process", "intracellular transport" and "organic substance metabolic process" were the ve most signi cant enrichments (Fig. 4B). There were no signi cant pathway enrichments observed in molecular function (GO). Finally, we observed that "Thermogenesis" was the only signi cant pathway (FDR=0.032) correlated with these genes in KEGG pathway analysis.

Prognostic predictors for ACC patients
We used the Lasso regression and multivariate Cox regression analysis to generate prognostic models  Table 2) following univariate Cox. Then, we divided ACC patients into low and high risk groups based on median values to analyze the e cacy of prognostic models by using Kaplan-Meier (K-M) method. As shown in Fig. 6A-H, all the prognostic models could predict good and poor outcomes of ACC patients. ROC curves validated the e ciency of these prognostic models (Fig. 6I). To further elucidate the independent prognostic signi cance of PM-ALL, univariate and multivariate Cox regression analyses were performed. After adjusting for the clinical factors, the PM-ALL remained an independent prognostic factor for ACC patients, with an HR of 1.012 (95%CI: 1.003-1.020, P=0.007) ( Table 3).

Network of survival-associated AS genes and SFs expression
SFs are RNA-binding proteins that recognize cis-regulatory elements within the pre-mRNA to in uence exon selection and splice site choice (38). SF alternations are a hallmark of cancer. Therefore, we explored the interaction networks of survival-associated AS genes and SFs. Firstly, we found 20 SFs related to survival that could be used as independent prognostic factors by using K-M method and Cox regression analysis (Table 4). Next, correlation analyses between the expression of these 20 SFs and the PSI values of survival-associated AS events were performed by using Pearson's correlation analysis (cor > 0.6, P < 0.001). Correlation plots were then generated using Cytoscape 3.7.1. The results showed that the expression of 19 survival-related SFs (triangular nodes) were correlated with 206 survival-associated AS events (Fig. 7). Overall, 97 AS events were correlated with favorable OS (rea ovals) and 109 AS events were correlated with poor OS (green ovals).

Discussion
Aberrant AS is involved in the development process of cancer [23][24][25]. TCGA RNA sequencing data and TCGA SpliceSeq data have enabled investigation of AS patterns in many different kinds of cancers, such as breast cancer (28), ovarian cancer [29]. Growing evidence has shown that AS has emerging potential therapeutic targets and biomarkers in cancer therapy [16,22,26]. However, comprehensive information concerning dysregulation AS in ACC, which is an endocrine malignancy with rapid progress and poor prognosis and lacks selective and e cacious treatment options, is lacking.
In this study, we rst found that 34,420 AS events in 8994 genes in the TCGA ACC cohort, among which 3781 AS events in 2366 genes were related to survival (P < 0.05). Among these survival-associated AS events, some splice variants may play critical roles in oncogenic processes, such as the AA variant of ZFAND6, the AD variant of ZSCAN18, the AP variant of CMC2, the ES variant of CIRBP, the ME variant of THNSL2 and the RI variant of CIRBP. Many splicing isoforms of these genes have been found to be related to survival in various types of tumors, such as papillary thyroid cancer [39], ovarian cancer [29], esophageal adenocarcinoma and esophageal squamous cell carcinoma [40].
Next, given the molecular function of AS events is partly described by the downstream functional impact, we conducted PPI network analysis. We found POLR2H, TCEB2, PSMA1, PSMD11 and SKP2 ranked at the core in the network. As THE HUAMAN GENE DATABASE GeneCards displays that POLR2H encodes an essential and highly conserved subunit of RNA polymerase II that is shared by the other two eukaryotic DNA-directed RNA polymerases, I and III and alternative splicing of POLR2H generates multiple transcript variants. To our knowledge the related study of POLR2H was rare, but it has been shown to be associated with the occurrence and progression of prostate cancer [41]. TCEB2 (also known Elongin B, ELOB) encodes the protein elongin B, which is a subunit of the transcription factor B (SIII) [42,43]. Deng et al. reported that TCEB2 Confers Resistance to VEGF-targeted Therapy in ovarian cancer [44]. As many problems of ACC are still unresolved, such as disease prevention and earlier detection, these new discovered core genes may provide new insights for us.
In addition, we proposed the predictive model for each splice type and all available splice types of AS events using Lasso regression and multivariate Cox regression. All the predictive models showed e ciency in distinguishing good and poor outcomes of ACC patients. All the AUCs of predictive models were greater than 0.7, suggesting good power and potential in application of prognosis prediction for ACC patients. However, the e ciency should be veri ed by another independent cohort.
Alterations in activity of regulatory SFs are an important mechanism of aberrant AS in cancer [45].
Therefore, we identi ed 20 survival-associated SFs in ACC patients, such as KHSRP, SRSF7, SRSF2 and HNRNPA2B1. Serine/arginine (SR) proteins and heterogenous ribonuclear proteins (hnRNPs) are the two key families of SFs. Many studies have con rmed that SR and hnRNPs involve in tumorigenesis, such as isoforms PKM2 or TP53β of SRSF3 alters cell metabolism and induces cellular senescence [46,47]. SRSF7 is upregulated in lung cancer, and its knockdown impacts cell proliferation [48]. In glioblastoma, hnRNPA2/B1 mediates its tumorigenic effect through alternative splicing of key oncogenes and tumor suppressors [49]. Moreover, splicing correlation network revealed that multiple AS events were correlated with SFs expression in ACC. These ndings might provide new insight into the mechanisms underlying the development and progression of ACC.

Conclusions
The current study conducted a comprehensive analysis base of survival-associated AS events in ACC patients, which might contribute to uncover the function of AS events in ACC. Moreover, the identi cation of prognostic SFs and construction of the correlation networks between SFs and AS events will pave the way for in-depth exploration of splicing-related mechanisms in the oncogenic process of ACC. This comprehensive analysis AS events and SFs also provided valuable therapeutic targets that require further validation.

Declarations Ethics approval and consent to participate
The study was approved by the Institutional Review Board of the A liated Hospital of Southwest Medical University, and was performed following the TCGA publication guidelines.

Consent for publication
Not applicable.

Availability of data and materials
Not applicable.

Competing interests
The authors declare that they have no competing interests.