The Correlation Between Novel Survival-related Alternative Splicing Signature and Prognostic Prediction and Immune Microenvironment in Clear Cell Renal Cell Carcinoma


 BackgroundAlthough extensive researches related alternative splicing (AS) as the prognostic markers of patients in cancers, which remains unknown in clear cell renal cell carcinoma (ccRCC), especially in immunotherapy. Therefore, a novel survival-associated AS signature was established to predict prognosis of patients and explored its correlation with immune cell infiltration or immune checkpoint expression to explain the phenomenon of resistance to immunotherapy in ccRCC.Methodsccording to AS data, clinical information and gene expression data in ccRCC, overall survival-related AS events was identified and further the AS-related prognostic risk model established by LASSO regression and multi-Cox regression analysis was evaluated by Kaplan-Meier survival analysis, the ROC curves and a nomogram model. Then we clarified the biological processes and pathways by GSEA, and further measured the immune cell infiltration by ESTIMATE, CIBERSORT and ssGSEA. Finally we analysed the clinical features and immune features of different parental genes, and quested the splicing factors regulating riskScore-related AS events by Spearman correlation analysis.ResultsWe obtained the most significant 5 AS events, including C4orf19|69001|AT, UACA|31438|AP, FAM120C|89237|AT, TRIM16L|39629|AP and SEC31A|100881|ES, to establish the prognostic risk model, and further illustrated the stability and importance of the riskScore prognostic signatures. Then we found that in high risk group, most of the top 10 GO enrichments and the KEGG pathway were closely related to the immunity, and the higher immune cell infiltration, and higher expression of classic immune checkpoints such as PD1 and CTLA4. In addition, 6 different parental genes were obtained, including C4orf19, ARHGAP24 DNASE1L3, P4HA1, SLC39A14 and TAF1D. These 6 genes could not be the independent prognostic signatures, but the expression of these genes was closely related to immune cells infiltration and the expression of immune checkpoints. Finally, we got aberrant 52 splicing factors regulating riskScore-related AS events.ConclusionOur study discovered that overall survival-related AS events mediated by aberrant splicing factors can be constructed a prognostic risk model to predict prognosis of patiens and utilized to index the situation of immune cell infiltration and immune checkpoint expression that impact tumor immune microenvironment in ccRCC.


Introduction
Renal cell carcinoma (RCC) is a malignant tumor originated from the renal tubular epithelium, as accounting for about 90% of all malignant kidney tumors, of which is the most common form [1]. The carcinoma encompasses ten subtypes in histopathology, of which clear cell RCC (ccRCC) is the most prevalent and accounts for about 80% [2]. After partial or radical nephrectomy for patients with stage I or II in ccRCC, of which the 5-year survival rate is more than 90%, but less than 10% in those with stage IV due to lack of targeted drugs and the poor sensitivity to radiotherapy and chemotherapy [3]. Even if the combination of immunotherapy and/or targeted therapy has prolonged reasonably the effective survival time of ccRCC patients, the bene ciary group remains extremely limited and even develops into drug resistance [4]. Hence, many researchers have devoted themselves to exploring the prognostic markers of ccRCC [5,6], especially in immunotherapy, but unsatisfactorily, those results remain completely unascertained.
The prominent biodiversity and functional complexity of human body are fundamentally caused by protein diversity, of which one of the molecular mechanism is alternative splicing (AS) and processing modi cation for pre-mRNA [7]. As a post-transcriptional process, AS enables the organized generation of multiple mRNAs from pre-mRNA of a single gene, which as molecular templates are translated into diverse proteins in mitochondria [8], indicating that it plays an important role in regulating the development and metabolism of cells and even provides potential opportunities for tumorigenesis or other disease when it expresses abnormally [9]. The discovery of mutations or aberrant expression in genes encoding splicing factors (SFs), a protein binding and splicing to the pre-mRNA, provides genetic evidence of a direct link between splicing misregulation and cancer pathogenesis [10]. According to recent reports, the phenomenon of tumorigenesis caused by mutations in SFs has been discovered and aon rmed in multiple cancers, especially the most common in hematologic malignancies [11]. For instance, SF3B1, SRSF2 and U2AF1, as the most extensively studied SFs, their mutations are closely related to a variety of tumors, such as chronic lymphocytic leukemia [12], uveal melanoma [13] and many solid tumors including breast carcinoma and cutaneous melanoma [14]. Aberrant expression of SFs, by reason of copy number variation or modi cation by transcription factors such as MYC ampli ed in cancers, possess oncogenic properties by bringing about abundant AS events and engendering the generation of different protein isoforms [15,16]. So far, AS events regulated by SFs mainly involve the following seven types: alternate acceptor site (AA), alternate donor site (AD), alternate promoter (AP), alternate terminator (AT), exon skip (ES), mutually exclusive exons (ME) and retained intron (RI) [17].
Recently, increasing number of researchers pay attention to and lucubrate the AS events, as an essential regulatory mechanism, for which even have developed the effective targeted drugs in the eld of tumors, such as ovarian cancer [18] and prostate cancer [19]. In RCC, it has been reported that some AS events, such as PKM and HnRNPL, can promote the occurrence or development of tumors [20,21] and even resistance to chemotherapy drugs [22], however, mainly are focused on as the novel prognostic factors [23][24][25]. Besides, AS events are closely related to the immune microenvironment, meanwhile the feasibility and safety of splicing isoforms produced by AS as potential tumor immunotherapy targets are in-depth discussed, of which the clinical challenges are in-detail analysed [26]. Due to the limited research conditions, in the eld of tumors, it is still more researched and reported as an indicator or a marker of tumor immune microenvironment (TIME) or immune cell in ltration [27][28][29].
In this study, of which details were shown in the work ow ( Figure 1A), we constructed the prognostic risk models based on survival-related AS events in ccRCC samples from the TCGA database, then found that in high-risk patient groups, more abundance of immune cell in ltration such as regulatory T cells (Treg cells), M0 macrophages, etc., but no signi cant difference in in ltrative CD8 + T cells, and the higher expression of immune checkpoints such as PD-1, CTLA-4, etc. These comprehensive results indicate the heterogeneity of TIME and the limited effect of immune checkpoint inhibitors (ICIs) in ccRCC.

Methods And Materials 1: Data Acquisition and Processing
We downloaded gene expression data and corresponding clinical information of KIRC (kidney renal clear cell carcinoma, same as ccRCC) samples from The Cancer Genome Atlas (TCGA) data portal website (https://portal.gdc.cancer.gov/) [30]. Alternative splicing (AS) data from KIRC samples were acquired from SpliceSeq data website (https://bioinformatics.mdanderson.org/TCGA SpliceSeq/PSIdownload), which provides a resource for investigation of tumor-normal alterations in mRNA splicing patterns of TCGA RNASeq data [31]. The following screening criteria were used: (1) patients pathologically diagnosed with KIRC; (2) samples with complete clinical characteristics, including age, gender, survival time, survival state, histological grade and pathological stage, including T, N,and M; (3) samples with integral mRNA expression data and corresponding AS data; (4) downloads of AS events where the percentage of samples with percent-spliced-in (PSI) value is greater than 75%; (5) deletions of samples where the percentage of AS events with PSI value is less than 70%; (6) deletions of AS events of which the mean of PSI value is greater than 0.95 or less than 0.05; (7) deletions of AS events of which the standard deviation (SD) of PSI value is less than 0.05. For minimizing the possible bias caused by the NA values of PSI, the "Knn" algorithm of "impute" (version: 3.13) package in R (version: 3.4.2) was used to perform multiple interpolations of the missing values. The "UpSetR" (version: 1.3.3) package was used to crate the UpSet diagram, which shows the gene interaction sets in 7 types of all AS events.

2: Identi cation of survival-associated AS events in KIRC
To combine the clinical information of samples and the PSI value of AS events of the corresponding samples by Perl (version: 5.30.0), which was used to estimate the correlation (P<0.05) between AS events and overall survival (OS) by implementing the "survival" (version: 3.2-3) package and univariate Cox regression analysis in R. The "UpSetR" package was used to crate the UpSet diagram of the AS events relevant to OS. The volcano plot and the bubble graphs were drawn by the "ggplot2" (version: 3.2.1) package.
3: Establishment and Evaluation of AS-related prognostic risk models Based out of the OS-related AS events identi ed by univariate Cox regression analysis, we selected the top 50 of these AS events to employ the LASSO (least absolute shrinkage and selection operator) regression to identify the nal elimination of potential predictors with nonzero coe cients by the "glmnet" (version: 1.0.6) package, which avoids model over tting to obtain the smallest parameter model. Then the multivariate Cox regression analysis was adopted to further comprehensively evaluate the contribution of each OS-related AS events ltrated by the LASSO regression. Finally, the riskScores were calculated using the following equations: where PSIi was the PSI value of OS-related AS event, Coe was the correspondent regression coe cient.
Based on the riskScore, ccRCC patients were divided into two groups (high/low risk), and the riskScore and survival state of these patients were severally showed through the diagrams. The "pheatmap" (version: 1.0.12) package was used to display the OS-related AS events in the low-risk and high-risk groups. The "survival" and "survminer" (version: 0.4.9) packages were employed to perform Kaplan-Meier survival analyse with log-rank tests. Besides the univariate and multivariate Cox regression analysis by the"survival" packages were further used to analyze whether the riskScore can be an independent prognostic indicator. The 1-, 3-and 5-year ROC analysis were performed by the "survivalROC" (version: We extracted the expression pro le data of each ccRCC sample, and calculated the tumor microenvironment (TME) socre by the "estimate" package, which included the stromal cell score, immune cell score and tumor purity. Then the "ggpubr" (version: 0.4.0) package was used to plot between the risk clusters and TME scores. CIBERSORT was used to evaluate the fractions of 22 types of immune cell in ltration in each ccRCC sample based on expression pro le data. Then the "ggpubr" and "vioplot" (version: 0.3.7) packages was used to plot between the risk clusters and the fractions of 22 types of immune cell in ltration. On the other hand, the scores of immune cells and their functions in each ccRCC sample were obtained by ssGSEA which need to use the "GSVA" (version: 1.40.1) and "GSEABase" (version: 0.3.7) packages. The "ggpubr", "reshape2" (version: 1.4.4) and "pheatmap" packages was used to plot between the risk clusters and the scores of immune cells and their functions. The "corrplot" package was used to display the correlation between the riskScore and signi cant immune checkpoint which included PD-1 (PDCD1), PD-L1 (CD274), CTLA4, HAVCR2, IDO1 and PDCD1LG2, meanwhile the "ggpubr" and "reshape2" packages was used to display the expression of immune checkpoint related genes in the risk clusters. 6: Identi cation of different parental genes of riskScore-related AS events in ccRCC The "limma" package was implemented to obtained different parental genes of riskScore-related AS events in ccRCC, |logFC|>0.585 and P<0.05 as the cutoffs. The clinical features and immune features of these genes were analysed by the preceding methods.

7: The regulatory network between riskScore-related AS events and splicing factors
A total of 404 human splicing factors (SFs) were downloaded from the reference [32]. A total of 212 SFs related to OS were testi ed by univariate Cox regression analysis. The correlation coe cient between the PSI values of riskScore-related AS events and the SFs expression values which were extracted from the RNA sequencing data of ccRCC samples in TCGA database, were calculated by Spearman correlation (r>0.6, P<0.05). The regulatory network of SFs and AS events was constructed and plotted by Cytoscape (version 3.9.0).

8: Statistical analysis
All statistical analyses were conducted in R software (version: 3.4.2). All statistical tests with p < 0.05 (two-sided test) were considered statistically signi cant.

1: Overview of AS events and identi cation of the OS-related AS events in ccRCC
According to the screening and ltering criteria described in the methods, nally, a total of 525 ccRCC samples and 72 matched normal samples were enrolled in our study cohort, and the clinical characteristics of 525 patients were summarized in Supplementary Table 1. Meanwhile,this research retained a total of 31407 AS events and 9734 parental genes, which were used to crate the UpSet diagram (Figure 2A), showing the interaction sets in 7 types (AA, AD, AP, AS, ES, ME, RI) of all AS events, and respectively displayed the type and number of AS events and parental genes through a histogram ( Figure 2B). In addition to this, it could be clearly seen that ES (AS=12142, Gene=5592) was the most common AS events, and ME (AS=151, Gene=145) was the rarest ( Figure 2B). Proverbially, one type of AS events, such as ES, can exist in multiple genes, but interestingly, at the most ve different AS events occurred simultaneously in one single gene, for instance, shows of 39 genes of this circumstance in the UpSet which just included the top 50 gene intersection sets in number such as the top one was the ES (n=1779) (Figure 2A).
To identify the OS-related AS events in ccRCC, we conducted a univariate Cox proportional hazards regression analysis (P<0.05), and then detected 9339 survival-associated AS events in 4480 parental genes, of which the types and number were respectively recorded in a histogram ( Figure 2D). The interaction between these OS-related AS events and parental genes were shown in an UpSet plot ( Figure  2C). At this point, the AP (AS=2838, Gene=1461) and ES (AS=2517, Gene=1723) events were the highest in number among them, and the ME (AS=54, Gene=54) events remained the least ( Figure 2D). The volcano plot ( Figure S1A) intuitively showed the all AS events, including the signi cant prognostic AS events (P value<0.05) and no signi cant AS events. The top 20 OS-related AS events of each of the 7 types, respectively, were clearly shown in the bubble graphs ( Figure S1B-H).

2: Establishment and evaluation of the riskScore prognostic signatures about OS-related AS events in ccRCC
In order to nd the most signi cant prognostic signatures, we then conducted a LASSO regression and a multivariate Cox regression according to the OS-related AS events. We obtained the most signi cant 17 events from the top 50 OS-related AS events through the LASSO regression ( Figure 3A and 3B), and further ltrated a list of prognostic signatures (Supplementary Excel 1) by the multivariate Cox regression and calculated the riskScore of each ccRCC samples based on the weight of each gene (Supplementary Excel 2). Now, the 525 KIRC patients were classi ed into the low-risk (n=263) cohort and the high-risk (n=262) cohort ( Figure 3C) depended on the median of those riskScores, and then renewedly veri ed the survival state of each ccRCC patients with riskScore ( Figure 3D). The most signi cant OS-related AS events after the multivariate Cox regression were also shown in the heat map ( Figure 3E). It was demonstrated that ccRCC patients with high riskScore were associated with poor OS (P< 0.001) by the Kaplan-Meier analysis with the log-rank tests ( Figure 3F). We performed the univariate and multivariate independent survival tests to evaluate the prediction ability of the riskScore as an independent prognostic risk factor. The univariate independent survival analysis indicated that the riskScore, tumor grade and stage were closely correlated (P< 0.001) to the prognosis of ccRCC patients in risk cohorts ( Figure S2A). The multivariate independent survival analysis demonstrated that the riskScore could be used as the independent prognostic indicator with signi cant differences (P< 0.001) ( Figure 3G).The ROC (receiver operator characteristic) plot of survival time of the riskScore showed the AUC (area under the curve) values were all greater than 0.7 (1 year=0.790, 2 year=0.745, 3 year=0.761) ( Figure 3H), declaring the excellence of the sensitivity and speci city of the riskScore-related AS events as prognostic signatures. Meanwhile, the AUC values of the riskScore, the grade and the stage were all greater than 0.7 (risk=0.790, grade=0.715, stage=0.793) in the ROC plot of clinical characteristics ( Figure S2B). These statistics provided su cient evidence to prove the important value of the OS-related AS events as prognostic signatures in ccRCC patients.
When further analyzed separately the top 20 of other 7 types of OS-related AS events in the same way in turn, we found that the signi cant prognostic signatures for each type in these 7 types can be still established ( Figure S3A-G), illustrating the stability and importance of the riskScore prognostic signatures created with the OS-related AS events in ccRCC patients.

3: The predictive value of overall survival of ccRCC patients
To reveal the correlation between AS-related prognostic signatures and clinical characteristics of ccRCC patients, we measured the correlation between the riskScore and different clinical pathological parameters, including age, gender, grade, and stage-T, M, and N ( Figure S4A-G), illustrating the signi cant positive correlation (P<0.05) between the riskScore and these clinical characteristics, excepting the age and gender (P>0.05). In addition, we constructed a nomograph model, which incorporates eight prognostic factors (such as riskScore, age, gender, grade, stage, T, M and N), to predict 1-, 2-and 3-year survival rate of patients with ccRCC ( Figure 4A). As showed in the calibration curves ( Figure 4B-D), the red lines were very close to the ideal gray lines, which proved the authenticity and effectiveness of the predictive power of the nomograph model. Collectively, these data demonstrate that the AS events-related prognostic signatures can be applied to predict OS in ccRCC patients.
When further analyzed separately the other 7 types of OS-related AS events in all above way ( Figure S5A-G), respectively, we testi ed the stability of these prognostic signatures in different types of OS-related AS events.

4:
The biological processes and pathways of differential gene sets in ccRCC subgroups with high/low risk.
In order to further clarify the biological processes and pathways in the high/low risk subgroups, we analyzed the differential gene sets (DGSs) by the GSEA, including GO analysis which composed of BP, MF and CC, and KEGG analysis which was KEGG pathway. We found that in high risk group, the top 10 GO enrichments in the DGSs were all the BP, most of which were closely related to the immunity, such as GOBP_activation_of_immune_response, GOBP_B_cell_activation, GOBP_B_ cell_mediated_immunity, GOBP_B_cell_receptor_signaling_pathway, etc. ( Figure 5A). However, in low risk group, most of the top 10 GO enrichments in the DGSs were the CC, such as GOCC_apical_part_of_cell, GOCC_apical_plasma_membrane, GOCC_cajal_body, etc. ( Figure 5B). Meanwhile, in the terms of KEGG pathway, the DGSs in high risk group were also enriched in immune-related pathways, sunch as KEGG_chemokine_signaling_parhway, KEGG_cytokine_ cytokine_receptor_interactin and KEGG_intestinal_immune_network_for_iga_production ( Figure 5C). However, the DGSs in low risk group tended to be enriched in metabolism-related pathways, sunch as KEGG_citrate_cycle_TCA_cycle, KEGG_fatty_acid_metabolism, KEGG_glycolysis_ gluconeogenesis, KEGG_propanoate_metabolism and KEGG_retinol_metabolism ( Figure 5D). Hence, it was a hypothesis that tumor immune microenvironment (TIME) might play an important role in high risk group. Those subgroups classi ed according to the 7 risk models built based on the other 7 types of OS-related AS events were further analyzed by the GSEA (Figure S6A-G). Interestingly and fortunately, these GO enrichments and KEGG pathways enriched in 7 high risk subgroups were all consistent with those above results, which tended to be closely related to the immunity, so TIME in the high/low risk subgroups was focused on the later research. 5 The immune features in ccRCC cohorts with high/low risk.
To investigate thoroughly the effect of TIME in ccRCC subgroups with high/low risk, we calculated the stromal cell score, immune cell score and tumor purity to re ect TME of each ccRCC patients with the riskScore, and found interestingly that ccRCC subgroup with high riskScore tended to manifest the higher immune cell score (P<0.05) ( Figure 6B), though, no relationship between these clusters and the stromal cell score (P>0.05) (Figure 6A), inferring a hypothesis of the higher immune cell in ltration in high riskScore subgroup. To verify the hypothesis, we estimated a comparison of the otherness of the 22 immune cells in high/low riskScore cohorts ( Figure 6C), discovered amazedly that these immune cells (such as naive B cell, CD8 + T cell, naive CD4 + T cell, activated NK cell and activated Mast cell, which were regulating immunological lethality) were indifference relation, and the immune cells (such as resting memory CD4 + T cell and macrophages M1/M2) were fewer in high riskScore cluster ( Figure S7A-C), but more the cells regulating immunological suppression such as regulatory T cells (Tregs) ( Figure S7D). Further to evaluate the immune cell in ltration and their functions in each ccRCC sample by ssGSEA, as shown in Figures 6D and 6E, the results were roughly consistent with those above results. Then there was a positive correlation between high-risk patients and classic immune checkpoints such as PD-1 and CTLA4 ( Figure 6F and S7E-F), and the vast majority of genes related to immune checkpoints were increased in the high riskScore patients group ( Figure 6G). There results implied that TIME affected by immune cell in ltration and immune checkpoints may be one of the most important reasons that contribute to poor prognosis in the high risk subgroups differentiated based on AS events.
When further analyzed separately the other clusters based on the different riskScore in the 7 types of OSrelated AS events in turn ( Figure S8A-G), we testi ed the stability of relationships between ccRCC patients and immune features.

6: The clinical features and immune features of different parental genes of riskScore-related AS events in ccRCC
To combine all parental genes of AS events uesd to eatablish these risk models and analyse the differences of these genes between ccRCC tumor samples and normal samples in TCGA database (Supplementary Excel 3), the C4orf19 was con rmed as the only crossed different gene (|logFC|>0.585, P<0.05) between the model built with top 50 AS events and other independent 7 models built with top 20 AS events in each type. Firstly, compared with normal samples, the expression of C4orf19 was lower in ccRCC tumor samples ( Figure 7A). And according to the median of the C4orf19 expression, these ccRCC samples were divided into the low-expression group and the high-expression group to reaserch the clinicopathological features of C4orf19, showing that except for age, gender and lymph node metastasis, the expression of C4orf19 was negatively correlated with grade, stage, T and M. (Figure 7B-C and S9A-E). Meanwhile, patients with low expression of C4orf19 had a poorer prognosis than patients with high expression of C4orf19 ( Figure 7D), but of which the sensitivity and speci city (AUC: 0.661) were ordinary as a prognostic signatures from the ROC curve ( Figure 7E). In addition to the C4orf19, other different parental genes (|logFC|>1, P<0.05), including ARHGAP24, DNASE1L3, P4HA1, SLC39A14 and TAF1D (Supplementary Excel3), were from their respective models and were analyzed respectively their clinicopathological features in ccRCC data by the preceding similar way. We discovered that the clinicopathological features and trends of ARHGAP24 and DNASE1L3 were similar to those of C4orf19, while those of P4HA1and TAF1D were opposite to those of the former three genes ( Figure S10A-E). As a special case, SLC39A14 was obviously high expression in tumor samples compared with normal, but no signi cant difference in clinicopathological characteristics ( Figure S10D). What is noteworthy was that the sensitivity and speci city of these 5 genes were weak as a prognostic signatures by itself ( Figure  S10A-E), similar to C4orf19, indicating that these 6 genes can not be the independent prognostic signatures.
To explore the immune features of different parental genes, it was shown that patients with low expression of C4orf19 had more immune cell in ltration ( Figure 7F-G), less purity of tumor ( Figure S9F-G), more in ltration of Treg cell ( Figure 7H-I), and more expression of these immune checkpoint related genes ( Figure 7J). However, in the perspective of immune cell in ltration, except that the results of ARHGAP24 were coincident with the C4orf19, the results of DNASE1L3, P4HA1, SLC39A14 and TAF1D were no signi cant ( Figure S10A-E). Similarly, the expression of immune checkpoint related genes of patients with low expression of ARHGAP24 or DNASE1L3 were consistent with those of patients with low expression of C4orf19, as well as patients with high expression of P4HA1, SLC39A14 or TAF1D ( Figure S10A-E). In sum, the expression of these genes can cause more immune cell in ltration in ccRCC, especially Treg cells, and further lead to abnormal expression of immune checkpoints by regulating the expression of genes related to immune checkpoints, which may contribute to ineffective immunotherapy or drug resistance. 7: The splicing factors regulating riskScore-related AS events.
As we all know, the irregular occurrence of AS events is caused by the SFs which is a protein binding and splicing to the pre-mRNA. In order to explore the SFs which regulates riskScore-related AS events, a total of 212 SFs related to OS were testi ed by univariate Cox regression analysis. The correlation coe cient between the PSI values of riskScore-related AS events (n=46) and the SFs (n=212) expression values were calculated by Spearman correlation, r>0.6 and P<0.05 as the cutoffs. As shown in the network, which was constituted of the 12 AS events and 52 SFs (Figure 8), an AS event could be regulated by multiple SFs, for instance TAF1D|18313|RI and TMEM104|217418|ME, and a SF could regulate multiple AS events, such as SNRNP70, SER31B, etc. There SFs were divided into high expression cluster (n=37, blue, rhombus) and low expression cluster (n=15, azure, triangle) in ccRCC samples. And there AS events were divided into high frequency cluster (n=10, red, round) and low frequency cluster (n=2, lime, round) based on their PSI values in ccRCC samples. Red line represents the positive correlation, for instance, most of highly expressed SFs (blue) and all high-frequency AS events (red) were connected by the red line. Azure line represents the negative correlation, for example, most of lowly expressed SFs (azure) and the high-frequency AS events (red) such as GIT2|24375|AA were connected by the azure line, menwhile the low-frequency AS events (lime) such as SERP2|25779|ME and most of highly expressed SFs (blue) were connected by the azure line. These results expound that the unbalance of the AS events related to high-risk model may be caused by the abnormal expression of SFs, contributing to poor prognosis in ccRCC patients.

Discussion
Both of initiation and progression of tumor are a complicated process, which is an intricate regulatory network involving many genes or proteins. Recently, the pathogenesis of ccRCC has been elaborated from the perspective of tumor evolution by some scholars, who believed through the TRACERx Renal studies that in ccRCC, driver gene mutations including those in VHL, SETD2, PBRM1 and BAP1, and chromosomal copy number variationincluding chromosome 3p loss, may engender genomic instability and promote defects in DNA repair pathways, which drive tumor initiation events [33]. Then with emergence of these driver events, the enormous irregular changes of downstream genes regulated by these driver genes at the transcription and translation levels, which promote tumor progression, and even confer lethality [34]. Thus it can be seen that AS, as an indispensable part of the process of posttranscriptional modi cations, of which it is particularly important to clarify function and molecular mechanism of ccRCC.
With the tremendous advance of high-throughput sequencing technology, the potential signi cance of AS pro ling obtained from RNA-seq datas of tumor has been great breakthroughs in cacner [8,9], including in ccRCC. In previous research, Ge Y, et al. kept on this view that DCLK1-ASVs, which is an alternate splice variant of DCLK1, assistantly maintains the stemness of ccRCC cell, and potentially heightens the curative effect of immunotherapy by regulating PD-L1 or CTLA4 [35]. In the early time, PKM2, rather than PKM1, was discovered to play a tumorigenic role in ccRCC [36], and recently, four transcript isoforms of the PKM type were combined to evaluate the survival outcome of ccRCC patients [37]. Moreover, the splicing factor SF3B3 promoted cell growth, migration and proliferation in ccRCC cell lines and even tumorigenicity in a xenograft model through inhibiting expression of EZH2 Δ14 and promoting expression of EZH2 [38]. Similarly, the alternative splicing events of CCDC50 regulated by HnRNP A1 have been demonstrated to contribute to cancer progression through the ZNF395 pathway in ccRCC [39]. In addition, AS events, as prognostic signatures, are concerned about and researched. For instance, Song J, et al. established a robust prognostic prediction model based on the seven hub genes, including KRT222, LENG8, APOB, SLC3A1, SCD5, AQP1, and ADRA1A, who believed that this model can be as the potential prognostic biomarkers for ccRCC [40]. Synchronously, Meng T, et al. revealed that RHOT2-32938-RI and TCIRG1-17288-RI regulated by DDX39B, two AS events in RI type, may be intimately associated with the metastasis and poor prognosis of ccRCC [41]. Furthermore, Chen T, et al.
systematically analysed the survival-associated AS signatures from the ccRCC samples in TCGA database, and then created prognostic predictors based on AS events and uncovered key SFs in splicing networks [42]. But unsatisfactorily, previous research results remain completely unascertained about the prognostic biomarkers of ccRCC. Therefore, likewise, in our study, a total of 525 ccRCC samples and 72 matched normal samples from the TCGA database were enrolled in our study cohort, and then 9339 survival-associated AS events in 4480 parental genes were detected from among 31407 AS events and 9734 parental genes by a univariate Cox regression analysis (P<0.05) (Figure 2). Secondly, we obtained the most signi cant 5 events, including C4orf19|69001|AT, UACA|31438|AP, FAM120C|89237|AT, TRIM16L|39629|AP and SEC31A|100881|ES, from the top 50 OS-related AS events through the LASSO regressionn and the multivariate Cox regression, which were used to establish the prognostic risk model (Figure 3A-E). These results, which were evaluated by the Kaplan-Meier analysis, the univariate and multivariate independent survival tests, the ROC curves ( Figure 3F-H and S2A-B) and a nomograph model ( Figure 4A-D), provided su cient evidence to prove the availability of the OS-related AS events obtained by our research as prognostic signatures in ccRCC patients. Differ from the results of other previous studies, we further analyzed separately the top 20 of other 7 types of OS-related AS events in the same way in turn, and found that the signi cant prognostic signatures for each type in these 7 types can be still established ( Figure S3A-G and S5A-G). Hence, our research results further illustrated the stability and importance of the riskScore prognostic signatures created with the OS-related AS events in ccRCC patients. And so far, the practicability and effectiveness of AS events as a prognostic biomarker for predicting the prognosis of ccRCC patients requires more clinical trials to verify.
In order to further clarify the biological processes and pathways in the high/low risk subgroups, GO analysis and KEGG analysis were carried out, and we found that in high risk group, most of the top 10 GO enrichments and the KEGG pathway were closely related to the immunity ( Figure 5A-D), so it was a hypothesis that TIME might play an important role in high risk group. This hypothesis had been successfully con rmed by those results of those GO enrichments and KEGG pathways enriched in 7 high risk subgroups ( Figure S6A-G). Therefore, we further reviewed the literature to understand the role of AS events in the TME, especially in the TIME. Firstly, Dery KJ, et al. constructed transcriptome sequencing in cell lines of breast carcinoma, to nd that IRF-1 can take part in regulation of growth and differentiation of tumor cells through inducing the global AS of genes, as well as augmentation of the immune response by global AS changes of genes of the cytokine family [43]. Nextly, a study of Sotillo E, et al. showed that alternative splicing of CD19 can reinforce the resistance to CART-19 immunitherapy by compromising expression of the CART-19 epitope [44]. Again, the research results of Georgilis A, et al. manifested that AS events mediated by PTBP-1, a splicing factor, may regulate the in ammatory secretome and accelerate tumorigenic effects of senescent cells [45]. These studies demonstrate that AS events can play a very important role in the TIME. What's more, the relationship between AS events and TIME has been con rmed in a variety of tumor diseases, such as hepatocellular carcinoma [46], esophageal squamous cell carcinoma [47], gastric carcinoma [48], endometrial carcinoma [49], triple negative breast carcinoma [50] and glioma [51]. However, this relationship has not been studied inccRCC.
In this study, we amazedly found that the higher immune cell in ltration such as regulatory T cells (Tregs), rather than immune cell regulating immunological lethality such as CD8 + T cell and activated NK cell, etc., and higher expression of classic immune checkpoints such as PD-1 and CTLA4 involved in high riskScore subgroup ( Figure 6A-G and S7A-F), which implyed that TIME affected by immune cell in ltration and immune checkpoints may be an important reason that contributes to poor prognosis in the high risk subgroups differentiated based on AS events. This phenomenon has been con rmed again by those results between immune features and ccRCC patients who were divided into different risk clusters based on the 7 types of OS-related AS events ( Figure S8A-G). In addition, we further explored the clinical features and immune features of different parental genes of riskScore-related AS events in ccRCC, and discovered that 6 different parental genes were obtained, of which 3genes presented low expression in tumor samples, including C4orf19, ARHGAP24 and DNASE1L3, as well as other 3 genes of high expression, such as P4HA1, SLC39A14 and TAF1D. These 6 genes can not be the independent prognostic signatures ( Figure 7A-E and S10A-E). Similarly, the expression of these genes was closely related to immune cells in ltration and the expression of immune checkpoints in TIME of ccRCC ( Figure 7F-J and S10A-E), which was consistent with the former results.
Not only that, we also in surprise found that immune cells, such as CD8 + T cell, resting memory CD4 + T cell and macrophages M1/M2, themselves in ltration scores were very high in all ccRCC samples ( Figure  6C-E and S7A-G), which makes us mistakenly believe the signi cant augment of these immune cells with lethality in ccRCC. However, there is a recent viewpoint from the research of Braun DA, et al. scholars, who performed single-cell RNA and T cell receptor sequencing on 164,722 individual cells from tumor and adjacent non-tumor tissue in patients with ccRCC, then proposed that exhausted CD8 + T cells and macrophages M2 co-occurred in advanced tumor, and expressed ligands and receptors facilitating T cell dysfunction [52]. The results indicate that immune cells signi cantly in ltrating in ccRCC do not always have effective immunity lethality, and the possible factors of resistance to immunotherapy is explained.
Furthermore, Bi K, et al. believed that immune checkpoint blockade (ICB) might remodel the TIME in ccRCC and modify the interplay between tumor cells and immune cell populations, which explained the reasons of response and resistance to ICB [53]. Meanwhile, Krishna C, et al. performed paired single-cell RNA and T cell receptor sequencing of 167283 cells, and discovered that the enrichment of tissueresident CD8 + T cells in a patient responsive to ICB but tumor-associated macrophages in a resistant patient, which emphasizes the existence of heterogeneity of the immune microenvironment within and between ccRCC patients [54]. Accordingly, those study will provide a crucial resource for the discovery of new immunotherapies and the advance of combination regimens in ccRCC.
In this study, we illustrated the stability and importance of the riskScore prognostic signatures from these prognostic risk models which were repetitiously constructed based on survival-related AS events in ccRCC samples from the TCGA database. After dividing into high/low risk groups based on these riskScores, the results from the GO and KEGG analyses showed that TIME might play an important role in high risk group. Hence, we further found that in high-risk patient groups, more abundance of immune cell in ltrationsuch as Treg cells, etc., but no signi cant difference in in ltrative CD8 + T cells, and the higher expression of immune checkpoints such as PD-1, CTLA-4, etc. These comprehensive results indicate the heterogeneity of TIME and the limited effect of ICIs in ccRCC.

Conclusion
In summary, our study suggested that OS-related AS events mediated by aberrant SFs can be constructed a prognostic risk model to predict prognosis of patiens and utilized to index the situation of immune cell in ltration and immune checkpoint expression that impact TIME in ccRCC.