Gene Expression and Regulatory Webwork of POLR2K in Bladder Carcinogenesis by Integrated Bioinformatics Approaches

Background (cid:0) RNA polymerase II subunit K (POLR2K) belongs to one of the multiple subunits of RNA polymerase II (Pol II), whose biological function is to synthesize mRNA. Aberrant POLR2K expression is related to carcinogenesis. However, POLR2K’s underlying role in bladder cancer has not been explored. In the current study, we intend to analyze the function of POLR2K and its regulatory network within bladder cancer. Methods: Public sequencing data was obtained from GEO and TCGA to investigate POLR2K expression and regulatory network within bladder cancer (BLCA) by using GEPIA and Oncomine as well as cBioPortal online tool. LinkedOmics was employed to identify genes displaying signicantly differential expression patterns and to perform GO and KEGG analyses. After differential genes was assigned and ranked, GSEA analyses was performed to obtain target networks for transcription factors, miRNAs, and kinases that could regulate POLR2K– associated gene network. Subsequent functional webwork analyses were used to identify cancer-relevant pathways Moreover, POLR2K gene is veried, by ChIP-seq in MCF-7 cell line , with transcription factor binding evidence in the ENCODE Transcription Factor Binding Site Proles dataset. Conclusions: The current study implies that POLR2K gene is overexpressed and often amplied in BLCA, providing the rst evidence that POLR2K deregulation, in particular increased transcription, may promote BLCA. These ndings uncover a unique expression patterns of POLR2K and its potential regulatory networks in BLCA, contributing greatly to study of the role of POLR2K in cancer development.

In this studies, we found that POLR2K was overexpressed in BLCA patient samples from The Cancer Genome Atlas compared to normal liver samples. In addition, Kaplan-Meier survival analysis displayed that poor progression-free survival and poor overall survival of BLCA patients were related to POLR2K overexpression (Supplementary Figure  1). Therefore, we further explore the role of POLR2K in BLCA from various public databases. Genomic alterations and functional networks relating to POLR2K in BLCA were analyzed by using multi-dimensional analysis methods. Thus, these results reveal a novel function of POLR2K as a regulator of cancer proliferation and provide a potential strategy of treatment for BLCA. POLR2K expression might function as a novel therapeutic target for bladder cancer.

Oncomine analysis
Oncomine 4.5 (www.oncomine.org), one of the largest existing oncogene chip database which currently includes 264 independent datasets, containing various cancer types analysis methods [12], was used for analyzing DNA copy number and mRNA expression of POLR2K within bladder cancer whose publicly-accessible online database could be derived from the Gene Expression Omnibus (www.ncbi.nlm.nih.gov/geo). POLR2K expression was evaluated in bladder carcinoma tissues relative to normal bladder tissues where signi cant differences were determined as p 0.01.

Pathway Commons
Pathway Commons is a collective database of biological pathway database derived from multiple sources associated with interactions from various organisms and functional correlations between genes in signaling pathways [13], whose data can be derived from its partner databases and the pathway data are represented in the BioPAX le.POLR2K was used as seed gene to acquire the neighboring associated genes.

UALCAN analysis
UALCAN is a comprehensive transcriptome database whose data is collected from the TCGA, and correlation between clinicopathological features and gene expression could be retrieved [14]. UALCAN (ualcan.path.uab.edu) is used to analyze cancer transcriptome data based on TCGA database.

cBioPortal analysis
The platform(cbioportal.org), termed as cBioPortal, has hosted more than 200 cancer genomics studies , which is often used to explore, visualize, and analyze multi-dimensional cancer genomic and clinical data information based on its multiple analysis features [15]. The mRNA expression, CNVs, and mutation of POLR2K were chosen as search parameters and then the tab Network was used for visualizing its biological interaction network based on the alteration frequencies. Cytoscope was further introduced to analyze GO and KEGG pathway.

LinkedOmics analysis
With abundance in multi-omics and clinical data information of various cancers from TCGA-associated multidimensional datasets [16], LinkedOmics (www.linkedomics.org/ login.php) was utilized to investigate mRNA sequence data from 408 BLCA patients within the Cancer Genome Atlas database. The gene expression difference correlated with POLR2K was studied using LinkFinder module and Pearson's correlation coe cients was chosen as the method to investigate the results. Individual gene's statistical plot are presented in heat maps or scatter plots. After the results were ranked, the pathway enrichment analysis of GO and KEGG as well as target enrichment analysis of transcription factor, miRNA and kinase was performed by GSEA. Target enrichment of the transcription factor and miRNA was primarily derived from Molecular Signatures Database.
2.6 GEPIA analysis GEPIA, an interactive web application providing and analyzing the RNA sequencing expression data based on the GTEx and TCGA projects [17], enables users to perform various of gene expression analyses and survival analyses including overall survival and progression-free survival. For the survival analysis, POLR2K was input as a gene symbol; BLCA was added as the datasets; other setting, except Methods tab, was unchanged by default.

GeneMANIA analysis
GeneMANIA is used for generating gene function hypotheses, analyzing gene list and then prioritizing genes for functional category assays [18]. When given a query gene list, GeneMANIA (www.genemania.org), by utilizing publically-available proteomics and genomics data information, could extend this list with functional similar and interacting genes to produce a functional related networks. GeneMANIA was also used for visualizing the gene network and predicting gene functions.

Enrichr analysis
Enrichr is a collective annotation resource for curated human gene datasets and a search engine of accumulated multiple biological knowledge about billions of biological discoveries [19]. We use 50 neighboring genes of POLR2K to inquire Enrichr in Cell Types tab (amp.pharm.mssm.edu/Enrichr).

POLR2K expression in bladder cancer
Using TCGA and the Gene Expression Omnibus (GEO) database, we rstly assessed POLR2K transcription levels in BLCA studies. Analysis of the Oncomine database revealed an increase in DNA copy number variation (CNV) and mRNA expression of POLR2K in BLCA tissues compared normal tissues (p 0.01). POLR2K ranked within the top 17% of transcriptome pro le and within the top 9% of DNA CNVs (Figure 1), when the fold differences were adjusted to 2. In addition, correlation analysis of POLR2K with clinic pathological features of 408 bladder cancer patients from the Cancer Genome Atlas database was conducted by UALCAN website. And the results consistently suggested high transcription levels of POLR2K in BLCA. Compared with healthy people, the levels of transcription for POLR2K was much higher in bladder cancer patient based on the sub-group such as disease stages, gender, smoking condition, histological subtypes, and nodal metastasis status ( Figure 2). Furthermore, Kaplan-Meier survival analysis displayed high levels of POLR2K was positively related to poor progression-free survival(PFS) and overall survival(OS) in bladder cancer patients (Supplementary Figure 1).Therefore, POLR2K expression could become a potential diagnostic indicator in BLCA.

Type and frequency of POLR2K alterations in bladder cancer
In order to investigate the type and frequency of POLR2K changes within bladder cancer, we then used the ciBioPortal to analyze sequence data from BLCA patient issues in the TCGA database. Compared with normal healthy tissues, POLR2K levels was altered in 141 of 408 (35%) BLCA patient tissues (Figure 3a). These alterations includes mRNA up-regulation in 139 cases (34.5%), ampli cation in 59 cases (14.7%), multiple alterations in 55 cases (13.7%) mRNA and down-regulation in 2 case (0.49%) (Supplementary Fig 2). Therefore, ampli cation is identi ed in BLCA, as the most prevalent type of POLR2K CNV.

Interaction network of POLR2K alterations in bladder cancer
Biological interaction network of POLR2K in BLCA was next investigated. To achieve this, we used Common Pathways to obtain POLR2K-neighboring genes, and then used Cbioportal to inquire the alteration frequencies of these genes. (Figure 3b and Supplementary Fig 2). The neighboring genes of POLR2K makes KDMSA (35%), KMT2D (33%) together with INTS8 (32%) as one of the most frequent alterations in BLCA. With the help of Cytoscape, analysis of GO category annotations implied these genes encoding proteins localized primarily to, or functioned at, RNA polymerase complex, the spliceosomal complex, and MLL3/4 complex. These proteins are fundamentally related to posttranscriptional gene expression processes (i.e. snRNA and dsRNA processing, transcription-coupled nucleotide-excision repair, etc), but they were also enriched such as "U2-type spliceosomal complex" and "basal transcription machinery binding" (Figure 4a-4c). As expected, KEGG pathway analysis indicated that POLR2K is connected to Spliceosome, RNA polymerase, mRNA surveillance and Basal transcription factors ( Figure 4d). As a result, the interaction network of POLR2K alterations is related to several RNA metabolic processes and the gene expression regulation.

GO annotation and KEGG enrichment analysis for co-expression genes related to POLR2K in bladder cancer
Using LinkedOmics online tool, we took advantage of TCGA database to explore mRNA sequencing data comprising 408 patients of BLCA. From the volcano plot depicted in Figure 5a, signi cant negative correlations with POLR2K was suggested among 2395 genes (dark green dots), while signi cant positive correlations was indicated among 2635genes (dark red dots) (P-value<0.01, false discovery rate FDR< 0.05). The heat map displayed 50 signi cant genes whose mRNA expression positively or negatively correlated with POLR2K in BLCA (Figure 5b, 5c). This result demonstrates that POLR2K have an extensive in uence on the transcriptome. Supplementary Figure 3a-3c show scatter plots for each gene. Signi cantly, POLR2K expression is positively related to the expression of ZNF706 (positive rank #1, Pearson correlation = 0.7887, p = 8.055e-88), YWHAZ (Pearson correlation = 0.6695, p = 2.224e-54), and COX6C(Pearson correlation = 0.6638, p = 3.552e-53), which mirrors alterations in mitochondrial composition as well as apoptosis, DNA repair, and transcriptional regulation. Further exploration by GSEA analysis demonstrated differentially expressed genes were primarily related to mitochondrial protein complex, spliceosome complex, replication fork and condensed chromosome, whereas those genes are involved mainly in mitochondrial gene expression, mRNA/ncRNA processing, DNA replication and cell cycle. These genes also serves as constituent of mitochondria and ribosomes (Figure6a-6c and Supplementary Tables 1-3).
Interestingly, GSEA analysis implied that these genes were also located in MHC protein complex, secretory granule membrane, cell leading edge and extracellular matrix, where they participated in the positive regulation of cell motility, antigen binding and immune response-regulating signaling pathway. They were also involved in response to type I interferon, I-kappaB kinase/NF-kappaB signaling, Toll-like / NOD-like receptor signaling pathway and JAK-STAT signaling pathway, all of which were associated to tumor immune escape. Also, GSEA analysis of KEGG pathway showed highly signi cant enrichment in cell cycle, DNA replication, ribosome and spliceosome, as well as in Natural killer cell mediated cytotoxicity and Cytokine-cytokine receptor interaction (Figure 6d, 6e and Supplementary Table 4). Taken together, POLR2K function networks were mainly responsible for gene expression, mRNA surveillance, cell cycle, DNA replication while were also involved in tumor immune response and survival, which demonstrated signi cant deregulation of cancer related pathways.

POLR2K networks of transcription factor, miRNA and kinase targets and in bladder cancer
Transcription factor, miRNA and Kinase target network of associated gene dataset created by GSEA were used to examine targets of POLR2K in BLCA. The signi cant kinase-target networks were associated mainly with mitogenactivated protein kinase 6 (Kinase_MAPK6), homeodomain interacting protein kinase 2 Kinase_HIPK2 and mitogenactivated protein kinase 7 (Kinase_MAPK7) ( Table 1 and Supplementary Tables 5-7); the miRNA-target network was generally related to AACTGGA_MIR145, TTGCCAA_MIR202, ACTTTAT_MIR507, GTGCAAT_MIR25_MIR32_MIR92_MIR363_MIR367 and TATTATA_MIR374; the transcription factor-target network was involved fundamentally in the cell cycle regulation factors including V$E2F1_Q6, V$FOXO1_02 and V$CDC5_01, whereas also involved in the IRF Transcription Factor (IRF) family including V$IRF7_01, V$IRF1_01, and V$ISRE_01.
Correlation among genes for the Kinase MAPK7, miR-145 and E2F1_Q6 respectively was uncovered by subsequent protein-protein interaction networks created by GeneMANIA. These genes enriched for transcription factor E2F1 has been linked primarily to modulating cell cycle checkpoint, DNA repair and DNA replication, while associated with MCM complex, G1/S transition of mitotic cell cycle and telomere maintenance (

Discussion
That DNA-directed RNA polymerases (Pol I, Pol II and Pol III) is differentially expressed and deregulated has been detailed in multiple cancers [20]. POLR2K, a subunit of this polymerases, participates in multiple steps of transcription [21].In this paper, we initially discovered that POLR2K was overexpressed in 408 BLCA samples from the TCGA and that high POLR2K expression may serve as an indicator of poor survival. Subsequently, further studies of the expression of POLR2K and its regulatory network will be necessary to obtain additional insight into the possible function of POLR2K in BLCA. Therefore, we conducted bioinformatic analyses of sequence data and hoped to motivate the future research of bladder cancer.
It has been a wearing problem for years since early detection approach of BLCA eluded many clinicians. Cystoscopy and urine cytology are currently employed to diagnose BLCA [22]. However, cystoscopy is an invasive method and also low sensitivity for bladder carcinoma in situ. Urine cytology is a non-invasive and has a higher speci city but lower sensitivity for low-grade urothelial tumors. In spite of the search for urinary biomarkers for the early and noninvasive detection of BLCA, no available biomarkers are currently employed in clinical practice. Thus, potential BLCA biomarkers are urgently required to increase the early diagnosis of BLCA. We also demonstrated that CNVs and mRNA expression of POLR2K are much higher in bladder cancer tissues compared with normal tissues by analyzing transcriptional sequence data from clinical patient tissues. We discovered that POLR2K overexpression exists in many cases of bladder carcinogenesis. We think that POLR2K research could move the eld of liquid biopsy biomarkers forward, which deserves additional clinical and research attention. And it needs to be veri ed whether POLR2K could be detectable in liquid biopsy.
It has been reported that CNVs could directly in uence gene expression and have drastic phenotypic consequences due to altering gene dosage or disrupting coding sequences. [23,24]. This work discovered that ampli cation was the main type of POLR2K change and POLR2K copy number was augmented, which was related to poor survival including overall survival (OS) and disease-free survival (DFS) (Supplementary Figure 1). In light of these ndings, we inferred that alterations in chromosomal structure might be involved in the altered POLR2K expression as well as POLR2K dysfunction in BLCA. POLRAK genetic change could lead to alterations in numerous downstream signals that may ultimately result in carcinogenesis for POLR2K plays a critical role in multiple biological functions. In fact, neighboring gene networks in close proximity to POLR2K display ampli cation with varying strengths in BLCA. Meanwhile, associated functional network are found to be related to spliceosome signaling, mRNA surveillance and ribosome signaling. Therefore, the network of POLR2K alterations is related to posttranscriptional modulation, which is involved in protein translation as well as RNA splicing, in consistence with several other published results about the biological roles of POLR2K [25,26]. Furthermore, related functional networks are also involved in positive regulation of cell motility, antigen binding and immune response-regulating signaling pathway.
Therefore, the networks based on POLR2K genetic change is also associated with the tumor immune response, which is closely related to the mechanism by which tumor escapes from host immune system.
To reveal critical network of transcription factors, miRNAs and target kinases, GSEA analysis is performed. Our results imply that the functional network of POLR2K is involved generally in the ribosome, spliceosome, mRNA/ncRNA processing, cell cycle and DNA replication. Just the same as the mutation webwork, the functional association network that integrates the effects of POLR2K transcription alteration participates in RNA metabolic processes and gene expression regulation. As represented above, we conclude that POLR2K has a profound in uence on the maintenance of short introns splicing [27,28].
Oncogenic kinases play a critical role in coupling intracellular signaling pathways with extracellular signals, which promote cancer progression in all stages [29]. We discovered that POLR2K in BLCA is related to a network of kinases including Kinase_MAPK7, Kinase_MAPK6 and Kinase_HIPK2. All of these kinases regulate mitosis, gene expression and apoptosis. Indeed, HIPK2 has been recognized as a signaling molecule which acts in various signal transduction pathways and cellular processes such as cell proliferation and apoptosis, transcriptional regulation and antiviral responses [30,31] while MAPK7 could facilitate tumor escape from immune surveillance [32,33] . Deregulated activity of HIPK2 may affect the genome integrity, leading to cancer development [34]. In BLCA, POLR2K might modulate DNA replication, repair, and gene expression via HIP2 kinase.
"Continuous proliferation" has been proposed as top one of 10 hallmark features of tumors [35]. One crucial explanation is that cell cycle-associated proteins, if aberrantly expressed, could contribute to cell cycle disorder in tumor cells, which results in decreased differentiation, abnormal proliferation and rapid progression in cancer cells [36]. E2F1 is among key links in cell cycle modulation web-work. Abnormal E2F1 expression proactively is related to the tumor formation and progression of BLCA [37]. There is one study has published that elevated levels of E2F1 is involved in shorter survival of bladder cancer patients [38], and another study showed that the POLR2K network of transcription factor targets is related to E2F1 [39]. Thus, these analyses suggest that E2F1 may be a critical target of POLR2K which modulates cell cycle and cancer progression in BLCA patients by acting through E2F1 transcription factor. Interestingly, according to known transcription factor binding site motifs from the TRANSFAC, we found that E2F1 could directly bind the promoter region of POLR2K likely to modulate the expression of POLR2K Cancer and chronic in ammation could be linked together by activating innate immune responses through NLR and TLR signals [40]. Chronic in ammation, mainly due to aberrant in ammasome or NF-κB activation, is closed coupled with cancer through TLRs-involved cytokine production [41]. The IFN-regulatory factor family proteins are transcription factors with varying biological functions, which might render a microenvironment for immune evasion and tumor progression [42]. Therefore, our analysis could represent that POLR2K might play a vital role in tumor microenvironment where POLR2K act as a chronic in ammation keeper and act as an immune surveillance regulator.
Our data mining also recognized some miRNAs that were related to POLR2K. These miRNAs are short non-coding RNA molecules that post-transcriptional regulate protein expression. Distinct miRNAs alterations could be used to characterize BLCA [43,44]. The particular miRNAs in this paper are associated with cancer occurrence, metastasis and invasion. Indeed, miR-145, miR-202 and miR-374 has been proposed as diagnostic, prognostic or therapeutic marker of BLCA [45,46] .miR-202 and miR-347 participate in invasion, metastasis and cancer progression[46, 47], while miR-145 modulates suppressor of cytokine signaling 7 (socs7) to enhance IFN-β expression, thus contributing to BLCA apoptosis. We speculate that deregulation of these miRNAs would be in consistence with the phenotype of POLR2K overexpression in BLCA, which should be further veri ed by experiments.
Our study presents striking evidence for the signi cance of POLR2K in urinary bladder carcinogenesis and demonstrate its potential as an early indicator in BLCA. This study imply that POLR2K overexpression in BLCA has profound impacts on tumor immune surveillance and on multiple steps of gene expression and of the cell cycle, thus contributing to immune surveillance evasion ultimately. POLR2K is particularly associated with some tumorrelated transcription factors such as E2F1 and IRF family, miRNAs such as miRNA-145 and kinases such as HIPK2.
Our study deploys current online websites to conduct bioinformatics analysis of bladder cancer. This strategy has superiorities in terms of simplicity and large sample size, which enables us to perform more large-scale POLR2K genomics research and functional studies free of charge, compared with classic chip screening.
Meanwhile, by using the TCGA database, we also face some limitations. The rst one is that the BLCA samples in data or gene expression pro les, and therefore, biological interpretation of the analysis results deserves considerable attention considering sample heterogeneity, which could be the reason why the 50 most frequently altered neighbor genes enriched in the CD56+NKCells and 721_B_lymphoblasts (Supplementary Table 8).

Conclusion
This work reveal novel function of POLR2K as a regulator of cancer proliferation and present a potential strategy of treatment for BLCA. POLR2K expression might function as a novel therapeutic target for bladder cancer.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.