Identification of Candidate Biomarkers Associated with Ovarian Cancer Progression and Prognosis by Integrated Microarray Analysis


 Ovarian cancer (OC) is one of the most common gynecological malignancies, with high mortality and few prognostic biomarkers. This study aimed to identify differentially expressed genes (DEGs) relevant to the prognosis of ovarian cancer. First, we performed differential analysis on GSE38666, GSE14407, GSE27651 and GSE18520 expression profile datasets from the Gene Expression Omnibus (GEO) database and obtained 638 DEGs using a Venn diagram tool. Second, we constructed a protein–protein interaction (PPI) network with the STRING database and used the Molecular Complex Detection (MCODE) plugin in Cytoscape software to perform gene cluster analyses for 269 upregulated and 367 downregulated DEGs, respectively. Based on the results of gene cluster analyses, we selected two gene clusters of upregulated DEGs with network scores of > 10 and two gene clusters of downregulated DEGs with scores of > 5 for further functional analysis with the clusterProfiler package. In addition, we analyzed the effect of genes from those clusters on the overall survival (OS) and progression-free survival (PFS) of OC patients based on Kaplan–Meier Plotter and PrognoScan databases. The gene clusters of upregulated DEGs are involved mainly in cell division and cell proliferation, and the genes of downregulated DEGs primarily participate in the TGF-β signaling pathway and various metabolic processes. Via cross-validation of survival analysis, we identified four hub genes (AURKA, CDCA5, CEP55 and UBE2C) that correlated with ovarian cancer OS and PFS, which derived from the upregulated DEGs. Patients with these four hub genes trended toward worse OS and PFS than patients with low expression. Unfortunately, no significantly prognosis-related genes were detected among downregulated clusters. The mRNA expression of these four genes were analysed using the Oncomine and GEPIA databases, and the result revealed that the mRNA expression of AURKA,CDCA5,CEP55 and UBE2C was greatly upregulated in OC tissues compare with normal tissues. The gene set/pathway primarily showed alterations in amplification and deep deletion, and correlation analysis indicated significant positive correlations between AURKA, CDCA5, CEP55 and UBE2C. Nevertheless, for clarifying the protein expression levels of the four prognostic genes, the staining of the immunohistochemistry (IHC) were obtained base on the Human Protein Atlas, and was showed the proteins levels of these four genes was significantly elevated in patients with OC. Our results revealed that these four hub genes may mediate ovarian cancer development and have valuable clinical potential as prognostic biomarkers for OC patients.


Introduction
Ovarian cancer (OC) is one of the most common gynecological malignancies, with high mortality and few prognostic biomarkers [1] [2]. These characteristics are due partially to the lack of observable symptoms, and no speci c screening method is available for the early detection of OC. Cytoreductive surgery in combination with neoadjuvant chemotherapy is the primary treatment for ovarian cancer. The rapid development of targeted drugs has provided more opportunities for the treatment of ovarian cancer patients, but the overall survival of patients has not improved signi cantly [3,4]. Therefore, it is necessary to explore potential genetic changes and molecular mechanisms involved in the occurrence and development of OC, which can help us identify more speci c biomarkers for diagnosis and assessing prognosis. Recently, interest has increased in applying gene expression pro ling to help understand the pathogenesis of OC, develop biomarkers for early detection, and elucidate drug resistance mechanisms in OC [5]. Connor et al [6] found that Thy-1 was upregulated in OC and correlated obviously with poor prognosis. Han et al [7] reported that SOX30 can be used as a prognostic biomarker and chemoresistance indicator in OC. However, these studies focused mainly on single genes to expose the correlation between gene expression and OC, and determining whether differentially expressed genes (DEGs) have predictive and prognostic roles in OC remains di cult.
In the present study, we screened four gene expression datasets to identify the DEGs and to evaluate the associations between DEGs and the prognosis of ovarian cancer patients.
Through the analysis, we screened the four hub genes of AURKA, CDCA5, CEP55 and UBE2C can be used as effective biomarkers for diagnosis and prognosis assessment in OC patients.

Preprocessing of microarray datasets and identi cation of DEGs
The four gene datasets contained a total of 48 normal ovarian tissues and 125 ovarian cancer tissues (Table S1), and differential gene analysis data were normalized ( Figure S1). We initially identi ed 5,870, 4,040, 4,532 and 18,520 DEGs from GSE27651, GSE38666, GSE14407 and GSE18520, respectively, with the limma package of Rtools and the criteria |logFC| >1 and adjusted P value <0.01( Figure 1A-D). Then, all DEGs were divided into up-and down-regulated genes according to whether |logFC| >0 or |logFC| <0, respectively. A Venn diagram tool was applied to identify the DEGs in the sets of up-and downregulated DEGs. A total of 636 DEGs, including 269 upregulated and 367 downregulated DEGs, were identi ed in the OC samples compared to the normal ovarian tissue samples ( Figure 1E&F). All of these DEGs are listed in Table S2.

PPI network construction and MCODE analysis
The PPI network of the 269 upregulated DEGs was constructed with that demonstrated 256 nodes and 773 edges were enriched 6 gene clusters( Figure S2A, TableS3). As well as, the PPI network of the downregulated DEGs contained 330 nodes and 94 edges and enriched 7 gene clusters using the MCODE plugin of Cytoscape software ( Figure S2B, TableS3). Next, two gene clusters with network scores of >10 were selected from the upregulated DEGs ( Figure 2A&B), and two gene clusters with scores of >5 from the downregulated gene clusters ( Figure 2C&D) were selected as the most important molecular modules in the respective results and further analyzed for function and prognosis.
GO enrichment analysis showed that the two gene clusters of upregulated DEGs were primarily related to the biological processes (BP) of nuclear division and sister chromatid segregation in the cell cycle. The primary molecular functions included microtubule binding and various kinase regulator activities ( Figure  3A&B). KEGG pathway analysis indicated genes enriched mainly in oocyte maturation, cell cycle, p53 signaling pathway, and cellular senescence pathway ( Figures 4A-F).
The two gene clusters of downregulated DEGs in GO enrichment analysis indicated genes involved mainly in the endoplasmic reticulum lumen and mitochondrial outer membrane as cellular components, protein modi cation and regulation of various signal paths as biological processes, and hormone activity and alcohol dehydrogenase [NAD(P)+] activity as molecular functions ( Figure 5A&B). No signaling pathways were signi cantly enriched for the rst gene cluster (P<0.05). The second gene cluster was mainly involved in many metabolic functions, such as tyrosine metabolism, fatty acid degradation and retinol metabolism ( Figure 6A-C).

Survival analysis for identi cation of prognostic genes
The OS and PFS of patients were strati ed by the expression of genes in the gene clusters of up-and down-regulated DEGs using the Kaplan-Meier Plotter database. Twenty-six DEGs from upregulated DEGs were identi ed to correlate with OS and PFS in ovarian cancer. The results were visualized through the forestplot package of Rtools ( Figures 7A-D). However, only CHGB and MAOB genes from downregulated DEGs were relevant to prognosis in OC, and a higher expression of these genes predicted better OS and PFS (Table S4).
To further examine the prognostic potential of the twenty-eight genes from the up-and downregulated DEGs, we continued to reanalyze the survival differences through the PrognoScan database. Interestingly, two cohorts (GSE9891 and GSE17260) [8,9], which included 278 samples and 110 samples at different stages of OC, showed that higher expression of AURKA, CDCA5, CEP55 and UBE2C was signi cantly associated with poorer prognosis (Figures 8A-H). Therefore, it is conceivable that high AURKA CDCA5, CEP55 and UBE2C expression is an independent risk factor and leads to a poor prognosis in OC patients.
Using the GEPIA dataset, we further con rmed the differential mRNA expression of the four genes between OC and normal tissues. The results showed that mRNA expression levels of AURKA, CDCA5, CEP55 and UBE2C were signi cantly increased in OC tissues compared with normal ovarian tissues [(tumor sample: n = 426 vs. normal sample: n = 88) ( Figure 10). Additionally, expression of the four genes was analyzed by ovarian cancer stage. However, in contrast to CDCA5, mRNA expression of AURKA ,CEP55 and UBE2C was not signi cantly associated with FGIO stage of OC ( Figure S3).The IHC staining and images were downloaded from the Human Protein Atlas ,the proteins levels of these four prognostic genes was signi cantly elevated in tumor tissues compared with normal tissues ( Figure 11).

Gene alteration and correlation analyses of prognostic genes
Mutations in AURKA, CDCA5, CEP55 and UBE2C retrieved in 1,680 cases from three TCGA datasets of ovarian serous carcinoma (606 cases from TCGA, Firehose Legacy; 585 cases from TCGA, PanCancer Atlas; and 489 cases from TCGA, Nature 2011) were analyzed using the cBioPortal database. Among the 3 OC datasets analyzed, the alterations of the four genes were mainly related to ampli cation and deep deletion, and ranges from 17.67% of 583 genes to 7.77% of 489 genes were identi ed for the gene sets submitted for analysis ( Figure 12A). The percentages of AURKA, CDCA5, CEP55 and UBE2C gene alterations in OC were 6%, 2.1%,1.3% and 4% ( Figure 12B), respectively. Kaplan-Meier Plotter and logrank test results indicated no signi cant difference in OS and PFS between the cases with alterations in one of the queried genes and those without alterations in any of the queried genes (P-values of 0.0851 and 0.213, respectively; Figure 12 C&D).
The GeneMANIA database was used for correlation analysis of AURKA, CDCA5, CEP55 and UBE2C at the gene level ( Figure 13A). The 4 central nodes were surrounded by 20 nodes representing genes that were closely related to the family in terms of physical interactions, co-expression, predictions, co-localization, and genetic interactions. The results revealed relationships in physical interactions, co-expression and colocalization between AURKA, CDCA5, CEP55 and UBE2C. The same pathway was shared between AURKA and UBE2C. The predicted protein domains was shared among AURKA, CEP55 and UBE2C. However, no relationship in shared protein domains or genetic interactions were noted among the four genes. Further functional analysis revealed that these four genes mainly correlated with regulation of cell division, including mitotic cell cycle, mitotic cytokinesis, mitotic sister chromatid segregation mitosis, nuclear division and cell cycle G2/M phase transition (Table S5).

Immune in ltrate analysis ofprognostic genes
Tumor in ammatory cell in ltration levels play an important role in cancer progression and patient survival. We analyzed whether AURKA, CDCA5, CEP55 and UBE2C expression correlated with in ammatory cell in ltration levels in OC. The results showed that the mRNA levels of the four genes correlated weakly with in ammatory cell in ltration ( Figure 14). AURKA expression was related to macrophages cells, CD4+T cells and neutrophils, but no signi cant correlation with tumor purity or in ltrating levels of B cells, CD8+T cells and dendritic cells was noted. In addition to the negative correlation with CD8+T cells, CDCA5 expression was not signi cantly correlated with other immune in ammatory cells and tumor purity. CEP55 was negatively related to CD8+T cells and positively related to CD4+T cells. UBE2C was only associated with tumor purity and CD4+T cells. The results suggest that these four genes did not correlate closely with tumor immune in ammatory cell and tumor purity associated with the occurrence and development of OC. The detailed mechanisms require further study.

Discussion
OC is a gynecological malignancy with high mortality that is partially attributed to tumor heterogeneity, and its pathogenesis remains unclear. Recently, interest has been increasing in applying gene expression pro ling to help understand the pathogenesis of OC, develop biomarkers for early detection, and elucidate drug resistance mechanisms in OC [14]. In the present study, we attempted to systematically explore the prognostic values, expression patterns, genetic alteration, correlation, and potential functions of different DEGs in OC. In total, 636 DEGs, including 269 upregulated and 367 downregulated DEGs, were identi ed between normal ovary tissue and OC tissue by screening 4 gene datasets. Then, based on the status of gene cluster analyses, we selected two gene clusters of upregulated DEGs with network scores of > 10 and two gene clusters of downregulated DEGs with scores of > 5 for functional and survival analyses.
The gene clusters of upregulated DEGs were mainly involved in cell division and cell proliferation, and the genes of downregulated DEGs primarily participated in the TGF-β signaling pathway and various metabolic processes.
We focused on two gene clusters of up-and downregulated DEGs on prognosis analysis. The results showed the mRNA expression of AURKA, CDCA5, CEP55 and UBE2C was signi cantly increased and correlated with poorer prognosis in OC tissues compared with normal ovarian tissues through crossvalidation. Additionally, the four gene sets/pathways primarily showed alterations in ampli cation and deep deletion, and the percentages of individual AURKA, CDCA5, CEP55 and UBE2C mRNA genes' alterations in OC were 6%, 2.1%, 1.3% and 4%, respectively. Correlation analysis indicated signi cant positive correlations between AURKA, CDCA5, CEP55 and UBE2C. Further functional analysis revealed that these four genes mainly take part in regulation of cell division, including mitotic cell cycle, mitotic sister chromatid segregation mitosis and cell cycle G2/M phase transition, indicating that these genes play a vital role in the occurrence and development of OC.
Aurora kinase A (AURKA) is a member of a family of three serine-threonine kinases that are highly conserved throughout evolution and play important roles in cell cycle regulation [15][16][17]. AURKA has pleotropic roles in centrosome maturation, spindle assembly, cytokinesis and kinetochores required for proper mitotic progression [15,16]. AURKA expression levels and kinase activity are upregulated and relevant to poorer prognosis in many human cancers [18][19][20], indicating that AURKA might serve as a potential biomarker and target in the development of anticancer drugs. AURKA mediates RPS6KB1 phosphorylation at T389 to promote gastrointestinal cancer cell proliferation and survival upon inhibition of KRAS [21]. Knockdown of AURKA signi cantly inhibited OC cell-induced angiogenesis of endothelial cells compared to controls and also signi cantly inhibited cell proliferation, migration, and invasion of OC cells in vitro and vivo [22]. We found that increased AURKA RNA expression was closely associated with worse OS and PFS in OC patients, and the underlying mechanisms need to be further elucidated.
Cell-division cycle-associated 5 (CDCA5), also known as sororin, is thought to play a critical role in embryonic development, establishment or maintenance of cohesion and ensuring the accurate separation of sister chromatids during the S and G2/M phases of the cell cycle through interactions with cohesin and CDK1 [23,24]. CDCA5 functions at the central region of the synaptonemal complex during mitosis in collaboration with SGO2-PP2A [25]. Concurrently, studies have indicated that the expression of CDCA5 correlates with tumorigenesis and development in several cancers, including nonsmall cell lung cancer, colorectal cancer, and esophageal squamous cell cancer [26][27][28]. However, the gene has not been reported in ovarian cancer and deserves further study.
Centrosomal protein 55 (CEP55) is dispensable for mouse embryonic development and adult tissue homeostasis [29]. CEP55 mediates the physical separation of daughter cells and completes cell division through binding GPPX3Y motifs of ALIX and TSG101 to block similar GPPX3Y motifs of ALIX and TSG10 from interacting and localizing to the midbody [30]. Recently, CEP55 has been studied in various tumor cancers, including those of the colon, liver, breast and lung. High levels of CEP55 improved migration and invasion and were closely related to poor prognosis in multiple tumor types [31,32]. As well as its roles as a biomarker and actuator of tumorigenesis [33,34], how CEP55 contributes to OC development remains unclear and it is a valuable avenue to explore in future studies.
Ubiquitin-conjugating enzyme 2C (UBE2C) is an integral component of the ubiquitin proteasome system and participates in destruction of the spindle assembly checkpoint and mitotic exit. UBE2C consists of a conserved core domain containing the catalytic Cys residue and an N-terminal extension. The UBE2C Nterminal extension regulates E3 enzyme activity as a part of an intrinsic inhibitory mechanism [35]. UBE2C mRNA and/or protein levels are aberrantly increased and correlated with poor clinical outcomes in many cancer types; UBE2C is considered a potential cancer biomarker [35][36][37]. Increased UBE2C expression was noted in oral squamous cell carcinoma (OSCC) compared to control cells and associated with poor cell differentiation and lymph node invasion. Moreover, suppression of UBE2C expression decreased cell proliferation, migration/invasion, and colony formation and reduced expression of the cancer stem cell markers ALDH1/A2, CD44, CD166 and EpCAM [38]. In patients with HR+/HER2-breast cancer, high UBE2C expression was associated with signi cantly inferior survival with pN0 and pN1 tumors but not pN2/N3 tumors (P < 0.05). UBE2C depletion markedly inhibited cell proliferation and increased the cytotoxicity of tamoxifen by inducing apoptosis [39]. This study also revealed that UBE2C overexpression correlates with poor clinical outcomes in tumor tissue. However, few reports have addressed the correlation of UBE2C with these characteristics in OC, and the underlying mechanism of UBE2C in ovarian tumors requires further exploration.
Our analysis showed that high expression of AURKA, CDCA5, CEP55 and UBE2C genes correlates signi cantly with poor prognosis in OC patients. These four genes represent promising novel candidates and potential prognostic biomarkers for OC. However, the present study had certain limitations because the main sources of our clinical information were datasets from the GEO and TCGA databases, which are retrospective data. Therefore, the potential mechanisms of these genes should be elucidated further using other external datasets with complete clinical information and gene expression information to better understand the role these genes play in the pathogenesis and progression of OC. Moreover, the protein expression levels of the prognosis-related DEGs and their potential mechanisms in OC must be validated in further experimental studies.

Materals And Methods
Downloaded datasets and identi cation of differentially expressed genes (DEGs) Four datasets associated with OC were identi ed and downloaded from the GEO database (https://www.ncbi.nlm.nih. gov/ geo/): GSE38666 [40], GSE27651 [41], GSE14407 [42]and GSE18520 [43]. These datasets contain a total of 48 normal ovarian (OV) tissues and 125 OC tissues. The original GEO data were subjected to background correction and normalization [44] and then converted into expression measures using R software [45]. The DEGs were screened in each dataset with the limma package [46] in R software, and the selection criteria were de ned as |log fold change (FC)| >1 and a P value of <0.05. The DEGs across the four datasets were identi ed by using a Venn diagram tool (http://bioinformatics. psb.ugent.be/webtools/Venn/).

Protein-protein Interaction (PPI)network construction and molecular complex detection (MCODE) analysis
The up-and down-regulated DEGs were used separately to construct a PPI using the STRING (https://string-db.org/) database with the following speci cations: (1) human species; (2) meaning of network edges was set as evidence with a minimum required interaction score of 0.4; (3) active interaction sources of text mining, experiments, and databases; and (4) hiding of disconnected nodes in the network. Then, we downloaded the PPI data and further explored gene cluster status for up-and down-regulated DEGs using the MCODE and cytoHubba plugins in Cytoscape software (version 3.7.1, https://cytoscape.org/), respectively. MCODE data were ltered according to the default parameters as follows: "Degree Cutoff = 2," "Node Score Cutoff = 0.2," "K-Core = 2" and "Max. Depth = 100" [47]. Based on the status of gene cluster analyses, we selected the top 2 gene clusters as the most important molecular modules in the respective results and further Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichments with the clusterPro ler package [48], and the results were visualized with the GOplot package [49]. An adjusted P-value <0.05 was established as a selection criterion.

Survival analysis for identi cation of prognostic genes
We investigated the effects of genes from the most important molecular modules on the overall survival (OS) and progression-free survival (PFS) of OC patients from the Kaplan-Meier (KM)Plotter database, and the results were visualized with the forestplot package of R [50]. The Kaplan-Meier Plotter database is an online database containing microarray gene expression data and survival information derived from GEO, EGA and The Cancer Genome Atlas (TCGA), which can be used to assess survival in 21 cancer types. The OS and PFS of patients with OC were determined by dividing the patient samples into two groups based on median expression (high vs. low expression) and evaluated with a hazard ratio with 95% con dence intervals and a log rank P-value.
The PrognoScan database [51] was used to cross-validate the prognosis-related genes that were obtained in KM Plotter database. The PrognoScan database is a large collection of publicly available cancer microarray datasets with clinical annotation as well as a tool for assessing the biological relationship between gene expression and prognosis, including OS and PFS. The threshold was adjusted to a Cox Pvalue <0.05.
Analyses the differential mRNA and protein expression of prognostic genes To identify prognostic genes that play an important role in OC, we used the ONCOMINE database [52,53] to analyze the mRNA levels of DEGs relevant to prognosis for OC patients. The gene expression array datasets of ONCOMINE (www.oncomine.org) compose a publicly accessible, online cancer microarray database that helps facilitate research from genome-wide expression analyses. The thresholds were restricted as follows: P-value = 0.001, fold-change = 1.5, a top 10% gene ranking and mRNA data.
To further determine the differential expression of prognostic genes, tumor and normal tissues were analyzed using the GEPIA database [54], which is an interactive web server that estimates mRNA expression data based on 9,736 tumors and 8,587 normal samples in TCGA and Genotype-tissue Expression (GTEx) dataset projects.
In addition, to clarify the protein expression levels of the hub genes, we downloaded the immunohistochemistry (IHC) staining and the IHC images from the Human Protein Atlas (HPA, http://www.proteinatlas.org/) [55] and explored whether there were statistically signi cant differences in the expression of hub genes between the normal and OC tissues, and showed some IHC images.

Alteration and correlation analyses of prognostic genes
The cBioPortal for Cancer Genomics (http://www.cbioportal.org/) is an open-source platform that provides information regarding the integrative analysis of complex cancer genomics and clinical pro les from more than 200 cancer studies in The Cancer Genome Atlas (TCGA) [56]. The cBioPortal contains an extensive variety of data, including DNA copy numbers, DNA methylation, mRNA and microRNA expression levels, and nonsynonymous mutations. In this study, the cBioPortal database was used to analyze genetic variations in prognostic genes. In addition, the database was used to evaluate correlations between prognostic genes. GeneMANIA (http://www.genemania.org) provides a exible web interface for deriving hypotheses based on gene functions [57]. GeneMANIA generates a list of genes with similar functions to the query gene and constructs an interactive functional-correlation network to illustrate relationships between genes and datasets. In this study, the database was utilized to construct a gene-gene interaction network for prognostic genes in terms of physical interactions, co-expression, predictions, co-localization, and genetic interaction as well as to evaluate their functions.  Tables   Table 1 is not available with this version. Figure 1 Initially identi ed 5,870, 4,040, 4,532 and 18,520 DEGs from GSE27651, GSE38666, GSE14407 and GSE18520, respectively, with the limma package of Rtools and the criteria |logFC| >1 and adjusted P value <0.01( Figure 1A-D). Then, all DEGs were divided into up-and down-regulated genes according to whether |logFC| >0 or |logFC| <0, respectively. A Venn diagram tool was applied to identify the DEGs in the sets of up-and downregulated DEGs. A total of 636 DEGs, including 269 upregulated and 367 downregulated DEGs, were identi ed in the OC samples compared to the normal ovarian tissue samples ( Figure 1E&F). All of these DEGs are listed in Table S2.

Figure 2
Two gene clusters with network scores of >10 were selected from the upregulated DEGs ( Figure 2A&B), and two gene clusters with scores of >5 from the downregulated gene clusters ( Figure 2C&D) were selected as the most important molecular modules in the respective results and further analyzed for function and prognosis.

Figure 3
The primary molecular functions included microtubule binding and various kinase regulator activities ( Figure 3A&B).  The two gene clusters of downregulated DEGs in GO enrichment analysis indicated genes involved mainly in the endoplasmic reticulum lumen and mitochondrial outer membrane as cellular components, protein modi cation and regulation of various signal paths as biological processes, and hormone activity and alcohol dehydrogenase [NAD(P)+] activity as molecular functions ( Figure 5A&B). No signaling pathways were signi cantly enriched for the rst gene cluster (P<0.05).

Figure 6
No signaling pathways were signi cantly enriched for the rst gene cluster (P<0.05). The second gene cluster was mainly involved in many metabolic functions, such as tyrosine metabolism, fatty acid degradation and retinol metabolism ( Figure 6A-C).

Figure 7
Twenty-six DEGs from upregulated DEGs were identi ed to correlate with OS and PFS in ovarian cancer.
The results were visualized through the forestplot package of Rtools ( Figures 7A-D).
Page 22/28 Figure 8 To further examine the prognostic potential of the twenty-eight genes from the up-and downregulated DEGs, we continued to reanalyze the survival differences through the PrognoScan database. Interestingly, two cohorts (GSE9891 and GSE17260) [8,9], which included 278 samples and 110 samples at different stages of OC, showed that higher expression of AURKA, CDCA5, CEP55 and UBE2C was signi cantly associated with poorer prognosis (Figures 8A-H). Therefore, it is conceivable that high AURKA CDCA5, CEP55 and UBE2C expression is an independent risk factor and leads to a poor prognosis in OC patients.

Figure 9
We compared the transcriptional levels of these four genes in cancers with those in normal tissue samples using the ONCOMINE database ( Figure 9).

Figure 10
Using the GEPIA dataset, we further con rmed the differential mRNA expression of the four genes between OC and normal tissues. The results showed that mRNA expression levels of AURKA, CDCA5, CEP55 and UBE2C were signi cantly increased in OC tissues compared with normal ovarian tissues [(tumor sample: n = 426 vs. normal sample: n = 88) ( Figure 10).

Figure 11
The IHC staining and images were downloaded from the Human Protein Atlas ,the proteins levels of these four prognostic genes was signi cantly elevated in tumor tissues compared with normal tissues ( Figure   11).

Figure 12
Mutations in AURKA, CDCA5, CEP55 and UBE2C retrieved in 1,680 cases from three TCGA datasets of ovarian serous carcinoma (606 cases from TCGA, Firehose Legacy; 585 cases from TCGA, PanCancer Atlas; and 489 cases from TCGA, Nature 2011) were analyzed using the cBioPortal database. Among the 3 OC datasets analyzed, the alterations of the four genes were mainly related to ampli cation and deep deletion, and ranges from 17.67% of 583 genes to 7.77% of 489 genes were identi ed for the gene sets submitted for analysis ( Figure 12A). The percentages of AURKA, CDCA5, CEP55 and UBE2C gene alterations in OC were 6%, 2.1%,1.3% and 4% ( Figure 12B  The GeneMANIA database was used for correlation analysis of AURKA, CDCA5, CEP55 and UBE2C at the gene level ( Figure 13A). The Spearman correlations between AURKA, CDCA5, CEP55 and UBE2C in OC were determined by online analyses using the cBioPortal database (TCGA, PanCancer Atlas ( Figure 13B).

Figure 14
Tumor in ammatory cell in ltration levels play an important role in cancer progression and patient survival. We analyzed whether AURKA, CDCA5, CEP55 and UBE2C expression correlated with in ammatory cell in ltration levels in OC. The results showed that the mRNA levels of the four genes correlated weakly with in ammatory cell in ltration ( Figure 14).

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.