Screening key genes related to ovarian cancer and exploring their possible molecular mechanisms

Background As one of the common malignant tumors in women, ovarian cancer (OC) often exerts the atypically early clinical symptoms. Therefore, it is particularly important for seeking more effectively early diagnosis of OC (biomarkers). Besides, although a lot of sequencing and chip research have been done on the pathogenesis of OC, the pathogenesis, clinical and genetic features of OC is still not very clear. Methods In this study, 4 GEO data (GSE66957, GSE119054, GSE14407 and GSE54388) were selected for differential expression gene analysis (DEGs), and the important template of the 4 DEGS overlapping genes was taken as Hub genes. Then, the GO and pathway enrichment analysis were conducted to confirm the enrichment of these Hub genes, and these Hub genes were identified as key genes. In addition, the transcriptional levels of these Hub genes in OC and their impacts on the overall survival rate of OC were validated via the UCSC and TCGA datasets.. Besides, cBioPortal, TargetScan, UCSC, DiseaseMeth and TIMER software were performed to explore the potential biological functions of these key genes in OC. Results We screened out 10 Hub genes related to OC including VEGFA, ZWINT, CDKN2A, SLC2A1, TOP2A, MKI67, CCND1, KPNA2, FGF2 and SMC4, and further demonstrated that they were most significantly enriched in protein binding, cytoplasm, nucleus, extracellular exosome, membrane, cell division, cell adhesion and pathways in cancer. Meanwhile, CCND1, TOP2A, SMC4 and FGF2 were screened out as key candidate genes associated with OC. Further analysis proved these key candidate genes may regulate the occurrence and development of OC through mediating the gene mutation, miRNAs and genetic epigenetics such as methylation and acetylation. Conclusion These data would improve our understanding of the causes and underlying molecular events of OC, be of clinical significance for the early diagnosis and prevention

of OC, and may provide the promising therapeutic targets in OC.
Background OC is one of the common malignant tumors in gynecological diseases, accounting for 5% of female malignant tumors, and its incidence ranks third only after cervical cancer and uterine body cancer. There are still more than 14 [3,,,]. However, these biomarker genes still do not have a good early diagnostic effect in clinical practice, and the incidence and mortality of OC are still high []. Therefore, it is essential to understand the precise molecular biological mechanisms involved in the occurrence, proliferation and recurrence of OC and formulate the effective diagnosis and treatment strategies, thereby preventing the OC occurrence and reducing the OC mortality. In this study, we screened out Hub genes that might be involved in the development of OC via using a series of bioinformatics analysis and explored the potential biological process of these key candidate genes in OC.
The workflow of this study is divided into seven main steps: select 4 raw microarray datasets (GSE66957, GSE119054, GSE14407 and GSE54388) related to OC from the GEO datasets (https://www.ncbi.nlm.nih.gov/geo/) for differential expression gene analysis (DEGs) , analyze the Gene Ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment of overlapping genes of the 4 DEGS via DAVID (https://david.ncifcrf.gov/), construct the Protein-Protein Interaction (PPI) network using STRING online software (https://string-db.org/) and Cytoscape to analyze the correlation among these common DEGs, screen out the 10 Hub genes related to OC, validate the survival of OC and the transcriptional levels of these 10 hub genes in OC by cBioPortal(http://www.cbioportal.org/) and GEPIA (http://gepia.cancer-pku.cn/), respectively,, seek for the potential key candidate genes, which are related to the occurrence and development of OC. use the TargetScan (http://www.targetscan.org/vert_72/), TIMER (https://cistrome.shinyapps.io/timer/), UCSC (http://genome.ucsc.edu/index.html) and DiseaseMeth (http://biobigdata.hrbmu.edu.cn/diseasemeth/index.html) to explore the potential function of targeting miRNAs, gene mutation, acetylation and methylation of these key genes and their relationship with tumor immune infiltration.
In conclusion, we screened out the common DEGs, the Hub genes and even the key candidate genes through a series of bioinformatics methods. Meanwhile, the potential molecular and molecular mechanisms involved in the regulation of OC development are further explored on the basis of TCGA and UCSC databases, which would improve our understanding of its causes and the underlying molecular events. These candidate genes may provide a promising future for the research of therapeutic targets in OC.  Fig. 1B.

Screening of common DEGs
We divided each of microarray datasets into malignant tumor group and normal group, and screened out DEGs between 2 groups of microarray datasets by GEO2R(https://www.ncbi.nlm.nih.gov/geo/geo2r/) which is an online data analysis software that automatically screens for DEGs between two sets of chip data on the GEO database.
The DEGs between the 2 groups were considered statistically significant with adjusted P values <0.05 and |log2 fold change (FC)|≥1 (Table 2). Cluster analysis of gene expression of the first 100 differentially expressed genes (50 up and 50 down) between the 2 groups is shown in Fig. 2 (plotted by R). Subsequently, we screened out 103 common DEGs (73 up and 30 down) by Veen Diagrams online software (http://bioinformatics.psb.ugent.be/webtools/Venn/) ( Table 3 and Fig. 3).  Fig. 4). GO enrichment analysis showed that the common DGEs were mostly enriched in protein binding, cytoplasm, nucleus, extracellular exosome, membrane, cell division and cell adhesion. Moreover, the first 10 biological processes of obvious enrichment are all associated with the up-regulated genes, but not the down-regulated genes which were only enriched in nervous system development, cell differentiation and somatic stem cell population maintenance. In addition, KEGG pathway enrichment analysis showed that the common DEGs were largely existed in p53 signaling pathway s in bladder cancer and pancreatic cancer. Combined with the above results, we found that up-regulated common DEGs were enriched in these pathways, while the downregulated differential genes do not have significant signaling pathway enrichment.

Screening of Hub genes
We imported these 103 common DEGs into STRING and Cytoscapte to construct the

Analysis and validation of Hub genes
The 10 Hub genes were introduced into cBioPortal, a TCGA database online analysis software, to explore their co-expression and network in OC (Fig. 5C). Transcriptional levels of Hub genes and the related different tumor grades based on the TCGA database were validated by GEPIA (Fig. 6). The results showed that only FGF2 was lowly expressed and the other 9 Hub genes were highly expressed in OC, which was consistent with the results obtained from the GEO database. The transcriptional levels of SLC2A1, VEGFA and ZWINT in different grades of OC is different, while the expression of the other genes showed no significant difference in different levels of OC. Moreover, we studied the effect of these Hub genes on the overall survival rate of OC. There are 4 key candidate genes including CCND1 (up), TOP2A (up), SMC4 (up) and FGF2 (down) related to survival and prognosis of OC were screened out (Fig. 7).

Function of key candidate genes and its regulation mechanism of OC
At this point, our study has identified 4 key candidate genes for OC. However, why do these key candidate genes change significantly? or through which molecular biological mechanisms do they regulate the occurrence of OC? None of these mechanisms is clear.
More in-depth study is carried out to explore the potential molecular and biological  8A). Interestingly, a one-to-one correlation study between FGF2 and CCND1, TOP2A and SMC4 showed that they were not related to each other (Fig. 8B).
To further explore through which epigenetics way these key candidate genes were differentially expressed in OC, we performed CBioPortal to understand the proportion of genes mutation types and clustering expression of the 4 key candidate genes in OC. These key candidate genes exhibited different gene mutation types and degrees in sequence.
The results showed that there was no mutation in CCND1, but the other three key genes had mutations, in which the most common mutation was missense mutation (Fig. 8C).
Moreover, we used TIMER, UCSC and DiseaseMeth software to analyze the mechanisms of tumor immune infiltration, acetylation and methylation of these genes, respectively.
Results demonstrated that there was no significant correlation between these key gene expressions and tumor immune infiltration (Fig. 9A, ploted by TIMER). The epigenetic regulation (acetylation, methylation and lncRNA) of these genes were subsequently detected through the UCSC database. The results indicated that all these 4 key genes had the methylation and acetylation sites ( Fig. 9B and Fig. 10A). However, there was no enrichment of lncRNA around the sequences of these 4 key candidate genes. Surprisingly, the methylation levels of CCND1, TOP2A and SMC4 in OC were higher than that in normal, while the methylation levels of FGF2 showed the opposite result (Fig. 10B, ploted by DiseaseMeth). What's more, CCND1, TOP2A, SMC4 and FGF2 combined with miR-142-5p, miR-144-3p, miR-219-5p and miR-23-3p, respectively (Fig. 11, ploted by TargetScan).

Discussion
OC is one of the most common malignant tumors in female reproductive organs, and its incidence ranks third only after cervical cancer and uterine body cancer. Currently, OC is In this study, we combined with GEO database, TCGA data and a variety of comprehensive bioinformatics methods to screen out the key genes related to OC and further explore the possibly related molecular biological mechanisms. Through analyzing GEO2R, GO and KEGG pathways, we found that VEGFA, ZWINT, CDKN2A, SLC2A1, TOP2A, CCND1, KPNA2, and SMC4 were significantly up-regulated in OC, while MKI67 and FGF2 were significantly down-regulated. Meanwhile, these genes mainly functioned in protein binding, cytoplasm, nucleus, extracellular exosome, membrane, cell division and cell adhesion. Moreover,, we further proved that CCND1, TOP2A, SMC4 and FGF2 might regulate the occurrence of OC through miRNAs, gene mutations, acetylation (H3K4Me and H3K27AC), methylation through searching for TCGA, UCSC, TIMER and TargetScan software. In our research, although these key genes have nothing to do with lncRNA, lncRNA may be involved in ovarian cancer through other genes [,].As we known, CCND1, a proto-oncogene associated with cell division cycle, has been reported to be related with breast cancer, bladder cancer, parathyroid cancer, lymphoma, and lung cancer[], and it also promotes the cell proliferation via mediating the transition from G1 phase to S phase of cell cycle. At present, some studies have also shown that CCND1 participates in regulation of OC [].

CCND1 silencing could break DNA double strands [], thus inhibiting the growth of OC.
However, the mechanism is still unknown. Therefore, the mutation, methylation and acetylation of genes that we have analyzed through bioinformatics would provide a certain degree of reference value for further in-depth study.
Topoisomerase (DNA) II alpha (TOP2A) could convert DNA superhelix into a relaxed state.
Previous studies have confirmed that TOP2A is generally overexpressed in OC tissues, and they speculate that TOP2A may be functioned as a potential On the other hand, GO and KEGG pathway enrichment analysis results showed that the common differential genes were mostly significant enriched in protein binding, cytoplasm, nucleus, extracellular exosome, membrane, cell division, cell adhesion, pathways in cancer, bladder cancer, pancreatic cancer, and p53 signaling pathway. Combining with the recent theories that OC exosomes can act as coordinators of pre-metastatic niche formation, biomarkers suitable for liquid biopsy and targets for chemotherapy [,,], it is very important to identify whether these key genes exist in the OC exosomes. As we know, P53 mutations are associated with poor prognosis, because that P53 can suppress cancer progression by inducing cell cycle arrest or apoptotic process, and can respond to a variety of cell stress signals. Based on these data, we propose that CCND1, TOP2A, SMC4 and FGF2 may regulate the occurrence of OC through miRNA, gene mutation and epigenetic (methylation and acetylation). Unfortunately, there's still no research carried out to explore the regulatory role of these key candidate genes in OC.
It is particularly noteworthy the 4 key candidate genes acetylation. UCSC results showed that CCND1, TOP2A, SMC4 and FGF2 had H3K4Me and H3K27AC acetylation sites in our study. Previous studies have shown that H3K4Me acetylation of these genes can upregulate the expression of BIRC3, thereby further inhibiting the growth of OC by promoting cell apoptosis and inhibiting proliferation []. Based on these data, CCND1/TOP2A/SMC4/FGF2-H3K4Me-BIRC3 pathway may be involved in OC.

Conclusions
In this study, 103 common DEGs related to OC were screened out in GEO database via using R software and bioinformatics analysis. The enrichment analysis of GO and KEGG pathway showed that these DEGs mostly significant enriched in protein binding, cytoplasm, nucleus, extracellular exosome, membrane, cell division, cell adhesion, pathways in cancer, bladder cancer, pancreatic cancer and p53 signaling pathway, which would provide a theoretical basis for studying the biological processes of OC. Then, we screened out 10 Hub genes by comprehensive bioinformatics analysis and successfully selected 4 key candidate genes related to OC via the survival analysis and the transcription level analysis. Finally, we explored the mechanisms of these 4 key candidate genes through various online software. Further exploration would be beneficial for understanding the interaction among these 4 key candidate genes, which might regulate the development of OC through various molecular biological mechanisms such as gene mutation, regulation of targeted miRNAs, acetylation and methylation of these key candidate genes.
The results of this study would improve our understanding of the pathogenesis of OC and the occurrence and development of the underlying molecular mechanisms. These findings might also have important clinical significance for the early diagnosis treatment and the prevention of OC as well as be helpful for subsequent experimental studies to find potential key molecule targets of OC. However, these results are derived from large data analysis, which need to be further confirmed by more direct experimental studies.

Declarations Not applicable
Funding This study was supported by the National Science Foundation of China (No. 8187060862).
The funder had no role in study design, data collection and analysis, except for bioinformatics training, writing the manuscript, and decision to publish.

Availability of data and materials
The expression data associated with this article is available on GEO databases(https://www.ncbi.nlm.nih.gov/geo/).

Authors' contributions
JZLperformed comparative analysis using bioinformatics tools. ZXH, KH, XYL and BZ participated in data analysis and discussion. JZL and YHZ interpreted data and wrote the manuscript. QHC organized and supervised the project. All authors read and approved the final manuscript.   Table 4 The GO and KEGG pathway enrichment analysis of the common DEGs         The binding sites of key genes to miRNAs. Notes: Each gene sequence has multiple miRNAs binding sites.