Background: There were 313959 cases of newly diagnosed ovarian cancer (OC) and 207252 new deaths for OC in 2020 and OC lacks effective treatment options. Therefore, identifying novel therapeutic targets is imminent. Here, we use an integrated bioinformatics analysis to key genes involved in ovarian cancer and reveal potential therapeutic targets.
Methods: GSE105437, GSE14407 and GSE18520 downloaded from Gene Expression Omnibus (GEO) were used to screen differentially expressed genes (DEGs). Gene ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analyses were performed to predict the potential functions of the DEGs. Protein-protein interaction network (PPI) was drawn through STRING database and select CDC20 having the highest degrees of connectivity as the potential therapeutic target. Oncomine database and quantitative Real-time RT-PCR (RT-qPCR) of the ovarian tissues were used to validate the mRNA expression of CDC20. We use Gene Set Enrichment Analysis (GSEA) software to explore the potential biological function of CDC20 in OC.
Results: A total of 821 DEGs were obtained, including 497 upregulated genes and 324 downregulated genes. Functional and pathway enrichment analyses indicated the DEGs were mainly involved in DNA-binding transcription activator activity, tubulin binding, microtubule binding, cell cycle, Wnt signaling pathway, p53 signaling pathway, and metabolism changes. Oncomine database analysis and RT-qPCR showed that CDC20 is significantly upregulated in OC tissues. GSEA analysis showed that CDC20 may regulate OC via cell cycle, citrate and TCA cycle, Oxidative phosphorylation and ubiquitin mediated proteolysis pathways.
Conclusion: The results of the present study deduced that CDC20 is overexpressed in OC and may be a promising therapeutic target for the treatment of OC.
According to the new cancer statistics, there were 313959 cases of newly diagnosed OC and 207252 new deaths for OC in 2020 . There will be approximately 22530 cancer cases diagnosed and 13980 patients will die from OC due to lack of early diagnosed markers, diagnosis at advanced stage with or without distant metastases, drug resistance and easiness to relapse [2-6]. Therefore, identifying novel therapeutic targets to improve the prognosis of OC patients is imminent.
We use the bioinformatical studies on OC from GEO database to study and explore the detailed molecular mechanisms of OC progression and therapy and reliable and positive new biomarkers and specific targets of OC. In this study, GSE105437, GSE14407 and GSE18520 were chosen from GEO database. First, R was used to merge and normalize the expression data of the three datasets based on batch normalization for the next DEGs screening. Subsequently, GO was used for functional annotation evaluation, and KEGG was used for pathway enrichment evaluation. Then, the PPI network of string database was used to explore the relationship between DEGs, and the molecular interaction between DEGs and tumorigenesis was found. From the PPI network, CDC20 was chosen as one of the key genes. CDC20, an importance factor in cell cycle, control the chromosome segregation and mitotic progression through interactions between spindle assembly checkpoint (SAC) and anaphase-promoting complex or cyclosome (APC/C) . The knockdown of CDC20 causes different endings, including the arrest of mitosis and subsequent cell death or mitotic slippage [8, 9]. CDC20 expression level was also searched on Oncomine database and verified in patients’ ovarian cancer tissue compared to the normal ovarian tissues. In conclusion, CDC20 might be a novel anti-cancer therapeutic target in OC.
2.1 Data sources
NCBI-GEO (https://www.ncbi.nlm.nih.gov/geo/) is a free public database of microarray/gene profile and we obtained the gene expression profile of GSE105437, GSE14407 and GSE18520 in ovarian cancer and normal ovarian tissues. The details of each microarray study are provided in Table 1.
2.2 Integration of microarray data and differential expression analysis
Heterogeneity and potential variables are commonly recognized as major sources of bias and variability. The samples of the datasets we recruited for our multiple data sets analysis were handled on different days, in different groups or by different people. Therefore, we first integrated all samples of three data sets to improve the number of samples and avoid generating less reliable results by batch normalization in the R computing environment using sva and limma package [10, 11]. Next, we performed gene differential analysis (|LogFC|>1, adjusted P value (FDR)<0.05) by comparing tumor tissues with normal tissues using limma R package . The integrated dysregulated gene lists were saved for subsequent analysis. In addition, the expression conditions of these DEGs are shown in the heatmap and volcano. Heatmap analysis of the resulting data matrix were performed with R language (version 4.0.2) and pheatmap package (1.0.12), which is available from https://cran.r-project.org/web/packages/pheatmap/.
2.3 Functional enrichment analysis of DEGs
GO , KEGG  pathway enrichment analyses were performed to predict the potential functions of the DEGs in R using the function of clusterProfiler (version 3.16.1). The top 10 of GO and KEGG pathways were then analyzed and presented in bubble plots. These bubble charts are drawn based on P value by using ggplot2r software package and statistical software R (version 4.0.2). We consider P<0.05 to be statistically significant.
2.4 Protein-protein Interaction
The online database STRING (v11.0, http://www.string-db.org/) was used to visualize the protein-protein Interaction (PPI) between the statistically significant DEG-encoded proteins in the resultant dataset . To avoid an inaccurate PPI network, we used a conﬁdence interaction score≥ 0.9 to obtain the signiﬁcant PPIs. We downloaded the high-resolution network from the STRING database. And 30 hub genes with the highest interactions in the network were listed, and a bar plot was drawn to show that in R language (version 4.0.2).
2.5 Gene Set Enrichment Analysis
Gene Set Enrichment Analysis (GSEA) software (Version 4.0.3) (http://www.broad.mit.edu/gsea) was used to explore the potential biological function of CDC20 in OC . GSEA (version 3.0) was run for the KEGG gene sets (c2.cp.kegg.v.7.2.symbols.gmt) . The number of permutations is equal to 1,000 and the phenotype labels were CDC20-high and CDC20-low. FDR <0.25 and NOM P<0.05 was considered as statistical significance.
2.6 Oncomine database extraction
Oncomine database (http://www.oncomine.org, accession number: [email protected]) is currently the world’s largest oncogene‐chip database and integrated data‐mining platform for the purpose of mining cancer gene information. To date, the database has collected 715 gene expression data sets and 86,733 pieces of cancer tissue and normal tissue sample data. The Oncomine database was applied for differential expression classification for common cancer types, and their respective normal tissues [17-19].
2.7 Ethical statement
This study was carried out in accordance with the standards of the Helsinki Declaration of the World Medical Association , and approved by the Ethics Committee of China Medical University. All clinical samples were collected from the Shengjing Hospital of China Medical University with informed consent from all patients .
2.8 Tissue collection
30 cases of primary OC tissue and 30 cases of normal ovarian tissue were used in this study. All samples were collected from the patients undergoing surgical excision at the Department of Obstetrics and Gynecology, Shengjing Hospital of China Medical University. No patient received radiotherapy, chemotherapy, or hormone therapy before surgery. The histopathological diagnosis obtained from the Pathology Department according to the criteria of the World Health Organization .
2.9 Total RNA extraction and real-time reverse transcription PCR
RNA isolation of ovarian tissue samples were conducted through TRIzol Reagent (Invitrogen, USA). The synthesis of complementary DNA (cDNA) was conducted using PrimeScriptTM RT reagent Kit with gDNA Eraser (Takara), through reverse transcription reaction. Quantitative polymerase chain reaction for CDC20 and GAPDH were conducted in a volume of 20μL using SYBR Premix Ex Taq (Takara) in the ABI 7500 Fast (Life Technologies, Carlsbad, CA, USA). GAPDH was selected as the internal reference gene. The primer sequences were as follows: CDC20 forward 5- AGCAGCAGATGAGACCCTGAGG-3; CDC20 reverse 5- CAGCGGATGCCTTGGTGGATG-3; GAPDH forward 5- CAGGAGGCATTGCTGATGAT-3; GAPDH reverse 5- GAAGGCTGGGGCTCATTT -3. The relative levels of CDC20 expression were evaluated by the 2-ΔΔCT method using GAPDH as the control.
2.10 Statistical analysis
Statistical analysis was conducted using Graphpad Prism 8. Unpaired t test was utilized for comparing continuous variables between two groups. P-value less than 0.05 was considered to be statistically significant.
3.1 Integration of microarray data and identification of DEGs in ovarian cancer.
Three expression profiles (GSE105437, GSE14407 and GSE18520) were obtained from the GEO database. We chose the expression data of normal ovarian epithelial cells and OC epithelial cells. In order to increase the signal and reduce the false positive rate, these data sets are standardized and combined in batches to reduce the variability.
To get differentially expressed genes of three databases, limma package was made to identify DEGs and we set the threshold of (|LogFC|>1, adjusted P value (FDR)<0.05. Finally, we got 821 DEGs, including 497 upregulated genes and 324 downregulated genes. In addition, the volcano (Figure 1A) and heatmap plots (Figure 1B) were drawn to show the expression levels of these DEGs.
3.2 GO function and KEGG pathway enrichment for the DEGs.
To further explore the function of the DEGs, GO function and KEGG pathway enrichment analysis were applied in R using the function of clusterProfiler (version 3.16.1). The top 10 of GO (Figure 2A) and KEGG (Figure 2B) pathways were then analyzed and presented in bubble plots. The detailed results are presented in Table 2 and Table 3.
3.3 PPI network and the selection of CDC20.
According to the information in string database, PPI network of DEGs protein translated in OC was constructed. Confidence score was set more than 0.9 and the high-resolution network picture (Figure 3A) and the details of the network were filtered out. Then R language was used to count the interaction number of each protein in the network and a bar plot was drawn to show 30 genes that had the highest interaction in the network (Figure 3B). As the top 2 proteins, it was identified that many OC-related studies had already well studied CDK1 through literature mining. Therefore, we chose CDC20 as the focus of following analyses.
3.4 Analysis of CDC20 gene in Oncomine database and ovarian cancer tissue.
The mRNA expression of CDC20 in the OC tissue was evaluated through the Oncomine database (Table 4). The results indicated that contrasted with the OC tissues, the CDC20 expression level is significantly decreased in the normal ovarian tissues (P<0.01; Figure 4A-I). For future verification of these results, 30 OC tissues and 30 normal ovarian tissues were evaluated by RT-qPCR. in line with the above results, the expression of CDC20 mRNA in the OC tissues remarkably increases (P<0.001) compared to the normal ovarian tissues (Figure 4J).
3.5 The mechanisms of CDC20 expression in ovarian cancer.
The analysis of single-gene differential expression in biological process research is limited . The expression data of three datasets were used for GSEA to predict the gene set and signal pathway related to CDC20, so as to effectively reveal the biological function of the datasets. CDC20 may function in cell cycle, citrate and TCA cycle, Oxidative phosphorylation and ubiquitin mediated proteolysis (Figure 5A-D).
Although the treatment technology of ovarian cancer has made great progress, the total mortality of ovarian cancer is the seventh largest cause of death of gynecological malignant tumors [1, 4]. The main causes of OC death are lack of early detection methods, high metastatic tendency and chemotherapy resistance. Therefore, for better treatment of OC, novel specific therapeutic targets should be identified.
Recently, bioinformatics is developing rapidly and microarray and sequencing data is getting more and more popular, which provide a platform for exploring the general genetic changes of tumor, identifying DEGs, and clarifying the molecular mechanism of tumor diagnosis, treatment and prognosis . Therefore, GSE105437, GSE14407 and GSE18520 were downloaded from the GEO databases, from which 821 DEGs were identified, and GO function and KEGG pathway enrichment analysis showed the DEGs were mainly enriched in some vital pathways and functions, whose abnormalities may cause tumor progression and drug resistance. Activations of gene transcription such as HIF-1, STAT-3, PAX3, c-MYB, TGF-β can promote cancer progression in aspects of immune responses, hematopoiesis, neurogenesis, angiogenesis, cell survival, glucose metabolism and invasion, which can be used as therapeutic targets in OC [24-27]. Tubulin is the important component of microtubules that are an important therapeutic target in tumor cells . Paclitaxel as a tubulin inhibitor is the first-line drug for treating OC, but as many as 80% of patients will eventually relapse and become paclitaxel resistant, which may cause treatment failure and poor prognosis of ovarian cancer [29-32]. Therefore, to find new targets for treatment is imminent. Cell cycle is the basic process of cell division, which is closely related to the orderly expression of related genes, cyclin-dependent kinases (CDKs), cell cycle divisions (CDCs), the tumour suppressor p53 and so on, whose abnormalities contribute to carcinogenesis and tumor progression and drugs target for these can be used to treat various cancer [33, 34]. Wnt signaling pathway can be implicated in OC stemness, carcinogenesis of many OC subtypes by regulating cell growth and apoptosis, and attaches great importance to chemoresistance, which can be targeted for chemo-sensitization in OC [35, 36]. Chen, M.W., et al. reported that the Wnt signaling pathway regulates ovarian cancer cells growth, progression, and migration through interaction with STAT3 and miRNA-92 . P450 are a series of metabolic enzymes that can regulate many processes such as anticancer drugs’ pharmacokinetics. Recent studies have shown that individual forms of P450 play a role in the resistance of anticancer drugs [38, 39]. Downie, D., et al. reported that P450 enzymes are overexpressed in OC and can be markers of prognosis . p53 is a famous tumor suppressor gene, which can regulate the cell cycle and avoid the occurrence of cancer. It is jokingly called "the guardian of genome". Generally, p53 gene mutation happens in more than 50% of cancer patients . Chen, Y.N., et al. reported that microRNA let-7d-5p-HMGA1-p53 signaling pathway rescues OC cell apoptosis and restores chemosensitivity in ovarian cancer . Metabolic reprogramming is a marker of malignancy. There are a lot of metabolic heterogeneity between cancer cells and normal tissues, but almost no metabolic activity is limited in tumors. The metabolic phenotype and metabolic dependence keep changing during the development of cancer from precancerous tissue to local invasion and metastasis [43, 44]. These findings help us better understanding the possible mechanisms in OC development, progression, and therapy.
Furthermore, a PPI network was drawn based on the DEGs, from which 30 hub genes that had the highest interactions in the network were identified. By literature mining, it was identified that many OC-related studies had already well studied CDK1. Therefore, CDC20 was selected for further research as a key gene in OC. Previous studies have suggested that CDC20 could function as tumor oncogene [45-51]. However, there lacks works done to explore CDC20 expression level, protein level and related molecular mechanisms in OC. Initially, we used Oncomine database to verify the CDC20 mRNA expression levels between OC and normal ovary tissues. What’s more, we examined the levels of CDC20 mRNA in normal OC tissues and normal ovarian tissues from patients in our hospital. All results indicated CDC20 expression in normal ovarian tissues was lower than that in OC tissues. Next, we predict gene sets and signaling pathways associated with CDC20 using the expression data of three datasets in GSEA software. CDC20 may function in cell cycle, citrate and TCA cycle, oxidative phosphorylation and ubiquitin mediated proteolysis. As SAC target, and APC/C E3 ubiquitin ligase co-activator involving in metaphase and anaphase transition during mitosis, the significance of CDC20 in cell cycle regulation is obvious . The tricarboxylic acid (TCA) cycle is the final metabolic pathway and also the center of carbohydrate, lipid and amino acid metabolism. Oxidative phosphorylation is accompanied by ATP production in biological oxidation, including two types of phosphorylation: metabolite linked phosphorylation and respiratory chain linked phosphorylation. TCA cycle and oxidative phosphorylation are two main process in cells to produce energy, and metabolic reprogramming causes tumorigenesis. From this aspect, controlling cancer energy metabolism becomes a potential treatment. Besides, APC/CCDC20 as an E3 ubiquitin ligase can also promote substrate ubiquitination and their subsequent degradation by the proteasome, which can regulate metabolic signaling pathways, transcription factors, and metabolic enzymes . These show that CDC20 attaches great importance to OC and can be a specific target to treat OC.
In conclusion, we deduce that CDC20 may be a promising therapeutic target for the treatment of OC. However, continued work will be necessary to define whether CDC20 can produce a marked effect and the exact role and mechanism of it in OC Yet, despite the uncertainty of these mechanism, we believe that CDC20 is an appealing potential therapeutic target for OC patients.
Ethics approval and consent to participate
This study was carried out in accordance with the standards of the Helsinki Declaration of the World Medical Association, and approved by the Ethics Committee of China Medical University. All clinical samples were collected from the Shengjing Hospital of China Medical University with informed consent from all patients.
Consent for publication
Availability of data and materials
NCBI-GEO: https://www.ncbi.nlm.nih.gov/geo/; accession number: GSE105437, GSE14407 and GSE18520
Oncomine database: http://www.oncomine.org, accession number: [email protected]
STRING v11.0: http://www.string-db.org/
The authors declare that they have no competing interests.
This work was supported by National Natural Science Foundation of China (No.81872125) and Outstanding Scientific Fund of Shengjing Hospital (No. 201704).
Xiaocui Zhang downloaded the dataset, analyzed the data, performed the RT-qPCR, and was a major contributor in writing the manuscript. Fangfang Bi collected the ovarian tissues and reviewed the manuscript together with the corresponding author Qing Yang. All authors read and approved the final manuscript.
We appreciate the Gene Expression Omnibus and the Oncomine database for the open data and the software of R language and Gene Set Enrichment Analysis.
Mentioned figures and table are not included with this version of the Manuscript.