Differences in Gene Expression Between High- and Low-Grade Serous Ovarian Cancers: Implications for Diagnosis and Prognosis

Jianhao Xu A liated Kunshan Hospital of Jiangsu University Qian Wang Medical College of Soochow University Fang Cao A liated Kunshan Hospital of Jiangsu University Zhiyong Deng A liated Kunshan Hospital of Jiangsu University Xiaojiao Gao A liated Kunshan Hospital of Jiangsu University Tingting Gu A liated Kunshan Hospital of Jiangsu University Tingting Liu A liated Kunshan Hospital of Jiangsu University Song Xu (  xs19780116@163.com ) A liated Kunshan Hospital of Jiangsu University Wenjuan Gan Dushu Lake Hospital A liated to Soochow University


Background
Serous carcinoma, mucinous carcinoma, endometrioid carcinoma, and clear cell carcinoma are the histological subtypes of epithelial ovarian cancer (OC) (1).The most frequent subtype, Serous OC is further subdivided into high-grade and low-grade serous OCs (abbreviated as HGSOC and LGSOC, respectively) (2). The clinical characteristics of the two serous OCs differ (2). For example, the data set GSE151335 was used to verify the differences in progress-free survival (PFS) and OS between HGSOC and LGSOC. The platform for GSE151335 is GPL28589 [Oxford Classi er of Carcinoma of the Ovary].
HGSOC is now linked with TP53 mutations, whereas LGSOC is associated with BRAF and KRAS mutations as the benign/borderline adenoma progression (3,4). Given that BRAF and KRAS mutations are often detected through next-generation sequencing (5), time and economical expenses associated with LGSOC diagnosis are high. Immunohistochemistry (IHC) techniques are more commonly utilized for differential diagnosis in real-world clinical settings. TP53 is a precise marker for the differential diagnosis of HGSOC(6), although it occasionally produces errors in clinical practice. Furthermore, the Gene Expression Pro ling Interactive Analysis (GEPIA) website indicates that TP53 has little predictive value for the prognosis of OC patients(pHR=0.93, Figure 2). Thus, the principal clinical objective of this research is to identify critical molecules that can distinguish diagnosis and inform prognosis based on HGSOC and LGSOC expression differences. Therefore, the primary clinical goal of this research is to screen critical molecules that can differentiate diagnosis and guide prognosis based on differential genes expression of HGSOC and LGSOC.

Identi cation of DEGs
GEO2R is regarded as an interactive online tool that has been designed to compare two or more datasets in the GEO series to identify differentially expressed genes (DEGs) (7). DEGs between HGSOC and LGSOC tissues were identi ed using GEO2R, with the cut-off points of adj.p < 0.05 and |logFC|> 1. DEGs in GSE14001 were identi ed only with the threshold of adj.p < 0.05 due to the limited DEGs by the threshold above to ensure su cient DEGs for subsequent analyses. Therefore, we rst detected the intersection DEGs between GSE73638 and GSE27651 and then used GSE14001 for a second con rmation. We used online Wien software to detect the intersection DEGs between the datasets. Volcano and the heat map were painted to picture these DEGs using the R ggplot2 and heat map package.

Function enrichment analysis
The Gene Ontology (GO) datasets and Kyoto Encyclopedia of Gene and Genome (KEGG) pathway enrichment were used to analyze DEGs at the functional level with WEB-based Gene Set Analysis Toolkit (WebGestalt, http://www.webgestalt.org/option.php, version 2019). WebGestalt is a functional enrichment analysis web tool, which has on average 26,000 unique users from 144 countries and territories per year, according to Google Analytics(8). The results were pictured in the R ggplot2 package. P < 0.05 was considered statistically signi cant.

GSEA analysis
Gene Set Enrichment Analysis (GSEA) is a computational method that determines whether an a priori de ned set of genes shows statistically signi cant, concordant differences between two biological states (e.g. HGSOC and LGSOC) (9). The GSEA approach was applied to recon rm function enrichment analysis in GSEA software version 4.1.0, which uses prede ned gene sets from the Molecular Signatures Database (MSigDB v7.4).

PPI network construction and module analysis
Search Tool for the Retrieval of Interacting Genes (STRING, http://string-db.org, version 11.5) is a database of known and predicted protein-protein interactions (PPI), which was applied to visualize the potential gene interaction network (10). The minimum required interaction score is 0.700, which is high con dence. Cytoscape (a public bioinformatics software, https://cytoscape.org/, version 3.8.2) is an open-source software platform for visualizing complex networks and integrating these with any attribute data (11). The plug-in Molecular Complex Detection (MCODE, version 2.0.0) app of Cytoscape clusters a given network based on the topology to nd densely connected regions with the following criteria: degree cutoff = 2, node score cutoff = 0.2, K-Core = 2, Max. Depth = 100(12).

Hub genes selection and analysis
Another plug-in app cytoHubba (version 0.1) of Cytoscape predicts and explores important nodes and sub-networks in a given network by different topological algorithms (13). In the present study, the top nodes were ranked by Radiality algorithms. The OS and stage analysis of 10 hub genes in OC was conducted by GEPIA (http://gepia.cancer-pku.cn/index.html) database (14). The protein level was veri ed using OC pathological tissue slices in Human Protein Atlas (HPA, https://www.proteinatlas.org/, Version: 20.1) database(15).

Samples and histologic assessment
From January 2014 to September 2021, 20 borderline serous tumors(BST) cases, 20 LGSOC cases, and HGSOC 38 cases with complete clinicopathological data were obtained from the Department of Pathology, Kunshan Hospital, Jiangsu University. All patients have not received previous adjuvant systemic therapy. All tumors archived in the pathology department were pathologically diagnosed according to the World Health Organization standards (2014) and were staged by the International Federation of Obstetrics and Gynecology (FIGO). 2 attending pathologists independently reviewed the tissue sections, and the difference was reviewed by the third deputy chief physician/chief physician.

Immunohistochemistry
All specimens were xed with paraformaldehyde, embedded in para n, and prepared as 4µm thick serial sections. Para n sections were processed following the GTVisionTM Detection System(Gene Tech company, Item No.: GK5007). The primary antibody was the BIRC5 antibody(Gene Tech company, Clone No.: EP2880Y). The results were analyzed by the double-blind method. According to the HPA database, BIRC5 immunoreactivity was localized in the nucleus/cytoplasm. Positive staining of BIRC5 was scored using the Allred score system(16). The Allred score was calculated by adding the percentage of positive tumor cells (0, none; 1, <1%; 2, 1-10%; 3, 11-33%; 4, 34-66%; and 5, 67-100%) and the average intensity of immunoreactivity (0, no staining; 1, weak; 2, moderate; and 3, strong), with a range from 0 to 8.

Identi cation of DEGs between HGSOC and LGSOC
Among the 136 GSEs screened, we evaluated a total of four series that comprised HGSOC and LGSOC broad gene expression microarray data, namely GSE73638, GSE73551, GSE27651, and GSE14001. Because the series GSE73638 and GSE73551 belong to the same study, the current study included only GSE73638 because it had a larger sample size. Table 1 summarizes the essential information for the three GSEs. We utilized the GEO2R online program to identify 1465, 9914, and 230 distinct genes from GSE73638, GSE27651, and GSE14001, respectively ( Figure 3A, adj.P 0.01, |logFC| >1). The DEGs in the GSE14001 were restricted by the threshold above. 9500 DEGs in GSE14001 were identi ed with a threshold adj.P 0.05 and utilized to con rm the overlapping DEGs between GSE73638 and GSE27651. According to the Venn diagram, overlapping DEGs between GSE73638 and GSE27651 comprised 157 upregulated and 204 downregulated genes ( Figure 3B). Following validation by GSE14001, 79 upregulated and 85 downregulated genes were chosen for the current study ( Figure 3C). We used heat maps to depict the distribution of screened gene expression in each GSE between HGSOC and LGSOC ( Figure 3D).

Enrichment analysis for DEGs
We used the WebGestalt web tool to perform GO and KEGG enrichment analyses to identify the most important biological processes (BPs) and pathways. In total, 164 DEGs were primarily enriched in the BPs associated with mitotic cell cycle, organelle ssion, and nuclear division ( Figure 4A) and pathways such as hepatitis C, micro RNAs in cancer, and chronic myeloid leukemia ( Figure 4B). In addition, we used GSEA to validate the GOBP, which was substantially represented with a normalized p-value of <0.05. In the HGSOC group, 363, 8, and 31 BPs were enriched in GSE73638, GSE27651, and GSE14001, respectively, and three BPs were enriched in all three GSEs ( Figure 4C). Meiotic cell cycle process, homologous chromosome segregation, and meiosis I cell cycle process were the BPs con rmed by the three GSEs ( Figure 4D).

PPI network construction and signi cant module identi cation
We utilized the String database to estimate the protein-level connection of the overlapped DEGs ( Figure  5A). We improved the visualization with Cytoscape software and constructed a PPI network with 115 nodes and 894 edges ( Figure 5B). MCODE was used to divide the PPI network into four modules ( Figure   5C); the rst module had 38 nodes and 661 edges (MCODE score 35.730), the second module had 7 nodes and 21 edges (MCODE score 7.000), the third module had 5 nodes and 10 edges (MCODE score 5.000), and the fourth module had 3 nodes and 3 edges (MCODE score 3.000). We utilized Cytohubba to lter the top 10 Hubba nodes, namely BIRC5, CDC20, CDK1, CDKN3, MKI67, NUSAP1, RRM2, TOP2A, TPX2, and UBE2C, for additional investigation using Radiality topological techniques ( Figure 5D).

Analysis of the hub genes
We used the GEPIA database to investigate the connection between the 10 hub genes and OC OS/staging. Among the 10 hub genes, only BIRC5 was found to be favorably linked with OS in OC (pHR = 0.014, Figure 6A), and only RRM2 was found to be negatively correlated with OC staging (p = 0.0251, Figure 6B).
Based on the results above, BIRC5 and RRM2 with potential therapeutic utility were chosen from a list of 10 hub genes. Then, we predict the ability of BIRC5 and RRM2 to distinguish between normal ovarian tissue and OC by the HPA database. Using antibodies HPA002830 and CAB004270, we found that the Page 7/20 expression of BIRC5 in OC tissues was greater than that in normal ovarian tissues ( Figure 7A-C). RRM2 expression in OC tissues could not be identi ed with HPA056994 in both OC and normal ovarian tissues ( Figure 7D).

Verify genes with potential clinical value using our cases
We analyzed the correlation between the BIRC5 IHC staining score and clinicopathological parameters (Table 2). We included three different pathological types of ovarian tumors: BST, LGSOC, and HGSOC. Through ANOVA test, we found that in addition to pathological types (p <0.0001), age (p <0.0001), preoperative CA125 level (p =0.0175), FIGO stage (p =0.0079), TP53 (p <0.0001) and Ki67 expression (p <0.0001) are also related to BIRC5 expression. LGSOC, low-grade serous ovarian cancer; HGSOC, high-grade serous ovarian cancer. Positive staining of BIRC5 was scored using the Allred score system.
We further analyzed the correlation between BIRC5 expression and various clinicopathological parameters. Figure 8A shows the proportion of high BIRC5 expression in different clinicopathological parameters groups, and Figure 8B shows the statistical correlation between BIRC5 expression and different clinicopathological parameters.

Discussion
HGSOC and LGSOC have diverse clinical presentations, and the prognosis differs even within the same group. However, our literature survey indicated that only a few studies had compared HGSOC with LGSOC. Thus, the goal of the present study was to identify and validate the essential molecules that can help distinguish diagnoses and guide prognoses of the two OC subtypes.
We identi ed 164 robust DEGs by using the three GSEs. Based on WebGesalt and GSEA analysis, we discovered that the GOBPs differing between HGSOC and LGSOC are mainly the cell cycle process and chromosomal segregation, which is compatible with HGSOC's proliferative activity (17,18). We then selected 10 hub genes out of 164 DEGs. We found that only BIRC5 is favorably linked with OS, and only RRM2 is inversely correlated with staging, indicating these two genes' therapeutic signi cance. Furthermore, as stated in the introduction that the chosen molecules should be used in clinical IHC, we used the HPA website to assess the feasibility of BIRC5 and RRM2 as IHC markers. The ndings indicated that BIRC5 could differentiate between normal and OC tissues; however, RRM2 could not be detected. The core of this article is to explore the IHC indicators for distinguishing HGSOC and LGSOC. Therefore, for BIRC5, in addition to predicting its ability to distinguish benign and malignant through the HPA database, we also need to verify its ability to distinguish HGSOC from LGSOC through our pathological specimens. We collected 20 BST, 20 LGSOC, and 38 HGSOC specimens. IHC experiments found that BIRC5 can distinguish HGSOC and LGSOC and positively correlate with the patients' age, preoperative CA125 level, FIGO stage, TP53, and Ki67 expression. In conclusion, BIRC5 offers a high clinical application value.
BIRC5 is an inhibitor of the apoptosis gene family that encodes harmful regulatory proteins, which suppress apoptotic cell death. BIRC5 expression is high during fetal development and in most cancers but low in adult tissues (19). Yin et al. used bioinformatics tools to investigate the OC OVDM1 cell line and con rmed the clinical signi cance of BIRC5, which is consistent with the ndings of our study (20). A meta-analysis by He X et al. indicated that the protein survivin is closely linked to FIGO staging and tumor grade of OC (21). It is also worth mentioning that for more than 20 years, BIRC5 has been regarded as a cancer treatment target (22).For example, Ozreti et al. revealed that the Hedgehog signaling pathway is linked to OC pathogenesis and that BIRC5 might be a novel pathway target (23). Wang et al. utilized an orthotropic OC mouse model to con rm that miR-203 suppresses ovarian tumor metastasis by targeting BIRC5 to prevent EMT (24). In this study, we found that the high expression of BIRC5 is closely related to the high level of CA125, the advanced FIGO stage, and the malignancy of OC. From the above, we know that BIRC5 regulates tumor cell proliferation. It suggests that BIRC5 can assist in judging the prognosis of patients with ovarian cancer and may serve as a tumor treatment target.
The present study also has some limitations. The selection criteria for DEGs in GSE14001 are only speci ed as adj.p < 0.05, which is utilized as a secondary veri cation for the summary results of DEGs in the other two datasets.

Conclusion
BIRC5 is the crucial molecule identi ed in this investigation that can distinguish diagnosis, guide prognosis, and be applied in clinical settings. The importance of this study is to provide a critical marker to guide clinical practice.

Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author on reasonable request.

Competing interests
The authors declare that they have no competing interests.