Gene Expression Prole of Active HE4 Stimulation in Epithelial Ovarian Cancer Cells: Microarray Study and Comprehensive Bioinformatics Analysis

Background. Human Epididymis Protein 4 (HE4) is a novel serum biomarker for diagnosis of epithelial ovarian cancer (EOC) with high specicity and sensitivity compared with CA125, and the increasing researches have been carried out on its roles in promoting carcinogenesis and chemoresistance in EOC in recent years, however, its underlying molecular mechanisms remain poorly understood. The aim of this study was to elucidate the molecular mechanisms of HE4 stimulation and to identify the key genes and pathways mediating carcinogenesis in EOC using microarray and bioinformatics analysis. Methods. We established a stable HE4-silence ES-2 ovarian cancer cell line labeled as “S”, and its active HE4 protein stimulated cells labeled as “S4”. Human whole genome microarray analysis was used to identify deferentially expressed genes (DEGs) from triplicate samples of S4 and S cells. “clusterProler” package in R, DAVID, Metascape, and Gene Set Enrichment Analysis (GSEA) were used to perform gene ontology (GO) and pathway enrichment analysis, and cBioPortal for WFDC2 coexpression analysis. GEO dataset (GSE51088) and quantitative real-time polymerase chain reaction (qRT-PCR) was applied for validation. The protein–protein interaction (PPI) network and modular analyses were performed using Metascape and Cytoscape. Results. In total, 713 DEGs were found (164 up regulated and 549 down regulated) and further analyzed by GO, pathway enrichment and PPI analyses. We found that MAPK pathway accounted for a signicant portion of the enriched terms. WFDC2 coexpression analysis revealed ten WFDC2 coexpressed genes (TMEM220A, SEC23A, FRMD6, PMP22, APBB2, DNAJB4, ERLIN1, ZEB1, RAB6B, and PLEKHF1) that were also dramatically changed in S4 cells and validated by dataset GSE51088. Kaplan–Meier survival statistics revealed clinical signicance for all of the 10 target genes. Finally, PPI was constructed, sixteen hub genes and eight molecular complex detections (MCODEs) were identied, the seeds of ve most signicant MCODEs were subjected to GO and KEGG enrichment analysis and their clinical signicance was evaluated. Conclusions. By applying microarray and bioinformatics analyses, we identied DEGs and determined a comprehensive gene network of active HE4 stimulation in EOC cells. We offered several possible mechanisms and identied therapeutic and prognostic targets of HE4 in EOC. comparison of sample. DEGs were identied through volcano plot ltering, and the thresholds for DEGs were |log2FC| ≥ 1 and p value < 0.05, or log2 ratios is “NA” and the differences in intensity between the two samples ≥ 1, 000. Hierarchical clustering was performed using “pheatmap” package in R while the threshold of log 2 |FC| was dened as ≥ 1.5. Gene Ontology (GO) analysis and pathway enrichment were performed using multiple databases, including “clusterProler” package in R, DAVID(https://david.ncifcrf.gov), and Metascape[27] (https://metascape.org), using p value < 0.05 as the cut-off thresthold. Gene Set Enrichment Analysis (GSEA, Version was performed as per software instructions on the comprehensive microarray datasets showing differentially expressed to determine differences and enriched gene sets in Group S4 versus with S [28, 29]. In this study, we focused on the GO biological processes and pathway processes, so the gene sets “c5.bp.v7.1.symbols.gmt”, “c2.cp.biocarta.v7.1.symbols.gmt”, “c2.cp.pid.v7.1.symbols.gmt”, and “c2.cp.reactome.v7.1.symbols.gmt”, which were downloaded from the Molecular Signatures Database (MSigDB) (http://www.broadinstitute.org/gsea/), were used. Enrichment analysis was performed using 1000 phenotype permutations, gene sets with nominal p-value < 0.05, and selecting the weighted scoring scheme with a signal to statistical noise metric to rank genes and complete the GSEA analysis(31). POLR2D. These hub could interact with suggest that these hub genes might play an important role in HE4 activation and should be studied in EOC. MCODEs were subjected to GO and KEGG enrichment analysis and their clinical signicance was evaluated.


Introduction
Human epididymis protein 4 (HE4) is a member of the WFDC domain family and encoded by the WFDC2 gene, features the characteristic WAP motif consisting of 8 cysteine-formed disul de bonds [1]. It was initially discovered in human distal epididymal epithelial cells by Kirchhoff et al. in 1991[2]. In physiological conditions, HE4 is secreted into the blood to act as a protease inhibitor and is involved in the maturation of sperm cells [2]. Several types of cancers are associated with HE4 overexpression both in serum and tissues, and it is a relatively promising and useful biomarker for the diagnosis of ovarian cancer [3][4][5], primary fallopian tube carcinoma [6], endometrial cancer(combined with CA125) [7], lung cancer [8], breast cancer(combined with miR-127) [9], gastric cancer [1], colorectal cancer [10] and pancreatic adenocarcinomas [11]. In 2008, HE4 was cleared as a serum marker to monitor disease recurrence or progression in patients with epithelial ovarian cancer (EOC) by Food and Drug Administration of USA. Since then, more and more investigators are paying attention to it. Multiple studies found that HE4 is a useful biomarker possessing higher sensitivity and speci city than CA125 in the early con rmatory diagnosis for EOC and differentiation of pelvic masses, especially in combination with the risk of ovarian malignancy algorithm (ROMA) [12,13]; it seems to be a good predictive factor for the ideal tumor cytoreductive surgery [3], pre-operative prediction of residual disease after interval cytoreduction [4], adjuvant chemotherapy resistance [14] and the possibility of ascites formation [15].
Nevertheless, most of the investigations on HE4 are focusing on its clinical application mentioned above, and there are few studies on its mechanism or function in EOC, whereas the results have not yet reached a consensus. Previous studies have reported that HE4 overexpression signi cantly promotes tumor cell apoptosis, adhesion and inhibits cell proliferation, migration, and invasiveness [16,17], but other researchers addressed that high expression of HE4 promotes cell migration, spreading and proliferation [18]. Furtherly, it was found in vitro that HE4 may regulate the mitogen-activated protein kinase (MAPK) and phosphoinositide 3-kinase/AKT signal transduction (PI3K/AKT) pathways to produce tumor-suppressing effect [18,19]. Recently, emerging studies have been carried out to investigate the association between HE4 and tumorigenesis as well as chemotherapeutic resistance in EOC, whereas the studies were inconclusive due to inconsistencies in results [20][21][22][23][24].
Using gene expression pro le detection and bioinformatic analysis to retrieve a large amount of biological information accumulated to a speci c gene is a research tool that can help to provide fundamental data for molecular mechanism investigations and to identify new interaction targets. Until now, no microarray pro ling study of active HE4 stimulation within epithelial ovarian cancer cells has been performed. In this work, we performed microarray analyses to comprehensively analyze the expression pro le of active HE4 stimulation. In total, 713 DEGs, 10 HE4 coexpressed genes, 16 hub genes and 8 MCODEs were found and further analyzed by GO, pathway enrichment and PPI analyses. We found that MAPK signaling pathway, TNF signaling pathway, PI3K-AKt signaling pathway, p53 signaling pathway and cell cycle may be crucial in HE4 stimulation. These veri ed coexpressed genes and hub genes may help us identify novel biomarkers and treatment targets synergic with HE4 in ovarian cancer in the future.

Materials And Methods
Cell culture, gene transfection and identi cation ES-2 cells were purchased from the American Type Culture Collection (ATCC; Manassas, VA, USA) and maintained in RPMI-1640 medium with 10% fetal bovine serum (FBS) and 1% penicillin/streptomycin, at 37℃ in a humidi ed atmosphere with 5% CO 2 . Cell culture, shRNA expression vectors construction of HE4, gene transfection and identi cation were prepared as previously described [25]. Stable cell lines: HE4 shRNAs low-expressing and its empty-plasmid transfected cell lines were labeled as "S" and "S_Mock", respectively. The untreated cells were labeled as "S_Untreated". Quantitative realtime polymerase chain reaction (qRT-PCR) and Western blotting were performed as preciously described [25]. The active HE4 protein (recombinant human HE4, rHE4, catalog: MBS355616, MyBioSource) was applied to stimulate S cells (serum-free medium that contained the recombinant HE4 protein 0.2 µg/ml for 24 h) [26] and labeled as "S4".

Microarrays and bioinformatics analysis
Microarray analysis was performed using triplicate samples of S cells and S4 cells. Total RNA extraction and RNA quantity control were applied and assessed as previously described [25]. The RNA purity and integrity pass criteria were established as A260/A280 ≥ 1.8, A260/A230 ≥ 1 and RIN ≥ 6. gDNA contamination was evaluated by agarose gel electrophoresis. Target preparation and hybridization were performed as previously described [25], the pass criteria for CyDye incorporation e ciency at > 10 dye molecular/1000nt.
Puri ed RNA samples were subjected to Human Whole Genome OneArray® (Array Version: HOA6.1) with Phalanx hybridization buffer using Phalanx Hybridization System for microarray analysis, the hybridization process was described previously [25]. Fold-changes (FC) were calculated by Rosetta Resolver 7.2 with error model adjusted by Amersham Pairwise Ration Builder for signal comparison of sample. DEGs were identi ed through volcano plot ltering, and the thresholds for DEGs were |log2FC|≥1 and p value < 0.05, or log2 ratios is "NA" and the differences in intensity between the two samples ≥ 1, 000. Hierarchical clustering was performed using "pheatmap" package in R while the threshold of log 2 |FC| was de ned as ≥ 1.5. Gene Ontology (GO) analysis and pathway enrichment were performed using multiple databases, including "clusterPro ler" package in R, DAVID(https://david.ncifcrf.gov), and Metascape [27] (https://metascape.org), using p value < 0.05 as the cut-off thresthold. Gene Set Enrichment Analysis (GSEA, Version 4.0.3) was performed as per software instructions on the comprehensive microarray datasets showing differentially expressed to determine differences and enriched gene sets in Group S4 versus with S [28,29]. In this study, we focused on the GO biological processes and pathway processes, so the gene sets "c5.bp.v7.1.symbols.gmt", "c2.cp.biocarta.v7.1.symbols.gmt", "c2.cp.pid.v7.1.symbols.gmt", and "c2.cp.reactome.v7.1.symbols.gmt", which were downloaded from the Molecular Signatures Database (MSigDB) (http://www.broadinstitute.org/gsea/), were used. Enrichment analysis was performed using 1000 phenotype permutations, gene sets with nominal p-value < 0.05, and selecting the weighted scoring scheme with a signal to statistical noise metric to rank genes and complete the GSEA analysis(31).

HE4 (WFDC2) coexpression analysis and validation by bioinformatics
Assessment of the coexpression genes of WFDC2 was performed using the cBioPortal database (http://www. cbioportal.org). The data obtained were RNA-Seq V2 RSEMdata from TCGA PanCancer Atlas that included 585 ovarian serous cystadenocarcinoma tissues. Spearman's correlation score (≥ 0.2 was considered positively correlated and ≤ − 0.2 was considered negatively correlated with WFDC2) were used to select WFDC2 coexpressed genes. To predict the target genes that were also changed in our microarray, we use Venny online (https://bioinfogp.cnb.csic.es/tools/venny/) to identify the overlapping genes between DEGs and WFDC2 coexpressed genes.
For validation of target genes, the gene expression pro le result, GSE51088, submitted by Slamon D, et al. [30] was used. This gene expression pro le contains 152 epithelial ovarian cancer patients, 5 benign epithelial ovarian tumor patients and 15 normal healthy ovarian tissues. Based on this data, we calculated Pearson correlation coe cient between target genes and WFDC2, and compared the expression of target genes among malignant, benign and normal ovarian tissues (t test, p < 0.05 as cut-off criterion) by "ggplot2" and "ggpubr" packages in R. The clinical signi cance of the target genes were evaluated by online Kaplan-Meier survival analysis (http://www.kmplot.com/). A total of 1436 mRNA data samples for PFS and 1657 mRNA data samples for OS of epithelial ovarian cancer were interrogated. The patients were split into 2 groups (high vs. low) based on the expression level.

Validation of target genes by qRT-PCR
For validation, the qRT-PCR were conducted as previously described [25]. 10 target genes were selected, the primers were designed and purchased as previously described, they were presented in Table 1. All reactions were performed in triplicates, and the speci city of PCR ampli cation was determined by melting point curve analysis. Metascape was used to establish a PPI network, and proteins with degree > 1 were selected. The network analyzers "CentiScape" of Cytoscape software was used to analyze the topology property of the network. Genes with a degree of connectivity ≥ 30 were de ned as hub genes. "Molecular Complex Detection" (MCODE) in Metascape was used to analyze modules of the PPI network, with the degree cut-off set to 2. The seeds of key modules were identi ed, and GO analysis and KEGG pathway analysis were performed, nally, their clinical signi cance was evaluated.

HE4 gene transfection identi cation and RNA quantity assessment
As detected by qRT-PCR and Western blot, the gene and protein expression levels were signi cantly lower in the HE4 shRNA transfection cells than the Untreated and Mock cells ( Fig. 1A and B, all P < 0.01), and there was no statistical difference of HE4 in latter two groups of cells (P > 0.05). The RNA quantity and purity assessment showed that both of the two samples passed the criteria of ampli cation yield and labeling e ciency (Table 2).

Gene expression analysis and clustering
After chip hybridization and data obtaining, volcano analysis displayed the distribution of the 18398 expressed genes (Fig. 1C). Setting log2|FC|≥1 and P-value < 0.05 as cut-off criteria, 713 DEGs were identi ed, in which 549 genes down-regulated and 164 genes up-regulated (Fig. 1D, the raw data is available in Supplementary Table 1, all gene list is available in Supplementary Table 2). To the DEGs showed log2|FC|>2 differentially expressed, heatmap analysis revealed that 5 DEGs such as EPS15, MSMO1, TMPO, ECT2, ZMYND11 had higher expression levels in S4 cells relative to S cells, and 21 DEGs such as PABPC1, AP3S1, TMX2, PHF6, NR1D2, RAB23, NEK7 had lower expression levels in S4 cells relative to S cells (Fig. 1F).

Gene oncology function analysis of DEGs
We performed gene oncology (GO) enrichment analysis by uploading all the DEGs to "clusterPro ler" package in R to get the biological function. The DEGs were classi ed into three functional groups: biological process (BP), cellular component (CC) and molecular function (MF). The most enriched BP functions were coenzyme metabolic process and cofactor biosynthetic process. For CC, nuclear speck and nuclear inner membrane were the most enriched. In the clusters of MF, single-stranded RNA binding and translation factor activity were the most enriched ( Fig. 2A). For gaining more biological insight, we applied Metascape [27] to identify BP in which the DEGs participated, the enrichment analysis related to the signi cant GO terms selected for DEGs was shown as heatmap (Fig. 2B, C), among the diverse pathways highlighted, various are related to oncogenetic, such as: "cell division", "DNA repair", "regulation of growth", "regulation of DNA metabolic process", "regulation of mitotic cell cycle", "phosphatidylinositol phosphorylation", and "signal transduction by p53 class mediator", etc. These results con rmed the function of HE4 to participate in tumorigenesis and development in epithelial ovarian carcinomas. Interestingly, the "response to wounding" and "cellular response to glucose starvation" pathway found to be modulated by HE4 was associated with immune response regulation and autophagy, which adumbrates the possible capacity of HE4 to drive in the immune mediators' production and autophagy regulation. For GESA analysis setting GO biological process as gene set, a total of 132 items were enriched with NOM p-val < 0.05, we noticed that the "GO_MAP_KINASE_KINASE_KINASE_ACTIVITY" was the most signi cantly enriched one, with the highest NES score (NES = 2.166, NOM p-value = 0)( Supplementary Table 3), this indicates that the MAP kinase may participate in the tumorigenesis induced by HE4 activation (Fig. 2D).
Pathway enrichment analysis of DEGs KEGG pathway enrichment analysis of the 713 DEGs was conducted by using online Metascape [27]. Nineteen KEGG pathways were enriched of the total DEGs with a criterion of Minimal overlap ≥ 3, p value cutoff < 0.01, and Minimal enrichment = 1.5. MAPK signaling pathway, TNF signaling pathway, PI3K-AKt signaling pathway, p53 signaling pathway and cell cycle were highly enriched in the DEGs, in which MAPK signaling pathway was the most signi cantly enriched ( Fig. 2E (1)). In order to analyze and integrate the pathways involved in different gene lists, we divided the DEGs into up-regulation and down-regulation groups and uploaded them in Metascape to conduct a new pathway analysis. The current pathway enrichment analysis includes the pathways currently covered by Metascape: KEGG, Hallmark Gene Sets, Reactome Gene Sets, Canonical, and BioCarta Gene Sets, and the criterion is the same as previously shown. Finally, we found that, MAPK pathway, cell cycle, PI3K AKT mTOR pathways were highly enriched, in which MAPK signaling pathway was also the most signi cantly enriched in S4 cells compared with S cells (Fig. 2E (2)). To further explore the possibility of MAPK signaling pathway expression in different gene data sets of comprehensive microarray datasets, we used GSEA analysis, and found that MAPK signaling pathway has been signi cantly enriched in BioCarta, PID, and Reactome Gene Sets in S4 cells compared with S cells (all Nominal p-val < 0.05, Fig. 2F). This analysis suggests that MAPK pathway may be critical in the oncogenesis of HE4 in epithelial ovarian cancer.
HE4 coexpression analysis and validation HE4 (encoded gene name is "WFDC2") coexpression genes within epithelial ovarian cancer were identi ed from cBioPortal which is based on the TCGA Pancancer atlas, including 585 ovarian serous cystadenocarcinoma tissues. In total, 870 WFDC2 coexpressed genes were selected with |Spearman's Correlation score|>0.2 and p-value < 0.05 as criteria (Supplementary Table 4). In total, 26 genes overlapped between DEGs and WFDC2 coexpressed genes, including 19 HE4 positively correlated and 7 negatively correlated genes (Table 3). To validate these target genes, GEO dataset GSE51088 [30], which included 152 epithelial ovarian cancer patients (including 11 epithelial ovarian borderline tumors), 5 normal ovarian tumor patients and 15 normal healthy ovarian tissues, was used to calculate correlation coe cient. Ten genes were dramatically correlated with HE4 expression, including 8 HE4 negatively correlated genes (TMEM220A, SEC23A, FRMD6, PMP22, APBB2, DNAJB4, ERLIN1, and ZEB1), and 2 HE4 positively correlated genes (RAB6B, and PLEKHF1) (Fig. 3A). To further validate microarray data, the ten target genes were subjected to qRT-PCR detection for their differential expressions in S and S4 cells (Fig. 3B). As to these genes, their qRT-PCR data were entirely consistent with gene chip results (all P < 0.05). On the whole, qRT-PCR and gene chip data were correlated to each other in this study. To evaluate the clinical signi cance of 10 target genes, we screened the correlation of tumor type with these ten genes in GESE51088, we nd that all of these ten genes were signi cantly correlated with tumor types, in which TMEM220A, SEC23A, FRMD6, PMP22, APBB2, DNAJB4, ERLIN1, and ZEB1were obviously downregulated in ovarian cancers compared with ovarian benign tumors and normal tissues, and RAB6B, PLEKHF1 were obviously upregulated in ovarian cancers compared with ovarian benign tumors and normal tissues (Fig. 4A). Kaplan-meier survival analysis were generated for a large cohort of ovarian cancer. In total, data from 1657 epithelial ovarian cancer patients were interrogated and hazard ratios (HR) and p-values for statistical signi cance were determined. The data are summarized in Table 4. Interestingly, all of the 10 target genes correlated with overall prognosis (Fig. 4B). These 10 target genes should be further investigated to explore their association with HE4 expression, which might expose the mechanism of HE4 during the development of epithelial ovarian cancer.

Protein-protein interaction (PPI) network and modular analysis
The PPI enrichment analysis was carried out in Metascape online. The resultant network contains the subset of proteins that form physical interactions with at least one other member in the gene list. The "centiscape" plug-in in Cytoscape was used to nd out the hub genes. Proteins with degree > 1 were selected. In total, 289 nodes (40.4% of all 713 DEGs) and 2942 PPI relation-ships were obtained (Fig. 5A). Sixteen genes with a degree of connectivity > 30 were de ned as hub genes for HE4 activation (Table 5). According to the degree rank, the sixteen hub genes included HSPA1B, HSPA1A, SUMO1, CDK1, MAX, PABPC1, MAGOH, HNRNPU, YWHAG, RANBP2, SRSF1, CNBP, U2AF2, RNPS1, SMAD3, and POLR2D. These hub genes could interact with each other, which suggest that these hub genes might play an important role in HE4 activation and should be further studied in EOC. Here in this paper, the MCODE algorithm [31] has been implemented in Metascape to distinguish densely connected network components, and nally, 8 modules were identi ed and sequenced with the descending of Score from 8.65 to 1.00 (Supplementary Table 5). Five most signi cant MCODEs were extracted when the Score > 2 (Fig. 5B).
To further explore the biological function of the seeds in the ve key modules (SMNDC1, HSPA1A, FNBP1L, GAR1, and SKA2), functional enrichment analysis was performed based on the Metascape by setting as Min Overlap = 2 and P value cutoff = 0.05. Regarding the GO terms, the main enriched ones were regulation of microtubule polymerization or depolymerization, and positive regulation of cellular component biogenesis. The pathway signaling analysis showed marked enrichment of cell cycle, and spliceosome (Fig. 5C).
To determine the clinical signi cance of 5 seed genes of the key modules, Kaplan-meier survival analysis were generated for a large cohort of ovarian cancer. In total, data from 1657 epithelial ovarian cancer patients for OS and 1436 patients for PFS were interrogated and hazard ratios (HR) and pvalues for statistical signi cance were determined. As shown in Fig. 5D, all of the 5 seed genes obviously correlated with OS and PFS prognosis.

Discussion
Ovarian cancer is the seventh most common malignant tumor in the world. In 2012, it was estimated that there were 238,719 incident cases and the agestandardized rate was 6.1/100,000 [32]. In 2020, it is expected by the American Cancer Society that there will be approximately 21,750 new ovarian cancer cases in the United States, and 13,9400 women will die from it. Owing to its occult onset and innocuous symptoms, most of the patients with ovarian cancer are diagnosed in advanced stage. Although the development of new anti-tumor drugs and the improvement of surgical treatment, the survival rates decline dramatically from 92% for patients with Stage I to 17-28% for those with advanced disease (Stages III-IV) [33]; majority of advanced stages patients eventually relapse and chemotherapeutic resistance. Although serum biomarker CA125 was widely used in clinical practice for diagnosis and differentiation, population-based screening serum cancer antigen (CA125) assessment and use of risk for ovarian cancer algorithm (ROCA) did not identify signi cant mortality reduction and has been proved to be ineffective [34]. Thus, it is urgent to clarify the underlying mechanism of oncogenesis of EOC and nd out tumor biomarkers to facilitate early diagnosis and targeted therapy or prevention.
As a new tumor biomarker, HE4 has aroused full attention in recent years. However, most of the research focuses on its clinical application of early and differential diagnosis, relapse, prognosis, chemotherapeutic resistance, as well as other clinical aspects for EOC [12,35], and there are few studies on its mechanism in ovarian cancer yet; it may be the reason that HE4 has not to be anchored as a therapeutic target due to a unanimous conclusion on its roles in the tumorigenesis and progression of EOC. As early as in 2011, Gao L. et al. [16] reported that they found overexpression of HE4 obviously promoted ovarian cancer cell apoptosis and adhesion, they noticed HE4 may inhibite ovarian cancer cell proliferation, migration and invasiveness, as well as xenograft tumor formation in vivo; thus they concluded that HE4 might play a protecting role in the progression of EOC. Further, in 2014, Kong et al. [19] found in vitro that this protective infulence may be attained by regulating the MAPK and PI3K/AKT pathways. On the contrary, other researchers noted that HE4 high expression promotes cell migration, adhesion, proliferation, and spreading, which can be associated with its effects on the EGFR-MAPK signaling pathway [18,36]. What is more, HE4 contains fucosylated modi cation (Lewis y antigen) [37], Lewis y overexpression can promote HE4-mediated invasion and metastasis in ovarian cancer cells [38]. Impressively, overexpression of Lewis y antigen enhanced tyrosine phosphorylation of EGFR and HER/neu, which improved cell proliferation by the PI3K/Akt and Raf/MEK/MAPK pathways [39]; hence, Lewis y antigen and HE4 may affect alike signaling pathways that promote tumor growth and malignancy [40]. HE4 overexpression promotes ovarian cancer cell xenograft tumor growth in vivo, antisense target of HE4 can suppress this effect, HE4 interacts with tumor microenvironment constituents (EGFR, IGF1R, Insulin) and transcription factor HIF1α, these results provide some convincing proof that HE4 is tied to growth factor signal and the MAPK/ERK pathway [41]. Annexin A2 (ANXA2) was identi ed as a robust interacting partner of HE4 by mass spectrometry and co-immunoprecipitation, the HE4-ANXA2 complex can promote ovarian cancer cell invasion and migration in vitro and tumor distant metastasis of lung in vivo, downregulation of HE4 decreases expression of MKNK2 and LAMB2, which were associated with signaling pathways of MAPK and focal adhesion [5].
It was known that ovarian cancer participates in evading immunosurveillance and orchestrating a suppressive immune microenvironment, a series of studies by James NE, et al. [23,44,45] found that, upon exposure of puri ed human peripheral blood mononuclear cells(PBMCs) to HE4, osteopontin (OPN) and DUSP6 appeared as the most inhibited and upregulated genes; the proliferation of human ovarian carcinoma cells in conditioned media from HE4-exposed PBMCs was enhanced, while the effect was attenuated by adding recombinant OPN or OPN-inducible cytokines (IL-12 and IFN-γ); HE4 can compromise both OPN-mediated T cell activation [44] and cytotoxic CD8 + /CD56 + cells through upregulation of self-produced DUSP6 [45], thus promoting the tumorigenesis of ovarian cancer [23,44,45,48]. Other researchers found that HE4 promotes carcinogenesis of ovarian cancer by combining with histone deacetylase 3 (HDAC3) to activate PI3K/AKT pathway [46], and that HE4 knockdown suppresses the invasive cell growth and malignant progress of ovarian cancer by inhibiting JAK/STAT3 pathway [24]. Until now, a few studies have begun to delineate HE4's role in chemoresistance of ovarian cancer. It was noted that overexpression of HE4 promotes the collateral resistance of ovarian cancer cells to cisplatin and paclitaxel, and down-regulation of HE4 partially reverses the resistance of multiple chemotherapeutic agents; the HE4-mediated chemoresistance might be related to a variety of factors, including deregulation of MAPK signaling (EGR1 and p38 inhibition), and alterations of tubulin levels or stability; recombinant HE4 could upregulate the levels of α-tubulin, β-tubulin and microtubule associated protein tau (MAPT) [41,43]. Similarly, in vitro, HE4 represses apoptosis induced by carboplatin, and recombinant HE4 results in increased BCL-2 expression and decreased Bax (Bcl-2 associated X protein) expression in carboplatin treated ovarian cancer cells, which reduces the ratio of Bax/Bcl-2; in addition, HE4 also suppresses EGR1 expression, which may contribute to the overall reduction of pro-apoptotic factors that lead to EOC chemoresistance [26]. HE4 can enhance the regulation of DUSP6 and they are positively correlated, DUSP6 deactivates extracellular-signal-regulated kinase (ERK), the inhibition of DUSP6 can alter gene expression of ERK pathway response genes (EGR1 and c-JUN) and sensitize ovarian cancer cells to chemotherapeutic agents(paclitaxel or carboplatin) [23,48]. The resensitization of ovarian cancer cells to cisplatin and paclitaxel caused by HE4 knockdown is due to the corresponding decreases of ERK and AKT during gene knockouts, and the activation of these pathways inhibits the apoptotic signal of tumor cells [21].

Conclusions
Token together, mechanism underlying HE4's contribution to tumorigenesis, progression and chemoresistance in EOC has not been su ciently established. Therefore, the microarray analysis on HE4, which can provide high-throughput data for accurate molecular function research, has become particularly essential. In this study, we analyzed the gene expression pro le change after active HE4 protein stimulation in EOC cells, we tried to decipher the cellular biological processes by using the online tools for delineating the pathways and interaction network enriched. We found 713 DEGs (164 up regulated and 549 down regulated), pathway enrichment showed that MAPK pathway accounted for a signi cant portion of the enriched terms. WFDC2 coexpression analysis revealed ten WFDC2 coexpressed genes (TMEM220A, SEC23A, FRMD6, PMP22, APBB2, DNAJB4, ERLIN1, ZEB1, RAB6B, and PLEKHF1) that were also dramatically changed in S4 cells and validated by dataset GSE51088. Kaplan-Meier survival statistics revealed clinical signi cance for all of the 10 target genes. Finally, PPI was constructed, sixteen hub genes and eight molecular complex detections (MCODEs) were identi ed, the seeds of ve most signi cant MCODEs were subjected to GO and KEGG enrichment analysis and their clinical signi cance was evaluated. These data have not been mentioned in previous studies, which can offer a new approach to further clarify the mechanism of HE4 in the oncogenesis and chemoresistance of EOC.

Declarations
Availability of data and materials The datasets used and/or analyzed in the present study are available from the supplementary les or the corresponding author on reasonable request.
Ethics approval and consent to participate Not applicable.   (9) and (10) were 2 genes (RAB6B, and PLEKHF1) were obviously upregulated in ovarian cancers compared with ovarian benign tumors and normal tissues. C. Online Kaplan-Meier survival statistics analysis revealed that all of these ten correlated genes were dramatically correlated with prognosis of overall survival.