Bioinformatic Analysis: Screening and Identication of Differentially Expressed Genes Modied by m6A in Ovarian Cancer

Background: N6-methyladenosine(m6A) is one of the most common RNA modications that occurs at the nitrogen-6 position of adenine. Emerging evidence has revealed that regulatory functions of m6A play an essential role in the development of cancer. However the study of m6A in ovarian cancer(OC) is still in our infancy. In this work ,we aimed to identify and analysis the differentially expressed genes(DEGs (cid:0) modied by m6A which can provide new therapeutic targets and key biomarkers in OC. Methods: We downloaded Microarray datasets GSE146553 and GSE124766 from Gene Expression Omnibus (GEO) database. The differentially expressed genes (DEGs) were identied by GEO2R analysis tools. Subsequently, The DAVID database was used to construct Enrichment analysis of GO and KEGG pathways. Next, the DEGs modied by m6A were identied by m6AVar database. Finally, the functional analysis and clinical sample validation of these genes were veried by ONCOMINE, GEPIA, cBioPortal online platform and Kaplan-Meier Plotter. Results:152 DEGs were selected ,and the DEGs were mainly enriched in extracellular exosome, spindle microtubule, response to hypoxia and cell cycle .And we identied 15 DEGs which were modied by m6A:MAPK10 GATM and ANGPTL1. After statistical analysis, two DEGs (SCN7A and GAMT) were selected for detailed study. We revealed that SCN7A and GAMT were expressed at a low level in OC. Afterwards, Survival analysis showed that SCN7A and GAMT expression were correlated with OC overall survival. And the expression of SCN7A


Background
Ovarian cancer (OC) is the most lethal gynecological cancer and one of the leading causes of cancerrelated death in women worldwide [1,2] .Aggressive frontline treatments with surgery and targeted chemotherapy are the main treatment of OC.However, due to late diagnosis, extensive metastasis and rapid development of chemotherapy resistance, the ve-year survival rate of OC has not been improved [3] .In order to reverse this situation, it is urgent to explore the molecular mechanism of ovarian cancer. Increasing evidences has revealed that abnormal gene expression and mutation play an important role in the pathogenesis of ovarian cancer. For example, overexpression of HOXC10 promotes the migration of ovarian cancer cells [4] . In addition,H2AX is highly expressed in OC which can be used as a new prognostic biomarker for OC [5] . Gene expression is regulated by various factors, including histone modi cation, DNA modi cation, RNA modi cation and so on. It has been recognized that epigenetic modi cation of DNA can regulate gene expression and chromatin tissue [6] . Recently, an additional regulatory layer called"epitranscriptomics" has been created, which depends on the chemical modi cation of RNA. Among them, N6methyladenosine(m6A) is the most common internal modi cation of mRNA [7,8,9,10] . It refers to the methylation of adenosine base at position 6 of nitrogen. This methylation is a dynamically reversible modi cation that can be installed, removed and recognized by methyltransferases, demethylases and readers, respectively [11] . According to their functions, they can be divided into three families( Fig. 1):"Eraser" protein (FTO and ALKBH5), which can remove the m6A modi cation from RNA [12] ."Reader" proteins (YTHDC1/2,YTHDF1/2/3 IGF2BP1 /2/3 hnRNPC hnRNPG, etc).Their function is to speci cally recognize and combine with the modi ed records of m6A [13,14,15] ; The main function of the last group of "Writer" proteins(METTL3, METTL14, WTAP, HAKAI, etc) is to catalyze the formation of m6A, which is a protein complex, named multicomponent m6A methyltransferase complex (MTC). METTL3 is the only catalytic subunit using S-adenosylmethionine (SAM) as a methyl donor [18,19,20] .It was reported that m6A modulator is a key participant in the malignant progression of hepatocellular carcinoma and a potential prognostic target [21] .Another study considered that m6A plays an important role in Gastrointestinal Cancer via PI3K/Akt and mTOR signaling pathway [22] . Furthermore, high expression of m6A methyltransferase METTL3 can maintain tumorigenicity of colon cancer cells by inhibiting SOCS2 [23] .However,the role of m6A in the development of OC needs further study, and the signi cance of its modi ed differential expression genes(DEGs) in OC needs to be explored. Therefore,in this study we downloaded and analyzed two mRNA microarray data sets from Gene Expression Omnibus (GEO) to obtain DEGs between ovarian cancer and non cancer tissues. In order to help us understand the molecular mechanism of cancer occurrence and development, we also analyzed the KEGG and GO enrichment of DEGs. After that, we analyzed the genes modi ed by m6A in OC by m6Var database, crossed with the selected DEGs, selected two DEGs modi ed by m6A(SCN7A and GAMT), and analyzed these two candidate genes in detail. The combined analysis of m6A and SCN7A/GAMT will help us to explore the potential molecular mechanism of OC. It can also provide new diagnostic markers and therapeutic targets for the clinical treatment of OC.

Methods
Microarray data. GEO (http://www.ncbi.nlm.nih.gov/geo) [24] is is a common database for storing chips, second generation sequencing, and other high-throughput sequencing data.. Two gene expression datasets (GSE146553 [25] and GSE124766 [26] ) were downloaded from GEO.By annotating the information, we rst obtain the genetic symbol which converted from the probe.The GSE146553 dataset included 26 OC tissue samples and 3 normal samples. The GSE124766 included 8 OC samples and 3 normal samples.

2.1.Identi cation of DEGs.
GEO2R(http://www.ncbi.nlm.nih.gov/geo/geo2r) is an online analysis tool for GEO database, which can analyze differentially expressed genes in two or more sample datasets in GEO.The DEGs between OC and normal samples were screened using GEO2R.logFC (fold change) > 1.5 or LogFC<-1.5 and P-value < 0.01 were considered statistically signi cant.Furthermore, the online analysis tool of Wenn chart(http://bioinformatics.psb.ugent.be/webtools/Venn/)was used to obtain the DEGs contained in both data sets.

2.2.PPI network construction.
Search Tool for the Retrieval of Interacting Genes (STRING; http://string-db.org) (version 10.0) [27] was used to predicted the PPI network.The purpose of this operation is to explore the functional interactions between proteins.Cytoscape (version 3.4.0) [28] is a graphical display of the network software, but also for analysis and editing. It supports a variety of network description formats.Simultaneously,it can use the own editor module directly to build the network.The PPI networks and up/down regulation of the DEGs were drawn by Cytoscape.

2.3.KEGG and GO enrichment analyses of DEGs.
The Database for Annotation, Visualization and Integrated Discovery (DAVID; http://david.ncifcrf.gov) [29] is a biometric database and free online analysis software. It integrates biological data and analytical tools to provide systematic and comprehensive biological functional annotation information for large-scale lists of genes or proteins to help us extract biological information from them.At present DAVID is mainly used for GO and KEGG pathway enrichment analysis.
KEGG [30] (Kyoto encyclopedia of gene and genomes)is a comprehensive database that integrates genomic, chemical and system functional information. It enables us to understand advanced functions and biological systems, because it can obtain molecular level information through high-throughput experimental techniques, especially genome sequencing generated by large molecular data sets.
GO(Gene Ontology)is regarded as a key bioinformatics tool for annotating and classifying genes according to biology process, molecular function and cellular location [31] . To analyze the function of DEGs, biological analyses were performed using DAVID online database. P < 0.05 was considered statistically signi cant.

2.4.Application of m6AVar database.
M6AVar database(http://m6avar.renlab.org/index.html) [32] is a kind of database which records functional variants involved in m6A modi cation.It contains the site and speci c location of m6A modi cation, related sequence, gene type, sample source, evidence con dence and so on. We performed m6AVar to identify the m6A modi ed DEGs and analyse the details of the modi cation process (modi cation sites, gene types, related binding proteins, etc).

3.1.Identi cation of DEGs in OC and PPI network construction.
Two standard microarray datasets GSE146553 and GSE124766 were selected from GEO database and analyzed online by GEO2R respectively. A total of 689 and 970 DEGs were screened out,.As shown in Venn Fig. 2: 152 DEGs overlapped in the two data sets. The function of genes in the body requires the interaction between genes or proteins which helps us to study the mechanism of the occurrence and development of diseases. So we built the PPI with STRING (Fig. 3A). In addition, we used the Cytoscape database to mark up and down regulated genes (Fig. 3B).

3.2.KEGG and GO enrichment analyses of DEGs.
We performed DAVID to nish the KEGG and GO enrichment analyses. GO Analysis is divided into three parts:biological processes (BP), molecular function (MF) and cell component (CC).The result showed that changes in BP of DEGs were mainly enriched activation of protein kinase activity ,response to hypoxia and signal transduction (Table 1).Changes in MF were noticeably enriched in binding ,structural molecule activity and protein homodimerization activity (Table 1).Changes in CC of DEGs were greatly enriched in extracellular exosome, spindle microtubule, membrane and mitotic spindle (Table 1). KEGG pathway analysis demonstrated that the downregulated DEGs were signi cantly enriched in tight junction and epithelial cell signaling in Helicobacter pylori infection ,whereas the upregulated DEGs were mainly enriched in cell cycle and Oocyte meiosis (Table 1) .

3.4.Function analysis and clinical sample certi cation of two identi ed DEGs.
Firstly, Oncomine analysis of cancer and normal tissues showed that SCN7A and GATM were dramatically lower expressed in OC in different data sets (Fig. 5A-F). This result was also con rmed in GEPIA database (Fig. 5G, H). Afterwards,to accurately investigate the relationship between the survival time of OC patients and the expression levels of SCN7A and GATM, we analyzed the TCGA data of ovarian cancer by Kaplan-Meier Plotter database. the results suggested that the overall survival time (OS) of patients with low SCN7A and GATM expression in OC was shorter than that in the high expression group (SCN7A p = 0.0048, GATM p = 0.043) (Fig. 6A,B). So we can see the expression levels of SCN7A and GATM had a signi cantly impact on the overall survival time of patients. Afterwards, we analyzed SCN7A and GATM by stages, GEPIA database analysis indicated that the expression of SCN7A and GATM was different in different TNM stages of OC, and the expression of SCN7A and GATM in stage II and III was higher than that in stage IV (Fig. 7A,B). Kaplan-Meier Plotter database explored that the expression levels of SCN7A and GATM could predict the prognosis of patients with stage I / II and III / IV (SCN7A p I-II = 0.01, SCN7A p III-IV = 0.0011, GATM p I-II = 0.0025, GATM p III-IV = 0.043) (Fig. 7C-F). 3.5.Analysis and hypothesis on the process of m6A modi cation In order to further investigate the effect of m6A modi ed DEGs on the pathogenesis of OC, we analyzed the modi cation process in detail. We analyzed and sorted out the occurrence site, speci c location, related binding protein, gene type, sample source and evidence credibility of m6A modi cation using m6AVar database (Table.4).These data imply that GATM has 11 (2 Mouse,9Human) research results, SCN7A has 94 (6 Mouse, 88 human) research results (due to the large number of SCN7A studies, the representative ones are selected for summary).The m6A modi cation of GATM mainly focused on chromosome 2,15,16. The related binding proteins were EIF4A3 and FUS. The former occurred mainly on the downstream of 3 'AG, while the latter occurred on the upstream of 5' GT. It can be hypothesized that M6A silented GATM via EIF4A3 or FUS to promote the occurrence and development of OC. The m6A modi cation of SCN7A mainly focuses on chromosome 2, whereas the protein pathway on which the m6A modi cation dependented has not been documented.

Discussion
As one of the three major gynecological malignancies, ovarian cancer is a highly destructive and heterogeneous gynecological malignancy. Even with the development of many surgical techniques and chemotherapy, the overall ve-year survival rate is still as low as 47% [37] . Besides, OC is the fourth leading cause of cancer-related deaths in China [38] . It is worth noting that the vast majority of ovarian cancer patients are in advanced stage when they are diagnosed, and the 5-year survival rate is less than 30% [39] .
These unsatisfactory data are mainly due to poor early diagnosis rate of OC [40,41] and poor prognosis [42] . At present, the internationally recognized treatment for ovarian cancer is cytoreductive surgery, followed by platinum based chemotherapy. However, primary and secondary drug resistance may occur during chemotherapy of ovarian cancer, which makes it fail to achieve the desired effect [43] . Recently,it was discussed that the unknown pathogenesis of OC is the main challenge to develop new diagnostic markers and therapeutic targets [44] . This prompted us to further explore the molecular mechanism of ovarian cancer progression, in order to nd valuable biomarkers and new treatment methods.
As one of the most common mRNA modi cations, m6A is closely related to the development of human beings.RNA m6A methylation is a post transcriptional modi cation of adenine at position 6. This modi cation has been found in most eukaryotic mRNA, tRNA, rRNA and other noncoding RNAs. Its important regulatory functions include the regulation of gene expression, biological development and cancer development.Especially the proliferation, apoptosis and metastasis of cancer. In breast cancer (BC), METTL3 ("Reader" protein of m6A modi cation) promotes HBXIP expression by increasing m6A modi cation, thus promoting BC proliferation [45] . Another study on glioblastoma stem cells (GSC) showed that the m6A modi cation of METTL3 and METTL14 in GSC inhibited the expression of ADAM19, EPHA3 and other oncogenes, inhibited the growth and self-renewal of GSC, thus inhibiting tumorigenesis [46] . In addition, it was reported that M6A modi cation also plays an important role in endometrial carcinoma(EC).
High expression of METTL3 promotes the development of endometrioid epithelial ovarian cancer (EEOC) by regulating abnormal methylation of m6A RNA, indicating malignant tumor and low survival rate of patients [47] . However, the study of m6A modi cation in ovarian cancer has only started in recent years. The fat mass and obesity associated protein (FTO) is a kind of m6A demethylase ("Eraser" protein of m6A modi cation). It can inhibit the self-renewal of ovarian tumor and cancer stem cells (CSC) and inhibit tumorigenesis in vivo, this process is completed by inhibiting cAMP signal transduction [48] . Overexpression of TLR4 activates the NF -κ B pathway to upregulate the expression of ALKBH5 and increase the level of m6A, which promotes the occurrence of ovarian cancer. This discovery provides clues for the invention of new targeted therapy methods in OC [49] . YTHDF1 ("Reader" proteins of m6A modi cation) can promote the occurrence and metastasis of ovarian cancer by binding with E6A modi ed EIF3C mRNA, and the up regulation of YTHDF1 is associated with poor prognosis of ovarian cancer patients [50] . Another study showed that METTL3 plays a carcinogenic role partly through AkT signaling pathway in the process of esophageal cancer, which indicates that METTL3 can be used as a potential therapeutic target for esophageal cancer treatment [51] . Thus, the above results demonstrated that m6A modi cation often depends on the related protein signaling pathway to abnormal expression of targeted genes, to participating in the proliferation, apoptosis and metastasis of cancer.Herein,t he purpose of our study is to identify the differentially expressed genes(DEGs) modi ed by m6A in OC and investigate the protein pathway that the modi cation depends on.
In this study, two mRNA microarray data sets were analyzed to obtain DEG between OC and non-cancerous tissues. A total of 152 DEG were identi ed in two data sets, including 75 down-regulated genes and 77 upregulated genes. GO and KEGG enrichment analysis was carried out to explore the interaction between DEGs. The over-expressed genes were mostly enriched in cell cycle, mitotic spin, activation of protein kinase activity, response to hydroxyia and oocyte meiosis, while the low-expressed genes were mostly enriched in signal transduction and protein domestication activity. It has been reported that the imbalance of cell cycle process and mitotic cell cycle play an important role in the occurrence or development of tumor [52,53,54] .
Furthermore, A large number of reports have suggested that complement activation is another way to promote tumor [55] . In addition, hypoxia is the common result of rapid growth of solid tumors and abnormal vascular structure, which has been recognized as one of the most important characteristics of tumor microenvironment (hallmark) [56] . More importantly, Hypoxia also induces the invasion of ovarian cancer to increase, and to chemoradiotherapy,including not sensitive or even resistant [57,58,59,60] . Consistent with these results, our results are consistent with all these theories. Subsequently,we also screened 15 m6A-modi ed DEGs through the m6AVar database. The candidate genes were analyzed byKaplan-Meier Plotter database,GEPIA and cBioPortal online platform, two DEGs were selected for detailed analysis: SCN7A and GATM. In different data sets, SCN7A and GATM were signi cantly low expressed in OC, and then the impact of the expression levels of the two genes on the prognosis of patients was studied. The results con rmed that the expression levels of SCN7A and GATM had a signi cant impact on the overall survival time of patients.Besides, the expression levels of SCN7A and GATM in OC of different TNM stages were different, and the expression of SCN7A and GATM in stage II and stage III was higher than that in stage IV. Mounting studies have shown that during the transcriptional regulation of BCL2 family and IAP family genes by ras-pi3k-akt-nf -κ B pathway, SCN7A is regulated and dramatically down regulated, leading to tumor proliferation and inhibiting tumor apoptosis [61] . SCN7A as a BM related gene carrying frequent brain metastasis (BM) speci c mutations. In colorectal cancer (CRC), SCN7A mutation leads to down-regulation of gene expression and promotes BM of CRC [62] . It can be seen that our study on SCN7A is consistent with the above conclusion. GATM is a proximal tubular enzyme involved in creatine biosynthesis [63] , In renal cell carcinoma (RCC), the expression of GATM RNA chimeras is increased, compared with benign adjacent kidneys, this up regulation of RNA chimeras may regulate the cellular mechanism and may affect the survival of patients [64] . All these revealed that the role of SCN7A and GATM in OC needs further research and exploration.
The analysis of the methylation of the two genes was completed through the m6AVar database. The results included 11 GATM studies (2 mouse and 9 human ) and 94 SCN7A studies (6 mouse and 88 human). The modi cation of SCN7A by m6A is mainly concentrated on chromosome 2, but the protein pathway dependent on m6A modi cation has not been recorded. On the other hand, the m6A modi cation of GATM mainly concentrated on chromosome 2, 15 and 16. The related binding proteins were EIF4A3 and FUS. EIF4A3 mainly occurred in the downstream of 3'AG and FUS occurred in the upstream of 5'GT. The present study aimed that CircPVRL3 can activate FUS, LIN28A, PTB and EIF4A3 binding proteins, promote the methylation of m6A, thus promoting the occurrence of gastric cancer (GC) [65] . Linc00667 up-regulated in non-small cell lung cancer (NSCLC) can promote the production of vascular endothelial growth factor A (VEGFA) by binding with EIF4A3, thus promoting the proliferation, migration and angiogenesis of NSCLC cells [66] .It has been previously shown that LncRNA EMX2OS binds directly to FUS protein, and overexpression of both can inhibit the proliferation, migration and invasion of prostate cancer cells and activate the GMP-PKG pathway [67] . In the case, combined with the above information, we can assume that m6A can promote OC cell development by reducing GATM gene expression through EIF4A3 or FUS. Of course, this hypothesis needs to be further veri ed in future research.

Conclusion
The aim of this study was to identify the m6A modi ed DEGs that may be involved in the carcinogenesis or progression of OC. A total of 152 DEGs were identi ed, of which 15 were modi ed by m6A. SCN7A and GATM selected from them can be regarded as diagnostic biomarkers of OC. Finally, it is hypothesized that m6A may be a new therapeutic target for OC by reducing GATM gene expression through EIF4A3 or FUS Page 9/20 protein pathway. Whereas, in order to illuminate the biological function of these genes in OC, further research should be carried out.

Author contributions statement:
H-DL conceived the study, wrote the manuscript, and completed the gures and tables. G-L conceived the study and organized and edited the text. All authors have read and approved the manuscript.

Con icts of interest:
The authors declare that the research was conducted in the absence of any commercial or nancial relationships that may be construed as a potential con ict of interest.
Ethics approval and consent to participate: Not Applicable Consent to publish:

Not Applicable
Availability of data and materials: Data sharing is not applicable to this article as no datasets were generated or analysed during the current study.