Advances in Research on the circRNA-miRNA-mRNA regulatory network in lung adenocarcinoma by analysis of microarray data

Background: The present researches indicate that circular RNAs (circRNAs) play pivotal roles in the generation and development of human cancers. However, function of circRNAs in lung adenocarcinoma is still unknown. Herein, we focused our study on the regulation mechanism of circRNAs in lung adenocarcinoma (LUAC) Methods: CircRNAs-related data can be downloaded from Gene Expression Omnibus(GEO) microarray databases. Relevant expression data of miRNAs and mRNAs were obtained from The Cancer Genome Atlas (TCGA) databases. The differentially expressed circRNAs (DEcircRNAs) were obtained by the robust rank aggregation method, and constructed a ceRNA network with circRNA-miRNA pairs and miRNA-mRNA pairs. The function and pathway enrichment of different genes were analyzed, and the protein-protein interaction (PPI) was predicted by String software, and the subnetwork management module was constructed by MCODE plugin. The differentially expressed mRNAs (DEmRNAs) were used to establish survival analysis by R software. Results: A total of 19 up-regulated circRNAs, 45 down-regulated circRNAs, 98 up-regulated miRNAs, 18 down-regulated miRNAs, 2002 up-regulated mRNAs, and 553 down-regulated mRNAs were obtained in lung adenocarcinoma. The circRNA-miRNA-mRNA network was constructed by six circRNAs (hsa_circ_0003528, hsa_circ_0004315, hsa_circ_0005699, hsa_circ_0002588, hsa_circ_0005777, hsa_circ_0027033), 19 miRNAs and 33 mRNAs. GO and KEGG pathway analysis indicated that DEmRNAs might be related to the progression of LUAC. The PPI network was constructed and four hub genes (CEP55, CHEK1, CDC25A, KIF23) were obtained from the network, and four hub genes were found to be associated with the same miRNA and circRNA, including hsa_circ_0004315/hsa-miR-195/CEP55, CHEK1, CDC25A, KIF23. in and normal samples a robust aggregation a circRNA-miRNA-mRNA regulatory network and PPI network based on DEmRNAs. constructed a circRNA-miRNA-hub gene based on regulatory modules in the genes obtained from the network were analyzed for survival. available microarray data to construct a circRNA-related ceRNA network. High expression of three hub (CEP55, CHEK1, KIF23) genes is associated with the survival rate of lung adenocarcinoma. Therefore the circRNA–miRNA-hub genes (hsa_circ0004315/hsa-mir-195/CEP55,CHEK1, CDC25A, KIF23) regulatory subnetwork reveals an important circRNAs that might be involved in carcinogenesis, providing new insight into the pathogenesis of LUAC, and deserves further study of potential therapeutic targets.


Background
Lung cancer is one of the most common disease of cancer-related death in the past ten years (2008-2017) [1], and the ve-year survival rate for patients with lung cancer is only 19%. Lung adenocarcinoma (LUAC) accounts for 40% of lung cancer types, with more than 1 million deaths annually worldwide [2,3]. In recent decades, circRNAs have great potential as prognostic and diagnostic biomarkers for LUAC [4]. Circular RNAs (circRNAs) contain a large number of non-coding RNAs, which are generated for non-canonical splicing events [5], have high stability because circRNAs are closed circular structures [6], and circRNAs have the function of acting as microRNA (miRNA) sponges to regulate gene expression [7]. Some studies have shown that circRNAs exist stably in some tissues, especially cancer [8], including LUAC [9]. Thus, circRNAs have become a new hot topic in cancer. Xu Y et al. [10] showed that high expression of hsa_ circ_0000326 was related to tumor size, regional lymph node status and differentiation of human lung adenocarcinoma. hsa_ circ_0000326 acts as a sponge for miR-338-3p and changes the function of miR-338-3p, which in turn upregulates the expression of downstream target RAB14 and affects the proliferation, migration and apoptosis of lung adenocarcinoma cells. However, mechanisms circRNAs are still unclear in the process of LUAC.
In this study, the expression pro les of circRNA were obtained from the GEO database and miRNAs and mRNAs were obtained from the TCGA database in LUAC tissues and adjacent normal tissues. Differentially expressed circRNA, miRNA, and mRNA were identi ed by a robust rank aggregation method. Construction of ceRNA network with circRNA-miRNA pairs and miRNA-mRNA pairs. Through functional enrichment of gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG), the functional pathway of mRNA in this network was analyzed. We then established a protein-protein interaction (PPI) network and extracted hub genes from the PPI network. To better understand the pathogenesis of LUAC, we also constructed a circRNA-miRNA-hub gene subnetwork regulation module and carried out a survival analysis of the hub gene. The ow chart of research ideas is presented in Fig. 1.

Material And Methods
Data acquisition GEO (http://www.ncbi.nlm.nih.gov/GEO) is a public functional genome database. CircRNA microarray data were obtained from GSE101684 and GSE112214 data in the GEO database. Sequencing data of miRNA and RNA were downloaded from the TCGA data portal (https://tcga-data.nci.nih.gov/tcga/). The miRNA sequencing data included 483 lung adenocarcinoma tissues and 45 adjacent normal lung adenocarcinoma tissues, and the mRNA sequencing data included 497 lung adenocarcinoma tissues and 54 adjacent normal lung adenocarcinoma tissues. It is not necessary to obtain approval from the Ethics Committee because all data were obtained from public databases (GEO and TCGA).

Acquisition of DEGs
The raw data were normalized and log2 transformed. Differentially circRNAs (DEcircRNAs) were obtained by the Limma package in normal samples and tumor samples. Then, all DEGs were integrated and ranked using the robust rank aggregation method [11]. Differentially expressed miRNAs (DEmiRNAs)and mRNAs (DEmRNAs) were obtained with edgeR software package, with Log [fold change] > 2 and adj.P value < 0.01.

Prediction of miRNA binding sites
Cancer-Speci c CircRNA (CSCD) (https://gb.whu.edu.cn/CSCD/) was used to predict binding sites of miRNA (MREs) [12]. The miRNAs obtained from DEcircRNA in CSCD are considered potential targets for miRNA. These targeted miRNAs were further intersected with the DEmiRNAs from The Cancer Genome Atlas (TCGA).
Predictions of miRNA target gene TargetScan, miRTarBase, and miRDB databases were used to predict the connections between miRNAs and mRNAs [13][14][15]. Only those three databases that identi ed the mRNAs were considered candidate mRNAs. The nal DEmRNAs were obtained from the intersection of candidate genes and DEmRNAs obtained from the TCGA database.

Construction of the ceRNA network
The circRNA-miRNA pairs and miRNA-mRNA pairs were constructed to circRNA-miRNA-mRNA regulatory networks.
Finally, the ceRNA network was established by CytoCope 3.7.2.
GO and KEGG functional enrichment analysis gene ontology (GO) is the main tool for annotating genes and analyzing their biological processes (BPs), molecular functions (MFs), and cellular components (CCs). To investigate the main functional pathway of LUAC, DEmRNAs in the network were evaluated through GO annotation and KEGG pathway with the cluster pro le package in R. P-value <0.01 was considered a statistically signi cant difference.

Construction and module analysis of the PPI network
The String online database (https://string-db.org/) was used to build the PPI network by DEmRNAs. The Molecular Complex Detection (MCODE) app was used to identify hub genes from the PPI network [16].

Survival analysis
Kaplan Meier curve was used to analyze the total survival rate of hub genes selected from the PPI network. The logrank test was used for statistical analysis. The threshold for survival prognosis signi cance was a P-value<0.05. A subnetwork was constructed based on based on mRNAs that were meaningful for overall survival.

Acquisition of DEGs in LUAC
The expression of circRNAs in LUAC and control tissues was evaluated in two microarray datasets (GSE101684, GSE112214), and the basic information is listed in Table 1. In the circRNA expression pro le data, a total of 64 DEcircRNAs, 19 up-regulated and 45 down-regulated were obtained from GEO. The DEcircRNA was selected from the up-regulation and down-regulation to select the highest ranked circRNA for analysis. There are a total of 6 DEcircRNAs, three up-regulation (hsa_circ_0003528, hsa_circ_0005699, hsa_circ_0004315) and three downregulations (hsa_circ_0005777, hsa_circ_0027033, hsa_circ_0002588) (p <0.01) (Fig.2). The basic characteristic of the six circRNAs is listed in Table 2. Their basic structure patterns of six circRNAs are shown (Fig.3). A total of 116 DEmiRNAs, 98 upregulated and 18 downregulated, and 2555 DEmRNAs, 2002 upregulated and 553 downregulated, were identi ed in the TCGA database (p<0.01, logFC>2).

Construction of the ceRNA network
To better understand the relationship between circRNA, miRNA, and mRNA, the circRNA-miRNA-mRNA (ceRNA) network was constructed by per DECs. CSCD database was used for the circRNA-miRNA prediction. The six highestranked circRNAs in the GEO database were found to intersect the corresponding miRNAs from the CSCD database with those obtained from TCGA and identi ed 19 pairs of interacting circRNAs and miRNAs. Then we identi ed mRNA targeted by these DEmiRNAs in three databases (miRBD, miRTarBase, and TargetScan). These targeted mRNAs were cross-checked with the DEmRNAs retrieved from the TCGA database, including 33 DEmRNAs. Finally, we used six circRNA nodes, 19 miRNA nodes, and 33 mRNA nodes in Cytoscape 3.7.2 to construct a ceRNA network. (Fig4) Functional and pathway enrichment analysis GO analysis revealed that above identi ed mRNAs were mainly concentrated in 'regulation of cell cycle', 'renal system development', 'DNA damage checkpoint'(BPs); 'nucleus' (CC); 'cyclin-dependent protein serine/threonine kinase regulator activity'(MF) (Fig 5a). KEGG pathway analysis revealed strong enrichment in the 'MicroRNA in cancer' 'PI3K-Akt signaling pathway' and 'small cell lung cancer' (Fig 5b).

Survival analysis of hub gene
In this study, 522 clinical specimens of lung adenocarcinoma were downloaded from TCGA database. Kaplan-Meier curves were used to analyze the overall survival rate of 33 DEmRNAs in 522 clinical samples, and 13 DEmRNAs were found to be signi cant for the overall survival analysis (p<0.05). Three of the four hub genes (CEP55, CHEK1, CDC25A, KIF23) obtained from the PPI network were included in 13 DEmRNAs of signi cance for the overall survival analysis.
Three genes (CEP55, CHEK1, KIF23) were most signi cantly related to the survival and prognosis of patients with LUAC(p<0.01), and the overall survival rate of lung adenocarcinoma decreased with the high expression gene compared with the low expression gene (Fig. 7).

Discussion
CircRNAs are highly stable and can be found in exosomes, cell-free saliva, and plasma [17]. So they may become potential biomarkers or therapeutic targets for cancer, including LUCA [18][19][20]. However, the exact role of circRNAs in LUAC remains largely unknown. To identify whether circRNA plays a role as ceRNA of the LUAC, we performed microarray data analysis to examine DEGs in LUAC samples and normal samples using a robust rank aggregation method. We constructed a circRNA-miRNA-mRNA regulatory network and PPI network based on DEmRNAs. We also constructed a circRNA-miRNA-hub gene subnetwork based on regulatory modules identi ed in the circRNA-miRNA-mRNA network. The hub genes obtained from the PPI network were further analyzed for survival.
Lung adenocarcinoma is prone to metastasis in the early stage, while brain metastasis is a common complication of lung adenocarcinoma, and its incidence increases with the prolongation of survival [21].The most common mutant gene types of lung adenocarcinoma are EGFR, KRAS, etc [22,23], which have corresponding targeted drug therapy, but are prone to drug resistance to targeted drugs. However, a large number of studies have shown that the dysregulation of circRNAs plays a crucial role in the development of LUAC; therefore, further understanding of the biological mechanism of abnormally expressed circRNAs is essential for the discovery of new, promising therapeutic targets for LUAC [24]. YU et al. [25] collected 36 paired LUAC and healthy tissues and concluded that circRNA CMRA was signi cantly downregulated in cancer tissues, and circRNA cMras overexpression inhibited LUAC growth and metastasis, and its expression was negatively correlated with tumor stage. Liu et al. [26] found that in vitro experiments suggested that hsa_circ_0005962 promoted LUAC cell proliferation.
In current study, six circRNAs (hsa_circ_0003528, hsa_circ_0004315, hsa_circ_0005699, hsa_circ_0002588, hsa_circ_0005777, hsa_circ_0027033) were involved in the ceRNA network. One of these, only hsa_circ_0003528 were studied by Huang Z et al. [27] This study was based on changes in circRNA in plasma samples from 3 patients with active tuberculosis and 3 healthy controls. Then, 61 DEcircRNAs were recorded. And the veri cation test showed that TB patients have 6 types of circRNA signi cantly increased in plasma levels, one of which is hsa_circ_0003528. No relevant studies have reported involvement of hsa_circ_0004315, hsa_circ_0005699, hsa_circ_0002588, hsa_circ_0005777, hsa_circ_0027033 in cancer.
The present study that a total of 33 DEmRNAs and 19 DEmiRNAs were identi ed and involved in the ceRNA network, some of which have been found as a biomarker for diagnosis and prognosis. To identi ed the key circRNAs participating in the regulatory network that was established by a PPI network, four genes were identi ed, including CEP55, CHEK1, CDC25A, KIF23. Previous work has identi ed these four genes, but there is only one gene linked to miR-195-5p which is CHEK1in LUAC. Zuo et al [28] found that Upregulation of miR-195 in LUAC cells was found to decrease CHEK1 and increase Bax expression.In addition, the overexpression of miR-195 enhanced the sensitivity of Lac cells to cisplatin, promoted the apoptosis of Lac cells, and inhibited cell proliferation. Fu F et al [29] showed that CEP55 was related to brain metastasis after LUAC operation. Takamochi et al [30] found that LUAC with EGFR mutations were biologically inert. In EGFR mutant LUAC, the cell cycle-related CDC25A gene is speci cally downregulated. Iltzsche et al. [31] found that RNA interference-mediated depletion of KIF23 inhibited lung tumor formation in vivo and induced cell apoptosis in lung cancer cell lines. The results indicated that inhibition of KIF23 may be a strategy for the treatment of lung cancer. However, there no report that the linkages between miR-195-5p and CEP55 and CDC25A and KIF23 in LUAC. Herein, we identi ed four circRNA-miRNA-hub gene axes, indicating competitive regulatory relationships of one circRNAs with the four genes in LUAC, and the overall survival rates of the three hub genes (CEP55, CHEK1, KIF23) were statistically signi cant. However, given that these results are based solely on bioinformatics models, further in-depth studies are critical to verifying the possible role of these four axes in LUAC.

Conclusions
We screened differentially expressed circRNAs, miRNAs, and mRNAs from publicly available microarray data to construct a circRNA-related ceRNA network. High expression of three hub (CEP55, CHEK1, KIF23) genes is associated with the survival rate of lung adenocarcinoma. Therefore the circRNA-miRNA-hub genes (hsa_circ0004315/hsa-mir-      The ceRNA network of circRNA-miRNA-mRNA in LUAC. Diamonds indicate circRNA, square indicate miRNA and round indicate mRNA. The nodes highlighted in red and blue represent up-regulation and down-regulation. GO and KEGG functional enrichment analyses of the differentially expressed mRNA in the ceRNA network. a The green represents biological processes; blue represents molecular fuction; purple represents molecular fuction. b The enrichment analysis of the KEGG pathway. A P<0.01 was considered to indicate a signi cant difference Figure 6 Identi cation of hub genes from the PPI network with STRING software and R software. a PPI network of 33 genes. b PPI network of four hub genes which are expressed in red Overall survival of the three hub genes, high expression in red and low expression in blue (p<0.05). a CEP55, b CHEK1, c KIF