Identification of cancer-promoting circRNAs and potential contributions of these circRNAs to the pathogenesis of hepatocellular carcinoma (HCC)

Purpose There is growing awareness of critical roles of circular RNA (circRNA) in tumor growth and development. This study aimed to discover new cancer-promoting circRNAs and to study their potential roles in the pathogenesis of hepatocellular carcinoma (HCC). Methods The expression profiles and clinical data of HCC-related circRNA, miRNA, and mRNA were obtained from the Gene Expression Omnibus (GEO) and The Cancer Genome Atlas (TCGA) databases. A model of the circRNA-miRNA-mRNA network was established based on analysis and interaction prediction. The main biological functions of the targeted mRNA were predicted by Gene Ontology (GO) analysis and Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis, and the protein-protein interaction (PPI) network was constructed to reveal important hub genes. The online tool OncomiR and GEPIA were used to analyze the correlation between miRNA and mRNA and clinicopathological features, and the Kaplan-Meier curve was constructed to reveal the relationship between mRNA and patient prognosis. Results Differential analysis indicated that twenty highly expressed circRNA in GEO microarray datasets GSE78520, GSE94508, and GSE97332. Ten miRNAs were targeted by 20 circRNAs. These miRNAs are expressed at lower levels in the TCGA liver hepatocellular carcinoma (LIHC) expression profile, and most of the functions of these genes are closely related to pathological staging. These 10 miRNAs predicted a total of 7310 downstream mRNAs, of which 169 mRNAs were more abundant than normal tissues in the TCGA LIHC expression profile. GO analysis, KEGG analysis, and protein-protein interaction network analysis showed that E2F1, H2AFX, TOP2A, and RAD51 are central genes of the competitive endogenous RNA (ceRNA) network, and are mainly involved in biological mechanisms such as cell cycle, cancer-related pathways, and blood vessel morphogenesis. In addition, E2F1, H2AFX, TOP2A, and RAD51 are also closely related to the pathological stage and survival of patients. Conclusions analysis construction of RNA network Genes involved in this network may provide potential therapeutic targets for the diagnosis and treatment of HCC. Data for circRNA expression profiles of HCC and human-matched non-tumor tissues are not currently available in the TCGA database, we selected the gene chip data of relevant circRNA expression in the GEO database (dataset screening HCC and (micrornas miRNA) and Homo sapiens). TCGA was the source of the expression profiles of miRNAs and mRNAs differentiation, blood vessel morphogenesis, G0 and Early G1, urogenital system development, connective tissue development, integration of energy metabolism, cell morphogenesis involved in differentiation, learning or memory, negative regulation of DNA binding, response to nutrients, positive regulation of cell development, resolution of D-loop structures through Synthesis-Dependent Strand Annealing (SDSA), brain development, and carbohydrate-derivative biosynthetic process. KEGG analysis showed two enriched terms: Small cell lung cancer and GABAergic synapse (Figure 5).


Introduction
3 Circular RNAs (circRNAs) are a class of non-coding RNAs with intact closed-loop structures [1]. In 1986, Sanger et al. first proposed circRNA, and with improvements in detection techniques, more and more circRNAs have been discovered in recent years [2]. Current research indicates that circRNA is formed by reverse shear processing of linear RNA precursors, which can be composed of an exon or an exon plus an intron [3]. Lacking a 5' end cap structure and a 3'-end poly(A) tail, circRNA can be stably present in living organisms and can be a key biomarker for cancer [4].
Several functions of circRNA have been described, including forming a complex with RNA polymerase II to promote gene expression; binding to an RNA binding protein (RBP) to form an RNA-protein complex (RPC) to inhibit the function of a specific RBP, thereby inhibiting transcription of the parental gene; and acting as a competitive endogenous RNA (ceRNA) to sponge miRNA through miRNA response elements (MREs), thereby regulating the abundance of target genes [5]. In hepatocellular carcinoma, the circular RNA ciRS-7 can promote the expression of CCNE1, PIK3CD, and EGFR and other proteins by adsorbing miR-7, and the circular RNA circMTO1 can competitively bind and promote the expression of cyclin-dependent kinase inhibitor 1A (p21) [6,7]. However, there has been no comprehensive analysis of circRNAs and functions for hepatocellular carcinoma (HCC).
In this study, we downloaded the HCC-related circRNA microarray dataset from the Gene Expression Omnibus (GEO) database and performed bioinformatics analysis to identify up-regulated circRNA, members of the ceRNA network, and potential functions. Finally, we constructed a circRNA-miRNA-mRNA network including six circular RNAs, four miRNAs, and four mRNAs.

Collection and selection of relevant data in the CEO and TCGA database
The gene chip and data used in this study were obtained from the GEO database (http://www.ncbi.nlm.nih.gov/gds/) and The Cancer Genome Atlas (TCGA) database (https://portal.gdc.cancer.gov/). Data for circRNA expression profiles of HCC and human-matched nontumor tissues are not currently available in the TCGA database, we selected the gene chip data of relevant circRNA expression in the GEO database (dataset screening keywords: HCC and (micrornas OR miRNA) and Homo sapiens). TCGA was the source of the expression profiles of miRNAs and mRNAs 4 in HCC (TCGA LIHC) and the clinical and pathological information of relevant patients. Survival analysis was performed on 364 patients with complete survival records, and correlation analysis was performed on 342 patients with complete clinicopathological features.

Differential gene analysis in HCC
First, differential circRNA identification was performed on each set of GEO gene chip data using the R software package (http://www.proproject.org), with log 2 values > 1 and P < 0.05. Venn diagram construction was then performed using the online tool Bioinformatics & Evolutionary Genomics (http://bioinformatics.psb.ugent.be/webtools/Venn/) to screen for up-regulated circular RNA in hepatocellular carcinoma. Next, TCGA expression profile data was pre-processed. First, the transcriptome expression data were converted from FPKM to TPM, and then the miRNA and mRNA data removed which with expression values equal to 0 in more than 25% of patients were removed.
Finally, miRNAs with log2< -1 and P< 0.05 and mRNAs with log2 > 1 and P< 0.05 were also screened using R software. Comparing to adjacent tissues, the differentially up-regulated circular RNA in tumor tissue was designated DUcircRNA, the differentially down-regulated miRNA in tumor tissue was designated DDmiRNA, and the differentially up-regulated mRNA was in tumor tissue designated DUmRNA.

MRE prediction and construction of DUcircRNA-DDmiRNA-DUmRNA network
The circular RNA Interactome (CircInteractome) (https://circinteractome.nia.nih.gov/) network tool was used to predict DUcircRNA that targeted miRNAs and their binding sites [8]. The predicted miRNA and DDmiRNAs were overlapped to obtain the final DDmiRNAs and then displayed as a heat map. The  [9]. The following criteria were used for targeting mRNA selection: 1) mRNA were predicted by at least 10 different algorithms; 2) mRNA with miRDIP prediction score equal to 0.5 or higher; 3) before calculation, confidence class in miRDIP was set to "Very High". Next, the set of screened mRNAs were overlapped with the set of DUmRNA to obtain the final DUmRNA dataset for further analysis. Finally, Cytoscape (version 3.6.1) (http://cytoscape.org/) was used to map the DUcircRNA-DDmiRNA-DUmRNA network [10].

Functional enrichment analysis of mRNA
We used Metascape (http://metascape.org/gp/index.html#/main/step1) to perform gene ontology (GO) and Kyoto Gene and Genomic Encyclopedia (KEGG) analysis of the final DDmRNA and screened for annotations with P < 0.05 [11].

Construction of a PPI network and identification of hub genes
STRING (http://string-db.org/) is a database of known and predicted protein-protein interactions [12].
The interactions include direct (physical) and indirect (functional) associations, and are based on computational prediction, interactions in other organisms, and interactions included in other (primary) databases. This analysis was used to construct a PPI network for final DUmRNA with an interaction score greater than 0.7 as the cutoff value.

Statistical analysis of the correlation between miRNA and clinical features
OncomiR (http://www.oncomir.org/oncomir/search_miR_clinical.html) is an online resource to explore miRNA dysregulation in cancer [13]. This database can be queried for miRNA association with: tumor formation, tumor stage and grade, cancer patient survival, and gene targets. Analysis of variance (ANOVA) compares miRNA expression data between the different cohorts within each clinical parameter. Multivariate Cox analysis is used to evaluate the combined effect of the miRNA data, clinical parameters, and patient survival. In the above analysis, a value of P < 0.05 was considered statistically significant.

Survival analysis of hub genes and its relevance to stage
GEPIA (http://gepia.cancer-pku.cn/) is an online analysis software based on RNA sequencing data for 9736 tumors and 8587 normal samples in the TCGA and GTex databases [14]. First, statistical analysis of the expression of hub genes in HCC and normal tissues was performed using GEPIA and the Student two-tailed t-test. Subsequently, using TCGA LIHC patient clinical information and the GEPIA 6 online tool, the χ2 test was used to analyze the association of hub genes and tumor stage. Finally, the survival and prognostic significance of hub genes were analyzed using the survival analysis tool and the Kaplan-Meier method. In the above analysis, a value of P < 0.05 was considered statistically significant.

Discussion 9
The ceRNA mechanism is one of the important functions of non-coding RNA, by which mRNA translation can be regulated by competitively shared miRNAs, especially by circRNA [15,16]. Circular RNA cSMARCA5 can promote DHX9 and TIMP3 expression by targeting miR-17-3p [17]. CircHIPK3 sponges miR-7 to increase the expression levels of FAK, IGF1R, EGFR, and YY1 and promote colorectal cancer growth and metastasis [18].
In this study, because circular RNA data was not included in TCGA, we searched the GEO database to find HCC-related circRNA microarrays, GSE7850, GSE94508, and GSE97332. We performed differential analysis of these three microarrays using R software, and then entered the DUcircRNAs data into the online tools Bioinformatics & Evolutionary Genomics to identify 20 circRNAs that were highly expressed compared to adjacent normal tissues. The ceRNA mechanism of circZFR and hsa-circ-0000673 was previously reported. In colorectal cancer, circZFR affects the expression of the FOXO4 protein by targeting miR-532-3p; in lung cancer, circZFR acts as a ceRNA on miR-4302, promoting ZNF121 expression [19,20]. Hsa-circ-0000673 can promote HCC proliferation and metastasis by indirectly regulating the expression of SET by competitive binding to miR-767-3p [21]. So, in this study we mainly explore the ceRNA mechanism of hsa-circRNA-103948, hsa-circRNA-104640, hsa-circRNA- E2F1 is a tumor suppressor in a variety of cancers, binds to gene promoter regions, and is involved in numerous cellular pathways [27]. Its roles include regulating cell cycle, apoptosis, aging, and DNA damage responses [28]. The different biological functions of E2F1 depend on its binding partner [29].
H2AFX is one of the common variants in the H2A family, and is a histone that is closely related to DNA double-strand breaks [31]. When the C-terminus of H2AX is phosphorylated, there are increased numbers of DNA double-strand breaks, leading to gene mutation [32].

11
TOP2A gene is a molecular target of anthracycline antibiotics. TOPA2 affects DNA topological structure, chromosome segregation, and cell cycle progression, processes important for the development and progression of tumors [33]. Previous studies have shown that TOP2A is overexpressed in HCC and is closely related to HBsAg and Ki-67 [34].
The RAD51 protein is essential for homologous recombination repair processes [35]. Genetic variation in RAD51 may lead to the development of cancer, such as liver cancer, ovarian cancer, breast cancer, prostate cancer, and colorectal cancer [36].

Conclusions:
We allowed the construction of a competitive endogenous RNA network associated with cancerpromoting circRNAs. Genes involved in this network may provide potential therapeutic targets for the diagnosis and treatment of HCC.

Competing interests
The authors declare that they have no competing interests

Funding
This study was supported by the Leading Clinical Department Sponsored By '136' Healthcare Improvement Project of Shanxi Province.

Authors' contributions
ZSZ and JFH designed the project. MQY and CRR analyzed and interpreted the relevant TCGA and GEO data. DLZ, XML and JZ perform part of the data processing and were major contributor in writing the manuscript. All authors read and approved the final manuscript. Tables Table 1 Overview of the 20 DUcircRNAs  CircRNA-miRNA-mRNA network for up-regulation of circRNA in HCC. CircRNA-miRNA-mRNA network based on hub genes E2F1, H2AFX, TOP2A, and RAD51