Bioinformatics Analysis Predicts hsa_circ_0026337/miR-197-3p as a Potential Oncogenic ceRNA Network for Non-small Cell Lung Cancers

Circular RNAs (circRNAs) play an essential role in developing tumors, but their role in non-small cell lung cancer (NSCLC) is unclear. Thus, the present study explored the possible molecular mechanism of circRNAs in NSCLC. Three circular RNA (circRNA) microarray datasets were downloaded from the Gene Expression Omnibus (GEO) database. Differential expressions of circRNAs (DECs) were identied in NSCLC tissue and compared to adjacent healthy tissue. The online cancer-specic circRNA database(CSCD) was used for the analysis of the DECs function. Protein-protein interaction (PPI) network, Kyoto Encyclopedia of Genes and Genomes (KEGG), Gene Ontology (GO), Cytoscape and UALCAN were used to predict the critical nodes and perform patient survival analysis, respectively. The interaction between the DECs, the predicted miRNAs, and hub genes was also determined. Finally, the circRNA-miRNA-mRNA network was established.


Introduction
Despite signi cant advances in targeted therapy and immunotherapy, lung cancer is still the malignant tumor with the highest morbidity and mortality in the worldwide [1][2][3]. It accounts for about 85% of all lung cancers, and non-small cell lung cancer (NSCLC) includes the main pathological types: lung adenocarcinoma and lung squamous cell carcinoma [4]. Therefore, the identi cation of biomarkers for the early diagnosis, prognosis, and the monitoring of the therapeutic response of the cancer is an urgent requirement [5][6][7]. Indubitably, non-coding RNAs, including long non-coding RNAs (lncRNAs), microRNAs (miRs), and circular RNAs (circRNAs), play an essential role in regulating tumorigenesis [8][9][10][11]. circRNAs are produced by superior variable shear and are abundant in the cytoplasm of eukaryotic cells. Conversely, only a small amount of intron-derived circRNAs is present in nucleic acids, with respect to tissue, timing, and disease speci city [12][13]. In addition, circRNA molecules contain miRNA response elements (MREs) that relieve the inhibitory effect of miRNAs on target genes and cells and upregulate the expression level of target genes by competitive endogenous RNA (ceRNAs), binding themselves to miRNA, and acting as a miRNA sponge in cells [14][15].
Data accumulated indicates that circular ANRIL(cANRIL) is an antisense transcription of INK4/ARF(CDKN2a/b), and the expression of cANRIL may be closely associated with the transcription of INK4/ARF and the risk of cardiovascular sclerosis. Some studies demonstrated that has_circ_002059 is down-regulated in gastric cancer, making it a potential biomarker for the diagnosis of gastric cancer [16]. Shenglin et al. sequenced circRNA in the exosomes of liver cancer cells and found that circRNA exosomes are enriched and differ signi cantly from that of normal cells. The degree of tumor circRNAs enrichment in serum was found to be related to tumor size [17]. In recent studies, many circHIPK3-related cancers have been identi ed, including nasopharyngeal carcinoma, gallbladder cancer, lung cancer, and chronic myeloid leukemia, deeming circHIPK3 as a potential biomarker [18]. Li et al. reviewed the literature on circRNAs and NSCLC from PubMed and focused on the roles and mechanisms of circRNAs in regulating the cell cycle and epithelial-mesenchymal transition [19]. Cai et al. detected that hsa_circ_0001947 and hsa_circ_0072305 are abnormally expressed in patients with NSCLC, and bioinformatic analysis identi ed that the network of hsa_circ_0001947/hsa-miR-637/RRM2 and hsa_circ_0072305/hsa-miR-127-5p/DTL may be related to the occurrence and development of NSCLC [20]. These ndings indicated that circRNAs are closely related to the occurrence and development of diseases and maybe a potential target for the future diagnosis and treatment of the disease. However, most of the studies on circular RNA and non-small cell lung describe only a few genes description, and the mechanism of systematic molecular regulation is yet to be elucidated [21][22].
The present study aimed to detect the differentially expressed circRNAs (DECs) in NSCLC using Gene Expression Omnibus (GEO) database chips and predict the function of circRNAs with a miRNA and mRNA. Then, the selected target genes were used to establish the protein-protein interaction (PPI) network. In addition, survival analysis was carried out to identify the genes with critical roles in the occurrence and development of NSCLC and lay a foundation for the discovery of potential molecular markers of the disease.

Source of circRNA array data
Three NSCLC circRNA expression array datasets (GSE158695, GSE112214, and GSE101684) were selected, by searching through the GO database(https://www.ncbi.nlm.nih.gov/geo/). The rst and second datasets (GSE158695 and GSE112214) included data on three NSCLC and three normal tissue samples. The last circRNA dataset (GSE101684) included four NSCLC samples and four samples of tumor-adjacent tissue.

Identi cation of DECs
R version 4.03 software and Limma package were used for differential analysis of three datasets, with the criterion for differential expression being an adjusted p-value <0.05 and |logFC|≥1. Four potential DECs were obtained by taking the intersection of the three datasets using the online platform, VENNY 2.1(https:// venny/index.html).

ceRNAs-based functional analysis of DECs
The miRNAs that carried MREs corresponding to the four DECs were queried and retrieved to obtain the miRNAs that may be sponged by the DECs. circRNAs targeting miRNA projections were obtained from the online ENCORI database (http://starbase.sysu.edu.cn) and used to determine several miRNAs that may directly interact with circRNA targets by choosing two databases of the same target genes. To further analyze the biological functions, miRNA target genes were predicted using the database (miRDB http://mirdb.org/, miRTarBase https://bio.tools/mirtarbase, TargetScan http://www.targetscan.org/vert_72/) , and the mRNAs predicted by the three databases were selected for GO and KEGG enrichment analysis. These analyses were conducted using R software as well as BiocManager and ClusterPro ler package. P-values <0.05 indicated statistical difference [23].

Construction of Protein-Protein Interaction (PPI) networks and validation of the hub Genes
The STRING database was applied to construct the PPI network. Cytoscape identi ed, the top 50 candidates used creating the PPI network. Moreover, the top ten genes were selected as hub genes. UALCAN (http://ualcan.path.uab.edu/index.html) was employed to determine the expression of the predicted miRNAs-carrying MREs in NSCLC tissue and o understand the effect(s) of these miRNAs associated with the ten hub genes on the prognosis, expression, and survival analyses of the MREs miRNAs. Thus, the circRNA-miRNA-mRNA subnetwork of interest was established. P<0.05 indicated statistical signi cance.

Four circRNAs differently expressed in NSCLC
Three circRNA datasets (GSE158695, GSE112214, and GSE101684) were selected from the GEO microarray database, and then, the four differential expression circRNAs were downregulated: hsa_circ_0049271, hsa_circ_0026337, hsa_circ_0043256, and hsa_circ_0008234 (Fig. 1). Functional analysis of the four circRNAs was performed using the CSCD database(http://gb.whu.edu.cn/CSCD/). The information and structure of these four circRNAs from CSCD databases are shown in Fig. 2 and Table1. hsa_circ_0049271 was bound speci cally to 62 miRNAs; it served as an miRNA sponge to regulate the expression of the miRNAs (Table S1). Similarly, hsa_circ_0026337 bind to 47 kinds of miRNAs (Table S2), hsa_circ_0043256 bind to 47 kinds of miRNAs (Table S3), and hsa_circ_0008234 might bind to 43 types of miRNAs (Table S4).
3.5 Identi cation of these miRNAs sponged by DECs in uences patient survival Expression data from TCGA database were used to perform survival analysis using UALCAN(http://ualcan.path.uab.edu/index.html). It was found that the expression of miR-197 signi cantly increased in lung adenocarcinoma and squamous cell carcinoma patients compared to normal samples (P1=3.56e-2, P2=1.08e-12). The high expression was also signi cantly associated with shortened survival time in lung squamous cell carcinoma patients (P=0.037), but no statistical difference was found in lung adenocarcinoma (P=0.5) (Fig.7A, 7B, 7C, 7D). Therefore, it is speculated that hsa_circ_0026337/has-miR-197-3p are involved in the occurrence and development of NSCLC and may effectuate a biological function by upregulating or inhibiting the expression of genes, such as MAPK8, IGF1R, GRB2, and ITCH.

Discussion
Protein-coding RNAs in eukaryotes are abnormally cleaved to form circRNAs, which are not easily degraded by enzymes and aggregate in the cytoplasm [24]. The advanced molecular biology techniques have revealed the structure and function of several circRNAs. The data are recorded in public databases and can be used for expression difference analysis, MRE identi cation, RNA Binding Protein(RBP), and other functional analysis of circRNAs. In this study, we retrieved NSCLC data from the GEO database, and screened three NSCLC circRNA expression data. The differential analysis data showed that only four circRNAs showed signi cant differences. The endogenous competition mechanism enables circRNAs to in uence the function of miRNAs through sponge absorption. Therefore, we explored the potential ceRNA networks of the four circRNAs. The CSCD and ENCORI databases were selected as the potential functional genes. The results showed that only miR-197-3p, miR-3605-5p, miR-433-3p, and miR-653-3p might bind to hsa_circ_0026337, and three genes speci cally bind to hsa_circ_0043256 miR-1252-5p, miR-494-3p, and miR-558, respectively. The remaining seven miRNAs were predicted to combine with 100 mRNAs, and the ten hub genes in the PPI network were as follows: PTEN, MAPK8, MDM2, CDKN1A, IGF1R, RB1, GRB2, ATF3, ITCH, and SGK1. Based on the above analyses, two circRNAs, four miRNA, and ten mRNA interaction networks were established.
GO and KEGG enrichment analyses found that these 100 mRNAs are involved in various cancer-related biological functions, including "histone deacetylase binding." Histone deacetylase (HDAC) is involved in the regulation of histone acetylation and is vital for epigenetic events, including the removal of residues from the acetylation of histone lysine on the base, the formation of heterochromatin, and silent transcriptional gene translation. In addition, this group of proteins plays a role in regulating gene expression, cell proliferation, cell migration, angiogenesis, and cell death; several studies showed that the abnormal expression of HDACs in different types of tumors is associated with the occurrence and progress of cancer [25][26][27]. SMAD-binding, or phosphorylation of SMAD3 is essential for TGF-β-induced epithelial-mesenchymal transformation in NSCLC cells. The activation of TGF-β/SMAD signal communication is accompanied by the formation of SMAD complex, which inhibits E-cadherin and activates the transcription of Snail, Slug, and Twist, thereby improving the metastatic and invasion ability of the cells [28-30]. The DNA-binding transcription activity repression for RNA polymerase II (RNAPII) speci c for protein-coding gene transcription was conducted by a transcription cycle. The starting points of the cycle differ among the suspended stage, extending stage, and ending stage. Each stage is associated with different transcriptional mechanisms and regulatory factors regarding the change in the composition and activity. Previous studies have shown that RNAPII interacts with cyclin-dependent kinases (CDKs) and regulates the cell cycle [31]. The enrichment of cell cycle regulation, PI3K, and SMAD indicates that these mRNAs are closely related to the proliferation and in ammatory response of NSCLC. Moreover, hsa_circ_0043256 is differentially expressed in the serum of NSCLC patients and used as an indicator of patient diagnosis. According to the current analysis and a few previous studies [32], the predicted circRNAs and hsa_circ_0043256 induce NSCLC cell apoptosis, and their expression level could be used for patient diagnosis and prognosis prediction.
For example, miR-494-3p in endometrial cancer cells inhibit PTEN expression in translation, activate the downstream phosphoinositide 3 kinase/protein kinase B (PI3K/AKT) signaling pathway, and enhance tumor cell proliferation, migration, and invasion. Conversely, the restoration of PTEN protein levels or inhibition of the PI3K/AKT pathway also eliminates the miR-494-3p-mediated pro-tumor effect [33]. The in uence of miR-494-3p on PTEN can promote the proliferation and metastasis of tumor cells. Moreover, miR-494-3p has been proved to promote the proliferation and metastasis of liver cancer cells [34], glioma cells [35], and lung adenocarcinoma cells [36]. miR-433 also plays a role as a tumor suppressor gene in various tumors. It targets SMAD2 to inhibit the development of NSCLC [37], activates MAPK signaling pathway, and thus, inhibits the proliferation of breast cancer cells [38][39]. Finally, we searched the relative expression levels and survival analysis data of ve miRNAs in the TCGA database and found that miR-197 was signi cantly increased in lung squamous cell carcinoma tissues, and the survival time of patients with high expression of the miRNA was shortened considerably (Fig.7). Presently, only a few studies have con rmed that miR-197-3p is closely related to the occurrence and development of tumors and chemotherapeutic drug resistance [40][41].
Tian et al. [42] demonstrated that miR-197-3p was overexpressed in NSCLC cell lines, and the proliferation ability and resistance to chemotherapy drugs of the cells were enhanced markedly. However, the role and mechanism of miR-197-3p in NSCLC are yet unclear. The current analysis showed that the low expression of hsa_circ_0026337 in NSCLC increased the expression of hsa-miR-197-3p, which might further regulate the expression of proteins (MAPK8, IGF1R, GRB2, and ITCH) and promote tumor growth. However, this nding is based on bioinformatics and cellular function. Hence, further exploration and direct mechanistic experiments are required to con rm the role of these circRNAs and their networks.

Conclusion
Four circRNAs, hsa_circ_0026337, hsa_circ_0043256, hsa_circ_0043256, and hsa_circ_000823, showed a decreased expression in NSCLC tissues. MRE, ceRNA network, and mRNA enrichment analyses showed that these circRNAs played a role of tumor suppressor genes via regulation of cell cycle, in ammatory response, and cell proliferation ability. In addition, TCGA survival analysis indicated that patients with high expression of hsa-miR-197 in NSCLC had a short survival time. Therefore, we speculated that low expression of hsa_circ_0026337 increases the expression of hsa-miR-197-3P, thereby inhibiting the translation of proteins, such as MAPK8, IGF1R, GRB2, and ITCH, which are involved in the occurrence and development of NSCLC.

Declarations Acknowledgements
We appreciate the supports of our researchers.

Availability of data and materials
Source data of this study were derived from the public repositories, as indicated in the section of "Materials and Methods" of the manuscript. And all data that support the ndings of this study are available from the corresponding author upon reasonable request.
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable.
Authors' contributions QZ, LK and XL conceived and designed the study. QZ and LK performed the experiments. QZ and LK wrote the manuscript. ZL, SW, and XF reviewed and edited the manuscript. All authors read and approved the manuscript.

Funding
There is on funding   The potential speci c binding miRNAs of DECs were predicted using the online ENCORI database and CSCD database; The miRNAs in red color indicate the same results in both two databases.
The circRNA-miRNA-hub genes network for the two circRNAs, ve miRNAs, and ten mRNAs. Color in red refers to downregulated.