Identification of the Expression Signature and Potential Mechanisms of miR-493-3p in NSCLC using Bioinformatics Strategy: A Comprehensive Study from TCGA and GEO Datasets


 Background

Recent evidence highlights that miR-493-3p serve as crucial regulators of tumorigenesis. Nevertheless, the expression and clinical roles of miR-493-3p has been rarely reported in non-small cell lung cancer (NSCLC). Thus, this study was aim to investigate the expression status and potential mechanism of miR-493-3p in NSCLC progression.
Methods

We initially examined the expression of miR-493-3p in NSCLC through The Cancer Genome Atlas (TCGA) database and Gene Expression Omnibus (GEO) microarrays. The overlap of the conjecture miR-493-3p target genes and down-regulated genes in NSCLC from TCGA were identified as the possible miR-493-3p target genes. Gene Ontology (GO), Kyoto Encyclopedia of Genes and Genomes (KEGG) analyses and protein-protein interaction (PPI) network were constructed to explore the biological function and hub genes of miR-493-3p targets. The expression pattern and prognosis value of key hub genes were examined by Genotype-Tissue Expression (GTEx) database, The Human Protein Atlas and Kaplan- Meier Plotter database.
Results

miR-493-3p was significantly increased in NSCLC tissues and connected with tumor stage in TCGA and GEO database. A total of 46 genes were identified as miR-493-3p targets, and those involved in various key pathways by GO and KEGG analysis. Furthermore, PH domain and leucine-rich repeat protein phosphatase 2 (PHLPP2) was indicated of miR-493-3p key targets, which low-expressed in NSCLC and predicted better overall survival.
Conclusions

Our study emphasized that up-regulated miR-493-3p may target PHLLP2 and predicate worse prognosis of NSCLC patients.

Page 4/23 effects model to the pooling process. Sensitivity analyses were performed to assess the potential impact of microarrays in the heterogeneity. Then, publication bias was examined using funnel plots and the presence of asymmetry was assessed with Begg's and Egger's tests.
Bioinformaticsprediction of miR-493-3p targets miRWalk 2.0 (http://zmf.umm.uni-heidelberg.de/apps/zmf/mirwalk2/), an open online miRNA target genes prediction website that including 12 prediction databases. The target genes of miR-493-3p were screened by miRWalk must more than six prediction databases. Moreover, we extracted down-regulated genes in NSCLC by TCGA database through Gene Expression Pro ling Interactive Analysis (GEPIA) (http://gepia.cancer-pku.cn/). The overlapping genes, between miRWalk2.0 predicted target genes and down-regulated genes in TCGA database were identi ed promising target genes of the miR-493-3p, and were used in the further functional research.
GO and KEGG clustering analysis of miR-493-3p target genes The database for Annotation, Visualization and Integrated Discovery (DAVID) ( https://david.ncifcrf.gov/home.jsp), an online tool to perform the gene ontology (GO) analysis and the Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis of miR-493-3p target genes in NSCLC. GO analysis is mainly composed by three parts as biological processes (BPs), cellular components (CCs) and molecular functions (MFs). The enrichment pathways were further optimizing by R package.

PPI network construction and hub target genes selection
The Search Tool for the Retrieval of Interacting Genes (STRING) (http://string-db.org/cgi/input.pl), which was used to construct a PPI network of miR-493-3p target genes. In addition, cytoHubba java plugin in Cytoscape v3.7.1 were employed to lter most likely hub genes in the PPI network.
Validating the expression and prognosis value of hub target genes GEPIA was performed to describe the expression status of hub target genes of miR-493-3p in NSCLC and PCT, and the results present in box plots. Moreover, we investigated proteins expression status and clinical prognostic signi cance of hub target genes in NSCLC and PCT through The Human Protein Atlas (https://www.proteinatlas.org/) and the Kaplan Meier-plotter (https://kmplot.com/analysis/).

Results
Validation the clinicopathological value of miR-493-3p expression in NSCLC through TCGA database In total, 500 samples of LUAD patients, 470 samples of LUSC patients, and 89 samples of PCT were collected from TCGA database. The miR-493-3p expression whether in LUAD, LUSC or in NSCLC, were remarkably upregulated in tumor tissues compare with the normal controls ( Fig. 2A-C). Moreover, we also performed Kaplan-Meier curve to investigate the in uence of miR-493-3p expression on OS. No matter in LUAD, or in LUSC, there were no signi cant difference in OS between the patients with high miR-493-3p expression and the one with low levels with the p-values were all smaller than 0.05 ( Fig. 2D-F). In addition, we also analyzed the correlation between miR-493-3p expression and clinicopathologic parameters of NSCLC, including age, gender, tumor stage, lymph node status and metastasis. As can be seen in Tables 1 and Tables 2, female patients (2.2581±1.2951) had a higher expression level of miR-493-3p than male patients (2.0294±0.9327) in LUAD, and LUSC patients in stages I-II (2.5992±1.2113) had a higher expression level of miR-493-3p than those in stages III-IV (2.1763±0.9168). As for NSCLC patients, were composed by LUAD and LUSC patients, there was the signi cance in the statistics for the stages I-II (2.3717±1.1954) had a higher expression level of miR-493-3p than those in stages III-IV (2.1703±1.0768) ( Table 3).
Con rmation the expression and meta-analysis of miR-493-3p in GEO database After the selection criteria, we identi ed 12 eligible microarrays from the GEO database for further analysis, essential information and features of the included microarrays are depicted in Table 4. The expression of miR-493-3p for NSCLC in each of the included GEO database are illustrated in Fig. 3, the results indicated that the NSCLC patients had a prominently higher expression level of miR-493-3p than PCT in GSE27486, GSE15008 and GSE93300 ( Fig. 3A-C), but no signi cance in other microarrays ( Fig. 3D-L).To further con rm the expression of miR-493-3p in 12 included GEO database, a meta-analysis was performed. The heterogeneity test suggested prominent heterogeneity among included GEO database (p < 0.05, I 2 = 70.2%). Hence, the random effects model was performed, as shown in Fig.4A, the forest plot of pooled SMD is 0.31 and 95% CIs is 0.07 to 0.54, which showed NSCLC tissues had higher miR-493-3p expression than PCT. Moreover, no GEO database led to a signi cant deviation from the overall pooled results through sensitivity analysis (Fig. 4B). Regarding publication bias, Bgger's test (p=0.150) and Egger's test (p=0.699) indicated there was no evidence of publication bias among all included GEO databases ( Fig. 4C-D).
To assess the prognostic value of PHLPP2, we explore PHLPP2 protein expression in the Human Protein Atlas database, and found PHLPP2 was down-regulated in LUAD and LUSC compared to normal tissues (Fig.8A).
Moreover, Kaplan-Meier curve analysis were performed to estimate the clinical prognostic signi cance of PHLPP2, the results indicated that high expression of PHLPP2 predict signi cantly longer survival (Fig. 8B).

Discussion
NSCLC, consisted of LUAD and LUSC chie y, remains the leading reason of cancer mortality worldwide. Until the last decade, the 5-year overall survival rate for patients with metastatic NSCLC was less than 5% [34]. Further understanding of the biology molecular mechanism and nding new biomarks of NSCLC may provide novel therapeutic targets. Although previous study have documented miR-493-3p involved in the progression of NSCLC [35], the expression and molecular mechanism of miR-493-3p in NSCLC progression still not been clari ed.
In this study, we intended using a comprehensive bioinformatics analysis to investigate the expression, targets genes and potential molecular mechanism of miR-493-3p in NSCLC progression.
As one of the noncoding RNAs, microRNAs (miRNAs) are novel gene regulators that target the 3'-UTR of downstream mRNAs to accelerate their degradation and/or block their translations via seed region matching.
Current studies have reported that miR-493-3p differentially expressed in multiple cancers, and had been de ned as tumor suppressor which inhibits the progressions of several types of cancers. For example, miR-493-3p had been found down-regulated in leukemia cells and could affect leukemogenesis, clonogenic and stemness capacities [36] . Wang et al. revealed that miR-493-3p downregulated in laryngeal squamous cell carcinoma(LSCC), and LINC01605 directly target at miR-493-3p to promote LSCC proliferation [37]. Xu et al. found that miR-493-3p was downregulated in prostate cancer cells and regulated the expression of YTHDF2, which indirectly regulated N6-methyladenosine modi cation to inhibit the proliferation and migration abilities of prostate cancer cells [38].
However, the expression pattern and speci c molecular mechanisms of miR-493-3p in NSCLC were still unclear.
In present study, up-regulated expression of miR-493-3p was identi ed in NSCLC by TCGA and GEO database.
This results hints miR-493-3p may act as a tumor promoter in NSCLC progression. Although the Kaplan-Meier curves showed no signi cance difference between the low-expression miR-493-3p patients and the highexpression miR-493-3p patients from TCGA database, the clinical correlation analysis revealed that female patients had a higher expression of miR-493-3p than male patients in LUAD, and high expression of miR-493-3p was related to the tumor stages in LUSC patients. Moreover, the meta-analysis indicated NSCLC patients had Page 7/23 higher miR-493-3p expression than non-tumor patients in GEO database. Due to that miR-493-3p function with target genes, bioinformatics analyses were performed to discover the potential target genes and speci c molecular mechanisms of miR-493-3p in NSCLC. Based on miRWalk2.0 and GEPIA, the intersection including 33 potential target genes were selected, and 13 target genes of miR-493-3P through literature screening and extraction. Finally, a total of 46 candidate target genes were identi ed.
To con rm the functions of 46 target genes, GO and KEGG enrichment analyses were performed. As for the GO enrichment analysis, we found that hepatocyte proliferation, epithelial tube branching involved in lung morphogenesis and regulation of microtubule polymerization or depolymerization were the main functions for BPs, and the functions including cell surface, transcription factor complex, chromosome, centromeric region were the main terms for CCs, 3 terms were the main functions for MFs including transcription factor activity and sequence-speci c DNA binding, transcription factor activity and RNA polymerase II core promoter proximal region sequence-speci c binding, DNA binding. For the KEGG analysis, we identi ed 11 signi cant pathways that might Basis on the PPI network, nine hub genes (MKI67, MAD2L1, FEN1, SKA3, E2F1, SP1, ETS1, PHLPP2, TGFA) were identi ed. Since miR-493-3p is up-regulated in NSCLC, the target genes of miR-493-3p have greater potential lowexpressed in NSCLC, so we found ETS1 and PHLPP2 were lowly expressed in NSCLC patients compare with the normal tissues through GEPIA database, but EST1 was opposite to the previous studies for high expression in NSCLC tissues [41][42][43], PHLPP2 lowly expressed in NSCLC tissues and consistent with the previous studies [44,45] , and we also fund PHLPP2 lowly expressed in LUAD and LUSC tissues compare to normal lung tissues through Human Protein Atlas database. Moreover, Kaplan-Meier curve analysis indicated that NSCLC patients with high expression of PHLPP2 had a better OS. Therefore, PHLPP2 may be the most likely target gene of miR-493-3p.
PHLPP2 catalyzes the dephosphorylation of Akt kinase, reduces the activity and expression level of Akt to suppress tumor growth [46]. Previous studies have showed that PHLPP2 expression is ubiquitously lost in multiple cancers and plays a key role in a wide range of biological behavior, such as cancer cell propagation, metastasis, autophagy and apoptosis [47][48][49]. As for NSCLC, wang et al, found PHLPP2 expression less pronounced in NSCLC tissue samples than that in non-tumor lung tissues and associated with the presence of lymph node metastasis [44] . Mei et al, found that the up-regulated miR-141 can direct target and suppress PHLPP2 to promote the proliferation of NSCLC [50] . Therefore, miR-493-3p might target at PHLPP2 to play an important role in NSCLC progression.

Conclusion
This research demonstrated that miR-493-3p was highly expressed in NSCLC and might target at PHLPP2 to promote NSCLC progression based on online, comprehensive, large database. Certainly, much effort and research is still needed to verify the functions of miR-493-3p in NSCLC progression, which is a challenging but promising task. Those ndings suggest miR-493-3p might function as a latent tumor biomarker in the prognosis prediction for NSCLC, and pave the way for clinical NSCLC treatment and future molecular mechanism exploration.

Declarations
Ethics approval and consent to participate Not applicable. All data in this study are publicly available.

Consent for publication
Not applicable.

Availability of data and materials
All analyzed data are included in this published article.

Competing interests
The authors declare that they have no competing interests.

Funding
This research was supported by grants from The Third A liated Hospital of Chongqing Medical University (approve number: KY19030) to Hongbo ZOU. The funding bodies had no involvement in the design of the study, collection, analysis, and interpretation of data and in writing the manuscript.
Authors' contributions ZH performed databases collection and analysis, and drafted the initial manuscript. ZHB and LM made substantial contributions to conception and design, acquisition of data, and analysis and interpretation of data. In addition XQC was involved in drafting the manuscript and revising it critically for important intellectual content.
ZLJ acquired the data and performed the analysis and interpreted the data. All authors read and approved the nal manuscript.   Overall ow chart. The current study were comprehensively analyzed based on multiple online database: validation miR-493-3p expression in NSCLC by TCGA and GEO databases, collection of the possible miR-493-3p target genes and potent molecular mechanism through GEPIA, MiRWalk2.0, DAVID, STRING, and GEPIA databases.    Collection the potential target genes of miR-493-3p, performing GO and KEGG enrichment analysis. A. Venn diagrams of potential target genes of miR-493-3p, showing the intersection (n=33) between different groups, miRWalk2.0 forecast target genes (n=2103) and TCGA low-expressed genes in LUAD (n=1108) and in LUSC (n=1920). B. GO enrichment analysis. C. KEGG enrichment analysis. The x-axis represents the ratio of involved genes, and the y-axis represents the GO and KEGG terms. Each bubble represents a term. The size of the bubble indicates the number of involved genes. Lighter colors indicate smaller P values.

Figure 6
The PPI network of possible miR-493-3p target genes in NSCLC. Each node represents a gene-encoded protein, while lines between the nodes represent protein associations.