Identifying of a Novel miR-98-5p/IGF1 Axis Contributes the Pathogenesis of Breast Cancer Using Comprehensive Bioinformatic Analyses Methods and Experiments Validation

Background: Breast cancer (BC) is a huge threat for the health of women worldwide. Although the numerous microRNAs (miRNA) have been identied to be aberrantly expressed in BC, the construction of a comprehensive miRNA-messenger RNA (mRNA) network is still needed. This study was aimed to identify BC-associated miRNAs through analyzing microarray datasets obtained from GEO database and to construct a miRNA-mRNA network for BC. Methods: Limma package was used to identify differentially expressed miRNAs (DEMs) in microarray datasets. Genes targeted by DEMs were analyzed with mirTarBase. Gene Ontology and pathway enrichment analysis for the predicted target genes were performed at DAVID. Correlation of DEMs and target genes was analyzed at ENCORI. Based on these results, a miRNA-mRNA regulatory network was constructed. Results: A total of 17 overlapping DEMs were identied at these two microarray datasets. Expression of DEMs in BC were further validated by ENCORI. By utilizing miRTarBase, a total of 167 target genes for DEMs were obtained. 10 hub genes (AKT1, MYC, VEGFA, CCND1, PTEN, IL6, CASP3, KRAS, IGF1, ESR1) were identied after network analysis at STRING and CytoScape. Through analyzing the effects of hub genes on overall survival of BC patients and correlation of DEMs and hub genes, we found hsa-miR-98-5p/IGF1 axis may play a crucial role in BC progression. The connections of hsa-miR-98-5p and IGF1 were further validated by luciferase activity reporter assay and functional assays. Conclusion: In this work, a miRNA-mRNA network related to BC progression was built, and identied one important miRNA-mRNA axis in BC.


Background
The estimated newly diagnosed cases for Breast cancer (BC) are 2,088,849 in 2018 at the worldwide range [1]. Hence, BC represents a huge health threat for women worldwide [2]. Cancer is characterized as genome mutation, abnormal cellular growth status, failed to immune response, and so on [3]. As a result, it was believed that identifying molecules related to these processes is crucial to understand the mechanisms behind BC progression for the identi cation of novel prognosis or treatment biomarkers.
The recent development in sequencing technology and bioinformatic analyses measures has identi ed numerous aberrantly expressed protein coding genes and non-coding genes related to cancer metastasis, drug resistance, initiation and development [4][5][6].
As reported, there are only about 2% protein-coding RNAs in genome according to the human genome project, while most of them belong to non-coding RNAs (ncRNAs) [7,8]. ncRNAs including pseudogene, circle RNA (circRNA), long non-coding RNA (lncRNA), microRNA (miRNA) were used to be regarded as "junk genes" but their roles in cancers have been gradually recognized in recent years [9]. miRNAs are a class of ncRNAs with the length of approximately 18-25 nucleotides that can directly regulate target gene expression mainly via 3'-untranslated region (3'-UTR) binding [10]. miRNAs were found closely associated with BC cancer cell growth, apoptosis, metastasis, and drug sensitive [11]. Yao et al. identi ed 218 differentially expressed miRNAs (DEMs) and 2222 differentially expressed messenger RNAs (DEGs) correlated with prognosis of BC patients using the data obtained from The Cancer Genome Atlas (TCGA) database, and based on that a miRNA-mRNA regulatory network was constructed [12]. In addition, He et al. reported 11 highly expressed glycolytic proteins, who have a closely association with poor prognosis of BC [13]. By furthering analyses, they identi ed miR-140-5p was downregulated in BC and targeted GLUT1 expression to regulate BC glycolytic and proliferation [13].
In light of this, we aimed to build a miRNA-mRNA regulatory network in BC based on the datasets obtained from Gene Expression Omnibus (GEO). By analyzing the targets for miRNAs, we constructed miRNA-mRNA network related to the progression of BC. Finally, we explored the correlation of miRNA and mRNA in BC to provide more evidence to strength the ndings in this work.

Microarray data analysis
Two microarray datasets were downloaded from GEO (https://www.ncbi.nlm.nih.gov/gds), a database contains gene expression data. Dataset GSE83270 contains gene pro le data from 6 health control tissues and 6 BC patients' tissues was analyzed at 7th generation of miRCURYTM LNA Array (v.18.0, Exiqon) and submitted by Ya-Wen Wang from Shandong University. Dataset GSE68085 contains gene expression data from 104 BC patients' tissues and 11 healthy normal tissues was analyzed at GPL10999 platform and submitted by Sambasivarao Damaraju from University of Alberta.

Identi cation of DEMs
Limma package in R software was employed to identify DEMs between cancer tissues and normal tissues at both datasets. Cut-off values were set as |log2 fold change (FC)| > 2 and P < 0.01. Venn diagram was utilized to identify the overlapping DEMs in these two datasets.

Validation of DEMs expression at ENCORI
The expression of these overlapping DEMs in BC patients was validated using ENCORI, a public accessed platform for studying RNA interactions [14].

miRNA targets prediction
The mRNA targets for DEMs were analyzed at miRTarBase, a database contains experimental validated targets [15]. Only the targets validated by strong methods including luciferase activity reporter assay, western blot, or quantitative real-time PCR (RT-qPCR) were selected for followingly analyses. Based on these results, a miRNA-mRNA regulatory network related to the development of BC was established and visualized at Cytoscape V_3.6.0 software [16].
Protein-protein interaction (PPI) network PPI network for the above predicted miRNA targets was analyzed at Search Tool for the Retrieval of Interacting Genes (https://string-db.org) and visualized at Cytoscape V_3.6.0 software. Hub genes in this network were analyzed at CytoHubba with the cut-off value of degree above 10.
Gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis Database for annotation, visualization, and integrated discovery (https://david.ncifcrf.gov/) was employed for GO and KEGG analyses of the predicted miRNA targets [17]. P value less than 0.05 was believed as signi cant enrichment.
Validation of hub gene expression at ENCORI, GEPIA, and UALCAN The expression level of these 10 identi ed hub genes in BC was analyzed at both ENCORI, GEPIA, and UALCAN [18,19]. The genes that revealed to be abnormally expressed in these three databases were selected for followingly analyses.

Survival analysis using Kaplan-Meier Plotter
The effects of hub genes on the overall survival of BC patients were analyzed at Kaplan-Meier Plotter [20].
The median value was used as cut-off value to classify patients into high or low expression groups.
Analysis of miRNA and mRNA correlation at ENCORI Subsequently, hub genes after expression and prognostic value validation were selected to analyze its connection of miRNA at ENCORI. The pairs that inversely correlated were selected for following analyses.
Cell transfection pcDNA3.1 contains the coding sequence of IGF1 (pIGF1) was designed by GenScript (Nanjing, Jiangsu, China). For miRNAs or siRNAs transfection, Lipofectamine 2000 obtained from Invitrogen was used according to the manufacturer's protocol.
Cell counting kit-8 assay Cell Counting Kit-8 (CCK-8) bought from Beyotime (Haimen, Jiangsu, China) was used to measure cell proliferation rate. Brie y, 2,000 cells were cultured into 96-well plates and incubated for indicated time.
Then, CCK-8 reagent was added to each well and the detected the optical density value at 450 nm after further incubation for 2 h.
Wound-healing assay 5 × 10 5 cells were plated in 6-well plates. After wound generation at cell surface, cells were washed with PBS and incubated in serum-free medium. After 48 h, cell images were captured to measure the migration rate.

Transwell invasion assay
Cell invasion ability was detected with Matrigel coated transwell chamber (8-μm pore size). Cells in serum-free medium were plated to upper chamber, while the serum contained medium was lled to lower chamber. Invaded cells were stained by crystal violet and counted under microscope.
Tumor formation in BALB/c nude mice BALB/c athymic nude mice (female, 4-6 weeks old and 16-20 g) purchased from Beijing Vital River Laboratory Animal Technology Co., Ltd. were used for animal experiments. MCF7 cells stable expressed has-miR-98-5p inhibitor and NC-inhibitor suspended in PBS were inoculated subcutaneously into nude mice (6 per group). Tumor length and width was measured every 5 days. The mice were euthanized via overdose of pentobarbital, and the tumors were weighed. Tumor size was calculated using the formula: (L × W 2 )/2 (L represents Length; W represents Width). Animal experiments were approved by animal care committee of The First A liated Hospital of Jinzhou Medical University.

Statistical analysis
Data obtained in this work were analyzed using GraphPad software (San Diego, CA, USA) and presented as mean ± SD. Differences in groups were analyzed with Student's t-test (two groups) or ANOVA and posthoc test (for three groups). P value less than 0.05 was considered as statistically signi cant.

Screening and analyzing DEMs between BC tumor tissues and normal tissues
Through limma package analyses, we identi ed 46 DEMs in GSE83270 dataset. In the meantime, we found 297 DEMs in GST68085 dataset. Using Venn diagram, we identi ed 17 overlapping DEMs in both datasets (Fig. 1a). The detailed expression information of these 17 DEMs in these two datasets were summarized in Table 1. Moreover, the expression of these 17 DEMs in BC patients was further validated at ENCORI and we showed 13 of them were abnormally expressed (Fig. 1b). Moreover, 11 of these 13 miRNAs t the expression trend as indicated in two GEO datasets (Fig. 1b).

GO and KEGG analyses
Next, GO and KEGG analyses were performed on these 167 mRNAs to understand the biological functions of these genes. GO analysis revealed that these genes were mostly located at nucleoplasm and cytoplasm (Fig. 3a). Molecular functions (MF) enrichment showed that these genes were associated with protein binding, protein kinase binding, and protein kinase activity (Fig. 3b). In the meantime, the enriched biological processes (BP) were those associated with regulation of cellular processes (Fig. 3c). KEGG analysis showed these genes were most enriched in pathways related to cancer including Pathways in cancer, Prostate cancer, Melanoma, Proteoglycans in cancer, Cell cycle, Small cell lung cancer, PI3K-Akt signaling pathway, Hepatitis B, HTLV-I infection, and Chronic myeloid leukemia (Fig. 3d).

PPI network construction and hub gene identi cation
To investigate whether these genes were essential for the development of BC, PPI network consists 177 nodes and 176 edges was built (Fig. 4a). Top 10 hub genes according to degree method were selected using CytoHubba and the genes were: AKT1, MYC, VEGFA, CCND1, PTEN, IL6, CASP3, KRAS, IGF1, and ESR1 (Fig. 4b). Network analysis showed the PPI network ts the characters of small-world network (Fig.  5).

Prognostic value of hub genes
In order to explore the signi cance of these hub genes, we analyzed their effects on overall survival of BC patients. We found MYC, VEGFA, CCND1, IL6, CASP3, KRAS, IGF1, and ESR1 were indicators for the poor overall survival of BC patients (Fig. 6).

Hub gene validation
Then, the expression level of hub genes in BC tissues was validated using ENCORI, GEPIA, and UALCAN. The genes that validated to be aberrantly expressed in all these three databases were selected for analyses. It was found MYC, CCND1, IL6, and IGF1 were abnormally expressed in BC tissues compared with normal tissues (Fig. 7). Moreover, miRNAs that can target these four genes were summarized in Table 3.

Correlation of miRNAs and hub genes in BC
The connection of miRNA and hub genes in BC was then analyzed at ENCORI. As presented in Fig. 8, we found hsa-miR-98-5p and IGF1 was negatively correlated in BC tissues (P = 1.58 e-9).
Validation of hsa-miR-98-5p and IGF1 axis in BC At length, we explored the expression level of hsa-miR-98-5p and IGF1 in BC cells. As presented in Fig. 9a, we showed hsa-miR-98-5p expression level was signi cantly increased in BC cells compared with the normal cell line. In the meantime, we found IGF1 expression level was signi cantly decreased in BC cells in comparison with normal cell line (Fig. 9b). The binding region between hsa-miR-98-5p and IGF1 was presented in Fig. 9c. Dual-luciferase activity reporter assay revealed that hsa-miR-98-5p mimic transfection decreased luciferase activity in cells harbor wt-IGF1 (Fig. 9d). These results validated IGF1 was a direct target for hsa-miR-98-5p.

hsa-miR-98-5p regulates tumor growth in vivo
To validate the roles of hsa-miR-98-5p in vivo, we established nude mouse xenograft model. As expected, both the tumor size and weight were decreased with has-miR-98-5p inhibitor relative to those treated with NC-inhibitor ( Fig. 11a and 11b).

Discussion
The 5-year overall survival for BC patients is still far away from satisfactory even though the recently improvements in identifying aberrantly expressed genes in BC [12,13,20,21]. For instance, CYP4Z1 and pseudogene CYP4Z2P was reported to target CDK3 expression to regulate the responses of BC cell to tamoxifen [20]. Besides that, lncRNA SNHG7 expression was reported to be elevated in BC tissues compared with normal tissues [21]. Moreover, the knockdown of SNHG7 was revealed to suppress BC cell proliferation and invasion via regulating miR-381 [21]. Hence, we believe it is essential to further investigate the mechanisms behind BC development.
Owing to the appearance of new measures to analyze RNA-seq data and microarray-based expression pro ling data, we are now more likely to understand the changes in molecular level during cancer progression process. In this study, we downloaded two datasets contains data from a total of 137 of either BC tumor tissues or normal tissues from GEO. We found that 17 overlapping miRNAs were aberrantly expressed in BC tumor tissues compared with the normal tissues. Combined with the targets of these miRNAs, we found 167 mRNAs that were further used for functional enrichment analyses. We showed these mRNAs were mostly enriched in pathways related to cancer progression and able to regulate numerous cellular processes.
Generally, the analysis of interaction connection of genes can be used to identify key genes in that network. Here, using the cut-off criteria value of degree above 10, we identi ed 10 hub genes in the PPI network. Of these 10 hub genes, 8 of them were revealed could be regarded as predictor for the prognosis of BC patients. By combining the expression results in ENCORI, GEPIA, and UALCAN, we validated 4 genes (MYC, CCND1, IL6, and IGF1) were abnormally expressed in BC. Through analyzing the miRNA and mRNA expression connection in BC patients, we identi ed a novel hsa-miR-98-5p/IGF1 axis that may play a crucial role in BC. hsa-miR-98-5p has a controversial role in human cancers as it can function as either tumor suppressor or oncogene [22,23]. For example, Fu and co-workers showed that downregulation of hsa-miR-98-5p could promote the development of pancreatic ductal adenocarcinoma via inversely regulating the expression of MAP4K4, indicating the tumor suppressive role of hsa-miR-98-5p [22]. On the contrary, Wang et al. showed hsa-miR-98-5p was enriched expression in epithelial ovarian cancer, and regulated the sensitive of cancer cells to cisplatin [23].
Subsequently, we validated the hsa-miR-98-5p/IGF1 axis in BC using in vitro experiments. Firstly, we showed hsa-miR-98-5p expression was increased, while IGF1 expression was decreased in BC cell lines compared with normal cell line, which is in consistent with the bioinformatic analyses results presented previously. Moreover, the luciferase activity reporter assay was used to validate the direct connection of hsa-miR-98-5p and IGF1. Not surprisingly, we found the introduction of hsa-miR-98-5p mimic could decreased luciferase activity in BC cells harboring wt-IGF1, indicating the direct connection of hsa-miR-98-5p and IGF1. Functional assays showed hsa-miR-98-5p overexpression could increase, while IGF1 overexpression could decrease BC cell proliferation, migration, and invasion. The overexpression of IGF1 could reversed the effects of hsa-miR-98-5p overexpression on BC cell behaviors, indicating IGF1 was a functional target for hsa-miR-98-5p. In vivo experiments validated the oncogenic roles of has-miR-98-5p in regulating BC progression.

Conclusion
To summary, our results demonstrated that 13 miRNAs were abnormally expressed in BC tissues compared with normal tissues. Moreover, we identi ed a novel hsa-miR-98-5p/IGF1 axis that may play crucial roles in BC development through bioinformatic analyses and experiment validation. Our work presented will help us to understand the molecular mechanisms of BC and aid to develop novel therapeutic target against BC. Figure 1 Screening abnormally expressed miRNAs in BC tissues compared with normal tissues. a Identi cation of overlapping miRNAs between dataset GSE68085 and GSE83270. b Validation the expression of overlapping miRNAs at ENCORI database. miRNA: microRNA; BC: breast cancer.

Figure 2
The miRNA-mRNA interaction network in BC. miRNA: microRNA; mRNA: messenger RNA; BC: breast cancer. Figure 4 PPI network of the 167 targets for miRNAs. a Using STRING, a PPI network was constructed for these 167 genes. b Top 10 hub genes derived from degree method were selected. PPI: protein-protein interaction; miRNA: microRNA.  Validation of hub genes expression on BC tissues using the ENCORI, GEPIA, and UALCAN databases.  Validation the hsa-miR-98-5p and IGF1 axis in BC cells. a hsa-miR-98-5p expression in BC cells and normal cell line. b IGF1 expression in BC cells and normal cell line. c Binding region between hsa-miR-98-5p and IGF1. d Luciferase activity in BC cells with synthetic miRNAs or luciferase vectors transfection.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download. NC3RsARRIVEGuidelinesChecklist2014.docx