STXBP5L and its related molecules as potential biomarkers in lymph node metastatic breast cancer


 BackgroundLymph node metastasis is a common problem in breast cancer, and its underlying molecular mechanism is still unclear. The purpose of this study is to research the molecular mechanism and to explore the key RNAs and pathways that mediate lymph node metastasis in breast cancer.MethodsGSE100453 and GSE38167 were downloaded from the Gene Expression Omnibus (GEO) database and 569 breast cancer statistics were also downloaded from the TCGA database. Differentially expressed miRNAs were calculated by using R software and GEO2R. Gene ontology and Enriched pathway analysis of target mRNAs were analyzed by using DAVID and R software. The PPI network was constructed by using Metascape and Cytoscape software.ResultsIn total, 6 differentially expressed miRNAs were selected, and 499 mRNAs were identified after filtering. The research of KEGG indicated that mRNAs enriched in many cancer pathways. Besides, some hub mRNAs were analyzed after constructing the PPI network. A total of 3 out of 6 miRNAs had a significant relationship with the overall survival (P<0.05) and showed a good ability of risk prediction model of over survival.ConclusionBy utilizing bioinformatics analyses, differently expressed miRNAs were identified and constructed a complete gene network. Several potential mechanisms and therapeutic and prognostic targets of lymph node metastasis were also demonstrated in breast cancer.


Background
By 2018, among 1762450 newly diagnosed cancer patients in the United States, and 606880 died. Of them, breast cancer accounted for 30% (268600) of new cancer cases and 15% (41760) of cancerrelated deaths in women (1). Although we have made great progress in the diagnosis and treatment of drugs, drug resistance and molecular heterogeneity have affected the effective treatment. With the increasing incidence rate of breast cancer in the United States, the related treatment of malignant 3 breast cancer has brought enormous pressure to the national economy and health care industry (2).
Recent studies have shown that radiotherapy and adjuvant chemotherapy can promote the overall survival of patients, especially for patients with early diagnosis of breast cancer (3). However, in these treatment methods, there are often side effects that affect the quality of life of patients, such as hematopoiesis system disorders, gastrointestinal discomfort, immune decline and so on. In addition, they also bear the risk of death such as bone loss, increased risk of secondary cancer and cardiotoxicity (4,5).
MicroRNAs (miRNAs) are small, noncoding RNAs, having the function of regulating after gene transcription (6). According to previous studies, miRNAs can regulate many target genes or one type of miRNAs can be regulated by many genes (7). In particular, Thomas et al found that some miRNAs can improve the therapeutic effect by improving the drug sensitivity of cancer cells (8). Xuan et al found that miR-374a, miR-92a, and miR-106a increase drug resistance and promote growth and metastasis of lung cancer (9). Kania et al also reported that miR-9-3p and miR-9-5p decrease DNA topoisomerase IIα expression levels in acquired resistance to etoposide and may act as biomarkers of responsiveness to TOP2-targeted therapy (10). However, the mechanisms of miRNAs in the occurrence and development of lymph node metastasis breast cancer remain unknown.
In this study, microarray data from GEO database and breast cancer sample data in the TCGA database promoted the research of differently expressed miRNAs in lymph node non-metastasis cancer tissues and lymph node metastatic cancer tissues. The functions of the target mRNAs were evaluated by using GO annotation, KEGG, PPI network and the association between identified miRNAs and the overall survival of patients with cancer. In summary, we conducted this study to identify key miRNAs and mRNAs for lymph node metastasis and to identify potential new breast carcinoma treatment targets.

Microarray data.
The GEO (https://www.ncbi.nlm.nih.gov/gds) database is a public functional genomics data set to help users download statistics and import gene expression. In our research, Gene expression profile data (GSE100453 and GSE38167) were obtained from GEO. 43 lymph node 4 non-metastasis breast cancer tissues and 25 lymph node metastasis breast cancer tissues were included. The array data were acquired from the Agilent-070156 Human miRNA [GPL20712; miRNA version] and Agilent-029297 Human miRNA [GPL14943; miRNA version]. Besides, we downloaded breast cancer statistics from the TCGA database which included 44 normal tissues and 525 tumor tissues. The flow of this study is shown in Figure 1.

2.2.
Differently expressed miRNAs analysis. R software was used to compare two groups of tissues. Besides, |log2FC|≥2 and P<0.05 were used as a cut-off criterion and a significant statistical difference would be considered if the statistics met our standards (11).

Targets of miRNA prediction.
To predict the miRNA and mRNA interaction, we used the database of miRDB (http://mirdb.org/) and TargetScan (http://www.targetscan.org/vert_72/). we used cumulative weighted context score ≤−0.4 and target score ≥65 as a cut-off criterion, and the intersection of two screening results was identified as our target mRNAs. We also constructed the network between the identified miRNAs and their target genes by using Cytoscape.

Functional and pathway enrichment analysis. KEGG pathway analysis and GO functional
analysis were performed by using DAVID (https://david.ncifcrf.gov/) and R software. GO analysis was divided into the cellular component (CC), biological process (BP) and molecular function (MF), and a Pvalue < 0.05 was thought that there was a statistical difference. For KEGG pathway analysis, P-value < 0.05 was also used as cut-off criterion.

PPI network analysis.
We conducted the PPI network by taking advantage of metascape (https://metascape.org/gp/index.html). It provides information about confirmed and predicted interactions between a large number of proteins. Differently expressed mRNAs would then input into the metascape. We would consider mRNAs to be significant which scored > 0.9 (12). Besides, molecular Complex Detection (MCODE) was then utilized to research the sub-modules of the PPI network, and the standards we set was the number of nodes >7 and MCODE score >6.

Analysis of mRNA expression in breast cancer. Hub genes expression in breast cancer
tissues and normal tissues were extracted from the human protein atlas (www.proteinatlas.org). mRNAs expression we identified were determined through analysis of TCGA databases, which are 5 available through TCGAportal (http://tumorsurvival.org/). High and low groups were defined as above average and below average respectively.

Analysis of the miRNAs and their relationship with breast cancer prognosis. The
Kaplan-Meier Plotter (www.kmplot.com/) is a non-parametric statistic that is used to research the overall survival statistics from recorded data. It was constructed from gene expression and survival data which were downloaded from GEO and TCGA database (13). Every miRNA which was identified would then be inputted into the online tool to assess the survival of patients with breast cancer for the Kaplan-Meier curve.

Independent Prognostic Ability of the miRNA Signature.
To better understand the relationship between the selected miRNAs and the prognosis of breast cancer patients, we constructed a risk prediction model. By using "survivalROC" package, we calculated AUC of 3 years and 5 years dependent ROC curve to assess the predictive power of identified miRNAs. A block diagram was also drawn to show the risk score of the model.

Quantitative reverse transcription polymerase chain reaction (qRT-PCR). 18
participants, including 11 lymph node non-metastasis breast cancer patients and 7 lymph node metastasis breast cancer patients, were recruited from The Second Affiliated Hospital of Soochow University. Breast cancer patients were diagnosed based on pathological results, who were excluded if he or she had the following diseases such as mammary tuberculosis, diabetes, mammitis, or other breast diseases. Before the study began, all the patients agreed. Plasma samples were collected from each participant before treatment began. After that, total RNA was extracted from per sample using TRIzol (Invitrogen), and its concentration and purity were evaluated by K5800 Microspectrophotometer (Kaiao). The reverse transcription was conducted using PrimeScript™ 1st Strand cDNA Synthesis Kit (Takara) at 42℃ for 60 minutes and then at 95℃ for 5 minutes. Next, based on LightCycler® 480 II real-time PCR system (Roche), PCR was performed with SYBR® Premix Ex Taq™ Kit (Takara) at the temperature of 95℃ for 2 minutes, followed by 38 cycles with the temperature of 95℃ for 30 seconds, 53℃ for 30 seconds and 72℃ for 30 seconds. U6 was applied as internal controls. The 2−ΔΔCt method was utilized to determine the relative expression of each selected 6 miRNA between case and controls.

Identification of the miRNAs between lymph node non-metastasis tissues and lymph
node metastasis tissues. R software was used to research the gene expression profiles from the GSE100453 and GSE38167. According to the cut-off criteria (P<0.05 and |log2FC|≥1), 30 differentially expressed miRNAs were identified. After analyzing 569 samples of the TCGA database, 80 differentially expressed miRNAs were identified and 6 of them exist in both filter results which were consisted of 3 downregulated and 3 upregulated miRNAs. The result is shown in Table 1.

Target prediction and GO analysis.
The target mRNAs of those 6 differently expressed miRNAs were downloaded from two miRNA target prediction websites (targetscan and miRDB). 499 mRNAs were identified after filtering. The Network between miRNAs (2 upregulated miRNAs and 2 downregulated miRNAs) and target mRNAs was shown in Figure 2. To learn more about the function of these mRNAs, we uploaded them into DAVID to perform GO and KEGG analysis. In the CC ontology,  Table 2). Based on the key mRNAs and related miRNAs, we constructed a miRNA-mRNA network (Figure 6), and It may become potential therapeutic targets and new biomarkers for lymph node metastasis breast cancer.

Analysis of the key mRNAs expression in normal tissues and cancer tissues. The human
protein atlas was utilized to research the expression of human proteins in different tissues. RBAK, TUBA1C, and HK1 were selected from 11 key mRNAs. After entering them into the database, we found that three mRNAs have a positive strong expression in breast cancer tissues and negative weak expression in normal tissues (Figure 7a). To verify our conclusion, TCGAportal was then used. It is a website that downloads statistics from the TCGA database and contains 1102 breast cancer sample tissues. After inputting relevant mRNAs, we also found that the three mRNAs level was higher in cancer tissues than that in normal tissues (Figure 7b).

Analysis of the miRNAs and their relationship with breast cancer prognosis. To research
the prognosis of patients with breast cancer, we used the Kaplan-Meier Plotter. After uploaded 6 miRNAs, we got 6 survival graphs. The results indicated that overexpression of hsa-miR-4274, hsa-miR-6880-3p, and hsa-miR-670-5p ( Figure 8) were related to worse overall survival in patients with breast cancer. However, the expression level of has-miR-149, has-miR-1-3p, and has-miR-30b-3p may 8 have no significant relationship with the overall survival. This suggested that the selected miRNAs may be potential targets.

Evaluation of the 6-miRNA Signature for overall survival. The AUC of 3 years survival for
the 6-miRNA signature achieved 0.809 and the AUC of 5 years survival achieved 0.981, which proved that the model has good performance in predicting the survival risk of breast cancer patients ( Figure   9a). Besides, the box diagram also proved our conclusion (Figure 9b).

Verification of potential biomarker expression by qRT-PCR. 3 miRNAs were verified to
have relationship with cancer prognosis, and it was found that miR-223-3p and miR-448 had high reliability, which all target STAT1. Then, the selected biomarkers including hsa-miR-4274, hsa-miR-6880-3p, and hsa-miR-670-5p were validated in breast cancer plasma samples using qRT-PCR analysis. Consistent with the prediction, the results showed that the expression levels of hsa-miR-4274 (P-value = 0.015) and hsa-miR-670-5p (P-value = 0.013) in plasma of lymph node nonmetastasis breast cancer patients were obviously lower than that of lymph node metastasis breast cancer patients and hsa-miR-670-5p (P-value = 0.013) were higher in lymph node non-metastasis breast cancer patients plasma ( Figure 10). And we also found that hsa-miR-4274 and hsa-miR-670-5p had high reliability, which all target STXBP5L.

Discussion
Lymph node metastasis is a common problem in breast cancer, which seriously affects the survival and prognosis of patients. Therefore, it is more and more urgent to understand mechanisms and find a specific and sensitive treatment method for patients. In this present research, gene expression data of GSE100453 and GSE38167 were searched from the GEO. 569 breast cancer statistics were searched from the TCGA database. 6 differently expressed miRNAs (hsa-miR-4274, hsa-miR-6880-3p, hsa-miR-4689, hsa-miR-1-3p, hsa-miR-670-5p, hsa-miR-30b-3p) were identified by combining two screening results. Among them, some miRNAs have been shown to affect tumor proliferation, migration, and prognosis. For example, a previous study reported miR-4274 promotes breast cancer cell proliferation, migration, and invasion through suppressing lncRNA SNHG15 (14), it also serves as a prognostic biomarker, or as a potential therapeutic target, in cutaneous melanoma patients (15). As 9 for hsa-miR-146a, it may play an important role in enhancing the proliferative activity related to liver cancer by regulating the expression of PROX1 at the post-transcriptional level (16). Another study also reported hsa-miR-1-3p was significantly down-regulated in gastric cancer and overexpression of miR-1-3p inhibited proliferation and invasion in gastric cancer by inhibiting stanniocalcin 2 expressions (17). Besides, it may play a tumor suppressor role by directly regulating ADAM9 and MMP7 in breast cancer (18). However, the regulation mechanism of some miRNAs in the tumor has not been reported yet, so more researches need to be conducted in the future, especially in breast cancer.
To learn more about the regulatory mechanism of the 6 miRNAs in breast cancer, we chose the intersection of two screening results from targetscan and miRDB, 499 mRNAs were identified after filtering. Function annotation indicated that these miRNAs were primarily related to adherens junction, regulation of cytoskeleton organization, and protein heterodimerization activity. KEGG According to previous studies, the NOD-like receptor signaling pathway participates in a diverse array of important cellular processes, including the survival, proliferation, differentiation, and activation of different cell types (19). Another study reported the treatment for influenza A has been long used as a treatment for Parkinson's disease (20). Also, recent researches showed that hepatitis C involved in Invasion, metastasis, and prognosis of metastatic melanoma and may be potential therapeutic targets (21).
Besides, PPI network analysis indicated that 11 hub genes including RBAK, TUBA1C, CCR4, HK1, FBXO17, INTS7, F2RL2, RPS29, FOXJ2, OBSL1, and GPI may be used as new targets in breast cancer with lymph node metastasis which had higher degrees of interaction. RBAK, RB-associated KRAB zinc finger, was found upregulated in non-small cell lung cancer (22) and taking part in the regulation of the cell cycle (23). A previous study reported that the high expression of RBAK is related to the short survival time of prostate cancer patients after radical prostatectomy (24), but its prognostic value and molecular function in breast cancer are not clear. As for TUBA1C, a component of tubulin is highly expressed in tumor tissues than the normal tissues according to previous studies (25). Ji Wang et al reported that TUBA1C may promote the migration and proliferation of hepatoma cells possibly through the cell cycle signal pathway (26). As for FBXO17, previous results showed that FBXO17 expression in tumor tissues of hepatocellular carcinoma (HCC) patients was markedly higher than that in adjacent tissues (27). It might promote the malignant progression of HCC by inhibiting the wnt/βcatenin pathway (28). In lung adenocarcinoma cells, FBXO17 was observed to promote cell proliferation through activation of Akt (29). Furthermore, the relationship between the 6 miRNAs and the overall survival curves for patients with breast cancer revealed that overexpression of hsa-miR-4274, hsa-miR-6880-3p, and hsa-miR-670-5p was related to worse overall survival. However, the expression level of has-miR-149, has-miR-1-3p, and has-miR-30b-3p may have no significant relationship with the overall survival of patients. Meanwhile, evaluation of the 6-miRNA signature for overall survival by the ROC curve displayed better predictive ability. After using qRT-PCR analysis, we found that hsa-miR-4274 and hsa-miR-670-5p were differently expressed in plasma of lymph node metastasis breast cancer patients, and we also found that hsa-miR-4274 and hsa-miR-670-5p had strong effects, which all target STXBP5L. However, there has been too little research on STXBP5L in breast cancer, and more research is needed to understand the mechanisms involved.
Nowadays, with the development of precision medicine, more and more attention has been paid to individual differences in treatment methods. Therefore, it is necessary to find new biomarkers and treatment methods for patients. Our findings approved that these miRNAs may play a vital role in lymph node metastasis breast cancer. Since all our data were achieved from the GEO database and TCGA by bioinformatics tools, as well as the limited number of relevant samples, more data analysis, and clinical experiments should be further performed for verification.

Conclusion
Our study not only concluded certain mechanisms of lymph node metastasis in breast cancer but also constructed a significant 6-miRNA risk prediction model for overall survival. Bioinformatics methods were utilized to identify the differently expressed miRNAs. The results of the enrichment analysis of GO and KEGG pathways showed that these miRNAs were significantly correlated with tumor progression. Also, 11 selected genes were identified according to the PPI network. Based on selected miRNAs and mRNAs, we constructed a miRNA-mRNA network. Compared with previous biomarkers, this network has an improved prediction of targeted therapy for patients. Meanwhile, we found that overexpression of hsa-miR-4274, hsa-miR-6880-3p, and hsa-miR-670-5p were related to the worse overall survival. Of them, hsa-miR-4274 and hsa-miR-670-5p were differently expressed through experiments. By using targetscan and miRDB, we found they all target STXBP5L. These conclusions from this study may provide a significant understanding of the mechanisms of lymph node metastasis in breast cancer. However, these findings need to be considered for future investigation.

Conflict of interest statement
The authors declared no conflict of interest.   Figure 1 Flow chart of this study.

Figure 2
The interactions between differentially expressed miRNAs and target mRNAs.
18  The association between miRNAs and breast cancer prognosis. 24