Identification of DEmiRNAs and DEmRNAs
Through differential analysis of the TCGA database, we obtained a total of 7531 DEmRNAs and 267 DEmRNAs. Among them, 3136 DEmRNAs were down-regulated and 4395 were up-regulated (Fig. 1A/B). What is more, 82 DEmiRNAs were down-regulated and 185 were up-regulated (Fig. 1C/D). We randomly divided the LMNGC cases (n = 291) into a training group (n = 207) and a test group (n = 84). We then constructed a cox risk prediction model as the outcome variable. Univariate regression analysis showed that hsa-mir-139-5p, hsa-mir-141-3p, hsa-mir-143-5p, hsa-mir-328-3p, hsa-mir-96-5p, hsa-mir- 100-5p, hsa-mir-4664-3p, hsa-mir-125b-5p, hsa-mir-7-5p and hsa-mir-504-5p were significantly associated with OS.
Prediction Of Target Demirnas
We combined network databases (targetscan, mirdb, mirtarbase) and EMT-related genes for analysis, visualized the regulatory network of miRNAs and target genes (Fig. 2). It was obtained the intersection of miRNAs and visualized by venn diagram (Fig. 3). The results showed that a total of 103, 11, 13 and 83 overlapping genes were associated with hsa-mir-141-3p, hsa-mir-4664-3p, hsa-mir-125b-5p and hsa-mir-7-5p, respectively. In addition, 139 target genes were found to be differentially expressed, of which 108 genes were down-regulated and 31 genes were up-regulated. Differential miRNAs were further observed and their clinical value in prognosis was assessed (Fig. 4A/B). Kaplan-Meier determined that these four miRNAs can effectively distinguish high-risk and low-risk groups, and have a good indicator role (Fig. 5).
Construction Of Emt-related Mirna Prognostic Model
We combined analysis of EMT-related miRNA predictors and other clinical characteristics, and it was divided into the high-risk group and low-risk group according to different risk scores (Fig. 6A). The setting parameters of the EMT-related miRNAs model are: risk score = 0.533×(hsa-mir-141-3p)-0.188×(hsa-mir-7-5p)-0.183×(hsa-mir-4664-3p)-0.188× (hsa-mir-125b-5p).
Univariate Cox multiple regression analysis indicated that both EMT-related miRNA predictive model (Fig. 6B; HR = 1.701, p < 0.001) and lymph node metastasis were prognostic risk factors (Fig. 6B; HR = 2.013, p = 0.007). Meanwhile, multivariate cox regression analysis also showed that EMT-related miRNA predictive model and lymph node metastasis were both prognostic risk factors (Fig. 6C; all p < 0.05).
Evaluation Of The Validity Of A Prognostic Model
By Kaplan-Meier survival analysis, we evaluated all LMNGC patients (Fig. 7A, 13.77% vs 56.09%, P < 0.05), training group (Fig. 7B, 11.89% vs 53.84%) and test group (Fig. 7C, 18.97% vs 60.10%), respectively. The results showed that the prognosis of the high-risk group was significantly lower than that of the low-risk group in the above three data sets (all p < 0.05). In addition, we also plotted the distribution of survival for all patients (Fig. 7D), the training group (Fig. 7E), and the test group (Fig. 7F). We evaluated the effectiveness of the model by the survivalROC package, and the AUC areas of all cases (Fig. 7G, 0.755), training set (Fig. 7H, 0.750) and test set (Fig. 7I, 0.778) were all above 0.7, indicating that the model has excellent predictive ability .
Gene Ontology And Kegg Pathway Analysis
According to the clusterprofiler package, we obtained GO and KEGG enrichment of all target genes participating in lymph node metastasis. The GO analysis results show that 483 GO annotations are related to the development trend of EMT. The dot chart shows the top terms of the biological process, cellular component and molecular function of all microbial strains (Figs. 8A/B/C).
Among these three categories, EMT-related target genes are mainly enriched in BP genes, including include cellular calcium ion homeostasis, calcium ion homeostasis, and regulation of cytosolic calcium ion concentration. For CC section analysis showed that the presynaptic membrane, distal axon, and cation channel complex. In the MF section, channel regulator activity, ion channel regulator activity, and nuclear receptor activity are the first three technical terms for overall target gene enrichment.
In addition, KEGG analysis results showed that 19 GC related pathways were mainly enriched in HIF-1 data signal pathway, calcium data signal pathway and adhesive plaque (Figs. 8D). In addition, we also brought "pathway gene Internet" (Figs. 9a) and "pathway pathway Internet" (Figs. 9b) to confirm the relationship between KEGG pathway and target genes.