Construction of WGCNA network and identification of key modules
In an attempt to explore the potential mechanisms that underpin the pathogenesis of RM, we performed a Weighted Gene Co-expression Network Analysis (WGCNA) on data from both RM and normal samples. The clustering trees of the RM and normal samples are displayed in Fig. 3A, demonstrating that all samples were included in the cluster under both conditions. To establish scale-free networks, we calculated the soft power threshold β using the function “SFT $power estimation”. This calculation informed our decision to use the power of = 9 as the soft threshold (refer to Figs. 3B, C for details).Interestingly, the module eigengenes (MEs) found within the green-yellow module displayed a significant correlation with the histological grading of RM (R2 = 0.74, P = 9e-04) as illustrated in Figs. 3D, E. This compelling evidence led us to focus on the green-yellow modules for subsequent investigation.Fig. 3F presents the connection between module connectivity and gene significance. Gene modules were identified using the Topological Overlap Measure (TOM) matrix, setting a foundation for the forthcoming steps in our research. This streamlined and logical methodology allows us to make solid inferences and contributes to the credibility of our findings.
Functional and pathway enrichment of gene module
To comprehend the biological roles and signal pathways involved in gene modules, the Metascape database was used for annotation and visualization. To evaluate specific genes, we performed Gene Ontology (GO) and the Kyoto Encyclopedia of Genes and Genomes (KEGG) path analysis. Minimum overlap≥3 and P≤0.01 were considered statistically significant. Based on the green-yellow module, the percentage of related genes in the important pathway of the module was determined. In this study, the GO biological process and KEGG pathways were selected. Fig.4A displays the top five most abundant terms, including receptor regulator activity, cilium, anterior/posterior axis specification, neural crest cell migration, and ubiquitin ligase complex. The cytokine-cytokine receptor interaction and Ras signaling were the enriched in KEGG pathway. Additionally, we built a network of enriched related terms (Figure 4B).
Identification of key genes in functional modules
To further explore the gene constitution of the particular module, which was primarily related to the RAS pathway, a PPI system was established to discover key genes in the green-yellow module. key genes were found significantly related to the nodes in the module and considered functionally significant. We submitted the genes enriched in RAS-correlated pathways to the STRING database and retrieved 419 PPI nodes and 628 edges with a confidence threshold > 0.4 in the green-yellow modules. Thereafter, the top five nodes in the PPI network with the highest connectivity were used as central agents to be analyzed by Cytoscape. Resultantly, five key genes (F2, EGF, NGF, IL13, and FOXP3) were identified (Fig. 5).
Evaluation the diagnostic accuracy of key genes
In clinical diagnostics, the AUC value is used to evaluate and compare different biomarkers. We used a ROC curve method to examine the diagnostic accuracy of key genes and found that the AUC value of F2, EGF, NGF, IL13, and FOXP3 was 0.828, 0.906, 0.898, 0.922, 0.922, respectively (Fig.6). Hence, all key genes had high diagnostic accuracy for assessing the potential application of metabolites in the diagnosis of RM [24]. To further validate the accuracy of diagnostic biomarkers, we were further validated for differences in the validation set GSE26787 and GSE22490. All five key genes had significantly higher expression levels in RM than in controls (p < 0.05) in the set GSE26787 (Figure 7A). In the validation set GSE22490, IL13 and NGF were expressed at significantly higher levels in RM than in controls (p <0.05) (Figure 7B). To further validate the expression of key genes in RM, we conducted RT PCR detection on both RM and normal tissues and found that all five key genes were specifically overexpressed in RM( Figure 8).
Immune cell infiltration analysis results
We employed the CIBERSORT algorithm to assess immune infiltration in 22 immune cells in RM. The relative proportions of immune cells in each sample are displayed in Fig. 8A. According to the correlation heatmap of the 22 kinds of cells in RM, activated mast cells correlated positively with neutrophil cells (r=0.98) and regulatory T cells (r= 0.95); regulatory T cells correlated positively with neutrophil cells (r=0.92); and activated CD4 memory T cells correlated positively with naive B cells (r= 0.81). Resting mast cells showed a strikingly negative connection with neutrophils (r= -0.73) and NK cells (r= -0.67). (Fig. 9B). The violin plot revealed that M0 macrophage infiltration was considerably lower in RM samples (p 0.05). (Figure 9C).
Spearman correlation analysis between the five key genes and infiltrating immune cells
The Spearman correlation between of the correlations between 5 key gene expressions and various immune cells, we discovered distinct patterns. F2, EGF, IL13, and FOXP3 gene expressions were positively associated with NK cells but negatively associated with a suite of immune cells including immature dendritic cells, Th1 cells, Th2 cells, and co-stimulation T cells. Intriguingly, NGF expression exhibited an opposite trend, being positively correlated with an array of immune cells and negatively correlated with NK cells. These findings contribute to our understanding of the complex interplay between these key genes and immune cell infiltration, providing a potential pathway for further investigation in the pathogenesis of RM.
Identification the correlation of immune factors and the five key genes
Utilizing the TISIDB database, we examined the relationship between key genes and a variety of immune factors, including immunosuppressive and immunostimulatory factors, chemokines, and receptors. EGF, F2, FOXP3, and IL13 demonstrated significant positive correlations with certain chemokines (e.g., CXCL3, CXCL2, CXCL1, CCL28, CCL22, CCL19, and CCL17), immune inhibitors (e.g., VTCN1, IL10RB, and BTLA), and immunostimulators (e.g., TNFSF9, TNFSF13, MICB, and IL6). Conversely, they negatively correlated with chemokines such as CXCL14, immune inhibitors like CD96 and CD244, and several immunostimulators. Notably, NGF exhibited inverse correlations compared to the other key genes. This compelling evidence suggests these key genes play a crucial role in the immune-mediated mechanisms of RM.
GSVA Analysis for key genes
Gene set variation analysis (GSVA) was used to explore the signaling pathways implicated in the five key genes to determine the potential molecular mechanisms of RM. Figure 12 demonstrates that several hallmark gene-sets, including “CHOLESTEROL-HOMEOSTASIS,” “NOTCH SIGNALING,” “PEROXISOME,” “HEDGEHOG SIGNALING” “PANCREAS-BETA-CELLS” “P53 PATHWAY,” “PI3K AKT MTOR SIGNALING,” “WNT-BETA-CATENIN SIGNALING,” “FATTY-ACID-METABOLISM,” were observed in all high expression groups of F2, EGF, NGF, IL13 and FOXP3. Moreover, genes in low expression groups of F2, EGF, NGF, IL13 and FOXP3 were enriched in the “KRAS-SIGNALING” and “COMPLEMENT” “ESTROGEN-RESPONSE-LATE” “EPITHELIAL-MESENCHY MAL-TRANSITON” “INTERFERON-GAMMA-RESPONSE” “ALLOGRAFT-REJECTION” pathways. Remarkably, IL13 and FOXP3 showed similar molecular mechanisms, while NGF had the opposite signaling pathway with the other four key genes. Similarly, some immune-related pathways, such as the IL2 and interferon response pathways, were found to be enriched in the high expression groups of these key genes, indicating that they may play a significant role in the RM-associated immune response.
Co-expression analysis and the molecular function-based relationship between key genes.
Thereafter, the Corrplot and Circlize packages were used to investigate the genetic interactions of key genes. Co-expression of five key genes was assessed. As shown in the circular diagram (Figure 13A), EGF, IL13, F2, FOXP3 showed closely positive relationships, but the NGF had negative associations with each other. Different processes required to be investigated further. Next, we predicted a potential interaction of all identified key genes and constructed a gene-gene functional interaction network by using the GeneMANIA algorithm (Fig. 13B). Results showed a condensed interaction and pathway relations between key genes and their 20 neighbors. NGF and EGF mainly participated in the neurotrophin signaling pathway and neurotrophin TRK receptor signaling pathways. F2 and FOXP3 mainly participated in the regulation of cell activation and negative regulation of cell activation signaling pathways. Besides, FOXP3 also took part in the regulation of B cell activation. Overall, the results provided new insights into the interaction of key genes and their co-operators mediating RM oncogenesis and development.