m6A RNA methylation regulators expression and clinical data in TCGA BC cohort
FPKM normalized gene expressions used as the RNA-seq results of 1109 BC samples and 113 normal samples were obtained from TCGA database. To explore the expression pattern of m6A RNA methylation regulators in BC, the sequencing data of 19 m6A related genes, shown in table 1, were extracted from TCGA BC cohort. Besides, there were 908 BC patients with complete survival information enrolled from TCGA dataset. The baseline characteristics of these BC patients was shown in table 2.
Differential expression and correlation of m6A RNA methylation regulators
In total, 19 m6A related genes were extracted from TCGA BC cohort to explore the expression pattern of m6A RNA methylation regulators. There were 11 upregulated genes (ALKBH5, HNRNPA2B1, YTHDC2, YTHDF1, HNRNPC, IGF2BP1, KIAA1429, RBM15, YTHDF2, YTHDF3 and IGF2BP3) and 8 downregulated genes (FTO, IGF2BP2, ZC3H13, METTL3, METTL16, METTL14, WTAP and YTHDC1) in tumor samples compared with normal samples of BC patients shown in Figure 1A and 1B. Moreover, the correlation analysis indicated that most parts of the different m6A RNA methylation regulators exhibited weakly to moderately positive correlation and some of them represent the weakly negative correlation. Among them, METTL14 had the strongest positive correlation with YTHDC1 and had the most negative correlation with IGF2BP2 combined with YTHDC2 (Figure 1C).
Consensus clustering of m6A RNA methylation regulators identified two clusters of BC patients with clinicopathological features
We clustered the tumor samples into different groups to investigate the expression similarity of m6A RNA methylation regulators based on the clinicopathological features of BC. According to consensus cluster analysis, k = 2 seemed to be a most optimal choice with clustering stability increasing from k = 2 to 9 in the TCGA BC cohort as shown in Figure 2A, B, C and Supplementary Figure 1. The tumor samples were divided into 2 different groups which were named cluster 1 and cluster 2. To explore the association between the two clusters and clinical outcomes, the heatmap was performed for further analysis. The relationship of clinicopathological characteristics between the two clusters demonstrated significant differences for the age, pathological stage, and T stage (p < 0.05). These results indicated that the two clusters were closely related to clinical outcomes and malignancy of BC patients.
Prognostic Network of m6A RNA methylation regulators
A prognostic network was constructed to depict the comprehensive landscape of 19 m6A regulators interaction, regulator connection and their prognostic significance for BC patients (Figure 3A). The results indicated that the expression of m6A regulators was significantly correlated not only in the same functional category, but also among writers, ereasers and readers. IGF2BP2 was negatively correlated with YTHDF1, YTHDF2, METTL16, METTL14, and FTO with HNRNPA2B1 and HNRNPC, respectively. Other m6A regulators were positively correlated, among which METTL14 had the strongest positive correlation with YTHDF1 and YTHDF2. Furthermore, we conducted a univariate Cox regression in patients with BC. The results viewed that 5 m6A regulators (YTHDF3, ZC3H13, METTL16, HNRNPC and KIAA1429) can be selected as prognostic factors with p < 0.1 (Figure 3B).
GO enrichment analysis of m6A RNA methylation regulators
GO enrichment analysis indicated that m6A RNA methylation regulators were enriched in multiple BP (biological process), MF (molecular function), and CC (cellular component) terms. The most enriched top terms of BP, MF and CC are regulation of mRNA metabolic process, catalytic activity (acting on RNA) and nuclear speck, respectively (supplementary table 1). As the top 10 enriched GO terms shown in Figure 4A, 6 terms (included regulation of mRNA metabolic process, regulation of mRNA stability, regulation of RNA stability, regulation of mRNA catabolic process, mRNA catabolic process, and RNA catabolic process) were upregulated and 4 terms (included negative regulation of translation, negative regulation of translation, negative regulation of cellular amide metabolic process, and RNA destabilization) were downregulated. The description of the top 10 enriched GO terms was shown in Figure 4B.
Construction and validation of the m6A related survival prediction model
After performed univariate Cox regression analysis, LASSO analysis was used to further screen the prognosis-related m6A regulators. Totally, 3 m6A RNA methylation regulators (YTHDF3, ZC3H13 and HNRNPC) were selected to establish the prognosis RS model (Figure 5A and 5B). The heatmap of risk distribution among the 908 BC patients with distinct clinicopathologic features in entire TCGA cohort was shown in Figure 5C. Then, 908 BC patients were randomly separated to the training cohort (n=636) and validation cohort (n=272). According to the calculated median RS, BC patients were divided into the high-risk group and the low-risk group in training cohort and validation cohort (Figure 6A1 and 6A2). The mortality rate was constantly increasing with the enhanced RS (Figure 6B1 and 6B2). And with the increase of RS, the expression levels of YTHDF3 and ZC3H13 were individually elevated, while the HNRNPC expressed decreasingly (Figure 6C1 and 6C2). Subsequently, we performed OS and ROC to evaluate and validate the prognostic value of the RS model. We found that the OS of the high-risk group was significantly shorter than that of the low-risk group based on KM analysis (p < 0.001, Figure 6D1 and 6D2). The AUC values of time-dependent ROC curves were 0.61 and 0.624 in the training cohort and validation cohort, respectively (Figure 6E1 and 6E2).
Independent Prognostic Factors Identification and the Prognostic Nomogram Construction and validation
The results of univariate and multivariate Cox regression analysis showed in table 3 indicated that age, pathological stage, and RS were significantly correlated with OS and prognostic models constructed with the m6A related genes may be better predictors of OS in BC. As a result, we constructed a nomogram to predict OS based on the RS combined age and pathological stage in entire TCGA BC cohort (Figure 7A). The calibration plots showed that the performance of the nomogram was the best in predicting the 1- and 3-year OS in the training cohort (Figure 7B and 7C) and validation cohort (Figure 7D and 7E). The C-index of our nomogram reached 0.757 (95% CI:0.7-0.814) in training cohort and 0.749 (95% CI:0.648-0.85) in validation cohort, respectively. These results demonstrated that our nomogram can be applied to predict the OS rate of BC patients based on their own conditions to improve the prediction efficiency and accuracy.