N6-Methyladenosine methylome landscape and a novel m6A-related gene risk signature associated with the prognosis of glioblastoma


 Background

N6-Methyladenosine (m6A) represents one of the most prevalent mRNA modification, strongly linked to the initiation and progression of glioblastoma (GBM). However, the global transcriptome-wide patterns of m6A in GBM are largely unknown.
Methods

RNA m6A modification patterns in GBM and normal brain tissues were described via m6A-sequencing (m6A-seq). Least absolute shrinkage and selection operator (LASSO) regression and multivariate Cox regression analyses were performed to identify genes for constructing an m6A-related gene risk signature. A nomogram was developed by integrating the risk signature with clinicopathological factors. Time-dependent receiver operating characteristic (ROC) curves were used to evaluate the performance of the prognostic model.
Results

Here, we reported transcriptome-wide m6A modification profiling in GBM and normal brain tissues for the first time, illuminating highly differences between these two groups, showing that m6A regulates large number of genes and cancer-related pathways. We demonstrated an m6A-regulated 8-gene risk signature that correlated with poor prognosis of GBM patients from the TCGA dataset, and validated its efficiencies in LeeY, Gravendeel and CGGA datasets. Furthermore, we developed a nomogram that integrated the risk signature with clinicopathological factors and validated its better performance for predicting the 1-, 3- and 5-year survival rates of GBM patients.
Conclusion

This study presents the first m6A transcriptome-wide map of human GBM and normal brain tissues, whose m6A landscape is greatly altered in GBM; In addition, an epigenetic m6A-related 8-gene risk signature and a nomogram are established, whose efficiencies have a high potential to further improve the individualised therapy in GBM patients.


Abstract
Background N 6 -Methyladenosine (m 6 A) represents one of the most prevalent mRNA modi cation, strongly linked to the initiation and progression of glioblastoma (GBM). However, the global transcriptome-wide patterns of m 6 A in GBM are largely unknown. Methods RNA m 6 A modi cation patterns in GBM and normal brain tissues were described via m 6 A-sequencing (m 6 A-seq). Least absolute shrinkage and selection operator (LASSO) regression and multivariate Cox regression analyses were performed to identify genes for constructing an m 6 A-related gene risk signature. A nomogram was developed by integrating the risk signature with clinicopathological factors. Timedependent receiver operating characteristic (ROC) curves were used to evaluate the performance of the prognostic model.

Results
Here, we reported transcriptome-wide m 6 A modi cation pro ling in GBM and normal brain tissues for the rst time, illuminating highly differences between these two groups, showing that m 6 A regulates large number of genes and cancer-related pathways. We demonstrated an m 6 A-regulated 8-gene risk signature that correlated with poor prognosis of GBM patients from the TCGA dataset, and validated its e ciencies in LeeY, Gravendeel and CGGA datasets. Furthermore, we developed a nomogram that integrated the risk signature with clinicopathological factors and validated its better performance for predicting the 1-, 3and 5-year survival rates of GBM patients.

Conclusion
This study presents the rst m 6 A transcriptome-wide map of human GBM and normal brain tissues, whose m 6 A landscape is greatly altered in GBM; In addition, an epigenetic m 6 A-related 8-gene risk signature and a nomogram are established, whose e ciencies have a high potential to further improve the individualised therapy in GBM patients. Background N 6 -Methyladenosine (m 6 A) is the most prevalent methylation modi cation of mRNAs in eukaryotes [1,2] , and it regulates almost every aspect of mRNA metabolism, including mRNA splicing [3] , translation [4,5] , stability [6] and decay [7] , regulating gene expression at the posttranscriptional level. RNA m 6 A modi cation is catalysed by the m 6 A methyltransferase complex (MTC), which is composed of methyltransferase-like 3 and 14 (METTL3 and METTL14, respectively) and their cofactors, Wilms tumour 1-associated protein (WTAP), KIAA1429 and RBM15. m 6 A is removed by m 6 A demethylases, such as FTO and ALKBH5, and detected by 'm 6 A readers', including YTHDF1/2/3, YTHDC1/2, IGF2BP1/2/3 and others, which interact with distinct subsets of m 6 A sites to mediate different effects on mRNA metabolism, keeping m 6 A modi cation in a dynamic balance [8] . An increasing number of validation studies show that dysregulation of the m 6 A modi cation and its associated enzymes also lead to the initiation and progression of tumours [9] . With the progress of high-throughput technologies, two independent studies reported the m 6 A RNA methylomes in mammalian genomes for the rst time by the m 6 A-RNA immunoprecipitation approach followed by high-throughput m 6 A-sequencing (m 6 A-seq) in 2012 [1,10] . These studies provided a method to elucidate the m 6 A modi cation pattern on a transcriptome-wide level and help to get further insight into the positions and roles of m 6 A in mRNA metabolism. However, the global transcriptome-wide patterns of m 6 A in most kinds of cancers, including glioblastoma (GBM), are largely unknown.
GBM is an intractable central nervous system tumour without effective therapy strategies. While the study of m 6 A in GBM is in its early stages, studies in this eld have drawn controversial conclusions.
Researches collected so far clearly demonstrate that the levels of mRNA m 6 A levels impacts multiple aspects of GBM [11][12][13][14][15] , including proliferation, glioma stem cell (GSC) self-renewal and tumorigenesis, prompting us that the mRNA m 6 A modi cation may be a promising target for GBM therapy. However, there is no comprehensive study on the expression of m6A RNA methylation patterns in GBM tissues.
Here, we report transcriptome-wide m 6 A pro ling in GBM and normal brain tissues via m 6 A-seq for the rst time, illuminating the highly diverse m 6 A modi cation differences between these two groups.
Through a systematic analysis, we found that abnormal m 6 A RNA modi cations in GBM regulate the expression of a large number of genes and cancer-related pathways. We demonstrated an m 6 A-regulated 8-gene risk signature with poor prognosis of GBM patients from the Cancer Genome Atlas (TCGA) dataset and validated the 8-gene risk signature in the LeeY, Gravendeel and Chinese Glioma Genome Atlas (CGGA) datasets. Furthermore, we developed a nomogram that integrated the m 6 A-related 8-gene risk signature with clinicopathological factors and validated its superior performance in predicting prognosis of GBM patients. We hope this study will facilitate further investigations of the potential roles of the m 6 A modi cation in GBM pathogenesis.

Results
Overview of N 6 -methyladenosine methylation within mRNAs in GBM and normal brain tissues To understand the role of the m 6 A modi cation in GBM, three human GBM tumour tissues versus three normal brain tissues from patients were selected for transcriptome m6A-sEq. A total of 15804 m 6 A peaks were identi ed by model-based analysis of MACS2 software in the tumour group, representing transcripts of 9228 genes (Table 1); in the normal group, 12356 m 6 A peaks were identi ed, which corresponded with the transcripts of 8320 genes ( Table 2). The protein-coding RNAs (mRNAs) constituted the majority of the m 6 A-modi ed RNAs (over 75% of all m 6 A-modi ed genes in the normal and GBM groups); other RNA types were also present, such as lncRNAs, pseudogenes and others (Fig. 1A). The validity and stringency of our m 6 A-seq data were con rmed by an unbiased search for motifs enriched m 6 A peak region, and the previously reported top m 6 A consensus motif 'GGAC' was identi ed in both tissue groups (Fig. 1B), similar to previous studies [16,17] . Further, we found that 68.6% of the m 6 A-methylated genes in the normal group (57.1% of the methylated genes in the GBM group) contained only one peak, while a relatively small number of genes contained two or more peaks (Fig. 1C), consistent with the proportion trend previously reported in mouse brains [2] . Current studies have found that the degree and pattern of methylation of mRNAs can affect their splicing, translation, decay and other RNA metabolism, regulating protein expression post-transcriptionally. To this end, we started systematically examining the distribution landmarks of m 6 A peaks along mRNAs. We found that there are m 6 A peaks throughout the genes but with a obvious enrichment around the stop codons, also extended to the 3' UTR region (Fig. 1D). To further analyse the distribution pro les of m 6 A peaks within mRNAs, peaks were categorized into ve transcript segments-the 5' UTR, TSS (100 nucleotides centred on the start codon), coding sequence (CDS), stop codon segment (100 nucleotides centred on the stop codon) and 3' UTR ( Fig. 1E, pie chart)and then normalized by the relative fraction each segment covered in the transcriptome (Fig. 1E, histogram). Our results showed that the m 6 A site most often occurred in the 3' UTR, with some near the stop codons, consistent with previous studies that reported mRNA m 6 A-seq data [1,2] ; the stop codon segment stood out as most enriched peak of the m 6 A peaks. From these results, we concluded that the m 6 A modi cation was particularly harboured around the stop codon, extending into the 3' UTR region in GBM tissues. Functional analysis of speci c genes associated with m 6 A peaks in normal brain tissue and GBM tissue RNA m 6 A-seq analysis of RNA derived from brain tissues revealed that, in the normal group, there were 9903 overlapping m 6 A peaks within 7565 mRNA transcripts, representing transcripts of 6565 genes. In the GBM group, there were 13033 overlapping m 6 A peaks within 9209 mRNA transcripts, corresponding to 7525 genes. The differences and overlaps in m 6 A genes between the individuals are shown by the Venn diagram in Fig. S1A. Among them, 4257 m 6 A-modi ed genes were detected within both groups.
Compared with the normal group, the tumour group had 3268 new genes that appeared with the disappearance of 2308 genes, indicating a signi cant difference in global m 6 A modi cation patterns between the tumour and normal groups. The Metascape database analysis [18] revealed that speci c genes with m 6 A in the normal group were signi cantly linked to the transmission of neural signals, such as the regulation of ion transport, synaptic signalling, signal release and potassium channels, as well as system development, such as neuronal system, regulation of system process, embryonic morphogenesis and brain development, and these genes interacted with each other signi cantly (Fig. S1B). The analysis of tumour-speci c genes indicated that they were signi cantly enriched in the metabolism of RNA and ribonucleoprotein complex biogenesis, playing key roles in regulation of the protein expression levels, as well as some terms related to cell growth, such as cell cycle, DNA repair and others (Fig. S1C).

RNA m 6 A methylation changes between GBM and normal brain tissues
To uncover the biological implications of m 6 A methylation in GBM, the global abundance of the m 6 A peaks between GBM and normal brain tissues were compared. In total, 9315 differentially methylated peaks were selected to study further. A total of 3130 hypermethylated m 6 A sites, corresponding to 2349 protein-coding genes, were found in the GBM group compared with the normal group, and 6185 hypomethylated m 6 A sites, representing 4472 genes, were discovered ( Fig. 2A, fold change ≥ 1.2, P ≤ 0.05). Consistent with the global distribution of m 6 A sites in GBM and normal brain tissues, the differentially methylated m 6 A sites were also preferentially distributed in the CDS and 3' UTR, while the hypomethylated m 6 A sites preferentially occurred in the 5' UTR, compared with normal brain tissues ( Fig. 2B). The genes containing signi cantly hypermethylated m 6 A peaks were analysed by the Metascape database, and they were signi cantly enriched in the cell cycle, cell division, DNA replication, DNA repair, mitotic sister chromatid segregation, RNA splicing, and system development, with the corresponding signalling pathways interacting with each other to form a protein-protein interaction network (Fig. 2C). In addition, the Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis revealed that hypermethylated m 6 A sites represented genes in the GBM group that were signi cantly enriched in many pathways involved in cancer pathogenesis, including miRNAs in cancer, cell cycle, and the Hippo, p53, Wnt, TGF-β and Notch signalling pathways (Fig. S2A). Further, genes containing signi cantly downregulated m 6 A peaks were analysed by performing Gene Ontology (GO) and KEGG pathway analyses. Similar to the normal-unique m 6 A-modi ed genes, these genes were also mainly involved in the transmission of signals and neuronal system development ( Fig. S2B-C). These results revealed that abnormally m 6 A-modi ed genes, especially hypermethylated m 6 A genes, were enriched in cancer-related signalling pathways in GBM tissues. Conjoint analysis of m 6 A-seq and RNA-seq data of GBM and normal brain tissue samples TCGA RNA-seq data were used to detect the differentially expressed genes between 154 primary GBM and normal brain tissues. Compared with normal samples, 6397 genes were differentially expressed in primary GBM tissues (fold change ≥ 2 and padj ≤ 0.05), including 3794 downregulated genes and 2605 upregulated genes ( Fig. S3A; Table 3). The differentially expressed genes were selected for Metascape analysis. Similar to the m 6 A-modi ed differentially expressed gene enrichment terms, abnormal upregulated genes in GBM samples were signi cantly enriched in biological processes involving the cell cycle, cell division and some other proliferation-related terms (Fig. S3B), with downregulated genes enriched in the transmission of signal and neuronal system development (Fig. S3C).
By conjoint analysis of m 6 A-seq and TCGA RNA-seq data, we discovered a positive correlation of differentially methylated m 6 A peaks and gene expression levels in GBM and normal tissues. All genes were mainly divided into four groups, including 957 hypermethylated as well as upregulated genes ('hyper-up'), 1975 hypomethylated as well as downregulated genes ('hypo-down'), 312 hypermethylated but downregulated genes ('hyper-down') and 173 hypomethylated but upregulated genes ('hypo-up'). The number of hyper-up and hypo-down genes were greater than the number of hyper-down and hypo-up genes ( Fig. 2D), indicating that m 6 A modi cation level of these genes tend to have a positive correlation with their corresponding mRNA expression in GBM. Furthermore, a large cohort of transcripts contained m 6 A peaks that were not regulated at the mRNA level, emphasizing that other RNA metabolic processing steps of these genes, such as transport or translation, may be ne tuned by m 6 A. Since hyper-up and hypo-down genes covered the vast majority of the number of differentially regulated genes, both in m 6 A modi cation and expression levels, we focused on these genes in the following work. The two parts of the RNA transcripts were divided into subgroups according to their m 6 A modi cation sites. We found that the m 6 A sites of the regulated genes were also mostly in the 3' UTR, followed by CDS, in accordance with the global m 6 A distribution. We found no signi cant correlations between the location of m 6 A sites and their corresponding transcript expression level (Fig. 2E), emphasizing the complexity of m 6 A functionality. Further, Metascape database analysis revealed that these hyper-up genes were mainly enriched in the regulation of cell biological functions, including cell division, cell cycle phase transition, extracellular structure organization, cell cycle checkpoints, chromosome segregation, regulation of cell adhesion and others. Additionally, these genes and enriched terms were of great importance to the protein-protein interaction network clusters (Fig. 2F). KEGG pathway analysis similarly showed that these terms were linked to cell proliferation regulation pathways, such as the cell cycle and DNA replication, as well as classical signalling pathways involved in cancer development, including p53 signalling pathways, PI3K-Akt signalling pathways and microRNAs in cancer and others (Fig. S4A). Similar to the function of the normal-speci c genes, the hypo-down genes were also mainly enriched in the transmission of neural signals and system development, and the results of the enriched terms and network of enriched terms are shown in Fig. S4B. These results suggested that the hyper-up m 6 A genes were closely involved in cancerrelated signalling pathways in GBM tissues.
Weighted gene co-expression network analysis (WGCNA) and key module identi cation To determine the phenotypic relevance of the hyper-up genes, we identi ed gene modules (groups of highly interconnected genes), highly associated with clinicopathological features of GBM via WGCNA method [19] . A hierarchical clustering tree, in which each leaf on the tree represented a single gene and genes with similar expression data were close linked and formed a branch of the tree, on behalf of a gene module, was constructed using dynamic hybrid cutting and merged dynamic cutting. Five modules were generated ( Fig. 3A-B). We were interested speci cally in the oncogenes inducing GBM tumorigenesis, and among the ve modules, the turquoise module was the most highly correlated with GBM tumorigenesis (r = 0.34, p = 2e-05), and the brown module was moderately correlated with carcinogenesis (r = 0.3, p = 1e-4).
The correlation between the other modules and GBM was less than 0.3 (Fig. 3C). Furthermore, only genes in the turquoise module had a greater signi cance with the corresponding module (Fig. 3D, correlation r = 0.43, p = 9.2e-13) and were considered to be more connected with disease progression, and the other modules did not have a signi cant correlation with their corresponding genes (Fig. S5). Thus, the turquoise module was identi ed as a hub module to explore in the following work. The highly correlated genes in the turquoise module were investigated as potential key genes associated with GBM tumours. Based on the cut-off standard (module membership > 0.8), a total of 58 genes that were highly correlated to the turquoise module were de ned as candidate hub genes (Fig. 3D). Next, 44 genes included in both the turquoise module and TCGA HG-UG133A platform dataset were analysed through the GO database. GO biological process analysis showed that almost all of them could be involved in cell cycle processing, and some of them played a key role in cell division and nuclear division. GO cellular component analysis showed that the cell cycle-regulating genes were mostly in the nucleus, with some of them located in chromatin. GO molecular function analysis revealed that some of the proteins coded by these genes can bind to ATP and enzymes, suggesting that they could participate in cell signal transduction and metabolism. We visualized the genes enriched in GO database annotation terms via circos string graphics, ranked by logFC in descending order (Fig. S6). These results indicated that the 44 genes played an important role in regulating GBM progression.
LASSO Cox regression identifying an m 6 A-related 8-gene risk signature The 44 genes from the above analysis were used to construct a gene expression pro le with a total of 522 primary GBM patients in the TCGA HG-UG133A platform, and LASSO regression analysis was performed on these genes to prevent over tting problems in the risk signature. A total of 8 m 6 A-related genes-aurora kinase B (AURKB), cell division cycle 25C (CDC25C), centromere protein E (CENPE), epithelial cell transforming 2 (ECT2), forkhead box M1 (FOXM1), kinesin family member C1 (KIFC1), minichromosome maintenance complex component 10 (MCM10) and thyroid hormone receptor interactor 13 (TRIP13)-were identi ed according to the optimal lambda value ( Fig. 3E-F). As described in the methods section, the risk signature was established based on their expression levels, and the regression coe cient parameter were calculated via the multivariate Cox regression model (formula in additional le 2 in the Supplementary methods section) was used to predict the risk score of patients, dividing genes into a high-risk group (n = 261) and a low-risk group (n = 261) according to the median cutoff value of the scores (Fig. 4A). The survival status is shown in Fig. 4B in ascending risk score order. The Kaplan-Meier curve analysis showed that the overall survival of patients in the low-risk group is signi cantly better than those in the high-risk group in quartile cut-off values (Fig. 4C, HR: 2.129 [1.597-2.839], P 0.0001). Further, ROC curves were performed to compare the e ciencies of the prognostic risk model. ROC curves showed that the areas under the curve (AUCs) of the risk signature for predicting the 1-, 3-and 5-year survival rates were 0.628, 0.763 and 0.844 (Fig. 4D), respectively.
To study the relationship between the risk signature and clinical factors, we compared the risk scores strati ed by molecular subtypes, IDH1 status, MGMT promoter status and G-CIMP status in the TCGA GBM cohorts, separately. As shown in Fig. S7A, the risk scores were lower in patients with a mutation in IDH1 than in those with wild-type IDH1. The risk scores of samples with MGMT promoter methylation were lower than those of patients with an unmethylated MGMT promoter. For G-CIMP status, the risk scores were lower in patients with G-CIMP than in those without G-CIMP. The risk scores in the proneural subtype were obviously lower than those in the classical and mesenchymal subtypes. Older patients (> 55 years) had higher risk scores than younger patients (≤ 55 years). However, there were no differences in risk scores between males and females. Subsequently, we also evaluated the prognostic e ciency of the risk signature in different cohorts strati ed by the MGMT promoter status, IDH1 status, G-CIMP status and molecular subtypes. As shown in Fig. S7B, Kaplan-Meier curve analysis showed that patients in high-risk groups with these clinical characteristics had more adverse outcomes than those in the low-risk group without these clinical characteristics according to the median or quartile group cut-off value.
AURKB, CDC25C, CENPE, ECT2, FOXM1, KIFC1, MCM10 and TRIP13 were the 8 m 6 A-related genes in our LASSO Cox model. Then, we analysed the protein interaction relationship via the STRING database and found that they can interact with each other. Further, GO analysis showed that all of these genes could be involved in mitotic cell cycle processing, and KIFC1, AURKB, CENPE, CDC25C and ECT2 played a key role in cell division. GO cellular component analysis showed that all of them located in the nucleus, with KIFC1, AURKB, CENPE, CDC25C, FOXM1 and ECT2 located in the spindle. GO molecular function analysis showed that some of them could bind with a kinase or ATP (Fig. 4E). The Gene Expression Pro ling Interactive Analysis (GEPIA) database analysis showed that all 8 genes had a higher expression level in GBM tumour tissues than in normal tissues in the GETx dataset (Fig. 4F). The m 6 A characteristics of the 8 genes are shown in Table S1, with some of them having two or more modi cation sites, and the gene plot of the methylation information of these genes is shown in Fig. S8. And the core component of the m 6 A methylase complex, METTL3, has a signi cant positive correlation with these 8 genes in TCGA GBM cohort (Fig. 4G). MeRIP qPCR data also showed that the 8 genes were highly m 6 A modi ed (Fig. 4H). Further, compared to the NC group, the expression level of these 8 genes decreased signi cantly after knocking down METTL3 (Fig. 4I). These results indicated that the 8 genes showed high e ciency in distinguishing tumours from normal tissues in terms of gene expression level and m 6 A methylation level, predicting unfavourable prognosis in GBM patients.
Validating the e ciencies of the 8-gene m 6 A-related epigenetic signature in the three datasets To validate the e ciencies of the 8-gene m 6 A-related epigenetic signature, we collected a total of 191 GBM samples from the LeeY microarray dataset, 155 GBM samples from the Gravendeel microarray dataset and 133 GBM samples from the CGGA RNA-seq dataset as three validation datasets to evaluate the e ciency of the risk signature; the clinicopathological characteristics of the validation patient cohorts are shown in Table S2. Using the same methods as for the TCGA discovery cohort, the risk scores of patients were calculated on the basis of the expression levels of the eight genes in the three validation cohorts and were then divided into a high-risk group and a low-risk group based on the median cut-off value of the risk scores (Fig. 5A), with the survival status in an ascending order of risk score shown in Fig. 5B. In agreement with the initial discovery of the TCGA cohort ndings, the Kaplan-Meier survival curves showed that patients in high-risk group had poorer prognosis than those in low-risk group (Fig. 5C Fig. S9A-B. All 8 genes showed relatively upregulated expression in the high-risk group compared with that in the low-risk group. The high-risk group had more elderly patients and classical and mesenchymal subtype patients, whereas samples with G-CIMP status were almost all in the low-risk group in the LeeY and Gravendeel datasets, with complete clinical information (Fig. S9B). Concomitantly, the Kaplan-Meier curve analysis results indicated that all 8 genes could be independent adverse prognostic factors in the CGGA and LeeY datasets (Fig. S10). These results demonstrated the robustness of our m 6 A-related 8-gene risk signature in predicting unfavourable prognosis in GBM patients.

Functional enrichment analysis of the risk signature
To explore the functional enrichment pathways related to the risk signature, 532 TCGA GBM patients were divided into the high-risk group (top quartile, n = 130) and low-risk group (bottom quartile, n = 130) on the basis of the quartile risk score. Then differential expression analysis was conducted, and compared with the genes in the low-risk group, a total of 4357 genes were differentially regulated in the high-risk group, out of which 4205 genes were upregulated and only 152 genes were downregulated (Fig. S11A-B, fold change ≥ 1.2 and Padj ≤ 0.05, Table 4). The MSigDb hallmark pathway analysis of the upregulated genes showed that signs of cancer-related hallmarks, e.g., epithelial-mesenchymal transition, glycolysis, mTOR signalling, apoptosis, E2F targets, G2M checkpoints and others, were signi cantly enriched (Fig.  S11C); the detailed information is shown in Table 5. To further understand the function of these high-riskrelated upregulated genes, KEGG pathway enrichment analysis was performed. As shown in Fig. S11D, cancer-related pathways, such as DNA replication, apoptosis, and cell cycle pathways, could be enriched signi cantly; the detailed information of all enriched clusters is shown in Table 6. Furthermore, Gene Set Enrichment Analysis (GSEA) suggested that the malignant hallmarks of tumours, including the apoptotic signalling pathway, cell cycle and pathway in cancer, were signi cantly associated with the high-risk score group (Fig. S11E). All these results suggested that the risk signature was closely related to the malignancy of GBM.
Construction of a nomogram for predicting the 1-, 3-and 5year survival rates of GBM To better apply the m 6 A-related 8-gene risk signature, we collected 334 GBM patients in the TCGA HG-UG133A platform with detailed clinicopathological parameters, including IDH1 status, G-CIMP status, age and therapy history (Table S3). Then, we randomly divided the 334 samples into a training cohort and a validation cohort. The training cohort was used to construct a nomogram to predict the survival rate of GBM patients, and the validation cohort was used to further evaluate the effectiveness of the nomogram. First, we performed univariate and multivariate Cox regression analyses in the training cohort, both of which indicated that the risk signature was an independent risk factor for patients with GBM (Fig. 6A, P < 0.001). IDH1 mutant status was a signi cant protective factor for GBM in the univariate analyses, while the multivariate analyses showed that IDH1 status was not a signi cant factor. Subsequently, a nomogram integrating the ve factors, without the IDH1 status factor, was constructed for predicting the 1-, 3-and 5-year survival rates of GBM patients. In the nomogram, the 1-, 3-and 5-year survival rates were estimated by combining the total points for each factor (Fig. 6B). Furthermore, this new combination signature also demonstrated that the m 6 A-related 8-gene risk signature was a signi cantly important diagnostic indicator compared to other classic clinicopathological features, including age, sex, G-CIMP status, and treatment. ROC curves were then used to evaluate the reliability of the nomogram. The AUCs for predicting the 1-, 3-and 5-year survival rates were 0.707, 0.904 and 0.966, respectively, in the training cohort and 0.742, 0.839 and 0.918, respectively, in the validation cohort (Fig. 6C). These results indicated that the nomogram demonstrated good accuracy in predicting survival rates in patients with GBM.

Discussion
GBM is an intractable central nervous system tumour without effective therapy strategies. As the most abundant internal modi cation in eukaryotic mRNAs, the m 6 A modi cation was reported to play key roles in various RNA metabolism pathways involved in the initiation and progression of cancers [9] .
Nevertheless, to date, the transcriptome-wide distributions of the m 6 A modi cation in GBM tumour tissues and normal brain tissues, as well as the speci c role of abnormal m 6 A modi cations in GBM pathogenesis, are largely unclear. In this study, we developed a global m 6 A transcriptome roadmap of GBM tissues versus normal brain tissues for the rst time, illustrating that the gene expression levels and cancer-related pathways were modulated by abnormal m 6 A RNA modi cations, and constructed a predictive m 6 A-related 8-gene risk signature that can better stratify and predict clinical outcomes of GBM patients. The nomogram further improves the predictive power and accuracy of the model.
We analysed the m 6 A-seq data and determined that the m 6 A modi cation pattern in GBM samples was distinct from that of normal brain controls, with 3268 new genes in the tumour group and 2308 genes that disappeared, indicating signi cant variability in global m 6 A modi cations between the tumour and normal groups (Fig. S1). Analysing the differentially methylated genes revealed that cancer-related biological processes and pathways were signi cantly enriched, indicating the relationship between abnormal m 6 A modi cations and GBM pathogenesis. Furthermore, the combined analysis of our m 6 Aseq and TCGA mRNA-seq data revealed 957 genes in the GBM group, which had hypermethylated m 6 A peaks along with signi cant upregulation changes in mRNA abundance compared with the normal brain group (Fig. 2D). These genes play a crucial role in the progression of GBM and are worth further investigation. Then, WGCNA, LASSO regression and multivariate Cox regression analyses were used to identify genes to construct a risk signature; a total of 8 m 6 A-related genes (AURKB, CDC25C, CENPE, ECT2, FOXM1, KIFC1, MCM10 and TRIP13), which were associated with GBM survival, were identi ed. A large number of studies suggest that AURKB [20] , CDC25C [21,22] , CENPE [23] , ECT2 [24] , FOXM1 [25][26][27] , KIFC1 [28] , MCM10 [29,30] , and TRIP13 [31,32] are overexpressed during cancer progression, can promote cell proliferation, and contribute to tumour formation, including the formation of GBM, consistent with our ndings that these genes can be independent risk factors for GBM survival (Fig. S10). Notably, the m 6 A demethylase ALKBH5 can demethylate FOXM1 nascent transcripts, leading to enhanced FOXM1 expression and to the malignant development of GBM [27] . ECT2, which belongs to the YTHDF class of YTH proteins and is characterized by several groups, can serve as an m 6 A reader to regulate RNA metabolism in plants [10,33] ; therefore, further exploration of these genes is worthy of future research.
Additionally, some of these genes exhibit a certain mutual regulation relationship [34,35] , rea rming the database analysis results (Fig. 4E). Further functional studies may help to clarify the molecular mechanisms of the abovementioned genes in the development of GBM. Moreover, the expression of these genes can be positively regulated by the m 6 A modi cation in high-throughput sequencing data, and our gene-speci c m 6 A qPCR assays also con rmed that all 8 genes can be highly modi ed by m 6 A. So far, enzymes or factors that catalyse, recognize, and remove m 6 A have been identi ed, illustrating the functional importance of the underlying molecular mechanisms of the m 6 A modi cation machinery.
Although, we showed that METTL3 could regulate the expression in a positive manner both in the database and our experiments (Fig. 4G-I), it also requires further veri cation in the future to clarify the speci c regulation mechanism. These ndings can aid in the search for therapeutic interventions against the dysregulated m 6 A machinery to treat cancers. In the present study, we only analysed the genes with mRNA expression level changes. It was previously reported that the m 6 A modi cation could also in uence the localization, transportation and translation of target mRNAs that depend on the recognition of different kinds of 'readers' [36] . Therefore, functional and mechanistic experiments are needed to further con rm the regulatory roles of m 6 A RNA modi cations on the other genes without expression changes in GBM, which may also provide theoretical guidance for the future treatment of GBM.

Conclusion
In conclusion, this study presents the rst m 6 A transcriptome-wide map of human GBM and normal brain tissues, whose m 6 A landscape is greatly altered in GBM, and an epigenetic 8 m 6 A-related gene signature and a nomogram are established, which can be used as independent prognostic markers and accurate clinicopathological parameter predictors. The comprehensive evaluation of individual GBM m 6 A modi cation pattern will contribute to enhancing our understanding of the tumor initiation and malignant progression, prompting us that modulation of epi-transcriptomic processes such as m 6 A methylation might be an interesting target for GBM therapeutic interventions.

Patients and specimens
Human GBM tissues and normal brain tissues (the cortex of decompressive surgery patients with brain trauma or hypertensive intracerebral haemorrhage) were obtained from patients admitted to Qilu Hospital from November 2017 to December 2019. All participants provided written informed consent, and the research was approved by the Ethical Committee on Scienti c Research of Shandong University Qilu Hospital (approval number: KYLL-2018-324).

Data acquisition
The TCGA GBM RNA-seq transcriptome data and corresponding clinicopathological information of GBM patients were obtained from the TCGA database (http://cancergenome.nih.gov/). The TCGA HG-UG133A platform microarray data were obtained from https://xena.ucsc.edu/public, and corresponding clinicopathological parameters of GBM patients were obtained from the Gliovis database (http://gliovis.bioinfo.cnio.es/). The CGGA GBM RNA-seq transcriptome data and corresponding clinicopathological parameters of GBM patients were obtained from the CGGA database (http://www.cgga.org.cn/). The LeeY dataset and Gravendeel dataset microarray data and corresponding clinicopathological parameters of GBM patients were obtained from the Gliovis database (http://gliovis.bioinfo.cnio.es/). The m 6 A-seq sequencing data have been deposited in SRA PRJNA661159 (the data are being processed; submission ID: SUB8069560, to be released when the paper is published). The processed data and basic association analyses will be made available in supplementary data or from the corresponding author on reasonable request.

Statistical analysis
Univariate and multivariate Cox regression models, least absolute shrinkage and selection operator (LASSO) regression, and receiver operating characteristic (ROC) curve analysis were performed using RStudio (version 3.5.2). Kaplan-Meier survival curves were drawn using GraphPad Prism 7.04, and signi cant differences were compared by the log-rank (Mantel-Cox) test between two groups. Quantitative data are presented as the mean ± SEM. Student's t-test was used for two-group comparisons, and oneway analysis of variance (ANOVA) was used for three-group comparisons. P ≤ 0.05 was considered statistically signi cant. (*P ≤ 0.05, **P ≤ 0.01, ***P ≤ 0.001, ****P ≤ 0.0001). All data processing with R packages was performed using R Studio (version 3.   Global m6A modi cation and expression changes in GBM tissues compared with normal brain tissues. (A) Count of differentially m6A-modi ed peaks and corresponding genes in the normal and GBM tumour brain; (B) The coverage rate of m6A peak distribution of hyper-and hypo-methylated genes; (C) Bar graph of enriched terms across upregulated genes, coloured by p-values, and corresponding network of enriched terms; (D) Dot plot of Log2FC (mRNA expression) against Log2FC (differential m6A methylation) showing a positive correlation between overall m6A methylation and mRNA expression level (Pearson r=0.5012; p 0.0001) and distribution of genes with a signi cant change in both m6A (FC≥1.2, p≤0.05) and mRNA levels in GBM samples compared with normal brain tissues (FC≥2, FDR≤0.05); (E) The coverage rate of m6A peak distribution of hyper-up and hypo-down genes; (F) Bar graph of enriched terms across hyper-up genes, coloured by p-values, and corresponding network of enriched terms.   GETx normal brain tissues (n=207) via the GEPIA database; (G) The correlation of METTL3 with the 8 m6A-related genes in TCGA GBM samples; (H) MeRIP-qPCR assays showed that the 8 genes could be modi ed by m6A. Data represent mean ± SEM from three independent experiments. P ≤ 0.05 was considered statistically signi cant (*P ≤ 0.05; **P ≤ 0.01; ***P ≤ 0.001, ****P ≤ 0.0001); (I) QPCR assays showed that all the 8 genes could be positively regulated by m6A methyltransferase METTL3. Data represent mean ± SEM from three independent experiments. P ≤ 0.05 was considered statistically signi cant (*P ≤ 0.05; **P ≤ 0.01; ***P ≤ 0.001, ****P ≤ 0.0001).