The aberrant expression of several m6A regulators in chRCC tissue
Hematoxylin–eosin (HE) staining indicated that chRCC tissues composed of large vegetable-like polygonal cells with eosinophilic cytoplasm, irregular nuclei, perinuclear clear halo, and prominent cell membrane (Fig. 1b). In order to determine whether m6A modification functions in chRCC, we first analyzed the expression levels of 8 m6A regulators, including 3 key writer subunits, 3 readers, and 2 erasers (m6A writer subunits: METTLE14, METTL3, and WTAP; m6A readers: YTHDF1, YTHDF2, and YTHDF3; m6A erasers: ALKBH5 and FTO) in six patients. The qPCR results showed that the expression levels of WTAP, YTHDF2, FTO, and ALKBH5 were downregulated markedly in chRCC tissues compared with corresponding tumor-adjacent normal tissues (termed normal tissues) (Fig. 1c). The aberrant expression of these m6A regulators in chRCC indicated that m6A modification might play a crucial role in the progression of chRCC.
Overview of m6A methylation feature in normal and chRCC tissues
To investigate whether m6A methylation landscape changes between the normal and chRCC tissues, we performed m6A-SEAL-seq [41] using chRCC tissues and normal tissues from three subjects. Approximately 87.1-13.2 million reads were generated from each library and 82.8-12.8 million reads were mapped to GRCh38 genome (Additional file 1: Dataset S1). m6A peaks were called in each sample using the published m6A peak caller MACS2 algorithm [44] (fold enrichment (IP/input) ³ 2 and false discovery rate (FDR) £ 0.05). The m6A peaks identified in all three replicates were classified as “confident m6A peaks”. We identified 15,024 confident m6A peaks corresponding to 11,396 transcripts/genes in normal tissues, and 12,841 confident m6A peaks corresponding to 10,102 transcripts/genes in chRCC tissue (Fig. 2a). To evaluate the reliability and performance of m6A-SEAL-seq, we compared our confident m6A peaks identified in normal tissues with the published m6A peaks identified in normal kidney tissues by MeRIP-seq (GSE122744) [48]. 5686 out of 7261 (78.3%) of the m6A peaks in MeRIP-seq were overlapped with our identified m6A peaks in m6A-SEAL-seq (Additional file 2: Fig. S1a). We then plotted the distribution of distance between m6A peaks from m6A-SEAL-seq and MeRIP-seq, and found that our confident m6A peaks were highly enriched around MeRIP-seq peaks (Additional file 2: Fig. S1b). We further calculated normalized read coverages from m6A-SEAL-seq around MeRIP-seq peaks by deepTools [49]. Their co-enrichment was further shown in Supplementary Figure 1C and 1D. These results suggest that m6A-SEAL-seq is accurate and reliable.
We next investigated the m6A distribution across transcripts in normal and chRCC tissues. The metagene profiles was used to display the distribution of m6A peaks across transcripts. The results showed that confident m6A peaks in normal and chRCC tissues were both highly located within coding sequences (CDS) and 3′ untranslated region (3′UTR) (Fig. 2b), which was consistent with the previous observation [40]. To further locate confident m6A peaks, we divided the transcripts into five non-overlapping regions and assigned the confident m6A peaks into these regions. The fraction of confident m6A peaks of normal and chRCC tissues in these five regions showed that they were dominantly enriched in 3′UTR (40.46%, 40.55%), CDS (29.01%, 26.54%) and stop codon (17.32%, 18.59%) (Fig. 2c). We clustered the confident m6A peaks in HOMER (Hypergeometric Optimization of Motif Enrichment) software [45] and found the motif GGACH (H=U>A/C) in normal tissue and WRAC (W=G>C, R=G>A), RAACW (R=G>A, W=U>A) in chRCC tissue (Supplementary Figure 3), which are similar to the known m6A motif, RRACH (R=G/A, H=A/C/U).
Further we asked which RNA molecules prefer to contain m6A modification. We assigned confident m6A peaks to GRCh38 genome and found that 76.24% and 77.78% were mRNA, 18.42% and 17.16% were long non-coding RNA (lncRNA) in normal and chRCC tissues, respectively (Fig. 2d). We noticed that the confident m6A peaks number in chRCC tissues were less than that in normal tissues (Fig. 2a). We subsequently assigned m6A peaks to chromosome, genes and the five non-overlapping regions. We found that the number of confident m6A peaks in chRCC tissues were decreased globally among each chromosome except chromosome Y, which didn’t have m6A peaks (Fig. 2e). By analyzing the distribution of m6A peaks per gene, we found that most of m6A-motified mRNAs contained one or two m6A peak, while a small number of them contained three or more (Fig. 2f), consistent with previous studies such as ccRCC [40]. In each group of m6A peaks per gene, chRCC tissues always contain less gene number than normal tissues (Fig. 2f). We also counted the number of m6A peaks among the five non-overlapping regions in normal and chRCC tissues and found that chRCC tissues contain less m6A peak numbers in 3′UTR, 5′UTR, CDS, and stop codon compared with normal tissues (Fig. 2g). These results suggest that m6A modification level decreased in chRCC tissues.
Differentially methylated m6A genes (DMMGs) participate in multi-cancer related pathways
To dissect the role of m6A modification, we subsequently identified differentially methylated m6A peaks (DMMPs) between normal and chRCC tissues using the MeTDiff R package software [46] (p £ 0.05). We identified 644 hypermethylated m6A peaks representing 593 transcripts and 1,304 hypomethylated m6A peaks representing 1,137 transcripts in chRCC tissues of all three subjects compared with normal tissues (Fig. 3a). Recall that our qPCR results in six subjects of chRCC showing that m6A writer subunit WTAP and m6A erasers FTO and ALKBH5 were significantly downregulated in chRCC (Fig. 1c), the identified hypomethylated m6A and hypermethylated m6A sites in chRCC could be directly induced by the aberrant expression of WTAP and FTO/ALKBH5, respectively. The identified hyper- and hypomethylated m6A peaks in chRCC tissues were respectively regarded as hyper and hypo group. Motif search analysis using HOMER revealed one overrepensented motif GGACH (H=U>C/A) in hypermethylated m6A peaks and two highly enriched motifs GGAC and GAACU in hypomethylated m6A peaks; all these three identified motifs resemble the canonical m6A motif RRACH sequence (Additional file 2: Fig. S3).
We performed metagene profiling to examine the distribution of DMMPs within transcriptomes and found that both hyper- and hypomethylated m6A peaks were highly enriched around the stop codon (Fig. 3b). Further examination of m6A fraction in the five non-overlapping segments of transcripts revealed that the m6A peaks in both hypo and hyper groups were dominantly enriched within 3′UTR (54.6% for hypo and 56.68% for hyper), CDS (25.92% and 23.14%) and around stop codon (13.19% and 11.02%) (Fig. 3c and Additional file 2: Fig. S2a). The majority of hypo- and hypermethylated transcripts were mRNAs (80.44% for hypo and 70.61% for hyper), and ~14-18% were lncRNA and the rest were other types of RNAs (Fig. 3d and Additional file 2: Fig. S2b).
The hypomethylated m6A peak number in chRCC is 2-fold more than the hypermethylated peaks (Fig. 3a), in line with the finding that total identified m6A peaks in chRCC are less than those in normal tissues (Fig. 2a). Therefore, we firstly focused on the hypomethylated m6A peaks in chRCC. To explore the potential role of hypo-methylated m6A peaks in chRCC, we took advantage of the algorithm DAVID to examine the most preferential expression tissues of m6A hypomethylated genes. The results showed that m6A hypomethylated genes were preferentially expressed in epithelium, followed by brain, placenta, and renal cell carcinoma (RCC) (Fig. 3e), indicating the correlation between these genes and RCC. We performed Gene Ontology (GO) enrichment analysis to uncover the functions of these genes. The results revealed that m6A hypomethylated genes were enriched in many biological processes involved in kidney development and cancer pathogenesis, including transcription, androgen receptor signaling pathway, GTPase activity, and cell-cell adhesion (Fig. 3f). Pathway analysis showed that m6A hypomethylated genes were mainly enriched in cancer-related pathways (Fig. 3g). These results suggested that the hypomethylated m6A genes may participate in various pathophysiologic aspects of chRCC through different pathways.
We also explored the potential function of hypermethylated m6A peaks in chRCC. The results showed that m6A hypermethylated genes were preferentially expressed in brain, followed by epithelium, duodenum, fetal kidney, and ovary (Additional file 2: Fig. S2c). GO biological process analysis revealed that m6A hypermethylated genes were significantly associated with protein phosphorylation, positive regulation of cholesterol efflux, ubiquinone biosynthetic process, regulation of mitophagy, and so on (Additional file 2: Fig. S2d). Pathway analysis showed that m6A hypermethylated genes were mainly enriched in ubiquitin mediated proteolysis, metabolic pathways, and adherent junction (Additional file 2: Fig. S2e). Collectively, the results reveal both m6A hypomethylated and hypermethylated genes are involved in many regulatory pathways, especially hypomethylated genes directly enriched in cancer pathways, indicating that dysregulation of m6A could be a regulatory factor in the pathogenesis of chRCC.
The expression dysregulated genes in chRCC impair the normal functions of kidney
We next investigated the global mRNA expression patterns in normal and chRCC tissues by using the RNA-seq dataset (m6A-SEAL-seq input library). The results showed that a total of 3,911 mRNAs were significantly dysregulated in chRCC of three subjects compared with normal tissues, including 2,344 downregulated mRNAs and 1,567 upregulated mRNAs (fold change ≥ 2, p < 0.05) (Fig. 4a). Hierarchical clustering depicted differential expression profiles in all the samples. (Fig. 4b).
We examined the preferentially expressed tissues of the dysregulated genes (3,911) using the algorithm DAVID. The result showed that the dysregulated genes were preferentially expressed in kidney, followed by liver and plasma (Fig. 4c), suggesting our data was reliable and these genes may participate in kidney development. We further performed GO analysis and KEGG pathway analysis. GO analysis revealed that the dysregulated genes were significantly enriched in metabolic process, transmembrane transport including sodium ion transport, oxidation-reduction process, excretion, kidney development, angiogenesis, and P450 pathway (Fig. 4d). In line with the result of GO, the KEGG analysis result also revealed that the dysregulated genes were significantly associated with multiple metabolic pathways including many amino acid metabolism or degradation, fatty acid degradation, cytochrome P450-related drug metabolism and xenobiotics metabolism (Fig. 4e). These results showed that the expression levels of around four thousand genes were dysregulated in chRCC, preliminarily illustrating that the dysregulated genes in chRCC impair the normal function of kidney, especially metabolic function.
New m6A regulatory signature in the pathogenesis of chRCC
Considering that m6A can either destabilize m6A-modified transcripts through the recognition of YTHDF2 or stabilize m6A-modified transcripts through the recognition of IGF2BP [14], we investigated the correlation between m6A methylation levels and transcript levels. We overlapped the m6A hypermethylated and hypomethylated genes with the differentially expressed genes. In 593 hypermethylated genes, 44 genes were downregulated and 68 genes were upregulated in chRCC tissues (Fig. 5a left). In 1,137 hypomethylated genes, 123 genes were downregulated and 51 genes were upregulated (Fig. 5a right). We next took the m6A hypermethylated and hypomethylated genes as two groups to analyze their transcript accumulation in chRCC and normal tissues. The result showed that the m6A hypermethylated genes (ie. transcripts with higher m6A levels) tended to preferentially exhibit upregulated transcription levels in chRCC (Fig. 5b), revealing the positive correlation between m6A methylation levels and transcript levels in chRCC. Note we found that the expression of YTHDF2 transcript was downregulated in chRCC (Fig. 1c). The positive correlation between m6A levels and transcript levels in chRCC suggests m6A in chRCC tends to affect gene regulation positively, potentially stabilizing m6A-modified transcripts or escaping from YTHDF2-mediated mRNA decay pathway due to the lower expression YTHDF2 in chRCC.
To further analyze the role of DMMGs in cancers, we intersected DMMGs with Cancer Gene Census (CGC) database [50], a database consists of genes with strong indications of a role in cancer, and found that 73 and 23 genes were annotated in the hypomethylated and hypermethylated genes separately (Additional file 1: Dataset S2 and Dataset S3). Among these cancer-related genes, 10 genes were differentially expressed genes, including NOCH1 and FGFR1. NOTCH1 plays distinct roles in different cancers: NOTCH1 functions as a tumor suppressor gene in mouse skin and oral squamous cell carcinoma (OSCC) [51], and the expression of NOTCH1 is decreased significantly in these cancer tissues [52]; whereas in Glioma and colon cancer, NOTCH1 functions as an oncogene and its expression level is increased in these two cancer tissues [53-55]. According to Integrative Genomics Viewer (IGV) software, the m6A modification level in NOTCH1 transcript was decreased significantly in chRCC compared to normal tissues (Fig. 5c left) and qPCR results in six patients showed the transcript expression level of NOTCH1 was reduced significantly in chRCC tissue (Fig. 5c right). The downregulated expression of NOTCH1 in chRCC is consistent with its expression pattern in mouse skin and OSCC cancers, suggesting NOTCH1 might act as a tumor suppressor in chRCC. Fibroblast growth factor receptor 1 (FGFR1) is a known oncogene. In breast cancer, the expression level of FGFR1 shows positive correlation with the amplification of cancer cell [56-58]. In lung cancer models, activation of FGFR1 promotes proliferation and migration of tumor cell while inhibition of FGFR1 suppresses tumor growth [59]. We found the m6A methylation level of FGFR1 was significantly decreased and the transcript expression level of FGFR1 was increased in chRCC tissue (Fig. 5d right), which strongly suggests FGFR1 is an oncogene in chRCC similar to that in breast cancer and lung cancer.
We conducted principal component analysis of the differential expressed DMMGs in 65 chRCC cases from The Cancer Genome Atlas (TCGA) database. Based on the expression of these genes, we could completely distinguish chRCC samples from normal samples (Fig. 5e). Cox regression screen and least absolute shrinkage and selection operator (LASSO) identified three m6A-dependent signatures (Additional file 1: Dataset S4) and defined a m6A-dependent cox model in the training set (Additional file 2: Fig. S4a). Concordance index (C-index = 0.96) showed that the proposed model has a high prognostic power. In this model, we separated patients into high-risk or low-risk group according to their risk score, and patients with different 5-years survival could be distinguished completely between the two groups (Log-rank p < 0.0001) (Fig. 5f). AUC of ROC curve also confirmed the prognostic power of the m6A-dependent model (Additional file 2: Fig. S4b). Then the proposed model was applied to the testing set for prediction. We calculated risk score of each patient in the testing set and assigned to high-risk or low-risk group according to the cut off value in the training set. The Kaplan-Meier survival curve and log-rank between the two groups showed significant difference (Fig. 5g), which demonstrated the high predictive ability of the m6A-dependent model. Furthermore, we also test the m6A-dependent model in ccRCC samples (a cohort of 602 cases from TCGA database) and PRCC dataset (a cohort of 318 cases from TCGA database). The results indicated that the m6A-dependent model is also suitable for ccRCC, but not PRCC (Additional file 2: Fig. S4c, d).