Identication of a m6A RNA methylation regulators-based signature for predicting the prognosis of esophageal adenocarcinoma

N6-methyladenosine (m6A) is an abundant modication in RNAs that affects RNA metabolism, and it is reported to be closely related to cancer occurrence and metastasis. The aim of this study was to identify novel prognostic biomarkers by using m6A RNA methylation regulators capable of improving the risk-stratication criteria of survival for esophageal adenocarcinoma patients. The gene expression data of 16 m6A methylation regulators and its relevant clinical information were extracted from The Cancer Genome Atlas (TCGA) database. The expression pattern of these m6A methylation regulators was evaluated. Consensus clustering analysis was conducted to identify clusters of esophageal adenocarcinoma patients with different prognosis. Univariate, least absolute shrinkage and selection operator (LASSO), and multivariate Cox regression analysis were performed to construct multiple-gene risk signature. A survival analysis was carried out to determine the prognosis signicance.

of esophageal cancer, esophageal adenocarcinoma (EAC) and esophageal squamous-cell carcinoma (ESCC), differ greatly in terms of epidemiology and risk factors. EAC is three to four times as common in men as it is in women whereas the gender distribution is relatively equal in ESCC [4]. Although EAC accounted for only about 10% of cases of esophageal cancer worldwide, its incidence has increased rapidly and even surpassed that of ESCC in several regions in North America and Europe [8]. In addition to the well-known risk factors in EAC like symptomatic gastroesophageal re ux disease, Barrett's esophagus, obesity and tobacco using, genetic variants have been associated with risk for EAC [9]. Over the last decade, genome-wide association studies (GWAS) have identi ed about 20 genetic risk loci for EAC, indicating that single nucleotide polymorphisms (SNPs) play a pivotal role in EAC susceptibility [10][11][12][13]. Identifying novel biomarkers for predicting EAC patients' long term survival is urgently to be addressed.
Previously, epigenetic research mainly focused on DNA and histone modi cations whereas mRNA was believed to only contribute to information transmission. However, with great progress made in highthroughput sequencing technology, it is agreed that there also exist various modi cations in mRNA, such as N 1 -methyladenosine (m 1 A), N 6 -methyladenosine (m6A), 5-methylcytosine (m 5 C) and pseudouridine methylation during the process of exon splicing, 5'-capping and 3'-tailing [14][15][16][17]. Among 171 different RNA modi cations that have been identi ed by the end of 2017 [18], N 6 -methyladenosine (m 6 A) modi cation was rstly identi ed and was the most abundant internal modi cation of RNA in eukaryotic cells [19]. In addition to mRNA, the m 6 A methylation can also be detected in transfer RNA (tRNA), ribosome (rRNA), microRNA and long non-coding RNAs (lncRNAs) [20][21][22]. Similar to DNA and protein modi cation, the regulatory effects of m 6 A methylation is a dynamic and reversible process modulated by methyltransferases called "writers" (KIAA1429, METTL3, METTL14, METTL16, RBM15, WTAP and ZC3H13), binding proteins called "readers" (ALKBH5 and FTO), and demethylases called "erasers" (HNRNPA2B1, HNRNPC, YTHDC1, YTHDC2, YTHDF1, YTHDF2 and YTHDF3) [23,24]. Accumulative evidence has indicated that m 6 A methylation is closely related to the tumor development and m 6 A related regulators play different roles in different types of cancer [25][26][27]. For instance, FTO has been reported to promote cell proliferation and migration in esophageal cancer through regulation of MMP13 [28].
Highlighting these bases, we hypothesized that m 6 A RNA methylation related regulator genes play a role in EAC. However, a comprehensive analysis of the expression pattern of m 6 A RNA methylation regulators in EAC is still lacking and the prognostic value of such regulators remains to be explored.
In this study, we systematically analyzed the expression pattern of sixteen widely studied m 6 A RNA methylation regulators in EAC. Then analysis by means of bioinformatics and statistical analysis were performed in order to explore the potential value of m 6 A methylation regulators in the prognosis of patients with EAC.

Dataset acquisition
The available RNA-seq transcriptome data and clinicopathological information from 78 esophageal adenocarcinoma samples and 9 normal samples were downloaded from the TCGA data (https://portal.gdc.cancer.gov/).

Selection, differential expression analysis and correlation analysis of m 6 A methylation regulators
According to latest published review on m 6 A RNA Methylation in human cancer, sixteen m 6 A methylation regulators (METTL3, METTL14, WTAP, KIAA1429, RBM15, ZC3H13, METTL16, YTHDC1, YTHDC2, YTHDF1, YTHDF2, YTHDF3, HNRNPC, HNRNPA2B1, FTO and ALKBH5) with available expression data in the TCGA datasets. We compared the expression level of those m 6 A related genes between tumor and normal samples by t-tests with a threshold of p<0.05. We also compared the expression level of those genes in esophageal adenocarcinoma patients with different clinical characteristics. In order to demonstrate the different expression patterns of m 6 A related genes, heatmaps and violin plot were made by "pheatmap" and "vioplot" R package. After that, the co-expression correlation analysis was performed to investigate the association among those m 6 A methylation regulators by means of "corrplot" R package.

Consensus clustering analysis
To investigate the expression characteristics of m 6 A methylation regulators in esophageal adenocarcinoma, we classi ed the tumor samples into different groups by "ConsensusClusterPlus" R package. A principal component analysis (PCA) was conducted to verify the different gene expression patterns in different groups. Subsequently, the OS of patients in different groups was analyzed by "survival" R package. The different expression pattern of m 6 A related genes and clinicopathological features in different groups were visualized by "pheatmap" R package. Chi square test was performed to compare the distribution of gender and AJCC stage between the groups.

Construction of prognostic signatures
We performed univariate Cox regression analysis of the expression of m 6 A RNA methylation regulators. Genes with p<0.1 were considered associated with esophageal adenocarcinoma patients' survival and further selected for performing LASSO Cox regression analysis. After that, a ve-gene risk signature and their corresponding coe cient were determined. By multiplying the gene's expression value and its corresponding coe cient, the risk score for each patient was calculated as the sum of each gene's score. Then patients were divided into high-risk and low-risk groups based on the median value of the risk score. The Kaplan-Meier method was made to analyze the OS difference between the high-risk and low-risk groups. The Receiver operating characteristic (ROC) analysis was conducted to evaluate the prediction e ciency of the ve-gene risk signature. Heatmaps was utilized to visualize the different expression pattern of those ve genes between high-risk and low-risk groups with "pheatmap" R package.

Statistical analysis
All data were analyzed using the R statistical package (R version 4.0.0). All assessments were considered statistically signi cant when the two-sided p-value is less than 0.05.

Results
The expression pattern of m 6

A RNA methylation regulators in esophageal adenocarcinoma
We extracted the sequencing data of sixteen m 6 A related genes from the TCGA esophageal adenocarcinoma cohort in order to investigate the expression pattern of m 6 A RNA methylation regulators in esophageal adenocarcinoma. Ten out of sixteen genes (HNRNPA2B1, HNRNPC, YTHDF1, METTL3, YTHDF2, RBM15, YTHDC1, WTAP, KIAA1429 and YTHDF3) represented signi cantly higher expression level (p<0.05) in 78 EAC tissues than in 9 normal tissues. Among these ten genes, the expression of HNRNPA2B1 was in the highest level in EAC tissues, which almost doubled the expression level in normal tissues (Fig. 1b). Five genes (METTL16, ALKBH5, ZC3H13, YTHDC2 and METTL14) showed relatively high expression in tumor tissues, while FTO represented relatively high expression in normal tissues; however, no signi cant difference was seen (p>0.05) ( Fig. 1a and b).
The correlation among the m 6 A RNA methylation regulators in esophageal adenocarcinoma The correlation analysis was performed to analyze the correlation of the m 6 A RNA methylation regulators ( Fig. 2). Among the chosen sixteen regulators, none of the four genes (ALKBH5, METTL16, YTHDC2 and WTAP) had signi cant correlation with any other 15 regulators (p<0.05). KIAA1429 had the strongest correlation with YTHDF3 (r=0.67). A relatively strong correlation was observed between HNRNPA2B1 and HNRNPC (r=0.62), METTL3 and KIAA1429 (r=0.59). A negative correlation value was seen in METTL16 and WTAP (r=-0.23); however, there was no signi cant difference (p>0.05).
Consensus clustering of m 6 A RNA methylation regulators identi ed three clusters of esophageal adenocarcinoma with distinct clinical outcomes Using the ConsensusClusterPlus package, we classi ed the tumor samples into different groups according to the similarity of the expression of the 16 m 6 A RNA methylation regulators. As shown in Fig.  3a and b, it seemed to be the most appropriate to divide the EAC cohort into three groups, namely cluster 1, cluster 2 and cluster 3. The principal component analysis (PCA) displayed the distinction among the three cluster subgroups (Fig. 3c).
After that, we compared the overall survival (OS) of EAC patients among the three cluster subgroups in order to investigate the association between the clustering result and clinical outcome. The survival analysis demonstrated that the patients in cluster 3 had a signi cantly shorter OS than the patients in cluster 1 or 2 (p<0.05) (Fig. 3d). Taken together, these results indicated that the clustering result was closely related to the clinical outcome in EAC.
Construction of a ve-gene risk signature with distinct prognostic value In order to investigate the prognostic role of m 6 A RNA methylation regulators in EAC, univariate Cox regression analysis was performed to identify regulators associated with OS in TCGA EAC cohort (n=88). The results exhibited that 5 out of 16 regulators were associated with OS (p<0.1), among which ALKBH5 played a role as a protective gene with HR=0.952, whereas the other four (HNRNPA2B1, KIAA1429, WTAP and METTL16) were risky genes with HR>1 (Fig. 4). HR for these four genes were 1.023 (95% CI 1.011-1.035), 1.280 (95% CI 1.116-1.468), 1.178 (95% CI 1.046-1.326), and 1.345 (95% CI 0.958-1.887), respectively. Subsequently, these ve genes were included in the LASSO Cox regression analysis so as to construct a better predicting model of m 6 A RNA methylation regulators on the clinical outcomes of EAC.
To evaluate the prognostic role of the ve-gene risk signature, the EAC patients were divided into high-risk group and low-risk group according to the median risk score. After that, we compared the OS between the two groups. Compared with patients in low-risk group, those in high-risk group had a signi cantly shorter OS (p<0.05) (Fig. 5c). The ROC curve displayed the acceptable prediction e ciency of the prognostic signature with the AUC value of 0.681 (Fig. 5d). These results, taken together, suggested that the ve-gene risk signature could effectively screen out high-risk EAC patients with relatively worse clinical outcome.

Discussion
RNA epigenetics has become a hot topic in recent years. Among different modi cations, the m 6 A is the most abundant RNA modi cation modulated by methyltransferases, demethylases and binding proteins, which regulates almost each step of mRNA metabolism and multiple biological processes [29]. Emerging data from recent studies have indicated that m 6 A RNA methylation regulators play a crucial role in most of cancers, contributing to the self-renewal of cancer stem cell, promotion of cancer cell proliferation, and so on [30]. For instance, METTL3, a methyltransferase of m 6 A, has been reported to be signi cantly upregulated in human hepatocellular carcinoma (HCC) and the overexpression of METTL3 is associated with poor prognosis in patients with HCC [31]. WTAP, a methyltransferase of m 6 A, has been found to regulate migration and invasion of glioblastoma cells [32]. The m 6 A demethylase ALKBH5 has been reported to demonstrate a high expression in glioblastoma stem-like cells (GSCs) and to maintain tumorigenicity of GSCs by sustaining FOXM 1 expression and cell proliferation program [33]. Esophageal cancer is one of the most aggressive malignant tumors with two main histological subtypes, EAC and ESCC. Several studies have explored the association between m 6 A modi cations and ESCC [28,34,35]; however, relevant data on EAC remains rare. Given the rising role of m 6 A modi cations in multiple tumors, its exact role in esophageal cancer, especially EAC, needs to be further investigated.
In this present study, we proved that 10 out of 16 m 6 A RNA methylation regulators were highly expressed in EAC. This is consistent with the results from previous studies focusing on gastric carcinoma, lung cancer, HCC and other malignant tumors [36][37][38]. Compared with the expression level in normal tissues, the expression of three genes, HNRNPA2B1, HNRNPC, and YTHDF1, were signi cantly up-regulated in EAC tissues with p<0.001. There was a similar phenomenon in a study on melanoma, where YTHDF1 and HNRNPA2B1 affected modi cation by genes related to p53-signaling, further up-regulating their expression and facilitated their roles in inhibiting p53 to suppress tumorigenesis [39]. The m6A demethylase FTO represented a relatively higher expression in normal tissues than in EAC tissues in this study (Fig. 1b). Interestingly, FTO has been reported to be highly expressed in ESCC and other malignant tumors such as melanoma and gastric cancer, implying its distinct role in EAC [28,40,41]. In the correlation analysis, KIAA1429 was found to present the strongest correlation with YTHDF3. A previous study found that YTHDF3 could enhance the mRNA stability of Zeb1, the downstream target of KIAA1429, thus leading to cell metastasis in HCC [42]. This may contribute to the strong correlation between the two genes. Furthermore, through consensus clustering, three clusters of EAC subgroups were identi ed based on the expression pattern of m 6 A RNA methylation regulators. A signi cant difference in OS was observed among the three subgroups, which indicated that the expression pattern of m 6 A RNA methylation regulators had a close relationship to the prognosis of patients with EAC. Taken together, these results unveiled the vital role of m 6 A RNA methylation in EAC.
In our study, we performed univariate, LASSO Cox regression analyses to develop a prognostic related risk signature with 5 genes, HNRNPA2B1, KIAA1429, WTAP, METTL16 and ALKBH5, and then divided the EAC patients into high-and low-risk group. Among the 5 genes, ALKBH5 was the only protective gene for the prognosis of EAC. Nonetheless, a previous study investigated the association between ALKBH5 and prognosis in 177 patients with ESCC, and identi ed ALKBH5 as the rst demethylase that accelerated cell cycle progression and promoted cell proliferation of ESCC cells, suggesting that the up-regulation of ALKBH5 was associated with poor prognosis in ESCC [35]. Conversely, ALKBH5 has been reported to inhibit pancreatic cancer motility by demethylating lncRNA KCNK15-AS1 [43], which agrees with the result in our study to some extent. Considering the opposite effect of ALKBH5 in different types of malignant tumors, more preclinical and clinical evidences are required to nd out the exact role of ALKBH5 in EAC. HNRNPA2B1, KIAA1429, WTAP, METTL16 were identi ed as protective genes, which was consistent with the results in many studies [44][45][46][47]. At present, the prognosis assessment mainly relies on TNM stage which includes limited risk factors [2]. Establishing an effective prognosis model with molecular biology risk factors becomes an issue of increasing importance. In the present study, a ve-gene risk score model was built (Risk score = (0.145) × WTAP + (0.257) × KIAA1429 + (0.108) × METTL16 + (0.019) × HNRNPA2B1+ (-0.043) × ALKBH5) and used to stratify the survial of EAC patients into high and low risk categories, which exhibited signi cant difference in OS (p=0.0016), showing a potential value for EAC treatment.
To our best knowledge, this is the rst study to compare the expression pattern of m 6 A RNA methylation regulators in EAC and normal tissues. Our analyses revealed the predictive value of ve identi ed m 6 A RNA methylation regulators in prognosis of EAC patients and built a risky signature. However, there are some limitations in our study. First, the sample size of normal tissues is limited, which may lead to some biases during the statistical analysis. Second, further molecular biology experiments are required to investigate the mechanisms by which the ve identi ed genes in uence the progression of EAC. In addition, the prognostic e cacy of the ve-gene signature constructed in the present study still needs further veri cation in a large-scale population of EAC patients.

Conclusions
In conclusion, our study demonstrates a dysregulated expression of m 6 A RNA methylation regulators between EAC and normal controls. 10 of 16 m 6 A RNA methylation regulators represented high expression in EAC. Furthermore, a ve-gene risky signature was constructed to distinguish EAC patients with different prognosis, indicating its prognostic value as a promising molecular biomarker.