The Cancer Genome Atlas Predicts the Prognosis of Lung Carcinoma Based on Genes Associated with m6A Methylation

Lung cancer has become a predominant cause of death in relation with carcinoma worldwide. N6-methylladenosine (m6A) is a common mRNA that is internally modied, which has a pivotal role in mRNA splicing, outputting, localizing, and translating and in identifying stable features. This study evaluated the expression pattern and prognostic value of m6A-related genes in lung cancer. Expression data of lung cancer samples with related clinical information were obtained from The Cancer Genome Atlas (TCGA). Then, R software was used in combination with several corresponding software packages to identify the regulatory factors of m6A RNA methylation with differential expression. Three genes (METTL3, YTHDF1, and FTO) were overexpressed in lung cancer. METTL3 had a low survival rate (P < 0.05). Signicant differences in survival rate were observed among the subgroups, which possessed differently expressed m6A levels. Two latent predicting factors (METTL3 and KIAA1429) that met the independent predictive values were selected. M6A RNA methylation modulators may be involved with the malignant progression of lung cancer, and the two selected risk characteristics of m6A RNA methylation regulators may be a potential prognostic biological marker for guiding customized therapies in patients with lung carcinoma.


Introduction
Pulmonary carcinoma has become a common malignancy possessing great morbidity, and its death rate ranks rst in the world with a 5-year survival rate of below 20%. In the recent decades, various therapeutic methods have been developed, such as surgical resection, chemotherapy, radiotherapy, and molecular targeting therapy, but the whole survival status for patients with pulmonary carcinoma is not improved probably because the early symptoms of lung cancer are hidden; thus, the majority of patients have been con rmed in the middle or terminal stages [1]. In addition, effective molecular markers to target therapy are lacking [2]. Therefore, the development of a reliable prognostic predictor for lung cancer patients is important for accurate individualization of treatment.
Post-transcriptional modi cation is a signi cant regulatory factor for many physiological processes and illness development, which has attracted increasing attention in biology. N6-methyladenosine (m6A) becomes the most suitable mRNA modi cation among the massive RNA modi cations. M6A is involved in almost every step of RNA metabolism comprising mRNA translating, degradation, splicing, outputting, and folding [3][4]. Dysregulation of the m6A regulator results in reduced cellular proliferation, loss of selfrenewal, developmental defects, and cellular death [5]. The regulator of m6A methylation is involved in the occurrence and development of various carcinomas, such as glioblastoma, hepatocellular carcinoma, and breast cancer [6-8]. Although m6A becomes a focal point for various explorations recently, people's understanding of this modi cation remains insu cient. Furthermore, the study of m6A-methylationrelated genes is a small combinatorial study. The function of m6A-methylation-related genes in lung carcinoma remains unknown. In this study, the relation of the expression of m6A-methylation-related genes with clinical prognosis among lung carcinoma patients was analyzed. Through Cox univariate analysis and Lasso Cox regression analysis, the regulatory factors of m6A methylation were selected to construct the risk characteristics. Then, the function of risk signals in lung cancer prognosis was analyzed. Finally, new prognostic biological markers and therapy targets were obtained.

Data collection
The transcriptomic information of lung cancer and the relevant clinical information were obtained from The Cancer Genome Atlas (TCGA) database [9] (https://portal.gdc.cancer.gov/). The mRNA expression information for 1037 tumor tissues and 108 healthy tissues was collected, and the clinical data of 1026 with pulmonary carcinoma patients, comprising age, gender, grading, clinical staging, and NM staging, were collected. Data preprocessing and differential expression analysis of m6A RNA methylation regulators The differential expression of 15 m6A-methylation-related genes in lung carcinoma and normal control samples should be assessed using R software (Version 3.8; http://www.bioconductor.org/packages/release/bioc/html). R software was utilized to plot the heatmap of the 15 genes.

Bioinformatic analysis/statistics
Pearson correlation analysis was performed to identify the correlation among genes, and an m6A methylation regulator interaction network was constructed by Cytoscape. Kaplan-Meier (KM) survival analyses were conducted to assess the impact of each gene on the survival status according to the UALCAN website (http://ualcan.path.uab.edu/index. html). P-value of 0.05 indicated the signi cant threshold for all tests.

Consensus-clustering analysis
The TCGA lung cancer cohort was clustered into distinctive groups through consensus expression of m6A RNA methylation regulators using "ConsensusClusterPlus" in R to determine whether the expressing level of m6A RNA methylation regulators was associated with the prognosis of pulmonary cancer.
Utilizing KM and log-rank test, the distinctions of overall survival (OS) among the groups were tested.
Prognostic value of m6A-methylation-related genes Univariate Cox regression analysis screened out m6A regulatory factors with signi cant differences for further analysis. Subsequently, LASSO regression of the high-dimensional data was performed, and "glment" package in R software was utilized to select the prognostics factor with the greatest value. The patients were divided into the high-risk group and low-risk group, which was in accordance with the median expression of m6A-methylation-related genes. The KM survival method was used to analyze the relation of m6A-related genes with the survival rate. P value for the KM survival curve was calculated by utilizing the log-rank test. In verifying the accuracy of the model, receiver operating characteristic (ROC) curves were plotted. After univariate and multivariate Cox regression analyses, the prognostic factors of lung cancer were determined. Statistical analysis R3.5.2 and Cytoscape v.3.7.2 were used for all statistical analyses. Survival time was analyzed by univariate Cox regression. Hazard ratios and 95% CI were calculated to identify genes associated with OS. K-M survival analyses were performed by utilizing the log-rank test. P < 0.05 indicated statistical signi cance unless otherwise stated.
Results mRNA expression pattern of m6A-related genes among pulmonary carcinoma This study included 1037 pulmonary carcinoma samples and 108 adjacent normal lung samples from TCGA. Clinical information comprised age, gender, grading, clinical staging, and TNM staging. The expression pro les of all known m6A-related genes in lung cancer were analyzed on the basis of TCGA datasets. The heatmap of expression of m6A-related genes was constructed to understand the expression in lung cancer. As shown in Fig. 1, three genes (YTHDF1, FTO, and METTL3) evidently increased in pulmonary carcinoma versus adjacent normal tissues. Pearson correlation analysis demonstrated the positive correlation of such regulators with one another in a weak-to-moderate manner, in which YTHDF3 and KIAA1429 had the strongest correlation. In addition, HNRNPC and HNRNPCA2B1 were negatively associated, which indicated that HNRNPCA2B1 expression might be downregulated with the upregulation of HNRNPC expression (Fig. 2).
Survival analysis of m6A-methylation-related genes UALCAN (http://ualcan.path.uab.edu/index.html) online tool should be utilized to identify the survival information for three genes. Based on the ndings, patients with high expression of METTL3 showed an inferior survival status (P < 0.05, Fig. 3). By contrast, the remaining two genes were not signi cantly different (P > 0.05).
Based on the m6A-methylation-related gene expression levels, different subgroups of the 374 tumor samples were identi ed by utilizing R's ConsensusClusterPlus package. Moreover, cluster-consensus and item-consensus outcomes were calculated. As shown in Fig. 4(A-C), the output manifested k (2 to 4) subgroups. In line with our ndings, k = 2 indicated that full selection was achieved. All patients were classi ed into two subgroups with regard to the k value with the greatest stability ( Fig.4(A-E)). Figure 4 Prognostic value of m6A-methylation-related genes and identi cation of prognostics features Univariate Cox regression was utilized to analyze 15 genes and comprehensively understand the prognosis of m6A-methylation-related genes among pulmonary carcinoma. Two candidate genes were selected, and P < 0.05 was set as the screening condition ( Figure 5(A)). The LASSO Cox regression model was used in selecting genes with the greatest prediction as prognostics markers. λ should be selected when the minimum median of the residual sum of squares occurred. Two latent predictive factors ( Figures 5(B-C)), METTL3 and KIAA1429, were identi ed as prognostics factors for lung carcinoma. The risk score for two genes was calculated for further univariate and multivariate Cox regression analyses.

As shown in
On the basis of the cutoff value combination model of the median expression of the two candidate genes, the patients were divided into the high-risk group and low-risk group. The prognosis of the low-risk group was better than the high-risk group (Figure 5(D)).
Prognostic value of m6A-methylation-related genes Based on univariate analyses, the T stage, M stage, N stage, and risk score of m6A-methylation-related genes affected patients' prognostic status (P < 0.05, Figure 6A). Multivariate regression analysis showed that the age, T stage, and risk score of m6A-methylation-related genes was an independent factor for the prognostic status of pulmonary carcinoma (P < 0.05, Figure 6B).

Discussion
Lung cancer has great aggression and rapid fatalness. Despite remarkable technical advances over the past few decades, the 5-year survival rate of pulmonary carcinoma remains unexplored. Therefore, identifying a new therapy for pulmonary carcinoma and the factor affecting clinical prognosis is necessary. Numerous studies have shown that tumorigenesis and development can be facilitated through genomic and epigenetic changes such as DNA methylation [10]. The common internal modi cation of eukaryotic mRNAs, m6A RNA methylation, is important for outstanding prediction of malignant behavior and clinical prognosis of various carcinomas and for early diagnosis and treatment of pulmonary adenocarcinoma [11][12][13].
In this study, we evaluated whether m6A-related genes could be used as a new prognostic biological marker for lung carcinomas. Based on the TCGA bioinformatics, 15 gene signatures were established to forecast the prognostic status for pulmonary carcinoma, comprising METTL3, METTL14, KIAA1429, RBM15, ZC3H13, WTAP, YTHDF1, YTHDF2, YTHDF3, YTHDC1, YTHDC2, HNRNPC, HNRNPA2B1, ALKBH5, and FTO. We also analyzed the relation of those regulators with clinical pathological characteristics and nally constructed a two-gene risk signature. These results indicated that the regulatory factor of m6A RNA methylation played a signi cant function in the pathogenesis of pulmonary cancer, which can be used as a reliable prognostic factor.
Our current analysis showed that three of the 15 m6A RNA methylation regulatory genes (YTHDF1, FTO, and METTL3) were upregulated in pulmonary cancer tissues, indicating that these genes may have a main role in the development and progression of pulmonary cancer. In addition, m6A-methylation-related genes had close relation with one another in the regulatory network, suggesting that they were cooperatively coordinated in cancer development. Moreover, METTL3 may have a deleterious effect on lung cancer patients because of its association with poor survival status. The ndings indicated that m6A modulator could be a potential target for pulmonary carcinoma treatment.
In survival analyses, the patient with high expression of METTL3 signi cantly deteriorated the survival rate, which indicated that METTL3 was identi ed as a prognostic gene in the TCGA database. Visvanathan et al. [14] revealed that the upregulation of METTL3 was associated with inferior survival status in glioblastoma cells, which was consistent with our analyzing outcomes. Next, we used R's ConsensusClusterPlus package to divide the lung cancer tissue data into two subgroups based on the gene expression levels. Cluster analysis of the gene expression pro le is an important research topic in the diagnosis of cancer subtypes to provide accurate treatment for cancer patients [15]. Principal component analysis showed that subgroup 1 and subgroup 2 were separated. Overall survival analysis revealed signi cant improvement in survival time of ` 1, suggesting that the survival time was related to the general expression level of m6A-methylation-related genes.
Lasso Cox analysis identi ed METTL3 and KIAA1429 as prognostic factors for pulmonary carcinoma. In the prognosis of pulmonary carcinoma, the predictive force of m6A-methylation-related gene was evaluated using the ROC curve. The m6A-methylation-related gene played a role in the survival of lung cancer. METTL3 was initially identi ed as a methyltransferase, which involved the modi cation of m6A [16]. Recently, studies show that METTL3 has an important function in a variety of cancer types, depending on or independent of its m6A RNA methyltransferase activity. In most cases, METTL3 has been reported as an oncogene that promotes initiation and development of various cancers by depositing m6A modi cations on key transcripts. Vu et al. [17] found that METTL3 was more abundant in acute myeloid leukemia cells than in normal hematopoietic stem and progenitor cells, and differentiation of hematopoietic stems and progenitor cells was inhibited when wild-type METTL3 was overexpressed.
METTL3 consumption signi cantly suppressed tumorigenicity and metastasis, but in patients with hepatocellular carcinoma, METTL3 was signi cantly upregulated, and the OS rate was shortened [18]. METTL3 expression was elevated in pulmonary adenocarcinoma, and METTL3 promoted the growth, survival, and invasion of human pulmonary carcinoma cells [19]. Lin et al. reported that METTL3 could promote cellular growth, survival, and invasion by increasing the expression of EGFR and TAZ [20], revealing the signi cant function of METTL3 in boosting the translation of oncogenes in human pulmonary carcinoma. KIAA1429 is an RNA-binding protein, which involves the modi cation of m6A, mRNA splicing, and processing [21]. KIAA1429 is considered as a scaffold that coordinates the core component of METTL3/METTL14/WTAP on an RNA substrate for speci c m6A methylation near the 3 UTR and stop codon [22][23]. KIAA1429 is signi cantly upregulated in hepatocellular carcinoma tissues, and its high expression is associated with poor prognosis in hepatocellular carcinoma patients. In addition, silencing KIAA1429 inhibits cellular proliferation and metastasis in vitro and in vivo [24].
KIAA1429 is also highly expressed in breast carcinoma tissues and is often downregulated in noncancerous breast tissues. Moreover, the subsequent in vivo and in vitro studies have shown that KIAA1429 is involved in the proliferation and metastasis of breast carcinoma [23]. Tang et al. found that KIAA1429 was highly expressed in ge tinib-resistant NSCLC cells and tissues, and it showed close association with poor survival status. KIAA1429 also promoted ge tinib resistance in NSCLC in vitro. Furthermore, depletion of KIAA1429 inhibited tumor growth in NSCLC cells in vivo [25]. These results indicated that METTL3 and KIAA1429 can be used as oncogenic markers in cancer patients, which was consistent with our ndings in lung cancer.
Next, using the LASSSO regression, the prognostic model of the 2-m6A regulatory gene signature (MEETL3 and KIAA1429) was conducted, which was associated with the pathological stage and poor overall survival status of pulmonary carcinoma. LASSO regression risk scoring factor based on the 2-m6A regulatory signature was a general independent prognostic factor for prognostic evaluation/forecasting of pulmonary carcinoma patients with individualized management into high-risk and low-risk groups. We hypothesized that the pathology-speci c regulation factors of m6a RNA modi cation should be a biological marker with great scienti c value in the study of pulmonary cancer.
In general, clinical parameters combined with m6A-methylation-related genes may be better predictors than single biomarkers. In recent years, m6A-methylation-related genes have revealed great potential application in predicting tumor prognosis. This exploration tentatively suggests that the expression level of m6A-methylation-related genes plays an important role in the progression of pulmonary carcinoma, which could be used as a prognostic predictor of pulmonary carcinoma. However, this study has some limitations. First, this study performs an analysis using a public database and lacks some validation of our own queues; therefore, we will further study two m6A-related genes in our own lung cancer cohort. Second, in selecting an appropriate model, the number of pulmonary carcinoma patients is still small, and the results need to be further validated.

Conclusion
Several m6A RNA methylation regulators had abnormal expression in lung carcinoma, and such regulators were associated with pathological characteristics. In addition, the prognostic characteristicbase-risk score based on KIAA1429 and METTL3 expression levels had strong correlation with clinical outcome and clinical pathology characteristics, and they could be utilized as an independent predictive factor for lung cancer prognosis. This exploration provided signi cant proof to detect the function of M6A methylation in lung carcinoma.

Declarations Acknowledgments
We thank all colleagues involved in the study for their contributions. We acknowledge TCGA database for providing their platforms and contributors for uploading their meaningful datasets.

Author Contributions
Huanqing Liu performed statistical analysis, and was responsible for the quality control of data and algorithms. Tingting Li performed literature research. Chunsheng Dong performed data interpretation. Jun Lyu contributed to the study concept and study design. All authors contributed to writing of the manuscript and approved the nal version.

Funding
No funding was received.
Availability of data and material The datasets analyzed during current study are available from the corresponding author upon reasonable request.
Ethics approval and consent to participate Ethic approval: TCGA belongs to public databases. The patients involved in the database have obtained ethical approval. Users can download relevant data for free for research and publish relevant articles. Our study is based on open source data, so there are no ethical issues and other con icts of interest.