Construction and Analysis of a Prognostic Model of Osteosarcoma Based on m6A-Related lncRNA

N6-methylandenosine (m6A) methylation is one of the most common methylation modications in RNA. At present, a large number of studies have found that m6A methylation can regulate the occurrence and development of tumors by modifying mRNA. However, it is still unclear how m6A modies Long noncoding RNA (lncRNA) that regulates mRNA expression by interacting with miRNA to affect the occurrence and development of osteosarcoma(OS). Therefore, exploring the lncRNAs related to m6A methylation and identifying lncRNAs that have both prognostic effects and immune functions are things that need to be solved urgently.

This study clari ed the important role of m6A-related lncRNAs in the prognosis and immune microenvironment of patients with OS, and indicate that m6A-related prognostic lncRNA signals may provide new targets for the diagnosis and treatment of OS.

Introdution
Osteosarcoma(OS) is one of the most common malignant tumors that are commonly occurred at the metaphysis of long bones in children and adolescents. [1]. Although the 5-year survival rate of patients with OS has increased to 70-80% through surgery and chemotherapy [2], about 20% of patients have metastases at the time of diagnosis [3]. At the same time, another 30% of patients have metastases within 2-3 years after diagnosis [4]. In addition, when OS patients develop resistance to the drug, their 5-year survival rate is reduced to 20% [5]. Due to the high rate of distant metastasis of OS [6], it is necessary to nd new effective targets. m6A methylation is almost involved in the whole process of regulating mRNA biological behavior, including mRNA shearing, translation and degradation, etc [7]. It is composed of methyltransferase complex (writer), demethylase (erase), and a dynamic process maintained by the function manager (reader) [8]. At present, a large number of studies have found that m6A methylation can regulate the occurrence and development of tumors by modifying mRNA [9][10][11][12], and OS is no exception. METTL3 is one of the important members of the writer, When it is silenced, the biological functions of OS cell migration and invasion were decline [13]. Further studies have found that in OS, the decreased expression of METTL3 can lead to m6A methylation and a decrease in mRNA experssion levels of lymphoid enhancer-binding factor 1 (LEF1). Multiple studies have found that LEF1 is involved in the metastasis of different types of cancer and chemotherapy resistance through the WNT signaling pathway [14][15][16]. Furthermore, a recent study revealed that METTL3 acts as an oncogene in the development and progression of OS [17]. In addition, m6A methylation may affect the occurrence and development of tumors through the modi cation of non-coding RNA [18]. In one study, the stability of PVT1, a well-known oncogenic lncRNA, was improved by m6A methylation to promote and accelerate the growth of OS [19].
Long non-coding RNA (lncRNA) is an RNA that is not translated into a functional protein in the organism and whose transcript is more than 200 nucleotides in length. It has been found that lncRNA interacts with miRNA and participates in mRNA expression. Besides, both lncRNA and miRNA regulate the progress of OS through a variety of molecular mechanisms. The OS cell proliferation is enhanced through the GSK3 β/β-catenin signaling pathway mediated by lncRNA CCAT2 in vitro [20], At the same time, the proliferation and migration biological functions of OS cells under the action of lncRNA WWOX-AS1 decrease in vitro [21]. However, the overall function of m6A methylation-related lncRNA in OS is still unclear. Therefore, it is important to explore lncRNA related to m6A methylation and identify lncRNA that has both prognostic effects and immune functions.

Materials And Methods
Datasets and m6A RNA methylation regulators The published gene expression data of OS and complete clinical annotation les were obtained from the TARGET database(https://target-data.nci.nih.gov/Public/OS/gene_expression_array/). Patients without a survival status and survival time information were excluded. After screening, gene expression data from a total of 101 patients with OS were pulled into the study. Gene annotation les were obtained from the Ensembl database. In total of 14,589 lncRNAs were identi ed after annotating the expression data of OS.

Construction of co-expression network
In this study, we screened lnRNAs with the |Pearson R|> 0.5 and P <0.001 from the results of Pearson correlation coe cient analysis as m6A-related lncRNAs. The co-expression analysis between m6A-related genes and lncRNA was realized using the "igraph" and the "limma" package. The package realizes the visualization of the co-expression network.

Construction of prognostic model
The m6A-related lncRNA was integrated with the survival data of clinical patients, and the single-factor Cox regression analysis was used to screening prognostic-related lncRNA combined with the clinical information of patients. To reduce the deviation caused by the high degree of similarity between the prognostic genes and to prevent over-tting to the model, Lasso regression was further used to solve this problem. The prognostic signature of m6A-related lncRNA that were from the results single-factor Cox regression analysis was developed for OS patients. The patient's risk score can be obtained by the following calculation formula: Risk score = Coe *Xi. In the study, the patients with OS from the TARGET database were randomly divided into two groups (training set and a test set) . The scatter plots of the risk scores of the two data sets were drawn with the R package "ggplot2". In these two data sets, the risk scores is sorted using the median as the cut-off point, and the patients in the two data sets were divided into high-risk and low-risk groups. Moreover, the Kaplan-Meier curve and log-rank test were used to explore whether there is a difference in survival time of OS patients between high-risk and low-risk groups. The prognostic signature's predictive ability is an important indicator of the evaluation model, and it was evaluated through receiver operating characteristic (ROC) curves and the area under the curve (AUC) values. The expression of lncRNA for constructing a prognostic model in the high-risk and low-risk groups is displayed by heat map.

Bioinformatics
According to the expression of lncRNA with signi cant prognostic value, we divide patients with OS into different subgroups using the "Consensus Cluster Plus" package. The differences were calculated in the immune score, the stromal score, and the estimate score between different subgroups by estimate algorithm, then the visualization of the abundance of 22 immune cells is achieved through the "vioplot" package in each subgroup. The enrichment of differential genes between high-risk and low-risk groups in the KEGG pathway is explored through GSEA, and the pathways with signi cant differences to be screened was P <0.05 as the threshold. The bioinformatics analysis and statistical analysis are based on version 4.0 of R.

Results
Expressions of m6A methylation regulatory genes and identi cation of their lncRNA Two sets of gene expression matrices for OS were downloaded from the GEO database (GSE19276, GSE99671). The GSE19276 data set includes sequencing data of 41 cases of OS after chemotherapy.
The differential expressions of m6A methylation regulatory genes in good and poor responses to chemotherapy are shown in Figure 2a. The HNRNPA2B1 and RBM15 genes are differ between the good and poor chemotherapy groups. The GSE99671 data set includes sequencing data of 36 metastatic and non-metastatic patients. The expression levels of m6A methylation regulatory genes are shown in Figure  2b. The IGFBP1, YTHDF3, ALKBH5 and YTHDF1 genes are differ between the metastatic and nonmetastatic groups. At the same time, we obtained 14,589 lncRNAs from the TARGET database. Then, the expression data of 23 m6A methylation regulatory genes was extracted from the OS dataset in the TARGET database. When lncRNA is signi cantly associated with one or more m6A methylation regulatory genes, it is the m6A-related target lncRNA (correlation coe cient >0.5 and P <0.05). At the end, we obtained 706 lncRNAs from the TARGET database, and the co-expression network of m6A methylation regulatory genes and related lncRNAs are showed in Figure 2c. The work ow chart is shown as in Fig. 1.

Consensus Clustering Identi ed Two Clusters of OS
The above research results con rmed that m6A methylation regulatory genes were changed in the metastasis of OS due to chemotherapy. However, how m6A methylation modulates lncRNA to regulate the occurrence and development of OS? We tried to further explore it. We found that, in total, 706 lncRNAs are related to m6A methylation regulatory genes and 26 lncRNAs related to the prognosis of OS were  Figure 3a. In order to explore the role of m6A RNA methylation regulatory genes in OS metastasis, the consensus clustering method was used to divide patients with OS into subgroups based on the expressions of 26 prognosticrelated lncRNAs. When k = 2-9, the cumulative distribution function (CDF) of consensus clustering and the increment of AUC are shown in Supplementary Figures 1a and 1b. According to the similarity of m6Arelated prognostic lncRNA expression levels, it is found that k = 2 has the best cluster stability between k = 2 and 9 (shown in Figure 3b). Through a Kaplan-Meier survival analysis, it was identi ed that there are signi cant differences between the two subgroups (shown in Figure 3c), and suggested that changes in m6A lncRNA expressions affect OS prognosis.

Analysis of immune cell in ltration in OS
The immune microenvironment also affects the occurrence and development of tumors. We found that the immune score (Figure 4a, P = 0.02), stromal score (Figure 4b, P = 0.027), and estimate score ( Figure  4c, P = 0.015) were higher in cluster 1 than the cluster 2. The abundance of 22 immune cells in cluster 1 and cluster 2 is shown in Figure 4d, indicating cluster 1 has higher T cells follicular helpers (P = 0.031) and compared to cluster 2. After, the mRNA expression level of each subtype immune checkpoint were tested. We found that compared to cluster 2, HAVCR2, LAG3, and PDCD1 were highly expressed in cluster 1, while the SIGLEC15 was low in expression, as shown in Figure 4e.

Development of a Prognostic Signature
Based on 26 m6A-related prognostic lncRNAs that were from the results single-factor Cox regression analysis, the prognostic signature was constructed for OS patients by using Lasso Cox analysis. We get that the model has the good predictive ability when the prognostic model consists of 12 lncRNAs, and the coe cient and partial likelihood deviance of the prognostic signature are shown in Figures 5(a) and 5(b).
The risk score can be obtained by the following calculation formula: -0.0606×RP11-1017G21.5 expression +0.24817×LINC01534 expression-0.3080× CTC-444N24.8 expression-0.1920×RP1-92O14.6 expression+0.0007×SNHG6 expression-0.2496×RP11-876N24.5 expression +0.2154×TIPARP-AS1 expression+ 0.4944×CTC-268N12.3 expression-0.1731×CTC-428H11.2 expression+ 0.0343× RP11-24N18.1 expression+0.1530RP11-540B6.6 expression+0.0027× MIR210HG expression. Then, the patients with OS from the TARGET database were randomly divided into two groups (training set and a test set). In each group, OS patients were further divided into two data sets (high-risk and low-risk groups) based on the risk score using the median as the cut-off point. It can be found that there are signi cant differences between the high-risk and low-risk groups in the survival time of OS patients, and the OS patients in the low-risk group live longer from Figure 5(c) and Figure 5(d). The AUC in the training cohort was 0.871 (Figure 5e ), and the AUC in the test cohort was 0.870 (Figure 5f), it was showed that the prognostic model is robust in predicting the prognostic survival time of patients with OS.
To explore further the factors affecting OS prognosis, we explored the impact of gender and race on the prognosis of patients with OS combined with other clinical data. We found that the risk score was associated with the prognosis of OS in both sets (Figure 6a, 6b both P <0.05), but gender and race were not associated with the prognosis of OS. In addition, in the training and testing sets, the patients of OS in the high-risk group appear to was worse than the low-risk group (Figure 6e, 6f). LINC01534 and RP11-24N18.1 were all highly expressed in the high-risk group, while CTC-444N24.8 was all highly expressed in the low-risk group (Figure 6g, 6h).
GSEA and Risk Score Associated with Immune In ltration Based on the differential genes of the high-risk and low-risk groups, we found that in the high-risk group, the Linoleic acid metabolism and the glycine, serine and threonine metabolism pathway were mainly involved through GSEA (P <0.05), as shown in Figures 7a and 7b. In addition, we also found that two types of immune cells are associated with risk scores. As shown in Figures 7c and 7d, the abundance of activated mast cells (P = 0.024) and naive CD4 T cells (P = 0.0044) was positively associated with the risk score.

Discussion
OS is one of the most common malignant tumors, which are commonly occurred at the metaphysis of long bones in children and adolescents. Although patients with OS have a better prognosis than those with many other types of cancer thanks to surgery and chemotherapy, the negative impact of drug resistance and metastasis on the human life span and quality of life should not be underestimated. Some studies have shown that abnormal m6A methylation is closely related to the development of OS [13][14][15][16]. Although more and more researchers are beginning to pay attention to assessing the prognosis status of patients with OS from different angles, there are still relatively few studies to explore the prognostic status and immune microenvironment of patients with OS through m6A-related lncRNA.
As a result of our study, Firstly, we obtained 23 genes that play an important role in m6A modi cation from the literature and found that they differ signi cantly in the effects of chemotherapy and tumor metastasis. In total, 702 m6A-related lncRNAs were obtained through screening m6A-related lncRNAs with the |Pearson R|> 0.5 and P <0.001 from the results of Pearson correlation coe cient analysis. Then a single factor Cox regression analysis was used to screen out 26 lncRNAs related to the prognosis of OS combined with the clinical data of the patients. A total of 26 m6A-related lncRNAs were identi ed as signi cantly related to the overall survival of patients with OS. According to the expression of lncRNA with signi cant prognostic value, patients with OS were divided into two subgroups using the "Consensus Cluster Plus" package. We found that the immune score, matrix score, and estimate scores were higher in cluster 1 than those in cluster 2, and cluster 1 had a better prognosis than cluster 2. Furthermore, we found that HAVCR2, LAG3, SIGLEC15, and PDCD1G2 were highly expressed in cluster 1 compared with cluster 2.
Song et al. also found in their study that the high expressions of the immune checkpoints PDCD1LG2 and HAVCR2 are associated with an improved prognosis. At the same time, FAP-positive broblasts have a better chemotherapy response [22]. Ligon et al. found The concentration of LGA3 in OS pulmonary metastases was signi cantly higher than that in primary bone tumor [23]. Fan et al. found that it can affect the growth and migration of OS cells by changing the expression of SIGLEC15 in cells [24]. Zheng et al. found that the prognosis of patients with OS is worse when PD-L1 is highly expressed, but PDCD1LG2 is positively correlated with the overall survival rate of patients [25]. Based on the results of these previous studies, our results are consistent with them, revealing that patients with OS with a high expression of immune checkpoints have a better prognosis.
Subsequently, 12 key lncRNAs were selected by the Lasso Cox regression analysis, and the prognostic survival status of patients with OS were projected by a comprehensive risk score. The scatter plot of survival status of patients in high-risk and low-risk groups, K-M curve, ROC curve proved that reliable for 12 key lncRNAs. At the same time, we found that compared to the low-risk group, the high-risk group had a worse prognosis for OS patients. In 12 key lncRNAs, SNHG6 have been reported to demonstrated its role in OS in published studies. Zhu et al. found that SNHG6 is up-regulated in cell lines and tissues of OS, and the SNHG6 is often highly expressed in OS patients with tumor metastasis. Furthermore, when the expression of SNHG6 in OS is silenced, the growth of OS cells was signi cantly inhibited, the invasion ability of cells was reduced, the cell cycle was blocked in the G0/G1 phase, and the apoptosis of OS cells was induced. In addition, it has been found that SNHG6 and miR-26a-5p competitively bind to ULK1, thereby regulating the expression of ULK1, and cell apoptosis and autophagy were induced by targeting caspase3 and ATF3 [26,27].
Besides SNHG6, lncRNA miR210HG also has a regulatory effect on OS. Li et al. found that miR210HG is differentially expressed in tissues and adjacent tissues of OS patients and it is often highly expressed in tumor tissues. As well, the up-regulation of the miR210HG expression also indicates that patients have a poorer prognosis and lower survival rate. In vitro study, miR210HG knockdown can inhibit the proliferation and invasion of OS cells. In vivo study, tumor growth was inhibited by silencing miR210HG [28]. Published research found that miR210HG is not only highly expressed in OS, but also highly expressed in lung cancer [29], glioma [30], and drug-resistant pancreatic cancer [31]. The study of other lncRNAs among the 12 key lncRNAs in OS is still needed, but they provide us with powerful clues in the search for new targets for OS.

Conclusions
In summary, this study analyzed the expression of m6A in OS, screened m6A -related lncRNAs, and explored the prognostic value of m6A-related lncRNAs in OS and the impact on the immune microenvironment. This study used the single-factor Cox regression analysis and Lasso regression analysis to constructed a prognostic model, as well as conducted consistent clustering with m6A-related prognostic lncRNAs. The important role of m6A-related lncRNAs in the prognosis and immune microenvironment of patients with OS was clari ed. The results indicate that m6A -related prognostic lncRNA signals may provide new targets for the diagnosis and treatment of OS. Abbreviations OS, Osteosarcoma; lncRNA, Long non-coding RNA; TARGET database, Therapeutically Applicable Research To Generate Effective Treatments database TPM transcripts per kilobase million mapped reads GEO database GENE EXPRESSION OMNIBUS database ROC receiver operating characteristic 31. Li D, Qian X, Xu P, Wang X, Li Z, Qian J, Yao J: Identi cation of lncRNAs and Their Functional Network Associated with Chemoresistance in SW1990/GZ Pancreatic Cancer Cells by RNA Sequencing. DNA and cell biology 2018, 37(10):839-849. Figure 1