m6A-related lncRNAs Are Potential Biomarkers for Predicting the Prognosis of PAAD Patients

Background: Pancreatic cancer is a common malignant tumor of the digestive system, with insidious onset, dicult early diagnosis, easy metastasis and poor prognosis. N6-methyladenosine and long noncoding RNA play important roles in the prognostic value and immunotherapy response of PAAD. Therefore, it is crucial to recognize m6A-related lncRNAs in PAAD patients. Methods: in this study, m6A-related LncRNAs were obtained by co-expression analysis. Univariate, least absolute shrinkage and selection operator and multivariate Cox regression analyses were performed to construct m6A-related LncRNA prognostic models. Kaplan-Meier analysis, principal component analysis, feature-rich annotation, and nomogram were used to analyze the accuracy of risk models. Finally, potential drugs targeting this model are also discussed. Results: A prognostic model based on m6A-related LncRNA was constructed, potential drugs targeting this m6A-related LncRNA feature were discovered, and the relationship with immunotherapy response was studied. Finally, a nomogram was established to predict survival in PAAD patients. Conclusions: this m6A-based LncRNA risk prognostic model may be promising for clinical prediction of prognosis and immunotherapy response in PAAD patients.

the expression of post-transcriptional genes through three levels: transcriptional level, epigenetic modi cation level and post-transcriptional level. Promote or inhibit the development of cancer and play a role in the diagnosis and treatment of tumors [10,11]. Changes in RNA can affect a variety of biological processes, therefore, the role of m6A-regulated long non-coding RNAs (LncRNAs) may be crucial for the proliferation and migration of cancer cells [12], and studies have reported that LncRNAs can promote pancreatic cancer cell proliferation and inhibition of apoptosis [13].
The m6A methylation modi cation process is reversible and involves a variety of enzymes (adenosine methyltransferases, demethylases, and RNA-binding proteins). Knockout of METTL3 gene expression reduces mRNA m6A methylation modi cation, and cancer cell proliferation, invasion and migration are all attenuated [14]. The demethylase ALKBH5 is one of the important predictors of the overall survival rate of pancreatic cancer. Tang et al. [15] found that silencing ALKBH5 can signi cantly increase the proliferation, migration and invasion of pancreatic cancer cells in vitro and in vivo, while its overexpression leads to the opposite. the result of. Another study reported that the contents of Lnc RNAs KCNK15-AS1 and ALKBH5 in pancreatic cancer tissues were signi cantly lower than those in normal tissues. After overexpression of ALKBH5 in different cell lines, KCNK15-AS1 expression was subsequently increased, while epithelial-mesenchymal transition in pancreatic cancer cells was inhibited [16,17]. The speci c role of m6A regulators in lncRNAs remains unclear. Therefore, understanding the mechanism of m6A-related LncRNA in the development of PAAD may provide new ideas for the prognosis and treatment of pancreatic cancer patients.
In this study, we extracted the expression pro les of 14,056 LncRNA and 23m6A genes from The Cancer Genome Atlas (TCGA) dataset. Next, the limma package and BiocManager package in R studio software identi ed m6A-related LncRNAs. A prognostic model was constructed based on m6A-related LncRNA, which was used to predict the overall survival of PAAD patients. Next, using publicly available drug susceptibility databases, we discovered potential drugs targeting this m6A-related LncRNA signature. At the same time, we explored the relationship with immunotherapy response. Finally, we built a nomogram to predict survival in PAAD patients.

Data Sources
The RNA-seq transcriptome data, related clinical information and mutation data of PAAD patients were obtained from the TCGA (https://cancergenome.nih.gov/) database, and the data were organized. At the same time, in order to reduce the statistical error of this study, we deleted pancreatic cancer patients with no survival and incomplete data.

Construction and validation of prognostic models
The entire TCGA dataset was randomized into training and testing groups. A prognostic model was constructed using the training set, and the established model was validated. Subgroups including low-risk and high-risk groups were also subsequently established based on the median risk score. Combined with the survival information of PAAD patients in TCGA, we screened the m6A-related LncRNAs involved in model construction from 288 m6A-related LncRNAs in the TCGA dataset (p < 0.05). This study used univariate Cox regression. LASSO Cox regression was performed using the R package glmnet to nd m6A-related LncRNAs that were signi cantly associated with the survival of PAAD patients in the TCGA dataset. Multivariate Cox regression was used to analyze m6A-related LncRNAs, and nally a m6A-related LncRNA risk model was established. The formula for calculating the risk score is as follows: Risk score =m6A-related LncRNA1×coef +m6A-related LncRNA2×coef +…+m6A-related LncRNAn×coef, where coef represents the coe cient, which is the coe cient between LncRNA and survival. Risk curves for high and low risk were constructed using the pheatmap package in R studio software. ROC curves were constructed using the survival, survminer, and timeROC packages in R studio software. Finally, we performed model validation of clinical subgroups to nd out which clinical subgroups our model was applicable to.
Differential gene identi cation, functional analysis, and tumor mutational burden We identi ed differentially expressed genes in high and low risk groups and performed GO functional analysis on them. The ltering criteria of high and low risk differential genes were logFC lter>1 and fdrFilter<0.0. GO functional analysis was performed using the clusterPro ler package in the R studio software. The analysis threshold was determined by the p-value, with p<0.05 indicating signi cant enrichment of functional reviews. We also used the ggpubr package and the limma package in R studio software to analyze whether the tumor mutation burden of the high and low risk groups was different, and nally used the survival package and survminer package in the R studio software to analyze the survival of the high and low tumor mutation burden groups. analyze.
Model estimation of tumor immune microenvironment, principal component analysis and Kaplan-Meier survival analysis Differential immune function in high and low risk groups was screened by Limma package, GSVA package and GSEABase package in R studio software. The maftools package in the R studio software was used to assess the mutation frequencies of the high and low risk groups in the model. We performed principal component analysis (PCA), which is used for e cient dimensionality reduction, model identi cation and grouping of high-dimensional data of whole gene expression pro les, m6A genes, m6Arelated LncRNAs, and risk models based on gene expression patterns visualization. Finally, we used Kaplan-Meier survival analysis to assess the diversity of survival between high-and low-risk groups. The R packages survminer and survival are the tools used for this research.
Analysis of prognostic models and screening of potential drugs Multivariate and univariate Cox regression analyses were performed to test whether the prognostic model was an independent variable considering other clinical characteristics of PAAD patients (sex, age, tumor grade, and tumor stage). Analyses of immune evasion and immunotherapy were also performed to nd out whether there were differences between high and low risk groups when receiving immunotherapy. To obtain potential drugs for clinical use in PAAD treatment, we calculated the IC50 of compounds obtained from the GDSC website in the TCGA project of the PAAD dataset. IC50s of compounds obtained from the GDSC website were predicted in PAAD patients using the pRRophetic package in R studio software.

Build and validate the nomogram
The predictive power of nomogram and other predictors (age, gender, risk score, TNM stage, T stage, N stage, and M stage) for 1-, 3-, and 5-year survival was set. A calibration curve based on the Nomogrampredicted test was applied to illustrate the agreement between actual and model predicted results.

Result
Identi cation of m6A-associated LncRNAs in PAAD patients The matrix expression of 23 m6A genes and 14,056 LncRNAs was extracted from the TCGA database.
288 lncRNAs with a co-expression relationship with m6A were identi ed. m6A-associated LncRNAs were de ned as LncRNAs signi cantly associated with greater than or equal to one of the 23 m6A genes (|Pearson R| > 0.4 and p < 0.001). In Figure 1A, a Sankey diagram of m6A-related LncRNAs is shown. In Figure 1B, a heatmap of the correlations between 23 m6A genes and 5 m6A-related LncRNAs involved in model construction, with positive correlations in red and negative correlations in blue.

Construction and validation of risk models based on m6A-related LncRNAs in PAAD patients
The m6A-related prognostic LncRNAs were screened from 288 m6A-related LncRNAs in the TCGA training set using univariate Cox regression analysis. In Figure 2A, 24 m6A-related LncRNAs in the TCGA dataset were signi cantly associated with survival. LASSO-penalized Cox analysis is a common method for multiple regression analysis. The application of this method not only improves the prediction accuracy and interpretability of statistical models, but also enables variable selection and regularization to be performed simultaneously, which can effectively identify the most available predictive markers and generate prognostic indicators to predict clinical outcomes. The vertical dashed line illustrates the rst level value of logλ with the smallest segmentation error. Therefore, 9 m6A-related LncRNAs were selected for subsequent multivariate analysis. Next, we used multivariate Cox ratio hazard regression analysis to distinguish autologous prognostic proteins. Five m6A-related LncRNAs, which were prognostic proteins independently associated with survival in the training set, were used to construct risk models to assess prognostic risk in PAAD patients ( Figure 2B and Figure 2C). PAAD patients were divided into low-risk and high-risk groups according to the median prognostic risk grade. In Figure 3A, the distribution of risk levels for the entire set is shown. Figure 3B shows survival status and survival time. In Figure 3C, m6A-related LncRNAs are shown. In Figure 3D we performed a Kaplan-Meier survival analysis, which showed that the low-risk group survived longer than the high-risk group (P<0.001).
To test the prognostic power of this established model, we calculated a risk score for each patient in the training group and in the test group using a uni ed formula. Figure 4 depicts risk scores, survival status patterns, and risk heatmaps ( Figures 4A, 4B, 4C for the training group; Figures 4D, 4E, 4F for the test group), with increasing risk levels from left to right. Finally, in Figure 5, we performed model validation of clinical groupings to verify whether patients with different clinical characteristics were suitable for the model constructed in this study. Subgroups classi ed by age, sex, and tumor stage showed signi cantly higher survival in the low-risk group than in the high-risk group.
Principal component analysis further validates the prognostic model PCA was performed to test based on the whole gene expression pro le, 23 m6A genes, 5 m6A-related lncRNAs and model LncRNAs. In Fig. 6A, Fig. 6B and Fig. 6C, the distributions of high-risk and low-risk groups are shown relatively dispersed, while Fig. 6D based on the model we constructed shows that the high-and low-risk groups have different distributions, indicating that the model can distinguish between high-and low-risk groups of patients.
A prognostic model estimates the tumor immune microenvironment and cancer immunotherapy response First, in Figure 7A we performed immune function analysis. Look for differences in immune function between high and low risk groups. Next, to explore the underlying molecular mechanisms of the m6Abased model, we performed Gene Ontology (GO) enrichment analysis, revealing the involvement of many immune-related biological processes (Fig. 7B). In Figure 7C, we also performed an analysis of immune escape and immunotherapy to nd out whether there are differences between high and low risk groups when receiving immunotherapy. Mutational data were analyzed and summarized using the maftools package in R studio software. Mutations were strati ed according to variant effect predictors. Figure 7D and Figure 7E show the top 20 most frequently altered driver genes between high-risk and low-risk subgroups. We then performed a differential analysis of tumor mutational burden and calculated TMB scores from TGCA somatic mutation data. The low-risk group had lower TMB than the high-risk group (Fig. 7F). Finally, we performed a survival analysis of tumor mutational burden. In Figure 8A, it can be seen that the survival of the low tumor mutational burden group was better than that of the high tumor mutational burden group, and then combined the tumor mutational burden with the patient risk score for survival analysis, Figure 8B , patients with low tumor mutational burden and low risk score were found to have better survival.

Identi cation of potential drugs for prognostic models
To explore potential drugs for the treatment of PAAD patients with our prognostic model, we used the pRRophetic algorithm to estimate treatment response based on the half-maximal inhibitory concentration (IC50) of each drug in the Genomics of Cancer Drug Sensitivity (GDSC) database. We screened for 6 drugs with signi cantly different estimated IC50s between the high and low risk groups, and the low risk group was more sensitive to most of the potential drugs. Figure 9 shows 6 potential drugs that can be used for further analysis of PAAD patients.

Independent prognostic analysis of prognostic models and assessment of clinical features of PAAD
We performed univariate and multivariate Cox regression analyses to assess whether risk models for m6A-related LncRNAs had independent prognostic features of PAAD. In Figure 10A, rst in the univariate Cox regression analysis, the HRs for the risk score and 95% con dence interval (CI) were 1.181 and 1.097−1.271, respectively (p < 0.001). Figure 10B, HR was 1.162 in multivariate Cox regression analysis, 95% CI was 1.074−1.257 (p < 0.001). We then performed a concordance index analysis of the risk score and found that the concordance index of the risk score was consistently greater than other clinical factors over time, suggesting that the risk class could better predict the prognosis of PAAD patients (Fig.  10C). Finally, in Figure 10D and Figure 10E, we performed AUC analysis of risk grades, and the AUCs of risk score grades were also higher than those of other clinical features, indicating that the prognostic risk model constructed in this study was relatively reliable.

Construction and evaluation of prognostic nomograms
In Figure 11A, we constructed nomograms including risk classes and clinical characteristics to predict 1-, 3-, and 5-year survival in PAAD patients. Correlation plots show that the observed and predicted rates of survival in PAAD patients at 1, 3, and 5 years showed good agreement ( Figure 11B).

Discussion
Pancreatic cancer is the main cause of cancer-related death worldwide. It has been a serious threat to human life and health due to its insidious onset, strong invasiveness, poor prognosis, and high mortality rate [2,19,20]. Previous studies have mostly focused on the occurrence, development and treatment of pancreatic cancer. With the deepening of research, it has been found that the disorder of m6A methylation modi cation regulation may affect the processing, degradation and translation of mRNA, resulting in the activation of oncogenes and the inactivation of tumor suppressor genes, and the occurrence, development and drug resistance of malignant tumors. The occurrence of m6A is closely related, and m6A changes play a crucial role in carcinogenesis and tumor progression [8].
m6A plays a post-transcriptional modi cation role in eukaryotic mRNAs and lncRNAs, such as in regulating mRNA transcription, splicing and translation, as well as affecting the structure and function of lncRNAs with extensive regulatory roles [12]. m6A regulators can modify speci c LncRNAs, and LncRNAs can maintain malignancy in various tumors through transcriptional, epigenetic, and post-transcriptional levels [9,11]. The role of m6A-regulated long non-coding RNAs (LncRNAs) may be critical for the proliferation and migration of cancer cells [12]. Studies have reported that m6A methylation modi cation of LncRNA can affect the occurrence and development of tumors, and m6A modi cation can also affect the formation of RNA-DNA triple helix, in which one LncRNA binds to this series through the Hoogsteen base pair in the main groove of double-stranded DNA. , In addition, m6A may also affect the reciprocal site between lncRNA and speci c DNA [21,22]. Both m6A and LncRNA are important regulators of PAAD occurrence. However, studies on their roles and biological mechanisms in PAAD progression are still relatively lacking [14,17]. In this study, we constructed an independent prognostic model based on m6Arelated LncRNAs, inspired by the functions of m6A and LncRNAs in PAAD.
In this paper, we identi ed 14056 m6A-associated LncRNAs from the TCGA dataset to explore the prognostic functions of m6A-associated LncRNAs. After the TCGA dataset con rmed the prognostic value of m6A-related LncRNAs, ve of them were selected to construct a m6A-related LncRNA prognostic model to predict the survival of PAAD patients. We also performed model validation for clinical grouping, calculating risk scores for each patient in the training group and across the entire set, and principal component analysis to validate the prognostic model, all of which demonstrated the accuracy of the prognostic model. PAAD patients were divided into a low-risk group and a high-risk group according to the median prognostic risk level. It was found that the high-risk group had poorer survival than the low-risk group. The model validation results of clinical grouping showed that the model we constructed was more suitable for patients with age less than or equal to 65 years old, female patients, and PAAD patients with tumor stage I and II. Multivariate Cox regression analysis showed that the m6A-related LncRNA prognostic model was an own risk factor for survival. ROC analysis showed that the model outperformed traditional clinical features in predicting survival in PAAD patients. In addition, a nomogram was constructed showing the agreement between the 1-, 3-, and 5-year prognostic model prediction rates for PAAD patients. The accuracy of the prognostic model based on m6A-related LncRNAs in predicting patient survival, the prediction model can provide a certain basis for subsequent research to identify new biomarkers.
TMB is the total number of somatically encoded mutations associated with the emergence of neoantigens that trigger antitumor immunity [23,24]. Studies have reported that patients with low-risk endometrial cancer have higher tumor mutational burden and are more sensitive to chemotherapy than patients with high-risk scores [25]. Here, we found that TMB in the low-risk group was lower than that in the high-risk group, and then performed a survival analysis of tumor mutational burden, nding that the low-tumor mutational burden group had better survival than the high tumor mutational burden group, and then combined tumor mutational burden with patients Risk scores were used for survival analysis and found that patients with low tumor mutational burden and low risk scores had better survival.
Unfortunately, our prognostic model showed that the analysis of immune escape and immunotherapy showed no statistically signi cant difference between high and low risk groups when receiving immunotherapy, perhaps due to our limited sample, future studies can consider using more sample studies to test. The tumor microenvironment can regulate the biological properties of tumor cells such as chemotherapy resistance through metabolism and other means. We screened out 6 drugs with signi cantly different estimated IC50s between the high and low risk groups, and the low risk group was more sensitive to most potential drugs, the discovery of which may provide new insights into the subsequent treatment of patients with PAAD ideas. Pathological stage is a decisive factor for the diagnosis and prognosis of PAAD [26]. Current staging is not precise in providing reliable predictions and re ecting the heterogeneity of PAAD, therefore, exploring new potential predictive markers and immunotherapeutic agents is critical. Our established m6A-related LncRNA prognostic model provides a new idea for predicting the survival of PAAD patients. We are also aware of some shortcomings and limitations of this study, the biological mechanisms of m6A-related lncRNAs have not been fully elucidated. In the future, we will try to verify the accuracy of this model with more external experiments to explore the role of LncRNA and its interaction with m6A in our follow-up work.

Summarize
Understanding the mechanism of m6A-related LncRNA in the development of PAAD may provide new ideas for the prognosis and treatment of pancreatic cancer patients. Our study provides new clues and ways for survival prediction in PAAD patients, and may help to elucidate the process and mechanism of regulation between m6A and LncRNA.

DATA AVAILABILITY STATEMENT
The datasets presented in this study can be found in online repositories. The names of the repository/repositories and accession number(s) can be found in the article/Supplementary Material.

AUTHOR CONTRIBUTIONS
Yiyang Chen and Xi Ou conceived of the presented idea. Data mining and gene analysis were carried out by Yiyang Chen, and Yiju Gong. Yiyang Chen and Jikui Liu wrote the manuscript. Yiyang Chen, Yiju Gong contributed to the manuscript revision. All authors contributed to the article and approved the submitted version.     Kaplan-Meier curves for differences in survival between high-risk and low-risk groups by sex, age, and tumor grade.
Page 18/22     Demonstrates 6 potential drugs for further analysis of PAAD patients.