Identification of m6A-associated LncRNAs in PAAD patients
The matrix expression of 23 m6A genes and 14,056 LncRNAs was extracted from the TCGA database. 288 lncRNAs with a co-expression relationship with m6A were identified. m6A-associated LncRNAs were defined as LncRNAs significantly associated with greater than or equal to one of the 23 m6A genes (|Pearson R| > 0.4 and p < 0.001). In Figure 1A, a Sankey diagram of m6A-related LncRNAs is shown. In Figure 1B, a heatmap of the correlations between 23 m6A genes and 5 m6A-related LncRNAs involved in model construction, with positive correlations in red and negative correlations in blue.
Construction and validation of risk models based on m6A-related LncRNAs in PAAD patients
The m6A-related prognostic LncRNAs were screened from 288 m6A-related LncRNAs in the TCGA training set using univariate Cox regression analysis. In Figure 2A, 24 m6A-related LncRNAs in the TCGA dataset were significantly associated with survival. LASSO-penalized Cox analysis is a common method for multiple regression analysis. The application of this method not only improves the prediction accuracy and interpretability of statistical models, but also enables variable selection and regularization to be performed simultaneously, which can effectively identify the most available predictive markers and generate prognostic indicators to predict clinical outcomes. The vertical dashed line illustrates the first level value of logλ with the smallest segmentation error. Therefore, 9 m6A-related LncRNAs were selected for subsequent multivariate analysis. Next, we used multivariate Cox ratio hazard regression analysis to distinguish autologous prognostic proteins. Five m6A-related LncRNAs, which were prognostic proteins independently associated with survival in the training set, were used to construct risk models to assess prognostic risk in PAAD patients (Figure 2B and Figure 2C). PAAD patients were divided into low-risk and high-risk groups according to the median prognostic risk grade. In Figure 3A, the distribution of risk levels for the entire set is shown. Figure 3B shows survival status and survival time. In Figure 3C, m6A-related LncRNAs are shown. In Figure 3D we performed a Kaplan-Meier survival analysis, which showed that the low-risk group survived longer than the high-risk group (P<0.001).
To test the prognostic power of this established model, we calculated a risk score for each patient in the training group and in the test group using a unified formula. Figure 4 depicts risk scores, survival status patterns, and risk heatmaps (Figures 4A, 4B, 4C for the training group; Figures 4D, 4E, 4F for the test group), with increasing risk levels from left to right. Finally, in Figure 5, we performed model validation of clinical groupings to verify whether patients with different clinical characteristics were suitable for the model constructed in this study. Subgroups classified by age, sex, and tumor stage showed significantly higher survival in the low-risk group than in the high-risk group.
Principal component analysis further validates the prognostic model
PCA was performed to test based on the whole gene expression profile, 23 m6A genes, 5 m6A-related lncRNAs and model LncRNAs. In Fig. 6A, Fig. 6B and Fig. 6C, the distributions of high-risk and low-risk groups are shown relatively dispersed, while Fig. 6D based on the model we constructed shows that the high- and low-risk groups have different distributions, indicating that the model can distinguish between high- and low-risk groups of patients.
A prognostic model estimates the tumor immune microenvironment and cancer immunotherapy response
First, in Figure 7A we performed immune function analysis. Look for differences in immune function between high and low risk groups. Next, to explore the underlying molecular mechanisms of the m6A-based model, we performed Gene Ontology (GO) enrichment analysis, revealing the involvement of many immune-related biological processes (Fig. 7B). In Figure 7C, we also performed an analysis of immune escape and immunotherapy to find out whether there are differences between high and low risk groups when receiving immunotherapy. Mutational data were analyzed and summarized using the maftools package in R studio software. Mutations were stratified according to variant effect predictors. Figure 7D and Figure 7E show the top 20 most frequently altered driver genes between high-risk and low-risk subgroups. We then performed a differential analysis of tumor mutational burden and calculated TMB scores from TGCA somatic mutation data. The low-risk group had lower TMB than the high-risk group (Fig. 7F). Finally, we performed a survival analysis of tumor mutational burden. In Figure 8A, it can be seen that the survival of the low tumor mutational burden group was better than that of the high tumor mutational burden group, and then combined the tumor mutational burden with the patient risk score for survival analysis, Figure 8B , patients with low tumor mutational burden and low risk score were found to have better survival.
Identification of potential drugs for prognostic models
To explore potential drugs for the treatment of PAAD patients with our prognostic model, we used the pRRophetic algorithm to estimate treatment response based on the half-maximal inhibitory concentration (IC50) of each drug in the Genomics of Cancer Drug Sensitivity (GDSC) database. We screened for 6 drugs with significantly different estimated IC50s between the high and low risk groups, and the low risk group was more sensitive to most of the potential drugs. Figure 9 shows 6 potential drugs that can be used for further analysis of PAAD patients.
Independent prognostic analysis of prognostic models and assessment of clinical features of PAAD
We performed univariate and multivariate Cox regression analyses to assess whether risk models for m6A-related LncRNAs had independent prognostic features of PAAD. In Figure 10A, first in the univariate Cox regression analysis, the HRs for the risk score and 95% confidence interval (CI) were 1.181 and 1.097−1.271, respectively (p < 0.001). Figure 10B, HR was 1.162 in multivariate Cox regression analysis, 95% CI was 1.074−1.257 (p < 0.001). We then performed a concordance index analysis of the risk score and found that the concordance index of the risk score was consistently greater than other clinical factors over time, suggesting that the risk class could better predict the prognosis of PAAD patients (Fig. 10C). Finally, in Figure 10D and Figure 10E, we performed AUC analysis of risk grades, and the AUCs of risk score grades were also higher than those of other clinical features, indicating that the prognostic risk model constructed in this study was relatively reliable.
Construction and evaluation of prognostic nomograms
In Figure 11A, we constructed nomograms including risk classes and clinical characteristics to predict 1-, 3-, and 5-year survival in PAAD patients. Correlation plots show that the observed and predicted rates of survival in PAAD patients at 1, 3, and 5 years showed good agreement (Figure 11B).