- Identification of m6A-Related LncRNAs in PDAC Patients
Firstly, using the downloaded profile from the “Ensembl” website, we identified 4441 lncRNAs in the GTEx-TCGA database and 14181 lncRNAs in the ICGC-CA database based on recognizing the Ensemble IDs of the genes for the following analysis. In addition, we extracted the expression matrixes of 26 m6A-related genes from the GTEx-TCGA and the ICGC-CA databases. A lncRNAs whose expression value was correlated with one of the 26 m6A-related genes, with | Pearson R| > 0.6 and p < 0.001 as the criterion, was defined as an m6A-related lncRNA. We obtained 762 lncRNAs significantly correlated with m6A-related genes in the GTEx-TCGA database, while 1370 lncRNAs in the ICGC database. Combined with the survival information, univariate Cox regression and Log-rank test were executed to screen seven m6A-related prognostic lncRNAs in both databases (Figure 1A).
- Establish the m6a-Related LncRNAs Prognostic Model in the TCGA Database
All m6A-related lncRNAs showed positive co-expression, which presented an extensive synergy effect (Figure 1B, C). The lncRNA CASC19 and LINC02323 possessed the most significant correlation coefficient (0.68) in the GTEx-TCGA database, while the lncRNA ITGB1-DT and LINC02323, NRAV and PRECSIT had the largest correlation coefficient (0.68) in the ICGC database. It is worth mentioning that NRAV and PRECSIT also possessed a significant correlation coefficient (0.64) in the GTEx-TCGA database. In order to build the m6A-related lncRNAs prognostic model for forecasting the overall survival of PDAC patients, at the same time, avoid the expression overfitting possibility brought by high correlation, we performed a LASSO regression analysis based on the seven m6A-related prognostic lncRNAs in the TCGA cohort. Moreover, it generated the prognostic model containing seven m6A-related lncRNAs and each lncRNA coefficient (Figures 1D, E). For each patient in the TCGA database, a risk score was calculated based on the coefficient for each lncRNAs (Figure 1F). Patients in the TCGA training cohort were divided into low-risk and high-risk subgroups based on the median value of risk scores. K-M curves demonstrated that PDAC patients with higher risk scores had worse outcomes (Figure 1G). The ROC curves illustrated that the lncRNAs prognostic model has an excellent predictive ability to predict overall survival in the TCGA training cohort (1-year AUC = 0.710, 3-year AUC = 0.803, 5-year AUC = 0.887; Figure 1I). Risk score and survival status distributions are plotted in Figure 1K.
- Validation of the LncRNAs Prognostic Model in the ICGC Database
To validate the lncRNAs prognostic model’s predictive ability based on the TCGA training set, we calculated risk scores for patients in the ICGC cohort using the same equation. PDAC patients in the ICGC database were assigned to low-risk and high-risk subgroups according to the median risk score. The results were consistent with the TCGA database findings: PDAC patients with higher risk scores had lower overall survival rates and a shorter overall survival time in the ICGC dataset (Figure 1H). The ROC curves also demonstrated that m6A-related lncRNAs prognostic model had a robust prognostic value for PDAC patients in the ICGC database (1-year AUC = 0.641, 3-year AUC = 0.698, 5-year AUC = 0.711; Figure 1J). Risk score and survival status distributions are shown in Figure 1L, and it showed that patients with higher risk scores had shorter overall survival time and more death status. These results showed that the lncRNAs prognostic model was a robust and stable overall survival predictive tool.
- Prognostic Analysis of the Seven m6A-Related LncRNAs
Univariate Cox regression analysis was employed to evaluate seven m6A-related lncRNAs in the prognostic model and their prognostic roles. The forest plot shows that all of them are risk factors with Hazard Ratio (HR) >1 in PDAC patients (Figure 2A). The K-M survival curves confirmed that higher expression of CASC19, ITGB1-DT, LINC01094, LINC02323, NRAV, PRECSIT, and UCA1 were associated with worse overall survival in the TCGA database (Figures 2B–H).
- Pathway and Process Enrichment Analysis of Different Risk Subgroups
For investigating the potential biological process and pathway involving in the molecular heterogeneity between the low-risk and high-risk subgroups, we identified 1107 differential expression genes (DEGs) between the low-risk and high-risk subgroups in the TCGA cohort with the filter criteria |log2 (fold change)| > 0.5 and p < 0.05 (Figure 2I). These DEGs were primarily enriched in these GO terms: the biological process (BP) included the leukocyte migration, epidermis development, cell junction assembly, skin development, positive regulation of cell adhesion, etc; the cell component (CC) contains cell-cell junction, collagen-containing extracellular matrix, apical part of cell, external side of plasma membrane, apical plasma membrane, etc; the molecular function (MF) included cell adhesion molecule binding, receptor ligand activity, actin binding, cadherin binding, cell adhesion mediator activity, etc (Figure 2J). The KEGG analyses revealed that 16 tumor characteristics were enriched in the high-risk subgroup, such as hematopoietic cell lineage, cell adhesion molecules, insulin secretion, malaria, axon guidance, etc (Figure 2K). These results may disclose some perspectives into the cellular biological effects related to the m6A-related lncRNAs prognostic model.
- Stratification Analysis of the m6A-Related LncRNAs Prognostic Model
The heatmap exhibited that LINC01094, CASC19, LINC02323, PRECSIT, UCA1, ITGB1-DT, and NRAV were enriched in the high-risk subgroup, meanwhile, showed the association between each m6A-related lncRNAs expression and the clinicopathological features of PDAC patients (Figure 3A). We attempted to identify whether clinicopathological characteristics were connected with the risk score (Figure 3B-R). The results revealed that the PDAC patients with a higher risk score, the tumor issue have a worse pathological differentiation and showed a strictly increasing relationship (P<0.05). Besides, we found that the PDAC patients with higher risk scores were also more likely to have tumor recurrence after treatment (P<0.05). The patients with distant metastasis had the highest risk score, followed by the locoregional recurrence. The new primary tumor had the lowest risk score with only three samples. To better assess the m6A-related lncRNAs prognostic model’s prognostic capacity, we carried out a stratification analysis to verify whether it remains its ability to forecast overall survival in various subgroups. According to the clinicopathological characteristics, we performed K-M curves for subgroups with each risk stratification of more than five persons. In contrast with lower risk score patients, higher risk PDAC patients had worse overall survival in both the pathological G2 and G3 grades, while there was no significant difference in the G1 stratification (Figure 4A-C). Likewise, we confirmed that the m6A-related lncRNAs prognostic model retained its capacity to predict overall survival for whether PDAC patients have recurrence events or not. Moreover, the further detailed stratification was performed, we detected that the PDAC patients with distant metastasis had shorter survival time, On the contrary, the locoregional recurrence patients had no significant difference (Figure 4D-G).
- LncRNAs Prognostic Model Was an Independent Prognostic Factor for PDAC Patients
We used univariate and multivariate Cox analyses to assess whether the m6A-related lncRNAs prognostic model was an independent prognostic factor for patients with PDAC. Based on the data of PDAC patients in the TCGA database, univariate Cox analysis indicated that lncRNAs prognostic model was remarkably associated with overall survival [the hazard ratio (HR): 2.960, 95% confidence interval (CI): 1.807-4.849, p < 0.001; Figure 4H] and multivariate Cox analysis further showed that lncRNAs prognostic model was an independent predictor of overall survival, with the HR(95%CI) was 2.966(1.687-5.215) (p < 0.001, Figure 4I). The same results were verified in the ICGC database with less clinicopathological characteristics abundance (Figure 4J, K). These results indicated that the m6A-related lncRNAs prognostic model might help clinical prognosis evaluation as an independent prognostic indicator.
- The Differential Expression Level of Each m6A-Related LncRNAs
We analyzed the expression of each m6A-related lncRNAs in PDAC patients compared to normal pancreas tissues in the GTEx-TCGA database using the R package “limma.” We generated boxplots for seven critical lncRNAs and observed a statistically significant increased expression in tumor samples for all m6A-related lncRNAs based on the Bayesian algorithm (p < 0.001, Figure 5A-G).
- Construction of the ceRNA Network and Functional Enrichment Analysis
To further understand how the critical lncRNAs act on N6-methyladenosine regulators by sponging miRNAs in PDAC patients, we constructed a ceRNA network to explore the mechanism of m6A-related lncRNAs. Six lncRNAs were extracted from the Starbase and miRcode databases, and 162 pairs of interactions between the six lncRNAs and 153 miRNAs were identified. Then we excavated three databases (Starbase, miRDB, and miRTarBase) to search target N6-methyladenosine regulator based on the 153 miRNAs and a total of 890 pairs of interactions between the 153 miRNAs and 26 m6A regulators were identified in all three databases. Finally, 6 lncRNAs, 153 miRNAs, and 26 m6A regulators were included in the ceRNA network (Figure 5H). Furthermore, there are 17288 mRNAs we extract from the three databases the 153 miRNAs target, and we affirmed 1145 DEGs from the GTEx-TCGA database with the filter criteria |log2 (fold change)| > 2 and p < 0.01. These DEGs were wielded to implemented functional enrichment analysis and we found that these genes were enriched in the BP included extracellular matrix organization, extracellular structure organization, neutrophil activation, etc; the CC contains collagen-containing extracellular matrix, cell-substrate junction, focal adhesion, etc; the MF included cell adhesion molecule binding, extracellular matrix structural constituent, glycosaminoglycan binding, etc (Figure 5I). KEGG analysis showed that 31 signaling pathways were enriched in pancreatic cancer, some of which had tumor characteristics, including protein digestion and absorption, ECM-receptor interaction, focal adhesion, etc (Figure 5J). These data may provide medical workers clues for finding the potential pathways of these m6A-related lncRNAs in PDAC.
- Exploration of Small Molecule Drugs Related to Pancreatic Cancer
We put the 1145 DEGs mentioned above into the CMAP database for analysis. We set P< 0.01,∣mean∣> 0.6 as the filter criteria, and extract twelve small molecule drugs related to PDAC. The Thapsigargin, Adiphenine, Viomycin, and Nadolol negative control the m6A-related lncRNAs targeted mRNA expression. While the Mepacrine, Ellipticine, 8-azaguanine, DL-thiorphan, Proscillaridin, Trazodone, Bisacodyl, and Riboflavin positive regulated the targeted mRNA expression level (Table 2).