The differential expressions of m6A-associated lncRNAs.
A total of 19604 mRNAs and 14086 lncRNAs were screened from the TCGA database. 1590 m6A-associated lncRNAs were obtained (/R/>0.4 and p<0.05) according to 23 reported m6A-associated genes, of which, 866 differentially expressed m6A-associated lncRNAs in CRC were detected with a log/fold change (FC)/>0.5 and a p<0.001(Supplementary data 1).
Identification of m6A-associated lncRNAs with prognostic value
As shown in Figure1A, we annotated m6A-associated lncRNAs and clinical characteristics, then investigated the role of each lncRNA on the prognostic outcome of the patients with CRC. A total of 68 m6A-associated lncRNAs with obvious prognostic value were detected and used for further study.
Establishment of m6A-associated lncRNA clusters
To classify the different m6A clusters based on the lncRNAs, we mapped these 68 m6A-associated lncRNAs to expression profile of CRC samples to perform clustering using the Consensus Cluster Plus (CCP) tool. As shown in Figure1B, the number of clusters was sequentially set from 1 to 9 and CCP analysis indicated that the results were most stable when these m6A-related lncRNAs were separate into three clusters using Consensus Cluster Plus R package (Figure 1C, D). The OS data of each cluster was calculated using Kaplan-Meier method, and the results displayed that there was significant difference among the survival of CRC patients in these three clusters (Figure 1E).
Clinical characteristics and Immune Score of each cluster in CRC
As compared with cluster1 and cluster3, cluster 2 had the highest N stage, M stage and TNM stage (Figure 2A). ESTIMATE-algorithm was employed to evaluate the accurate estimate score (tumor purity), immune score and stromal score in accordance with the gene expression profiles of CRC patients. Our findings showed that compared with cluster 1 and 3, the cluster 2 had the lowest estimate score, immune score and stromal score (Figure 2B).
m6A-associated lncRNAs signature construction
As shown in Figure 3A, a total of 14 m6A-associated lncRNAs that had a co-expression relationship with 8 m6A-associated genes were recognized as effective independent prognostic factors. Among them, AC137932.3, AL391422.4, AC092123.1, AC156455.1, AC132192.2, AC008760.1, RPARP-AS1, LINC02657, AP001619.1, AC003101.2, AL161729.4, TNFRSF10A-AS1, AL121906.2 and AC074117.1 were found to be favorable prognostic factors (Supplementary data 2). The risk score of each CRC patient=AC137932.3*(-1.4041)+AL391422.4*0.9484+ AC092123.1* (-1.3865) +AC156455.1*0.1977+AC132192.2*(-0.4822)+ AC0a08760.1*0.5973+RPARP-AS1*0.3572+LINC02657*0.7205+ AP001619.1*0.8025+AC003101.2*1.0959+AL161729.4* 0.3047+TNFRSF10A-AS1*(-0.2329)+ AL121906.2*1.02629+AC074117.1*0.25582. Based on the median risk score, 426 CRC patients were classified into the low- and high- risk groups. The Kaplan-Meier curves and the distributions of survival status confirmed the poor outcome in the high-risk group (Figure 3 B-D). Our findings showed that the mortality was closely associated with the risk score. Moreover, the area under the curve (AUC) is measured and the value for prognostic risk score was 0.764 which is higher than AUCs of the other clinicopathological factors (Figure 4E). And the AUC values corresponding to 1-, 3-, and 5- year of OS were 0.764, 0.743 and 0.753, respectively (Figure 3F). These data indicated the good prediction accuracy of this model.
The validation of the signature in CRC
The prognostic value of the m6A-associated lncRNA signature was investigated in CRC patients from the TCGA dataset. The patients were classified by various clinical parameters, consisting of gender, age, T, N, M and TNM stage. In almost all subgroups, the patents with low- risk score trended to have higher OS rate than that of the high-risk group (Figure 4).
Next, we evaluated the independence and effectiveness of this model in predicting prognosis of CRC patients. Our findings showed that this m6A-associated lncRNA signature could be an effective and independent factor for predicting the outcome of the patients with CRC (Figure 5 A-B). Then, a nomogram was conducted to predict 1-, 3- and 5-year OS of the patients with CRC based on the results of univariate and multivariate Cox-regression analyses, including age, TNM stage and risk score (Figure 5C). The calibration curves demonstrated the well prediction accuracy of this nomogram in CRC patients (Figure 5 D-F).
Gene set enrichment analysis
Finally, we evaluated the potential biological mechanisms associated with the risk model by GSEA. As shown in Figure 6, the P53 signaling pathway (NOM p-val=0.0019, FDR q-val=0.155) and Natural Killer cells mediated cytotoxicity (NOM p-val=0.0172, FDR q-val=0.195) were more enriched in the low- risk group. Our study suggested that this risk-related model could be used for the personalized treatment for CRC patients.