3.1 M7G-Related lncRNAs in HCC patients.
The flow map of our study was presented in figure1.In TCGA LIHC project ,we got totally with 50 normal samples, 365 tumor samples( 9 samples was omitted).
Based on different gene expression and correlation analysis,we obtained 730 M7G related significantly different expressed LncRNAs in tumor and normal samples(| logFC |>1,FDR<0.05 ;correlation coefficient >0.5,p<0.001).The heat map,volcano plot and network map were presented in figure 2 separately (Figure 2A&B).The M7G gene list and net work list were available in supplementary 1.
3.2 Construction and Verification of the Risk model.
Univariate Cox regression and Lasso regression were utilized to identify target LncRNAs with prognostic value, multiple Cox regression was applied to well select LncRNAs so as to construct a reliable mode (cox p<0.01).Then using four target IncRNAs, we built an M7G-related IncRNA model. A forest map was intended to show uni-cox result, while a heatmap show such lncRNAs expression between tumor and normal samples (figure 3A&B ). To avoid overfitting the prognostic signature, we performed the Lasso cox regression and finally 4 lncRNAs were considerably associated with prognosis (Figures 3C and 3D). Sankey diagram and M7G-LncRNA net work was used to show the relationship between target LncRNAs (figure 3E). By taking the median risk score as the cut-off point, patients were divided into high and low risk group, respectively. The different PCA between two groups showed that our risk signature has good discrimination of the samples in our study (figure 3F).
And then the expression standard of 4 lncRNAs, survival status, survival time were compared between two groups. The survival analysis displayed a worse prognostic outcome in high risk group compared with low risk group. The above analysis in the entire set, train set and test set was performed and obtained similar results, which indicated that our model is stable and reliable (Figures 4A–L). Similarly, it also performed the same prognostic results after separated by the clinical parameters (Figure 4M).
3.3 Nomogram analysis.
We did uni-Cox and multi-Cox analysis to verify the risk score is an independent risk factor for HCC prognosis. According to uni-Cox and multi-Cox analysis, we can see that risk score (HR=1.149, p < 0.001) and stage (HR=1.680, p < 0.001) were independent factors for OS (Figure 5 A&B).
In order to explore the prognostic value of the risk model, nomogram and calibration were also implemented (Figure 5 C&D). Nomogram was established with gender, TNM grade and risk score to predict 1-, 2-, and 3-years OS rate of all HCC patients(Figure 5C). Furthermore, the calibration plots were also used to see whether the nomogram was well concordant with the prediction for 1, 2, and 3 years (Figure 5D). Considering of the two analyses, the model illustrated a good prognostic predictive ability.
3.4 Risk Model evaluation
We applied time-dependent ROC to evaluate the accuracy of risk model. It showed that the 1-, 3-, and 5-years area under the ROC curve were 0.757, 0.715,0.655, respectively (Figure 5E). At the 1-years ROC of risk model, it was observed that risk score had a better predictive ability compare with other clinical factors in all HCC samples (AUC=0.757, Figure 5F), which revealed a reliable result.
3.5 Functional Enrichment Analysis
According to the risk signature above, we split all of the patients into two risk groups, and then we screened 76 genes associated with m7G that showed differential expression using the criterion of |log2 FC| > 1 and p < 0.05. These genes were explained biologically using functional enrichment analysis. These genes were strongly linked to the cell cycle, chromosomal segregation, organelle fission, nuclear division, and other processes, according to the GO analysis (Figure 6A&B).
GSEA software were utilized to explore the biological pathways which were enriched in the high-risk and low-risk group. With the criteria of p < 0.05 and FDR <0.05, 5 most significant pathways enriched in the low-risk group and high-risk group were displayed, respectively. The full significant enriched pathways were displayed in the Supplementary 2. Immune related pathways were enriched among them. In the low-risk group, these genes were significantly associated with complement coagulation cascades, fatty acid metabolism, primary bile acid biosynthesis, retinol metabolism, valine leucine and isoleucine degradation. In the high-risk group, these genes were found to be significantly associated with transcription factors, cell cycle, oocyte meiosis, spliceosome, ubiquitin mediated proteolysis which may play a key role in the occurrence and development of tumors (figure 6C).
3.6 Immune inflation and function analysis.
Subsequently, we analyzed the correlation between tumor immune score, immune cell infiltration and risk score to explore the potential correlation between risk score and the efficacy of immunotherapy. The estimate score, immune score and stromal score were significantly lower in the high-risk group compared with the low-risk group (figure 6D-F). These results showed that samples with higher risk score pretend to have negative relationship with TME scores, which meant worse effective immunotherapy as reported. Moreover, different types of immune cells were associated with the risk score. For example, macrophages M1, NK cell, plasmacytoid dendritic cell, T cell CD4+ effector memory, T cell CD8+ central memory and T cell CD8+ naive were negative correlation with risk score. while, macrophage M2, macrophage/monocyte, monocyte and T cell regulatory (Tregs) were positive correlation with risk score (figure 7A). All of these showed the high-risk group had a lower immune infiltration status, signifying a different TME from the low-risk group. The TME analysis applied ESTIMAT score to evaluate tumor microenvironment. Besides, concerning the comparison of the single sample GSEA (ssGSEA) scores for immune cells and immune functions, the low-risk group has higher B cells, CD8+T cells, DCs, mast cells, neutrophils, NK cells, T helper cells and tumor infiltrating lymphocyte (TIL) (figure 7B. It was also upregulated for ACP co inhibition, cytolytic activity, HLA, T cell co-inhibition, T cell co-stimulation, type I IFN response and type II IFN response in the low-risk group (figure 7C). Immune checkpoint gene expression between risk groups were shown in box plot, almost all the immune checkpoints expressed more activity in high-risk group, such as TNFSF18,CTLA4, PDCD1 (PD-1) and CD276 (figure 7D ).
Analysis of IC50 Drug Database on GDSC
Due to results presented before, we considered it practicable that reliable immune related therapy could be selected. Thus treatment analysis drug index of IC50 based on the risk model using pRRophetic R package were underwent. We found that the high-risk patients had a lower IC50 of 13 immunotherapeutic drugs such as mitomycin, FTI-277, ABT-263 and so on (Figure 8A), while 16 candidate drugs in low risk groups had a lower IC50 such as Sorafenib, Temsirolimus, Methotrexate and so on (Figure 8B).