3.1 Identification of prognostic-related immune-related lncRNA in osteosarcoma
Firstly, we isolated lncRNA expression data from the RNA-SEQ data of 88 patients downloaded from the TCGA database. Subsequently, a total of 2498 immune-related genes were extracted from the ImmPort database. Supplementary Table 1 provides detailed information on these immune-related genes. As shown in Supplementary Table 2, 1986 immune-associated lncRNA were identified by constructing immune-lncRNA co-expression networks. We analyzed the relationship between these immune-related lncRNA and the survival of 85 patients using univariate Cox regression analysis (3 patients lack valid clinical data). Finally, 240 lncRNAs were identified as survival-related immune-related lncRNA and used for further analysis. Supplementary Figure 1 and Supplementary Table 3 shows the results of univariate cox analysis of these lncRNAs.
3.2 Establishing a risk score and testing its prognostic value in osteosarcoma
As mentioned above, these lncRNAs related to prognosis were subjected to 1000 iterations using LASSO analysis. Subsequently, the 29 lncRNA retained in more than 50 iterations were considered important lncRNA for further analysis. Supplementary Table 1 shows the details of these lncRNA. Incorporating these important lncRNA into the COX model, the optimal model consisting of 13 lncRNA was determined based on the 5-year survival AUROC. The risk score of each patient was calculated, and the cut-off value of the risk score was determined to be 0.186 using the time-dependent receiver operating characteristic (ROC) curve analysis. The results of the survival analysis showed that patients in the low-risk group had longer survival rate than patients in the high-risk group (Figure 1D, P<0.001). Figure 1 shows the process of building this model. Supplementary Figure 2 shows the relationship between 13 lncRNA and immune genes.
To verify the independent prognostic value of the risk score, we performed a multivariate regression analysis. As shown in Figure 2B, after adjusting for other variables (including age, gender, and metastatic information), we found that the risk score may be an independent predictor. The results of the forest plot show that the higher the risk score, the shorter the overall survival of the patient (hazard ratios: 2.974, 95% of confidence intervals: 2.164-4.088, P= 1.89e−11). The results of the time-dependent ROC analysis show that the risk score has good discriminative ability. As shown in Figure 2A, no matter how the patient's survival time changes, the risk score always has an excellent discriminating ability. On the contrary, as the patient's survival time prolongs, the discriminating ability of metastatic status continues to decline.
3.3 Relationship between risk score and clinical characteristics
To further explore the strength of the risk score, we divided patients into 6 subgroups based on their age, gender and metastatic status, and explored the prognostic value of risk scores in each subgroup. Among them, 48 were in the male group, 37 were in the female group, 66 were in the Age<18 years old group, 19 were in the Age>=18 years old group, 64 were in the non-metastasis group, and 21 were in the metastasis group. As shown in Figure 3A, the risk score shows a good prognostic value in each subgroup. Metastatic status is an important basis for formulating treatment plans for patients with osteosarcoma. Therefore, we further divided patients into 4 groups according to their metastatic status and risk status. Among them, group 1 has 45 patients, group 2 has 8 patients, group 3 has 19 patients, and group 4 has 13 patients. As shown in Figure 3B, there was no significant difference in overall survival rate between metastatic patients and non-metastatic patients in the low-risk group. Among patients in the metastasis group, patients in the low-risk group had a longer overall survival than those in the high-risk group. Since the prognosis of metastatic patients is affected by some clinical factors such as the metastatic location and treatment methods. We further analyzed whether the prognostic value of the risk score for metastatic patients is affected by the location of metastasis. According to the patient's clinical information, 6 out of 21 metastatic patients with complete survival data had bone metastases. Our results show that after adjusting the metastatic site, the risk score is an independent prognostic factor for metastatic patients.
3.4 Evaluation of differences in immune infiltration between high- and low-risk groups
We further evaluated the differences in immunological characteristics between the two groups of patients. As mentioned above, the abundance of 10 immune-related cells was calculated using the MCP-counter method. As shown in Figure 4A, a significant difference was observed between the two groups of patients. Compared with patients in the high-risk group, the abundance of the 8 cell populations in the patients in the low-risk group was higher (B-cell lineage, CD8+T cells, Cytotoxic lymphocytes, Endothelial cells, Monocytic lineage cells, Neutrophils, NK cells, T cells). Then, the estimate, immune and stromal scores were calculated using the ESTIMATE algorithm. Similarly, patients in the low-risk group had higher estimate scores, immune scores, and stromal scores than those in the high-risk group (p < 0.001, Figure 4B). We further used ssGSEA to evaluate the difference of 29 immune-related gene sets or immune cells between the two groups of patients as a supplement. The results of ssGSEA have reached similar conclusions. Most immune-related gene sets or immune cells are significantly enriched in the low-risk group (Figure 4C). Finally, we explored the relationship between abundance of 10 immune-related cells and risk score. As shown in Figure 5, with the increasing risk score, the abundance of immune cells kept decreasing, especially CD8 + T cells.
The differences in gene expression of some immune-related pathways were further evaluated and heatmaps were drawn to show the results. We found that genes related to activated T effector and IFNγ pathway such as STAT1, CCL4, CXCL9, CXCL10 were significantly up-regulated in the low-risk group (Figure 6B).
3.5 The relationship between signature and the expression of immune checkpoint gene, tumor microenvironment type
As mentioned above, the association between the expression of 7 potentially targetable immune checkpoint genes between the two groups was assessed. As shown in the figure, all immune checkpoint genes in the low-risk group were more highly expressed, although PDCD1 did not show statistical significance (Figure 6A). Further analysis showed that the CD8A gene is also highly expressed in the low-risk group. Similarly, tumor lymphocyte infiltration was higher in the low-risk group. However, the optimal cut-off values for PD-L1 and CD8A mRNA expression have not been determined. Therefore, we analyzed the relationship between the risk score value and PD-L1 gene expression, CD8A gene expression, and TIL. The results showe that as the risk score increased, the expression of PD-L1 gene and CD8A gene decreased (Figure 6C-D). Unfortunately, although TIL also showed a similar trend, it did not reach statistical significance (Figure 6E). However, the results of the box plot show that the TIL of the low-risk group is higher than that of the high-risk group (Figure 4C).
In addition, we used SubMap analysis to further study the relationship between MRGP signature and immunotherapy efficiency. Using subclass mapping, the expression profiles of the two groups of patients (high-risk group and low-risk group) were compared with a published immunotherapy data set. This data set records the expression data of 47 melanoma patients treated with programmed cell death protein 1 (PD-1) immune checkpoint inhibitors or cytotoxic T lymphocyte associated protein 4 (CTLA-4) immune checkpoint inhibitors. The results showed that the expression profiles of patients in the low-risk group were correlated with those in the PD-L1 response group (Figure 6F, Bonferroni P value=0.016). This indicates that patients in the low-risk group are more likely to benefit from PD-L1 therapy. We further evaluated the characteristics of patients in the lncRNA signature group. As shown in Figure 6G, there is a higher proportion of metastatic patients in the high-risk group.
3.6 Identifying the differences in biological pathways and processes between the two groups of patients
We used GSVA to study the differences in biological pathways between the two groups of patients. As shown in the figure 7, significant differences were observed in the biological pathways between the two groups of patients. We found that in the low-risk group, most immunization-related pathways had higher GSVA scores. It is worth noting that patients in the high-risk group had higher GSVA scores for certain metabolic-related pathways. Supplemental Table 4 shows detailed information on GSVA results.
3.7 Construction of Nomogram Based on Risk Score
We developed a nomogram that combined risk scores with traditional clinical factors (Figure 8), and tested the accuracy of the nomogram using a calibration curve. As shown in Figure 8B, the 3-year and 5-year overall survival predictions show that the nomograms have good accuracy. The C index of the nomogram is 0.924, which indicates that our nomogram has a good degree of discrimination. These results suggest that the new nomogram shows reliable performance in predicting patient prognosis. Finally, the results of the DCA analysis show that the model combined with the risk score can bring clinical net benefits.