Identification of immune-related lncRNAs correlated with OS prognosis
A total of 1126 immune-related lncRNAs and their expression were obtained through co-expression analysis with immune genes. Based on the training set, a total of 25 lncRNAs correlated with OS prognosis at P < 0.01 were detected in univariate Cox regression analysis with the R package Survival (Table 1).
Table 1 Univariate cox regression analysis for 25 immune-related lncRNAs were correlated with Osteosarcoma
id
|
HR
|
Lower 95% CI
|
Upper 95% CI
|
P value
|
ZFAS1
|
2.832085
|
1.378306
|
5.819248
|
0.00460853
|
RP11-69E11.4
|
2.01796
|
1.287032
|
3.163994
|
0.00221606
|
RP11-304F15.3
|
2.056553
|
1.203551
|
3.51411
|
0.00834548
|
EPB41L4A-AS1
|
2.919649
|
1.305154
|
6.531299
|
0.00910008
|
RP11-535M15.1
|
2.235202
|
1.246264
|
4.008885
|
0.00696358
|
SNHG7
|
2.843028
|
1.41978
|
5.692998
|
0.00318482
|
RP3-460G2.2
|
0.44072
|
0.244987
|
0.792835
|
0.0062419
|
GAS5
|
2.700873
|
1.667261
|
4.375268
|
0.0000542
|
OLMALINC
|
3.011264
|
1.318521
|
6.877184
|
0.0088915
|
ELFN1-AS1
|
2.148336
|
1.40772
|
3.278599
|
0.00039184
|
UNC5B-AS1
|
2.272839
|
1.403979
|
3.679398
|
0.00083625
|
RP5-894A10.2
|
4.0175
|
1.454702
|
11.09527
|
0.00729447
|
SNHG6
|
2.827236
|
1.511355
|
5.288806
|
0.00114413
|
MIR210HG
|
2.141462
|
1.25106
|
3.665579
|
0.00549075
|
RP11-774O3.3
|
0.255121
|
0.09958
|
0.653615
|
0.00442894
|
RP11-21L23.3
|
2.629617
|
1.320111
|
5.238108
|
0.00596264
|
RP11-750H9.5
|
0.302797
|
0.137799
|
0.665361
|
0.00293673
|
RP11-136I14.5
|
1.618776
|
1.135072
|
2.308607
|
0.00782557
|
CTD-2341M24.1
|
0.229614
|
0.075576
|
0.697613
|
0.00945731
|
RP11-45P15.4
|
0.268225
|
0.098707
|
0.728871
|
0.00987941
|
RP1-239B22.5
|
2.383866
|
1.249241
|
4.549015
|
0.00841514
|
RP11-81A22.5
|
1.439271
|
1.106446
|
1.872212
|
0.00665094
|
PARD6G-AS1
|
2.333596
|
1.31905
|
4.128477
|
0.00359924
|
RP5-894A10.6
|
3.627491
|
1.576219
|
8.348263
|
0.002446
|
CTD-2227E11.1
|
2.839231
|
1.422439
|
5.667188
|
0.00308429
|
Abbreviations: TARGET, Therapeutically Applicable Research to Generate Effective Treatment; HR, hazard ratio; |
CI, confidence interval |
Construction and evaluation of the risk score signature
For the 25 prognosis-related lncRNAs, LASSO Cox regression analysis was used to select appropriate lncRNAs to construct the model [17]. Eleven lncRNAs were obtained by 10-fold cross-validation, in which the partial likelihood deviance was at its minimum (Fig. 1). Subsequently, multivariate Cox regression analysis and KM survival analysis were conducted to construct the risk signature. A combination of five lncRNAs (RP11-69E11.4, SNHG6, MIR210HG, RP11-750H9.5 and CTD-2341M24.1) (Table 2) was selected as the most suitable predictor for prognosis. A risk score model was constructed to predict the prognosis of patients with OS as follows: RP11-69E11.4 × 1.219 + MIR210HG × 0.905 + MIR210HG × 1.000 − RP11-750H9.5 × 1.479 − CTD-2341M24.1 × 1.763.
Table 2
Multivariate cox regression analysis for RP11-69E11.4, SNHG6, MIR210HG, RP11-750H9.5, and CTD-2341M24.1
id
|
Coefficient
|
HR
|
Lower 95% CI
|
Upper 95% CI
|
P value
|
RP11-69E11.4
|
1.219242
|
3.384621
|
1.834678
|
6.243962
|
0.000095
|
SNHG6
|
0.904814
|
2.471473
|
1.102726
|
5.53916
|
0.027988
|
MIR210HG
|
0.999502
|
2.716928
|
1.409203
|
5.238208
|
0.002844
|
RP11-750H9.5
|
-1.4786
|
0.227956
|
0.080779
|
0.643281
|
0.005215
|
CTD-2341M24.1
|
-1.76314
|
0.171506
|
0.043114
|
0.68225
|
0.012325
|
Abbreviations: coef, coefficient |
Taking the median of risk scores as the cut-off, the OS patients were classified into high-and low-risk groups. Across the dataset, the KM survival analysis also showed a significant correlation of the 5 lncRNAs with survival and prognosis, that high-risk group patients had significantly poorer prognosis compared to low-risk ( P< 0.01 ) (Fig. 2).
The high-risk and low-risk groups was compared in survival analysis curves, indicating that the overall survival period of the patients in the low-risk group was much higher than that of the high-risk group, and the risk score and number of deaths in the high-risk group were also much higher than that of the low-risk group (Figs. 3.1, 3.2 and 3.3).
The time-dependent AUCs for the risk signature of training set were 0.925 at 1-year, 0.929 at 3-year, and 0.961 at 5-year (Fig. 4a), of validation set were 0.921 at 1-year, 0.898 at 3-year, and 0.878 at 5-year (Fig. 4b), of overall data set were 0.913 at 1-year, 0.881 at 3-year, and 0.882 at 5-year (Fig. 4c). All of sets were greater than 0.87, indicating good prediction accuracy of the model.
Furthermore, the validation set and the overall data set were well consistent with the training set that suggested by the calibration curves for 1-year, 3-year and 5-year survival rates (Fig. 5).
Construction and verification of the nomogram
Incorporating clinical factors of the OS dataset into the model of the overall dataset, the results of the univariate and multivariate Cox regression analysis showed that our model also played a significant role in clinical predictability (Table 2). Therefore, a nomogram integrating the risk score and four clinical factors was established to predict OS prognosis (Fig. 6). Each variable has its own score, and the 1-year, 3-year and 5-year predicted survival can be obtained by summing the scores of the different variables on the nomogram.
GO and KEGG enrichment analysis
In GO term, the results of immune genes co-expressed with modeled lncRNAs, showed that the biological process (BP) of co-expressed immune-related genes was enriched to the leukocyte migration and proliferation, cell chemotaxis and monocyte proliferation. The cellular component (CC) was enriched to the MHC protein complex, the outer side of the plasma membrane, the MHC class II protein complex and so on. The molecular function (MF) was enriched to cytokine binding, receptor-ligand activity and cytokine receptor activity (Fig. 7). All above of them were associated with the risk signature, which affect the proliferation, migration, chemotaxis and combination of Osteosarcoma-related immune cells.
In addition, co-expressed immune-related genes were involved in KEGG pathways, such as B-cell receptor signaling, NF-κB signaling pathway, MAPK signaling pathway, chemokine signaling pathway, JAK-STAT signaling pathway and some other cancer-related pathways (Fig. 8). The lncRNAs play a key role in the development of OS and contribute to it was indicated in above enrichment results.
The GSEA software provides a method for analyzing large gene expression data, interpreting the gene expression data, and aggregating and visualizing genes with common biological functions [18]. With the data of high- and low-risk groups were identified through GSEA, it was found that the high-risk group in the model significantly clustered into the immune-related gene set, while it gradually decreased when it came to the low-risk group, as shown in Fig. 9. Combining with the former overall survival analysis suggested that the survival of the low-risk group was significantly higher than that of the high-risk group, indicating that lncRNAs may play a crucial role in the prognosis of OS.