Baseline
According to the criteria, 255 soft tissue sarcoma patients were selected in our research. Then, 170 patients were incorporated into the training set and the remaining 85 patients were used to form the testing set. The baseline information of all patients shown in Table 1. The results indicated that the differences of clinical data between the training and testing sets were not statistically significant.
Identification of prognostic IRGs in sarcoma patients
To identify the prognostic value of IRGs in soft tissue sarcoma patients, univariate Cox proportional hazard model was performed in 170 patients. Totally, 105 IRGs were selected as OS-related IRGs (Supplementary 1). Moreover, GO and KEGG enrichment analyses were performed, and the results are shown in Figure.1, which indicated that the major enriched GO terms of BP were defense response to other organism, positive regulation of cell adhesion, regulation of innate immune response, regulation of leukocyte activation, and positive regulation of leukocyte activation. In CC, the major enriched GO terms were adherens junction, focal adhesion, cell−substrate adherens junction, cell−substrate junction and Schaffer collateral − CA1 synapse. For MF, we can find that the OS-related IRGs were mainly enriched in receptor ligand activity, receptor regulator activity, growth factor activity, cytokine activity, and steroid hormone receptor activity. For KEGG pathway analysis, it showed that many immune- or tumor-related pathways were identified, such as T cell receptor signaling pathway, Natural killer cell mediated cytotoxicity, Kaposi sarcoma−associated herpesvirus infection, PD−L1 expression and PD−1 checkpoint pathway in cancer, Th1 and Th2 cell differentiation, and NF−kappa B signaling pathway.
Construction of a TF regulatory network
To elucidate regulatory mechanisms of OS-related IRGs, we used univariate Cox proportional hazard model to identify the OS-related TFs. Totally, 36 OS-related TFs were confirmed (Supplementary 2). Furthermore, we analyzed the correlation between the expression of OS-related TFs and OS-related IRGs, the correlation results are shown in Supplementary 3. Interestingly, the results indicated that all TFs were positively correlated with IRGs. To better illustrate the regulatory relationship between TFs and IRGs, a TF-based regulatory network was generated (Figure.2).
Construction Development and validation of the IRGs prognostic signature in training cohort
Based on 105 OS-related IRGs, the LASSO regression was used to choose the best appropriate genes as the prognostic predictors of the model. 35 genes were selected in the LASSO regression analysis (Figure.3). Then, the multivariate Cox proportional hazard model was performed based on the significant genes in the LASSO analysis, and a prognostic signature was established based on the 19 prognostic IRGs (Supplementary 4 and Figure.4). The time-dependent ROC of 1-, 2-, and 3-years were shown in Figure. 4C. The AUC values of 1-, 2-, and 3-years were 0.938, 0.937, and 0.935, respectively, which means that the prognostic signature can serve as a valid tool for prognostic prediction in sarcoma patients (Figure.4C). In addition, the risk scores of each patient in the training set were calculated, the median of risk score was used as the cutoff to stratify patients into high-risk(n=85) and low-risk(n=85) groups. The survival curve was generated, and the log-rank test indicated that the patients in the low-risk group had a favorable prognosis (Figure.4D).
External validation of IRGs signature
To further verify the stability and reliability of the risk signature based on the IRGs, an independent set was used. As the formula of risk score in the training set, the risk scores of each patient sample in the testing set were calculated (Figure.5). The time-dependent ROC curves were generated to test the discrimination of the signature (Figure. 5C). The results showed that the AUC values of 1-,2-, and 3-year were 0.730, 0.717, and 0.647, which also showed good accuracy of nomogram in predicting the OS of sarcoma patients. Furthermore, according to the median of the risk score in the testing set, 85 patients were stratified into the low-risk group (n=43) and high-risk group (n=42). The survival curve of two groups was generated, and the results indicated that patients in the high-risk group have a worse prognosis (Figure.5D).
Development of a nomogram based on the IRG signature and clinical data
To further construct a prognostic nomogram combining IRGs and clinical data, we performed univariate and multivariate Cox regression analysis to assess the independent prognostic variables for soft tissue sarcoma patients. In the univariate analysis, age, disease multifocal indicator, metastatic disease confirmed, surgical margin resection status and risk score were associated with the prognosis of sarcoma patients (Table 2). Then, the significant variables in the univariate Cox analysis were incorporated into the multivariate Cox analysis, and five independent prognostic variables were identified (Table 2). According to the results of the multivariate Cox analysis, we can find that risk score has the greatest impact for OS. In addition, higher age, multifocal sarcoma, tumor metastasis, and surgical resection status(R1-2) were also associated with worse prognosis in soft tissue sarcoma patients (Table 2). To better predict the prognosis of soft tissue sarcoma patients, we constructed a nomogram based on the independent factors determined in the multivariate regression (Figure.6A). The C-index of our nomogram was 0.775(95%CI:0.751-0.799), which showed good accuracy in predicting the prognosis of soft tissue sarcoma patients. The favorable calibration plot of our nomogram indicated that the OS predicted by the nomogram is highly consistent with the actual observation (Figure. 6B-6D). In addition, DCA was also performed, and the results indicated the nomogram can serve as an effective prognostic model for soft tissue sarcoma patients (Figure. 6E-6G).
Table.2 Univariate and multivariate Cox analysis in sarcoma patients
|
|
Univariate analysis
|
|
Multivariate analysis
|
|
HR
|
95%CI
|
P
|
|
HR
|
95%CI
|
P
|
Age
|
1.020
|
1.004
|
1.036
|
0.012
|
|
1.025
|
1.008
|
1.043
|
0.004
|
Sex
|
|
|
|
|
|
|
|
|
|
Female
|
|
|
|
|
|
|
|
|
|
Male
|
0.849
|
0.565
|
1.274
|
0.429
|
|
|
|
|
|
Race
|
|
|
|
|
|
|
|
|
|
Asian
|
|
|
|
0.412
|
|
|
|
|
|
African American
|
1.108
|
0.135
|
9.098
|
0.924
|
|
|
|
|
|
WHITE
|
0.810
|
0.111
|
5.907
|
0.836
|
|
|
|
|
|
Unknown
|
2.004
|
0.207
|
19.378
|
0.548
|
|
|
|
|
|
Histological type
|
|
|
|
|
|
|
|
|
|
DLP
|
|
|
|
0.903
|
|
|
|
|
|
LMS
|
0.855
|
0.515
|
1.419
|
0.543
|
|
|
|
|
|
MYX
|
0.730
|
0.339
|
1.571
|
0.421
|
|
|
|
|
|
UPS
|
0.960
|
0.509
|
1.810
|
0.899
|
|
|
|
|
|
Other
|
0.746
|
0.321
|
1.734
|
0.496
|
|
|
|
|
|
Multifocal indicator
|
|
|
|
|
|
|
|
|
|
No
|
|
|
|
0.002
|
|
|
|
|
0.008
|
Yes
|
2.328
|
1.443
|
3.754
|
0.001
|
|
2.228
|
1.324
|
3.748
|
0.003
|
Unknown
|
1.220
|
0.606
|
2.459
|
0.577
|
|
1.135
|
0.486
|
2.651
|
0.770
|
Metastasis
|
|
|
|
|
|
|
|
|
|
No
|
|
|
|
<0.001
|
|
|
|
|
<0.001
|
Yes
|
2.962
|
1.797
|
4.880
|
<0.001
|
|
2.926
|
1.721
|
4.972
|
<0.001
|
Unknown
|
1.795
|
1.078
|
2.987
|
0.024
|
|
1.770
|
1.005
|
3.115
|
0.048
|
Radiotherapy
|
|
|
|
|
|
|
|
|
|
No
|
|
|
|
0.963
|
|
|
|
|
|
Yes
|
1.012
|
0.633
|
1.620
|
0.959
|
|
|
|
|
|
Unknown
|
1.075
|
0.638
|
1.812
|
0.786
|
|
|
|
|
|
Surgical margin resection status
|
|
|
|
|
|
|
|
|
R0
|
|
|
|
<0.001
|
|
|
|
|
0.005
|
R1-2
|
2.418
|
1.572
|
3.719
|
<0.001
|
|
1.974
|
1.257
|
3.099
|
0.003
|
Unknown
|
2.194
|
1.156
|
4.165
|
0.016
|
|
2.099
|
0.960
|
4.589
|
0.063
|
Tumor site
|
|
|
|
|
|
|
|
|
|
Extremity
|
|
|
|
0.745
|
|
|
|
|
|
Other
|
1.182
|
0.699
|
1.997
|
0.533
|
|
|
|
|
|
Retroperitoneum/
Upper abdominal
|
1.193
|
0.735
|
1.934
|
0.475
|
|
|
|
|
|
Risk
|
5.192
|
3.215
|
8.382
|
<0.001
|
|
5.362
|
3.241
|
8.868
|
<0.001
|
LMS: Leiomyosarcoma; DLP: Dedifferentiated liposarcoma; UPS: Undifferentiated pleomorphic sarcoma; MYX: Myxofibrosarcoma
Subgroup analyses of nomogram
To further confirm that the nomogram can perform stably in different histological types of soft tissue sarcoma, patients were divided into different subgroups (Figure. 7). We can find the AUC values of the nomogram were higher than 0.750(range:0.750-0.867) in all subgroups, which means that the nomogram can perform stably in different histological types of soft tissue sarcoma. In addition, in all subgroups, the Kaplan-Meier survival curves and the log-rank test indicated that the patients in the high-risk group have a poorer prognosis than patients in the low-risk group (all p<0.05) (Figure. 7).
Comparison of the immune infiltration cell in low- and high-risk group
After CIBERSORT package was performed, 169 patients with complete data of immune infiltration cells were included in this part of the study (Figure.8A). Among the 169 patients, 87 were in the low-risk group and 82 were in the high-risk group. The results showed that six immune cells were significantly different between the two groups (Figure.8B). The infiltration level of plasma cells and macrophages M0 were significantly higher in the high-risk group, while the infiltration level of NK cells resting, NK cells activated, monocytes, and macrophages M1 were significantly higher in the low-risk group (Figure.8B).