Profiling of DEARGs in sarcoma patients
The differential analysis was performed between 911 normal samples and 259 sarcoma samples. Totally, 5609 genes were dysregulate in sarcoma samples (Figure 1A). By matching the list of ARG from HADb, 84 DEARGs were determined (Figure 1B). Among 84 DEARGs, 40 genes were highly expressed in sarcoma samples. The location of 84 DEARGs on chromosomes are showed in Figure 1C.
To further understand the molecular mechanism of DEARGs, we performed GO and KEGG annotation analyses. The GO analysis showed that DEARGs were mainly involved in “autophagy”, “vacuole”, “apoptotic signaling pathway”, “regulation of autophagy”, and “regulation of cellular response to stress” (Figure 1D and 1E). Furthermore, KEGG analysis demonstrated that the pathway of DEARGs were mainly involved in “Autophagy – animal”, “Mitophagy – animal”, “Pathways in cancer”, “Apoptosis”, and “Protein processing in endoplasmic reticulum” (Figure 1F and 1G). Generally, both GO and KEGG analyses showed these 84 DEARGs were significantly associated with autophagy and malignance.
DEARGs are robust diagnostic biomarkers for sarcoma
Based on 84 DEARGs and LASSO analysis, 15 DEARGs were confirmed (Figure 2A). The ROC curves and corresponding of these genes are shown in Figure 2B. The AUC values of aforementioned genes were higher than 0.950, which means these genes have terrific discrimination between sarcoma and normal samples. Furthermore, the expression 12/15 genes were obtained from GSE2719 and GSE21122 cohorts. For the GSE2719 cohort, 9/12 genes were confirmed as diagnostic biomarkers for sarcoma (AUC>0.500 and P values<0.05), with AUC values range from 0.709-0.950 (Figure 2C). Additionally, in the GSE21122 cohort, 9/12 genes were determined as diagnostic genes for sarcoma (AUC>0.500 and P values<0.05), with AUC values range from 0.704-0.991 (Figure 2D). Interestingly, seven genes were successfully validated in both GSE2719 and GSE21122 cohorts and BIRC5 has highest AUC values in in both cohorts. Therefore, we speculated that BIRC5 may be a robust diagnostic and therapeutic biomarker for sarcoma.
ARG-based clusters associated with tumor microenvironment
By performing unsupervised cluster analysis, 259 sarcomas were divided into two clusters (Figure 3A-3C). Tumor site, histological type, sex, and age were significant difference between two clusters (Figure 3D). Moreover, tumor microenvironment analysis further showed the heterogeneity of two clusters. Compared with cluster 2, cluster 1 had lower immune and stromal scores (Figure 3E). Specifically, the fraction of B cells naive, T cells CD4 memory resting, NK cells activated, and Mast cells resting were significantly higher in cluster 1, whereas the fraction of T cells CD4 memory activated, Monocytes, Macrophages M2, and Neutrophils were significantly lower in cluster 1(Figure 3H). In addition to immune cells, we also find that the expression of common immune check points was significantly different between two clusters (Figure 2G) but there were no significant difference of MSI (Figure 3F).
Construction two favorable ARG signatures of sarcoma patients
Univariate Cox regression found that 17 DEARGs were significantly related to the OS and 4 DEARGs were related to the DFS. Subsequently, in the LASSO analysis, seven OS-related DEARGs were exclude but no DFS-related genes were excluded (Figure 4C and 4D). Finally, by performing multivariate Cox analyses, five and three DEARGs were incorporated into OS and DFS signatures, respectively (Figure 4A and 4B). According to the median of risk score, all patients were divided into low- group and high-risk groups. Kaplan-Meier survival curve analyses showed that the prognosis of OS and DFS in high-risk group was significant worse than low-risk patients (Figure 5A and 5C). The ROC curves also indicated favorable discrimination of signature. The 3- and 5-years AUC values of OS signature were both 0.744 (Figure 5B), and the AUC values of DFS signature at 3- and 5-years were 0.644 and 0.668, respectively (Figure 5D).
ARG signatures showed stable prognostic value in independent cohorts
To verify the accuracy of two prognostic signatures, we calculated the risk score of each patient in the corresponding validation cohort. The TARGET-OS cohort was used to validate OS signature and the survival analysis showed that low-risk patients were favorable OS (Figure 5E). The ROC curves at 3- and 5-years also have favorable discrimination, with AUC were 0.674 and 0.656, respectively(Figure 5F). Meanwhile, the GSE30929 cohort was used to validate the DFS signature and the verification result also confirmed that the DFS signature based on ARGs was a stable prognostic prediction tool (Figure 5G and 5H).
Development of nomograms for predicting 3- and 5-years prognosis for sarcoma patients
The Cox analysis for clinical data and ARG signatures are illustrated in Figure 6. Generally, both OS and DFS signatures were independent prognostic biomarkers for sarcoma(Figure 6A and 6B). For clinical data, age, metastatic status, and margin status were confirmed as independent OS-related variables(Figure 6A), whereas tumor site and margin status were DFS-related variables(Figure 6B). Two nomograms were developed (Figure 7A and 7C). The values of C-index of OS and DFS nomograms were 0.818 and 0.636, respectively. The calibration plots of the both nomograms showed that the nomogram-predicted outcomes were in good agreement with the observational outcomes (Figure 7B and 7D). Generally, these results showed that ARG-clinical nomograms can accurately predict the OS and DFS of sarcoma patients.