Available datasets of sarcomas
There were four datasets (GSE55625, GSE31045, GSE17674 and GSE18546) obtained from GEO, whereas the GSE55625 and GSE31045 datasets, with multiple zero and negative expression values, were excluded. The GSE17674 dataset contains Ewing sarcoma patients and 18 normal muscle tissue samples and is based on the array platforms of Affymetrix Human Genome U133 Plus 2.0 Array, from which we extracted probes for lncRNA annotation. The GSE18546 dataset contains two subtypes of bone sarcoma, Ewing sarcoma and synovial sarcoma, and 5 muscle tissue samples. Five Ewing sarcoma samples and 10 synovial sarcoma samples were allocated to two case-control groups, respectively.
Identification of DEGs, DELs and DEMs in sarcomas
DEGs and DELs were identified in 32 Ewing sarcoma patients compared with 18 normal muscle tissue samples in the GSE17674 dataset. DEMs were identified in 10 synovial sarcoma samples and 5 Ewing sarcoma samples compared with muscle tissue samples in the GSE 18546 dataset. A total of 3415 mRNAs (2554 upregulated and 861 downregulated genes) and 338 lncRNAs (234 upregulated and 104 downregulated lncRNAs) were deferentially expressed in sarcoma patients of the GSE17674 dataset (Table S1; Table S2).
There were two different subtypes of sarcomas, Ewing sarcoma and synovial sarcoma, in the GSE18546 dataset, and two different expression miRNAs sets were identified. Fifty-two miRNAs (39 upregulated and 13 downregulated miRNAs) were differentially expressed in Ewing sarcoma in comparison to skeletal muscle. In addition, there were 145 (109 upregulated and 36 downregulated miRNAs) DEMs between synovial sarcoma and skeletal muscle. We created a Venn diagram intersecting the two DEM sets, and 26 upregulated miRNAs and 10 downregulated miRNAs were identified (Figure S1; Table S3). Hierarchical clustering of the identified DEGs, DELs and DEMs was displayed as a heatmap (Figure 2).
GO and KEGG enrichment analysis of DEGs
GO enrichment analysis and KEGG pathways analysis of 3415 DEGs were performed to identify the potential functional genes. The upregulated and down-regulated genes were analyzed, respectively. The top 25 significantly enriched upregulated DEGs are presented in Figure 3A, including transcription, DNA-templated (GO:0006351; P=2.64×10-104), cell division (GO:0051301; P=2.57×10-82) and regulation of transcription, DNA-templated (GO:0006355; P=7.99×10-75). The most significantly enriched downregulated DEGs are presented in Figure 3B, including muscle contraction (GO:0006936; P=1.89×10-57), muscle filament sliding (GO:0030049; P=5.78×10-46) and sarcomere organization (GO:0045214; P=8.40×10-38). The main pathways for both the upregulated genes and the downregulated genes are metabolic pathways (Figure 3C, D). The significant GO terms and pathways of up- and downregulated DEGs are presented in Table S4 and Table S5.
Finally, 1296 intersecting DEGs were extracted from the significantly enriched genes in GO and KEGG pathway analyses involved in up- and down-regulated genes (Table S6).
Target genes and lncRNAs of DEMs
In this study, we have identified 36 sarcoma-associated miRNAs, and we focused on whether these miRNAs would target the 1296 DEGs identified by GO and KEGG pathway analyses and 338 DELs between sarcoma samples and muscle tissue samples. Based on the predicted targets of DEMs, 448 miRNA-mRNA interactor pairs (including 34 miRNAs and 269 mRNAs) and 454 lncRNA-miRNA interaction pairs (including 117 lncRNAs and 36 lncRNAs) were obtained.
ceRNA network
According the ceRNA theroy, we use the shared miRNA as a junction, that is, upregulated miRNAs, accompany with downregulated lncRNAs and mRNAs, and downregulated miRNAs, accompany with upregulated lncRNAs and mRNAs in the miRNA-mRNA and miRNA-lncRNA interaction pairs, to constructed a lncRNA-miRNA-mRNA ceRNA network. Finally, the ceRNA network consisted of 1440 interactions, including 69 lncRNAs, 29 miRNAs and 113 mRNAs (Figure 4; Table S7).
PPI network analysis
To further explore the most significant clusters of DEGs in the ceRNA network, we conducted PPI network analysis using the STRING database version 11.0 and visualization by Cytoscape. The most significant hub upregulated proteins were IGF1, PRKCB and GNAI3, and the most significant hub downregulated proteins were AR, CYCS and PPP1CB in the PPI network (Figure 5; Table S8).
Survival of miRNAs, lncRNAs and miRNAs
Furthermore, we performed the survival analysis based on the miRNAs, mRNAs and lncRNAs that were involved in the ceRNA network. The results show that three miRNAs, three mRNAs and one lncRNA were significantly associated with the overall survival of sarcoma patients (P<0.05). Among them, high expression levels of SMARCC1, SRSF10, PRPF38A, JARID2, GNAI3, miR-301a-3p, miR-106b-5p, miRNA-130b-3p, miR-423-3p and LINC01296 and low expression levels of ARF3 and PRKCB were associated with shorter overall survival in sarcomas (Figure 6).