An abundance of genes are differentially expressed between SS and adjacent normal tissues
To gain insights into the transcriptomic changes of SS patients, we collected the SS and adjacent normal tissues from 10 patients and then deeply sequenced them with total RNA sequencing (including both poly (A+) and ploy (A-) RNAs). The RNA-seq reads of each sample were first mapped to the human reference genome GRCh38 using HISAT2[16], then differential expression calling was conducted by employing DEseq2[18]. A total of 4,286 differentially expressed genes (DEGs) were detected using the threshold of |fold change| >2 and adjusted P-value < 0.01, of which 2,309 (including 432 lncRNA genes) and 1,977 (including 333 lncRNA genes) genes were separately upregulated and downregulated in SS compared to normal tissues (Figure 1A, Supplementary Table S1). Interestingly, we found that 340, 185, 124, and 7 of those DEGs are oncogenes, tumor suppressor genes (TSGs), transcription factors (TFs), and splicing factors, respectively (Figure 1B, Supplementary Figure S1). Specifically, 52 TFs (such as AES and BCL6) were down-regulated and 72 TFs (e.g. ARID3A and BRCA2) were up-regulated, suggesting that these TFs could influence the expression of their downstream target genes including related oncogenes and TSGs. Since oncogenes and TSGs are closely correlated with cancer, they may significantly contribute to the development of SS. Moreover, we further validated the expression profiles of the seven differentially expressed splicing factors (ELAVL2, HNRNPA1, HNRNPH2, MBNL1, PCBP1, QKI, and TIA1) using qPCR. As expected, the experimental results are consistent with that of our RNA-seq data analysis. Splicing factors are critical for the AS process[28], differential expression of splicing factors could result in the AS deregulation of corresponding genes in SS.
Gene ontology (GO) and KEGG pathway enrichment analyses showed that these upregulated and downregulated DEGs were mainly involved in a range of fundamental and tumor-related biological processes and pathways (Figure 1C, D, and E FDR < 0.05). For example, the up-regulated DEGs were primarily enriched in the cell-cycle-related biological processes (e.g. chromosome organization, chromatin organization and DNA conformation change) and pathways of systemic lupus erythematosus, cell cycle, DNA replication and P53 signaling (Figure 1C). Several previous studies also identified the cell-cycle-related genes in sarcoma as a major category of up-regulated genes[29, 30], which further supported our finding. The down-regulated DEGs were mainly involved in the metabolic-related biological processes (such as energy derivation by oxidation of organic compounds, muscle system process, and glucan metabolic process) and the pathways of oxidative phosphorylation, insulin signaling pathway and vascular smooth muscle contraction (Figure 1D). Thus, the result suggests that a multitude of genes prominently changed their expression levels in SS, which could be one of the main factors accounting for tumorigenesis through up-regulating and down-regulating a range of crucial pathways.
Deregulation of alternative splicing could contribute to the tumorigenesis of SS
Considering that the misregulation of AS can lead to the production of aberrant proteins that contribute to tumorigenesis[8], we further compared the AS profile between SS and adjacent normal tissues by employing rMATS[19]. Five classical splicing categories of exon skipping (ES), alternative 5’ donor sites (A5DS), alternative 3’ acceptor sites (A3AS), mutually exclusive exons (ME), and intron retention (IR) were analyzed. Remarkably, we identified 2511 (including 41 lncRNA genes) significantly differential AS genes (DASGs), of which 2018, 223, 242, 486, and 159 belong to the splicing mode changes of ES, A5DS, A3AS, ME and IR, respectively (Figure 2A, FDR < 0.05, Supplementary Table S2). As expected, ES is the most common differential splicing mode (80.37%, 2018 out of 2511 DASGs), whereas IR is the least (6.33%, 159 out of 2511 DASGs). Notably, the majority of those genes that underwent the five classical splicing mode changes are largely different, only a small portion of them simultaneously exhibited three or more distinct splicing types (Figure 2A).
Gene functional enrichment analysis indicated that those 2511 DASGs were mainly involved in the RNA splicing and cancer-related biological processes and KEGG pathways (Figure 2B and C, adjusted P-value <0.05), which is highly correlated with the AS process. For instance, the top enriched biological processes of those DASGs are mRNA processing, microtubule cytoskeleton organization, and RNA splicing, while the enriched pathways are endocytosis, RNA transport, proteoglycans in cancer and spliceosome. Interestingly, we observed that 21 splicing factor genes showed significantly differential AS between SS and adjacent normal tissues, such as HNRNPA1, PTBP2, QKI, RBFOX2, and TRA2A. It is well known that the splicing factors are crucial for AS regulation[28], the deregulation of those splicing factors could drastically disrupt the splicing process of many other genes and contribute to the tumorigenesis of SS[31]. Furthermore, we found that 346, 204, and 122 oncogenes, TSGs, and TFs were also differentially spliced (Figure 2D). The abnormal splicing of these TFs could influence the expression of many downstream target genes, which could be closely associated with the development and progression of SS. Only 368 genes shared between DASGs and DEGs, and, leaving the most of them are distinct (Figure 2E). These common 368 genes were enriched in the biological process of actomyosin structure organization and pathway of regulation of actin cytoskeleton (Figure 2E, adjusted P-value <0.05). Thus, the genes showed differential expression are quite distinct from those exhibited differential splicing, suggesting that expression level is complementary with AS in revealing the transcriptomic changes. Our results indicate that the abnormal AS changes of genes could be another important factor responsible for the tumorigenesis of SS.
Novel gene fusion events are identified in SS
Considering that the chromosomal translocation between X and 18 chromosomes is frequent in SS[32], we further explored the gene fusion events in SS patients. A total of 14 and 11 unique gene fusion pairs were identified in SS and adjacent normal tissues using TopHat-Fusion[20], respectively. No fusion events shared between SS and adjacent normal tissues. The 14 tumor-specific gene fusion pairs were from seven SS patients (Supplementary Table S3), most of which (11 out of 14) were resulted from the rearrangements within the same chromosome, while 3 of them were generated from the breakage and re-joining of two disparate chromosomes (Figure 3A). In total, 27 genes were involved in these tumor-specific gene fusions, only SS18 was fused with SSX1 and SSX2, other genes were only fused with one partner (Figure 3B).
Moreover, all the 14 tumor-specific gene fusion pairs were detected only once and the maximum number of gene fusion pairs detected in individual patients was four (Figure 3C), suggesting that gene fusion events in those SS patients are quite distinct. (Figure 3C). Intriguingly, these tumor-specific gene fusion events contain one TF of SSX2 and seven oncogenes of SS18, SSX1, SSX2, BCOR, CNOT1, HIST2H2AC, and TOP1 (Figure 3D). Oncogene SS18 was fused with the TF and oncogene of SSX2 as well as the oncogene SSX1, which is consistent with the known finding that such fusion events are common for SS[32]. Besides, other oncogenes of BCOR, CNOT1, HIST2H2AC, and TOP1 were formed the fusion events of BCOR-CCNB3, CNOT1-SETD6, HIST2H2AC-HIST2H2AB, and TOP1-PLCG1-AS1, respectively. Previous studies have shown that BCOR-CCNB3 fusion tends to occur in the undifferentiated small round-cell sarcomas like Ewing sarcoma and has the potential to drive sarcoma progression[33-35]. Other gene fusions could be novel for SS, and the involved genes are with critical functions. For example, CNOT1 encodes the CCR4-NOT transcription complex subunit 1, which mainly participates in deadenylating mRNAs[36]. HIST2H2AC and HIST2H2AB can generate the replication-dependent histones that are basic nuclear proteins responsible for the nucleosome structure of the chromosomal fiber. TOP1 encodes the enzyme of DNA topoisomerase for controlling and altering the topologic states of DNA during transcription[37]. Since TF could regulate the expression of many downstream target genes and oncogenes are closely associated with cancer, the fusion events of these TFs and oncogenes could contribute to the tumorigenesis/progression of SS. Interestingly, lncRNA genes of LINC00970, LOC105375787 and PLCG1-AS1 are also involved in the gene fusion events, but their functions are still unknown. Gene functional enrichment analysis showed that those fusion genes were significantly enriched in the KEGG pathway of transcriptional misregulation in cancer (Figure 3E, adjusted P-value < 0.05). Therefore, we not only identified the common gene fusions between chromosome 18 and X, but also detected some novel tumor-specific gene fusions that could contribute to the formation/progression of SS.
Circular RNAs are functionally important in SS
Emerging evidence shows that circRNAs can involve in various aspects of tumor biology[15, 38], thus we further investigated the expression profile of circRNAs in SS and adjacent normal tissues. We detected 49 differentially expressed circular RNAs by employing CIRI[21] with the threshold of |fold change| >2 and adjusted P-value < 0.01 (Supplementary Table S4). As shown in Figure 4A, 21 of them were significantly up-regulated in SS, whereas the other 28 were down-regulated. Furthermore, we found that the great majority (46 out of 49, 93.88%) of those differentially expressed circRNAs were formed by the circulation of exons of their parental genes, only two circRNAs of 10:24380869|24384423 (parental gene: KIAA1217) and 17:35168061|35168685 (parental gene: UNC45B) were produced from the intronic region and another one (5:137757867|137759020) was generated by an intergenic region (Figure 4B).
Intriguingly, 7, 5, and 3 of the parental genes for those differentially expressed circRNAs are oncogenes, TSGs, and TFs (Figure 4C). These circRNAs have the potential to affect the expression of their parental oncogenes, TSGs, and TFs, because previous studies have shown that circRNAs can form posttranscriptional regulators to regulate the expression of their parental genes[39, 40]. Strikingly, gene functional enrichment analysis showed that the parental genes of those differentially expressed circRNAs were mainly enriched in muscle system process (such as MAP2K4, HDAC4, TMEM38A, MYH1, MYH2, CAMK2G, TRDN, and SULF2) and muscle contraction (e.g. HDAC4, TMEM38A, MYH1, MYH2, TRDN and SULF2) (Figure 4D, adjusted P-value < 0.05), which is highly correlated with SS. Therefore, our result indicates that those differentially expressed circRNAs could play key roles in SS.
The genes involved in different types of transcriptomic changes are largely distinct
We further compared the four types of genes with significant changes in terms of expression level and AS, as well as the fusion genes and the parental genes of differentially expressed circRNAs. As shown in Figure 5, the genes in one type are largely distinct from that of other types, and no genes were common among these four categories. Only a fraction of them were involved in two or three types of changes (Figure 5). Intriguingly, 3 DEGs (BCOR, HIST2H2AB, and MEG8) and 2 DASGs (AKR1E2 and DCAF8) were overlapped with the fusion genes, suggesting that the fusion events may influence the expression and/or AS profile of these genes. BCOR is an oncogene, while MEG8 is an imprinted gene. Moreover, 18 DEGs (e.g. DNM3OS, ZNF730, DNAH14, and AFF2) shared with the parental genes of differentially expressed circRNAs, implying that expression changes of these genes could affect the expression of circRNAs as well. In addition, 17 DASGs (such as SUCO, VWA8, MTUS1, and USP53) were common to the parental genes of differentially expressed circRNAs. Since circRNAs are mainly formed by AS of pre-mRNAs through backsplicing[41], the AS changes of these DASGs has the potential to influence the expression of corresponding circRNAs. Accordingly, our result shows that all the four aspects of expression changes, AS, gene fusions, and circRNAs could be closely correlated with the tumorigenesis/progression of SS.
CircRNAs could regulate the expression of a multitude of genes by acting as miRNA sponges
An increasing number of studies suggested that the endogenous circRNAs can act as miRNA sponges to regulate corresponding gene expression[42, 43], we further constructed the interaction network among differentially expressed circRNAs, miRNAs and the miRNA target genes of DEGs, DASGs and fusion genes to elucidate the functional roles of those differentially expressed circRNAs. Based on the known miRNA-circRNA regulations, and the miRNA-targets relationships in starBase database[24] as well as the protein-protein interactions (PPIs) in String database[23], the resulting interaction network involved in 5 circRNAs (hsa_circ_0001699, hsa_circ_0000247, hsa_circ_0000246, hsa_circ_0000095, and hsa_circ_0000118), 44 miRNAs, 293 protein-coding genes, containing 57 miRNA-circRNA interactions, 789 miRNA-mRNA interactions and 350 PPIs (Figure 6). It is well known that circRNAs can regulate gene expression through influencing transcription, mRNA turnover as well as translation by sponging RNA-binding proteins (RBPs) and miRNAs[43]. Our resulting network showed that circRNAs hsa_circ_0001699, hsa_circ_0000247, hsa_circ_0000246, hsa_circ_0000095 and hsa_circ_0000118 could separately act as the sponges of 14, 13, 13, 12 and 5 miRNAs, where these miRNAs have the potential to regulate the expression of 119, 202, and 3 genes of DEGs, DASGs, and/or fusion genes. The expression of these miRNA target genes could be indirectly influenced by corresponding circRNAs through competing the interaction with miRNAs. Consequently, our finding suggests that circRNAs could function as miRNA sponges to regulate the expression of an abundance of genes.