Bioinformatics analysis predicts the association of SNPRD1 with cell cycle
SNRPD1 expression is higher in malignant or highly proliferative cells than normal cells in all types of cancers except for LAML (Acute Myeloid Leukemia) according to TCGA mRNA data (Fig. 2A). Triple negative breast cancers (TNBCs) are more malignant and grow faster than the other breast cancer subtypes [26], which exhibited higher SNRPD1 expression than non-TNBCs in TCGA patient transcriptomic data (p = 8.4E-4, Fig. 2B) and patient protein data protein_NC (p = 0.0016, Fig. 2D). Basal breast cancer cells are the counterpart of TNBCs at the cell line level, which showed higher SNRPD1 expression than non-basal cells according to the CLM cell line gene expression data (p < 2E-16, Fig. 2C). Both OS and RFS analyses showed that high SNRPD1 expression was prognostic of unfavourable clinical outcome with statistical significance (HR = 1.49, p = 0.0021 for OS, HR = 1.52, p = 1.6E-13 for RFS, Fig. 2E, 2F). ROC curves showed the performances of SNRPD1 and KI67 in prognosing triple negative breast cancers (AUC = 0.82 for SNRPD1, AUC = 0.8 for KI67, Fig. 2G).
We defined genes differentially expressed between TNBC and non-TNBC cells and highly correlated with SNRPD1 expression as ‘spliceosome-related fast-growing cell identifiers’ (SRFGs) and identified 434 SRFGs from the proteomic data (Supplementary Table III).
GO and KEGG pathway enrichment analyses showed that SRFGs were enriched in ‘cell cycle’, ‘DNA replication’ and ‘mitosis’ using both the ‘protein_NC’ proteomic (Fig. 3A, B) and ‘gene_TCGA’ transcriptomic data (Fig. 3C, D). It was shown that ‘DNA transcription’, ‘DNA repair’ and ‘Cell cycle’ were the most significantly enriched GO terms besides ‘splicing’. A network depicting the relationship between the enriched GO terms was constructed using SRFGs from the protein_NC data, where each node represents an enriched GO term and nodes with similarities > 0.3 were connected by edges. The nodes in the GO term network were categorized according to their general functionalities, where ‘RNA processing’ and ‘cell cycle’ were popped up as two major clusters. The ‘cell cycle’ cluster was primarily comprised of 5 inter-connected sub-clusters (Fig. 3E).
In order to confirm the role of cell cycle in breast cancer development, unsupervised hierarchical clustering of protein and mRNA data was performed. TNBC patients were clustered together using cell cycle genes from SRFGs as the classifier using both the proteomic data (Fig. 4A) and the transcriptomic data (Fig. 4C), suggestive of the important role of cell cycle in differentiating TNBC and non-TNBC patients (Fig. 4A, C). GSEA further confirmed the enrichment of cell cycle related genes in SRFGs using both protein_NC and gene_TCGA datasets (Fig. 4B, 4D).
Correlation analysis showed that SNPRD1 expression was highly correlated with cell cycle, with the correlation scores being 0.44, 0.25 and 0.35, respectively, in the protein_NC, gene_TCGA and gene_CLM datasets.
Experimental validation confirms the role of SNRPD1 in cell cycle control
Two siRNAs were designed (Fig. 5A) and purchased (supplementary Table 1). E value was used to assess the significance of the homologous similarity of two sequences, where two sequences with E < 10E-5 were considered highly homologous and such a homology was nearly confirmed without a need of further validation if E < 10E-6. The siRNA-1 could target the NM_006938.4 transcript and the siRNA-2 could target both the NM_006938.4 and the NM_001291916.2 transcripts. None of these two siRNAs could target cell cycle related siRNAs assessed in this study with statistically significant E value (Table 2).
Table 2
Sequence alignment of two designed SNPRD1 siRNA against SNRPD1 and cell cycle related genes experimentally assessed in this study.
Type | Gene | Accession | E value | Signiciance |
siRNA-1 | PCNA | NM_182649.2 | 0.18 | |
| CCND1 | NM_053056.3 | 2.4 | |
| CCNB1 | NM_031966.4 | 0.29 | |
| CDK1 | NM_001786.5 | 0.068 | |
| CDCA5 | NM_080668.4 | 1.4 | |
| NDC80 | NM_006101.3 | 0.077 | |
| CCNA2 | NM_001237.5 | 0.39 | |
| SNRPD1 | NM_006938.4 | 0.0000002 | * |
| SNRPD1 | NM_001291916.2 | 0.16 | |
siRNA-2 | PCNA | NM_182649.2 | 0.012 | |
| CCND1 | NM_053056.3 | 0.15 | |
| CCNB1 | NM_031966.4 | 0.29 | |
| CDK1 | NM_001786.5 | / | |
| CDCA5 | NM_080668.4 | 1.4 | |
| NDC80 | NM_006101.3 | 0.3 | |
| CCNA2 | NM_001237.5 | 0.39 | |
| SNRPD1 | NM_006938.4 | 0.0000002 | * |
| SNRPD1 | NM_001291916.2 | 0.0000002 | * |
Both siRNAs could significantly silence SNRPD1 (p = 0.002 for siRNA-1 and p = 8.6E-4 for siRNA2 in MCF7; p = 0.0091 for siRNA-1 and p = 0.0093 for siRNA2 in MDAMB231), and we obtained considerably improved inhibitory effects on SNRPD1 expression by pooling these two siRNAs together (p = 1.22E-5 in MCF7, p = 1.64E-5 in MDAMB231, Fig. 5B). Similarly, SNRPD1 was effectively knocked down in MCF7 and MDAMB231 cells at the protein expression level (Fig. 5C). We therefore used pooled siRNAs in the following assays. In particular, SNRPD1 mRNA expression was reduced to less than 10–25% of that of the control cells at the gene expression level (Fig. 5B), and about 5%-11% of that of the control at the protein expression level (Fig. 5C) upon pooled siRNA transfection.
Both MCF7 and MDAMB231 cells were subjected to reduced cell viability on SNRPD1 knockdown (p = 7.56E-7 for MCF7, p = 1.1E-03 for MDAMD231, Fig. 5D). Significant discrepancies in G1/G2 proportion were observed between the control and si-SNPRD1 cells, suggestive of the important role of SNRPD1 in ‘cell cycle’. Cells were arrested at the G1 phase, resulting in 35% increase of G1 phase cells and 55.7% decrease of S phase cells, and a slight increase of G2 phase cells were observed in si-SNRPD1 cells (Fig. 6A, 6B). Similar results were observed in MDAMB361 and HCC1937 cell lines (Fig. 6C, 6D).
Doxorubicin is one type of anthracycline-like drugs that confers cytotoxicity through its antimitotic activity and thus is effective in killing cells with accelerated cell cycle progression including malignant cells. By applying doxorubicin to SNRPD1-silenced cells, we observed significantly right-ward shifted IC50 in triple negative breast cancer cells MDAMB231 and HCC1937 (Fig. 7B, 7D) but not in luminal cells MCF7 and MDAMB361 (Fig. 7A, 7C).
There were 92 SRFGs enriched in the cell cycle pathway (HSA-1640170, Supplementary Table IV) as predicted using STRING [27]. By classifying these genes into four categories, i.e., ‘M’ (genes specific to M phase regulation), ‘M checkpoint’ (genes specific to M phase checkpoint regulation), ‘S’ (genes specific to S phase regulation), ‘S checkpoint’ (genes specific to S phase checkpoint regulation), we identified CDCA5 as the sole gene specific to M and S phase regulation, 30 and 12 genes specific to M and S phase checkpoint regulation, respectively (Table 3). We chose one gene from each of the four categories, i.e., CDCA5 (represents both ‘M’ and ‘S’), NDC80, CCNA2, three genes from G1/S transition (CCNB1, CDK1, PCNA, Table 4) to examine whether these cell cycle related genes could be significantly modulated by SNRPD1 silencing in vitro. All tested genes were significantly altered on SNRPD1 silencing in both MCF7 (p = 8.2E-4 for CDCA5, p = 0.027 for NDC80, p = 6.2E-4 for CCNA2, p = 0.013 for CCNB1, p = 5.1E-4 for CDK1, p = 0.011 for PCNA, Fig. 8A) and MDAMB231 cells (p = 3.56E-4 for CDCA5, p = 0.0062 for NDC80, p = 2.96E-4 for CCNA2, p = 0.009 for CCNB1, p = 0.001 for CDK1, p = 0.005 for PCNA, Fig. 8A).
Table 3
Classification of genes enriched in the cell cycle from Reactome pathways among the 434 SRTNS. STRING version 11.0 was used to conduct the enrichment analysis. Experimentally tested genes are highlighted in bold face. ‘M’ and ‘S’ each represents genes specific the M and S phase, respectively. ‘M checkpoint’ and ‘S checkpoint’ each means genes specific to M and S phase check point regulation, respectively, which were obtained by taking the intersection of genes between ‘M phase’ or ‘S phase’ and ‘cell cycle checkpoint’ pathways. ‘Common’ and ‘Rest’ each represents genes present in and absent from all ‘M’, ‘M checkpoint’, ‘S’, ‘S checkpoint’ categories, respectively.
M | | M checkpoint | S | | S checkpoint | | Rest | | Common |
AAAS | NCAPG2 | BUB1B | NDC80 | | CDCA5 | | CCNA2 | | BLM | | PSMD3 |
CDCA5 | NCAPH | | CCNB1 | NUF2 | | CUL1 | | CDC45 | | CDC7 | | PSMD14 |
CEP152 | NCAPH2 | | CDC20 | NUP107 | | FEN1 | | MCM2 | | CHEK1 | | RPS27A |
HAUS1 | NDC1 | | CDCA8 | NUP133 | | PCNA | | MCM3 | | GMNN | | |
HAUS2 | NUP153 | | CDK1 | NUP160 | | POLA1 | | MCM4 | | MCM10 | | |
HAUS3 | NUP155 | | CENPF | NUP85 | | POLA2 | | MCM5 | | MDC1 | | |
HAUS4 | NUP205 | | CENPH | PLK1 | | POLE | | MCM6 | | MND1 | | |
HAUS5 | NUP210 | | CENPI | RCC2 | | PRIM1 | | MCM7 | | TOP3A | | |
HAUS6 | NUP50 | | CENPO | SKA1 | | PRIM2 | | ORC6 | | TOPBP1 | | |
HAUS8 | NUP93 | | CENPQ | SPC24 | | | | RFC2 | | TPX2 | | |
KIF20A | SMC2 | | CENPU | SPC25 | | | | RFC4 | | WHSC1 | | |
MASTL | SMC4 | | ERCC6L | XPO1 | | | | RFC5 | | | | |
NCAPD2 | VRK1 | | INCENP | ZW10 | | | | | | | | |
NCAPD3 | | | KIF2C | ZWILCH | | | | | | | | |
NCAPG | | | KNTC1 | ZWINT | | | | | | | | |
Table 4
Classification of genes enriched in cell cycle transitions from Reactome pathways among the 434 SRTNS. STRING was used to conduct the enrichment analysis. Experimentally tested genes involved in G1/S transition are highlighted in bold face.
G1/S Transition | | G2/M Transition |
CCNA2 | MCM6 | | CCNA2 | HAUS8 |
CCNB1 | MCM7 | | CCNB1 | PLK1 |
CDC45 | ORC6 | | CDK1 | PSMD14 |
CDC7 | PCNA | | CENPF | PSMD3 |
CDK1 | POLA1 | | CEP152 | RPS27A |
CUL1 | POLA2 | | CUL1 | TPX2 |
GMNN | POLE | | HAUS1 | XPO1 |
MCM10 | PRIM1 | | HAUS2 | |
MCM2 | PRIM2 | | HAUS3 | |
MCM3 | PSMD14 | | HAUS4 | |
MCM4 | PSMD3 | | HAUS5 | |
MCM5 | RPS27A | | HAUS6 | |
We, in addition, tested the expression of CCND1 whose down-regulation is associated with G0/G1 arrest [28] but missed from the dataset we used for SRFG identification. CCND1 was down-regulated to approximately 20% and 60% of the control in SNPRD1-silenced MCF7 and MDAMB231 cells at both gene and protein expression levels (p = 0.022 at the gene expression level, p = 1.53E-4 at the protein expression level in MCF7, p = 0.0023 at the gene expression level, p = 1.7E-166 at the protein expression level in MDAMB231, Fig. 8B), suggestive of a G0/G1 cell cycle arrest.
We next explored whether SNRPD1 directly interacts with cell cycle related genes. By constructing a protein-protein interaction network of SNRPD1 and the analyzed cell cycle proteins using STRING version 11.0 (https://string-db.org), we found that SNRPD1 is co-expressed with PCNA with a potential direct interaction (Fig. 8C). We thus conducted immunoprecipitation to assess the interactions of SNPRD1 with PCNA in MCF7 and MDAMB231 cells, and the results showed that SNPRD1 physically interacts with PCNA in both cell lines (Fig. 8D).