Generation of mdig knockout cells by CRISPR Cas 9
To create mdig knock out cells, human triple negative breast cancer cells, MDA-MB-231, were transfected with pSpCas9-2A-Blast vector containing sgRNA that targets the third exon of the mdig gene. Thereafter blasticidin selection was performed for two consecutive weeks and the colonies obtained were screened for mdig expression by western blot (Fig. 1A). Altogether we obtained 5 WT and 12 KO clones and after screening them for mdig expression at the protein level, we prepared them for proteomic analysis. Each of the WT and KO clones was cultured and analyzed in duplicate. 2 of the 34 samples were removed from further analysis for quality control reasons. 5739 proteins were detected, and 5711 were quantified in at least 1 sample. 3569 were quantified in all samples.
Principal component (Fig. 1B) and cluster analysis (not shown) indicated some within-group heterogeneity. One KO clone in particular, KO#3, appeared to be more similar to WT samples than to other KOs. The gene knock out for that clone was confirmed by western blot and by the mass spec data. The clones KO#10, KO#3 and WT#5 were removed from the dataset and not used in any further analysis. Protein data for protein quantitative analysis (MaxQuant) and peptide data supporting protein quantitative analysis (MaxQuant) has been shown in Supplementary Table S1 and Supplementary Table S2 respectively.
Identification of the differentially expressed proteins for their class and gene ontology annotation
LC-MS/MS data were analyzed to determine the fold change (FC) as a normalized ratio for KO compared to WT control cells. This first screening of the raw data identified a set of proteins for which abundances increase or decrease in the MDA-MB-231 mdig KO cells. The analysis consisted of the unique protein IDs, with their fold change, p value and t statistics as a function of KO/WT. Thereafter the differentially expressed proteins were classified based on gene ontology designations such as molecular function, cellular component, and biological process using the PANTHER classification system (Fig. 2). A total of 26 protein classes were identified at the p < 0.05 level. Those categories are: calcium binding, cell adhesion molecules, cell junction proteins, chaperones, cytoskeleton, immunity, enzyme modulator, hydrolase, isomerase, ligase, lyase, membrane traffic proteins, nucleic acid binding, oxidoreductase, receptor, signaling molecule, storage proteins, structural proteins, surfactants, transcription factor, carrier proteins, transferase, transmembrane receptor regulatory, transporter and viral proteins categories. Among them, proteins in the nucleic acid binding (PCOO171) class were the most prevalent, encoding 340 genes for this category. According to biological process, most of the proteins belonged to the subcategories of biological adhesion, biological regulation, cell proliferation, biogenesis, cellular process, development process, immune system process, localization, metabolic process, multicellular organismal process, reproduction and response to stimulus. Among these, cellular and metabolic process were highly elevated with increased number of genes assigned to them compared to other subcategories. According to molecular functions, majority of the proteins belonged to functions pertaining to binding, catalytic activity, molecular function regulator, molecular transducer activity, structural molecule activity and transporter activities. Binding and catalytic activity were found to be the highest among the group. Finally, according to cellular components, most of the proteins were localized to the cell junction, cells, extracellular regions, membrane, organelle and protein containing complex. Among them, elevated regions were the proteins belonging to the cellular compartment, organelles and protein containing complex (Supplementary Fig. 1). These patterns of protein distribution suggest that mdig significantly affected the family of proteins that are essential for important biological and molecular processes such as binding, metabolism, immunity, and catalytic activities implicated in triple negative breast cancer. It also warrants a further detailed investigation of the individual genes and protein related to such biological functions manifested in breast cancer.
Canonical pathway analysis reveals key signaling cascades affected by mdig
We identified the ten proteins with the greatest magnitude change in abundance in KO over WT MDA-MB-231 cells (Table 1). Once the differentially expressed proteins were identified, next step was to query the role of those proteins in the pathogenesis of breast cancer.
Table 1
Top 10 proteins consisting the highest magnitude change in KO over WT MDA-MB-231 cells (p < 0.005), as revealed by proteomics data set obtained through mass spectrometry
Symbol
|
Identifier
UniProt/Swiss-Prot Accession
|
Description
|
Fold change
|
CTSD
|
P07339
|
Cathepsin D
|
11.063
|
MAGED2
|
Q9UNF1
|
Melanoma-associated antigen D2
|
10.443
|
FLNA
|
P21333
|
Filamin-A
|
9.662
|
ABHD16A
|
O95870
|
Abhydrolase domain-containing protein 16A
|
9.595
|
STMN1
|
P16949
|
Stathmin
|
9.118
|
NPC2
|
P61916
|
Epididymal secretory protein E1
|
8.845
|
RACK1
|
P63244
|
Receptor of activated protein C kinase 1
|
8.820
|
HIST1H2BA
|
Q96A08
|
Histone H2B type 1-A
|
8.673
|
IQGAP1
|
P46940
|
Ras GTPase-activating-like protein IQGAP1
|
8.603
|
HUWE1
|
Q7Z6Z7
|
E3 ubiquitin-protein ligase HUWE1
|
8.473
|
RIOX2
|
Q8IUF8
|
Bifunctional lysine-specific demethylase and histidyl-hydroxylase MINA
|
-8.521
|
KRI1
|
Q8N9T8
|
Protein KRI1 homolog
|
-8.272
|
CCDC51
|
Q96ER9
|
Coiled-coil domain-containing protein 51
|
-8.241
|
PLAUR
|
Q03405
|
Urokinase plasminogen activator surface receptor
|
-8.203
|
HYOU1
|
Q9Y4L1
|
Hypoxia up-regulated protein 1
|
-7.368
|
SOD2
|
P04179
|
Superoxide dismutase [Mn], mitochondrial
|
-7.173
|
RIN1
|
Q13671
|
Ras and Rab interactor 1
|
-7.096
|
NOP58
|
Q9Y2 × 3
|
Nucleolar protein 58
|
7.083
|
NAMPT
|
P43490
|
Nicotinamide phosphoribosyltransferase
|
-7.006
|
MCCC2
|
Q9HCC0
|
Methylcrotonoyl-CoA carboxylase beta chain, mitochondrial
|
-6.863
|
Characteristic alterations in signaling pathways and regulatory networks are expected between disease vs healthy cells. We used the Ingenuity Pathway Analysis (IPA) Software (IPA; Ingenuity® Systems, Qiagen) to identify the major biological pathways perturbed in mdig KO cells. IPA was used to interpret the differentially expressed proteins in terms of predominant canonical pathways and derivation of mechanistic networks. Canonical pathways are well defined biochemical cascades resulting in unique functional biological consequence. Performing the canonical pathway analysis of our dataset via IPA revealed 501 canonical pathways. The top 5 canonical pathways (p < 0.05) according to the number of identified proteins were EIF2 Signaling (54), Isoleucine Degradation I (8), Unfolded Protein Response (12) Regulation of eIF4 and p70S6K Signaling (31) and Caveolar-mediated Endocytosis Signaling (14) (Fig. 3). The Regulation of eIF4 and p70S6K Signaling and Unfolded Protein Response pathways have been elaborated in Fig. 4 showing the upregulated and downregulated proteins and their cellular localization.
Among the pathways that are overrepresented in mdig KO cells, EIF2 signaling was the topmost canonical pathway found in our analysis. Interestingly, PI3K, and AKT were upregulated with mdig silencing while RAS and eIF4a were downregulated. Previous reports identified the PI3K-Akt pathway as an enhancer of the expression of EMT resultant transcription factors such as Snail, Slug, ZEB1 and ZEB2 that promoted the EMT and resulted in an elevation of the cancer cell motility (27, 28). This suggests an increased motility potential of breast cancer cells upon the loss of mdig protein. Among the Unfolded Protein Response family of proteins, several heat shock proteins such as Hsp70 and Hsp40 were upregulated while TNF receptor associated factor 2 was downregulated in mdig KO cells. Analyzing the protein profiles belonging to the canonical pathway, Regulation of eIF4 and p70S6K Signaling, revealed a plethora of ribosomal proteins that were upregulated in the KO cells, such as ribosomal protein S16, S8, S9, S26, S2, S15a, S3, S6, S7, S21, S24, S3a, S27a, S10, S17, S20,S23 and S4X-linked. Since ribosome biogenesis is important for cancers; upregulation of ribosomal proteins in response to mdig deletion is a striking observation that needs further investigation. The filamin family of proteins such as filamin A, filamin B, and filamin C were upregulated in the KO cells. Notably, another interesting protein, flotillin 1 was found to be upregulated (Supplementary Fig. 2). Filamin proteins have been implicated in cancer progression while increased levels of flotillin 1 promoted cell proliferation, migration, tumorigenicity and lymph metastasis in breast cancer studies (29, 30).
These data indicate the important signaling pathways implicated in breast cancer upon mdig knockdown. The individual differentially regulated proteins in the top five canonical pathways certainly are attractive targets for further investigation where mdig is directly involved in the ribosome biogenesis, and the metastasis of triple negative breast cancers.
Cellular and molecular function gives insight into the differential biology of breast cancer cells affected by mdig
IPA-based protein network analysis was performed using all identified proteins upon mdig knockdown in TNBC cells. We identified 500 molecular and cellular functions associated with mdig deletion. The top five scoring function categories were evaluated for the predicted effect of mdig deletion on the activation status. Processes that are integral to cell growth and tumorigenesis were found, including: Protein Synthesis (177 associated proteins), RNA Damage and Repair (45 associated proteins), RNA Post-Transcriptional Modification (104 associated proteins), Cell Death and Survival (362 associated proteins) and Nucleic Acid Metabolism (74 associated proteins). These processes orchestrate the vital molecular functions such as protein expression, decay of mRNA, processing of rRNA, necrosis and metabolism of nucleic acid component or derivative respectively (Fig. 5A). Among them, an overall increase in protein synthesis and an overall decrease in the RNA post-transcriptional modification and cell death & survival were found in the KO category (Fig. 5B). Individual proteins belonging to these molecular and cellular functions with their upregulation and downregulation status have been depicted. Additionally, we found an overall decrease in inflammation with mdig loss (supplementary Fig. 3A). This is interesting as our in vivo studies on mdig knockout mice suggested a decreased inflammatory status of the mice upon silica exposure (17) further corroborating the current results.
The top enriched proteins associated with diseases and the disorders with the most proteins involved belonged to the categories of Cancer, Organismal Injury and Abnormalities, Tumor Morphology, Cardiovascular Disease and Developmental Disorder. The top network identified was associated with cancer. This network consists of 458 proteins in our proteomic data set (Fig. 6A). These results suggest the involvement of mdig in regulating the process of transformation in breast cancer. The IPA also predicted the upstream regulatory molecules that are either activated or inhibited on the basis of the observed protein expression changes allowing us to understand the underlying causal network. In our analysis we found the top 5 upstream regulators to be: MYCN, NFE2L2, MYC and TCR. Moreover, MYCN was activated upon mdig knockdown (Fig. 6B). Also, Myc is known as a classical upstream regulator of mdig (31).
Post translational modification and disease-based protein network analysis reveal the catalytic activity of mdig in the oxidation and demethylation process
PTMs can change the dynamics and affinity aspects of protein-protein interactions and often serve as the basis for modulation of signaling pathways implicated in breast cancer. Epigenetically relevant PTMs such as acetylation and methylation contribute to transcription regulation and have well established roles in cancer.
Mdig catalyzes both histidine oxidation (32) and tri-methyl lysine demethylation (33). Therefore, spectra were searched for histidine oxidation and lysine acylation to quantify their changes in response to mdig knockout (Table 2). Changes in PTM abundance were assessed using the number of peptides that were significant (q < 0.1) and whether the mean t-statistic was different from 0. Mdig catalyzes histidine oxidation at His39 of the 60S ribosomal protein L27a (Uniprot accession: P46776) (32). The tryptic peptide containing His39 from the 60S ribosomal protein L27a was detected in both the oxidized and native forms (Fig. 7A, Supplementary Fig. 3B). The peptide sequence (GNAGGLHHHR) has no residues that can be non-enzymatically oxidized so the oxidized form must be the product of an enzymatic reaction. The abundance of the oxidized form was decreased in mdig KO samples (q = 0.00030, moderated t-test, n = 8). The native, non-oxidized, form was detected only in mdig KO samples demonstrating that the knockout removed a specific enzymatic activity from the cells.
Table 2
Evaluation of changes in PTM abundance between MDA-MB-231 KO and WT samples. * p-value of a permutation test for the mean t-statistic being different from 0
PTM
|
peptides quantified
|
peptides increased (q < 0.1)
|
peptides decreased (q < 0.1)
|
modification mean t-statistic
|
modification p-value
|
oxidized histidine
|
98
|
7
|
4
|
0.01
|
0.571
|
di-methyl lysine
|
163
|
28
|
14
|
0.37
|
0.021
|
tri-methyl lysine
|
84
|
11
|
9
|
0.33
|
0.104
|
acetyl lysine
|
104
|
10
|
9
|
0.35
|
0.067
|
All peptides
|
65 281
|
6725
|
8513
|
0.07
|
n/a
|
In total, 98 peptides with candidate histidine oxidation sites were quantified and 11 were found to be significantly different between KO and WT samples (q < 0.1, moderated t-test, n = 8). The mean t-statistic for histidine oxidized peptides was near 0 (Table 2) indicating that they weren’t changed in a uniform direction by mdig knockout. To ensure that methionine oxidation didn’t interfere with our analysis we limited the set of histidine oxidized peptides to those that had no methionine residues or that had confident localization of the oxidation site to histidine by PTMRS (23). The mean t-statistic for that selected group was still approximately 0 (not shown). These data confirm the activity of mdig to catalyze the oxidation of His39. However, they don’t provide evidence that mdig oxidizes other His residues outside of 60S ribosomal protein L27a His39.
In addition to catalyzing His oxidation, mdig catalyzes the demethylation of tri-methylated lysine 9 of Histone H3 (33). Hence Lysine di- and tri-methylated and acetylated peptides were evaluated for changes in abundance in response to mdig deletion. Dimethylated lysine containing peptides had an overall increase in abundance in mdig KO samples relative to WT (Table 2). This is supported by the number of dimethyl-lysine peptides that were increased in abundance, 28 vs 14 decreased (q < 0.1, moderated t-test, n = 8), and also by the mean t-statistic for lysine di-methylated peptides that was positive (0.37, p = 0.021, permutation test for difference from 0). The change in abundance was confirmed in a smaller set of 63 very high confidence peptides (percolator posterior error probability, (PEP) < 0.001). Those 63 very high-confidence lysine di-methylated peptides had a greater increase in abundance than the larger set (mean t-statistic of 0.83, p = 0.0011) demonstrating that the increase in abundance was not just limited to low quality peptide identifications. Tri-methyl lysine and acetylated lysine also had positive mean t-statistics but did not meet our statistical threshold. These results suggest an important regulatory role of mdig on the 60S ribosomal protein L27a and on the methylation of lysine residues on histone proteins; which are likely to affect the transcription of critical genes implicated in TNBC.
PTM peptides that were differentially abundant between KO and WT has been shown in supplementary data Table S3. Peptide data for Histone oxidation analysis (Proteome Discoverer) is shown in Supplementary Table S4 and Peptide data for Lysine acylation analysis (Proteome Discoverer) is shown in Supplementary Table S5.
Individual peptides with their candidate PTM sites have also been shown for Acetylated Peptides (Supplementary Table S6), Dimethylated Peptides (Supplementary Table S7), HisHydroxylated Peptides (Supplementary Table S8) and Trimethylated Peptides (Supplementary Table S9).
The next level of regulation is the interaction of signaling networks and regulatory pathways. IPA identified 25 interaction networks built with 35 focus molecules that were affected by mdig knockdown. The five most affected gene networks as determined by IPA and a detailed interaction in the most significant networks has been shown in Fig. 7B and C. Genes with different expression patterns predominantly mapped to the networks associated with protein synthesis, RNA post transcriptional modification, DNA replication, recombination and repair. This shows an important role of mdig in regulating the genes associated with genomic stability and cancer, further indicating its influence on the pathogenicity of TNBC.
Validation of top identified differentially regulated proteins and their relevance in breast cancer growth, motility and metastasis
After having established the changes in protein abundance in mdig KO cells we selected the top five upregulated and downregulated proteins as determined by proteomic profiling and IPA. The rationale for selecting these proteins comes from the top ready molecule list as provided by the IPA and their specific relevance in breast cancer metastasis upon literature survey. Upregulated proteins consisted of CTSD, MAGED2, FLNA, STMN1 and RACK1, while downregulated proteins consisted of PLAUR, HYOU1, SOD2, RIN1 and NAMPT. To determine if an association exists between these proteins and TNBC, we performed western blotting analysis of these specific protein groups in MDA-MB-231 cells expressing mdig (WT) and deleted mdig (KO) (Fig. 8A). Five WT clones and twelve KO clones were tested using Tubulin as a loading control. We found a consistent pattern of altered protein expression in the TNBC cells, where CTSD, MAGED2, STMN1 and RACK1 were upregulated in the KO cells and PLAUR, HYOU1, SOD2, RIN1 and NAMPT were downregulated in the KO cells. Except for FLNA, all proteins at western blot assay corroborated with the IPA findings and hence validated our results. This suggests that these proteins are potential candidate biomarkers for TNBC and are strongly associated with breast cancer. This analysis also demonstrates mdig’s regulation on the abundance of these proteins which are implicated in motility, EMT, genomic stability, thereby governing the overall malignant phenotype of aggressive breast cancers.
Evaluation of the identified proteins in predicting disease prognosis for the survival of breast cancer patients
To explore whether the above proteomics findings are clinically relevant for breast cancer patients, the top proteins identified as differentially abundant in mdig KO cells were evaluated for their performance in predicting disease prognosis and overall survival of breast cancer and TNBC. Survival data from 3951 breast cancer patients and 618 TNBC patients were obtained from an online gene profiling database ( Kaplan Meir plotter, (34)). Abundance of the top identified and validated proteins in the mdig KO cells was evaluated for correlation to patient stratification based on the high expression of the proteins under study (Fig. 8B). In breast cancer patients, high expression of STMN1, NAMPT, PLAUR, and SOD2 predicted poor overall survival, whereas FLNA, MAGED 2, RACK1, HYOU1, and RIN 1 predicted better overall survival. However, in TNBC patients, high expression of MAGED2, and STMN1 predicted poor overall survival, whereas, RACK1, HYOU1, PLAUR, RIN1 and SOD2 predicted better overall survival. The differential regulation of such proteins by mdig is an important finding. Involvement of these proteins in the regulation of cell proliferation, motility, invasiveness, cancer metabolism and ER stress make them ideal candidates that can be exploited in breast cancer for therapeutic efficacy.