Mitochondrial protein discovery using quantitative proteomics
To identify putative D. discoideum mitochondrial proteins, we searched for Dictyostelium homologs of 1136 human mitochondrial proteins listed in the Human MitoCarta 3.0 (Morgenstern et al., 2017; Rath et al., 2021), and retrieved 616 proteins (Figure 1A, Table S1). This number is much less than known mitochondrial proteins in humans (1136) and baker’s yeast (901) (Pagliarini et al., 2008; Rath et al., 2021). We posited that mitochondrial proteome might be highly divergent between D. discoideum and humans, and many Dictyostelium mitochondrial proteins might be missed from this bioinformatic curation. We, therefore, took a proteomic approach to directly identify mitochondrial proteins in D. discoideum (Figure 1A). From AX2 axenic cultures, we prepared mitochondria isolates—both crude and highly purified—through Percoll gradient ultracentrifugation. We performed tandem mass tag (TMT) liquid chromatography-mass spectrometry (LC-MS) on both mitochondria isolates and included AX2 whole-cell lysate as the control. A total of 6,892 proteins were captured in all samples (Table S2).
A limitation of identifying organellar proteins from their subcellular fractions alone is that high-abundance contaminants are often co-purified and result in false-positive hits. To address this issue, we assessed the probability of a protein localizing to mitochondria by comparing its relative enrichment in mitochondrial preparations to a list of 47 authentic mitochondrial proteins that includes components of electron transport chain complexes and conserved enzymes in citrate cycles (Table S3). We first calculated the ratio of a protein’s abundance in the mitochondria isolates, both crude and highly purified, versus its abundance in the whole-cell lysate. The resulting value, indicating its enrichment in mitochondrial preparations, was further normalized to the average enrichment ratio of the 47 reference mitochondrial proteins, to compute the relative enrichment ratio (RER). Overall, a protein’s RER in crude mitochondria isolate is in accordance with that in purified mitochondria (Figure 2A). However, the distribution of RERs appears continuous in crude mitochondria (Figure 1B), but clusters into two distinct populations in purified mitochondria (Figure 1B), which allowed us to determine a proper threshold of RER for mitochondrial proteins using mathematical modeling. Thus, we proceeded to analyze the RER for purified mitochondria only.
In principle, a true mitochondrial protein would be co-purified with the reference mitochondrial proteins in pure mitochondrial isolates, and its RER should be 1.0. However, the RER distribution of these 47 reference proteins (Figure 2B) appears as a normal curve centered around 1.0, suggesting that many mitochondrial proteins may have an RER below 1.0. Among all proteins profiled using TMT-based LC-MS, only 259 have an RER higher than 1.0 in purified mitochondria (Figure 2C). We posit that different mitochondria proteins might be degraded to different extents, based on their intrinsic stability, during the procedure of mitochondrial purification, which involves overnight ultracentrifugation. Therefore, it is necessary to determine a proper RER value to differentiate mitochondrial proteins from non-mitochondrial proteins. We applied the expectation-maximization (EM) approach to a gaussian mixture model (GMM) to bin all proteins into two clusters: non-mitochondrial and mitochondrial proteins based on their RER values (Figure 2D). We chose an RER cutoff of 0.343 (Figure 2D) and assigned a total of 908 proteins having an RER higher than 0.343 as putative mitochondrial proteins (Table S4). GMM predicts that less than 17% proteins in the mitochondrial cluster, and only 0.1% proteins in the non-mitochondrial cluster would spill over to the other group (Figure 2D), which corresponds to an 83% recovery rate and 7% false discovery rate, respectively.
Validation of mitochondrial protein discovery based on quantitative proteomics
To validate the accuracy of RER-based mitochondrial protein discovery, we first assessed the recovery rate of putative mitochondrial proteins in silico. Many mitochondrial proteins possess an N-terminus mitochondrial targeting sequence (MTS) that directs the import of nuclear-encoded mitochondrial proteins into the mitochondrial matrix (Backers, 2017). Overall, 24% of all proteins retrieved in the proteomics discovery experiment contain an MTS (Table S4). Importantly, 94% of these MTS-bearing proteins had an RER greater than 0.343 (Figure 3A). On the contrary, 96% of proteins that were destined to other organelles such as the ER, Golgi, lysosomes, vacuoles, or secretory pathway had an RER less than 0.343. These analyses demonstrate that a cutoff value of an RER at 0.343 effectively separates mitochondrial proteins from non-mitochondrial proteins.
We also surveyed the localization of 81 proteins recovered in LC-MS (Table S5), using fluorescent microscopy. These proteins were selected on the basis that their subcellular localization has not been annotated previously as mitochondrial, and their RERs are randomly distributed from 0.1 to 1.5. Each protein was tagged with GFP at its C-terminus and co-expressed with an MTS-mCherry fusion protein, which marks mitochondria in D. discoideum AX2 cells. Among the 81 proteins, 90% of proteins with an RER higher than 0.343 showed complete or partial mitochondrial localization (Figure 3B, 3D), whereas only 5% of proteins with an RER less than 0.343 showed mitochondrial localization (Figure 3B, 3D), demonstrating a strong positive correlation between RER value and probability of mitochondrial localization (Figure 3B). Moreover, logistic regression analysis on the localization pattern of these 81 proteins predicts that a protein has more than a 78% probability of localizing to the mitochondria if its RER is higher than 0.343 (Figure 3C).
A comprehensive mitochondrial protein compendium in D. discoideum.
To further improve the coverage and accuracy of the mitochondrial protein discovery, we revised the list based on the in vivo microscopy validation by removing four non-mitochondrial localizing proteins and adding two mitochondrial localizing proteins. We also integrated three sets of mitochondrial protein discovery: the aforementioned list of mitochondrial proteins identified from quantitative proteomics analyses, those retrieved from homology detection, and those retrieved during a gene ontology search for mitochondrial genes. Among the 616 D. discoideum homologs of human mitochondrial proteins (Table S1), 352 proteins have an RER higher than 0.343 and hence were already included in the list, 223 proteins have an RER lower than 0.343, and 41 proteins were not captured in LC-MS. Among the 264 proteins that were not included in the list, 113 proteins do not have a predicted MTS (Table S1), whereas their human homologs have MTSs, suggesting these proteins might localize to other cellular compartments in D. discoideum. An exception is ribosomal protein S14 (O21035), which is encoded in the nuclear genome in humans but is encoded in the mitochondrial genome in D. discoideum, and thus contains an MTS in human cells but lacks one in D. discoideum. We added those remaining 152 proteins to the list, as well as 32 proteins with mitochondrial gene ontologies that had not emerged during the proteomic or homology analysis. The final compendium consists of 1082 high-confidence mitochondrial proteins in D. discoideum (Table S6).
Characterization of the D. discoideum mitochondrial proteome
Out of the 1082 D. discoideum mitochondrial proteins, there are 627 and 458 proteins that have homologs in the mitochondrial proteome of human and Saccharomyces cerevisiae, respectively (Figure 4A), indicating that D. discoideum mitochondria are more closely related to mitochondria in metazoans than fungi. Only 324 D. discoideum mitochondrial proteins have homologs in Rickettsia prowazekii (Figure 4A), an α-proteobacteria that is closely related to the mitochondrial ancestor. Overall, a total of 313 proteins, representing 28.9% of the D. discoideum mitochondrial proteome, have no homologs in the whole proteome of humans, S. cerevisiae or R. prowazekii (Figure 4A, Table S6), indicating that a large fraction of D. discoideum mitochondrial proteins was evolved de novo after the divergence of Amoebozoa. Moreover, 75 D. discoideum mitochondrial proteins (6.9%) have no homologs in D. purpureum (Figure 4A), a closely related species of social amoeba, further substantiating the fast-evolving nature of the amoeba mitochondrial proteome.
There are 89 D. discoideum mitochondrial proteins (8.2%) with human homologs that had not been annotated as mitochondrial proteins (Figure 4B, Table S6). Among these 89 proteins, 74 were also not annotated as mitochondrial proteins in yeast, including 32 that had homologs in S. cerevisiae. Given the estimated false-discovery rate of our compendium, the localization of these proteins needs to be experimentally accessed. Nonetheless, there are a few examples, such as the RNB domain-containing protein (DDB_G0288469) and tRNA-binding domain-containing protein (DDB_G0349377), both of which have predicted MTSs and are likely targeted to the mitochondrial matrix. The human homologs of DDB_G0288469, DIS3-like exonuclease 2, and DDB_G0349377, rhomboid-related protein 4, were not included in the human compendium (Rath et al., 2021), despite evidence that the yeast homolog of DIS3-like exonuclease 2 localizes to the mitochondria (Pagliarini et al., 2008), and that rhomboid-related protein 4 has been partially shown to localize to the mitochondria. The mitochondrial localization of their D. discoideum homologs substantiates these two proteins might indeed localize to the mitochondria and indicates that our compendium can complement previous studies toward a more comprehensive discovery of mitochondrial proteins in other organisms.
Additionally, we categorized the D. discoideum mitochondrial proteome using PANTHER biological function or protein family classifications (Figure 4B, Table S6). Proteins involved in mitochondrial gene expression and metabolism comprise the largest fractions of all mitochondrial proteins, over 20% for each category. Other proteins are involved in mitochondrial protein homeostasis, the electron transport chain, redox signaling and metabolism, and regulation of mitochondrial morphology and dynamics. A large fraction of D. discoideum mitochondrial proteins, approximately 15%, have no classified functions based on PANTHER analyses (Figure 4B).
D. discoideum -specific mitochondrial proteins
Proteins involved in gene expression consisted of a large fraction of D. discoideum specific mitochondrial proteome (Figure 4B), reflecting that the D. discoideum mitochondrial genome is more complex than human mtDNA. On the contrary, few metabolism proteins emerged in the list (Figure 4B), suggesting that metabolic processes are highly conserved between D. discoideum and metazoans. Here, we expand upon a few of the unique features of the D. discoideum mitochondrial protein compendium.
Mosaic nature of mitochondrial ribosomes
Mitochondrial ribosomes (mitoribosomes), ribosomal assembly factors, and other proteins involved in translation represented 8.9% and 9.3% of the overall and unique mitochondrial protein compendium, respectively. While mitoribosomes are thought to be evolved from bacterial ribosomes, these two differ greatly with regards to their structure, function, as well as their composition of proteins and RNAs. We identified 51 proteins that are predicted to be mitoribosomal proteins, including 13 proteins that did not share significant homology with any H. sapiens, S. cerevisiae, or R. prowazekii proteins (Table S7). Interestingly, D. discoideum mitoribosomal proteins belong to families across several taxonomic groups: 35 proteins belong to mammalian mitoribosomal protein families (28s and 39s), 2 belong to eukaryotic cytosolic ribosomal protein families (60s), 9 belong to yeast mitoribosomal protein families (37s and 54s), 2 belong to chloroplast or bacterial ribosomal protein families (30s and 50s), 1 is from archaea, and 2 are universally conserved among prokaryotes and eukaryotes. It has previously been shown that cytosolic ribosomes tether to the mitochondrial outer membrane. Hence, the recovery of the 60S ribosomal protein L22 could be the result of the association of cytoplasmic ribosome with the mitochondrial outer membrane rather than its localization in the matrix (Gold et al., 2017). Nonetheless, the presence of proteins representing multiple mitoribosome lineages suggests that there may be D. discoideum or protist-specific mechanisms to process mitochondrial transcripts and to regulate mitochondrial translation. Further validation of these findings is necessary as the composition and structure of the D. discoideum mitoribosome have yet to be resolved.
Mitochondrial DNA and RNA processing factors
Among the list of unique proteins are 24 candidate mtDNA and mtRNA processing factors including five endonucleases and a pentatricopeptide repeat (PPR)-containing protein A (PtcA). Bioinformatic analysis suggests that PtcA belongs to the mitochondrial group I intron splicing family. PPR proteins, defined by tandem PPR domains, are implicated in several different mitochondrial gene expression processes including translation initiation, and ribosomal stabilization (Manna, 2015). The number of PPR proteins that are encoded in an organism varies greatly: terrestrial plants, such as Arabidopsis thaliana, have upwards of 450 PPR proteins, while humans have 7 (Lurin et al., 2004; Lightowlers and Chrzanowska-Lightowlers, 2013). D. discoideum has 12 PPR-domain containing peptides, including PtcA, reflecting a greater complexity of D. discoideum’s mitochondrial genome compared to that of metazoans (Manna et al., 2013).
Divergent evolution path of lipid biosynthesis
The mevalonate pathway, which produces five-carbon blocks for the synthesis of diverse biomolecules such as cholesterol and coenzyme Q10, is an essential and highly conserved process in eukaryotes, archaea, and some bacteria. In animals and fungi, the mevalonate pathway takes place in ER, and 3-hydroxy-3-methylglutaryl (HMG)-coenzyme A (CoA) reductase (HMGR), a key enzyme in this pathway that converts HMG-CoA to mevalonate, localizes in the ER and peroxisomes (Chin et al., 1984; Keller et al., 1986; Burg and Espenshade, 2011). HMGR2, one of two HMG reductases in D. discoideum, is recovered in our compendium and contains a predicted MTS, suggesting that it likely localizes to the mitochondrial matrix. Additionally, HGSA, one of the two HMG-CoA synthases, also emerged as a mitochondrial protein. Our mitochondrial protein discovery suggests that mevalonate metabolism may take place in mitochondria in D. discoideum, highlighting the evolutionary divergence of some metabolic pathways that originated from the common mitochondrial ancestor.
Implication of mitochondrial function in multicellular development
D. discoideum with reduced mtDNA or a disruption of the gene encoding mt-ribosomal protein S4 display no defect in vegetative growth but have impaired starvation-induced development, suggesting that mitochondrial respiration is necessary for multicellularity (Chida, 2004; Chida et al., 2008). However, contrasting evidence has demonstrated a significant decrease in mitochondrial respiration after respiration, and accordingly, a decreasing expression of many respiration complexes (Kelly et al., 2021). To understand potential regulations of mitochondrial function in multicellular development, we retrieved RNA sequencing data using the Dictyostelium gene expression database, dictyExpress (Parikh et al., 2010; Stajdohar et al., 2017).
Overall, there was a decrease in the expression of mitochondrial genes within our compendium over the 24-hr development time course (Figure 5A). A similar pattern is observed in proteins that are involved in mitochondrial DNA maintenance and gene expression. Interestingly, despite the decrease in gene expression machinery (Figure 5B), over half of the mitochondria-encoded genes in the dataset (19 of 35) were upregulated (log2FC ≥ 1) after starvation induction (Figure 5C). Further, in examining all respiratory chain complexes, 12 nuclear-encoded ETC subunits had a higher expression level (log2FC ≥ 1) at or after 12 hours of starvation (Figure 5D), besides the 10 nuclear or mitochondrial-encoded subunits that show a burst of expression in the first 4 hours after the starvation (Figure 5D). The complex pattern of mitochondrial gene expression, particularly the upregulation of electron transport chain complex subunits during the development suggests potential roles of mitochondrial respiration in Dictyostelium development, and that both nuclear and mitochondrial-encoded proteins are likely implicated in these processes.