Proteases are a large and divergent group of enzymes with various biological functions including developmental processes, acquisition of nutrients and stress responses (Pérez-Lloréns et al. 2003). Calpains are well conserved cysteine proteases that have been found in a wide range of eukaryotic organisms and some bacteria, but they have not been found in Archaea. Most of information about calpains comes from the study of humans, animals and plants, and only little is known about their distribution, structure and function in bacteria. In this study, we have investigated 50 species of cyanobacteria to determine, whether calpains are present in this taxonomic group.
We have identified calpains in 10 cyanobacterial species based on Hidden Markov Model of the catalytic CysPC domain typical for calpain proteins. The number of identified cyanobacterial species possessing calpains is relatively low, but as it has been shown previously, cyanobacteria are a highly diverse group and their genome content varies significantly even at the species and strain levels (Mohanta et al. 2017). CysPC domain is in cyanobacteria often associated with PPC domain (Table 1), which is typically present in bacterial secreted proteins and it is found at their C-terminus (Yeats et al., 2003), while in cyanobacterial calpains, it is found at the N-terminus. The transmembrane helical regions are absent from all putative cyanobacterial calpains suggesting their cytosolic localization. These findings are consistent with the study of calpains in other bacteria that also do not possess any predictable transmembrane regions (Rawlings 2015).
Calpains are known to be involved in many cellular processes in eukaryotes such as aleurone bilayer development and positional cell division in plants (Olsen et al. 2015), and brain function, memory formation and the development of many pathological processes in mammals (Ono et al. 2016). Calpains cleave a wide range of substrates, among which are e.g. protein kinases, receptor molecules and proteins involved in signal transduction. It has been proposed that calpains play main role in regulation of cell signalling rather than in protein digestion (Wang et al. 1989; Moriyasu and Wayne 2004). However, their function in bacteria remains unknown.
Our in silico interaction analysis revealed that cyanobacterial calpains from C. parasitica, F. muscicola, F. thermalis and S. hofmannii (S. hofmannii 1) interact with methionine synthase (Fig. 4). Methionine synthase catalyses the transfer of a methyl group from methyl-cobalamin to homocysteine (Deobald et al. 2020). S8 peptidase is putatively interacting with calpains from A. minutissima, C. minutus and C. polymorphus. S8 endopeptidases are thermostable secreted proteins involved in nutrition that are present in almost all taxa except for fungi (Li et al. 2016).
SecA, TamB and collagen triple helix repeat protein were identified as putative calpain interacting partners in C. minutus and C. polymorphus. SecA protein has a role in coupling the hydrolysis of ATP to the transfer of proteins into and across the bacterial plasma membrane (Fröderberg et al. 2004). Cyanobacterial SecA is mainly found as soluble homodimer in the cytosol, but a small fraction of this protein is associated with cytoplasmic as well as thylakoid membrane suggesting that SecA is involved in protein translocation across these membranes in cyanobacteria (Nakai et al. 1994). TamB is a component of the translocation and assembly module autotransportercomplex. It functions in translocation of autotransporters across the outer membrane of Gram-negative bacteria. Collagen triple helix repeat protein predominantly consists of the Glycine (G) -X- Tyrosine (Y) repeats and the polypeptide chains form a triple helix. Collagens are generally extracellular structural proteins involved in formation of connective tissues (Mayne and Brewton 1993), mostly known in eukaryotes, but it has been shown that some bacterial species can possess collagen-like proteins as well. Among cyanobacteria, collagen-like protein was identified for the first time in the filamentous cyanobacterium Trichodesmium erythraeum (Layton et al. 2008; Price and Anandan 2013). This protein can bind antibody against human collagen. It is localized between adjacent cells of filaments and it was shown to play an important role in the structural development of filaments (Layton et al. 2008; Price and Anandan, 2013). Chamaesiphon spp. are currently not known to form filaments, however, according to phylogenetic analysis based on 16S rDNA, the genus is closely related to Gomontiellaceae, which phylogenetically clusters with other groups of filamentous cyanobacteria that belong to Oscillatoriales (Kurmayer et al., 2018).
Three annotated putative calpain interaction partners were identified in A. minutissima - BPSL0067 family protein, cadherin repeat-containing protein and S8 peptidase. Cadherins are adhesion molecules responsible for cell-cell interactions and the maintenance of appropriate intercellular spacing (Guan et al. 2014). They also play key role in stress responses (Ivanov et al. 2001; Tripathi and Sowdhamini 2008).
In C. parasitica, six annotated putative interaction partners were predicted - glycotransferase, glycoside hydrolase family 3 protein, methionine synthase, tandem 95 repeat protein and transposase ISAzo13. Proteins that belong to glycoside hydrolase family 3 are widely distributed in bacteria, fungi, and plants. They often cleave a broad range of substrates and function in a variety of cellular processes, including cellulosic biomass degradation, remodelling of cell wall in both bacteria and plants, energy metabolism and pathogen defence (Harvey et al. 2000; Lee et al. 2003).
Although F. thermalis and F. muscicola are closely related filamentous cyanobacteria, the proteins predicted to interact with their calpains are different, except for methionine synthase. In F. thermalis, calpain could also interact with beta-N-acetylhexosaminidase, glycosyltransferase and radical SAM ((Sterile α Motif) protein (Fig. 4). Beta-N-acetylhexosaminidase might also interact with calpain found in S. hofmannii. In bacteria, it is known to play a role in peptidoglycan recycling pathway important for cell wall development (Litzinger et al. 2010). N-acetylhexosaminidase is classified in glycoside hydrolase family 3 (McDonald et al. 2015) which potentially interacts also with calpain from C. parasitica.
WD40-like protein was identified as aa putative calpain interaction partner in F. muscicola (Fig. 4). WD40-repeat proteins belong to a large protein family and they are implicated in a variety of functions ranging from signal transduction and transcription regulation to cell cycle control, autophagy and apoptosis in eukaryotes. All WD40-repeat proteins play a role in coordinating multi-protein complex assemblies. The repeating units serve as a rigid scaffold for protein interactions (Mishra et al. 2014; Stirnimann et al. 2010). In bacteria, over 4000 WD40 proteins have been identified which is more than 6% of all known WD40 proteins. Although only a small number of bacterial genomes encode WD40 proteins, they are most abundant in cyanobacteria and Planctomycetes (Hu et al. 2017). WD40 proteins function in bacteria as signal transductors and they are also involved in nutrient synthesis (Hu et al. 2017).
The putative interaction partners of calpain 1 identified in S. hofmanii, play a role in stress responses including expression of specific genes, activation of specific proteins, transport of proteins between membranes and altering DNA replication progress (Matelska et al. 2017; Imamura and Asaima 2009; Paget and Helmann 2003).
Calpain from F. muscicola was also predicted to interact with SAM-dependent methyltransferase which are responsible for nitrogen sequestration (Sofia 2001). Another interaction partner is Can B-type protein – a calcium-dependent, calmodulin-stimulated protein phosphatase containing a regulatory subunit of calcineurin with calcium sensitivity (Li et al. 2016). Our result show that predicted interaction partners of identified cyanobacterial calpains differ significantly among studied cyanobacterial species.
We also conducted phylogenetic analysis of calpain core CysPC domain to infer the phylogenetic position of cyanobacterial calpains. Phylogenetic analysis revealed the monophyly of bacterial as well as of eukaryotic CysPCs with bootstrap support 98 and 97, respectively (Fig. 5). No horizontal gene transfers of CysPC domain from bacteria to eukaryotes or vice versa were detected using our taxon sampling. The tree is, however, indicative of the real evolutionary relationships within neither bacteria nor eukaryotes. CysPC is thus unlikely to be a suitable marker for inferring the evolutionary relationships between organisms and it is also possible that several horizontal transfers of calpains have occurred within bacteria as well as within eukaryotes.
With the exception of S. hofmannii 2, all cyanobacterial CysPC domains are a monophyletic group within bacterial CysPC domains (Fig. 5). The alignment of cyanobacterial CysPC domains also confirms that CysPC domain 2 from S. hofmannii is the most divergent in comparison to other cyanobacterial CysPC domains (Fig. 3). The tree topology also disproves the hypothesis that cyanobacteria, from which chloroplasts of Archaeplastida evolved, were the endosymbiotic donors of archaeplastidial calpains.
The explanation of the origin of eukaryotic calpains depends on the opinion about the origin of eukaryotes themselves. The most popular hypothesis for the origin of eukaryotes suggests that eukaryotes evolved by the endosymbiosis of an alphaproteobacterial ancestor of mitochondria in an archeal host (Martin and Müller 1998), probably from the group Asgard archaea (Spang et al. 2019; Liu et al. 2021). Since archaea do not possess calpains, while some alphaproteobacteria do, under this scenario, the host archaeal cell could have obtained calpain gene from an alphaproteobacterial endosymbiont. This scenario would be supported if alphaproteobacterial CysPC domains would be placed at the base of eukaryotic CysPCs in the phylogenetic tree with high bootstrap support. Since this is not the case (Fig. 5), our tree does not support alphaproteobacterial origin of eukaryotic calpains. Nevertheless, the hypothesis, that an archaeal ancestor of eukaryotes or the last common ancestor of eukaryotes obtained the calpain gene from an unknown bacterial donor, e.g. via an ancient horizontal gene transfer, cannot be rejected.
Alternative, currently less popular but still plausible, hypothesis suggests that Archaea and Eukarya are sister groups sensu Carl Woese et al. (1990). Under this scenario, Archaea and Eukarya had a common undefined ancestor. This ancestor might have been even more complex than all contemporary archaea, Archaea domain might have arisen via reductive evolution of this archaeo-eukaryotic ancestor and the differences between genome contents of contemporary archaeal lineages could be explained by differential gene losses (Vesteg and Krajčovič, 2011; Vesteg et al. 2012; Forterre 2015). Considering this scenario, the calpain gene could have been already present in the last universal common ancestor, lost in the ancestor of Archaea, while retained in the ancestor of Bacteria and in the ancestor of Eukarya. Since calpain genes are universally distributed in neither bacteria nor eukaryotes, all mentioned scenarios would require multiple independent losses of calpain genes in various bacterial and eukaryotic lineages.