The CDPKs were characterized by the presence of N-terminal domain, kinase domain, an auto-inhibitory domain, and a regulatory domain [10, 11]. The regulatory domain is characterized by the presence of 4 calcium-binding EF-hands [12, 13]. The EF-hands present in the regulatory domain of the CDPKs are conserved and contain D-x-D conserved amino acid at the 14th and 16th position, which are responsible for binding of Ca2+ ions [14]. In addition, the CDPKs contain the N-terminal palmytoylation and myristoylation sites [14]. Hence, it was important to find the presence of all 4 domains in the CDPKs and N-terminal signal sequences of palmitoylation and myristoylation sites. However, N-terminal signal sequences and regulatory domain were not found in these sequences, which raised the question of, how the genes/proteins were annotated as calcium/calmodulin-dependent protein kinase in the complete absence of calcium-binding EF-hand domain in the proteins. Even, the homology-based annotation does not result in such a misleading report, when there is a complete lack of EF-hand containing regulatory domain. It is well known that fungi do not encode for the CDPK in its genome, whereas it is present in the plant and animal kingdom [10, 14, 15]. The calcium dependent protein kinases play diverse roles in plants and animals [10, 14, 15]. In plants, CDPK regulates growth, development, and biotic and abiotic stress tolerance [16–19]. In fungi, the calcium-signaling events are regulated by calmodulins, calcineurin B-like proteins, calmodulin-like proteins, calcineurin-responsive zinc finger transcription factor, Ca2+ ATPase, Ca2+/H+ exchangers, high-affinity calcium system, low-affinity calcium system, transient receptor potential (TRP)-like calcium channels, and mitochondrial calcium channel [20, 21]. However, fungi do not encode for the CDPK gene family for the calcium signaling event. The basal cytoplasmic calcium level in fungi ranges from 50 to 200 nM and fungi store the maximum of their Ca2+ ion in the vacuole (approximately ~ 95%) and calmodulin, calcineurin B-like proteins and other bring the Ca2+ homeostasis irrespective of the presence of CDPKs [20]. The cellular Ca2+-channels, cation/proton exchange regulate the filamentous growth in the fungi associated with cell division, hyphal tip growth, and hyphal branching [22]. The vacuolar Ca2+-ATPase in the fungi is closely related to the plasma membrane Ca2+-ATPase (PMCA) -type pump. The PMCA contain a cytosolic auto-inhibitory domain at the C-terminal end, which is relieved by the binding of calmodulin into it [20, 23]. The presence of auto-inhibitory domain in PMCA in fungi is functionally similar to the auto-inhibitory domain of the CDPK in the plant and animals. This may be a possible similar structural and functional unit of the fungi with regard to the CDPK in order to conduct calcium-signaling events; hence, the fungi do not encode the CDPKs.
Selenoproteins contain Sec amino acid, which is encoded by UGA codon. The proteins associated with the reactive oxygen species signaling machinery (glutathione peroxidase) contain Sec amino acid [24]. The presence of Sec amino acid has been reported in plants [25], animals, [26] and bacteria [27]. However, the presence of Sec amino acid has not been reported in the fungi [7]. Previously Mariotti et al., (2015) also reported that fungi do not contain Sec amino acid [7]. Therefore, the presence of ambiguous gene/protein annotation name with “selenocysteine” in the fungal genome/proteome is quite a concern for the researcher community. Therefore, it is important to provide an insight to the annotation strategy of the fungal genome, making it important for the researchers across the globe to consider it as a serious problem, requiring an immediate address. Previous report also reported the genome annotation error in bacteria [28, 29]. Several reasons may be accounted for the misannotation of the gene/protein sequences. However, the parameters for the placement of lower limit for the coding sequences can be one of the reasons. Another most possible reason may be the lack of “gold standard” of reference sequences. However, when the misannotation occur at the super-family level, that is CDPK, it is quite a concerning matter. A recent comparative study of the protein-coding and lncRNA transcript in the RefSeq and Gencode human gene database led to the finding that only 27.5% of the Genecode transcript had the exact match with the introns at the same position corresponding to the RefSeq genes [30]. Even after 19 years of continuous effort, the exon-intron boundary of the human genome is not yet settled. The problem in yeast and Arabidopsis is even worse than in human [30]. The advancement of RNA-sequencing could slightly rectify the problem, as the full-length transcript can directly align with the genome to reveal the exon-intron structure. The Mammalian Gene Collection that includes the gene of humans and a few other species could reduce the error rate through the RNA-seq approach. The modern annotation pipeline MAKER uses RNA-seq data and aligns with the database of other proteins and provides the correct annotation names. Although the RNA-seq has its own limitation, it remains a viable alternative to remove the error-prone annotation. The error in assembly can also lead to the errors in the annotation. Although the automated genome annotation process is good enough to cope with the pace for the sequencing of big and large number of genomes, any minor error in the existing annotation can directly propagate the error to the other species with immediate effect.