Targeted Genome-Wide Data Enrichment for Phylogenomics of Amoebozoa
Despite the large number of RNA-Seq data generated in recent studies 4,24−26, only a small fraction of this data has been utilized in phylogenomic analyses. To increase it, we compiled a total of 1559 markers using genome-derived protein coding genes from 113 amoebozoan genomes and transcriptomes. Using putative single copy markers, primarily derived from Amoebozoa genomes, has enabled us to introduce highly conserved markers with phylogenetic signal corroborating morphology- and phylogenomic-based amoebozoan hypotheses 4,24. While single-copy genes identified in some genomes might not always apply to others, a previous phylogenomic study with seed plants, based on single copy markers resulted in more resolved phylogeny both at shallow and deep nodes 32. In this study, we followed a stringent approach aided by automated and manual curation of markers, selected from the above-mentioned dataset to build the largest supermatrix (823 genes) in the Amoebozoa. With this approach, we substantially increased the total number of genes used in Amoebozoa phylogenomics. Our analysis yielded consistent and well-corroborated topologies, despite whether we included or excluded fast evolving sites (Figs. 1, S2). The robustness of our phylogeny is also corroborated with the high support values from internode certainty analysis (Figs. S3, S4). One of the evident results of this approach is the first time phylogenomic recovery of the monophyly of the taxon Stygamoebida, earlier supported only at the morphological level 22,23 and a recovery of a novel deep split divergence of Amoebozoa.
Unraveling deep divergence of Amoebozoa
A recent phylogenomic study by Kang et al. 4, though based on a slightly smaller taxon sampling, proposed a split of the Amoebozoa supergroup into two major subclades: Tevosa (Evosea + Tubulinea) and Discosea. By contrast, in our study Evosea robustly groups as sister clade to Discosea (Figs. 1, S1, S2). Both phylogenetic hypotheses, ‘Tevosa’ and Divosa, receive high statistical support in their and our study, respectively (see Fig. 1, 4). In phylogenomic analyses, it is common to see that short subtending deep nodes receive high statistical support 33. Amoebozoan deep nodes are characterized by very short branch lengths, an indication of limited supporting characters, or possible ancient rapid diversification. Strong statistical support at these levels of nodes does not necessarily mean that the inferred relationships are correct. Statistical indices such as bootstrap values and Bayesian posterior probabilities only assess sampling effects, and give an indication of tree reliability that is dependent on the data and the method 34. This can partially explain why these short-branch, deep nodes in Amoebozoa phylogenomic studies tend to collapse, or vary, depending on the method of analysis or the composition of the gene/taxon sampling 4,24−26. Certainly, caution still must be taken when interpreting ancient divergences, because results can be muddied by noise (e.g., gene history 35 or lack of signal due to rapid radiation 29). However, the support of the split recovered in the present study is high and originates from different lines of evidence.
It is possible to note that in many lineages trophozoites of Discosea and Variosea are more similar to each other rather than to Tubulinea. Certainly, the morphology of presently living amoeboid organisms is derived and adaptive, but generally it is possible to say that members of Divosa lineage share more morphological similarity between each other rather than with the Tubulinea lineage. For example, amoebae of the genus Flamella, belonging to the class Variosea, by their morphology may be easily confused with some discosean amoebae (e.g., 36); the same is true for individual trophozoites of many mycetozoan species, showing flattened body shape and pointed subpseudopodia 37,38. Cells of amoebae belonging to the genus Squamamoeba (the taxon of Cutosea), sometimes resemble Korotnevella (Discosea) in their overall morphology, hence, being differently organized at the cytological level 39. At the same time, none of discosean or variosean lineages show the morphology resembling that of, e.g. Amoebida, or alteration of the locomotive morphology from flattened to tubular, which is a general characteristic of Tubulinea 20,22. To certain extent, the return to the tubular body shape, subcylindrical in cross-section occurs among amoeboid representatives of Archamoebea; however, this might be mostly related with their specific lifestyle (parasites or pelobionts). In addition the pattern of pseudopod formation (e.g., the tendency to show eruption of the hyaline cytoplasm in the frontal area of the cell) makes them to be significantly different from that in Tubulinea (see 40).
Mid-Proterozoic environment – the driving force for the origin of Amoebozoa
The flagellum (cilium) is a highly conserved complex structure that is believed to have originated only once, and be ancestral to all eukaryotes 2,41,42. Amoebozoa are remarkable in that the two basal phylogenetic lineages, Tubulinea and Discosea, have entirely lost cilia, kinetosomes (basal bodies) and associated root structures; while a derived major clade, Evosea, contains a handful of ciliated lineages in a few branches intermingled among amoeboid lineages 21,22. The loss of cilia and associated structures in the majority members of Amoebozoa is one of the biggest mysteries pertaining to their origin and evolution.
In ciliated members of Amoebozoa, the ciliary apparatus is characterized by a specific arrangement of root structures, which includes an incomplete (Variosea and Mycetozoa) or complete (Archamoebea) cone of microtubules extending from the kinetosome to the nucleus 43. In early interpretations, this conical arrangement of microtubules was considered to be homologous to the ciliary root system of Opisthokonta; which, together with other morphological and molecular evidence, gave rise to the “Unikonta” hypothesis 2,44,45. In this model, the hypothetical ancestor of Amoebozoa was considered to be an organism with a single emergent cilium, resembling Phalansterium or Mastigamoeba in cellular organization 46,47. This lineage, combining Amoebozoa and Opisthokonta, has been proposed as an alternative to that of the bikonts, with two emerging cilia; which included the rest of the eukaryotic groups. Cavalier-Smith 2 argued that among unikonts, paired kinetosomes (when present) resulted from convergent evolution rather than common ancestry with bikonts. Molecular and morphological analyses provided certain indications that the microtubular structures in Amoebozoa, and Opisthokonta may not be homologues 43,48. However, further development of molecular phylogeny provided evidence for the basal position of bikont organisms in the tree of eukaryotes 3,49,50. Thereafter, the general consensus nowadays is that hypothetical common ancestor of Amoebozoa, was a bikont organism 43,51,52. Several authors (e.g., 3,43,49,50) hypothesised that the presumable common ancestor was a ventrally grooved biciliate gliding flagellate, capable of producing filose ventral pseudopodia and possessing a relatively complex organization of the cell. That is, a cell possessing two cilia with kinetosomes and root structures, ventral groove supported with microtubules and dorsal pellicle – the so called “sulcozoan ancestor”. Its name originates from Sulcozoa – a phylum of protists established by Cavalier-Smith 43 that combines a heterogenous assemblage of early evolving eukaryotic lineages. Cavalier-Smith suggested that “opisthokonts and Amoebozoa evolved from sulcozoan ancestors by two independent losses of the pellicular dense layers and of the ventral groove, which in both cases would allow pseudopods to develop anywhere on the cell surface” (op. cit.).
The origin and further evolution of Amoebozoa in this hypothesis presumes the loss of both cilia and kinetosomes in Lobosa (Tubulinea and Discosea) and of the posterior cilium and one kinetosome in most of the ancestors of Conosa - Archamoebae, Variosea and Eumycetozoa; Cutosea were not known at that time (e.g., 3,49,50). This evolutionary scenario was rather logical and is illustrated in Fig. 2A. However, the Lobosa/Conosa dichotomy was doubted based on some 18S gene phylogenies 27; and it subsequently failed to garner support in wide-scale phylogenomic studies 4,24,25, as well as in the present study. This makes the model of multiple losses more complicated, because under the new tree configuration, we have to suggest subsequent partial or complete loss of cilia and related structures in all but one branch of Amoebozoa. This hypothetical scenario is illustrated in Fig. 2B. It remains unclear why the hypothesized ancestor of Amoebozoa, being initially a quite complex biciliated organism, underwent such a massive loss (or substantial simplification) of cilia-related structures in almost all evolutionary lineages of Amoebozoa, and what was the driving force for such a reduction.
Several studies based on molecular dating analysis correspondingly placed the origin of Amoebozoa to the Mesoproterozoic period, which means 1250–1624 mya 31,53. It means that the early evolution of Amoebozoa took place at the period when the biosphere was dominated with microbial biofilms – sheets of bacteria, embedded in extracellular polymeric substances, covering almost every possible substrate 54. Being initially rather simple, biofilms further evolved in complex microbial mats, comprising different prokaryotic organisms, showing concerted activities and intimate interactions between various microbial metabolisms 55. The oldest mats are dated to approximately 3.5 billion years ago, and the noonday of mats covers the mid-Proterozoic period 56,57, which roughly corresponds to the estimate of the potential age of Amoebozoa.
Formation of a microbial biofilm, among other structural and biogeochemical features, can be explained as an adaptation that increases survival of bacteria to avoid predation 58,59. The probable size of the bacterivorous biflagellate ancestor of Amoebozoa was relatively small, likely no larger than that of the existing representatives of the CRuMs clade (e.g., Mantamonas) or ‘Excavates’ (metamonads or Malawimonas), which is within the general size range of 2–20 µm. These organisms were able to phagocytize solitary bacteria, but consumption of microorganisms embedded in an intact microbial mat probably was beyond their capacity, as well as this is beyond the capacity of the modern flagellates of comparable size 60,61. Feeding on bacteria, major constituents of the microbial mats (the dominant food source in the mid-Proterozoic environment), required increment in the body size and acquisition of special adaptations allowing them to ingest filamentous food. However, the latter was again related to the body size, because the filament, even compacted in some way, must be ingested – i.e., appear inside the cell.
Due to Reynolds number limitation 62,63, the increment in the body size makes ciliary motility less adaptive due to loss of efficiency. Thus, from an adaptive aspect, an amoeboid lifestyle might be a way to increase the body size while retaining a motility function, no longer dependent on cilia. An amoeboid organization also could gain the adaptive capacity to disrupt microbial mats and graze, feeding on bacteria within the mats. This adaptation would provide access to the dominant food source in the biosphere of the mid-proterozoic eon. Indeed, presently, naked amoebae are known as one of the primary grazers of bacterial biofilms 64–66. Moreover, they not only just graze and phagocytize prey in the mats, but also disrupt them, making their content available for other organisms 67,68. Finally, in addition to the advantage of feeding on bacterial mats 69,70, it is also possible that an increase in body size alleviated pressure of predation by other organisms on the last Amoebozoan common ancestor (LACA), which for some time provided it an adaptive advantage and allowed rapid proliferation and differentiation of Amoebozoa in the mid-Proterozoic environment.
Hence, we hypothesise that the adaptive value of amoeboid locomotion and concomitant grazing potential on the dominant food source in the mid-proterozoic biosphere – the microbial mats – favoured the evolution of the Amoebozoa. They probably successfully solved this task by the increment of body size. However, at the same time, the efficiency of flagellar locomotion was highly reduced or lost; and this resulted in the multiple suspensions of the flagellar apparatus, which is completely absent in two major current amoebozoan lineages – Tubulinea and Discosea (Fig. 2). The modern configuration of the Amoebozoan tree, which rejects the Lobosa/Conosa dichotomy and suggests a subsequent branching of lineages (with either Tubulinea or Discosea at the base), leaves open a major question. That is, was the last Amoebozoa common ancestor an amoeboflagellate, with the domination of amoeboid movement based on the microtubular cytoskeleton; or was the flagellum-related structures and microtubular locomotive system entirely suppressed? If the latter case is true, then it probably drove the ancestral amoebozoan to switch to the acto-myosin movement, as found in modern representatives of naked and testate lobose amoebae. Probably, the answer to this question may be obtained by the analysis of gene content and the level of flagellum-related gene expression in the amoebozoan genomes. However, the dataset available for quality analysis remains limited in this group of protists and requires further accumulation prior to conclusive study.