Thor-Rab
To establish the authenticity of Asgard Rab-like small GTPases at the protein level, we selected a predicted small GTPase encoding gene (GenBank accession number KXH73347.1) from the Candidatus Thorarchaeota SMTZ1-45 archaeon MAG (GenBank accession number LRSL01000056.1) based on sequence considerations and solubility on heterologous expression in E. coli. The protein (Thor-Rab) was purified, and crystallized in the presence of GTP and GDP (5 mM each). The crystals contain two molecules of Thor-Rab in the asymmetric unit. The resulting crystal structure, refined to 1.5 Å resolution, revealed that GDP/Mg2+ was bound to both copies of Thor-Rab in the crystallographic asymmetric unit (Fig. 1a, Supplementary Fig. 1, 2 and Supplementary Table 1). Subsequently, we co-crystallized Thor-Rab with the slow hydrolyzing GTP-γS in the absence of Mg2+. The structure, refined 1.75 Å resolution, contains mixed occupancy for GDP/GTP-γS in one copy of Thor-Rab and the full occupancy for GTP-γS in the second copy (Supplementary Fig. 1, 2 and Supplementary Table 1). Finally, a GTP-γS/Mg2+ soak (5 mM each, 10 min) of a Thor-Rab/GDP crystal, refined at 1.95 Å resolution, contained mainly GTP-γS at both sites with only one site displaying robust density for Mg2+ (Supplementary Fig. 1, 2 and Supplementary Table 1). Molecule 1 in each crystal contains a partially ordered Switch I loop and a fully traceable Switch II loop in the GDP-bound structure (Supplementary Figs. 1 and 2g). Molecule 2 in the GDP-bound structure contains a mostly traceable Switch I loop and a partially ordered Switch II loop (Supplementary Figs. 1 and 2h). Thus, the conformations of the Switch I and Switch II loops did not change substantially in the presence of different nucleotides (Supplementary Fig. 1i,j). We interpret this to indicate that the conformational state of Thor-Rab is dominated by the crystal packing (Fig. 2), which holds it in a state that allows nucleotide exchange.
Thor-Rab contains the typical core domain that is found in eukaryotic small GTPase proteins. We further characterized the structural relatedness to other small GTPases via structural superimposition. Thor-Rab shows the highest structural homology to human Rab-1B in the current entries in the Protein Database (PDB), characterized by an RMSD of 1.04 Å over 159 aligned residues (Fig. 1b). The Thor-Rab structure is similar but more distant from prokaryotic MglA small GTPases, such as that from the Gram-negative bacterium Thermus thermophilus HB8 (2.38 Å over 144 residues, Fig. 1c). Since the structural relationships may be skewed by the nucleotide-induced conformations of the Switch loops (Fig. 3), we compared the structural relatedness of the core Thor-Rab domain after the removal of the Switch loops (Supplementary Table 2). 49 of the 69 top hits were Rab structures, 19 were K-Ras structures and 1 was a H-Ras structure, as ranked by the program Dali29. This confirmed that Thor-Rab is closest in structure to Rab and Ras family proteins, and more distant to other classes of eukaryotic GTPases (Arf), and even more distant to the bacterial GTPases (MglA and EngB). Inspection of structure-based sequence alignments revealed that Thor-Rab shows a high level of sequence homology in regions that have been used to define the eukaryotic Rab family (RabF1-F5, Fig. 1d)3,26. Thor-Rab has an arginine residue (Arg37) in an equivalent position in the sequence alignment to MglA Arg53 in Switch I (red triangle, Fig. 1d and Supplementary Fig. 2g). This residue in MglA has been predicted to act as an intrinsic “Arg finger” in stabilizing the GTP γ-phosphate during hydrolysis8. Other Rab paralogs from the Thor SMTZ1-45 genome do not have an arginine residue in this position (Supplementary Fig. 3). Thus, the MglA mechanism of γ-phosphate self-stabilization by arginine is not a common feature in Thor Rabs, implying that they may require GAPs to enhance hydrolysis. A phylogenetic tree calculated from a structure-based sequence alignment based solely on experimentally determined structures, placed SMTZ1-45 Thor-Rab closest to the Rab and Ras clades (Fig. 1e). Taken together with the high structural homology, this indicates that Thor Rabs are likely to be functionally related to Rab-like Ras GTPases. However, one of the defining characteristics of eukaryotic Rabs is the presence of one or two C-terminal cysteine residues which can be geranylgeranylated (Fig. 1d). These cysteine residues are absent from Thor-Rab and the C-terminus is significantly truncated relative to eukaryotic Rabs. Similarly, we could not find homologs for the eukaryotic Rab Escort Protein 1 and Rab geranylgeranyltransferase subunits in BLAST sequence databases searches. This indicates that Thor-Rab is not modified for insertion into membranes in the same manner as eukaryotic Rabs. As far as we know, Asgard archaea only have one membrane, the cell membrane27,28. Thus, the Rab cysteine-containing C-terminal extension likely arose in proto-eukaryotes in conjunction with the acquisition and distinction in internal membranes.
To measure the GTPase activity of SMTZ1-45 Thor-Rab, we employed a phosphate release assay30. SMTZ1-45 Thor-Rab generated similar levels of phosphate in comparison to human Rab11B (Fig. 1g,h) over two hours, whilst the heat-denatured proteins produced substantially less phosphate under the same conditions. This indicates that the unassisted GTPase activity of SMTZ1-45 Thor-Rab is similar to eukaryotic Rab proteins, implying that a GAP and/or GEF may be needed to accelerate GTP hydrolysis and nucleotide exchange for signaling.
Asgard small GTPases
To get a broader picture of the diversity of small GTPases in a single Asgard species, we searched the MKD1 genome, the first Asgard genome to be fully sequenced, and found 74 related sequences, for which we predicted the structures using AlphaFold2 (AF2, Fig. 1f). 35 of the sequences consisted of single domain GTPases and 26 sequences are combined with RB or longin domains (Fig. 1f). We calculated a phylogenetic tree from a structure-based sequence alignment of the small GTPase domains and included a variety of eukaryotic and prokaryotic structures (Supplementary Fig. 4 and Supplementary Table 3). Onto this tree, we mapped the domain architectures and the GTPase classes calculated from the superposition of AF2 models of the GTPase domains onto experimentally determined structures (Supplementary Fig. 4). This revealed the major class of GTPases to be Rab-like GTPases, as predicted from sequence annotation (Supplementary Figs. 4 and 5). The second largest class was most similar to Rag-like GTPases, but also close to Arf GTPases in structural homology (Supplementary Figs. 4 and 6). Many of these proteins had been predicted to be Arf GTPases from sequence annotation. These architectures include the GTPase domain fused to a longin or to a RB domain, or fused between a longin domain and RB domain. Interestingly, an insert was observed in Switch II from GTPase-RB architectures but not in RB-GTPase architectures, indicating potential differences in nucleotide regulation between these designs (Supplementary Fig. 6a). Finally, a smaller group contained AF2-predicted MglA bacterial-like GTPases (Supplementary Figs. 4 and 7). One pair of MKD1 Rag-like GTPase genes exist in an operon in the genome. The AF2-predicted structure reveals a potential heterodimer to be connected via the RB domains in a similar way to eukaryotic Rag-like complexes, such as the yeast GTR1/GTR2 heterodimer (Supplementary Fig. 8a-d)31. AF2-predicted structures of homodimers of other Rag-like GTPase architectures (Supplementary Fig. 8e-h) indicate a variety in the position of the GTPase domain relative to the longin/RB domains. Thus, from AF2 modelling, the MKD1 genome is predicted to encode multiple paralogs of Rab, Rag and MglA GTPases, proteins which have membrane-associated roles in eukaryotes and in bacteria.
RB
Subsequently, we investigated a predicted RB protein (KXH72322.1) from Thor SMTZ1-45, and a second RB protein (WP_147663254.1) from Loki MKD1 for comparison with the Loki profilin structures32. These proteins were chosen based on sequence homology to eukaryotic RB domains. The proteins (Thor-RB and MKD1-RB) were purified, crystallized and their structures were elucidated via X-ray crystallography to 2.14 and 2.69 Å, respectively (Fig. 4a,b and Supplementary Table 4). Both proteins are homodimers formed from protomers comprised of 5-stranded β-sheets sandwiched between a single α-helix (formed between strands 2 and 3) and a pair of α-helices (formed from the N- and C-termini) (Fig. 4a-c, f).
We compared the structural similarities of Thor-RB and MKD1-RB with MglBs and eukaryotic RB proteins. Both proteins displayed approximately similar levels of homology with the RB proteins M. xanthus MglB and H. Sapiens LAMTOR2 (Fig. 4g,h,i,j and Supplementary Table 5), matching ~ 110 residues with an RMSD of 1.5–1.6 Å. Lower levels of homology were seen in comparison with longin domains (86–94 residues, RMSD of 2.5-3.0 Å) and profilins (96–102 residues, RMSD of 1.7–2.7 Å). These homologies indicate that the RB fold is highly conserved between all domains of life and that the profilin and longin domains share a core structure with RB. Despite the similarity in structures, Thor-RB and MKD1-RB display different surface charge distributions (Fig. 4d,e). MKD1-RB has a basic patch on the same face as observed for MglB, which in the case of MglB, is used to recruit the MglA/MglB complex to membranes (Fig. 4e, i, k)8. Thus, we hypothesize that MKD1-RB may be a membrane-interacting module, whereas Thor-RB, which lacks the basic patch, may be a scaffold protein for protein:protein interactions.
RB
Next, we explored truncated RB sequences within the Asgard sequence databases. One potential RBLC7 sequence from Odin LCB_4 (OLS18093.1, Odin-RBLC7) was amenable to protein expression, purification and structure determination by X-ray crystallography, refined against 1.83 Å data (Fig. 5a,b and Supplementary Table 4). The Odin-RBLC7 structure shares a similar homodimeric structure to Thor-RB and MKD1-RB but lacks the terminal pair of α-helices observed in the RB structures (Fig. 4a,b). However, the interpretable electron density starts at Gln13 and the preceding 12 residues are predicted to form an α-helix by AF2 (Fig. 6a,b and Supplementary Fig. 9a). Thus, Odin-RBLC7 forms a RBLC7 conformation which lacks the C-terminal α-helix found in the longer RB architectures.
We compared the Odin-RBLC7 structure to other RB, RBLC7, longin and profilin structures (Supplementary Table 5). Odin-RBLC7 was most similar to MglB and dynein light chain RB domain 1 (DLRB1) homodimers (77–78 residues, RMSD of 1.3–1.4 Å, Fig. 5c) and showed good homology to the LAMTOR4/5 heterodimer (71 residues, RMSD of 1.7–1.8 Å, Fig. 5e) and Odin profilin32 (64 residues, RMSD of 1.8 Å). Comparison of the RBLC7 structures in their larger complexes reveals that the absent C-terminal α-helix, relative to RB structures, allows for the association of an α-helix from a binding partner, DC1I2 for DLRB1 (Fig. 5d) and LAMTOR1 for LAMTOR2/3 (Fig. 5f). We propose that the Odin-RBLC7 homodimer will act in a similar manner in providing an α-helix binding site to assemble larger complexes.
Longin
We were not successful in solving the structure of a longin domain, however many such sequences are predicted in the Asgard genomes22,23. We used AF2 to explore the structure of a longin domain protein from MKD1 (Supplementary Fig. 9b), which was predicted to have a similar fold to eukaryotic longin domains.
Comparison of RB/longin/profilin folds
Comparison of the structures and topologies of the RB, longin and profilin families of proteins indicates the adaptations in these proteins that surround the core domain (Figs. 6 and 7). RBLC7 has lost the C-terminal helix relative to RB, while the longin domain has lost the N-terminal helix and instead placed an additional C-terminal helix at the same location (Fig. 7a-c). Whereas, half of the central helix is replaced by a 3-strand motif in profilin (Fig. 7d). This region is responsible for dimerization in RB proteins (Fig. 7e-f) and in actin binding of profilin (Fig. 7g-h). Thus, the core common fold is comprised of a 5-stranded β-sheet surrounded by adaptable α-helices that mediate interactions with different binding partners. It is likely that differentiation in these proteins occurred in the ancestors of Asgard archaea, since these proteins are found in all Asgard archaea phyla33, and expansion in numbers of each fold continued in the subsequent Asgard lineages leading to the variations between lineages, as documented for profilins34,35.
TRAPPC3
Finally, we investigated the TRAPPC3-like proteins from Thor. We expressed, purified and crystallized two potential TRAPPC3 proteins (Thor-TRAPPC3), from Thor SMTZ1-45 (KXH75250.1) and Thor AB25 (OLS30461.1). The crystals were pink in color indicating potential Zn2+ binding. We scanned Zn2+ fluorescence for AB25 Thor-TRAPPC3 using X-rays to confirm the identity of the cation and collected multiple anomalous dispersion (MAD) diffraction data around the Zn2+ edge to solve the structure at 1.91 Å (Supplementary Table 6 and Supplementary Fig. 10a,b). Subsequently, SMTZ1-45 Thor-TRAPPC3 was solved at 1.7 Å resolution, by molecular replacement using the AB25 Thor-TRAPPC3 structure (Supplementary Table 6). We concentrated on the analysis of the SMTZ1-45 Thor-TRAPPC3 structure since it was refined against higher resolution data.
SMTZ1-45 Thor-TRAPPC3 forms a homodimer that closely resembles the eukaryotic heterodimer of TRAPPC3/C6 (Fig. 8a-c). The SMTZ1-45 Thor-TRAPPC3 subunit is most closely structurally related to TRAPPC3, characterized by 148–151 matching residues with RMSDs of 2.1–2.3 Å (Supplementary Table 7). The SMTZ1-45 Thor-TRAPPC3 architecture forms two layers, comprised of 4 α-helices and 4-stranded β-sheet, with the two N-terminal helices forming the dimerization interface. Bacterial V4R domains, such as the PoxR homodimer, are also structurally similar to SMTZ1-45 Thor-TRAPPC3 (Fig. 8b,d and Supplementary Table 7, 130 matching residues, RMSD of 3.3 Å), albeit more distant than eukaryotic TRAPPC3/C6. SMTZ1-45 Thor-TRAPPC3 also shows structural homology to other ligand-binding proteins, such as the bacterial NO-binding heme-dependent sensor protein (H-NOX) and cellulose synthase subunit D (AxCeSD), and to human soluble guanylate cyclase (GUCY1A/B) (Supplementary Table 7). Besides sharing a common fold and dimerization geometry, SMTZ1-45 Thor-TRAPP also contains internal cavities (Fig. 8b). Such cavities bind to hydrophobic ligands for mouse TRAPPC3 (palmitic acid, Fig. 8c)36 and PoxR (phenols, Fig. 8d)37, indicating that the function of binding ligands is maintained during evolution. We speculate that Thor-TRAPPC3 may also bind small molecules, potentially being involved in ligand transport. The cavities appear to be accessible from the exterior close to the dimerization interfaces (Fig. 8e-g). SMTZ1-45 Thor-TRAPPC3 Zn2+ binding occurs through 4 cysteine residues (Fig. 8b), similarly to PoxR (Supplementary Fig. 10c). We used this feature to search for TRAPP domains in other Asgard phyla. We found sequences in MKD1 and Heimdall that were predicted by AF2 to adopt the TRAPPC3/V4R fold and to form homodimers (Fig. 9 and Supplementary Fig. 10d-f)38. Interestingly, we identified two MKD1 proteins, one of which was predicted by AF2 to be more similar to Thor-TRAPPC3 (Fig. 9a,b) and the other more similar to the PoxR V4R domain (Fig. 9e,f) in their topologies and potential Zn2+ binding. We further analyzed the homology around the Thor-TRAPPC3 Zn2+-binding site (Fig. 9). We found that the predicted Zn2+-binding sites (Fig. 9b,d-e) and the experimentally determined sites (Fig. 9a,f) are not completely conserved in positions of the coordinating residues, rather they appear in similar regions and appear to serve the same function in tethering the β-sheet to the D α-helix to create cavities. Interestingly, mouse TRAPPC3 has evolved to replace the Zn2+-binding site by a hydrogen bond (Fig. 8c and Fig. 9c). Taken together, our structural and sequence analyses of Asgard TRAPP/V4R proteins suggest that the potential role of these proteins will be in ligand binding.
Finally, we expressed the Asgard proteins from this study as GFP fusion proteins in human HeLa cells. The GFP signal under a fluorescent microscope was observed to be diffuse throughout the cytoplasm and nucleus for all Asgard proteins tested (Supplementary Fig. 11). We did not observe targeting of the Asgard proteins to membranes.