Overall structure of the ArnA-ArnB complex and structural comparison with SavWA2
We were able to solve the crystal structure of the vWA2 paralog 5 and archaellum regulating factor ArnB (UniProt: Q4J9H3, saci_1211) in its complex with ArnA (UniProt: Q4J9H4, saci_1210). The ArnAB cocrystals comprise two complexes per asymmetric symmetry unit, whose structures were solved by molecular replacement and refined at 2.5 Å resolution. The ArnAB complexes are defined by electron density for residues T2-S380 of ArnB and the zinc-finger (ZnF) domain of ArnA (P16-K42, Fig. 1A, PDB: 8S05). The latter implies a loss of the FHA domain and the linker region of ArnA, possibly by unspecific proteolysis as observed before when solving the structure of the ArnA FHA domain 5. The overall architecture of ArnB corresponds mostly to its paralog, the van Willebrandt Factor A (vWFA)-containing protein SavWA2, including the vWFA domain, an eight-stranded β-sandwich whose topology is split by the vWFA domain, as well as the C-terminal, four helix bundle motif, that has been classified as ArnB_C domain (InterPro entry IPR040929) (Fig. 1A) 2,5. In contrast to SavWA2, the ArnB_C domain of ArnB harbors an elongated helix pair at the terminal region of the motif, revealing one of the most substantial structural differences between ArnB and SavWA2 (Fig. 1D). This extension presents additional threonine residues that allow for potential interaction of the forkhead-associated (FHA) domain of ArnA, known for its ability to bind phospho-threonines, with ArnB (Fig. 1D right panel) 9. Moreover, ArnB features an additional helix (P95-Q103) in the vWFA domain between β5 and α2 of SavWA2 increasing the total helix count to ten (Fig. 1D). The vWFA domain of ArnB also harbors a Na+ ion coordinated by D46, S50, T110, T135 and D136 in the metal ion-dependent adhesion site (MIDAS, Fig. 1C).
While ArnA interacts strongly with its FHA domain to phosphorylated ArnB (see below), the zinc finger domain (ZnF domain) itself is sufficient to promote an interaction without a post-translational modification of ArnB. This ZnF domain belongs to the RanBP2-type (IPR001876) and is characterized as a ZnF ribbon domains by two consecutive, distorted β-hairpin motifs, which together a zinc ion via C21, C24, C35 and C38. The interface of the ZnF domain of ArnA with ArnB has a rather moderate size of 504/574 Å2 for ArnAB chains A/C and B/D, respectively. The interactions are mostly of hydrophobic nature and include the C-terminal β-hairpin motif (D31-Q41) of the ZnF domain as well as the N-terminal β1-β2 loop (H12-K21) of the β-sandwich domain, it’s α-helical linker to the ArnB_C domain (V286-I293) and adjacent residues of the ArnB_C domain facing the ZnF domain. Accordingly, the ArnA-ArnB interaction based on the ZnF domain appears to be rather weak as indicated by pulldown assays (Figure S1A), but traceable by mass photometry 7. A further point for the uniqueness of the ZnF-mediated ArnA-ArnB interactions are AF2-multimodels, which were unbiased of the ArnAB structure and show almost an identical interaction as in the ArnA-ArnB cocrystal structure for 4 of the 5 predicted ArnA-ArnB models (Figure S2) with displacement r.m.s.d. values of 1.57–1.65 Å for M1-Q25 of the ZnF domain. This indicates that the intrinsic sequence covariation for ArnA and ArnB domains is already significant enough to provide a robust indicator for the relatively small ArnB/ZnF domain interface. Moreover, an ArnA-ArnB interaction is also displayed in solution by SAXS data (Fig. 1E) as the ab initio envelope as derived from the SAXS data is fittable to the ArnA-FHA and ArnAB crystal structures. Here, the pair distance distribution function P(r) suggests an overall elongated shape (Figure S3) and thereby supports an ArnA-ArnB interaction based on the ZnF domain, as in the crystal structure, with a flexible region followed by an unbound FHA domain due to a lack of pThr anchor points.
Promiscuity of ArnB phosphorylation-dependent interaction sites
In the search for the entire interaction site of the strong ArnA-ArnB interaction based on phospho-threonine interaction, we performed a comprehensive mass spectrometry-based analysis. Notably, there are many potential interaction sites found in ArnB, especially in the C-terminal HTH motif (Fig. 2A) where interaction appears to be most likely, based on structural analysis. Accordingly, we first opted to investigate the phosphorylation pattern by the kinase ArnC (UniProt: Q4J9J0, Saci_1193) with the previously reported phosphorylation conditions 7. Interestingly, we found that many threonines in the ArnB_C domain were phosphorylated during our in vitro phosphorylation, raising some interesting questions about the native phosphorylation conditions. However, as 60 minutes of incubation time at 55°C yielded many different potential interaction sites we opted to find the ones that are phosphorylated first. Hence, we investigated the effect of different incubation times on the phosphorylation pattern of ArnB. In time points of 5 min, 15 min, 30 min and 60 min (resembling our reference sample) the phosphorylation pattern of ArnB was analyzed in a tryptic digest MS experiment. Interestingly, ArnA could still be efficiently pulled down along with ArnB during purification with phosphorylation time of little as 5 min. Moreover, phosphorylation of threonines located closer to the N-terminal side of the ArnB_C domain takes place only after T353, T354, T359, T363 and T375 are phosphorylated and only until T322, which aligns with the known phenomenon that multiple phosphorylation usually occur as clusters in a protein 10. Accordingly, these main threonines are phosphorylated after 5 minutes already, while other phosphorylated threonines that have potential counterparts in vWA2 are found phosphorylated only after 15 minutes or more incubation time. This may indicate ordered processivity of hyperphosphorylation, like in the cyclin–Cdk1–Cks1 system 11 and goes along with the threonines being located in the extension of the C-terminal HTH motif of ArnB, besides T375, which are therefore not present in vWA2. Additionally, it is reported that vWA2 does not interact with ArnA 6, supporting the assumption that the C-terminal HTH extension is the main interaction side for the ArnA FHA domain.
As this narrowed down the possible interaction sites, we attempted to identify the exact position of the ArnB-FHA interaction by a comprehensive alanine mutagenesis study. Inducing the mutation of either T353A + T354A or T359A + T363A it was not possible for the FHA domain to interact with ArnB anymore, if phosphorylated for 5 min or less. However, increasing the phosphorylation time to 60 min, ArnA could be pulled down with ArnB without issues again. Additionally, even when threonines that were found to be phosphorylated only after 5 min were mutated as well, ArnA could still be pulled down along with ArnB, represented by the ArnB-T343A-T344A-T353A-T354A-T359A-T363A-T371A-T375A mutant (Figure S1B). In general, this gives an idea of the more likely interaction side of the FHA domains location at the C-terminal elongation of the ArnB_C domain, but leaves it open if multiple FHA interaction sites play a role in the interaction of ArnA and ArnB. However, the hyper-phosphorylation we observed leads to multiple structural rearrangements, as discussed in the following, and hence are not necessarily interaction sites but structural factors important for a correct ArnB-FHA interaction.
HDX-MS reveals phosphorylation-dependent structural relieves of ArnB
In order to get insights into the interaction site(s) between ArnA and phosphorylated ArnB in solution, we made use of the property of amide protons to exchange for protons from the aqueous solvent, the extent of which was traced upon incubation of the proteins in deuterated buffer. This hydrogen/deuterium exchange (HDX) was then, after digestion of the proteins into peptides, quantified by mass spectrometry (MS).
We subjected ArnA, ArnB, phosphorylated ArnB, and the ArnA/phosphorylated ArnB complex to HDX-MS experiments, allowing us to resolve, i) the conformational changes that ArnB undergoes upon phosphorylation, and ii) the regions of ArnA and phosphorylated ArnB establishing the interaction interface in their complex.
For ArnA and ArnB, we could identify 80 and 145 peptides, respectively, that covered more than 90% of their corresponding amino acid sequences (Figures. S1A and S2A, Supplementary Dataset 1). The HDX profile of ArnA corroborated its predicted domain topology, in particular the disordered nature of the linker (S28-N99, where maximal HDX was reached after 10 s of deuteration) joining the N-terminal ZnF domain to the C-terminal FHA domain (Figure S4B). Upon complex formation of ArnA with phosphorylated ArnB, HDX reduction became apparent in both the ZnF domain and FHA domains thus marking the major sites of interaction for ArnA (S1C-D). Specifically, the phosphate recognition module of the FHA domain constituted by R132 and R147 exhibited the strongest HDX reduction consistent with their role in FHA-mediated phosphate recognition for strengthening the ArnA/ArnB interaction 5.
The HDX profile of ArnB itself primarily showed regions of higher-order structure (low HDX at 10 s of deuteration and progression in HDX over the time-course) and only small disordered areas coinciding with short linkers of the crystal structure (Figures S2B, Fig. 1A). However, phosphorylation of ArnB induces widespread conformational changes, according to HDX increases of the ArnB domain, in parts of the β-sandwich and vWFA domains vicinity (Fig. 3B). These changes may reflect either a partial unfolding event or a disentanglement of the three associated domains. Furthermore, binding of ArnA to phosphorylated ArnB, in turn, reduced the observed HDX rates in proximity of the ArnA-ZnF domain binding site (Fig. 3A). Notably, ArnB residues D295-A310, constituting the N-terminal portion of helix α7, incorporate more deuterium upon phosphorylation, whereas a reduction was apparent upon ArnA binding (Figure S6). It may be hence speculated that the phosphorylation-induced conformational change could be a prerequisite for tight ArnA binding. Likely due to the phosphorylation of the threonine side chains, no peptides covering the C-terminus’ residues T353 onwards could be retrieved thus precluding further conclusions on this presumed ArnA-FHA domain interaction site by HDX-MS of ArnB (Figure S5A). Overall, HDX-MS corroborates the binding site of the ArnA-ZnF domain on ArnB observed in the complex structure (Fig. 1) and provides evidence for a secondary interaction site established with the ArnA-FHA phosphate recognition module.
tims-ToF proteomics data for ArnA and ArnB deletion strains
Although the roles of ArnA and ArnB in the archaellum regulatory network are established 5,6, the mechanism by which regulation occurs under nutrient limitation remains undisclosed. Consequently, we analyzed the Sulfolobus acidocaldarius proteome of the ΔarnA and ΔarnB strains in two nutritional states in comparison with the respective wild type (WT). For this analysis biological triplicates were grown of each strain, and samples were taken under nutrient rich and starved conditions. Samples were measured in a timsTOF (trapped ion mobility spectrometer) mass spectrometer and quantified via label free quantification before averaging the technical duplicates of each sample and further analysis. These measurements led to the identification of 1,699-1,710 proteins per sample leading to the identification of 1,723 overall proteins of the 2,222 gene products known for the respective Sulfolobus acidocaldarius strain (Supplementary Dataset 2). The comparison of the knockout strains with their respective WT sample reveals 1,713-1,716 identified proteins per comparison with overlapping count of 1,694-1,704 (98.7–99.4%). As the overall proteome is basically unaffected by the deletions, we investigated the effects of the deletion strains under higher stringency i.e. a two-tailed t-test with a p-value cutoff of < 0.05. This analysis revealed 325(rich)/384(starved) statistically changed proteins compared to the WT proteome for the ArnA knockout and 398(rich)/510(starved) for the ArnB knockout, respectively. In addition, we also included the stringency factor of a fold change of at least 50% to not only check for statistical relevance but also for biological effects (Fig. 4A/B). As displayed in the volcano plot analysis (Fig. 4A/B, S8-9), this revealed 115(rich) / 243(starved) proteins for the ΔarnA and 185(rich) / 279(starved) proteins for the ΔarnB strain passing that high stringency test. This shows that under starvation conditions the effect on the proteome level is more prominent under nutrient starved conditions for both knockouts. Moreover, the effect of the ArnB knockout is also slightly more impactful on overall altered proteins both after t-testing and after employing the additional fold change cutoff of 50%. To see if these statistically relevant changes on the proteome level also reflect on the biology of S. acidocaldarius, we conducted an intensive gene ontology (GO) term analysis (Fig. 4C/D). Interestingly, this analysis showed that both knockouts have a preference for metabolic enzymes, especially for additional nitrogen containing amino acids pathways. Nevertheless, purine/nucleotide, acetyl-CoA and carbohydrate metabolism enzymes were also significantly enriched after knockout of either ArnA or ArnB. Despite a small difference in statistical significance and actual enrichment score the GO term analysis delivered quite similar results for both knockouts. Notably, when comparing both knockouts with each other, only 43–61% of the significantly altered proteins can be found in both deletion strains simultaneously (Fig. 4G). However, those significantly altered proteins that can be found in both knockouts do share a very high correlation of 82–86% (Fig. 4E/F). Together this leads to the conclusion, that around half of the impact of knocking out either ArnA or ArnB is apparently based on the interaction of them with each other to some degree. Vice versa this means that the other half of the effected protein levels are apparently independent from the ArnA-ArnB interaction or at least without direct correlation. A comparison of these proteomics data with a deletion variant of S. acidocaldarius that missed the GPN-loop GTPase SaGPN and exhibited diminished motility 12 shows that the proteome changes of the hypermotile ArnA and/or ArnB deletion variants 13 are more modest. This suggests a more intimate involvement of ArnA and ArnB in the regulation of motility than of SaGPN, which causes large-scale changes in the proteome network. Accordingly, there is no correlation for significantly altered protein levels between the SaGPN knockout and the ArnA or ArnB knockouts (Figure S10). This suggests that the role of ArnA and ArnB in the archaeal regulatory network is independent of SaGPN.
Evolutionary context of ArnB, which contains the Sec23/Sec24-core motif
In the current state of literature, the function of ArnB, a homolog of the vWA2 protein in S. acidocaldarius 5, has been associated with the archaellum regulatory network. It has been reported that ArnB is a negative regulator of the archaellin arlB, regulated by the interaction with its partner ArnA. However, during investigation of ArnB we found structural similarities with the membrane curvature proteins Sec23/24, which are central components of the assembly machinery for eukaryotic COPII vesicles. Sec23/24 harbor a domain arrangement highly reminiscent of ArnB and even include an N-terminal ZnF domain like ArnA. Accordingly, Sec23/24 contain as a core motif the vWFA, β-sandwich and C-terminal helical domain, the latter being similar to the ArnB_C domain, besides an additional C-terminal domain (Fig. 5A). Superposition of the Sec24 core motif (PDB: 1m2v, P301-I749 of chain B) and ArnB reveals a structural deviation of 5.7 Å for 302 Cα atoms. Here, differences between the Sec24 core motif and ArnB are mostly found for the length of helices in the helix pair α9/α10 of the ArnB_C domain and the relative orientation of the β-sandwich. Notably, the ZnF domain of Sec23/Sec24 and that of ArnA in the ArnAB complex pack to different sites of the β-sandwich of the core motif. In the ArnAB complex structure, the ZnF domain occupies a site made up by an edge between the β-sandwich and the ArnB_C domains, whereas in Sec23/Sec24 the ZnF-domain associates to the opposite site of the β-sandwich domain (Fig. 5B). Another feature in this context is the packing between the Sec23/Sec24 core motif and the C-terminal domain. Although ArnB lacks the gelsolin-type C-terminal domain of Sec23/24, the FHA domain of ArnA is predicted to interact in a phosphorylation-dependent manner with the α-helix domain that resembles the packing of the Sec23/Sec24 motif and the C-terminal domain.
The close relationship between the ArnAB assembly and the eukaryotic Sec23/24 core motif prompt for a wider distribution of the Sec23/Sec24 core motif within the domains of life. Using foldseek and the ArnB crystal structure we found a wider occurrence for the ArnB/Sec23 core motif (Supplementary Dataset 3) than previously suggested by the structural relationship between vWA2 from S. acidocaldarius and a vWA protein from the actinobacterium Catenulispora acidiphila 5. Structural orthologs are found in other bacterial phyla than the actinobacteria. For example, Escherichia coli has a predicted structural ortholog, YfbK, whose domain topology is found in ~ 5900 gene products (status 01/24, Figure S11A). Besides matching Alphafold2 models from Archaea including Euryarchaeota like Halorubrum we also find Sec23/Sec24 core motifs in other Korarchaeota and Heimdallarchaeota (Figure S11B). More surprisingly, the closest structural hits outside the archaeal domain are found in the plant and fungal kingdoms despite marginal sequence identities. Orthologs in plants like Arabidopsis thaliana and Zea mays (Figure S11C) currently lack an assigned function. Notably, foldseek hits in the animal kingdom, apart from the expected Sec23/Sec24 orthologs, lack a direct structural relationship to the Sec23/Sec24 core motif and ArnB (Figure S11D).