Sequencing and comparative analysis of venom insulins from fish-hunting cone snails
Sequencing of the venom gland transcriptomes of the two fish hunters, Conus laterculatus and Conus mucronatus, from the Phasmoconus clade, led to the identification of four new venom insulins, two from each species. Molecular phylogenetics closely grouped these sequences with other cone snail venom insulins, particularly with those isolated from other fish-hunting species (Fig. 1a, red lines). In line with previous observations12, endogenous snail signaling insulins group separately and are less diversified (Fig. 1a, gray lines). According to the nomenclature introduced for cone snail venom insulins3, the new sequences were named Con-Ins La1 and Con-Ins La2 for insulins identified from C. laterculatus, and Con-Ins Mo1 and Con-Ins Mo2 for insulins from C. mucrontatus. All four precursor sequences have the canonical organization defined by human preproinsulin, with an N-terminal signal sequence for relocation into the endoplasmic reticulum and secretory pathway, followed by three regions encoding the B chain, C peptide(s) and A chain (Fig. S1). Proteolytic processing of venom preproinsulin is predicted to yield mature venom insulins with the same cysteine framework and disulfide connectivities as vertebrate insulin (Fig. 1b-c). All four sequences lack residues at the C terminus of the B chain that are critical for receptor activation in vertebrate insulin and the aromatic residues previously shown to be important for receptor binding of other venom insulins, such as the C. geographus venom insulin Con-Ins G13 (Fig. 1d).
Strikingly, all of the new venom insulin sequences have C-terminal extensions in their A chains with diverse amino acid composition (-PSLL#, -GSLL#, -GSLLD, -PVQ, -HTLQ#, and -ASLLGL (Fig. 1c), where # represents C-terminal amides, a common and bioinformatically predictable modification in cone snail toxins13). This pattern suggests that C-terminal A-chain elongations may play a functional role in IR activation of this family of venom insulins and serve as a substitute for the missing B-chain residues of human (and fish) insulin. To investigate this hypothesis, we synthesized a panel of venom-human hybrid analogs (Vh-Ins) for functional and structural studies.
Design and functional evaluation of insulin analogs with elongated C-terminal A-chain residues
Because the six venom insulins all display anionic B10 and hydrophobic B20 residues (Fig. 1c), we incorporated the GluB10 and LeuB20 mutations into a human des octapeptide insulin (DOI), lacking the C-terminal eight residues on the B chain, and attached the respective A-chain elongation motifs from six venom insulins onto the thus-modified DOI backbone to create six Vh-Ins analogs (Fig. 2a). We measured the extent of AKT phosphorylation in IR-overexpressing NIH 3T3 cells as an indicator of insulin potency. Strikingly, four of the six Vh-Ins molecules with elongated A chains display potency comparable to native human insulin (Fig. 2b) and are 400- to 800-fold more potent than DOI (Fig S4). These four potent Vh-Ins molecules all have serine at position A22 and leucine at position A23 within their elongation motifs. On the other hand, the analog containing the A-chain elongated sequence in C. kinoshitai venom, Vh-Ins-HTLQ, which has threonine instead of serine at position A22, has an 11-fold reduction in potency with respect to human insulin. To determine if ThrA22 is responsible for the reduced potency, we first mutated it to serine and found that Vh-Ins-HSLQ has equal potency to human insulin (Fig. 2c), further demonstrating the importance of this position. To understand better the role of A-chain elongation residues in signaling potency, we performed alanine scanning mutagenesis on the additional residues, A21-24 in Vh-Ins-HTLQ. This revealed that individual AlaA21 or AlaA24 substitution results in slightly lower potency than Vh-Ins-HTLQ (Fig. 2c). In contrast, AlaA23 substitution led to greatly reduced bioactivity, while the AlaA22 substitution displayed comparable bioactivity as human insulin. Two of the analogs—Vh-Ins-HALQ and Vh-Ins-HSLQ—showed potency similar to native insulin (Fig. 2c).
Structure determination of a Vh-Ins-HSLQ:receptor ectodomain complex
To elucidate the molecular interactions between the elongated A chain of Vh-Ins-HSLQ and the insulin receptor, we used a receptor isoform A (IR-A) ectodomain construct purified from suspension-adapted HEK 293-F cells, as described previously14. The purified receptor ectodomain (hereafter “receptor”) comprises wildtype residues 1 to 917 with a C-terminal linker and 8xHis tag. To prepare samples for cryo-EM structure determination, the receptor was incubated with Vh-Ins-HSLQ and applied to holey-carbon Cu grids. Movies were collected on a Titan Krios equipped with a Gatan K2 detector and energy filter. Our analysis focuses on three reconstructions: one for the symmetric insulin-binding “head” region (3.3 Å resolution) one from a subset of those particles that additionally shows an ordered C-terminal “stalk” (4.1 Å resolution), and one for an asymmetric conformation (4.4 Å resolution) Figs 3, S2-4 and Table 1).
Symmetric structure
The C2 symmetric head structure, which is represented by most of the particles, explains our biochemical and biological findings with venom-derived insulins. This reconstruction is essentially as reported previously for the insulin receptor in complex with two or more human insulin molecules14-16. Density is apparent for four Vh-Ins-HSLQ molecules, one at each of the two symmetry-related site 1 positions and the two site 2 positions, although the site 2 Vh-Ins-HSLQ had weaker density and did not contribute notable high-resolution information in the final reconstructions, possibly due to greater flexibility (Fig. 3c-d). Initial 3D reconstructions of the receptor resolved only one of the two receptor “stalks” comprised of the FnIII-2 and -3 domains, indicating conformational heterogeneity. The subset of particles subsequently reconstructed with both stalks resolved in a close-approaching conformation matches much more closely with the chimeric IR-leucine zipper construct used by Weis et al.17 than with other previously reported human insulin:IR complexes14,16.
Binding of Vh-Ins-HSLQ at site 1 and site 2 resembles that seen in previously reported cryo-EM structures of IR:insulin complexes14,16. Following structural overlay based on surrounding receptor residues, the relative displacement of Vh-Ins-HSLQ Ca atoms at the site 1 positions ranges from 0.3-0.9 Å (B5-B18; A1-A20) compared to insulin-receptor complex structures (PDB entries 6HN5, 6PXW and 6SOF)14,16,17. Essentially all of the IR contacts with Vh-Ins-HSLQ residues that are common to those with native insulin are retained, although several residues unique to native insulin or Vh-Ins-HSLQ are at the site 1 interface. Similarly, alignment of IR residues surrounding site 2 show Ca overlap of insulin versus Vh-Ins-HSLQ of 0.5-2.6 Å (PDB entries 6PXW and 6SOF)14,16, indicating that contacts between insulin and IR at site 2 are also largely conserved. In notable contrast to site 1, however, there is almost no change in residue identity between insulin and Vh-Ins-HSLQ at the site 2 interface (Fig S5). Consequently, our analysis of the Vh-Ins-HSLQ interaction focuses primarily on binding at site 1.
Vh-Ins-HSLQ binding at site 1
Vh-Ins-HSLQ, like insulin, binds site 1 though contacts with receptor surfaces formed by L1, αCT, and a loop near the periphery of FnIII-1 (Fig 4a). The structure reveals how the A-chain C-terminal elongation of Vh-Ins-HSLQ compensates for loss of C-terminal B-chain residues of native insulin. In particular, the new LeuA23 side-chain projects into the receptor pocket otherwise occupied by insulin PheB24, with LeuA23 aligning with one side of the PheB24 benzyl ring (Fig. 4b-d). Despite the resulting difference in docking residue coordination, the conformations and positions of the residues that form this pocket are virtually unchanged compared to native insulin complexes16,18 (Fig. 4c).
The role of PheB24 and surrounding residues in receptor binding has been characterized through extensive mutagenesis18-20. The equivalent roles seen here for Vh-Ins-HSLQ LeuA23 and insulin PheB24 align with the broader set of hydrophobic side chains that are compatible with receptor recognition at this site. An insulin analog with PheB24 substituted by cyclohexylalanine—a non-natural amino acid with a non-planar, six-member alicyclic side chain—retained full affinity for IR in competition binding assays, as did substitution of PheB24 by methionine20. These findings contradicted the hypothesis that an aromatic residue that interacts with the amino group of receptor residue Asn16 and/or with the sulfurs of the insulin A20-B19 disulfide is required to achieve full binding affinity19. Substitution by other hydrophobic residues at the B24 position showed a preference for side chains larger than alanine, which gave 300-fold weaker affinity than the native phenylalanine. LeuB24 and IleB24 substitutions had similar (~2-3 fold lower) affinities to PheB24, whereas the larger hydrophobic residues tyrosine and tryptophan each had ~20-fold lower affinity than native insulin20. Consistent with structures of insulin-receptor complexes14-18, these data indicate that shape complementarity at B24 is important for binding and that this binding pocket behaves as a “delimited non-polar cavity”20. PheA23 might be expected to mimic more exactly the binding of PheB24; however, Vh-Ins-HAFQ showed comparable activity to Vh-Ins-HALQ (Fig 1c, Fig 5c). Consistent with the leucine consensus of the venom sequences at this position (Fig 1c), LeuA23 in Vh-Ins-HSLQ fits the hydrophobic pocket normally occupied by insulin PheB24.
In addition to the extended A chain, the LeuB20 and GluB10 substitutions were identified as important compensatory mutations during development of the Vh-Ins analogs, with GluB10 providing a three-fold improvement to the EC50 of Vh-Ins-HTLQ as assessed by AKT phosphorylation (Fig S6, Table S1, comparing Vh-Ins-HTLQ, B20Gly with Vh-Ins-HTLQ, B10His, B20Gly). The mechanism behind the increased insulin receptor affinity seen for GluB10/AspB10 in the context of insulin X10 and related analogs21—and presumably in the Vh-Ins analogs presented here—was proposed to be due to a formation of a salt bridge between GluB10 and Arg53917. Indeed, the Vh-Ins-HSLQ GluB10 carboxylate is situated near (~4 Å) Arg539 in our atomic model (Fig. 4a), indicating a moderate charge-charge interaction. The close proximity of GluB10 and Arg539 in this interaction is consistent with the expected modest increase in binding energy needed to drive a three-fold change in EC50.
Substitution of native GlyB20 with LeuB20 also enhances activity of Vh-Ins-HTLQ by providing a further ~two-fold improvement to the EC50 (Fig. S6, Table S1). The site 1 Vh-Ins-HALQ LeuB20 side chain excludes 27 Å2 of solvent accessible surface area at the receptor interface. Furthermore, LeuB20 might stabilize the helical binding conformation of B9-B20 due to its more restricted main chain and through side-chain contacts with TyrB16 (Fig. 4e). In native insulin, the conformational range of GlyB20 is important for the formation of a type-II β turn that allows B23-B30 to fold back against the B-chain helix when insulin is not bound to the receptor18. Because B23-B30 are not present in Vh-Ins-HSLQ, there is no functional requirement to maintain a glycine at B20. These observations suggest that this region of Vh-Ins may provide opportunity for further optimization of receptor contacts and stabilization of analog conformation.
Vh-Ins-HSLQ binding at site 2
To better visualize interactions at site 2, we used symmetry expansion and focused 3D classification to enrich for complexes displaying insulin at this position (Fig. S7). Approximately 25% of the sub-particles showed occupancy of Vh-Ins-HSLQ within a mask surrounding site 2, and subsequent 3D refinement resulted in a reconstruction with an overall resolution of 3.9 Å and recognizable density for Vh-Ins-HSLQ (Fig. S5). Docking of FnIII-1 and site-2-bound insulin from published insulin-receptor structures14,16 into the site 2 density in the asymmetric reconstruction convincingly places insulin into the map, indicating that there is no discernible difference at this resolution in the positioning of Vh-Ins-HSLQ at site 2 relative to native insulin.
Residues previously determined to be important for interactions at site 2—namely, LeuA13 and LeuB1714,16—are not mutated in Vh-Ins-HSLQ, and their interactions with the receptor do not appear to be altered from the native insulin interactions. Neither the extended A chain nor LeuB20 of Vh-Ins-HSLQ approach the receptor at this site. The only other substitution relative to native insulin—GluB10—lacks side-chain density but may approach receptor residues Lys494 and Asp483. The impact of GluB10 on binding affinity at site 2 is unclear, although it is apparent that binding geometry is not substantially altered and that residues in the vicinity of GluB10 are poorly ordered in the structure. These observations support the inference that insulin substitutions to Vh-Ins are highly relevant for binding to site 1 but much less relevant for binding to site 2.
Structure-guided analysis of the Vh-Ins extended A-chain residues
Guided by the structural insights, Vh-Ins-H(S/A)LQ-specific residues were further investigated by mutagenesis and by cellular signaling assays that monitored the level of AKT phosphorylation. Substitution of HisA21 by proline had almost no effect on signaling, which is consistent with the A-chain extended helix being kinked 24° at this residue (Fig. 4b) and the absence of receptor contacts by the HisA21 side chain. Glutamine, lysine and glutamate substitutions at A21 each led to a slightly reduced (2-4-fold) potency (Fig. 5a), although the reason for this modest reduction in potency is not apparent from inspection of the structure.
The side chain of residue SerA22 approaches the backbone of receptor αCT residues Val713, Phe714 and Val715. Inspection of our Vh-Ins-HSLQ complex structure suggested that glycine, serine and alanine are the only natural amino acid residues capable of accommodation at this position without significant steric hindrance. Indeed, when SerA22 was subjected to mutagenesis, there was a negative correlation between the size of the A22 side chain and AKT signaling activity (Fig. 5b). Consistent with the modeling, SerA22 and AlaA22 both showed activity comparable to native insulin. While SerA22 is capable of forming hydrogen bonds with either an amide or carbonyl on the αCT backbone (at Val713 and Val715), the equivalent activity of AlaA22 indicates that a water molecule may substitute for the serine hydroxyl in formation of these hydrogen bonds. In contrast, GlyA22 showed two-fold reduced activity, likely because it destabilizes the helical conformation of the remaining extended A-chain residues. ValA22, LeuA22, PheA22, GluA22 and LysA22 all resulted in greater than ten-fold reductions in activity (Fig. 5b).
As discussed above, LeuA23 plays a key role in receptor binding by docking into a hydrophobic pocket on the receptor surface that is otherwise occupied by PheB24 of native insulin (Fig 4b,c). We evaluated hydrophobic substitutions by leucine, isoleucine, valine and phenylalanine at this position, and found that only PheA23 led to comparable potency to LeuA23 (Fig. 5c). Both the ValA23 and IleA23 substitutions led to reduced potency, which may be due to the unfavorable nature of β-branched amino acids in α helices22, or due to geometric incompatibility with the binding pocket. The preference for Leu at position A23 is consistent with the observation that LeuA23 is almost completely buried in the Vh-Ins-HSLQ-receptor complex and with our finding that LeuA23 is conserved in potent Vh-Ins sequences (Fig. 2b).
The C-terminal residue of the Vh-Ins A chain, GlnA24, does not contact the receptor in our structure. We therefore evaluated residues that naturally occur at high frequency at the C-terminal end of helices23 for their potential to increase activity further. All A24 substitutions tested had at least modest activity; however, the native Con-Ins K1 venom glutamine residue was the most potent (Fig. 5d). The effects of the A24 mutants tested were subtle, consistent with GlnA24 not directly engaging IR.
Having ascertained the similar behavior of Vh-Ins-HALQ and Vh-Ins-HSLQ, we performed fluorescence-based competition binding assays with Vh-Ins-HALQ to determine its relative affinity for both IR (Fig. 5e) and IGF-1R (Fig. 5f) that were detergent-solubilized and immobilized. These assays revealed that Vh-Ins-HALQ has full, native-insulin-like affinity for both IR and IGF-1R (Table S2). The IGF-1R affinity is notable in the context of the GluB10 mutation present in Vh-Ins-HALQ because previous investigations of some insulin variants containing anionic sidechains at B10 found a higher affinity for IGF-1R relative to native insulin21. In contrast, we find that Vh-Ins-HALQ has native-insulin-like binding preference for both IR and IGF-1R.
Binding was also investigated using isothermal titration calorimetry to determine the affinity of Vh-Ins-HALQ for a minimized model of receptor site 1 assembled from IR485 (a construct comprising IR domains L1, CR and L2)24 and the IR-A aCT peptide (receptor residues 704-719). Consistent with published work4,8, binding of human insulin was ~60-fold weaker in this assay than in the previous assay with immobilized full-length receptor. The inability of the model construct used in this assay to recapitulate the GluB10-Arg539 interaction (due to the absence of domain FnIII-1) might underlie the 10-fold weaker binding of Vh-Ins-HALQ relative to human insulin. Nevertheless, consistent with the compensating interaction seen in the structure, Vh-Ins-HALQ displays 24-fold tighter binding than DOI (Table S3).
Dynamic conformations of Vh-Ins-HSLQ-receptor complexes
Three-dimensional classification of the particles in our cryo-EM dataset indicated the presence of a subset of particles that exhibited increased conformational heterogeneity relative to the 4:1 Vh-Ins-HSLQ-receptor complex described above, appearing as a blurring of the head region of one of the two receptor protomers. CryoSPARC 3D variability analysis indicated that this subset displayed a range of conformations (Fig S2, right side). To visualize snapshots along the conformational trajectory, the particles were split into eight groups based on their latent coordinates. Subsequent 3D reconstructions produced a series of maps of 6-7 Å resolution (Fig. 6a), in which most of the variability is displayed by just one of the two receptor protomers. At one extreme, conformations in this trajectory approach our symmetric state (Fig. 3) and published insulin receptor complex structures with two or more insulins 14-16. (Fig. 6c). The other most asymmetric extreme of the trajectory bears some resemblance to some other previously reported structures15, including an “intermediate state” for the interaction between human receptor ECD and native insulin (EMD-10311)14 (Fig. 6b), with one protomer closely resembling the apo receptor crystal structure25 and the other protomer resembling the symmetric complex14-16 (Fig. 6f). Remarkably, unlike other reported structures, this asymmetric conformation is ordered and reveals an intriguing novel coordination state of Vh-Ins-HSLQ bound at a composite site that includes features of both site 1 and site 2 (Fig. 6e). As the trajectory progresses towards the symmetric conformation, the site-1 and site-2 surfaces diverge toward their ~40 Å-separated positions in the symmetric state, with Vh-Ins-HSLQ binding at both sites and overlapping density indicating partial occupancy throughout most of the trajectory (Fig. 6b-e, Video S1).
The reconstruction of the asymmetric conformation was further improved to an overall resolution of 4.4 Å by using Topaz26 to increase the number of particles picked followed by focused 3D classification in Relion27 to obtain a particle set with reduced conformational heterogeneity in the dynamic protomer (Fig S2). This revealed that the site-2 interface is indistinguishable between the combined site and the canonical site 2 of the symmetric structure (Fig. 6g). In contrast, although the combined and canonical site-1 interactions are similar, some differences are apparent in the relative positioning of Vh-Ins-HSLQ/insulin and αCT with respect to the L1 domain. In particular, the orientation of Vh-Ins-HSLQ relative to L1 is rotated approximately 70 degrees along the axis of the αCT helix (Fig 6h). Moreover, the αCT density is shorter than seen in site-1-bound structures, and is more consistent with αCT seen in the apo-IR crystal structure25. Unfortunately, the resolution is insufficient to conclusively assign the register of αCT, which also differs between the apo and bound states of receptor site 18.
A previously reported insulin receptor complex structure using the same receptor ectodomain preparation shows some resemblance to the most asymmetric state that we observe. Gutmann et al.14 reported this low-occurrence conformation that resembles maps near the center of the conformational trajectory described here and, although the details present were insufficient for unambiguous modelling, a 3:1 insulin:receptor state was proposed as an intermediate between the 2:1 and 4:1 states. A low-occurrence receptor conformation reported by Scapin et al.15 also has some overall similarity to our asymmetric state but lacked sufficient resolution to visualize relevant details. Although the asymmetric IRΔβ-Zip construct used by Weis et al.17 displays some similarity near the insulin-occupied site 1 and the C-terminal regions of the stalks, which show the same close approach as in our symmetric and asymmetric reconstructions, the organization of the unoccupied site-1 domains (L1, CR, L2, αCT) is distinctly different and the human insulin:IRΔβ-Zip complex does not display a combined site-1/site-2 architecture nor any density for more than the single site-1 insulin molecule.
Vh-Ins-HALQ signaling response
Insulin is capable of stimulating both metabolic and mitogenic responses through the PI3K/AKT and Ras/MAPK/ERK pathways, respectively. To characterize the signaling profile of Vh-Ins-HALQ, the relative phosphorylation of AKT and ERK induced by Vh-Ins-HALQ administration in L6 myoblasts overexpressing IR-A was determined (Fig. 7a). We found that the overall ratio of AKT/ERK phosphorylation induced by Vh-Ins-HALQ was the same as human insulin, indicating a native-like signaling profile with no bias towards AKT or ERK. To evaluate the metabolic efficacy of Vh-Ins-HALQ, an in vivo comparison between Vh-Ins-HALQ and human insulin (Humulin R) was evaluated in an insulin tolerance test. Subcutaneous administration of human insulin or Vh-Ins-HALQ (0.017 mg.kg-1) in streptozotocin induced diabetic rats lowered blood glucose levels and reached similar nadir levels (~60 mg.dL-1) (Fig. 7b). These observations indicate that the metabolic potency of Vh-Ins-HALQ is similar to that of human insulin. As a final assay of signaling response, the cell-proliferative potency of Vh-Ins-HALQ was assessed by DNA synthesis in L6 myoblasts over-expressing IR-A (Fig. 7c). We found that human insulin was slightly more potent than Vh-Ins-HALQ in its ability to induce DNA synthesis, indicating that Vh-Ins-HALQ may have the desirable property of being slightly less mitogenic than human insulin (Insulin EC50 4.9 nM vs Vh-Ins-HALQ EC50 7.3 nM, 95% C.I.s 4.2-5.5 nM, 6.3-9.5 nM, p <0.001).