Availability of X-ray crystal structures
All three diaminopyrimidine-based inhibitors were co-crystallized with MTHFD2: compound 1 (PDB code: 6S4A), compound 2 (PDB code: 6S4E) and compound 3 (PDB code: 6S4F). With regards to MTHFD2L, compound 2 is the only co-crystallized inhibitor (PDB code: 7QEI).5 The X-ray structure of MTHFD1 in complex with LY345899 and NADP (PDB code: 6ECQ)22 was also used in this study for docking and subsequent post-docking analysis. The X-ray structure of MTHFD2 in complex with a tricyclic coumarin-based selective inhibitor (PDB code: 6KG2, compound 5)18 was employed in the present work to compare with the crystallographic binding mode of compound 1.
Comparison of MTHFD1, MTHFD2 and MTHFD2L substrate binding sites
As discussed earlier, MTHFD2 is exclusively expressed in several cancer types such as breast cancer2, colorectal cancer3, acute myeloid leukaemia4,5, renal cell carcinoma6 and hepatocellular carcinoma7, while absent in healthy adult human tissues, which makes MTHFD2 an important target for cancer drug discovery. On the other hand, the two close homologs: MTHFD1 and MTHFD2 are present in healthy adult human tissues, thus rendering the development of selective MTHFD2 inhibitors more challenging. To understand the non-selective binding modes of the reported diaminopyrimidine-based inhibitors5, the X-ray structures of MTHFD1, MTHFD2 and MTHFD2L were superposed and compared with each other at the substrate binding site level. In Fig. 2, the X-ray structure of MTHFD2 in complex with compound 1 (PDB code: 6S4A) is superposed with the X-ray structure of MTHFD1 bound to LY345899 and NADP (PDB code: 6ECQ), and the X-ray structure of MTHFD2L bound to compound 2 (PDB code: 6S4E). The central pyridine ring and the diaminopyrimidine scaffold of 1 accommodates well into the substrate binding site of all three while its terminal tetrazole unit is projected towards the solvent-exposed region, in the vicinity of loop 1. As reported in our previous work20, Arg43 in MTHFD2 is a key residue that can be exploited for selectivity because the corresponding MTHFD1 and MTHFD2 residues (Lys10 and Thr57, respectively) are projected outward from the substrate binding site and are inaccessible for H-bond contact with the inhibitor. The butanoic acid unit of the pyrimidine-based compound 1 forms H-bond interactions with the backbone nitrogens of Gly273 (MTHFD1), Gly310 (MTHFD2) and Gly324 (MTHFD2L). The two nitrogens from the tetrazole moiety of 1 establish salt-bridge contacts with the sidechain of Arg278 in MTHFD2, however, no H-bond/salt-bridge interaction is observed with MTHFD1 and MTHFD2L at the same position due to the fact that the longer and flexible Arg278 is replaced by structurally different residues Tyr240 in MTHFD1 and Tyr92 in MTHFD2L, providing promising further scope of selective MTHFD2 binding. The pyridine ring of 1 demonstrates π–π stacking with Tyr52 (MTHFD1), Tyr84 (MTHFD2) and Tyr98 (MTHFD2L), contributing to the high binding affinities for the three isoforms. Interestingly, the hydrophilic asparagine in MTHFD2 (Asn87) and MTHFD2L (Asn101) are replaced by lipophilic valine (Val55) in MTHFD1. The amide carbonyl of 1 H-bonds to both Asn87 and Asn101 whereas no interaction was noted between the inhibitor and Val55 in MTHFD1 which renders selectivity potential. The urea carbonyl of compound 1 establishes H-bond interaction with Lys56 and Gln100 in MTHFD1, Lys88 and Gln312 in MTHFD2, and Lys102 and Gln146 in MTHFD2L, respectively. Finally, the diaminopyrimidine core of 1 seems to be an important structural element and is projected inward of the substrate binding site, forming identical H-bonds to all three MTHFD isoforms. One of the amino groups of the compound 1 diaminopyrimidine scaffold H-bonds to the backbone oxygen of Leu101, Leu133 and Leu147 of MTHFD1, MTHFD2 and MTHFD2L respectively, while its other amino group engages in H-bond contacts with Val99 and Asp125 (MTHFD1), Val131 and Asp155 (MTHFD2), and Val145 and Asp169.
Despite that the diaminopyrimidine scaffold seems to contribute crucially to the binding affinity for the MTHFD isoforms, it is not advisable to use in selective MTHFD2 inhibitor design due to its interior placement in the substrate binding site where all MTHFD residues are identical and do not provide any scope for selectivity. Taken together, avoiding scaffolds like diaminopyrimidine in the rational structure-based design of new MTHFD2 inhibitors, instead targeting Arg43, Asn87 and Arg278 appears to open up new opportunities to attain selectivity (Table 1).
Table 1
Residue comparison at the substrate binding site of MTHFD1, MTHFD2 and MTHFD2L.
Entry | MTHFD1 | MTHFD2 | MTHFD2L |
1. | Lys10 | Arg43* | Thr57 |
2. | Tyr52 | Tyr84 | Tyr98 |
3. | Val55 | Asn87** | Asn101 |
4. | Lys56 | Lys88 | Lys102 |
5. | Val99 | Val131 | Val145 |
6. | Gln100 | Gln132 | Gln146 |
7. | Leu101 | Leu133 | Leu147 |
8. | Asp125 | Asp155 | Asp169 |
9. | Gly273 | Gly310 | Gln324 |
10. | Tyr240 | Arg278* | Tyr292 |
*MTHFD2 residues different to MTHFD1 and MTHFD2L, important for selectivity |
**MTHFD2 residue different to MTHFD1, however, similar to MTHFD2L, important for selectivity |
Docking and MD simulations of compounds 1–3 bound to MTHFD1
Due to the unavailability of X-ray structures of MTHFD1 in complex with diaminopyrimidine analogues, we first performed molecular docking of compounds 1–3 in the substrate binding site of MTHFD1. As illustrated in Table 2, all compounds show high binding affinity with MTHFD1, possessing Glide docking scores of better than − 10.0 kcal/mol, further correlating with the experimental results. Similar to the aforementioned discussion on the superposed MTHFD structures (with the co-crystallized pose of compound 1 in MTHFD2), the suggested docking pose of 1 shows the desirable binding mode and existence of multiple protein-ligand contacts in the substrate binding site of MTHFD1 (Figure S2A). The butanoic acid unit of inhibitor 1 H-bonds to the backbone nitrogen of Gly273 while the central pyridine ring forms π–π stacking with Tyr52. Lys56 of MTHFD1 is involved in H-bond interactions with the urea carbonyl and diaminopyrimidinone carbonyl of 1. As expected, the diaminopyrimidine scaffold of compound 1 is placed inward to the substrate binding site of MTHFD2, constituting multiple H-bond contacts with Val99, Leu101, Asp125. Only the tetrazole unit of 1 is oriented away from the binding site, towards Lys10 but not forming any interactions. Compound 2, which is characterized by the presence of a central phenyl ring instead of the pyridine ring in 1, displays a similar binding mode and identical protein-ligand interactions in the MTHFD1 binding site (Figure S2B). In comparison with compound 1, the central 3-fluororpyridine ring in compound 3 is placed slightly away from Tyr52, resulting in loss of lipophilic contact; however, this is compensated by the presence of two additional H-bond interactions; one between glutamic acid sidechain of 3 and Gly274, and another between the amide carbonyl of 3 and Gln100, which overall contribute to the improved docking score of -10.75 kcal/mol (Figure S2C).
Table 2
Glide scores (in kcal/mol) of 1–3 in MTHFD1, MTHFD2 and MTHFD2L.
Entry | MTHFD1 | MTHFD2 | MTHFD2L |
Compound 1 | -10.51 | -12.66 | -11.90 |
Compound 2 | -10.65 | -12.65 | -11.43 |
Compound 3 | -10.75 | -12.50 | -12.22 |
In order to evaluate the conformational dynamics, binding mode and stability of protein-ligand interactions, 200 ns molecular dynamics (MD) simulations were performed of MTHFD1 in complex with each inhibitor starting from their suggested docking poses. As a measure of protein mobility, the Root Mean Square Deviation (RMSD) of the α-carbons of the protein was calculated throughout the simulation. RMSD of the ligand heavy atoms during the course of the simulation was also calculated as a measure of ligand mobility. The RMSD depicts the deviation of the ligand/protein α-carbon atoms from their initial suggested docking pose during the simulations. All MD simulations with respect to the 3 isoforms (MTHFD1, MTHFD2 and MTHFD2L) were performed in triplicates with the purpose of re-evaluating the stability of the MTHFD – inhibitor complexes. The triplicate RMSD plots of the MTHFD1 – compound 1 complex indicate significant stability over the course of the simulation, showing marginal differences among all replicas, with respect to both MTHFD1 α-carbons and compound 1 movement (Figure S3). The overall average RMSDs calculated from the triplicate MD simulations of the MTHFD1 – compound 1 complex were found to be 2.2Å and 2.5Å for the MTHFD1 α-carbons and compound 1, respectively (Table S1). Replica 2 was selected as the representative replica for further individual analysis. In this replica, compound 1 demonstrated a decent stability in the substrate binding site of MTHFD1, with an average RMSD of 2.2Å, while the MTHFD1 α-carbons possess mildly higher average RMSD of 2.8 Å (Figs. 3A, 3B). MD analysis furthermore verified the stability of the docking pose of compound 1. The initial minor fluctuations between 0–50 ns of the MD trajectory correspond to the dynamic movement of the butyl-tetrazole unit of 1. The interactions with Lys56, Leu101 and Asp 125 of MTHFD1 were maintained for 100% of the simulation time, while the interactions with Tyr52, Val99, Gln100 and Gly274 account for 94%, 68%, 67% and 70% of the simulation time, respectively. In addition to the aforementioned protein-ligand interactions observed from the docking pose, the MD analysis also revealed lipophilic contact between the central pyridine ring of the inhibitor and Val280 for 50%, and H-bond/salt-bridge interaction between the compound 1 tetrazole unit and Arg250 for 36% of the simulation time (Figure S4).
Interestingly, loop 1 of MTHFD1 displayed some conformational changes when inhibitor 1 was bound to the substrate binding site (Figs. 3C, 3D, S5). The initial docking pose and the energy minimized structure suggest that loop 1 is oriented away from 1 without establishing any interactions. However, the loop was found to fluctuate towards 1 already from the first 15 ns onwards, establishing H-bond and salt-bridge contacts between Arg250 and the tetrazole unit of the inhibitor. We also superposed the MD structure of the MTHFD1 – compound 1 complex with the X-ray structure of MTHFD1 without the inhibitor (PDB code: 6ECR), with the purpose of verifying the observed conformational changes. As shown in Figure S6, in the 6ECR MTHFD1 structure, loop 1 seems to be notably away from the position in the MTHFD1-compound 1 complex, which verifies the crucial conformational changes taking place. We furthermore hypothesize that the butyl-tetrazole unit of 1 is an important element in facilitating the conformational changes in loop 1 by interacting with Arg250, overall correlating with the 0.5 nM IC50 of inhibitor 1 against MTHFD1.
As observed from the triplicate MD simulations of MTHFD1 – compound 2, the MTHFD1 α-carbons show decent stability and similar pattern of convergence, with an overall average RMSD of 2.2Å. Furthermore, an overall average RMSD of 3.2Å was noted for ligand replicas. Replicas 1 and 2 of compound 2 superpose almost perfectly during the simulation, however, replica 3 demonstrates drastic fluctuation (Figure S7, Table S2). Replica 1 was selected as representative for further discussion. The MTHFD1 – compound 2 complex possesses notable stability over the course of the simulation, with mild initial fluctuations and 1.9Å and 2.0Å average RMSDs for ligand and protein, respectively (Figs. 4A, 4B). Similar to the MTHFD1 – compound 1 complex, engagement with the MTHFD1 loop1 is noted when the diaminopyrimidine-based compound 2 is bound, establishing H-bond/salt-bridge interactions with Tyr240 and Arg250, that were maintained for 39% and 61% of the simulation time, respectively (Figures S8, S9). However, the dynamic movement of the MTHFD1 loop1 in presence of 2 is limited compared to what was observed for compound 1. It can thus be hypothesized that the conformational changes at loop 1, particularly the movement towards the substrate binding site, are more favored in the proximity of tetrazole (compound 1) whereas in presence of glutamic acid (compound 2), the inward projection of loop 1 is relatively restricted, which is in good agreement with 89 nM IC50 of 2 against MTHFD1 (as compared to 0.5 nM IC50 of 1). Interactions of 2 with Lys56, Leu101 and Asp125 of MTHFD1 exist during the whole simulation while the interactions with Tyr52, Val99, Gln100 and Gly273 account for 89%, 70%, 84% and 86% of the simulation time, respectively.
The triplicate MD simulations of the MTHFD1 – compound 3 complex show an overall average RMSD of 2.6Å and 3.4Å for MTHFD1 α-carbons and compound 3, respectively, demonstrating significant stability (Figure S10, Table S3). Only replica 2 of compound 3 exhibits some fluctuations between ∼150ns to 180ns, but otherwise algins well with other replicas. Replica 3 was chosen as representative from these simulations. The MTHFD1 – compound 3 complex possesses significant stability during the 200ns simulation, with both the MTHFD1 α-carbons and compound 3 showing a high convergence, maintaining an average RMSD of 2.1Å and 2.2Å, respectively (Figs. 4C, 4D). Unlike 1 and 2, compound 3 does not interact with loop 1. Instead, it is located somewhat deeper into the substrate binding site of MTHFD1, forming additional H-bond interactions that were not observed with compounds 1 or 2; the backbone oxygen of the glutamic acid of 3 forms H-bond/salt-bridge contact with Arg17 for 90% of the simulation time while its carboxy sidechain forms an H-bond with Lys10 during 37% of the simulation (Figs. 4E, S11). On comparing the MD pose of the MTHFD1 – compound 3 complex with the MTHFD1 – compound 1 complex (Figs. 4E, 4F), we note that the acyclic glutamic acid unit of 3 facilitates interaction with Lys10 and Arg17 instead of interacting with Tyr240 and Arg250, thereby limiting the interaction with loop 1. The flexible butyl-tetrazole unit of 1, on the other hand, is projected away from the Lys10 and Arg17 substrate site residues and engages with Arg250, facilitating the notable movement and conformational changes in loop 1. The distinct interactions of 3 with Lys10 and Arg17 thus compensate for the absence of interactions with loop1 and are hypothesized to crucially contribute to its high binding affinity (IC50 16 nM) in addition to the essential interactions with Tyr52 and Lys56 (100% of the simulation time), and interactions with Gln100, Leu101, Asp125, Gly273 that exist for 96%, 80%, 97% and 37% of the simulation, respectively. The central 3-fluoropyridine ring system establishes lipophilic contacts with Val55 and Val280 for 32% and 44% of the simulation trajectory, respectively. The presence of the 3-fluoro atom also facilitates interactions with Val55 and Val280 of MTHFD1, overall contributing to the improved binding affinity.
MD simulations of compounds 1–3 bound to MTHFD2
The co-crystallized binding poses of compounds 1–3 in MTHFD2 (Figure S13, PDB codes: 6S4A, 6S4E and 6S4F for compound 1, 2 and 3, respectively) were subjected to triplicate 200ns MD simulations. Both the protein α-carbons and compound 1 possess remarkable stability during the triplicate MD simulations, demonstrating overall average RMSDs of 1.7Å and 2.0Å, respectively, (Figure S12, Table S4). Replica 1 was selected as representative. The MTHFD2 – compound 1 complex exhibits significant stability over the course of the simulation, with the MTHFD2 α-carbons and the diaminopyrimidine-based compound 1 possessing average RMSDs of 1.5Å and 2.1Å, respectively (Figs. 5A, 5B). Compound 1 illustrates some fluctuations throughout the simulation which correspond to its flexible butyl-tetrazole unit.
As observed from the simulation trajectory, loop 1 of MTHFD2 shows limited movement in presence of 1 (as compared to loop1 of the MTHFD1 – compound 1 complex). Tyr240 in the MTHFD1 loop1 is replaced by the longer, bulkier and flexible Arg278 in loop 1 of MTHFD2, which is more accessible to the diaminopyrimidine-based inhibitors (Fig. 2B). The tetrazole unit of 1 H-bonds with Arg278 for 63% of the simulation, indicating a significant engagement to loop 1, however, with less movement (Figs. 5A, 6A, S14). Furthermore, the H-bond contacts between the amide carbonyl of 1 and Asn87, and between the diaminopyrimidine ring of 1 and Leu133 and Asp155, were maintained for the whole simulation time. Lys88 of MTHFD2 H-bonds to the urea carbonyl of compound 1 for 90%, and further engages its diaminopyrimidine scaffold in a π-cation interaction for 92% of the simulation. In addition, the urea carbonyl of 1 is engaged in H-bond interaction with Gln132 for 80% of the simulation trajectory. The diaminopyrimidine ring of 1 H-bonds to Val131 for 88% of the simulation time and the backbone oxygen of the compound 1 glutamic acid H-bonds to Gly310 for 81% of the simulation time. Finally, lipophilic contacts of the central pyridine ring of 1 with Tyr84 and Val317 exist for 85% and 58% of the simulation. The overall strong interaction profile along with engagement to Arg278 of loop 1 contribute to the high binding affinity for MTHFD2 (IC50 16 nM).
As shown in the triplicate MD simulations of the MTHFD2 – compound 2 complex (Figure S15, Table S5), the overall average RMSD of MTHFD2 α-carbons was found to be 2.0Å, facilitating remarkable stability and similar pattern of convergence. Replicas 1 and 2 of compound 2 aligns perfectly, while replica 3 completely dissociates from the substrate binding site of MTHFD2. Replica 2 was chosen as the representative replica for further analysis. Compound 2 also displays a great stability at the substrate binding site of MTHFD2 during the simulation, possessing an average RMSD of 1.6Å; albeit with a sudden jump in RMSD between 167–172 ns. The MTHFD2 α-carbons were found to be consistently stable over the course of the simulation, possessing an average RMSD of 1.4Å (Figs. 5C, 5D). The aforementioned RMSD fluctuations in the interval between 167–172 ns correspond to a couple of conformational changes: (a) movement of the glutamic acid sidechain of 2, contributing to H-bond interaction with Asn87, and (b) rotation of the central phenyl ring of 2, thereby forming π-cation interaction with Lys88. In addition, apart from the glutamic acid sidechain, the amide carbonyl of 2 H-bonds to Asn87 resulting in an overall H-bond interaction with Asn87 for 100% of the simulation. Likewise, the urea carbonyl of compound 2 H-bonds to Lys88, including the above-mentioned π-cation interaction, leads to a resulting interaction between these for 85% of the simulation time. The H-bond interactions between the diaminopyrimidine ring of 2 and Leu133 and Asp155 exist for 100% of the simulation. H-bond interaction of the diaminopyrimidine system of compound 2 with Val131 furthermore accounts for 80% of the simulation while π-π stacking between the central phenyl ring of 2 and Tyr84 was maintained for 73% of the simulation. Finally, the H-bond contact between the urea carbonyl and Gln132 is maintained for 90% of the simulation trajectory (Figure S16).
Despite a desirable interaction profile at the substrate binding site of MTHFD2, compound 2 exhibited negligible contact with loop1 of MTHFD2, possessing H-bond interaction with Arg278 just for 17% of the simulation time (Figs. 6B, 6C). The limited interaction of compound 2 with loop1 of MTHFD2 can be thus correlated with the lower IC50 of 254 nM as observed from the biochemical screening.
Average RMSDs of 1.6Å and 2.7Å were observed from the triplicate MD simulations of MTHFD2 – compound 3 for protein α-carbons and ligand, respectively (Figure S17, Table S6). The MTHFD2 α-carbons in all replicas demonstrated great stability and identical convergence, whereas replica 1 and replica 2 of compound 3 show sudden jump at ~ 30ns and ~ 100ns, respectively. Replica 3 was selected as representative from the triplicate MD simulations of MTHFD2 – compound 3. Compound 3 displayed remarkable stability at the MTHFD2 substrate binding site over the course of the simulation, with an average RMSD of 1.5Å. The protein α-carbons also showed very low average RMSD of 1.8Å (Figs. 5E, 5F). The fluctuations of 3, as observed from the RMSD plots, correspond to the dynamic movement of its glutamic acid sidechain. The H-bond interactions between the diaminopyrimidine scaffold of 3 and Val131, Leu133 and Asp155 account for 91%, 100% and 100% of the simulation time, respectively (Figure S18), and π-π stacking between the central 3-fluoropyridine ring of 3 and Tyr84 exists during 80% of the simulation. Lys88 engages the amide carbonyl of compound 3 (near to the diaminopyrimidine system) via H-bond interaction for 80%, and with the diaminopyrimidine ring via π-cation interaction for 60% of the simulation trajectory, thus significantly contributing to the binding affinity towards MTHFD2. Moreover, the glutamic acid sidechain of 3 engages with loop1 of MTHFD2, establishing H-bond interaction with Arg278 for 43% of the simulation time (Fig. 6D). Unlike compounds 1 and 2, the H-bond contact between the amide carbonyl (near the glutamic acid) of compound 3 and Asn87 was reduced to only 35% of the simulation time. The interaction profile of 3 with MTHFD2 including the reduced interaction with Asn87 and moderate engagement with loop 1 thus results in its 47 nM inhibitory activity against MTHFD2.
Docking and MD simulations of compounds 1–3 bound to MTHFD2L
The X-ray structure of MTHFD2L in complex with compound 2 is available in the protein data bank (PDB code: 7QEI) whereas for 1 and 3, we performed molecular docking to obtain their putative binding poses (Figure S19). The suggested docking pose of inhibitor 1 shows the existence of plenty of H-bond contacts in the MTHFD2L binding site, similar to the binding sites of the other two isoforms. The triplicate MD simulations of the MTHFD2L – compound 1 complex demonstrated remarkable stability, showing overall average RMSDs of 1.9Å and 1.7Å for protein α-carbons and 1, respectively, further selecting replica 3 as being representative (Figure S20, Table S7). Both the protein α-carbons and compound 1 were highly stable during the simulation, possessing average RMSD’s of 1.7Å and 1.6Å, respectively (Figs. 7A, 7B). The terminal tetrazole unit of 1 H-bonds to Tyr292 of MTHFD2L loop 1; however, the said interaction was maintained for just 24% of the simulation time (Figure S21). Interestingly, Asn101 engages both the amide carbonyl and the tetrazole unit of 1 via H-bond interactions which altogether lead to the existence of H-bond between the two during 100% of the simulation. π-π stacking between the pyridine ring of 1 and Tyr98 exists for 88% of the simulation. Lys102 H-bonds to the urea carbonyl of compound 1 for 90%, and further establishes π-cation interaction with its diaminopyrimidine ring for 71% of the simulation trajectory. The urea carbonyl of 1 furthermore H-bonds to Gln146 during 52% of the simulation time. H-bond contacts exist between the diaminopyrimidine scaffold of 1, and Leu147 and Asp169, throughout the simulation, while interaction with Val145 is maintained for 89% of the simulation time. Despite the limited contact with the loop 1 of MTHFD2L and due to a sequence identity of 72% between MTHFD2 and MTHFD2L, the aforementioned interactions indicate a significant contribution to the improved binding affinity, correlating with the IC50 value of 27 nM for compound 1 against MTHFD2L as obtained in the biochemical screening.
Replica 3 was selected as the representative replica from the triplicate MD simulations of MTHFD2L – compound 2 complex. All replicas of MTHFD2 α-carbons demonstrated similar RMSD pattern with minor fluctuations, possessing an overall average RMSD of 2.2Å. Only replica 3 of compound 2 seems to be reasonably stable, while replica 2 fluctuates significantly and replica 1 dissociates completely, leading to an overall average RMSD of 5.0Å (Figure S22, Table S8). In replica 3, compound 2 demonstrated significant stability for the first 114ns of the simulation, followed by some notable fluctuations, giving an average RMSD of 1.9Å. Correspondingly, a fluctuation between 113–125ns was noted among the protein α-carbons, resulting in an average RMSD of 2.1Å (Figs. 7C, 7D). The ligand fluctuations mainly corresponds to the dynamic movement of the flexible glutamic acid unit of 2 and to some extent the central phenyl ring, whereas the RMSD fluctuations of the MTHFD2L α-carbons represents dynamic movements of loop 1 and α-helix 1 (Figure S23). The observed ligand instability and the limited loop 1 interaction are in correlation with the weaker MTHFD2L inhibitory activity of 2 against MTHFD2L with an IC50 of 126 nM. Compound 2 interacts with Asn101 for the whole duration of the simulation via H-bonds to its amide carbonyl and its glutamic acid moiety (Figure S24). The aforementioned conformational changes allow Lys102 of MTHFD2L to interact with the urea carbonyl of 2 via H-bond formation (88%) and with the diaminopyrimidine scaffold via π-cation interaction (41%). In addition, the urea carbonyl of 2 forms H-bond interaction with Gln146 of MTHFD2L during 42% of the simulation. Tyr98 interacts with the central phenyl ring of 2 by π-π stacking for 68%, while it additionally H-bonds to the glutamic acid sidechain of 2 for 20% of the simulation time. H-bond interactions between the diaminopyrimidine scaffold of compound 2 and Val145, Leu147 and Asp169 account for 90%, 100% and 100% of the simulation trajectory, respectively. The H-bond contact between 2 and loop 1 of MTHFD2L is limited to only 31% of the simulation time, and is seen between the glutamic acid and Tyr292, whereas interaction between the backbone oxygen of compound 2 and Gly324 was observed for 55% of the simulation time.
Overall average RMSDs of 2.3Å and 3.2Å for protein α-carbons and inhibitor, respectively, were observed from the triplicate MD simulations of the MTHFD2L – compound 3 complex, indicating a stable protein and ligand movement (Figure S25, Table S9). The protein replicas exhibit minor fluctuations, albeit some notable jumps were seen for the ligand in replicas 2 and 3. Replica 1 was selected as representative replica in this study. The MTHFD2L – compound 3 complex displayed significant stability during the course of the simulation, possessing average RMSDs of 2.5Å and 2.4Å for protein α-carbons and ligand, respectively (Fig. 7E, 7F). The minor fluctuations noted from the RMSD plot correspond to the dynamic movement of the compound 3 glutamic acid and loop 1 of MTHFD2L. The essential protein-ligand interactions were maintained similarly for compound 3 at the MTHFD2 binding site, as compared to compounds 1 and 2; however, a few new and additional interactions were noted that are believed to altogether contribute to the binding affinity for MTHFD2L with an experimental IC50 value of 47 nM. The H-bond interactions between the diaminopyrimidine scaffold and Val145, Leu147 and Asp169 account for 80%, 51% and 100% of the simulation time, respectively (Figure S26). The amide carbonyl and the backbone oxygen of 3 H-bond to Asn101 of MTHFD2L, for a total of 100% of the simulation time. The amide carbonyl near to the diaminopyrimidine ring forms H-bond contact with Lys102 for 70% of the simulation time. Interestingly, no π-π stacking is observed between the central 3-flurophenyl ring and Tyr98; instead the glutamic acid sidechain of 3 H-bonds to Tyr86 for 44% of the simulation trajectory. The flexible glutamic acid sidechain of 3 was furthermore able to constitute two new H-bond interactions: with Lys302 and Leu303 that were maintained for 26% and 32% of the simulation trajectory, respectively. The existence of these two interactions furthermore pinpoints the occupancy of compound 3 deeper into the substrate binding site of MTHFD2L, correlating with its experimental activity.
Insights into the MTHFD loop1 conformational changes in presence of compounds 1–3
In order to shed further light on loop 1 conformational changes of the MTHFD isoforms when the diaminopyrimidine-based inhibitors 1–3 are bound, we investigated the distance between these and loop 1 of MTHFD1, MTHFD2 and MTHFD2L during the course of the simulations. The information gathered shows a good correlation with the reported experimental activity. The average distance between the tetrazole unit of 1 and Arg250 of the MTHFD1 loop 1 during the simulation, was 7.50Å (Table 3, Fig. 8A) which correlates with the observed loop 1 movement (Figs. 3C, 3D) and the maintenance of H-bond interaction between compound 1 and Arg250 for 36% of the simulation time, as discussed earlier, leading to the 0.5 nM MTHFD1 inhibitory activity of observed for 1. Despite the H-bond contact between compound 2 and Arg250 of the MTHFD1 loop 1 that exists during 61% of the simulation time (as discussed earlier), limited movement of loop 1 was observed when 2 is bound. Moreover, the average distance between the glutamic acid unit of 2 and Arg250 of the MTHFD1 loop 1 was found to be 9.52Å (Table 3, Fig. 8B) which denotes reduced proximity between 2 and the MTHFD1 loop 1, corresponding to the comparatively weaker MTHFD1 inhibitory activity with an IC50 of 89 nM. The comparative analysis furthermore verifies the fact that conformational changes are more favored when the tetrazole unit (compound 1) rather than the glutamic acid moiety (compound 2) is in the proximity of the MTHFD1 loop1.
Table 3
Average distance (in Å) between 1–3 and loop 1 of the MTHFD isoforms during the 200ns MD simulations, along with their reported IC50 values.
Entry | MTHFD1 | MTHFD2 | MTHFD2L |
Avg. distance (Å) with Loop 1 (Arg250) | IC50 (nM) | Avg. distance (Å) Loop 1 (Arg278) | IC50 (nM) | Avg. distance (Å) with Loop 1 (Tyr292) | IC50 (nM) |
1 (Tetrazole) | 7.50 | 0.5 | 6.37 | 11 | 5.73 | 27 |
2 (Glutamic acid) | 9.52 | 89 | 7.97 | 254 | 8.31 | 126 |
3 (Glutamic acid) | 16.75 | 16 | 7.46 | 47 | 9.31 | 47 |
Compound 3 is characterized by the presence of a central 3-fluoropyridine ring whereas 1 and 2 have pyridine and phenyl rings at the same position, respectively. As mentioned above, compound 3 does not interact with loop 1 of MTHFD1, as evidenced by the 16.75Å average distance between the glutamic acid of 3 and Arg250 (Table 3, Fig. 8C). However, two additional H-bond interactions (with Lys10 and Arg17) were observed which underline the placement of compound 3 much deeper into the substrate binding pocket of MTHFD1 and further substantiates the remarkably potent activity of 3 against MTHFD1 (IC50 = 16 nM). It can thus be hypothesized that the deeper accommodation of compound 3 in the substrate binding site of MTHFD1, anchored by Lys10 and Arg17 H-bond contacts, prevents the interaction of the glutamic acid unit with loop 1 of MTHFD1, and further improves the binding affinity.
Lys10 in MTHFD1 is replaced by Arg43 in MTHFD2 (Fig. 2B). Arg43 is more accessible to the diaminopyrimidine-based inhibitors due to the projection of its flexible guanidine group toward the substrate binding site, whereas Lys10 in MTHFD1 is oriented deeper into the substrate binding site. Arg17 in MTHFD1 is similarly replaced by Lys50 in MTHFD2. Compounds 1–3 all form H-bond interaction with Arg43 for 25–40% of the simulation time, demonstrating promising selectivity potential; however, presence of the diaminopyrimidine scaffold in compounds 1–3 limits the opportunities for MTHFD2 selectivity. Besides this, the tetrazole unit of 1 similarly H-bonds to Arg278 of MTHFD2 loop 1 for 63% of simulation time, correlating with an average distance of 6.37Å (Table 3, Fig. 8D) and an 11 nM IC50 of compound 1 towards MTHFD2. The glutamic acid unit of compound 2 displays a larger average distance of 7.97Å (Table 3, Fig. 8E) to Arg278 of the MTHFD2 loop 1, and a corresponding existence of just 17% of H-bond contact during the simulation. This agrees with the negligible contact of the glutamic acid of 2 with loop 1 of MTHFD2 and furthermore correlates with the relatively poor MTHFD2 inhibitory activity with an IC50 of 254 nM. The average distance between the glutamic acid unit of compound 3 and Arg278 was found to be 7.46Å during the simulation (Table 3, Fig. 8F). Unlike the MTHFD1 – compound 3 complex, the glutamic acid unit of 3 is limited by the H-bond contact with Arg43 and thus cannot access Lys50 (Arg17 in MTHFD1) that is placed much deeper into the substrate binding pocket. This is in turn compensated by the formation of H-bond interaction with Arg278 of loop 1 during 43% of the simulation time, further justifying the 47 nM MTHFD2 inhibitory activity of compound 3 (relative to its 16 nM MTHFD1 inhibitory activity).
In comparison with the MTHFD1 and MTHFD2 structures, the diaminopyrimidine-based inhibitors 1–3 show limited interaction with loop1 of MTHFD2L. The tetrazole unit of compound 1 was able to H-bond to Tyr292 of MTHFD2L loop1 for just 24% of the simulation time, however, the average distance of 5.73Å (Table 3, Fig. 8G) indicates a closer proximity and reduced dynamic movement of the MTHFD2L loop1. Despite this, 1 displays an exceptional interaction profile with all essential residues at the substrate binding site of MTHFD2L which leads to its 27 nM potent inhibitory activity. The glutamic acid unit of compound 2 H-bonds to Tyr292 of the MTHFD2L loop 1 for 31% of the simulation time. The average distance between the two was calculated to be 8.31Å over the course of the simulation (Table 3, Fig. 8H). Despite a reasonable contact with Tyr292 of the MTHFD2L loop 1, the higher average distance can be justified by the fact that the glutamic acid unit and the central phenyl ring of 2 undergo some peculiar conformational changes, as discussed earlier, which in turn affect the dynamic movement of α-helix 1 of MTHFD2L (Figure S15), establishing a reasonable correlation with the weaker MTHFD2L inhibitory activity observed (IC50 = 126 nM). The glutamic acid unit of 3, finally, showed no interactions with loop 1 of MTHFD2L which correlates with the average distance of 9.31Å during the course of the simulation (Table 3, Fig. 8I). Apart from maintaining essential protein-ligand contacts at the substrate binding site of MTHFD2L, two new interactions were observed (with Lys302 and Leu303) that enable the 47 nM inhibitory activity of compound 3 against MTHFD2L.
Selective coumarin-based inhibitor vs non-selective diaminopyrimidine-based inhibitor
As mentioned in the introduction section, the two classes of MTHFD2 inhibitors, the diaminopyrimidine-based compounds (1–3)5 and the tricyclic coumarin-based compounds (4–6)18 bind to the substrate binding site of MTHFD2. The diaminopyrimidine-based inhibitors bind non-selectively whereas the tricyclic coumarin-based inhibitors bind selectively to MTHFD2. We tried to extend our insights on the aforementioned selectivity preferences and differences at the substrate binding site level involving the two classes of MTHFD2 inhibitors. Taking compounds 1 and 5 as examples, the diaminopyrimidine-based inhibitor 1 has demonstrated IC50 values of 0.5 nM and 11 nM against MTHFD1 and MTHFD2L isoforms5, respectively, confirming the non-selective mode of MTHFD2 inhibition, whereas the tricyclic coumarin-based inhibitor 5 has shown IC50 values of 6.4 µM and 48 nM against MTHFD1 and MTHFD2, respectively, confirming the selective mode of MTHFD2 inhibition.18 We thus superposed the X-ray structure of MTHFD2 in complex with 1 (PDB code: 6S4A) with the X-ray structure of MTHFD2 bound to 5 (PDB code: 6KG2).
As shown in Fig. 9A, the tetrazole unit of 1 (supported by the butyl linker) is accessible to the MTHFD2 loop 1 forming H-bond/salt-bridge interactions with Arg278. The same tetrazole-butyl unit is also in close proximity of loop 1 of the MTHFD1 and MTHFD2L isoforms, capable of forming H-bond/salt-bridge interactions with Tyr240, Arg250 (MTHFD1) and Tyr292 (MTHFD2L). Compound 5, on the other hand, is characterized by the presence of a terminal N-(2-chlorophenyl)methanesulfonamide group at the same position, which is a significantly smaller and less flexible group relative to the tetrazole-butyl unit of 1. The N-(2-chlorophenyl)methanesulfonamide of 5 is not accessible to loop 1 of MTHFD2 (and correspondingly in MTHFD1 and MTHFD2L) and is limited to interacting only with the substrate site residues, forming H-bond/salt-bridge contacts with Arg43 and Gly310 that play key roles in the MTHFD2 selectivity (Fig. 9B).20 Despite the high nanomolar activities, the incorporation of longer and flexible moieties such as tetrazole-butyl (compound 1) and glutamic acid (compounds 2–3) at the termini, are therefore not recommended when designing selective MTHFD2 inhibitors, in order to limit the interactions with and/or conformational changes of loop 1 in all three MTHFD isoforms. Instead smaller scaffolds at the same position such as benzoic acid (compound 4) and N-(2-chlorophenyl)methanesulfonamide (compound 5) restrict the binding to only the substrate site residues that are important for selectivity.
Another significant difference between the binding modes of 1 and 5 involves the presence of the diaminopyrimidine ring (compound 1) which extends deeper into the substrate binding site of MTHFD2 forming multiple H-bond interactions as compared to the solvent exposed N-methylpiperazine linked to the tricyclic coumarin ring system of compound 5. The diaminopyrimidine scaffold is proven to be an important structural element in improving the binding affinity for MTHFD2 as it engages Lys88, Val131, Leu133 and Asp155 via H-bond interactions. However, these residues are identical in the other two MTHFD isoforms (Lys56, Val99, Leu101, Asp125 in MTHFD1, and Lys102, Val145, Leu147, Asp169 in MTHFD2L), further narrowing the possibilities of selective MTHFD2 inhibition with compounds bearing the diaminopyrimidine-based scaffold. In contrast to this, compound 5 comprises of the solvent exposed N-methylpiperazine ring which significantly improves the solubility and pharmacokinetic properties, and further broadens the scope of selective MTHFD2 inhibition as it does not engage the aforementioned substrate site residues of either MTHFD isoform. Moreover, compound 4 has a solvent-exposed tricyclic coumarin ring system as it lacks any aliphatic monocyclic ring such as the N-methylpiperazine of 5. Despite that the binding affinity of 4 for MTHFD2 was reduced to an IC50 of 1.6 µM, it was able to show > 18-fold selectivity for MTHFD2.18 The information gathered on correlating the binding mode and the enzyme activities of compounds 4–5 against MTHFD2 signifies how the solvent-exposed aliphatic monocyclic systems and the central tricyclic coumarin-based scaffold play pivotal roles in facilitating selectivity, and should thus be taken into consideration when developing new and selective MTHFD2 inhibitors.