3.1. ESMACS results
A moderate correlation, r = 0.33, is obtained for the entire set of the compounds (Fig. 3). The predicted binding free energies appear to be more negative for the charged compounds. These compounds have no obvious differences with the electrostatically neutral ones in their structural and dynamic behaviours. The inclusion of receptor adaptation energy in the 1traj-ar approach causes the calculated binding free energies to be scattered more widely, with a reduced correlation coefficient (r = 0.22).
The scaffold is aligned well for all the compounds in their initial structures; it has the same pose as shown in Fig. 2. For the compounds with similar experimental binding affinities, the free energies of the charged ones are more favourably predicted than the electrostatically neutral ones (Fig. 3). A good correlation is obtained for the charged compounds, with a correlation coefficient of 0.84. This is largely due to the correct predictions of the ranking for the two compounds with less favourable binding affinities (Fig. 3). The correlation coefficient exhibits no improvements when the charged compounds are excluded, having the same correlation coefficient of 0.33 but a reduced covariance from 1.37 to 1.11 kcal/mol. The weak correlation is due to the wide spread of the predictions for the compounds with less favourable binding affinities. More negative predictions are made for some of these compounds (Fig. 3), which are caused by more negative van der Waals contributions.
Based on the positions and the chemical properties of the modifications made to the scaffold (Fig. 1, Table 1), the compounds are clustered into different groups. The grouping is based on the following categories: (1) ether variations, (2) pyridyl variation, (3) central aryl variation, (4) lactam substituent variation, (5) central aryl and lactam, (6) central aryl and lactam and ethers (Fig. 4). The subgroup within category 6 shows a moderate correlation (r = 0.52), while the subgroup with the combination of 1 and 6 (subgroup 16) manifests a strong correlation (r = 0.66). The combination of the two subgroups, containing about half of the neutral compounds, yields a similar correlation to that for subgroup 16. The other subgroups show weak or very weak correlations/anticorrelations (Fig. 4, Table 1).
3.2. TIES results
The TIES study has been performed for a total of 103 compounds pairs, of which 29 are simulated by OpenMM and 74 by NAMD. Our previous studies have shown that consistent results are generated from different MD engines21,34 or on different computational platforms30, provided that the same force field is used and the ensemble approach is employed. Here we evaluate the performance of TIES protocols in reproducing the experimental data for the dataset.
Comparison of predictions with experimental data. A very good correlation, with a correlation coefficient of 0.90, is obtained between the calculated free energy differences and the experimental data (Fig. 5a). In the subgroup with compound L30 as the reference, most of the predicted binding free energy differences are negative, indicating that the modifications to the compound are likely to improve the binding potencies (ΔΔG < 0). The calculations correctly predict compound L2 (Fig. 1) as one of the best binders. As L2 already has a very favourable binding free energy (-10.49 kcal/mol), it is not surprising that most of the modifications from it deteriorate the binding affinities (ΔΔG > 0). For all of the 103 compound pairs, 94% of the TIES predictions agree directionally with the experimental observations, that is, the calculated binding free energy differences have the same sign as those from experimental measurements.
The relative binding free energy differences from TIES agree well with those from ESMACS (Fig. 5c), with a correlation coefficient of 0.62. The predicted ΔΔG values for the two subgroups, paired with L30 and L2, are clearly distinct from both of the protocols. The predicted ΔΔG results from ESMACS also have a good correlation with the experimental data, with a correlation coefficient of 0.67 (Fig. 5b).
It should be noted, however, that an apparent overestimation is observed for some of the compound pairs involving L30, making the predicted free energy differences larger than those from experiments. An MSE of -1.36 kcal/mol is obtained for the L30 subgroup, comparing with an MSE of 0.16 kcal/mol for the subgroup with compound L2 as the reference. The apparent overestimations are investigated further in the following sections.
Overestimation from closed cycle mutations. There are 17 compounds which are paired with both L30 and L2 (Fig. 6). This makes it possible to check the performance of the cycle closure method. For the transformations starting from L2 (L2 → L* in Fig. 6a), the average free energy change from TIES calculations agrees well with the experimental data (Fig. 6b). The transformation L30 → L2, however, overestimates the difference by 1.28 kcal/mol. The two-step transformation for all 17 compounds (L30 → L2 → L*) generates an averaged overestimation of 1.17 kcal/mol, similar to that observed in L30 → L2 transformation. Compounds L* and L2 (Fig. 6) have the same/similar alkyl isopropyl substituent at the hydrophobic position (Fig. 1), while L30 does not; transformations L30 → L* therefore share the same change as that in L30 → L2 transformation. There is a similar overestimation in L30 → L* (-0.99 kcal/mol) as that in L30 → L2, which is likely to be rooted in the same alchemical change of growing an alkyl group from L30.
The legs L30 → L2, L2 → L* and L* → L30 generate an average difference of 0.17 kcal/mol within the closed cycle (L30 → L2 → L* → L30, Fig. 6) between the calculations and the experiments. As we have stated previously34, a hysteresis value of 0 is a necessary but not sufficient condition for convergence of predictions. An overall hysteresis near 0 for a closed cycle is not necessarily an indication of convergence for each leg; it may be a result of cancellation from individual legs. It is understandable that extending an alkyl group into a hydrophobic binding pocket (Fig. 2b) makes the binding more favourable, about which TIES and experiment agree with each other directionally. It is likely that the force field and/or the sampling attribute to the differences between the calculations and the experimental measurement, but we cannot pin down the source of the apparent overestimation from the simulations.
Ligand pairs sharing the common change. 50 pairs have been identified which involve a common variation between the two compounds in the pairs, changing from a -CH group to -C-CH-(CH3)2 at the hydrophobic site. This is the same change as that between L30 to L2 (Fig. 1). Simulations for the pair get an overestimation of 1.28 kcal/mol for the binding free energy difference: -4.16 kcal/mol from TIES vs -2.88 kcal/mol from experiment. For these 50 pairs, an averaged binding free energy difference of -3.82 kcal/mol is obtained from TIES calculations, while an average of -2.42 kcal/mol from experimental data. A recent study has proposed a combination of machine learning (ML) and alchemical free energy calculation to improve the accuracy of free energy predictions36. The ML method is used to learn the differences between the calculations and the experimental data and derives a correct term which brings the two closer. Such an approach may identify and subsequently correct the discrepancies as observed here. It should be noted that the ML approach used in the study36 only learns the differences between the calculations and experiments, with no attempts to identify the source of the discrepancies. Studies have shown that a large set of typical chemical modifications lead to a nearly normal distribution of binding energy differences37. For individual modifications, however, a wide and non-Gaussian distribution has been observed for experimental binding free energies from a large number of independent measurements38. Such non-normality has also been reported in numerous ensemble simulations when large sample sizes are used20,22,32. These observations indicate that the assumption of Gaussian statistics may not be justified20. If, on average, these 50 compound pairs have the same discrepancy as that in L30 → L2 transformation, the systematic differences might be collectively “corrected”. If we apply an offset and shift up all of the TIES result by 1.28 kcal/mol, the calculations agree well with the experimental data, with an MSE reducing from − 1.40 kcal/mol to -0.12 kcal/mol (Fig. 7). This offset could be associated with incomplete sampling of the protein conformational movement that is required to accommodate the change from a -CH group to -C-CH-(CH3)2 in the region of the DFG loop. DFG loop sampling is known to require longer timescales than studied here39.