The thermodynamic context
When two immiscible or partially immiscible solid compounds mix under isobaric conditions, the melting points will generally decrease, as characterized by the Eq. 8,9,18,19:
$$ln\left({x}_{i}{\gamma }_{i}\right)=\frac{{Δ }_{fus}{H}_{i}}{R}\left(\frac{1}{{T}_{m,i}}-\frac{1}{{T}_{i}}\right) \text{(1)}$$
In this equation, \({x}_{i}\), \({\gamma }_{i}\), \({Δ }_{fus}{H}_{i}\), \({T}_{m,i}\), and \({T}_{i}\) represent the mole fraction, activity coefficient, fusion enthalpy (J mol–1), pure compound melting point (K), and melting point in the mixture (K) of component \(i\), respectively. \(R\) is the ideal gas constant, given as 8.3145 J mol− 1 K− 1. Calculating for \({T}_{i}\) across a range of values for \({x}_{i}\) from 0 to 1 produces a melting curve for each mixture component. In a binary mixture, the eutectic point emerges where the melting curves of the two constituents intersect, with the x-axis and y-axis delineating the associated eutectic mole fraction \({x}_{E}\) and eutectic temperature \({T}_{E}\), respectively (refer to Fig. S1). Conventionally, if intercomponent interactions match the intracomponent interactions, the mixture exhibits ideal behavior, allowing \({\gamma }_{i}\) to be set at 1. However, real eutectic mixtures can display either a positive (\({\gamma }_{i}\) >1) or negative (\({\gamma }_{i}\) <1) deviation from this ideality. The latter characteristic defines a DES, as proposed by Martins et al9. Accordingly, SLE phase diagrams for both ideal and real mixtures can be derived by inputting the parameters \({T}_{m,i}\), \({Δ }_{fus}{H}_{i}\), and \({\gamma }_{i}\) in Eq. (1). It should be noted that Eq. (1) disregards the value of the molar heat capacity, \({Δ }_{m}{C}_{i}\) because its impact is minor compared with that of the other parameters (see equation S1).
Prediction of melting properties
One of the main challenges in employing ML techniques for DES research is the perceived scarcity of extensive datasets, especially those relating to SLE phase diagrams. This perception largely arises from an approach that views DESs as a novel class of compounds, rather than as mixtures. When building ML models tailored to DESs, researchers have often depended on training data sourced solely from previous DES-specific studies, which are relatively limited. In contrast, the melting curves in SLE phase diagrams can be charted individually for each component. Thus, the associated melting properties, \({T}_{m,i}\) and \({Δ }_{fus}{H}_{i}\), can be predicted using ML models trained on datasets of pure compounds, which are abundantly available in the literature20,21.
Figure 1a–c shows parity plots of the training and test data of three ML models developed in this study: random forest (RF); extreme gradient boosting (XGB); and multilayer perceptron (MLP), to predict the \({T}_{m,i}\) parameter. All these models demonstrated a robust learning ability using the prescribed melting point dataset, as evidenced by the low root mean square error (RMSE) and high coefficient of determination (R2) values. For the training data, XGB recorded the smallest RMSE value (0.71 K), followed by RF (14.67 K) and MLP (36.16 K). A similar trend emerged for the R2 values: XGB (1.0) > RF (0.98) > MLP (0.85). When evaluating the test data, all the models consistently gave low RMSE (≈ 40 K) values and high R2 (≈ 0.83) figures, indicating their good predictive ability. Cross-validation analysis further substantiated the reliability of the RF and XGB models, both returning RMSE and R2 scores of 40 K and 0.82, respectively (Fig. S2a,b). For the MLP model, a sharp decline in the loss function throughout the learning history suggested a swift adaptation to data patterns (Fig. S2c). These superior evaluation scores across the board indicated the aptness of the selected dataset, molecular descriptor, and model design, supporting their validity for use in predicting the melting points of novel or uncharted compounds.
Figure 1e–g shows parity plots of ML models designed to predict the \({Δ }_{fus}{H}_{i}\) parameter. In training data performance, XGB had the best performance with RMSE = 0.35 kJ mol− 1 and R2 = 1.0, then RF (RMSE = 1.92 kJ mol− 1, R2 = 0.98) and MLP (RMSE = 4.55 kJ mol− 1, R2 = 0.89). However, this order shifted when evaluating the test data: RF (RMSE = 4.96 kJ mol− 1, R2 = 0.84); followed closely by MLP (RMSE = 5.40 kJ mol− 1, R2 = 0.81); and XGB (RMSE = 5.71 kJ mol− 1, R2 = 0.79). Notably, the performance of RF and XGB was similar based on cross-validation analysis, while MLP demonstrated a rapid data learning ability (Fig. S2e–g). Given the outstanding results across all the models, there exists the potential to confidently predict the fusion enthalpy of novel or as-yet-unstudied compounds.
Traditionally, mixtures comprising a hydrogen-bond acceptor (HBA) and a hydrogen-bond donor (HBD) have been considered to be likely candidates for forming a DES. Thus, we identified 60 potential HBAs and 50 potential HBDs. The names and simplified molecular-input line-entry system (SMILES) representations of these compounds can be found in Table S1. Our selection of HBAs contained both strong (phosphine oxide, sulfinyl, and urea) and weaker (thiourea) HBA groups, with varied alkyl chain lengths, for a systematic study. The HBDs were largely natural organic compounds, including amino acids, sugars, and fatty acids. Merging these 60 HBAs with the 50 HBDs yielded a total of 3,000 possible mixtures. It was anticipated that most of these combinations would result in the formation of type V DES, i.e., DESs obtained from the mixtures of non-ionic compounds22–24.
Figure 1g,h shows the predictions for the \({T}_{m,i}\) and \({Δ }_{fus}{H}_{i}\) values using RF, XGB, and MLP models. A discernible trend in the melting point and fusion enthalpy values was evident for the HBAs (compounds 1–60). This trend was aligned with the length of the alkyl chains in each group, specifically, longer alkyl chains corresponded to higher \({T}_{m,i}\) and \({Δ }_{fus}{H}_{i}\) values. This observation was in accordance with basic chemistry principles, indicating the ML models were likely to predict the real values. In contrast, for the HBDs (compounds 61–110), a clear trend was not observed, likely because the chosen HBD structures were not systematically selected. Moreover, the predicted \({T}_{m,i}\) and \({Δ }_{fus}{H}_{i}\) values from each of the RF, XGB, and MLP models largely coincided, as evidenced by the frequent overlaps in the scatterplots. This convergence further bolstered our confidence in the ML predictions. Because of the consistent performance of the model across different datasets and evaluative methods, we opted to use the RF model to derive the \({T}_{m,i}\) and \({Δ }_{fus}{H}_{i}\) values.
Calculation of activity coefficients
To quantify deviations from thermodynamic ideality, QC methodologies such as density functional theory (DFT) and the COSMO-RS model are valuable tools as these methods have been shown to predict \({\gamma }_{i}\) values with good accuracy25–29. In the present study, we used ORCA software for DFT calculations and the OpenCOSMO-RS package for COSMO-RS calculations. The OpenCOSMO-RS, introduced by Gerlach and his team, offers an open-source variant of the COSMO-RS model, with the codebase available in both Python and C + + languages30.
Figure 2 shows the calculated \(\text{ln}{\gamma }_{i}\) values of HBAs and HBDs displayed in panels (b) and (d), respectively, in the mixtures. An illustration of the calculation process is displayed in panels (a) and (c). As anticipated from the nature of real mixtures, the calculated \(\text{ln}{\gamma }_{i}\) values tended to be closer to 0 as the \({x}_{i}\) values approached 1. Negative\(\text{ln}{\gamma }_{i}\) values indicate a preference for interactions between HBAs and HBDs in a mixture. The calculated \(\text{ln}{\gamma }_{i}\) values were predominantly negative, suggesting a high likelihood of DES formation from these combinations. However, a sizeable number of the \(\text{ln}{\gamma }_{i}\) values were positive, indicating that certain mixtures exhibited positive deviations from ideality. It is interesting to note that while the presence of H-bonds was anticipated across all the HBA–HBD combinations, such bonds do not necessarily guarantee negative deviations from ideality. This observation highlights that the correlation between HBA–HBD H-bond formation and the activity coefficients in a mixture is not straightforward. Consequently, relying solely on H-bond metrics as an indicator for DES formation might be misleading. To truly understand the intricacies of DES formation, SLE phase diagrams need to be analyzed across a vast chemical spectrum.
Estimation of SLE phase diagrams
Having determined the parameters \({T}_{m,i}\), \({Δ }_{fus}{H}_{i}\), and \({\gamma }_{i}\) from ML predictions and QC calculations, we estimated SLE phase diagrams for the proposed mixtures. To generate these diagrams, the melting curves of HBAs and HBDs were determined individually by solving for \({T}_{i}\) in Eq. (1) at \({x}_{i}\) = 0.1 to \({x}_{i}\) = 1.0, with increments of 0.1. These curves were then integrated into a single graph, with \({x}_{HBA}\) defining the primary x-axis and \({T}_{i}\) designated as the y-axis. The curve estimations were carried out for both the ideal mixture model (\({\gamma }_{i}\) = 1) and real mixture model (\({\gamma }_{i}\) ≠ 1), resulting in the collection of 6,000 graphical images and 2 animated videos (Video 1: ideal, Video 2: real) of SLE phase diagrams. The detailed results can be accessed from an external data repository as described in the Methods section. Essential analyses of the resulting SLE phase diagrams are given in Fig. 3.
Figure 3a shows the mole fraction of HBA corresponding to the eutectic mole fraction (\({x}_{E}\)) for each mixture. A considerable percentage of the \({x}_{E}\) values derived from the ideal mixture model appeared at high HBA fractions (\({x}_{HBA}\) > 0.8). The real mixture model showed a broader range of \({x}_{E}\) values (\({x}_{HBA}\) > 0.3), yet most observations were still clustered around high HBA fractions. This pattern underscored the fact that eutectic mixtures typically occur when the molar fractions of the HBAs are comparable to, or surpass, those of the HBDs. This data trend might also explain why numerous reports have claimed successful DES formation at HBD: HBA molar ratios of 1:1, 1:2, and 1:45,31,32. Nevertheless, designing DESs at fixed molar ratios, as is typically the case in IL synthesis, cannot be justified. Such a trend for \({x}_{E}\) values concentrated in a specific mole fraction was not observed in the investigated chemical space. This finding is in line with a previous report debunking “magic compositions” in DESs using ab initio MD simulations33. We recommend that DES compositions are consistently described by the molar fractions (\({x}_{HBA}\) or \({x}_{HBD}\)), as is common for mixtures, instead of by the molar ratios (\({mol}_{HBD}\): \({mol}_{HBA}\)) to avoid any ambiguities with pure compounds, such as ILs34.
Figure 3b shows the distribution of eutectic temperatures (\({T}_{E}\)) for the investigated mixtures. Interestingly, the data trend appeared to resemble the patterns of the predicted \({T}_{m,i}\) and \({Δ }_{fus}{H}_{i}\) values for the HBAs (Fig. 1g,h, compounds 1–60). This observation suggested that the magnitude of the \({T}_{E}\) value was greatly affected by the melting properties of the HBAs, which was strongly correlated with the alkyl chain lengths. In the real mixture model, numerous \({T}_{E}\) values were found to be lower than those in the ideal mixture model, except for the mixtures 2,300–3,000. This observation was aligned with the patterns of the \(\text{ln}{\gamma }_{i}\) values (Fig. 2b,d), indicating a favorable interaction between the HBAs and HBDs in most of the mixtures. Further analyses of the correlations between the \({T}_{m,i}\), \({Δ }_{fus}{H}_{i}\), and \(\text{ln}{\gamma }_{i}\) values and the \({T}_{E}\) values is shown in Fig. S3, which indicated the HBA melting properties had a considerable effect on the resulting eutectic temperatures.
Figure 3c shows the coordinates of the eutectic points (\({x}_{E}\), \({T}_{E}\)) of the investigated mixtures. The majority of the eutectic points of the real mixture model were positioned lower than their counterparts from the ideal mixture model. Hence, several mixtures within the investigated chemical space had negative deviations from the thermodynamic ideality and thus can be classified as DESs. It should be noted, however, that many eutectic points of the ideal mixture model were located below 298 K, i.e., a liquid at room temperature. Therefore, assessing DES formation by merely observing solid-to-liquid transformations would result in the misclassification of ideal eutectic mixtures, or even regular solutions, as DESs, as has been the case in previous studies12,13. Basic information from the eutectic point data is summarized in Table 1.
Table 1
Basic information from the eutectic point data.
Description | Ideal | Real |
(\({x}_{E}\), \({T}_{E}\)) count | 2444 | 2579 |
0.1 < \({x}_{E}\) < 0.9 | 1473 | 2239 |
\({T}_{E}\) < 298 K | 902 | 1594 |
\({T}_{E}\) max (K) | 344 | 485 |
\({T}_{E}\) min (K) | 232 | 153 |
Figure 3d, e, and f show examples of SLE phase diagrams with melting curves that behave ideally, exhibit positive deviations, and show negative deviations, respectively. Specifically, Fig. 3d displays the SLE phase diagram for Mixture 2996, a combination of 1-hexyl-1,3,3-trimethylthiourea and stearic acid. The melting curves of the real mixture were aligned with those of the ideal model, suggesting that the interactions between 1-hexyl-1,3,3-trimethylthiourea and stearic acid were the same strength as the interactions between the individual compounds. Figure 3e shows the SLE phase diagram for Mixture 2650, comprising 1,3-diheptyl-1,3-dimethylthiourea and glycine. The melting curves showed a positive deviation from ideality, indicating less favorable interactions between the two components than between the individual compounds. Figure 3f shows the SLE phase diagram for Mixture 171, formed by tributylphosphine oxide and erythrose. The melting curves here display a negative deviation, signifying a strong affinity between the two constituents. This particular diagram (Fig. 3f) typifies the SLE phase behavior of a DES. Key details for these mixtures are summarized in Table 2. Additional examples of SLE phase diagrams with varying melting-curve behaviors are shown in Fig. S4 and Table S2.
Table 2
Key information on the investigated mixtures.
Description | Mixture 2996 | Mixture 2650 | Mixture 171 |
HBA | 1-hexyl-1,3,3-trimethylthiourea | 1,3-diheptyl-1,3-dimethylthiourea | tributylphosphine oxide |
SMILES | S = C(N(CCCCCC)C)N(C)C | S = C(N(CCCCCCC)C)N(C)CCCCCCC | O = P(CCCC)(CCCC)CCCC |
HBD | stearic acid | glycine | erythrose |
SMILES | CCCCCCCCCCCCCCCCCC(O) = O | NCC(O) = O | O = C[C@@H]([C@@H](CO)O)O |
Behavior | Ideal | Positive | Negative |
\({x}_{E}\) ideal | 0.88 | 0.60 | 0.69 |
\({x}_{E}\) real | 0.86 | 0.98 | 0.44 |
\({T}_{E}\) ideal (K) | 306 | 319 | 315 |
\({T}_{E}\) real (K) | 306 | 328 | 253 |
Simulation of DES interactions
MD simulations were carried out for Mixture 2996, Mixture 2650, and Mixture 171 at the respective \({x}_{E}\) and \({T}_{E}\) values. The preliminary stages encompassing energy minimization and system equilibration are shown in Fig. S5. Molecular motions during the simulation can be seen in Video 3 (Mixture 2996), Video 4 (Mixture 2650), and Video 5 (Mixture 171), available on an external data repository as described in the Methods section.
In the investigated mixtures, H-bonds associated with HBA–HBA interactions were not present because of the absence of hydrogen-donor sites. In Mixture 2996, only approximately 60 H-bonds associated with HBD–HBD interactions could be observed at the beginning of the simulation and this number increased to almost 120 over 10 ns (Fig. 4a). This observation suggested that while the interaction between the HBD (stearic acid) molecules in Mixture 2996 was somewhat favorable, it remained fairly modest. The number of H-bonds associated with HBA–HBD interactions was very low (< 3), indicating there was a poor interaction between 1-hexyl-1,3,3-trimethylthiourea and stearic acid molecules in the mixture. The radial distribution function (RDF) analysis showed a relatively weak interaction between the HBA molecules and a modest interaction between the HBD molecules, as indicated by the peaks at 0.11 nm (Fig. 4b). In Mixture 2996, even though the strength of the HBA–HBD interactions was weak, the HBA–HBA and HBD–HBD interactions were not particularly strong either. Therefore, the ensemble of these weak interactions allowed Mixture 2996 to behave as an ideal mixture (Fig. 3d).
In Mixture 2650, a considerable amount of H-bonds (approximately 500) associated with HBD–HBD interactions could be observed at the beginning of the simulation (Fig. 4c). Within 10 ns, the number of hydrogen bonds had increased to approximately 600. This increase in H-bonds suggested the formation of HBD regions in the mixture, indicating favorable HBD–HBD interactions and less favorable HBA–HBD interactions. Moreover, there were hardly any H-bonds associated with HBA–HBD interactions. The RDF analysis showed a relatively weak interaction between HBA molecules, as indicated by the small peak at 0.11 nm (Fig. 4d). In contrast, the interactions between the HBD molecules were quite strong as indicated by the three large peaks at 0.09–0.11 nm. The strength of the HBA–HBD interactions was substantially lower than that of the HBD–HBD interactions, and thus Mixture 2650 showed a positive deviation from the thermodynamic ideality (Fig. 3e).
Mixture 171 exhibited a markedly high number of H-bonds associated with both HBD–HBD (approximately 800) and HBA–HBD (approximately 600) interactions. The number of H-bonds associated with HBA–HBD interactions appeared to increase over the course of 10 ns, indicating a favorable interaction between the HBA and HBD molecules. The P = O moiety of the tributylphosphine oxide molecule is a good acceptor for H-bonds with the O–H moieties of the erythrose molecule. The RDF analysis showed a few small peaks associated with HBA–HBA and HBD–HBD interactions at 0.09–0.11 nm, indicating that these interactions were not dominant. Although the initial HBD–HBD interactions produced a considerable amount of H-bonds, the ensemble HBA–HBD interactions were favorable and thus led Mixture 171 to undergo a negative deviation from the thermodynamic ideality (Fig. 3f).