Drug-induced liver injury (DILI) remains a significant challenge for the pharmaceutical industry and regulatory organizations. Despite a plethora of toxicological research aimed at estimating the risk of DILI, the efficacy of these techniques in predicting DILI in humans has remained limited. This has prompted the exploration of new approaches and procedures to improve the prediction accuracy of DILI risk for drug candidates in development. This study aimed to address this gap by leveraging a large human dataset to develop machine learning models for assessing DILI risk. The performance of the developed prediction models was extensively evaluated using a 10-fold cross-validation approach and two external test sets. Our study revealed that the Random Forest (RF) and MultiLayer Perceptron (MLP) models emerged as among the most effective in predicting DILI. RF outperformed other machine learning strategies, reaching an average prediction accuracy of 63.10% during the cross-validation, while the MLP achieved the highest Matthews Correlation Coefficient (MCC) of 0.245. These two models were further validated externally by a set of drug candidates that failed in clinical development due to DILI. Both models accurately predicted 90.9% of the toxic drug candidates in the external validation. Our study suggests that in silico machine learning approaches have the potential to significantly enhance the identification of DILI liabilities associated with drug candidates in development.