In silico prediction of toxicity and ADMET properties
Various Toxicity and ADMET-related properties of the investigated compounds were predicted in silico using the toxicity module of ADMET Prediction™ (version 9.5, Simulation Plus, Lancaster, CA, USA) software, where a broad range of toxicities are covered including cardiac, hepatotoxicity, endocrine, carcinogenicity and sensitivity (see figure 1) [17][18].
Several toxicity parameters are used for the evaluation: 1. allergenic skin sensitization (TOX_SKIN), 2. allergenic respiratory sensitization (TOX_RESP), 3. reproductive Toxicity (Repro_Tox), 4. cardiotoxicity (TOX_hERG_Filter), 5.Cardiotoxicity IC50 [mol/L] (TOX_hERG), 6. hepatotoxicity (five liver enzymes elevations: Ser_Alkphos, Ser_GGT, Ser_LDH, Ser_AST, Ser_ALT) 7. phospholipidosis (PLipidosis), 8. chromosome aberration (Chrom.Aberr),9. acute toxicity in rats (Rat_Acute), 10. carcinogenicity toxicity in rat (Rat_TD50), 11. carcinogenicity toxicity in mouse (Mouse_TD50), 12. estrogen (Estro_Filter) and androgen (Andro_Filter) binding, 13. maximum recommended therapeutic dose (Max_RTD). Among 90 compounds, none of the ligands showed toxicity for the selected parameters whereas the remaining ligands exhibited toxicity for only a few parameters (See Table 1). Additional details for each model are presented in materials and methods section.
Allergic skin and respiratory sensitization
A compound or substance that stimulates dermal allergic reactions is referred as a skin sensitizer. TOX_SKIN model employed the murine local lymph node assay (LLNA) which has established to be a successful tool in evaluating the relative potency of compounds as skin sensitizers for assessing the associated risks (https://ntp.niehs.nih.gov/whatwestudy/niceatm/index.html). Recently, this model has been endorsed for calculating the relative effectiveness of skin sensitizing chemicals. We also report here the results of TOX_RESP model, which is respiratory sensitization (see Table 1) [18] [19].
Reproductive Toxicity
Reproductive toxicity is an essential regulatory endpoint that is categorized as developmental toxicity. Reproductive toxicity refers to any parameter that disrupts organisms’ reproductive means, such as unfavorable effects on sexual organs, performance, ease of conception, as well as any developmental toxicity experienced by the offsprings. ADMET predictor used the data from the FDA/TETRIS database, which is collected originally from literature. The qualitative evaluation of the reproductive toxicity (TOX_REPR) model is presented in table 1. Compounds classified as toxic “T, red” and nontoxic “NT, green” (see Table 1) [19].
Cardiac toxicity (Affinity towards hERG-Encoded Potassium Channels)
Cardiovascular diseases continue to be a leading cause of morbidity and mortality. Medicines can significantly contribute to the high burden of cardiovascular risk factors and thus deserve special attention. The human Ether-a-go-go Related Gene (hERG) is a gene that encodes potassium channels, which mediates repolarization of the ion current in the cardiac action potential. Blockade of the transmembrane influx of K+ ions, and inhibition of channel rafficking in heart cells caused by drugs can lead to life-threatening ventricular arrhythmias [20] [21] [22]. ADMET toxicity module uses two neural network models to assess covid-19 drugs that may induce distinct cardiovascular toxicity through blocking of the hERG channel, TOX_hERG_Filter, and TOX_hERG. The first one can be considered as a classification model which determines if the compound is expected to have an affinity for the hERG K+ channel. The results for TOX_hERG Filter is presented in table 1. Compounds with their IC50 values below 10μM are shown as “T, red” (Toxic), while those greater than 10μM are labeled “NT, green” (Nontoxic). If a compound is predicted to be “T”, it will likely block the hERG channel. The IC50 value of the compound is predicted in molar units and outputted as pIC50 (-log(IC50 [M]). The model’s performance is shown in table 1.
As we discussed, ADMET predictor uses two ANNs, a classification model, and a regression model to evaluate the probability of blocking the hERG channel for a compound. In this paper, we also used QSAR/ML model that we recently developed to assess the possibility of blocking the hERG channel by the compounds under study (see Table 1 and Table 4) [19].
Hepatotoxicity (Human Liver Adverse Effects)
It has been over three decades that reports on adverse effects of drugs on human livers have been accumulated by the US FDA CDER (Center for Drug Evaluation and Research). Two databases developed using this work, the Spontaneous Reporting System (SRS) and the Adverse Event Reporting System (AERS), are used by a software package called ADMET Predictor to model hepatotoxicity of many popular pharmaceuticals. Using this procedure, five separate models that correspond to individual liver enzymes used in hepatotoxicity diagnostics are obtained:
alkaline phosphatase (Ser_AlkPhos) increase, gamma-glutamyltransferase (Ser_GGT) increase, lactate dehydrogenase (Ser_LDH) increase, aspartate aminotransferase (Ser_AST) increase, and alanine aminotransferase (Ser_ALT) increase. Compounds classified as Elevated enzymes level “EL, red” and Normal enzymes level “NL, green” are shown in Table 1 (see Table 1) [19].
Phospholipidosis
In individuals, lysosomal storage disorders can cause the accumulation of phospholipidosis in the tissue and body instead of regular metabolism by lysosomes. Lysosomes are defined as cellular organelles carrying specific enzymes that metabolize waste materials to promote their elimination. The origin of metabolic disorders can be either hereditary or drug-induced, as the latter manifests in phospholipidosis. Phospholipidosis is regarded to have a significant role in the nervous system. When present, phospholipids may cause disorder in neuronal cell signaling leading to different genetic diseases, e.g. Niemann-Pick disease. In drug discovery, the process of drug development may be delayed or halted due to the identification of as extra testing is needed to satisfy the obligations of regulators. ADMET predictor develops a classification model named TOX_PHOS by utilizing a data set of chemicals with a known phospholipidosis profile obtained from literature. Overall, electron microscopy was used to identify all non-inducers and some inducers while information about the presence of foamy macrophages or vacuolations was used to detect the rest of the inducers. In Table 1, non-inducers are labeled as Nontoxic “NT, green”, whereas inducers are labeled as Toxic “T, red” [19].
Chromosomal Abberations
An ANN ensemble model named TOX_CABR provided by ADMET Predictor is used to assess the genotoxic potential of chemicals and drugs. A training data set with observed CA results that exhibit a very balanced distribution of Toxic “T” and Nontoxic “NT” is used for this ANN ensemble model. Compounds classified as toxic “T, red” and nontoxic “NT, green” are shown in Table 1 (see Table 1) [19].
Acute Rat Toxicity
The acute rat toxicity model, referred as TOX_RAT, is built on the amount of an orally administered chemical substance (in mg per kg body weight) that resulted in lethality of half of the rats in a given study. The grand challenge to build a QSAR model is the permanence of such a diverse dataset poses. ADMET predictor utilizes the data from the following resources: Registry of Toxic Effects of Chemical Substances data set, referred as RTECS, (the version associated with the CDC’s NIOSH), and the ChemIDplus database. The unit used for LD50 in TOX_RAT model is mg/kg. Compounds with the predicted toxicities with LD50 (mg/kg) are shown in Supplementary Table 1. According to the risk criteria in risks section, (acute toxicity in rats, ra: TOX_RAT < 300) is considered as high risk. Supplementary Table 1 is presented in a color-coded fashion such that the most dangerous drugs are shown as red, while safe drugs are shown in green (see Supplementary Table 1) [19].
Endocrine Toxicity
Drug compounds compete with sex hormones to inhibit and interact with the estrogen and/or androgen receptors, which can drive disruptions in endocrine system signaling, such as blocking the passage of standard hormonal signals and causing toxicity. Androgens, for instance, play a significant role in developing and maintaining the male phenotype and the pathology and treatment of prostate cancer.
ADMET Predictor uses two models for predicting endocrine toxicity by qualitatively assessing estrogen receptor toxicity in rats (TOX_ER_Filter) and androgen receptor toxicity in rats (TOX_AR_Filter). Qualitative estimation of androgen and estrogen receptor toxicity in rats is shown in Table 1 as NT ‘Nontoxic’ and T ‘Toxic’ (see Table 1) (see Table 1) [19].
Maximum Recommended Therapeutic Dose
US FDA’s CDER has collected a maximum recommended therapeutic dose (MRTD) database to shed light on the relationship between structure, toxicity, and no-effect level (NOEL) of chemicals in humans to assess the health-related effects. ADMET Predictor utilizes ANN Ensemble models to predict the MRTD for compounds in mg/kg-BodyWeight/day units. When the prediction is higher than 3.16 mg/kg-BW/day, it is indicative of an “inactive” (green color-coded) compound with improbable side effects, and estimations less than 3.16 are labeled with red color with significant potential for side effects. The relevant results for MRTD are presented in Table 1 (see Table 1).
Chronic Carcinogenicity and Mutagenicity
ADMET Predictor adopted Carcinogenic Potency Database (CPDB) is available by the EPA’s DSSTox program to develop two quantitative chronic carcinogenicity and mutagenicity models: Rat_TD50 and Mouse_TD50. Rat_TD50 predicts the TD50 value of a selective compound. The TD50 is the dose of a substance given to rats orally throughout their lifetimes resulting in half of the population experiencing tumors. Furthermore, Mouse_TD50 predicts the TD50 value in mice. Both models predict TD50 values in units of mg/kg/day. According to the risk criteria in section 2.1.12.1 (carcinogenicity in chronic mouse studies, Xm: Mouse_TD50 < 25) and (carcinogenicity in chronic rat studies, Xr: Rat_TD50 < 4) are considered as high risk. Table S1. is color-coded as most dangerous drugs are shown in red, while safe drugs are shown in green, respectively (see Supplementary Table 1).
The outcome of 10 models estimating Ames Mutagenicity in five different strains of Salmonella with or without metabolic activation (m labeled) is summarized in Table S1. Developed by Ames et al. using strains of the Salmonella typhimurium as a time and cost-effective option for testing in rodents, the Ames Mutagenicity measures the mutagenic potential of chemical compounds. The 10 ANN Ensembles featured with TOX_MUT* are qualitative models that are used to predict the mutagenicity of chemical compounds either as “+” (i.e., mutagenic) or “-”. The ADMET Risk rule, MUT_Risk, described in the risk section, predicts overall mutagenicity by adding instances of “+.” The results related to mutagenicity are presented in Supplementary Table 2 [19].
Risks
Mutagenicity Risk
ADMET Predictor summarizes the output of mutagenicity models employing ADMET Risk and ADMET Code for mutagenicity in S. Typhimurium (MUT_Risk and MUT_Code), depicting the results of “virtual Ames testing.” There are ten TOX_MUT models, which individually take part in the assessment of the mutagenicity anticipated for five strains of Salmonella typhimurium with and without microsomal activation (e.g., TOX_MUT_102 and TOX_MUT_m102). Risk code and criteria for mutagenicity are listed as below:
S1: (TOX_MUT_97+1537 = “+”)
m1: (TOX_MUT_m97+1537 = “+” AND NOT TOX_MUT_97+1537 = “+”)
S2: (TOX_MUT_98 = “+”)
m2: (TOX_MUT_m98 = “+” AND NOT TOX_MUT_98 = “+”)
S3: (TOX_MUT_100 = “+”)
m3: (TOX_MUT_m100 = “+” AND NOT TOX_MUT_100 = “+”)
S4: (TOX_MUT_102+wp2 = “+”)
m4: (TOX_MUT_m102+wp2 = “+” AND NOT TOX_MUT_102+wp2 = “+”)
S5: (TOX_MUT_1535 = “+”)
m5: (TOX_MUT_m1535 = “+” AND NOT TOX_MUT_1535 =“+”)
SU: (TOX_MUT_97+1537 = Undecided OR TOX_MUT_98 = Undecided OR TOX_MUT_100 = Undecided OR TOX_MUT_102+wp2 = Undecided OR TOX_MUT_1535 = Undecided (weight =0.5))
mU: (TOX_MUT_m97+1537 = Undecided OR TOX_MUT_m98 = Undecided OR TOX_MUT_m100 = Undecided OR TOX_MUT_m102+wp2 = Undecided OR TOX_MUT_m1535= Undecided (weight = 0.5))
MUT Risk rule codes for mutagenicity from Table 3 are as follows: Risk of positive Ames test results with (m*) or not without (S*) microsomal activation for Salmonella typhimurium strains, where * = TA97 or TA1537; TA98; TA100; TA102 or WP2 uvrA strain of E. coli; TA1535, respectively. NIHS panel predictions are not separated with respect to S9 activation or lack thereof.
The results related to the mutagenicity risk are presented in table 2. We highlighted all the compounds with MUT_Risk 2 or higher as red.
Toxicity Risk
ADMET Risk and ADMET Code for toxic liability are TOX_Risk and TOX_Code, respectively. The TOX_Risk model includes seven rules, each of which has an associated weight of one. Risk code and criteria for potential hERG liability, acute toxicity in rats, carcinogenicity in chronic rat studies, carcinogenicity in chronic mouse studies, hepatotoxicity and SGOT and SGPT elevation are as follows respectively:
hERG = (TOX_hERG > 6)
ra = (TOX_RAT < 300)
Xr = (Rat_TD50 < 4)
Xm = (Mouse_TD50 < 25)
Hepatotoxicity = (Hp: (TOX_AlkPhos = Toxic OR TOX_GGT = Toxic OR TOX_LDH = Toxic ) AND (TOX_SGOT =Toxic OR TOX_SGPT = Toxic ))
SGOT and SGPT elevation= (SG: TOX_SGOT = Toxic AND TOX_SGPT = Toxic)
Mu = (TOX_MUT_Risk > 2)
The possible value range for TOX_MUT_Risk is 0-11 and it is 0-7 for TOX_Risk.
The results related to the toxicity risk are presented in table 2. We highlighted all the compounds with TOX_Risk 2 or higher as red.
Metabolism Risk
Metabolism module of ADMET predictor featured CYP_Risk model encompasses of seven rules, each with a weight of 1.“Substr” stands for the expectation of being substrate for certain isoenzyme. “Clint” means intrinsic clearance constant for this isoenzyme. Ki_Mid and Ki_tes are inhibition constants for Midazolam and testosterone, (see List of Abbreviations and Table 3). The code and criteria (being excessive CYP_(1A2, 2C19, 2C9, 2D6, 3A4 clearance) as well as Ki_Mid and Ki_tes for metabolism risk is presented in the following paragraph.
1A2 = (CYP_1A2_Substr = Yes AND MET_1A2_CLint > 30)
2C19 = (CYP_2C19_Substr = Yes AND MET_2C19_CLint > 30)
2C9 = (CYP_2C9_Substr = Yes AND MET_2C9_CLint > 30)
2D6 = (CYP_2D6_Substr = Yes AND MET_2D6_CLint > 30)
3A4 = (CYP_3A4_Substr = Yes AND MET_3A4_CLint > 30)
Mi = (MET_3A4_Ki_Mid < 1.5 AND (MET_3A4_I_mid=Yes OR MET_3A4_Inh=Yes))
Ti = (MET_3A4_Ki_tes < 1.0 AND (MET_3A4_I_tes = Yes OR MET_3A4_Inh = Yes))
CYP_Risk is 2 or greater for a little over 10% of the compounds in the focused World Drug Index (WDI). We highlighted all the compounds with CYP_Risk 2 or higher as red.
ADMET global Risk
Eventually, the ADMET predictor recapitulates the main outcomes and generates a global
classification (ADMET_Risk). The global ADMET_Risk itself combines the rules from S+Absn_Risk, CYP_Risk, TOX_Risk and additional two rules for low fraction unbound in plasma and high steady-state volume of distribution. Codes and criteria for the additional rules are as follows, respectively.
fu = (S+PrUnbnd < 3.5%)
Vd = (S+Vd > 5.5)
There are then 24 different rules that contribute to the ADMET_Risk model. Full ADMET Risk rule codes are mentioned in abbreviations section (see table 2). We highlighted all the compounds with ADMET_Risk 3 or higher as red.
In silico study of cytochrome P450 enzymes (CYPs) to understand drug-drug interactions (DDI)
In silico tools are broadly used to predict substrates and inhibitors of metabolic enzymes and sites of metabolism in molecules where the metabolic reaction occurs. These predictions facilitate the multidimensional drug discovery procedure, paving the way to fulfill the stability, enhancements in vivo half‐life, and circumventing the toxic metabolites. The most important enzymes in Phase I metabolism belong to the cytochrome P450s (CYPs) since they provide the most first-generation metabolites and have a high proportion of toxic/reactive metabolites. They are a family of heme‐containing enzymes where at least 57 CYP isoforms have been authenticated in humans. Changes in the CYP enzyme activity can influence the metabolism and clearance of drugs, therefore, the inhibition of cytochrome P450 is the most prominent cause of drug toxicities. ADMET Predictor engages different classification models such as substrate, SoM, and kinetic predictions for different isoforms of CYP to predict the metabolites that are more probable to occur. Finally, ADMET can estimate the contribution each will make to CYP metabolism in vivo.
The uridine 5’-diphosphate-glucuronosyltransferases (UGT) enzymes are disseminated in various organs in the human body and abundantly expressed in the liver as the central metabolic organ. The UGT enzymes catalyze in Phase II metabolism through glucuronidation, the primary Phase II metabolic pathway, which leads to a more straightforward clearance of xenobiotics. UGT enzymes in humans are predominantly created by the liver except UGTs 1A8 and 1A10 produced by the gastrointestinal tract. ADMET predictor built nine classification QSAR models from literature data for the following UGT isozymes responsible for Phase II drug metabolism: UGT1A1, UGT1A3, UGT1A4, UGT1A6, UGT1A8, UGT1A9, UGT1A10, UGT2B7, and UGT2B15. These models predict whether a compound will be metabolized by one or more of these enzymes. The probability of metabolism by human uridine 5’-Diphosphate-Glucuronosyltransferases (UGT) is summarized in Supplementary Table 3 [23]
Qualitative and Quantitative Prediction of Drug Blockade of hERG1 channel based on QSAR machine learning (ML) model
The cardiotoxicity potential of the compounds’ datasets listed in Table 4. was assessed using our recently reported machine learning algorithm for the prediction of drug-induced blockade of hERG channel [24]. This model is based on the eXtreme gradient boosting (XGBoost) algorithm [25]. Briefly, molecular and pharmacophoric descriptors (float, integer and binary values) were generated from each compound’s SMILES string using RDKit open source toolkit for chemoinformatics [26].These descriptors were used as input for the prediction of inhibitory potency for each compound (pIC50 of inhibition). The model also provides metrics to assess the compliance with its applicability domain (AD= True/False) in terms of the Minimum Distance to Training set (MDT) that is based on the Tanimoto similarity to the compounds used in the training set of the model. pIC50 has its regular meaning – large positive number is equivalent to high affinity blockers. The potentially dangerous compounds are in the range between 5.5 to +infinity.
Since it is an ML model, we also report applicability domain as measured by similarity matrix and distance to the training set. What it means, is that many compounds have unique scaffold not present in the model. We currently use the receptor map models (SILCS) to obtain somewhat more realistic estimates. That is, if AD (applicability domain) is “False”, the confidence in pIC50 prediction is low. This is a well-known issue with any QSAR/ML models facing unknown molecular scaffolds.