3.1. QSAR analysis
3.1.1. MLR modeling
First, various descriptors were used in stepwise multiple linear regression (MLR) analyses, then the MLR equation was obtained from the calculated descriptors. As shown in Table 2, the obtained model provided an acceptable predictive performance as confirmed by statistical parameters such as R2 value of 0.73, R2p value of 0.67 for the test set, standard error of regression (SE) value of 0.59, variance ratio (F) value of 14.7 at specified degrees of freedom, Q2 value of 0.67, Cvcv value of 10.08 and root mean square error of cross-validation (RMScv) value of 0.7. According to the MLR model, the connectivity indices (X5AV, X3A), 2D autocorrelation (MATS6m, MATS1e), geometrical (G(F.i .F)), geometrical descriptor (SPAM), atom-centered fragments (H-046) and constitutional indices (nN) descriptors influenced the kinase inhibitory activity of the studied structures. The obtained MLR equation and its results are shown in Table 2.
Table 2. MLR analyses results
3.1.2. GA-PLS method
In the GA-PLS method, unlike MLR analysis, PLS analysis ignores the multicollinearity problem in the descriptors. In GA-PLS modeling, a genetic algorithm was used to find the most helpful set of descriptors. It is an essential step in QSAR analysis because the selection of relevant descriptors directly relates to the predicted inhibitory activity values. In our work, the GA–PLS method was able to select the best descriptors for building the QSAR model. The details of the descriptors and the statistical results of the model obtained from GA-PLS method are shown in Table 3. As can be seen, a combination of 2D autocorrelation (ATS1m, GATS5v), information content index (IC1), topological index (MAXDN), geometrical descriptor (SPAM) and quantum chemical descriptor (softness), selected by GA-PLS approach, were responsible for the kinase inhibitory activity of oxindole analogs.
The obtained GA-PLS model showed very high statistical quality parameters (i.e., R2 value of 0.8284 and Q2 value of 0.7175). The predictive ability of the obtained model was evaluated by five external test set molecules. The predictive R2 value for the test set was found to be 0.79, which indicated the great predictive ability of the model. A detailed statistical results of the two generated models by GA-PLS and MLR methods are shown in Table S2. Comparison of statistical results of these two models indicates the higher predictability and accuracy of the GA-PLS model to the MLR model.
Table 3: GA-PLS results
3.2. Application domain of the selected GA-PLS model
The William plot for the applicability domain of GA-PLS model is displayed in Figure 3. This plot was acquired using the leverage approach to define the applicability domain of the model. As shown in Figure 3, all values of the training set and test set compounds were lower than the warning h* value of 0.6, indicating the good predicting ability of the final model. Hence, new compounds with enhanced tyrosine kinase inhibitory activity could be designed using GA-PLS model.
3.3. Docking Studies
With the aim of exploring the possible binding modes of TKIs to the tyrosine kinase active site and obtaining their crucial binding interactions, 44 TKIs were docked into the active site of the tyrosine kinase enzyme (PDB ID: 4AW5). Before docking, the reliability of the docking method and parameters was evaluated by redocking the original co-crystallized ligand into the binding pocket.
As shown in Figure 4A, the best redocked pose and the original structure of the co-crystallized ligand were superimposed well with each other. Additionally, the RMSD value between them was 0.33 Å, confirming the reliability and accuracy of the docking method employed in this study. An H-bond was found between the oxindole carbonyl group of co-crystallized ligand and Gly98 residue (Figure 4B). Hydrogen bond and electrostatic interactions were also found between the piperidine ring and Glu140 and Asp234, respectively.
All TKIs were then docked into the active site applying the same docking parameters. The ranking order of the inhibitory activity (pIC50 values) were found to be in accordance with the docking scores (Figure 5), which further indicated the reliability of the employed docking algorithm. The estimated free binding energy values (ΔGbind; kcal/mol) and the possible binding mode of this compound with the key amino acid residues at the active site of the enzyme are summarized in Tables 1, S5 and S6.
As can be seen from Table1, the studied molecules were classified into four different classes based on their structures. Accordingly, their docking results have changed with these structural changes. The binding energy values (ΔGbind) for the best-docked poses of the class 1 (6a-6w), class 2 (7a-7j), class 3 (8a-8e) and class 4 (8f-8k) compounds were within the range of -7.34 to -10.53 kcal/mol, -9.29 to -10.77 kcal/mol, -7.53 to -11.05 kcal/mol and -9.63 to -10.46 kcal/mol, respectively. Based on the results, class 2 and 3 compounds possessed the lowest docking binding energies. Therefore, the docking results of compounds 7g, 7j, 8b, and 8e with the lowest binding energy among the 44 compounds were chosen for further binding mode analysis between protein and ligands.
It was noted that the nitrogen and oxygen atoms of oxindole core and the nitrogen atom of benzimidazole group of all four compounds formed hydrogen bonding contacts with Glu170, Phe171 and Met172, respectively, which indicated the remarkable contribution of these interactions for the binding affinity (Figures 6 and S1). For compounds 7g and 7j, other CH---O interactions were observed between the sidechain group on the nitrogen atom of piperidine ring and the oxygen atom of Arg220, Asn221 and Asp234. Additionally, the piperidine ring in the three studied compounds 7g, 7j, and 8b formed CH---O interactions with Ser233.
3.3. Design of novel potent compounds
According to the information obtained from the developed QSAR models (MLR and GA-PLS) and the binding mechanisms of molecular docking, 44 compounds were proposed as novel tyrosine kinase inhibitors. The main strategy in the design of these novel inhibitors was the skeleton modification of the most potent compounds (8a-8e) and (8f-8k) via bioisosterism principle. The inhibitory activities of the newly designed compounds were predicted by the derived MLR and GA-PLS models and the results are listed in Table S3. From the designed compounds, molecules 9i, 9j, 9k, 9m, 10j, 11b, 11e, 11h, 11k, and 11m showed equal to or higher predicted pIC50 values than compounds (8a-8e) and (8f-8k) (Figure 7).
To explore the binding modes of the novel designed compounds, they were docked into the binding site of tyrosine kinase enzyme. The docking score of all newly proposed compounds is shown in Table S3 as well as the binding mode details of 12 of them are listed in Table S4. Based on their structures, the designed compounds can be divided into three groups (9a-9p), (10a-10k), and (11a-11m). Hence, the docking results were affected by structural changes in these compounds. As expected, compounds 11a-11m showed better docking scores than those of other designed compounds. Moreover, all proposed compounds of group 11a-11m showed lower binding energy values than that of sunitinib as the control drug and some of them displayed equal and even better binding energy values than that of co-crystalized ligand.
The binding interactions of Sunitinib and new compounds 9i, 9j, 9m, 11b, 11e, 11j, and 11k in the active pocket of tyrosine kinase are shown in Figures 8 and S2-S4. The docking results suggested that most of the designed compounds had a common binding mode similar to that of the most potent compounds (8a-8e) and (8f-8k). As shown in Figure 8A, there is a CH---O interaction between Sunitinib and Glu170, further indicating the crucial role of Glu170 in stabilizing ligands in the active site of enzyme. The designed compound 9i formed two H-bond interactions with Glu170 and Met172. Similar to compounds 7a-7j, the piperidine ring and the side chain group on the nitrogen atom of the piperidine ring of compound 9i participated in CH---O interactions with Arg220 and Asn221. The docking findings also indicated that, in addition to H-bonding, hydrophobic interactions with nonpolar amino acids and arene-H interactions play a significant role in the stability of the ligand-protein complex.
To investigate the specificity of the newly proposed compounds toward the tyrosine kinase enzyme, the target identification was carried out using SwissADME online tool. Table S5 summarizes the results of this prediction. Among these new structures, compounds 9j, 9k, 9m, 10j, 11b, and 11e demonstrated the highest probability to act as kinase inhibitors with similarity of 93.3%, 66.7%, 80%, 60%, 66.7% and 60% to previously reported kinase inhibitors, respectively. It was notable that SwissADME showed 80% similarity with sunitinib as a control drug, which could indicate the acceptable specificity of our proposed compounds.
3.4. In Silico Drug-Likeness, Bioavailability, Toxicology Prediction and, Synthetic accessibility
The druggability of the newly designed compounds was analyzed by assessing their physiochemical properties and by applying Lipinski's rule of five using SwissADME web tool and compared with that of sunitinib (Table 4). It could be seen that most of the selected compounds successfully passed Lipinski's five rules. Compounds 9i, 11h, 11k, and 11m had one violation of Lipinski's rule of five, and that was their molecular weight.
Table 4
In silico Lipinski’s rule of five results for 10 selected designed compounds
Compound
|
Molecular
weight
(≤500Da)
|
Oral
Bioavailability:
TPSA(≤140Å)
|
H- Bond
Donor
(≤5)
|
H- Bond
Acceptor
(≤10)
|
logP
(≤5)
|
Rotatable
Bond
(≤10)
|
Rule of 5
violation
|
Sunitinib
|
398.47
|
77.23
|
3
|
4
|
3.55
|
8
|
0
|
9i
|
536.62
|
96.55
|
2
|
6
|
4.01
|
8
|
1
|
9j
|
463.57
|
73.05
|
3
|
3
|
3.47
|
5
|
0
|
9k
|
480.51
|
96.55
|
2
|
5
|
3.26
|
5
|
0
|
9m
|
473.53
|
102.93
|
4
|
5
|
2.16
|
6
|
0
|
10j
|
434.56
|
114.18
|
3
|
4
|
2.85
|
5
|
0
|
11b
|
446.52
|
70.25
|
2
|
5
|
3.44
|
5
|
0
|
11e
|
476.52
|
78.09
|
2
|
6
|
2.86
|
5
|
0
|
11h
|
538.59
|
78.09
|
2
|
6
|
3.32
|
6
|
1
|
11k
|
587.50
|
98.32
|
3
|
5
|
3.39
|
6
|
1
|
11m
|
504.58
|
103.87
|
2
|
6
|
2.99
|
6
|
1
|
Table 5 shows the physiochemical properties of the selected newly designed compounds. As shown, most of the compounds have moderate water solubility and some of them are poorly soluble in water. Although all compounds except compound 11k displayed high gastrointestinal absorption, most did not show brain penetration. Half of the compounds depicted that they could act as cytochrome CYP2D6 inhibitors like the control drug, indicating that these compounds might not easily undergo oxidation and hydroxylation in the phase-1 metabolism. The synthetic accessibility of all new compounds were lower than 5, suggesting that they were relatively easy to be synthesized.
Table 5
Pharmacokinetic parameters of the best selected compounds
Compounds
|
Water solubility1
|
GI absorption
|
BBB permeant
|
CYP2D6 inhibitor
|
P-gp substrate
|
Lipophilicity
Log Po/w
|
Synthetic accessibility2
|
Sunitinib
|
Soluble
|
High
|
Yes
|
Yes
|
Yes
|
3.55
|
3.58
|
9i
|
Moderately
|
High
|
No
|
Yes
|
Yes
|
4.01
|
4.08
|
9j
|
Poorly
|
High
|
Yes
|
Yes
|
Yes
|
3.47
|
4.21
|
9k
|
Moderately
|
High
|
No
|
Yes
|
Yes
|
3.26
|
3.64
|
9m
|
Poorly
|
High
|
No
|
No
|
No
|
2.16
|
3.61
|
10j
|
Moderately
|
High
|
No
|
Yes
|
Yes
|
2.85
|
3.83
|
11b
|
Moderately
|
High
|
Yes
|
Yes
|
Yes
|
3.44
|
3.71
|
11e
|
Moderately
|
High
|
No
|
No
|
Yes
|
2.86
|
3.74
|
11h
|
Poorly
|
High
|
No
|
No
|
No
|
3.32
|
4.11
|
11k
|
Poorly
|
Low
|
No
|
No
|
No
|
3.39
|
4.15
|
11m
|
Moderately
|
High
|
No
|
No
|
Yes
|
2.99
|
3.98
|
1-log S scale: insoluble < -10 < poorly < -6 < poorly < moderately < -4 < soluble < -2 very < 0 < highly |
2-range: 1= very easy, 10= very hard |
Bioavailability radar plots depicted the oral bioavailability of the selected compounds (Figure 9). Orally bioavailable compounds were predicted to be 9i, 9j, 10j, 11b, 11e, 11h, and 11m. compounds 9k and 9m showed one off-shoot relative to insaturation (INSATU), and compound 11k showed four off-shoot relatives to insaturation (INSATU), insolubility (INSOLU), size and Lipophilicity (LIPO) indicated that they might have poor physicochemical properties for oral bioavailability.
The toxicity risk calculation is critical in the early drug design phase for screening the safest and most effective drugs or compounds. The toxicity of 10 selected compounds was predicted using the admetSAR online tool [37]. Table S6 shows the toxicity prediction results. None of the compounds showed carcinogenicity, eye corrosion, or eye irritation. Compound 9m was predicted to cause Ames mutagenesis. All Compounds and control drug (Sunitinib) were determined to be hepatotoxic, however, the toxicity range of the selected compounds was better than that of Sunitinib. According to the US EPA classification, all compounds were predicted to have class III acute oral toxicity, which means that their LD50 values are greater than 500 mg/kg but less than 5000 mg/kg. These computational pharmacokinetics and bioavailability information could improve the success rate of newly proposed compounds.