2.1 Library construction and filtering
A compound library was constructed with 1,011,637 azole compounds retrieved from ZINC database [22]. Hydrogens were added and three-dimensional structures were generated with Babel [23]. The library was “cleaned” by using SMARTS (SMiles ARbitary Target Specification) patterns implemented in Discovery Studio Client 2016 [24]. The patterns that were used for filtering are provided in the Supplementary Material Table S1. Next, the entire library was filtered according to Lipinski’s rule of five [25]. Briefly, the rule states that the molecular weight of an oral drug should be lower than 500Da, the number of hydrogen bond acceptors should be at most 10, the number of hydrogen bond donors should be at most 5, and log P should be lower than 5, allowing only one violation. Therefore, filtered ligands were decreased to 971,040.
2.2. Design of the workflow and evaluation of selectivity of the scoring functions
Design of the workflow in VS is critical as capabilities of computational tools are different. For the purpose of discarding inactive compounds early, a scoring function that is good at distinguishing between actives and inactives (selectivity) would be the best option for the first stage of screening. rDock is known to eliminate inactive molecules at an early stage [26]. However, scoring functions are target-dependent. For this reason, 3 different docking programs were evaluated for their use at a specific stage. Receiver Operator Characteristic (ROC) curve for each program was obtained and Area Under the Curve (AUC) was calculated. Toward this end, 50 known active compounds were downloaded from cHEMBL database and 36 decoys for each ligand [27], 1800 decoys were retrieved from DUD-E database [28] (A database of useful decoys: Enhanced, http://dude.docking.org/). rDock was selected as the first screening tool according to the AUC profile. Moreover, Autodock Vina was used for the second screening. Next, pharmacophore-based screening was performed to identify potential mPGES-1 inhibitors. Finally, Autodock4 was used to rank compounds according to their binding affinities before MD simulations. The workflow followed in this study is demonstrated in Figure 1.
2.2.1 Screening with rDock
Prior to screening with rDock [26], the cavity definition was made by reference molecule method as described in the manual. Toward this end, structure of a known inhibitor [29] with a high in vitro inhibition activity (0.241 µM) was used (Figure 2). Crystal structure of mPGES-1 (PDB ID: 4AL0) was downloaded from PDB Database [30] and prepared in Discovery Studio Client 2016.
The inhibitor was docked using rDock to the crystal structure of mPGES-1 enzyme in the presence of glutathione to ensure a minimized ligand-receptor interaction energy. Number of Genetic Algorithm (GA) runs were kept at 50. The binding pose with the lowest energy was used for cavity mapping. Furthermore, parameter file was prepared where the following parameters were set: Radius: 6.0 Å; Small sphere: 1.5 Å; Minimum volume: 100 Å (default); Maximum cavities: 1 (default); Grid step: 0.5 (default); RECEPTOR_FLEX: 3.0 Å (default).
Considering the size of the chemical space (971,040 compounds), High-Throughput Virtual Screening (HTVS) protocol for rDock was followed. Toward this end, the first docking with all compounds were performed with 5 runs. Later, ligands exhibiting binding score of -10 or below were subjected to further docking with 10 runs. Next, for ligands which scored below -20 were filtered for further docking simulations with Autodock Vina. Filtering at every step was performed according to SCORE.INTER, which refers to the ligand’s binding energy.
2.2.2. Further structure-based virtual screening with Autodock Vina
According to ROC profile, Autodock Vina ranks the second best at distinguishing active and inactive compounds. Furthermore, the scoring function of Vina is known for its speed and accuracy [31]. For this reason, screening was pursued with Autodock Vina. Filtered ligands according to their scores in rDock were further subjected to docking with Autodock Vina. Prior to this, tautomers and protonation states were generated using Prepare Ligands in Discovery Studio Client 2016. Ligands which raised problems of generating 3D coordinates, bad valences or being duplicates were eliminated. Consequently, 44,366 compounds were subjected to molecular docking simulations with Vina. A grid box was prepared with center coordinates of x: 11.749, y: -12.53, z: -12.147 and dimensions of x: 60 Å, y: 60 Å and z: 60 Å. The center was chosen at the sulfur atom of GSH.
2.2.3. Pharmacophore generation
Remaining compounds in the library were screened with pharmacophore to recover more drug-like compounds. In order to generate a pharmacophore model, a training set comprising known active mPGES-1 inhibitors was prepared. Toward this end, 20 molecules were retrieved from the literature and cHEMBL database. The entire set was docked to the crystal structure of mPGES-1 enzyme with Autodock Vina to determine their binding affinities (Table 1).
Next, the training set was subjected to Common feature pharmacophore generation protocol in Discovery Studio Client 2016. 10 hypotheses were generated by HipHop algorithm. Conformation generation was selected BEST, with an energy threshold of 20 kcal/mol and maximum conformation of 250. Maximum interference distance was set to 2.97 and the number of maximum pharmacophores to be generated was set to 10.
2.2.4. Validation of the pharmacophore model
Each generated pharmacophore model was validated by using a test set including 380 decoys and 20 compounds that are known to be active mPGES-1 inhibitors (the training set). Decoys were downloaded from DUD-E database (A database of useful decoys: Enhanced, http://dude.docking.org/). The quality of the pharmacophore often refers to its power to differentiate between inactive compounds and active compounds. Towards this end, the test set was screened against each pharmacophore and several statistical parameters such as percentage yield of actives, ratio of actives and enrichment factor (EF) were estimated. Güner-Henry score [32], which determines the quality of a pharmacophore model, was also calculated.
$$\%A=\frac{Ha}{\text{A}} \times 100\%{(Eq.1)}$$
(Eq. 2)
(Eq. 3)
\(\)\(GH=\frac{Ha\left(3Ha+Ht\right)}{4\text{H}\text{t} \times \text{A}}\times \left(1-\frac{Ht-Ha}{D-A}\right)\) (Eq. 4)
Where %A is the ratio of actives in the entire hit list, %Ya is the fraction of active hits relative to total hits, Ht refers to total hits acquired during database search, Ha refers to active compounds in the hit list, D is the total number of compounds, and A is the number of active compounds in the test set.
2.2.5. Screening the ligand library with pharmacophore model
After validating each pharmacophore model, hypothesis 01 was selected to further screen the ligand library. Upon filtering with this model, 48 compounds remained in the library. 10 compounds having the lowest Autodock Vina scores and Fit Value ≥3 were selected for further investigation.
2.4. Evaluation of scoring power and docking with Autodock4
Experimental data (IC50, nM) of 50 known active compounds (2.2. Design of the workflow and evaluation of selectivity of the scoring functions) were obtained from cHEMBL database plotted against the highest score estimated for each compound. Pearson’s correlation coefficient was calculated for each docking program and Autodock4 was found to be better at ranking binding affinities of ligands. Optimized structures were subjected to docking using Autodock4 [34]. Toward this end, the crystal structure of mPGES-1 enzyme was cleaned and prepared. Grid box was defined as described in section 2.2.2 and Autogrid4 was employed for grid calculation. Conformational space was searched by Lamarckian Genetic Algorithm [35], which was implemented in AutoDock4. Maximum number of energy evaluations was set to 2,500,000 and 100 runs were performed for each ligand. All calculations were performed on Linux platform.
2.5. Molecular dynamics simulation
The azole derivative with the highest affinity in complex with mPGES-1 enzyme was selected for MD simulations. mPGES-1 enzyme was defined with CHARMM36 force-field while GSH and ligand were parameterized by using CGenFF server (https://cgenff.umaryland.edu). The complex (enzyme with GSH and ligand) was solvated in TIP3P water [36] using a periodic cubic box and was minimized by using steepest descent algorithm with GROMACS 5.1.4 package [37]. Minimization was complete as the maximum force of 10 kJ/mol was reached. Next, the system was subjected to NVT equilibration for 100 ps with leap-frog integrator, V-rescale thermostat [38] and T=300 K. This was followed by NPT equilibration to equilibrate volume for 100 ps, also with leap-frog integrator, V-rescale thermostat and T=300 K. Finally, the equilibrated complex was submitted to MD simulations for 100 ns. For both stages of equilibration and MD simulation, LINCS algorithm [39] was used for constraining bond lengths and Particle Mesh Ewald (PME) [40] was employed for long-range electrostatic interactions (with a cut-off of 1.2 nm). V-rescale algorithm was selected for temperature coupling and Parrinello-Rahman barostat [41] for pressure coupling. The simulation time-step was set to 2 fs and data were recorded every 10 ps. Clustering of conformations was performed based on root-mean-square deviation using gmx cluster tool. GROMOS clustering algorithm [42] and a cut-off of 0.16 nm was selected. Optimal cut-off value was chosen according to the plot provided in the Supplementary Material Figure S1. Central complex structure was generated from the most populated cluster.