Discovery of Novel Inhibitors Targeting HDM2: Virtual Screening, Molecular Dynamics Simulation and Binding a nity predication


 HDM2(human double-minute 2 protein) is an important drug target for tumor treatment. The main physiological function of HDM2 is to negatively regulate the tumor suppressor protein p53 and inhibit the normal biological function of p53, which leads to the occurrence and development of tumors. Therefore, the discovery of small molecules that can inhibit the activity of HDM2 is a promising cancer therapy. In this study, we reported a virtual screening as an effective strategy to discover potential HDM2 inhibitors from the Specs database. Then 100 ns all-atom molecular dynamics (MD) simulation and binding free energies calculated were carried out on the selected compounds. In conclusion, we reported seven prospective candidates with novel skeletons among them Hit4 is a potential HDM2 inhibitor and provided an insight into the mechanism of interaction of the compounds to HDM2 target.


Introduction
Cancer has always a icted humans with increasing incidence and mortality, with an estimated 9.6 million deaths in 2018 and the number expected to rise to more than 14 million annually over the next decade [1,2]. In 2020, there are 4.569 million new cancer cases and 3.003 million cancer deaths in China for the Cancer Today website (https://gco.iarc.fr/today). Cancer has become the leading cause of death in China and is a major public health problem.
The tumor suppressor protein p53 is known as the "guardian of the genome" and plays a central role in preventing tumor development through biological functions such as cell cycle arrest, DNA repair, cell senescence, and cell apoptosis [3]. In about half of cancer cases, p53 is inactivated by mutations or deletions. In many other cancer cases, it is inactivated by overexpression of HDM2 [4].
HDM2 (also known as MDM2) is the main negative regulator for p53. In normal cells, HDM2 and p53 form a negative feedback loop that maintains both p53 and HDM2 at very low levels [5]. In this process, HDM2 is transcriptionally activated by p53, and HDM2 regulates p53 activity through three main mechanisms. (i) HDM2 inhibits p53 transcriptional activity by binding to the transactivation domain of p53 and thereby inhibits p53-mediated transactivation. (ii) HDM2 exports p53 out of the cell nucleus, leaving p53 inaccessible to targeted DNA and reducing its transcriptional ability. (iii) HDM2 functions as an E3 ubiquitin ligase that promotes the degradation of p53 proteasome [3,6,7]. Consequently, HDM2 is considered as an effective inhibitor of p53. HDM2 is abnormally up-regulated in human tumors through gene ampli cation, increased transcription level and enhanced translation [8], and the overexpression of HDM2 can impair the stability and activities of p53 and have carcinogenic effects. An analysis of nearly 4000 samples from tumors or xenografts from 28 tumor types shows that the frequency of HDM2 ampli cation in these human tumors was 7% [8]. Therefore, the HDM2 as a drug target that the inhibition of HDM2 can restore the activity of p53.
The HDM2-p53 interaction has been con rmed to be the interaction between the rst 120 N-terminal residues of HDM2 and 30 Nterminal residues of p53 [9]. Unlike many other protein-protein interfaces, the p53-HDM2 interface forms a narrow and deep cleft [10]. In 1996, the rst cocrystal structure of HDM2(residues 17 − 125) with p53(residues 15 − 29) was reported (PDB code 1YCR) [4]. The interaction between HDM2 and p53 is mainly that the three hydrophobic residues of p53 (Phe19, Trp23 and Leu26) interact with the surface of HDM2 to form three hydrophobic cavities. The key residues of these three mimic ligands of p53 occupy the hydrophobic cleft of HDM2 in a "thumb − index − middle nger" conformation. [11]. And some of the other ligands were designed for ubiquitin ligase. The binding modes of all ligands are highly conservative and occupying three hydrophobic groups with a strict arrangement.
Therefore, designing small molecule inhibitors to block HDM2-p53 interaction has become a possible cancer treatment strategy. In order to develop new HDM2 inhibitors, we have adopted the computer-aided drug design method in this work. Compared with the experimental method, it has become a reliable, cost-effective, and time-saving strategy that can be used to discover new lead compounds. In this study, considering the structural diversity and availability of compounds, the specs database (http://www.specs.net) was rst optimized according to the Lipinski and Veber rules, then the specs database was used to virtual screening. Moreover, in order to further understand the binding mode of the HDM2 and the potential modulators, molecular dynamics simulations and MM-PBSA was carried out to calculate the binding free energy [19].

Page 3/18
The work ow for this study which included Activity models, Pharmacophore modeling, Molecular docking, Molecular dynamics (MD) simulations and Binding free energies calculation is shown in Fig. 1.

Data Preparation
For the screening database, the specs database(http://www.specs.net) containing 212,736 compounds were rst ltered by Lipinski and Veber rules and 75671 compounds obeyed the Lipinski's cut-off values. Subsequently, the remaining molecules were prepared with LigPrep module provided in the Maestro 12.1 (www.schrodinger.com) using the OPLS_2005 force led and the protonation states were generated with Epik at PH of 7.4. The prepared database was used to generate 3D conformational set for each molecule using CAESAR in Discovery Studio 4.0 (http://www.accelrys.com) [20] and saved them as Database 1.
For the known ligand database, in order to have a reliable and representative set of active ligands with wide structural diversity, we collected known ligands from literature and patent [11,[21][22][23][24][25][26][27][28][29][30]. In this study, compounds whose IC 50 values (here reported as pIC 50 ) were determined by the same procedure ( uorescence polarization assay) were considered. Moreover, compounds with de ned chirality only were selected in order to have a univocal match between structure and activity. The nal the known ligand database was constituted by For the known ligand database, the molecules in the database need to be properly prepared. Therefore, prepared the structures by washing, adding hydrogen atoms, and energy minimizing in MOE (2019.01). In Database Wash, set the Protonation Field to Dominant (Dominant protonation state at speci ed pH) and set the Coordinates Field to Rebuild 3D. Other settings with default values. After preparation, the molecules were stored in the mol le with properly protonation. The known ligand data set after preparation was named KLD.

Preparation of the 3D-Structure of HDM2
Induced-t theory emphasizes that the ligand can induce the conformational change of the receptor during the binding process. Therefore, it is very important to select a good receptor structure. On this basis, HDM2 protein with a PDB ID of 5OC8 (https://www.rcsb.org/structure/5OC8) [15] was selected as the receptor of this study according to the good resolution. The crystal structure resolution is 1.56 Å and binding of protein HDM2 with small molecule ligand NVP-HDM2, which is a second generation p53-HDM2 inhibitor that had entered phase I clinical trial. Then Discovery Studio 4.0 (DS 4.0) was employed to the prepare protein by adding missing residues, hydrogen atoms as well as deleting water molecules, and then the structure was minimized by using SYBYL-X software (version 2.0) with AMBER99SB force eld[31].

Activity Models
Active models such as QSAR and ngerprint were constructed based on known compounds can be used to lter the initial database.

QSAR Modeling
The QSARs are statistical models that attempt to correlate the structure and experimental properties (activity, selectivity, toxicity, etc.) with computed properties such as molecular descriptors and ngerprints. The QuaSAR suite of applications in MOE is used to build numerical models of the data for prediction and interpretation purposes. For instance, if we get a series of ligands with known activity, we can build a numerical models from the training set to predict the activity of molecules not in the training set.
In this study, the QSAR modeling calculations were performed by using QuaSAR Model in MOE. First, QSAR models used three molecular descriptors to calculate the molecular descriptors for 48 compounds with already known activity in KLD. Then, the AutoQSAR function was employed for the model building using the Partial Least Squares (PLS) regression models. PLS is often used in the main regression analyses [32,33]. A reasonable QSAR model requires R 2 > 0.8 and XR 2 > 0.5. If the model was unreasonable, we need to remove outliers from the database, then rebuilt and cross-validated the model. Finally, the optimized QSAR model was used to lter the database.

Fingerprint Models
A vector of Fingerprint keys contains information on the presence or absence of certain properties or molecular features. According to the design philosophy, different ngerprint schemes emphasize different molecular attributes. In MOE, the fundamental idea of the Fingerprint Model is to encapsulate certain properties directly or indirectly in the ngerprint and then use the ngerprint as a surrogate for the chemical structure. Then comparisons between molecules are reduced to comparing sets of features and measuring the degree of overlapping of the sets.
MOE-Fingerprints can be used for: (i) Fingerprint Clustering to a group of molecules based on the similarity of their ngerprint. (ii) Fingerprint Models to study Quantitative Structure-Activity Relationships using ngerprint descriptions of the molecules. (iii) Similarity Searching to nd molecules with similar ngerprints. In this study, we used the ngerprints of a set of active compounds to search databases for similar active compounds by building a ngerprint model.
We built a ngerprint model with KLD mentioned above. The code FP: GpiDAPH3 refers to the Pharmacophore Graph Triangle (GpiDAPH3) ngerprints, and these are three-point pharmacophore ngerprints calculated from the 2D-molecular graph. Therefore, we selected the FP:GpiDAPH3 ngerprint model. Set ngerprint score to Maximum (Maximum of similarities scores) and saved the ngerprint model [34].

Pharmacophore Modeling
MOE's pharmacophore modeling tools determine the pharmacophoric feature characterizes and their spatial arrangement in 3D that are essential to the binding of ligands to their receptors and the drug activity of the ligands. Pharmacophore models can be generated from the structural data of protein-ligand complexes as well as from ligands when no receptor information is available and from the receptor structure when no ligands are available. The generated models can be used to screen virtual compound libraries for potentially active molecules.
In this study, we used Ligand-based and Complex-based Pharmacophore Modeling for virtual screening.

Ligand-Based Pharmacophore
The common feature pharmacophores were performed by using the pharmacophore Elucidation module in MOE. The HDM2 inhibitor dataset KLD contains 48 known active compounds, was divided into high active compounds(pIC ≥ 8) and low active compounds(pIC50 < 8). At rst, Pharmacophore Elucidation was employed to identify common features necessary for active HDM2 inhibitors. Set conformation to Clustered to reduce the total number of conformations, the conformation method was set to Conformation Import, the Activity Field was set to pIC50, the value was set to 8, and the other parameters of Pharmacophore Elucidation were set to default values. Then, the selected pharmacophore model with the highest alignment score and manually add ligand shape constraints. Finally, the pharmacophore model was saved for subsequent database screening.

Complex-Based
The Protein-Ligand Interaction Fingerprints (PLIF) tool was used for summarizing the interactions between ligands and proteins using a ngerprint scheme. The interaction between the receptor and the ligand can be observed more intuitively, and the pharmacophore obtained in this way is more reliable. First, selected ve crystal structures (PDB ID:4HG7, 4QO4, 4WT2, 4ZYF, 5OC8), which the crystal structure resolution was less than 2.0 Å and binding with good HDM2 inhibitor, imported ve HDM2 complex structures into MOE, performed QuickPrep, unchecking the Re ne checkbox, using default values for the rest of the settings. The QuickPrep application can correct problems in the protein-ligand complex, adding hydrogen and assigning the protonation states. Tethers were then added to the receptor, heavy atoms and distant atoms are xed.
The Protein Structure Superposition application was used to superpose the set of 3D protein structures to maximize the spatial overlap of a common atom subset, highlighting the regions of conserved structure and modi ed regions in the protein collection. It can also be used to overlay and study active sites of related proteins. Finally, the PLIFs were generated and analyzed, and a Pharmacophore Query was generated.

Molecular Docking
Docking is a computer simulation technique used to predict the binding mode of two interacting chemical species. Dock application from MOE can be used to search for favorable binding modes between ligands and macromolecular targets, which is usually a protein.
After the docking is completed, for each ligand, a number of poses are generated and scored. The nal highest-scoring poses, along with their scores and conformation energies, were recorded to a database where they are ready for further analysis.
The molecular docking in this study was performed using the crystal structure of HDM2 (PDB ID: 5OC8). First, the structure was prepared using the Protonate 3D module in the MOE to add hydrogens to the entire system and distribute ionization states. The structure was then optimized using energy minimization. Then, the active site of the receptor was checked using SiteView in MOE. The docking site was selected as the ligand-binding site in 5OC8, the initial scoring methodology selected London dG, poses were set to 5, the nal scoring methodology selected GBVI/WSA dG, poses were set to 1, while initial settings were adopted for the other conditions. The docking simulation output used score S while considering the binding energy.
Some potential inhibitors with diverse skeletons were selected manually after molecular docking, according to the principle of structural diversity.

Molecular Dynamics Simulations
Molecular dynamics can dynamically analyze microscopic interactions between compounds from the atomic scale. By analyzing the trajectory of the balanced system, a large amount of kinetic data can be analyzed to predict the macroscopic properties of matter.
The potential inhibitors bound to the HDM2 models were investigated by MD simulations for 100 ns with GROMACS 5.2.0 [35]. In this study, the size of the octahedral solvent box matches the size of the protein, and the edge distance is 1 nm. The corresponding Cl − are added to the solvation system to neutralize the system. Before the molecular dynamics simulations, the energy of the protein-ligand complex is minimized, so that the potential energy (Epot) of the system is as small as possible.
During the rst stage, a simulation was carried out for 200ps at a temperature of 300K in the Nose-Hoover thermostat (NVT) (constant number of particles, volume, and temperature) ensemble with solute heavy atoms restrained by a force constant of 100 kcal· mol − 1 ·Å −1 .
Then the system was kept to 300K during a 200ps MD simulation in the NPT ensemble, with the same force constant applied to the solute atoms. Finally, in the production phase, the relaxed systems were simulated without restrained under NPT ensemble conditions for 100ns. Within this simulation time, the total energy and RMSD of the protein backbone reached a plateau, and the systems were considered equilibrated and suitable for statistical analysis. The Nose-Hoover thermostat (NVT) was utilized to maintain a constant temperature in all simulations, and the Martyna-Tobias-Klein method was used to control the pressure [31].

Binding Free Energies Calculated By MM/PBSA
In this study, the molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) method[36] was used to estimate the binding free energy between protein HDM2 and potential inhibitors and used the g_mmpbsa tool of GROMACS 5.0.2 [19]. In MM/PBSA calculation, the solvent is regarded as a uniform continuous medium, and based on the force eld and implicit continuous medium model, the dynamic equilibrium trajectory of the protein-ligand complex is sampled, and the free energies of the complex, protein and ligand are calculated respectively.

QSAR model generation and validation
In this study, three calculating descriptors (TPSA, SMR, SlogP) and PLS were selected to construct a QSAR model. Then self-crossvalidation was performed, and the QSAR model was further analyzed (PLS: R 2 = 0.289,XR 2 = 0.174).The cross-validation results were not good. Then removed outliers from KLD, rebuilt and cross-validation the model[37], and got a new QSAR model with 3 descriptors (PLS:R 2 = 0.825, XR 2 = 0.752). Finally, the model was evaluated using Model Evaluate in MOE on the database 1. The model predictions were put to a new database eld PRED_qsar which represented the predicted value of the model. Selected the compound whose PRED_qsar value is greater than 5.5, screened 29700 compounds, and saved them as Database 2.

Fingerprint models lter
We used the ngerprint model to lter the database 2 and Entered $PRED_ GpiDAPH3_MAX as the Fingerprint prediction eld. Selected the compound whose $PRED_ GpiDAPH3_MAX value was greater than 0.5, screened 6729 compounds, and saved them as Database 3.

Ligand-based pharmacophore model generation and validation
The generation of the Ligand-Based Pharmacophore model was based on KLD. Initially, 68 possible pharmacophore features were discovered by using a feature mapping protocol, and three kinds of common features were included. Then, selected the top ten models of the Alignment Score from the generated pharmacophore model for analyses (Table 1). The coverage of active molecules of the ten models were all 11 compounds. By comparing their separation of actives/inactives and the rationality of the molecular arrangement in the model, model 1 (Fig. 3) was selected. The optimized pharmacophore contained three Hydrophobic center (H) and one Aromatic center (R) feature. Finally, added Ligand Shape Volumes (no atoms allowed outside the volume). After assessing the pharmacophore query, virtual screening was carried out E against the database 3. 458 compounds were selected in Database 4.

Complex-based pharmacophore model generation and validation
The Protein-Ligand Interactions Fingerprints (PLIF) were generated in MOE, and it was a ngerprint that can characterize protein-ligand interactions. The PLIF display modes of ligands were shown in Fig. 4. We were able to visually analyze the interactions and perform statistical analyses to generate pharmacophore. It was observed that the interaction between ligands and proteins mainly includes the H-bond acceptor and Anionic atom. Therefore, the modi cation of modify the generated pharmacophore was to include two common features (F1: Acc; F2: Ani&Acc) (Fig. 5). Finally, the generated pharmacophore was used to screen database 4, 486 compounds were obtained, which were saved as Database 5. The pharmacophore was obtained in different ways and compounds were screened in the database so that the result was more reliable.

Docking-based virtual screening
Redocking was performed for the preprocessed receptor using the 5OC8 ligand. The result showed that an almost similar binding style was identi ed (RMSD = 0.1369 Å). In this procedure, database 5 was docked with the preprocessed 5OC8. Docking simulation was performed using the initial condition in the application and the scoring function was adopted as GB/VI. After 5 trials, compounds with the highest docking scores (-S) were output and 246 compounds were obtained by docking. Finally, visually inspect the top compound molecules in the docking results, the top 16 compounds with the highest docking scores (-S) and good combination model were selected as Database 6.

Visual screening
After molecular docking, the top 16 molecules were further analyzed to visually inspect the binding mode of ligands and proteins, and pharmacophores. Finally, 7 compounds (Fig. 6) with structural diversity and appropriate binding modes were selected as potential inhibitors for HDM2 inhibitors. The binding mode of ligands and proteins and pharmacophores are shown in Fig. 7 and Fig. 8.

Molecular Dynamics Simulation Analyses
In this study, we performed 100 ns MD simulation analysis on the seven Hits and NVP-HDM2 complexes.
The backbone RMSD describes the change of the structure compared with the initial structure, and the simulation can be used to view the stability of convergence and composites. The RMSD of the complexes were shown in Fig. 9. Except Hit2 reaches equilibrium at 40ns, other systems reached equilibrium within the simulation time, and the overall uctuation was small in the simulation process. The RMSD of all the eight systems showed that the binding of the compound to the receptor was stable. The ligands binding led to a small conformational change in the HDM2 structure from its native one.
The Radius of gyration (Rg) is a structural parameter associated with the overall conformation and three-dimensional (3D) structure of a protein, which has been utilized to get insights into its compactness and folding behavior. The Rg of the complexes were shown in Fig.  10  With the aim of determining the mobility and exibility of each residue in the MD simulation, the Root Mean Square Fluctuation (RMSF) values were calculated. The RMSF of the seven potential active compounds and the original ligands were shown in Fig.11. The uctuations of the HDM2 backbone were compared with each residue after the ligands binding. The uctuations were found to be minimum at several key residues of THR49, LEU54, LEU57, TYR60, ILE61, GLY83, VAL93, TYR100. The RMSF suggested thatthe uctuations of these residues were reduced in the region whereligands were binding.

MM-PBSA Binding Free Energy Calculations
For the seven-candidate compounds and the reference molecule NVP-HDM2, frames were extracted from 40-60 ns trajectory with stable RMSD value to calculate the MM-PBSA. The dielectric constant is set to 2 in the solute and 80 in the solvent. As shown in Table 2, the binding free energies (ΔG binding ) of NVP-HDM2 and the seven potential inhibitors were 108.231 kJ/mol, -98.286 kJ/mol, -46.231 kJ/mol, -41.902 kJ/mol, -120.891 kJ/mol, -95.943 kJ/mol, -59.729 kJ/mol, and − 94.296 kJ/mol. The calculated results of the binding free energy showed that Hit 4 was lower than the reference molecule, which means the binding was more stable with HDM2 than NVP-HDM2, and it shows that Hit4 is more likely to be a potential inhibitor. Except the solvation free energy (∆Gpolar), the other energies such as van der Waals energy (∆Gvdw), electrostatic interaction energy (∆Gele), and non-polar solvation free energy (∆Gnonpola) had a signi cant contribution in the results of binding free energy. The polar interaction (∆G ele +∆G polar ) of the eight systems were greater than the nonpolar interactions (∆G vdw +∆G nonpolar ), indicating that the nonpolar interactions were more conducive to the binding of ligands and HDM2.

Conclusion
In this study, we rst used Lipinski and Veber rules to lter out compounds that do not meet the properties of a drug. Consequently, we used virtual screening strategy as well as visual inspection to target the HDM2 using three different algorithms (Activity Model, Pharmacophore Model, and Molecular docking of MOE). Finally, seven Hits from the Specs database were selected for further evaluation. In addition, to analyze the binding mode between the complexes, we calculated the molecular dynamics simulation and binding free energies for seven hits and reference molecules. The results showed the binding between the seven Hits and HDM2 was strong, and the seven complex systems were stable. Therefore, seven Hits with novel skeletons can serve as potential candidates for inhibiting HDM2. The binding free energy of Hit4 was lower than the reference molecule NVP-HDM2, predicting Hit4 is more likely to be a potential inhibitor.
In summary, this virtual screening strategy as well as detailed binding mechanisms of complexes reported herein may provide a new direction for the future discovery and development of potent modulators targeting HDM2. The protocol for virtual screening of HDM2 inhibitors. KLD represents the known ligand database.

Figure 2
The structures and activity of known ligands.

Figure 3
The pharmacophore consists of three Hydrophobic center (in green), one Aromatic center (in brown) and a volume Figure 4 Interaction between ligand and protein.

Figure 6
Structure of seven potential inhibitors. The binding mode of Hits and proteins.

Figure 8
The binding mode of Hits and pharmacophores. The green represents the hydrophobic center and the brown represents Aromatic center.
Page 16/18 RMSD values along with the whole simulations of the protein-ligand complexes.

Figure 10
The Radius of gyration (Rg) values of Cα atoms of HDM2 were calculated in the process of MD simulations.