Computational investigation of pyrrolidin derivatives as novel GPX4/MDM2–p53 inhibitors using 2D/3D-QSAR, Molecular docking, Molecular dynamics simulations and MM-GBSA free energy


 The p53 is a tumor suppressor protein that adjusts cell cycle and growth arrest as well as genes that restore DNA damage and apoptosis. Murine double minute 2 (MDM2) is a main p53 antagonist. We created a novel QSAR model using a series of highly active spiro [pyrrolidin-3,2-oxindoles] that consisted of 29 compounds that were experimentally validated to inhibit the MDM2-p53 interaction. Three optimal models have been developed CoMFA/E+S, CoMSIA/S+H+A and HQSAR have revealed good statistical results, but the CoMSIA mode only which validates all the external validation tests applied successfully. Based on the CoMSIA/S+H+A model was carefully chosen to design four compounds with values ​​of inhibitory activity greater than the highly active compound in the data set. The Newly designed compounds were docked in the target receptor binding site (ID: 4LWU). The newly designed compound Pred 01 showed the highest affinity with a value of -9.4kcal / mol, while compound N°04 which represents the data set and control compound (Nutlin-3) showed binding energies of the order of -8.8 kcal / mol and -8.2kcal / mol, respectively. In addition, the roles of lipinski and veber were estimated, the results obtained demonstrate that the proposed molecules involve good oral bioavailability and an ability to diffuse through different biological barriers. For in-depth study, The Pred01 / receptor, N°04 / receptor and Nutlin-3 / receptor complexes were selected via dynamic simulation analyzes with a simulation time of 100 ns and, also their free binding energy was examined operating the MM-GBSA approach. The molecular docking results obtained accentuate the crucial residues responsible for the ligand / protein interaction, providing insight into the mode of interaction. The MD simulation analysis confirms the conformational stability of the selected complexes during the MD trajectory, and the fluctuations recorded are insignificant. The results of MM-GBSA reveal that the new compound Pred 01 exhibits the lowest free energy, which confirms the result of molecular docking.


Introduction
The term ferroptosis was rst coined in 2012 to characterize a form of programmed cell death induced by the accumulation of lethal iron-dependent lipid peroxidation [1]. Ferroptosis can be affected by nutritional factors, pharmacological agents, and physiological functions that directy affect these factors. Its may have potential uses in the treatment of cancer, induction of ferroptosis in vivo may have therapeutic value. Multiple pathways can initiate ferroptosis, most of which act directly through the mechanism of using covalent inhibitors to bind glutathione peroxidase 4 (GPX4). It is a uniquely designed selenoprotein that is able to reduce lipid hydroperoxides (L-OOH) in fatty alcohols (L-OH) and thus neutralize their toxicity [2,3].
In recent years, some studies have shown that the p53 protein may be involved in the control of ferroptosis through the accumulation of lipid hydroperoxides or toxic iron [4][5][6]. The p53 transcription factor is one of the most usually mutated in cancer, and approximately 50% of human cancers are linked to its mutations or deletions [7,8]. In 2015, the rst report of p53 response to ferroptosis was published, the authors discovered that p53 promotes ferroptosis in broblasts and some cancer cells that are similar to human breast cancer MCF7 [9].
On the other hand, the activity of p53 can be regulated by overexpression of double minute mice 2 (MDM2) and may induce a tendency for tumor growth [10]. Hence, the use of a small molecule inhibitor to reactivate p53 function by disrupting MDM2-p53 interactions is now considered a promising strategy for cancer treatment. BDP (benzodiazepinediones) and nutlines have been identi ed as the most potent p53-MDM2 interaction abrogators so far Fig. 1 [11].
In recent years, the combination of spirooxindole and p pyrrolidine has garnered substantial interest. McConnell et al.
developed a series of spirooxindoles fused with 3,2-pyrrolidine, which promotes the formation of functional groups to form p-p interactions and hydrogen bond types and provided great properties to these compounds [12]. The mentioned properties of the spirooxindole scaffold fused to 3,2-pyrrolidine reveal the possibility of introducing it synchronously to construct a novel dual GPX4 / MDM2 inhibitor. New strategies for designing novel molecules against protein targets are currently in vogue for the development of anticancer treatments.
The QSAR protocol is an process to quanti e a relationship between activity of inhibitors and the structure using mathematical and computer tools for predict the activity biologic of new compounds [13,14]. In the present research, we performed 2D / 3D QSAR analysis on series of nitroisoxazole derivatives as MDM2-p53 antagonists. In the case of 3D / QSAR modeling, the CoMFA (comparative molecular eld analysis) methods and CoMSIA (comparative molecular similarity index analysis) methods were applied to establish the 3D / QSAR model. In addition, a 2D model has been established using the Halogram QSAR method, which is a recent method which. On the contrary 3D-QSAR, avoids the calculation of the molecular descriptors of the compound and it takes a long time therefore this method does not take into account the complexations of the 3D structure and the alignment of molecules to predict their activity. The best model built was based to discover new molecules to disrupt p53-HDM2 interactions. Moreover these proposed molecules, the most active compound and the control compound Nutlin-3 were docked at the MDM2 protein binding site to understand the mode of ligand / protein interaction as well as to identify the amino acids that govern activity. An MD simulation analysis were performed to access the stability of the complexes studied and validate the molecular docking results. The ADME / Tox properties of the modi ed compounds were also predicted to con rm the biodispobinility of the populated molecules as a drug candidate.

Research Method
Data set source In this research, we selected a database of 29 compounds nitroisoxazole containing spiro [pyrrolidine-oxindole] derivatives from the literature [15]. The basic structure used is replaced by an isoxazole in a small molecule inhibiting GPX4 for the development of new anti-tumor drugs. A uorine substitution has also been found in the basic structure studied, where it reveals an essential role in improving the e cacy of drugs which has become widespread in recent decades [17,18]. In particular at the level of the optimization of the molecular properties of a drug and of in uencing the selectivity thereof. target by modulating its conformation, permeability, hydrophobicity, metabolism and lipophilicity [19,29]. In addition, in 2019, eleven uoride-containing drugs were authorized by the FDA, more than half of which contain a tri uoromethyl group [20]. This suggested to us that the strategic integration of a tri uoromethyl group can enhance the inhibitory activity of a drug to some level.
The database structures were designed and optimized and again saved in mol2 format utilizing SYBYL X2.0 software [21]. Biological activity data of nitroisoxazole derivatives in these literatures are expressed as Ki (10-6 M), and we have converted them to logarithmic pki (-Log ki). The structures and activity (pKi) used in this work for were collected in Table 1. The 29 compounds were haphazardly divided into 23 compounds for the training set and 6 compounds for the test set in order to facilitate the 3D-QSAR analysis [22]. Minimization and alignment of molecular structure The structures of the selected compounds were minimized using the conjugate gradient under the standard Tripos Field force eld implanted in SYBYL-X2.0 [23] with partial atomic charges of Gasteiger-Hückel by the method of Powell with a convergence criterion of 0.05 Kcal / mol. The maximum iterations number is 10.000. Molecular alignment is an essential stage in 3D-QSAR for associated the orientation of molecules in 3D space. The highest activity compound N°04 was used as a template for alignment. The result of the structural alignment of all inhibitor molecules and the template compound structure is presented in Fig. 2.
Construction 3D-QSAR models 3D-QSAR study included the usage of CoMFA and CoMSIA, They are based on the idea that a compound of similar structure interacting with the receptor site in the same mode implies that the molecular elds involving them must also be similar, which shows that the biological activity of the compounds is intimately connected to the molecular eld that neighborhoods them. Using the Lenard-Jones function to compute the steric eld (S), as well as the Coulomb function to determine the electrostatic eld (E), the CoMFA model focuses primarily on van der Waals interactions [24].
Since most of interactions are very complex, and sometimes the steric eld and the electrostatic eld are insu cient to build the model. For this reason, we had to integrate the CoMSIA method [25]. The essential principle of the CoMSIA model is similar to CoMFA, with the addition of a hydrophobic eld (H), a hydrogen bond donor eld (D) and a hydrogen bond acceptor eld (A). The grid 3D used the spacing grid 2.0Å with the three Cartesian directions. The Gaussian function is used to calculate the energy of the eld molecular. The attenuation factor was set at 0.3.

PLS method
The relation between the activity and the descriptors CoMFA, CoMSIA and HQSAR was quanti ed by the method of the partial least squares (PLS) integrated in SYBYL X2.0 [26]. We carried out a Leave-one-out (LOO) to select the best value of validation coe cient cross Q² and the best principal component number (NOC). The value of the correlation coe cient (R²), the value F, the standard deviation of estimate (SEE) and the percentage of contribution were selected by non-validation. The 3D-QSAR model is predictive and reliable if Q²> 0.5, r 2 > 0.6, the SEE value should be as little as possible and the F value should be as large as possible [27].

Construction HQSAR model
The HQSAR method is a novel two-dimensional QSAR method that provides an unequaled advantage in molecular design. HQSAR can certainly display where optimization is required, as well as which groups of atoms contribute positively or negatively to molecular activity and nd the appropriate structural substitutions in molecule design. The Validation of the CoMFA, CoMSIA and HQSAR models The selected 3D-QSAR and HQSAR models must also be validated by external validation by using the test set reserved, The external validation was carried out according to Golbraikh-Tropsha criteria and using Roy metrics [30][31][32]. The conditions that the 3D-QSAR models and the HQSAR model must meet certain conditions for a more reliable external predictive capacityare as follows: -Q 2 > 0.5, R 2 > 0.6, (r 2 -r 2 0 ) / r 2 <1, (r² -r'² 0 ) /r² <0.1, 0.85 ≤ k' ≤1.15, 0.85 ≤k ≤ 1.15. Where: k and k′ are respectively the slope of the predicted values to those observed and the observed values to those predicted for the test set with zero intercepts/ -Δr m ² <0. 2, Δr' m ² <0. 2, r2 m > 0.5, in Roy's criteria.
Where -r 2 is the square correlation coe cient of the predicted and observed activity values of the test phase.
-r 2 0 and r' 2 0 are respectively the coe cients for determining predicted values and those observed and the observed values and those predicted for the test set with zero interception.

Molecular docking study
In this work, molecular docking was used to understand the mechanism of interactions between nitroisoxazole derivatives and the chosen receptor code was obtained from the RCSB protein database (https : //www.rcsb.org/) PDB: 4LWU. The AutoDock program performed all molecular docking steps [32]. Within a week of removing all water molecules. The polar hydrogen atoms and Gasteiger charges were added to the receptor protein chain. A grid box spacing of 0.708 Å was taken, with a number point in x, y, and z-dimensions of -11.639, 21.472, and -4.528, respectively. Molecular docking generated nine conformations based on docking a nity, the conformation with the lowest binding energy was then selected.
Furthermore, re-docking of X-ray bound ligand (20U) into their own active site was executed to validate the accuracy of the molecular docking protocol. A molecular docking protocol is reliable if the root mean square deviation (RMSD) is less than 2Å [33].

Dynamic molecular analysis
The Desmond program, a 2005 explicit solvent MD package with the OPLS xed force eld, was used to run MD simulations on the best docked ligand-protein complexes [34]. To simulate the system, an orthorhombic box with periodic boundary conditions was utilized as the solvent and a preset SPC (Simple Point Charge) water model as the solvent. An electrically neutral simulation system was constructed of 0.15 M NaCl was developed. Using the Nose-Hoover thermostat algorithm and the Martyna-Tobias-Klein Barostat method, we stabilized the temperature at 300 K and the pressure at 1 atm throughout the simulation[36-38]. For 100 ns, the simulation was run under TNP conditions, and the trajectory information was acquired using the Berendsen thermostat and barostat methods throughout the remaining 2 ps[38].

Simulation binding free energy analysis
In order to calculate binding free energies of ligand-protein complexes, molecular mechanics and generalized Born surface area (MM-GBSA) were operated. A simulation with the VSGB solvation model and the OPLS3e force eld was run with a 100-step sample size on frames 0-501 to estimate the Prime MM-GBSA binding free energy. For Prime MM-GBSA's free binding energy (kcal/mol), the additivity principle was employed in which each component's separate energy modules were combined to estimate the binding free energy of the protein and ligand.

Lipinski rule
The molecular properties of a compound in uence pharmacodynamics and its pharmacokinetics. Hence optimize their absorption, distribution, metabolism and excretion (ADME) in the human body as a drug. Lipinski's rule validation demonstrate good oral bioavailability and a balance between the aqueous solubility of a compound and its ability to passively diffuse through different biological barriers. √ furthermore, Veber's rule indicates that molecular exibility and polar surface area (PSA) are crucial factors of oral bioavailability [39]. This molecular exibility governed by the number of rotating bonds. Reduced molecular exibility and low PSA are important predictors of good oral bioavailability that molecular exibility determined by the number of rotating bonds. The reduction in molecular exibility and the low PSA are important predictors of good oral bioavailability [40].

3D-QSAR statistical results
A CoMFA analysis was generated with static and electrostatic elds and, 31 possible combinations of ve descriptors for the CoMSIA analysis, as the dependence on the ve CoMSIA descriptors results in a decrease in the signi cance of the model [41]. We selected the models with values Q² and R² were respectively greater than 0.5 and 0.6 Table 2.

HQSAR statistical results
The construction of the HQSAR model was adopted by three factors, containing the fragment size, the fragment distinction and hologram length. The statistical parameters of the established HQSAR models are presented in Table 3-4, the parameters R², Q² and SEE were used to display the validity of the models. The most excellent hologram model was generated using a histogram length of 257 having four optimal components.
The descriptors used to create the model were B (bonds) and H (hydrogen atom), the addition of descriptor Ch  Validation of QSAR models The predictive capacities of the QSAR models were validated externally using the six molecules reserved for the set of tests [29]. As it was mentioned after that the parameters Q 2 and R 2 , obtained from internal validation, are necessary but insu cient to con rm the stability and the predictive capacity of the models.
The results of the external validation are listed in Table 5. The best QSAR models for the test set gave the R 2 pred , r² m , r'² m ,  Considering the values of the performance parameters provided by the external validation (Table 5), it is obvious that among the three models generated, only the CoMSIA model respected all the criteria of Tropsha and Glorbaikh, and the metrics of Roy. these results validated the predictive power of the CoMSIA / S+H+A model to estimate the biological activities of new compounds.
The correlation plots between the experimental pki and predicted pki and the residuals of the CoMFA and CoMSIA and HQSAR models are displayed in Fig. 3a, 3b and 3c, respectively.

Visualization contour carts CoMFA and CoMSIA
The ability to visualize the results as three-dimensional contour plots is an intriguing feature of CoMFA and CoMSIA modeling. The contour maps were created by multiplying the coe cients and standard deviations associated with each CoMFA or CoMSIA column by a scalar product. The resulting maps depict regions with scaled coe cients greater than or equal to 80% (favored/unfavored). They will give us with su cient information about the regions that are favorable or unfavorable for the compounds' activity, allowing us to tweak and strengthen the inhibitors previously reported.
The steric and electrostatic elds for CoMFA analysis are illustrated by contour plots on the most active compound N°04 as seen in Fig. 4. In the case the steric eld CoMFA, green contours indicate regions where bulky groups improve biological activity, while yellow contour indicate regions where bulky groups decrease activity. The presence of two signi cant zones of green contours around the chlorine atom of R 1 substitution and the CF 3 group of R 3 substitution indicates that the substituents at these locations have favorable steric interactions. A small green contour between the para and ortho phenyl ring positions of the R 4 substitution suggests that the bulky substituents will rise proportionally biological activities. Three yellow polyhedrons concentrated around the meta position of the R 3 substitution phenyl ring suggest that less bulky substituents in the position will improve biological activity (Fig. 4a).
The Fig. 4b was the contour map of the electrostatic eld, the electronegative and electropositive regions were represented by blue and red contours, respectively. Red and blue contours indicate regions where bulky groups improve biological activity, while yellow contour indicate regions where bulky groups decrease activity. Three red contours around the CF 3 group, as well as the NO 2 and methyl of the isoxazole group, signal that these positions are appropriate for electronegative groups to considerably increase biological activity (Fig. 4b). Two small blue contour regions between pyrrole and isoxazole and near the core of R 3 substituent phenyl indicate positively charged substituents may improve activity.
In the case of the CoMSIA model, the steric, hydrophobic, hydrogen bond acceptor contours are presented in Fig. 5. In the steric eld, green contours indicate regions where bulky groups improve biological activity, while yellow contour indicate regions where bulky groups decrease activity. In the map of steric contours of CoMSIA, we noticed two large contours one near the CF 3 group of R 4 substitution, and the other covers the ortho and meta position of R 3 substitution phenyl which indicates that a bulky substitution is required in this region. Furthermore, we could also see two large yellow contours covering the two faces of the isoxazole ring showing that the substitution in this position is unfavorable to the increase in activity Fig. 5a. In the hydrophobic eld, the yellow contour indicates the regions where the hydrophobic areas are favorable, while the gray contour indicate the regions where the hydrophile regions are favorable. the hydrophobic grey contours covering the R 4 substitution CF 3 group, the phenyl para position of the R 3 substitution and also NO 2 group, suggest that the substitution of hydrophilic groups in these positions could enhance biological activity. while yellow outlines localize near the chlorine atom of group R 1 , and around oxindole show that the substitution of the hydrophobic groups could enhance the activity Fig. 5b.
The hydrogen bonds acceptor eld (A) plays a major role in the bioactivity compared to the steric and hydrophobic elds (percentage contribution 53%). The cyan colored contour and the purple colored contour denote the hydrogen bond acceptor favorable and unfavorable region. As shown in Fig. 5c, a large cyan contour near the NO 2 group indicates that the hydrogen bond acceptor group at this position suitable for activity. The unfavorable hydrogen bond acceptor red contour was noticed near the ring of the phenyl ring, indicating that the hydrogen bond acceptor group is not favorable at this position.

Hologram QSAR Atomic Contribution Plot Interpretation
On the most active compound N°04, as well as the least active compound N°20, an atomic contribution graph generated by the HQSAR model is shown Fig. 6. The different colors indicate the degree which each atom contributes to the total biological activity. The negative contributions are illustrated by the red, red-orange, and orange patterns, while the positive contributions are displayed by the yellow, green, and blue-green patterns. Grey colors have been used to indicate intermediate contributions. The most common substructure has been indicated in cyan. The appearance of red or orange phenyl hydrogen indicates that it contributed negatively to the activity. The hydrogen atoms of isoxazole's methyl group were whitish, indicating an intermediate contribution.

Designed molecular docking results and analysis
The results of established 3D-QSAR models were used for the design of new potent inhibitors, as shown in the Table 6.
Compound N°04 exhibited the highest activity and was taken as a template to design new compounds. In the phenyl ring at the R 3 substitution, there was the favorable steric contour between the para and meta position, suggesting that bulky substitutions increased biological activity. As a result, cyclohyptane was introduced at this position, and the Pred01 compounds were designed. Table 6 Structure, predicted pKi and a nity of newly designed compounds The four designed compounds, the most potent inhibitor (N°04), as well as the control were docked in the binding site of protein, for the purpose of detecting the key amino acids in the biological activity studied, as well as to acquire indepth insight into crucial interaction modes. We selected the conformation with the highest a nity score value among nine output conformations. According to the docking result, the newly designed compound Pred01 had the highest binding a nity -9.4 kcal / mol, while compound N°04 and Nutlin showed an a nity of -8.4 kcal / mol and -9.2 respectively Table 6. The amino acid His92 is in a suitable position to be involved in the formation of hydrogen bond of compound NO 2 (2.48 Å) and compound Pred01 (2.83Å) with the oxygen atom of NO 2 group and the nitrogen atom of isoxazole respectively. These observations are precisely in agreement with the contour maps of the H-binding acceptor in the CoMSIA odels. In addition, the compound Pred03 form a hydrogen bond with the same His92 residue on distance 2.56 Å. These results indicating this amino acid plays an essential role in controlling activity (Fig. 7). Thus, these residues may be critical indications for rational design to nd out more potent MDM2-p53 inhibitors.
The docking methodology was validated by the redocking technique by redecking the co-crystallized ligand, then calculating the RMSD between the reference co-crystalline and the re-docked co-crystallized as shown in Fig. 8. The result of the docking showed that the docking of the co-crystallized ligand shows an RMSD value equivalent to 0.87 Å, indicates that the methodology is accurate in predicting the binding a nity for ligands.

Lipinski rules and veber parametrs
The lipinski rules are widely used to assess drug-like compounds. According to these rules, compounds that do not meet at least two of the following criteria are very likely to have problems with absorption or oral permeability [42]. The Veber and lipinski properties of the newly designed compounds are collected in Table 7. According to Table 7, the most soluble compound in aquatic and lipid solutions is Pred 01, followed by Pred03, N°04, Pred01, and Pred02. Giving to Lipinski's rules, these compounds should have no di culty with oral bioavailability. We note that all compounds have PSA and Nrot Bond values of less than 140A 2 and 10 respectively, indicating that they are well predicted for oral bioavailability and membrane transport. We can convince these compounds to comply with Veber's rule. As shown in Table 7, the synthetic accessibility values of the new compounds are located within a range closest to the representative molecule of the dataset N°04 and can be considered encouraging.

Molecular dynamics analysis
In this part, we have selected three complex N°04/receptor, Nutlin-3/receptor and Pred 01/receptor. For a study molecular dynamics simulations were performed in 100 ns to assess conformational stability very much closer to physiological environmental conditions .The RMSD trajectories of free protein and ligand-protein complexes were used to assess the stability of the selected systems as were illustrated in Fig. 10. respectively, indicating that compound Pred 01 could form a more stable complex with the target protein binding site Fig. 9.

Page 15/23
The evolution of RMSF for the three complexes, indicating the local changes in the protein chains, generally the three complexes uctuate weakly such that does not exceed 3Å, but the complex relating to compound Pred 01 uctuates more weakly than the others and does not exceed 2Å Fig. 10.
The interactions of protein/ligand and the graphs of contacts were monitored throughout the simulation are also illustrated in Fig. 11. During the MD simulation equilibrium steps the protein-ligand interactions were evaluated to identify the crucial residues involved in the interaction for the three ligands. For compound N°04, the most stable interactions were hydrophobic conctates with His92, Leu50, Ile95 and Ile 57 accounting for 30, 28, 26, 17% of the whole Simulation Trajectory, respectively. Other hydrogen bond interactions formed with His92 and Gln55 were also observed, His92 formed a hydrogen bond with the nitrogen dioxide group, with percentages of 12 and 5% respectively, indicating the hydrophobic interaction formed with these residues played an important role in the inhibitory activity interacting with the protein.
In addition, the most signi cant water bridges were observed with Gln55, Tyr51 and Gly54 residues accounting for 23, 21 and 13% respectively of all MD simulation time, indicating the important role of these bindings in the receptorligand interaction.  Binding free energy calculation using prime molecularmechanics/generalized born surface area (MM-GBSA) approach The binding free energy was estimated by Coulomb energy (Coulomb), covalent bond (Covalent), hydrogen bond (Hbond), lipophilic bond (Lipo), π-π Packing interaction, the solvent the generalized binding and the binding of Vander Waals (VDW) the values of the contribution of each element on the free energy of total binding between the compounds N°04, Nutlin-3, Pred 01 and the binding pocket of the target protein were has been shown in Table 8. In addition, the newly predicted pred01 compounds, N°04 and Nutlin-3 as the control compound were selected for the molecular docking study and the 100 ns MD simulations to analyze the mode of ligand interaction in the binding site of the receptor and assess the stability of the docked complexes respectively. To obtain more information, the binding free energy was examined using the MM-GBSA method. The value of the free energy con rms the stability of the Pred 01 ligand showing the lowest energy. All of these interesting results can be invested in and used to design a nitroisoxazole derivative drug containing spiro [pyrrolidine-oxindole] as more potent inhibitors of GPX4 / MDM2 -p53. Figure 2 Molecular alignment, (a) most active compound N°04 among the data set (b) 3D-QSAR structure aligned compounds of the training set  Hologram quantitative structure activity relationship model contribution map of (a) the most compound N°04 and (b) the least compound N°20  Histoframe showing the protein/ligand interaction and A timeline representation of the interactions and contacts of three compounds, (a) compound N°04, (b) Nutlin-3 and (c) Pred01 throughout the simulation time of 100 ns.

Figure 12
Free energy binding of the studied complexes