6-Ethoxy-4- N-(2-morpholin-4-ylethyl) -2-N-propan-2-yl-1,3, 5-triazine-2, 4-diamine endows herbicidal activity against Phalaris minor a weed of wheat crop field: An in -silico and experimental approaches of herbicide discovery

Phalaris minor is a major weed of wheat crop which has evolved resistance against herbicides. Isoproturon is the most accepted herbicide developed resistance in 1992. Later, introduced herbicides also developed resistance and cross-resistance to their respective binding sites. Isoproturon binds at the QB binding site of the D1 protein of photosystem-II (PS-II), which blocks the electron transfer in photosynthesis. In this work, we have carried out a series of computational studies to prioritize the promising herbicides against D1 protein of P. minor. Through the computational studies, twenty-four lead molecules are prioritized which have shown a higher binding affinity and inhibition constant than the reference ligand molecule. The binding and conformational stability of docked complexes was evaluated by molecular dynamics simulations and binding free energy calculations i.e., MM/PBSA. A list of amino acids such as Ala225, Ser226, Phe227, and Asn229 present in the binding site of protein is obtained to be playing an important role in the stability of the protein-lead complex via hydrogen bond and π-π interactions. Binding free energy calculation revealed that the selected lead molecule binding is energetically favorable and driven by electrostatic interactions. Among 24 leads, computational results have uncovered eight promising compounds as potential herbicides which have shown comparable physiochemical profile, better docking scores, system stability, H-bond occupancy, and binding free energy than terbutryn, a reference molecule. These prioritized molecules were custom synthesized and evaluated for their herbicidal activity and specificity through whole plant assay under laboratory-controlled conditions. The lead molecule ELC5 (6-ethoxy-4-N-(2-morpholin-4-ylethyl)-2-N-propan-2-yl-1,3,5-triazine-2,4-diamine) has shown comparable activity to the reference herbicide(isoproturon) against P. minor.


Introduction
P. minor is the most serious grass weed of wheat, widely distributed over the rice-wheat cropping belt of northwest India [1,2]. The most threatened states of India are Punjab, Haryana, Himachal Pradesh, and also a serious cause of concern in Uttar Pradesh, Uttaranchal, Madhya Pradesh, and some parts of Bihar. Isoproturon has worked as the most effective herbicide to control P. minor growth, but it has developed resistance and thus withdrawn in 1992. Henceforth, some new alternate herbicides viz. clodinafop, fenoxaprop, sulfosulfuron, and tralkoxydim were introduced in the market in 1997-1998. Application of these alternate herbicides brought the P. minor infestation under control for some period and helped to resume wheat production. However, cross-resistance to these four alternate herbicides had been also reported on a few biotypes [3,4]. Singh et al. has reported difference in RAPAD profiles of susceptible and resistant biotypes of P. minor [5]. Tripathi et al. have reported four point mutations in the D1 protein of P. minor, which indicate target site alteration [6].
Isoproturon herbicide binds at the Q B site of the D1 protein of PS-II located in the thylakoid membrane of the chloroplast. PS-II reaction center is a heterodimer of D1 and D2 protein encoded by the psbA and psbD gene [7]. PS-II complex binds with the reaction center pigment P680, and the primary acceptor pheophytin, as well as the secondary acceptors Q A and Q B ,as shown in Fig. 1. The D1 and D2 proteins of the reaction center together bind with two pheophytin molecules and four chlorophyll-α molecules. The herbicide and Q B binding niche are prone to mutation lie in the transmembrane helix IV, and V. Horovitz et al. have reported that due to mutation of serine at the 264th position, it is replaced by glycine in the mutant D1 protein [8]. Because of this, the herbicide molecule is deprived of one hydrogen bond as it cannot form H bonds with glycine. This results in a considerable decrease in affinity with herbicide molecules towards the D1 niche and the normal PQ B molecule easily replaces them from the niche; thus, the normal electron transport continues in the mutant even in the presence of herbicide [9,10]. In our earlier in silico studies, few analogs against ACCase protein of P. minor were proposed and designed [11] and tested against D1 protein, wherein triazol derivative of isoproturon was the most effective [12,13]. Likewise, other researchers have employed rational approaches for the identification of inhibitor/herbicides against the number of targets [14][15][16][17]. Kishore and Singh have reported that D1 protein of P. minor and wheat share high sequence similarity; therefore, designing selective herbicides against this weed imposes a huge challenge [18]. Using rationale approach of herbicide discovery, our group is focused to design and develop new herbicide against the P. minor weed of wheat crop [11][12][13]. In this study, we have modeled D1 protein, and then high-throughput virtual screening, docking, de novo design of chemical molecules, molecular dynamic simulation, and binding free energy are carried out and compared with ref_lig to identify the promising herbicides against the D1 protein.
By virtual screening, large datasets of retrieved small molecules have been screened to identify potential lead molecules and a de novo approach was followed up to design new analogs, against the D1 protein. The binding affinity of screened hit molecules has been tested by docking studies, and then docked conformations were evaluated with MD simulations followed by binding free energy calculations. The implemented computational methods in the field of drug design provide the finest blueprint for in vivo experimental validation. Later in this study, a whole plant assay [19] has been performed under laboratory-controlled conditions, to test and determine the biological activity, efficacy, specificity, and sensitivity of new molecules.

Homology modeling
The structure of the D1 protein was modeled by Modeller v9.12, a homology modeling program [20]. The modeling and validation approach is being described briefly as follows: (i) target template alignment, (ii) derivation of the model by MODELLER v9.12, (iii) structure refinement, and (iv) validation of the model. The spatial restraint method has been Fig. 1 Architecture of D1 and D2 protein of photosystem-II and Q A site on D2 protein showed in a black color circle employed to construct the model which satisfies restraints derived from the template structure. The restraints, including bond lengths and bond angles, van der Waals contact distances, and dihedral angles, are then mapped on the target structure based on the alignment. Derived models were evaluated based on their discrete optimized protein energy (DOPE) scores and root mean square deviation (RMSDs). Steepest descent energy minimization of modeled D1 protein was carried out for 1000 steps and, subsequently, conjugate gradient energy minimization was done for 500 steps using Swiss PDB Viewer 4.10 [21]. Harmonic constraints were used for energy minimization in SPDBV. Further, modeled structure was validated using structural analysis and verification server (http:// nihse rver. mbi. ucla. edu/ SAVES/) inbuilt tools such as PROCHECK, WHAT_CHECK, WHATIF, PROSA, ERRAT, PROVE, and VERIFY-3D. Superimposition between modeled D1 protein and template structure was performed using TM-align server (http:// zhang lab. ccmb. med. umich. edu/ TM-align/) [22][23][24].

Preparation of virtual screening dataset
Herbicides, known for PS-II inhibition, were retrieved from the international survey of herbicide-resistant weed database (http:// www. weeds cience. com/ summa ry/ home. aspx). The retrieved molecules were classified into three different classes i.e., C1 (phenyl-carbamates, pyridazinones, triazines, triazinones, triazolinone, and uracils), C2 (ureas and amides), and C3 (nitriles and Phenyl-pyridazines). To maintain the possible diversity of the selected molecules of the aforementioned classes, all possible analog shaving 70% 2D structural similarity or above were retrieved from ZINC [25] and PubChem [26] databases.

Virtual high-throughput screening
Virtual-high throughput screening has been performed by Molegro [27], a method to identify new pharmacologically active compounds on massive trials. The virtual highthroughput screening (vHTS) has been employed to predict the putative binding affinities of small molecules with D1 protein of PS-II at the Q B binding site. The bond length, bond order, explicit hydrogen, and flexible torsion have been checked and assigned in both the protein and ligand data set. To screen the prepared ligand dataset, the following parameters were taken into consideration i.e., number of runs 50, population size 50, crossover rate 0.9, scaling factor 0.5, maximum iteration 5,000, and grid resolution 0.30 Å. The prepared data set was taken for screening for both rigid and flexible approaches at the center of ref_lig i.e., X: 20.63; Y: 65.50; and Z: 35.64 with 10-Å diameter from the center of terbutryn binding. A total of 43 amino acid residues were kept flexible, which lies under the 10-Å diameter of the grid to carry out the flexible docking. The flexible amino acids were Leu172, Phe173, Ser174, Ala175, Met176, HIS177, Ser179, Leu180, Val181, Ser183, Ser184, Tyr208, Asn209, Ile210, Val211, Ala212, Ala213, His214, Tyr216, Phe217, Arg219, Leu220, Ile221, Phe222, Gln223, Tyr224, Ala225, Ser226, Phe227, Asn228, Asn229, Ser230, Arg231, Leu233, His234, Phe235, Phe236, and Leu237. In both rigid and flexible approaches, molecules have been selected based on reranking, H bond, and MolDock scores.

De novo design of inhibitors
De novo design of new analogs has been performed by Lig-Builder v2 package [28], a tool for structure-based drug discovery. It has a CAVITY and BUILD module for pharmacophore modeling, de novo synthesis of ligands, and processing of synthesized molecules. The ligand detection mode of the CAVITY module was used to find the key interaction residues at the binding site with the minimal distance between any two pharmacophore features (not to be less than 3 Å) and the binding site was selected based on ligandability less than or equal to 6.0Kd (equivalent to 1 µM). The EXTRACTION submodule was used to design seed molecules to ensure the binding affinity that decides the manner in which the ligand would grow, and growing mode of the BUILD module was used to construct molecules in the binding pocket. The whole construction process was controlled by a genetic algorithm. Protein-ligand binding affinity was evaluated by using an empirical scoring function: chemscore [28]. Besides binding affinity, the synthesizability of the ligands was also taken into account by applying certain sets of chemical rules (LigBuilder, http:// www. ligbu ilder. org/).

Clustering of small molecule
Ligands obtained from vHTS and de novo design have been clustered separately in a hierarchical fashion by the ChemAxon'sJKlustor Suite. The MCS tool of JKlustor has been used to search for the maximum common substructures (MCSs) of a compound library hierarchically. Clustering in a hierarchical fashion identifies the largest substructure shared by several molecular structures (JKlustorhttps:// www. chema xon. com/ produ cts/ jklus tor/). All the vHTS and de novo synthesized ligands were clustered into 13 and 11 groups respectively. These molecules were clustered to retain at least one typical molecule of each class of herbicides. For further discussions, vHTS and de novo clusters are named as existing ligand clusters (ELCs) and de novo synthesized ligand clusters (DLCs) respectively.

Prioritization of leads
Molecular docking was performed for both ELC and DLC lead molecules at the binding site (terbutryn binding site) of D1 protein using AutoDock [29]. AutoDockTools (ADT) was used to select the parameters and to prepare the ligand and protein files. The screened molecules (screened through vHTS) were subjected to rigid docking followed by flexible docking. The grid parameter file was prepared by setting the dimensions of the grids for resistant D1 proteins, 40:40:40 Å, and spacing 0.375 Å with the center coordinates (20.63, 65.50, and 35.64). Grid point spacing was calculated for each of the atom types of protein and ligand (protein: A (aromatic C), C, HD, N, NA, OA, and SA; ligand: C, A, HD, N, NA, e (electrostatic), and d (desolvation)). AutoDock-run parameters were set as maximum iterations of energy evaluations, iterations of genetic algorithm, and number of GA per run were set to 2,500,000; 27, 000; and 100 respectively. The Lamarckian Genetic Algorithm (LGA) was selected for the ligand conformational search. The best molecule of each ELC and DLC cluster was selected based on binding energy, ligand efficiency, and inhibition constant. Finally, twentyfour leads were selected (13 from ELC and 11 from DLC).

Lead validation through molecular dynamics simulation
Molecular dynamics (MD) simulation was performed using Gromacs4.6.5 [30] to validate the 24 prioritized leads. The stability of D1 protein, complexed with ref_lig (docked complex), was evaluated in the lipid bilayer. MD simulations for the prioritized 24 lead complexes were evaluated in two sets, such as, in the first set, all 13 ELC, and, in the second set, all 11 DLC complexes were evaluated. The topology file for ELCs and DLCs was generated using PRODRG server [31] in the framework of GROMOS 53A6 force field. In order to impersonate the real environment, simulation of protein-lead complexes was carried out in DPPC (dipalmitoylphosphatidylcholine) lipid bilayer environment [32]. The desired orientation of the D1 protein-ligand complex in lipid bilayer was taken from the Orientations of Proteins in Membranes (OPM) database (http:// opm. phar. umich. edu). The bilayer was then expanded by translating lipid molecules and scaling the lateral dimensions of the box size using a scaling factor of 4 and subsequently shrinking step was performed to scale down the lipids by a factor of 0.95 to avoid the overlapping of the atoms in the system. During the scaling of the lipid bilayer, 22-23 steps of shrinking and energy minimization were performed to reach an area per lipid of ~ 71 Å 2 (close to the experimental value of ~ 62 Å 2 ). Lipid parameters and topology files were obtained from Tieleman's website (http:// moose. bio. ucalg ary. ca) [33]. Afterward, the DPPC head groups were hydrated by adding sufficient solvent molecules on either side of the lipids (unit cell dimension 8.30626 Å × 8.33877 Å × 6.59650 Å). The resulting simulation system contained 125 DPPC molecules (63 molecules in the upper leaflet and 62 in the lower leaflet) solvated on both sides with SPC water. The complete system contains ~ 21,000 atoms. A periodic boundary condition was employed to avoid edge effects. At physiological pH, the protein was having a negative charge; therefore, to make the system electrically neutral, 10 Na + ions were added to the simulation box. The system was relaxed by energy minimization (10,000 steps of the steepest descent and 10,000 steps of the conjugated gradient) where protein and ligands were harmonically restrained to their original position with an isotropic force constant of 1000 kJ mol −1 nm −2 . The above restraints allow for further equilibration of the bilayer and solvent while keeping the conformation of protein and ligand unchanged. The bilayer, solvent, and protein-ligand were separately coupled to the temperature and pressure bath. Temperature and pressure were maintained constant at 300 K and 100 kPa using weak coupling technique with coupling and relaxation parameters of 0.1 and the 1.0 Ps respectively [34]. The resulting systems were subjected to the production run for 240 ns (10 ns for each complex) using GROMOS 53A6 force field [35]. Linear constraint solver (LINCS) algorithm [36] was employed for covalent bond constraints. Results were analyzed by extracting information from trajectory files obtained from simulation. The root mean square deviation (RMSD), root mean square fluctuation (RMSF), and hydrogen bonds were analyzed using the Gromacs modules and graphs and images were produced using Xmgrace and Visual Molecular Dynamics [37].

Binding free energy calculation
Binding energy (ΔG bind ) of D1 protein complexes of ELCs and DLCs leads was estimated from the trajectory file of each complex obtained in MD simulation using molecular mechanics-Possion-Boltzmann surface area (MM-PBSA) method [38]. The polar solvation energy values were calculated using the implicit solvation Poisson-Boltzmann model (PB) and nonpolar solvation energy values were calculated based on the solvent-accessible surface area (SASA). Finally, these energy components were combined to obtain ΔG bind value as shown in Eq. 1.
where ∆E vdw and ∆E elec represent van der Waals interactions and coulombic interactions respectively. ΔG PB and ΔG sur represent polar and nonpolar solvation energies, respectively. The entropic term TΔS was computed in the normal mode approximation [39].

Theoretical study of physiochemical properties of known herbicides of similar group
Most of the molecules fail during the experimental testing due to their poor physiochemical properties. Therefore, the properties of existing herbicides were investigated using Drug Likeness Tool, DruLiTo, (http:// www. niper. gov. in/ pi_ dev_ tools/ DruLi ToWeb/ DruLi To_ index. html).

Testing the bioactivity of test molecules
The proposed prioritized molecules were synthesized and characterized by Alba Nova (Alba Nova Life Science Private Limited, Hyderabad). The molecules were high in purity which was directly employed for testing. Some of the prioritized molecules were excluded due to the high-cost index of synthesis. The efficacy and specificity of custom synthesized molecules were evaluated through whole plant assay. The experiment was designed in triplicates under laboratorycontrolled conditions. The test was conducted in plastic Petri plates (9-cm diameter). Ten seeds of P. minor, as well as wheat, were arranged on the moist double-layered Whatman filter paper in each plate. The prepared plates were placed in the plant growth chamber for germination of seeds, at 17 °C in dark conditions, and relative humidity of 45% for 6 days. An aliquot of distilled water was continuously supplied to Petri plates to keep the germinating seeds hydrated. The proposed herbicides were evaluated on P. minor at a concentration of 600 μM, along with control, isoproturon treated, and untreated control in triplicates. Isoproturon herbicide was taken as control/reference for bioactivity testing since terbutryn has an adverse effect (non-selective) over wheat plantlets: a non-selective herbicide. After 6 days (post germination of seeds), the seedlings in Petri plates were subjected to test as accordingly mentioned factors. The seedlings were provided with added 2 mL of respective solutions (distilled water in control, isoproturon in treated control, and test molecules in test-treated population) at an interval of 2 days, until a total 10 mL of the solution was supplied to each plate, followed by adding an aliquot amount of distilled water for hydration. Petri plates have again been incubated in plant growth chambers under the same optimal conditions but with altering light conditions (8/16 h day/ night cycle) for the next 15/16 days. Afterward, observations regarding the effect of the compound upon plantlet growth, biomass accumulation, and other physical changes that occurred in plantlets due to treatment have been noted. Further data of wet and dry weight of plantlets have also been tabulated to compare the effect of compounds through biomass accumulation. Later, the molecule with the most effective efficacy and specificity towards the said weed was determined by comparing the results with the reference molecule isoproturon.

Testing the efficacy and selectivity of proposed molecule in laboratory controlled condition
The different subjects considered in this test were the same as earlier mentioned (untreated control population, treated control population with isoproturon as a reference, and treated populations with synthesized lead molecules). Further, the effective molecule was tested in triplicate to determine the concentration at which the growth rate of weed plantlets gets reduced by 50% (GR 50 ). The testing was performed on a concentration of 0-600 μM at a regular interval of 40 μM in order to determine the meticulous GR 50 dose of the new effective herbicide molecule.

Homology modeling
Resistance D1 protein (gi|30,413,115|gb|AAP33145.1) was taken as a query sequence to find out the most suitable template and it has shown significant alignment with chain A of the photosynthetic oxygen evolving center (1S5L) having a score: 503; query coverage: 99%; identity: 90%; and E value: 1e-178 and resolution 3.5 Å. Cyanobacterial D1 protein (pdb|3PRQ) complex with terbutryn at the QB binding site has also shown significant alignment i.e., query coverage of 100%, identity 90%, similarity 95.2%, and E-value 1e-178 without any gap in the alignment and resolution 3.2 Å (Fig. 2). The crystal structure of Cyanobacterial PS-II consists of 20 polypeptides along with 88 ligands and 10 cofactors [40] which was selected as a template protein to model the D1 protein. A total of 10 models were built and the best model was selected based on their discrete optimized protein energy (DOPE) scores and root mean square deviation (RMSDs). The selected model was subjected to a short 1000 and 500 steps of steepest descent and conjugate gradient energy minimization respectively and then subjected to the structural validation using SAVES server and other approaches.
The Ramachandran plot showed phi and psi torsion angle of 93.6% amino acid residues in core regions and 6.4% amino acid residues in the allowed region for the modeled structure whereas 82.3% amino acid residues in the core region and 17.0% amino acid residues in the allowed region for template structure (Table S1 and Figure S1). The quality of the model was also validated by the ProSA server, a web server for protein structural analysis. The ProSA Z-score was − 4.0 which indicates the overall quality and measures the deviation from the total energy of the structure with respect to an energy distribution derived from random conformations. A negative PROSA score indicates the correctness of the modeled protein. Verify 3D was used for validation of the modeled protein and it showed an average 3D-1D profile score − 0.09 and 0.52. TM-align server was used to further validate the modeled protein which showed TM score 0.99125 (0.5 < TM Score > 1, indicates about the same fold), suggesting the modeled protein to be of good quality, as shown in Fig. 3. Thus, results suggest that the comparative modeled structure of D1 protein is of good quality, and could be used for further studies. The homology modeling was performed to get the blueprint of the binding mode of herbicide terbutryn complexed with D1 protein (PDB ID3PRQ/4V82) which is precisely located at the QB site of D1 protein for the design and development of herbicides. The target template identity is ~ 90% without any gap so the model structure could be considered equivalent to crystallographic structure except for a few side chains [41].
In a previous study, our group has built a chimeric homology model of the PS-II reaction center of P. minor. The D1 protein is truncated at both N and C terminals, so both missing ends were concatenated with ref sequence of T. aestivum's D1 protein. Sequence for D2 protein was taken from its nearest homolog i.e., T. aestivum (from Ref Seq Database). The build reaction center was having one β-carotene, one bicarbonate ion, five chlorophyll-a, two pheophytin-a, two plastoquinone and cofactors (iron and OEC) along with D1 and D2 protein. To check the integrity and stability of the build chimeric model, it was subjected for MD simulation, and we reported that the complete reaction center deviates by 1.7175 Å RMSD from its equivalent bacterial reaction center i.e., template; PDB ID 2AXT [13].

Virtual high-throughput screening
The stringent criteria were applied in vHTS to narrow down the number of molecules as shown in Fig. 4. vHTS for all retrieved 33,016 ligands (Table S2) have been performed at the ref_lig binding site of modeled D1 protein and 7250 ligands were selected based on rerank, H bond, and Mol-Dock score either more or equal to ref_lig molecule score  (Table S4). To enrich the screening, selected ligands were passed through three times flexible docking, and at least one molecule from each class and subclass of the C1, C2, and C3 groups was retained. Best screened ligands were selected based on the rerank score (best 5% ligands), H bond, and MolDock score which were having more or equal to ref_lig score.

Designing of new analogs at the binding site of D1 protein
The de novo drug design program Ligbuilder v2 was used to design the new analogs of selected ligands iteratively using an inbuilt library of organic fragments. Overall, 28 ligands i.e., 16, 7, and 5 ligands from class C1, C2, and C3 respectively, were set for the de novo synthesis (Table S3). Ligand detection mode of the CAVITY module was used to detect the binding site of the protein and the best binding site was decided based on ligandability (Fig. 5). These were compared with ref_lig binding sites in D1 protein.
The automatic extractor module was used which has extracted 71, 29, and 19 seeds from C1, C2, and C3 classes respectively (Table S3). Extracted seed molecules were initially used for growing to design new ligand molecules using BUILD module. BUILD module uses a pocket and grid file generated by the CAVITY module which acts as pharmacophore for de novo synthesis. Here, molecules were synthesized considering the Lipinski rule of five, chemscore, binding affinity, synthesizability, and toxicity parameter for all molecules as well as the molecules were

Fig. 4 Flow chart is showing complete methodology of the vHTS and selection of ligands in each step of this study
Retrieved dataset were taken for vHTS.
Ligand molecules were screened on the basis of rerank, H-bond and MolDock score either more or equal to ref_lig molecule.
Flexible screening was performed to screen above selected ligands on the basis of rerank (best 5% ligands), H-bond and MolDock score.
Above selected ligands were screened according to their subclasses to keep at least one molecule from each subclasses.
More stringent parameters were used to reduce the number of ligands according to classes of ligands i.e. C1, C2 and C3 class.
Ligand preparation and gridgeneration.

Clustering
The hierarchical clustering of both vHTS ligands (230) and de novo synthesized ligands (282) was performed to get 13 and 11 clusters respectively (Table S3). The vHTS and de novo clusters were named as existing ligand clusters (ELCs) and de novo synthesized ligand clusters (DLCs) respectively. Screened leads from individual clusters of ELC and DLC (Table S3) were selected for the docking study.

Docking
The ELC and DLC hits were docked at the center coordinates (X: 20.63; Y: 65.50; and Z: 35.64) of the binding site of modeled D1 protein using AutoDock4.2 as well as the Molegro tool. The criteria chosen to select the best pose of each hit were AutoDock term (binding energy, ligand efficiency, and inhibition constant) and Molegro term (MolDock score, H bond score, and rerank score). One hundred poses per ligand were generated and clustered with an RMSD tolerance limit of 1.5 Å. Binding poses of all hits of the ELC and DLC were selected from the most populated cluster and a summary of these hits are tabulated in Table S5 and  Table S6. Thirteen hits from ELC and 11 hits from DLC were selected from each cluster of the ELC and DLC respectively after employing the docking score filter. Three amino acids i.e., His177, Phe227, and Phe236, in binding sites were observed to be playing a crucial role in interaction with hits. Docking results have shown that ref_lig has two H bonds and one π-π interaction in the binding site of protein with ligand efficiency of − 0.25, and inhibition constant of 1.08 mM (Table S4 and Figure S2).
The ref_lig docking scores were set as criteria to select the best hits from each cluster, as a result, 13 and 11 hits were selected from ELC and DLC clusters. In addition to ref_lig-based criteria, two or more H bond and one π-π interaction with better binding energy, ligand efficiency, and inhibition constant were employed as selection criteria to select the molecule (Table S5-S6 and Figure S3-S6).

Molecular dynamics simulations
To find the binding affinity and binding mode of ligand, rigid docking followed by flexible docking was applied, but it does not properly sample the conformational space that could lead to a variety of different binding modes of the ligands [42]. The goals of these simulations were to observe the structural behavior of protein (flexibility of Fig. 5 Isoelectronic surface of herbicide binding site of D1 protein (pdb|3PRQ); cavity and key sites of interactions and receptor-based pharmacophore the D1 protein with ref_lig, ELC, and DLC leads) and ligands (the behavior of leads inside the binding site of D1 protein). MD simulations were performed for each protein-lead complex for 10 ns using Gromacs4.6.5 in an environment of a lipid bilayer embedded in a periodic box solvated with water, shown in Fig. 6. PS-II complex has 20 peptides and 95 co-factors (PDB ID 4V82). D1 protein complexes were with at least one β-carotene, one pheophytin, two α-chlorophylls, two plastoquinone, at least two Fe 2+ ions, and one bicarbonate. The PS-II reaction center protein and associated molecules of the complex were in a tight-knit providing stability to each other, and having interlinked function. In our previous attempt, a chimeric D1/D2 protein was constructed from its nearest homolog and subjected to MD simulation. The chimeric D1/D2 protein has shown RMSD of 1.7175 Å from its template protein [13]. Therefore, we have refrained from doing any long simulation with the D1 subunit in this study. RMSD curves of the protein backbone, energy profile, and H bond profile of each protein-lead complexes were calculated and compared with the ref_lig protein complex. Variation in the total energy plot of the ref_lig protein complex has shown that the overall energy tends to stabilize after 3 ns of simulation (Fig. 6B), which confers the average energy remains constant, thus suggesting structural stabilization. The total energy patterns were similar for all ELC and DLC lead protein complexes (data are not shown).
Throughout the simulation period, no significant fluctuations were observed in the backbone of protein with ref_lig. It implies that the binding of ref_lig at the binding site of protein is not only stable and strong but also does not perturb the protein backbone stability. The binding stability of the ELC leads (ELC2, ELC3, ELC4, ELC5, ELC6, ELC10, ELC11, and ELC12) and DLC leads (DLC4, DLC6, DLC7, and DLC8) in the binding site of the protein have shown comparable fluctuation with ref_lig protein complex throughout the simulation in the size of the noticeable window of 0.30-0.40 nm (Fig. 7). It ensures and suggests that these leads are well accommodated inside the binding site throughout the simulation. H bonds at the binding site of the ref_lig protein complex were determined which revealed that it encompassed four Fig. 6 A Snapshot of the simulation system after solvation. Water molecules are shown in red; phospholipids groups are shown in gray; protein is shown in orange and ligand is shown in blue. B Total energy variation for the ref_lig protein complex throughout the simulation Fig. 7 Time dependence root means square deviation (RMSD) of the backbone of protein throughout the simulation. A RMSD of protein with ELC leads and compared with RMSD of protein with ref_lig. B RMSD of protein with DLC leads is compared with an RMSD of protein with ref_lig H bonds throughout the simulation period. The amino acid residues involved in the key H-bond formation at the binding site were Ala225, Ser226, and Phe227. The succession of frames for ELC and DLC leads inside the binding site of the protein has shown that ELC leads ELC2, ELC3, ELC5, ELC7, ELC8, ELC11, and ELC13 and DLC leads DLC3, DLC4, DLC6, DLC8, and DLC11 comprise three to four H bonds throughout the simulation as shown in Fig. 8. The amino acid residues such as His177, Ala225, Ser226, Phe227, Asn229, and Ser232 were observed to be holding the lead molecules at the binding site via H bonds. These interacting patterns advocate that leads are also interacting in a similar fashion as ref_lig. Hence, MD results confirm that these leads are accommodating well into the binding site of the D1 protein.

Amino acid fluctuation analysis
The root mean square fluctuation (RMSF) analysis revealed the possible movements of amino acid residues Phe201, Phe222, Tyr224, and Phe227 (inside the binding pocket) and Arg26, Gln127, and Arg187 (out the binding site) in the ref_lig ligand-protein complex (Fig. 9). In the case of selected ELC and DLC lead complexes, RMSF appears to be almost similar to the ref_lig except for the other small fluctuations. This signified that protein-ELC and protein-DLC lead complexes are also stable like the ref_lig-protein complex. The amino acid residues inside the binding sites have shown slightly less fluctuation because of the formation of a π-π interaction between the leads and the aromatic ring of Tyr224 and Phe227. Moreover, residues of the binding site showed lower fluctuation, revealing to be a more stable protein-ligand system.

Analysis of hydrogen bond occupancy
The MD trajectories were analyzed for the H-bond distance and the H-bond occupancy. H-bond distance is determined by the bond length (< 3.5 Å) and the H-bond occupancy is defined as the time percent existence of a  specific H bond during the simulation. The H bonds predicted for the ref_lig, ELC, and DLC lead and amino acid residues of the binding site of protein were observed consistently for the major part of trajectory with high occupancy for all the proposed lead molecules. H-bond interactions were categorized as strong (> 60%), medium (40-60%), and weak (< 40%) based on H-bond occupancy with ref_lig and the binding site of the protein.
Four H bonds were found between ref_lig and protein i.e. O@Ala225…HAG@ref_lig, HG@Ser226…NAL@ ref_lig, O@Phe227…HAG@ref_lig, and H@Phe227… NAL@ref_lig with occupancy 63.0%, 59.9%, 55.2%, and 60.3%, respectively, and the distance of H bonds were fluctuating in windows of 1 Å to 3.5 Å. Occupancies of the H bonds between protein, ELC, and DLC lead molecules showed a diverse range of fluctuation which is comparable to ref_lig. ELC leads ELC2, ELC3, and ELC8 have shown four H bonds, where two H bonds are having occupancy > 70% and another two H bonds having occupancy 70 to 30%. ELC4 and ELC7 showed three stable H bonds with significant occupancy > 40%. DLC leads DLC3, DLC6, and DLC9 showed four H bond occupancy > 40%. DLC4 and DLC8 ligands also have shown two stable H bonds with significant occupancy. Other ELC and DLC lead molecules (except the abovedescribed ELC and DLC lead molecules) have not shown a significant stable H bond as compared to ref_lig. The occupancies of the H-bond analysis also showed that amino acid residues His177, Ala225, Ser226, Phe227, and Asn229 were involved in the H bonding with ref_lig, ELC, and DLC lead molecules. However, the residues Ser226 and Phe227 were obtained as key residues to hold the molecules at the binding site as these residues showed higher occupancy in most of the ELC and DLC lead molecules (Fig. 10).

Free energy calculations
Binding free energy of D1 protein complexed with ref_lig, ELC, and DLC molecules was determined by the MM/ PBSA method. In the case of ref_lig, van der Waals interactions (∆G vdw ), showed the most significant contributions among the van der Waals interactions (∆G vdw ), the electrostatic energy (∆G ele ), and the nonpolar solvation free energy (∆G nonpol , soil ). The results reveal that leads from ELC class (ELC1, ELC2, ELC4, ELC5, ELC8, ELC11, ELC12, and ELC13) and lead from the DLC class (DLC1, DLC3, DLC4, and DLC6) displayed higher binding free energy compared to ref_lig binding free energy (− 104.627 ± 10.792 kJ/Mol). The order of binding free energy of ELC's and DLC's is ELC4 > ELC 11 > ELC 8 > ELC 13 > ELC 5 > ELC 12 > ELC1 > ELC2 and DLC6 > DLC4 > DLC1 > DLC3 respectively. The binding energies of the ELC and DLC lead molecules are shown in Table 1, which reveals that ∆G vdw has the most significant contribution to the binding energy.
From Table 1, it can be seen that ELC1, ELC13, DLC1, and DLC3 showed better binding affinity than ref_lig. However, these lead molecules are discarded in the broad-spectrum purpose as the number of H bonds was decreased from the initial structure, and occupancy was lower than the other lead molecules. Therefore, the lead molecules ELC2, ELC4, ELC5, ELC8, ELC11, ELC12, DLC4, and DLC6 were selected. These selected ELC and DLC lead molecules and the other known herbicides for D1 protein of PS-II protein were undertaken to predict the physiochemical properties (molecular weight, logP, AlogP, number of hydrogen donors and acceptors, total polar surface area (TPSA), number of rotatable bonds, number of acidic groups, number of rigid bonds, number of aromatic rings, and number of hydrogen bonds) (Fig. 11, Table S7).
From the data presented in Fig. 11 and Table S7, it can be observed that the properties of all the proposed molecules are comparable to known PS-II herbicides and have no toxic functional groups. Therefore, proposed molecules (ELC2, ELC4, ELC5, ELC8, ELC11, ELC12, DLC4, and DLC6) are considered potential lead molecules for experimental testing and further authentication. The 2D structures of these selected molecules are shown in Fig. 12, and ZINC, PubChem ID, and IUPAC name are shown in Table S8. However, only four molecules (ELC8, ELC5, ELC11, and ELC12) were synthesized among the proposed eight molecules, as shown in Table 2. Other molecules were not synthesizable due to the high polarity of intermediates, thereby increasing the cost of synthesis which would not be viable as an herbicide.

Testing the bioactivity of synthesized molecules
The tabulated data of untreated control and treated control with isoproturon (ref, and treated with synthesized test molecules) ( Table 2) were compared to interpret the effect of test molecules against P. minor and wheat with respect to the solvent (2% DMSO) and distilled water. Non-treated seedlings have a healthy growth rate as compared to treated seedlings. Besides late germination in treated seeds, yellowing of leaflets of treated plantlets has been observed after 1 week of treatment. Progression in the yellowing of leaflets has been observed in the treated plantlets, reflecting the visible chlorosis in treated plantlets which could be an indirect reflection of hindrance in photosynthesis thereby loss in biomass accumulation. At the end of the experiment, biomass accumulation in plantlets has been tabulated. Biomass accumulation has been compared among the test molecules and isoproturon (ref), to determine the most efficient molecule among the four test molecules graph that has been plotted (Fig. 13, Table S9). From the graph in Fig. 13, it has been recorded that there is no effect of either control isoproturon or test molecules upon wheat plantlets. However, it has been observed that there is a selective effect of test molecules upon P. minor, with different activity range i.e., ELC5 > ELC12 = ELC8 > ELC11. The efficacy and selectivity of ELC5 (6-ethoxy-4-N-(2-morpholin-4-ylethyl)-2-N-propan-2-yl-1,3,5-triazine-2,4-diamine) is comparable as isoproturon. Thus, it has been further tested and evaluated to determine the GR 50 of ELC5 molecule.

Testing the efficacy and sensitivity of ELC5 molecule in laboratory-controlled condition
The ELC5 has been further tested to evaluate its activity at different concentrations (0-600 μM at a regular interval of 40 μM), to determine its GR 50 . The plot (Table S10) deciphers that there is no effect of either isoproturon or test molecule ELC5 upon wheat at any concentration, thus promising the selective activity of ELC5 against P. minor only. The tabulated data of biomass has been statistically analyzed using a descriptive statistics (confidence level: 95%) approach to verify the truthfulness of the results. Upon further evaluation of tabulated data (Table S10), it has been interpreted that upon the treatment of plantlets with ELC5, there is ~ 50% reduction in biomass of P. minor plantlets at the concentration of ~ 40 μM (Fig. 14). ELC5 (6-ethoxy-4-N-(2-morpholin-4-ylethyl)-2-N-propan-2-yl-1,3,5-triazine-2,4-diamine) molecule has shown   comparable activity to isoproturon a reference molecule against P. minor under laboratory-controlled conditions. It could be further taken for pot assay and field assay (pilotscale synthesis and testing for various parameters) to test its effectiveness, efficacy, selectivity, and sensitivity in a natural uncontrolled environment. The outcome of the assays could suggest whether the molecule is ready to use or need some chemical modification and optimization.

Conclusion
P. minor resistance to herbicides is one of the major causes of reduced wheat production, in the wheat cropping belts in India as well as in other countries. Due to the development of resistance to existing herbicides, it is of utmost importance to find a novel herbicide to restore wheat production. In silico studies i.e., molecular modeling, virtual screening, de novo synthesis, and docking as well as molecular dynamics simulation and MM/PBSA calculation, were performed to find promising molecules with a high binding affinity than ref_lig (terbutryn; complexed with D1 protein PDB:3PRQ) at the Q B site of D1 protein. The screened small molecules from large datasets and de novo synthesized molecules were taken for docking studies, molecular dynamics simulation, and MM/PBSA calculation. From the result obtained, ELC2, ELC4, ELC5, ELC8, ELC11, ELC12, DLC4, and DLC6 have shown better features in all aspects in comparison to ref_lig. These molecules were selected based on protein backbone stability, the number of H bond and their occupancy, non-covalent interactions, and binding free energy. These theoretical results suggested that the selected molecules accommodate well inside Fig. 13 The effect on the biomass of plantlets upon treatment with new molecules on A wheat and B P. minor plantlet under controlled laboratory conditions the binding site throughout the MD simulation, which reveals the stability of the system, later confirmed by the RMSD analysis. The strength of binding of selected molecules has been evaluated by H-bond occupancy, and binding energy, and then physiochemical profiles were also checked which has shown comparable to the other PS-II herbicides. Finally, four prioritized molecules were synthesized and tested in whole plant assay in laboratory-controlled conditions. The whole plant assay showed that ELC5 (6-ethoxy-4-N-(2-morpholin-4-ylethyl)-2-Npropan-2-yl-1,3,5-triazine-2,4-diamine) is equally effective as isoproturon (GR 50 at 40uM) and does not have an adverse effect over wheat plantlets. The newly discovered lead molecule mentioned formerly is proposed for pilotscale testing in pot and crop fields.