Identification of the Potential Hits for Hindering Interaction of SARS-CoV-2 Main Protease (M pro ) from the Pool of Antiviral Phytochemicals utilizing Molecular Docking and Molecular Dynamics (MD) Simulations

The novel SARS-CoV-2 is an etiological factor that triggers Coronavirus disease in 2019 (COVID-19) and tends to be an imminent occurrence of a pandemic. Out of all recognized solved complexes linked to SARS-CoV, Main protease (M pro ) is considered a desirable antiviral phytochemical that play a crucial role in virus assembly and possibly non-interactive capacity to adhere to any viral host protein. In this research, SARS-CoV-2 M Pro was chosen as a focus for the detection of possible inhibitors using a variety of different analytical methods such as molecular docking, ADMET analysis, dynamic simulations and binding free energy measurements. Virtual screening of known natural compounds recognized Withanoside V, Withanoside VI, Racemoside B, Racemoside A and Shatavarin IX as future inhibitors of SARS-CoV-2 M Pro with stronger energy binding. Also, simulations of molecular dynamics for a 100 ns time scale showed that much of the main SARS-CoV-2 M Pro interactions had been maintained in the simulation routes. Binding free energy calculations using the MM/PBSA method ranked the top five possible natural compounds that can act as effective SARS-CoV-2 M Pro inhibitors.


Introduction
Coronavirus, identified as SARS-CoV-2 and COVID-19 (Coronavirus disease), was declared a pandemic by the WHO (World Health Organization) in March 2020 [1]. Severe Acute Respiratory Syndrome (SARS) was first identified in 2002 while Middle East Respiratory Syndrome (MERS) in 2012 [2] [3]. It induces respiratory and gastronomic disease with symptoms such as influenza, fever, and shortness of breath in severe conditions [4]. The first diagnostic case of SARS-CoV-2 was identified on 17 November 2019 in Wuhan, China, as an extreme pneumonia cluster since then, the disease has spread globally with a 50 percent rise in the incidence of infection and a lower diagnosis rate of less than 9.2 percent [5] [6]. Till now 1,04,956,439 (dated 6 February 2021, https://www.who.int/data) confirmed cases in 213 COVID-19 countries, no apparent licensed or scientifically validated medication has been produced for COVID-19, and it has become critical to find an appropriate treatment for COVID-19 [7].
Twelve Coronavirus proteins have been identified so far as to be a possible drug target commonly categorised as: structural proteins including Spike protein (S protein), S2 protein, Envelop small membrane protein (E protein), Membrane protein (M protein) and Nucleocapsid protein (N protein) and other non-structural proteins including Main protease (M pro ), Papain-like protease, Non-structural protein 13 (nsp13, helicase), Non-structural protein 14 (nsp14, N-terminal exoribonuclease and C-terminal guanine-N7 methyltransferase), Non-structural protein 12 (nsp12, RNA-dependent RNA polymerase, RdRp), Non-structural protein 15 (nsp15, Uridylate-specific endoribonuclease), Non-structural protein 16 (nsp16, 20-O-methyltransferase) and Non-structural protein 10 [13]. Structural proteins are mainly present on the membrane and Non-structural proteins are responsible for the replication and packaging of the virus nuclear material into the capsid protein coat [14] [15]. Blocking any of these proteins can lead to arresting viral growth.
There are two main strategies to control CoVs in this pandemic situation, mainly the discovery of (i) inhibitor to prevent the entry of virus in the host cell, and (ii) bioactive compounds that restrict viral replication and transcription. pp1a and pp1ab proteins are required for viral replication and translation, active pp1a and pp1ab are released after proteolytic processing by M pro highly conserved protein. The stereotypical role of M pro and no similarity with any human protein makes M pro an attractive antiviral drug target [16] [17] [18] [19] [20].  [21].
Plants are the source of numerous phytochemicals such as alkaloids, flavonoids, phenols, chalcones, coumarins, lignans, polyketides, alkanes, alkenes, alkynes, basic aromatics, peptides, terpenes and steroids used in antiviral, antifungal and antibacterial medicines [22]. With that focus, this study was done to identify a potential phytochemical inhibitor of M pro . 110 phytochemicals [23] [24] were screened based on their binding energy. Further in silico analysis was done through molecular docking, Molecular dynamic, ADMET (Absorption-Distribution-Metabolism-Excretion-Toxicity).

Virtual screening of SARS-CoV-2 M pro target
Virtual screening was conducted to identify the best natural compounds with the ability to inhibit SARS-CoV-2 M pro . Re-docking of the co-crystal ligand, N3 was performed to gauze the exact match into the catalytic site (Cys-His catalytic dyad) of the SARS-CoV-2 M pro and it was confirmed using the RMSD between bound and docked conformations. SARS-CoV-2 M pro contains three domains in which 8 to 101 residues are being found from residues 8 to 101 while antiparallel β-pleated sheets forms through 102 to 184 residues of domain II.
Domain residues are generated from residues 201 to 303 and have five a-helices grouped into a mostly antiparallel globular cluster, which is linked to Domain II comprises of a long loop structural complexity from residues 185 to 200. Domain I and II contains the binding site of SARS-CoV-2 M pro . The original position and orientation of N3 in I the co-crystallized structure of the structure with PDB ID 6LU7 and (ii) the docking projections were shown in Fig-1. The best dock pose (binding energy = 7.55 kcal/mol) with an RMSD of 1.70 Å was chosen with the potency of the docking technique to establish native poses. The cavity described in Fig-1 is the main site to be aimed at 3CL pro to inhibit its activity. N3 is a comparatively large molecule in which its backbone generates an antiparallel layer containing residues 164 to 168 (His164, Glu166, Met165, Leu167, Pro168) of SARS-CoV-2 M pro when interacting with residues 189 to 191 (Gln189, Thr190, Ala191) (Fig-2).
The co-crystallized SARS-CoV-2 M pro protein with N3 also indicates a tight covalent bond developed by Cys145 of pro with N3 of 1.8 Å length, which, in turn, makes the interaction with this inhibitor covalently linked to the protein. The re-docked pose had captured 7 H-bonds (with Gly143, Phe140, His163, His164, Glu166 and Thr190), 1 amide pi-stacked (Leu141), 5 pi-alkyl (His41, Met49, Leu167, Pro168 and Ala191), 2 carbon-hydrogen bonds (Met165 and His172) and 1 van der Walls (with Asn142) crystal contacts with SARS-CoV-2 M pro . This re-docking confirmation with appropriate dock pose-ability of N3 and its binding coordinates were encouraged the activity of docking 110 lead-like natural compounds within the binding site of the SARS-CoV-2 M pro target. This protein contains the four sites S1´, S1, S2, and S4 in its active site region. The inhibitory effect of the SARS-CoV-2 M pro targeted protein was prompted by the thiol of a cysteine residue in the S1´ site. The association of Cys145 with the ligand is therefore essential to inhibit the activity of this protease.
The plant phytochemicals were obtained with higher binding affinity as well as key receptor contacts better than co-crystal ligand (N3). The threshold for interaction affinity was estimated at 8 kcal/mol to identify Phytochemicals better than the co-crystal (7.55 kcal/mol) ligand. Among were found in all the top five compounds (Fig-S1 to Fig-S4). Withanoside V was found as the preeminent inhibitor for SARS-CoV-2 M pro (Fig-4).

In silico ADME/T prediction of top-five compounds
The drug likelihood of the top-5 compounds was assessed in terms of physicochemically significant descriptors and key pharmacokinetic properties via the Schrodinger QikProp software.
N3 was subjected to QikProp ADME/T profiling and a few of the ADME/T parameters are shown in Table-

Molecular dynamics simulation
The best-docked complexes were derived from the molecular docking study to investigate the residue regions as some of the amino acid residues found in this window were pocket residues that promote ligand binding (Fig-S5 to Fig-S10). The secondary structural elements contributing to pocket residues were intact in the β-sheets and loop regions (Fig-S11 to Fig-S16). A thorough examination of fluctuating residues the RMSD plots in this window are enriched with loop components that showed up peaks about ~ 3 Å in its plots. The ligand RMSD plot revealed the fluctuations for all compounds were ranged between 0.5 Å to 4 Å (Fig-6a). The compactness of the gyration radius calculated showed similar notions for all compounds excluding Withanoside V and Shatavarin IX (Fig-6b). Intramolecular hydrogen bonding was detected in the compounds Withanoside VI, Racemoside A and Shatavarin IX (Fig-6c) which accounted for stronger persisted contacts in their distinct simulation pathways. The molecule surface area (MSA) of Withanoside V was around 650 Å 2 in most of the frames in the trajectory (Fig. 6d). The solvent-accessible surface area (SASA) and polar surface area (PSA) for Racemoside B and Shatavarin IX were more than 600 Å 2 in most of the intervals (Fig-6e, f). Both compounds were best in the SARS-CoV-2 M pro target, while co-crystal ligand did not establish strong contacts, and its PSA did not lead to strong hydrogen bonding and is always introduced to solvents in most frames.

Preservation of intermolecular contacts in molecular dynamics simulations
The crystal structure of SARS-CoV-2 M pro with N3 indicated that its ligand-binding site consists and Met 49, respectively. One polar and negatively charged contact were developed which retained 37% of docked interactions (Fig-8e). Shatavarin IX had produced seven hydrogen bonds, one alkyl and one unfavorable donor-donor interaction during its docking profile which was not reserved in the MD simulation event.

Data set Preparation
A library of 110 natural compounds having antiviral properties were retrieved and compiled having demonstrated in vitro anticancer activities from the plants Withania somnifera, Asparagus racemosus, Zinziber officinalis, Allium sativum, Curcuma longa, Adhatoda vasica. All the ligands were prepared for docking by adding charges and hydrogen. These structures were prepared energy minimized and aligned for molecular docking experiment through the Amber03 force field algorithm of YASARA Structure (academic license). Also, crystallographic waters were removed, polar hydrogens and charges to titratable amino acids were also applied which followed by atom typing using Amber03 force field, and geometry optimization using the steepest gradient approach  [25]. YASARA Structure (academic license) was employed for the docking experiment which comprised of water removal, the addition of polar hydrogens, steepest descent method was also used for the energy minimization [12].

Virtual library screening upon SARS-CoV-2 M pro target
The library of 110 natural compounds was screened against SARS-CoV-2 M Pro using AutoDock Vina molecular dock pose search and AMBER03 force field for the scoring algorithm of YASARA software. The re-docking of the N3 co-crystal ligand was performed to validate the scoring mechanism. Whereas, the dock site/grid box size was kept as a 20×20×20 Å with root mean square deviation (RMSD) evaluation. The selected library compounds were sorted because of the highest docking score to pick the top-scoring best compounds for further analysis due to the scoring function of YASARA software [26] which employs high positive as a better affinity of the ligand to the receptor [27]. The binding energy (ΔGbind) was calculated using the following empirical equation:

In silico physico-chemical and ADME/T studies
Physico-chemical and ADME/T properties, that predict both physico-chemical signal descriptors and pharmacokinetically significant molecular properties, have been determined using the Qikprop module of Schrodinger suite 2020-4 [28]. QikProp identifies the comparative ranges of the data features of a molecule with that of recognized medicines. In silico physico-chemical and ADME/T studies of the top 5 docked compounds were predicted to understand its pharmacokinetic nature [29].

Molecular dynamics simulations of the top-scoring molecule with SARS-CoV-2 M pro target
Physical motions of atoms and molecules of protein-ligand docked complexes have been described all-atom force field with 1000 iterations of the steepest descent technique for system minimization.
After equilibrium, an unrestrained production phase took place in the NPT assembly (atoms, pressure and temperature were kept constant) for 100 ns at 300 K and 1.01325 bar. Nosé-Hoover thermostat (relaxation time = 1 ps) and isotropic Martyna-Tobias-Klein barostat (relaxation time = 2 ps) were used. Short-range interactions (cut-off = 9 Å) and long-range Columb interaction were assessed using the smooth particle mesh Ewald (PME) method (PME) with a smooth particle network with the RESPA integrator. Confirmed seizures were recovered at the rate of 5 ps. After the simulation was completed, the stability of the system was evaluated by histogram for RMSD, root mean square fluctuations (RMSF), Hydrogen bond analysis, radius of gyration (Rg), torsional Bonds [30][31] [32].

Binding free energy calculations of top-scoring molecules with SARS-CoV-2 M pro target
The single trajectory method was used to determine Free energy binding using molecular mechanics/Poisson-Boltzmann Surface Area (MM/PBSA) with YASARA Structure. AMBER14 APBS (Adaptive Poisson-Boltzmann Solver) force field was employed to measure the solvent energy and to handle electrostatics [12] [33]. The 100 ns long simulation trajectory of the top five MD natural compounds and the co-crystal ligand was supplied as data.

Conclusion
The lack of an appropriate medicinal drug or vaccine has already exacerbated the condition of a can be used to scan and refine natural molecules that need experimental confirmation by in vitro and in vivo studies.

CONFLICTS OF INTEREST
The authors declare that there are no conflicts of interest.