Molecular dynamics and docking studies on potentially active natural phytochemicals for targeting SARS-CoV-2 main protease

Abstract In the present study, we screened eighty seven novel phytochemical compounds from four popular herbs, such as, Aegle Marmelos, Coleus Amboinicus, Aerva Lanata and Biophytum Sensitivum and identified the best three for targeting the main protease (Mpro) receptor of SARS-CoV-2. After categorizing all the phytochemicals based upon LibDock scores and hydrogen bonding interactions, the top ranked 26 compounds were further subjected for detailed Molecular dynamics (MD) study. From these screening we identified that Aegelinosides B leads the list with a high LibDock value of 142.50 (binding energy: −8.54 kcal/mol), which is better than several popular reference compounds namely, Tipranavir (LibDock score, 141.50), Saquinavir (125.34), Zopicole (122.9), Pirenepine (122.70), (115.37), Metixene (109.18), Oxiconazole Pimozide (138.00) and Rimonabant (91.88). Detailed analysis for structural stability (RMSD), Cα fluctuations (RMSF), intermolecular hydrogen bond interactions, effect of solvent accessibility (SASA) and compactness (Rg) factors were performed for the best six compounds and it is found that they are very stable and exhibit folding behavior. Apart from the docking and MD tests, through further drug-likeness and toxicity tests, three compounds, such as, Aegelinosides B, Epicatechin, and Feruloyltyramine (LibDock scores, respectively, 142.50, 124.33 and 129.06) can be suggested for fighting SARS-CoV-2. Communicated by Ramaswamy H. Sarma


Introduction
Controlling and defeating the deadly pandemic, COVID 19, is the most challenging issue the world is facing today. The work of Yan et al. (Yan et al., 2020) provided the early understanding that "human angiotensin converting enzyme 2" (hACE2) is a potential receptor for SARS-CoV-2. The SARS-CoV-2 pathogen, once reached inside the lungs, easily gets attached with hACE2 and undergoes in-situ integration rapidly (V'kovski et al., 2021). Reports reveal that SARS-CoV-2 can mutate its own protein structure upon human to human transmission (Gupta, 2021); as for example, P1 (B.1.1.28), SARS-CoV-2 VOC 202012/01 (B.1.1.7), 501Y.V2 (B.1.351) and B.1.617.2 are the four recent mutated fast spreading variants (Mascola et al., 2021). The world is striving hard for ensuring an absolutely effective and affordable vaccine is available at the earliest but the issues related with large scale production, affordability and global accessibility remain elusive (Kyriakidis et al., 2021;Xie et al., 2021). As far as the existing vaccines are concerned, still considerable skepticism exists regarding the improved immunity for humans without any risks of having complications after administration (Moore & Offit, 2021).
It is widely accepted that for fighting a new disease, adopting a "repurpose" strategy through commercially available drugs (e.g., antimalarial drugs, anti HIV protease inhibitors or interferon beta-1b, etc) can reduce the time and manpower (Mohamed et al., 2021). With the help of computers one can screen virtually over millions of small molecules and also model the drug target known as "homology modelling" strategy. Several SARS-CoV-2 main protease (M pro ), PL pro , RdRp, nucleocapsid, helicase and spike protein have been reported experimentally which triggered screening of available drugs or small molecules using different molecular docking approaches (Mouffouk et al., 2021). Especially, M pro plays a dominant role in the maturation of functional polypeptides involved in the viral replication and transcription, hence making it an attractive drug target for the development of COVID19 therapeutics (Lockbaum et al., 2021). Moreover, the crystal structure of M pro contains several substrate binding sites, making it an opportunity for testing wide varieties of inhibitors. Lu studied the anticorona activities of four anti HIV protease inhibitor drugs namely, nelfinavir, pitavastatin, perampanel, and praziquantel and found that nelfinavir is the best for fighting SARS-CoV-2 (Lu, 2020). In another study, Wang et al. reported that the FDA approved drugs such as chloroquine and remdesivir can be effectively used for fighting COVID-19 (Wang et al., 2020). However, the preliminary studies have not yet been approved as a consolidated treatment strategy for the COVID-19 infected patients since many drugs have low efficiency level in terms of cure reports. Another important concern for many drugs is severe side effects after consumption, hence a best alternative option is to scrutinize the safe natural products for the treatment of COVID19. In view of this, phytochemicals receive considerable attention, as they are a rich source of several flavonoids, alkaloids, terpenoids, fibers, limonoids, glucosinolates, peptides, polyphenols etc. Good numbers of reports are available for the docking of small molecules extracted from Pubchem, ChEBI, ZINC, drug bank, CEMBL, chemspider, etc. Hasanain et al. studied the molecular dynamicity of the real time movement of M pro in the presence of conivaptan and azelastine ligands and found the best binding pockets of M pro (Odhar et al., 2020). Large numbers of experiments and in-silico molecular docking studies were reported recently (Dotolo et al., 2021;Rismanbaf, 2020;Tarighi et al., 2021). Goswami et al. employed several FDA approved drugs for studying the anti corona activities and suggested that triterpene saikosaponin and simeprevir are potential inhibitors (Bouchentouf & Missoum, 2020;Goswami & Bagchi, 2020). Based on molecular docking study, nigellidine and a-hederin compounds containing herbs were recently suggested as inhibitors for SARS-CoV-2 (Bouchentouf & Missoum, 2020).
Many traditional herbs with Indian origin have been proven effective for cough, asthma and bronchitis (Warrier, 1993). Inspired by this, for the current study, we selected four different herbal species namely, Aegle marmelos, Coleus Amboinicus, Aerva Lanata and Biophytum Sensitivum. These herbs are being used as a first aid home remedy and especially it is abundantly available throughout Kerala state of India. The phytochemicals present in these herbs are well documented using various spectroscopic methods and reported in the literature (Ajithkumar & Seeni, 1998;Arumugam et al., 2020;Mariswamy et al., 2011;Sarkar et al., 2020). Apart from that, these medicinal herbs are clinically proven for various pharmacological activities and now emerged as an unavoidable part for treating various life threatening infections. For instance, Aegle marmelos is proven for its anticancer, antipyretic, radioprotective and anti-inflammatory properties (Baliga et al., 2011). Aromatic perennial herb Coleus Amboinicus is used for malarial fever, colic, epilepsy and helminthiasis, etc (Singh et al., 2002). Aerva Lanata is known for its anthelmintic, antibacterial, antiinflammatory effects (Chowdhury et al., 2002). Biophytum Sensitivum exhibits excellent antioxidant, anti-inflammatory, anti cancer, anti diabetic and antioxidant activities (Khare, 2008;Sakthivel & Guruvayoorappan, 2012).
The main objective of the present work is to explore the best phytochemical constituents from four medicinal herbs for neutralizing M pro and to understand the mechanism of its interaction. We identified 87 phytochemicals from these herbs and then screened them by performing docking study. For the best ranked phytochemicals, molecular dynamics simulation studies were performed. Subsequently the ADMET properties and drug likeness were analyzed and from these important insights were explored for drug design.

Preparation of protein
The structure of SARS-CoV-2 M pro was used as a receptor for the present study. The corresponding PDB ID: 6LU7 is downloaded from Protein Data Bank. This protein structure is the first crystal of M pro of SARS-CoV-2. The structure does not contain any missing residue as well as repetitions in the amino acid residue. The experimental data shows that this protein has got good resolution of 2.16 Å and R value of 0.235 using Xray diffraction method. The 3 D crystal structure of M pro consists of three domains (Domain I, II and III). with 8-101, 102-184 and 201-306 amino acid residues, respectively. Domain I and II consist of b-barrels while domain III mainly consists of a-helices. The active binding site is located at the cleft of domain I and II. The inhibitor N3 and water molecules were deleted from it. Visualization and docking were performed on licensed discovery studio software (DISCOVERY, 2018). All pdb files were first opened by the discovery studio and then added polar hydrogen. The protein is minimized using CHARMm forcefield and smart minimizer algorithm with a maximum step of 2000. All these options were set to be default in discovery studio. Chimera software was used for visualization.

Preparation of ligand
Aegle Marmelos (in the local language of Kerala (Malayalam), it is called Koovalm) consists of 20 phytochemicals, whereas Coleus Amboinicus (Navara) contains 24 chemical constituents. Aerva Lanata (in malayalam it is known as Cherrula) and Biophytum Sensitivum (also known as little tree plant, in malayalam it is called as Mukkutti) consists of 17 and 26 phytochemicals, respectively. Chemical identity information of all these compounds were taken from the previously published literature (Warrier, 1993) and the corresponding files were downloaded from ChEBI (Hastings et al., 2016) and Pubchem (Kim et al., 2021) (the obtained files (totally 87) were in .sdf format, which needs to be converted to .dsf format). The phytochemicals were prepared by adding polar hydrogen and then minimized using input CHARMm forcefield with smart minimizer algorithm (maximum step: 2000. The RMS gradient was 0.01). The partial charge estimation is done by Momany-Rone method.

Druglikeness and ADMET analysis
The druglikeness of screened drugs was carried out by Lipinski rule of five and Veber rule as implemented in the Discovery studio. All compounds need to show molecular weight <500 g/mol, number of hydrogen donors < 5, number of hydrogen acceptors <10 and AlogP98 < 5 and no more than one violation of above Lipinski rule criteria is permitted (Jia et al., 2020) Molecular weight of compounds greater than 500 g/mol were avoided before analysis. Rotatable bond <10 and polar surface area (PSA) <140 and total hydrogen bond donors and acceptors <12 are criteria for the Veber rule (Veber et al., 2002). 87 candidates satisfied Lipinski rule criteria out of the 98 phytochemicals taken from the literature. Absorption, distribution, metabolism and excretion and toxicity (ADME/T) are the five pharmacokinetic properties of all drugs which must be checked out. Aqueous solubility, blood brain barrier penetration (BBB), cytochrome P450 inhibition (CYP250), hepatotoxicity, human intestinal absorption (ADMET absorption level) and plasma protein binding (PPB) are the ADMET properties calculated using the QSAR. Calculations were done under the calculated molecular properties protocol and ADMET properties tool. The ADME/T properties are given in the supporting information.

Molecular docking
Binding site of M pro was first determined using the option tool "define and edit binding site" under "ligand receptor interaction" tool panel of discovery studio. The active site was chosen based on the location of co-crystallized peptide inhibitor N3. The spherical transparent binding sphere was created around the centroid of the selected inhibitor and then taken for the binding site attribute. After preparation and minimization, binding site is defined for the docking of protein with ligands. In the process of molecular docking, the predicted active binding site radius is set to be 10 Å and centered at 12.355, 14.273, 71.259, XYZ coordinates for M pro ( Figure SI 1). The aligning ligand conformations to polar and nonpolar receptor interaction site, also known as hotspot was done by high throughput virtual screening (HTVS) using LibDock protocol. Various parameters such as max conformation hit, final cluster radius, final score cut off and maximum BFGS were set to be default under high quality modes. LibDock protocol allows generating several modes of ligand conformations for docking. Number of Hotspot and docking tolerance are 100 and 0.25, respectively. LibDock score can be taken directly from the output files which are saved in .dsf format. Highest value of docking score and maximum number of hydrogen bonds are the two criteria for the selection of LibDock score.

Binding energy calculation
All phytochemicals from each herb were converted into mol2 format. Protein must be in pdb format to be compatible with DOCKTHOR online server for binding energy calculations (Guedes et al., 2021). DOCKTHOR utilizes the multiple solution steady state genetic algorithms used for the search method. Here, we fix the same XYZ coordinate which is defined for the protein. The default grid box is 20 Å Â 20 Å Â 20 Å. Number of maximum evaluations is 500000 and number of runs is set to be 12. Soft docking is performed for the protein flexibility using MMFF94S Buf-14-7 potential. The output files obtained consist of topranking ligands with binding energy (kcal/mol).

Molecular dynamics simulations
Three different forcefield parameters such as CHARMM27, GROMOS96 43al and GROMOS96 54a7 available in the web server were used for M pro . Based on the RMSD versus 100 ns simulation time plot, we proceed with GROMOS96 43al forcefield. Other parameters such as water model and box are TIP4P and Triclinic, respectively. The molecular dynamics (MD) simulations of top docked protein-ligand complexes were performed using WebGRo for macromolecular simulations. The server uses Gromacs protocol for carrying out many parameters such as RMSD, RMSF, ligand RMSD, Rg, SASA (total) total and M pro -ligand hydrogen bond plots, total volume and density (Oostenbrink et al., 2004). The force file was GROMOS96 43al applied for all six top docked complexes. The ligand topology was created by the PRODRG tool. TIP4P was selected as a solvent model with triclinic box. The system was neutralized by adding sufficient sodium and chlorine ions (0.15 M salt) based on the total charges. The steepest descent algorithm with 5000 steps was applied for energy minimization. For equilibration and MD run parameters were set at 300 K and pressure, 1 bar. Modified Berendsen thermostat named V-rescale was used for temperature coupling. The MD integrator was set to be of leaf frog method for updating position and velocities. MD simulation was performed for 100 ns with approximate number of frames, 50000 per simulation. Three replica simulations per system were performed in order to ensure the reproducibility. First set calculations were discussed in the text and remaining plots were shown in supplementary information [SI]. Further, 50 ns MD simulations for next best 20 docked systems have been performed.

Results and discussion
A brief summary of four herbs (Aegle Marmelos, Coleus Amboinicus, Aerva Lanata and Biophytum Sensitivum) referred for the present study and their biological/therapeutic value are given in Table 1. Initially by using molecular docking study we scrutinized 87 phytochemicals obtained from these herbs for finding the most effective among them. Aegelinoside B Feruloyltyramine and Feruloyltyramine show binding energy (BE) greater than 8 kcal/mol (Table 2). It is noted that Discovery studio software and DOCKTHOR server are consistent in their performance in terms of LibDock score and BE even though exceptions can be seen. Different algorithm used for the two methods could be the reason for the variation. Nonetheless, in general, highest LibDock values of chemical constituents are always seen with good BE obtained in DOCKTHOR. The same procedure has been used for finding binding energy for the reference ligands. Binding energies and LibDock score of these compounds can be seen in the   et al., 2020) reported a binding energy value of À7.2 kcal/ mol for Imperatorin-M pro complex (also known as Marmelosin) obtained from autodock which is lower than that obtained from the current study (-8.39 kcal/mol). Aegelinoside B, an alpha glucosidase inhibitor, is the phytochemical that can be extracted from the leaves of Aegle marmelos. The 3 D interaction diagram shown in Figure 1A indicates the hydrogen bond donors and hydrogen bond acceptors around Aegelinoside B. We have observed that Aegelinoside B interacts with GLU166, GIN192, THR190, ARG188 and GLN189 through conventional hydrogen bonds. The hydrogen of GLN192 interacts with two O3 and O5 of the ligand, while oxygen of THR 190 residue interacts with H35 and H34 of ligands. Altogether seven hydrogen bonds can be seen in the 2 D figure ( Figure 1B). Numerous carbon-hydrogen bonds and two psulphur interactions, one p-p stacked and one p alkyl interactions can be observed. The hydrogen bond interactions of native ligands (Inhibitor N3) in the crystal structure show GLY143, GLU166, HIS164, PHE140, GIN 189 and THR 190. However, the docking studies on the native ligand with M pro reveals that hydrogen bond interactions involving the residues HIS41, HIS163, HIS172, GIN189, THR190 and GLU166. The 2 D plot can be seen in Figure   hydrogen bonds with ASN142 (here, two hydrogen bonds), THR190 and GLU166. panion interactions were found between GLU166 and the aromatic part of the ligand and other palkyl interactions can be seen with CYS145 and HIS41. The third best, Marmin, exhibits 4 hydrogen bonds (Figure 3). The hydrogen bonded interacting residues in this case are, TYR54, GLU166, GLY143 and SER144. GLY143 is the common active residue seen in Epoxyaurapten and Aegelinoside B whereas GLU166 active interacting residue can be seen in all three complexes and the native ligand. For further understanding, molecular dynamics (MD) studies were performed for Aegelinoside B, Epoxyaurapten and Marmin, and the results are highlighted in section 3.6.

Docking of the phytochemical constituents from Aerva lanata
Aerva Lanata contains many alkaloids and flavonoids and hence is quite popular for various biological activities. It is used for antiurolithiatic, astringent, diuretic, antimicrobial, anti-inflammatory and hepatoprotective drugs. One of the major chemical constituents of Aerva Lanata is Ervoside, which is a biologically active canthin-6-one alkaloid (Goyal et al., 2011). Our study reveals that, out of 17 phytochemicals obtained from Aerva Lanata, Ervoside docks better with M pro target, with a LibDock score of 129.69 kcal/mol (Figure 4). The TYR54, HIS172, CYS145, SER144 and MET165 residues interact with Ervoside through hydrogen bonds. A sulphur p-interaction can be seen for MET49 residue. The second best Libdock scored (123.22) phytochemical is Feruloyltyramine. This compound contains three hydrogen bonds with the protein (Figure 5). The active residues of M pro are ASP776 and THR556. It is interesting to note that the binding site is slightly changed and due to this reason common active residues have not involved in the interaction. Methergine, Ervolanine, Kaempferol, Quercetin and 4-Methoxykaempferol exhibit LibDock score in the range of 110 to 130 kcal/mol. Interestingly, Quercetin has been studied for protective effects from COVID19 induced acute kidney injury (Gu et al., 2021). In our study, Quercetin shows five Hydrogen bonds with GLY143, SER144 ARG and MET165 residues while GLY143 and SER144 residues make three hydrogen bonds with Kaempferol. Even though higher number of hydrogen bonds seen in Qucercetin, it appears that the LibDock score is close to the score of Feruloyltyramin. In this category, we have selected best two phytochemicals based on the LibDock score. Hiremath et al. reported in-silico docking analysis of Kaempferol and Quercetin obtained from the leaves of phyllanthus amarus. The binding affinity of these compounds with M pro in their study was À7.70 kcal/ mol and À7.50 kcal/mol, respectively (Hiremath et al., 2021). DOCKTHOR provides closely similar binding energies of À8.10 kcal/mol for Kaempferol and À7.97 kcal/mol for Quercetin. However, Kaempferol with spike protein indicates a significant improvement in binding affinity (-9.6 kcal/mol) (Hiremath et al., 2021). Proceeding with the docking studies, MD studies were also performed by us for Ervoside and Feruloyltyramine, and the results are highlighted in section 3.6.

Docking of the phytochemical constituents from Biophytum sensitivum
Maximum number of phytochemicals is seen for Biophytum Sensitivum herbs, however most of them exhibit binding energy values less than À7 kcal/mol. Attractive among them is Epicatechin which exhibits a LibDock score of 124.33. Reports indicate that Epicatechin mediates reverse transcriptase inhibiting activity for HIV (Chang et al., 1994;Gour et al., 2021). The green tea also contains Epicatechin and its derivatives such as, epigallocatechin gallate epicatechin gallate and gallocatechin-3-gallate. These compounds and their interactions with M pro were studied previously by Ghosh et al. (Ghosh et al., 2021). The binding energy of Epicatechin was found to be À7.20 kcal/mol which is similar to our estimation (-7.69 kcal/mol). The hydrogen bond interactions occurred between HIS164, HIS163, SER144, PHE140 and ASN142 residues with the ligand. Numerous van der Waals and two p alkyl interactions can also be seen in Figure 6. The other phytochemical is Stigmast-4-en-3-one, which exhibits a LibDock score of 118.74. It has only two hydrogen bonds between SER144 and HIS163 residues with oxygen ligands.

Docking of the phytochemical constituents from Coleus amboinicus
Coleus amboinicus is also known as Indian borage and is used for respiratory diseases like sore throat, cold, bronchitis, asthma, etc (Arumugam et al., 2016). Recent studies indicate its antiproliferative effect against cancer cell lines (Bhatt et al., 2013). Maste and Saxena studied the possibility of multi target response of the chemical constituents of Coleus Amboinicus (Maste & Saxena, 2020). They considered only four ligands and all these exhibits BE less than À8.00 kcal/ mol which agrees well with our observations. Here, we considered a maximum of 24 chemical constituents. None of the phytochemicals of the Coleus Amboinicus plant except N-benzoyl-L-phenyl alaninol show > 100 LibDock score and > À8 kcal BE. Among all the phytochemical constituents from this set, only N-benzoyl-L-phenyl alaninol shows a highest binding energy of À8.22 kcal/mol. However, as compared to chemical constituents of the other three herbs, our observation is that the docking values are not high enough. Only N-benzoyl-L-phenylalaninol exhibits over 100 LibDock score. Conventional hydrogen bonding occurs between residue GLN189 and two OH and NH bonds from the ligands and residue GLU 166 with O of N-benzoyl-L-phenylalaninol. None of the constituents are attractive candidates for the further studies since most of them show less than LibDock score of 100 (except two candidates showing LibDock score of 111 and 104).

Drug likeness and ADMET screening
Assessment of Lipinski rule and pharmacokinetics are important for optimizing drugs (Hastings et al., 2016). Owing to toxicity, many drugs fail at clinical trial stage, hence it is necessary to find lead compounds with good pharmacokinetics properties (Hashida, 2020). Lipinski rule, also known as rule of thumb, is an algorithm based model which predicts the drug likeness of small molecules. In this study, 87 constituents did not violate more than one criterion. Stigmast-4-en-3-one and Gamma Sitosterol display highest lipophilicity (8.319 and 8.084, respectively) whereas (13 R)-8,13-epoxylabd-14-ene, Aurapten and 3-Methoxyamphetamine are in the border line. Only 3-Hydroxy-4-methoxybenzoic acid violates the Veber rule (PSA 2 D of 151.42) whereas the remaining compounds were accepted as oral bioavailability of potential drugs by means of Veber's rule. Compounds were scrutinized for their pharmacokinetic properties and toxicity using in-silico method as implemented in Discovery studio. After administrations all the phytochemicals show good absorption and moderate absorption except 3-Hydroxy-4-methoxybenzoic acid, Gamma Sitosterol and Stigmast-4-en-3-one. Bioavailability of drugs depends on the solubility level and the majority of the constituents tested in this study exhibits good and optimal solubility in water at 25 C. (13 R)-8,13-epoxylabd-14-ene, Tigogenin, Gamma Sitosterol exhibits very low solubility. BBB model describes the blood brain barrier penetration level of the drug to the central nervous system of the brain and spinal cord. Note that, small lipophilic potential drug molecules should enter the nervous system and stay there a long time for the desired action. In this study, majorities of drugs are in very high, high, medium and low level of BBB except those for 4 0 ,7-Dimethoxykaempferol, Gamma Sitosterol, Stigmast-4en-3-one, 3-Hydroxy-4-methoxybenzoic acid, dl-Phenylephrine, Ervoside and Methoxykaempferol. Aegelinoside B is at the border level and all constituents are non-inhibitors except Aeglemarmaelosine, oct-1-en-2-ol and 3-Butylindolizidine. Eight, twelve, eight and eleven constituents of Coleus Amboinicus, Aerva Lanata, Aegle Marmelos and Biophytum Sensitivum, respectively, are toxic in nature. The binding of a drug with plasma protein was described using PPB Tight binders, as shown by 49 compounds (<1) remaining indicating the weak binders. Gamma Sitosterol, Anhydromarmeline and Stigmast-4-en-3-one show a value of greater than 7. All the well docked ligands are acceptable as a drug in terms of pharmacokinetics. All chemical constituents provided in the Table SI 11-14 exhibits good druglikeness and passed pharmacokinetics properties of majority of compounds.

Molecular dynamics simulation
The three different force field parameters available in the website namely, CHARMM27, GROMOS96 43al and GROMOS96 54a7, are compared before conducting all calculations. We used M pro protein for the benchmark studies and based up on the RMSD results further studies have been performed. Other parameters are same for all cases and we found that protein using GROMOS96 54a7 force field exhibit a maximum fluctuation in the RMSD plot throughout the simulations (0.51 ns). On the other hand, while using CHARMM 27 it displays a slight structural change after 85 ns. It is also found that the M pro using GROMOS96 43al force field is less flexible. It attains stability after 10 ns and its average RMSD is 0.30 ns (Figure SI 3).
In order to evaluate the stability of ligand bound protein, conformations, flexibility and compactness, we performed MD simulations using GROMOS96 43al forcefield. For this, we selected the top six ligand docked protein complexes with M pro and also evaluated the ligand induced changes on the protein structure. Figure 7A represents the root-mean-square deviation (RMSD) profile of docked complexes, which indicates the stability of the protein ligand complex. RMSD values were gradually increased till 7 ns time scale except Epicatechin, which stabilized quickly after about 4 ns for the first run. Proteins with Aegelinosides B, Ervoside, Feruloyltyramine, Epicatechin and Marmin were quite rigid with RMSD < 0.45 nm after 17 ns. On the other hand, Mpro-Epoxyaurapten complex show rigidity with RMSD < 0.55 nm. Initially all complexes with ligands were flexible in nature and then they reached a stable state. Epicatechin bound complex show some marginal fluctuation compared to others in the first run. However, it attains stability quickly in second and third run simulation. Three replicas of MD simulation have been performed for the all six docked protein complexes for 100 ns. All complexes show a similar trend in RMSD for at least two run simulations whereas, Epoxyaurapten and Marmin exhibit similar RMSD plot for three replica calculations. The complexes achieved stability after 25 ns ( Figure SI 4). We observed that this system is highly stable as compared to the M pro -Lopinavir complex (Bhardwaj et al., 2021) (Figure SI 7). Hence, these complexes could be regarded as stable systems. Solvent accessible surface area (SASA) and radius of gyration (Rg) are used to analyze solvent accessibility of all complexes and structural compactness of protein. We observed that the average value of Rg for M pro and M pro -Marmin has almost similar Rg value (2.13 nm) and others show values in the range of 2.10 to 2.16 nm for three MD run. M pro -Feruloyltyramine (Rg ¼ 2.15 ns) shows highest Rg and this suggests slightly less compactness in comparison to other complexes and free M pro (Figure 9 A). Khan et al. reported that Remdesivir, Saquinavir and Darunavir provide average Rg score values (2.2 ± 0.1 nm) which are at the upper range of obtained score of our drugs (2.10 nm À 2.16 nm) . M pro -Epoxyaurapten shows highest compactness and stable folding due to being lowest in the Rg value, which was maintained till the end of the simulation for three MD runs ( Figure SI 8)  . Moreover, Rg of all complexes and M pro (Figure SI 6) decrease over the simulation time, meaning that binding of ligands helps stabilization of the whole complex (Alazmi & Motwalli, 2021). The effect of solvent molecules on the residues of M pro when it is   In addition, we evaluated the intermolecular hydrogen bonds formed between protein and ligands during the simulation. In fact, H-bond is a main factor in determining selectivity, stabilization and binding affinity. The M pro -Aegelinoside B contains a maximum of five hydrogen bonds, out of which two hydrogen bonds were consistently seen during the simulation. Moreover, within 0.35 nm we identified a maximum of 8 hydrogen bonds. M pro -Epicatechin, M pro -Epoxyaurapten, M pro -Ervoside, M pro -Feruloyltyramine and M pro -Marmin contains an average of 1, 1, 2, 2, and 1 hydrogen bonds, respectively, for MD runs ( Figure SI 10-12). The constant range of intermolecular hydrogen bonding of M pro -Epoxyaurapten persists throughout the simulation indicating highest stability compared to other ligands and no change in the conformation (Chowdhury, 2021). Total volume of all systems and M pro lies in between 56 nm 3 to 59 nm 3 and the density lies between 951 g/l to 1200 g/l ( Figure SI 12). Similar volume and density of all systems were seen for other two MD runs. M pro -Marmin, M pro -Epicatechin and M pro -feruloyltyramine show a slight increment of 200 g/l as compared to other M pro -Ligands. In the case of M pro also we observed an increment of 1050 g/l towards the end of the MD.
We have evaluated the interacting residues of protein with six ligands after MD simulations. Table SI 15 indicates the interacting residues of protein with ligands after the MD simulation. Aegelinoside B shows two hydrogen bonds with two pi-alkyl interactions, one anion-pi and one cation-pi interactions. The crucial residues of proteins which involve hydrogen bonds with ligands are, SER46 and GLY143. However, the docked systems show seven hydrogen bonds and the pocket atoms of proteins are changed after MD. In Epoxyaurapten and Marmin, no hydrogen bonds could be observed. Interestingly, Ervoside shows seven hydrogen bonds and other non-bonds are pi-sulphur, pi-pi stacked and pi-alkyl interactions. Hydrogen bonds are found in ASN119, ASN142, SER144, SER46, THR24, THR26 and CYC44 with Ervoside ligands (All show H-bond distances less than 3 Å). Epicatechin and Feruloyltyramin exhibit three and two hydrogen bonds, respectively. Epicatechin exhibits Hydrogen bond interaction with GLU166, GLN192 and GLN189 residues of protein, whereas Feruloyltyramin shows hydrogen bond interaction at GLY143 and HIS165 residues of protein. Upon comparing structures of docked systems and systems after MD simulations ( Figure SI 14-19) it is vividly observable that a slight rearrangement in terms of confirmation of ligands and pockets of proteins occurs in order to interact each of them effectively during the movement.
From our theoretical studies, it is clear that out of the 87 phytochemcials present in the 4 herbal medicines, the many of the ligands passed ADMET, Lipinsky rule and viber rule. Out of the 6 best docked systems, Epoxyaurapten, Marmin and Ervoside exhibit toxicity. Interestingly, our best docked system Aegelinoside B has passed all five pharmacokinetic properties compared to the other five phytochemicals. Furthermore, it is observed that this particular phytochemical only shows a maximum of seven conventional hydrogen bond interactions with M pro which in turn always increases the docking score. In contrast, Remesivir shows four hydrogen bonds and high LibDock score of 169. MD simulation of Aegelinoside B also reveals its high stability with M pro . Interestingly, after dynamics the binding site changed slightly and the ligand orientation is also quite different. Based upon this, it is understandable that Aegelinoside B is the best phytochemical out of the 87 phytochemicals analyzed in the current study.

Conclusions
The current study explores the bioactive phytochemicals from four popular herbs for targeting the M pro receptor of SARS-CoV-2 through molecular docking and molecular dynamics studies. Among the 87 phytochemicals chosen for docking study, 6 active phytochemicals can be suggested as the M pro inhibitors based upon the highest docking score. They are, namely, Ervoside and Feruloyltyramine (from Aerva L), Epicatechin (from Biophytum Sensitivum), Epoxyaurapten, Marmin, and Aegelinoside B (from Aegle Marmelos). Aegelinoside B is identified to be the top ranked for docking/ binding over the other phytochemicals tested in the current study (Libdock score: 142.50 and binding energy: À8.54 kcal/ mol). Based upon the best ranking in terms of dock score values, 26 ligands were chosen for molecular dynamics simulations for further understanding of structural features such as fluctuations, stability and ligand binding ability, conformational analysis and hydrogen bonds. The interesting results observed in this study paves way for follow up studies on various medicinal herbs found in the locality of the authors and make them familiar to the scientific community. These overlooked herbs have traditionally been used for asthma/ bronchitis for years and systematic docking and molecular dynamics studies with M pro will provide the required attention from the international scientific community.