In silico screening of potential antiviral inhibitors against SARS-CoV-2 main protease

ABSTRACT Respiratory illness due to SARS-CoV-2 emerged in 2019 and has a significant morbidity and mortality rate. The main protease (Mpro) is mainly responsible for viral replications, which acts as a good drug target to inhibit SARS-CoV-2-related diseases. Chemical compounds obtained from various herbal plants are showing potent antiviral activity against numerous viral diseases. Initial screening was performed with the phytochemicals against Mpro using molecular docking. This result shows that there is a strong interaction exhibited between active sites (His-41 and Cys-145) of Mpro with chemical compounds. In addition, ADME prediction and Lipinski’s rule of five (RO5) calculations demonstrated that the selected compounds have potential drug-like properties. Further, molecular dynamics (MD) simulations were performed to understand the stability and structural changes of protein–ligand complexes for the top five compounds. MM/PBSA studies strongly suggested that compounds, β-spinasterol, and asarinin form stable complexes with Mpro. The most significant hot spot residues such as Thr-25, Met-49, Cys-145, Met-165, and Gln-189 have strongly interacted with the selected chemical compounds. Our calculations suggest that asarinin is the best inhibitor to the Mpro, which supports these candidates and could be potent antiviral agent against SARS-CoV-2. GRAPHICAL ABSTRACT


Introduction
The novel coronavirus (CoV) is a large family of viruses that causes more severe diseases such as severe acute respiratory syndrome (SARS in 2002(SARS in -2003 [1] and middle east respiratory syndrome (MERS in 2012) [2]. In December 2019, an unknown pneumonia etiology emerged in Wuhan city, China. They observed symptoms such as dry cough, sore throat, dyspnea, and fever [3]. World Health Organization (WHO) declared that the Public Health Emergency of International Concern of COVID-19 is a pandemic disease by January 2020 [4]. The report states that SARS-CoV-2 is transmitted from bats to humans (i.e. Zoonotic). Later, SARS-CoV-2 transmissions from human to human were confirmed. It causes common cold, lung failure and leads to death [5]. The number of confirmed cases and deaths hikes significantly.
Coronaviruses belong to the subfamily of Coronaviridae and this subfamily includes four genera: (i) Alpha coronavirus (ii) Beta coronavirus (iii) Gamma coronavirus (iv) Delta coronavirus. The genome of the CoV is a single-stranded positivesense RNA virus ((+)ssRNA), it has ∼30 kilobase which encodes multiple structural and nonstructural proteins (Nsp1-16) [6]. The structural proteins are Spike (S), Envelope (E), Membrane (M), and Nucleocapsid (N), whereas nonstructural polyproteins (Nsp) are responsible for viral functions, replications, and survival in the host cell. Two proteases 3-chymotrypsin-like-protease (3CL pro ) or main protease (Mpro) and Papain-like-protease (PL pro ) undergo autocleavage [7]. The report states that human angiotensin-converting enzyme 2 (hACE2) [8] is the key receptor and main entry point into the cell for SARS-CoV-2 [9]. The spike glycoprotein can bind and then fuse with human cells. Cryo-EM structure analysis has revealed a higher binding affinity for SARS-CoV-2 with human cells of ACE2 than SARS-CoV [10]. It concluded that SARS-CoV-2 possessed more transmissibility. Selvam et al. collected a review of targets, structures, and inhibitors of SARS-CoV-1, which is helpful to understand the inhibition mechanism of SARS-CoV-2 [11]. Recently, the mutation occurred in genomes of SARS-CoV-2, the most prevalent one is D614G. It is more highly contagious than previous infections [12].
PLpro is responsible for the formations of Nsp1-3 and Mpro is responsible for the formations of Nsp4-16. Mpro causes the catalytic cleavage of 11 polyproteins, whereas PLpro causes only 3 polyproteins [13,14]. Therefore, Mpro is playing a vital role in replications of viral polyproteins and hence it is a potential drug target to inhibit SARS-CoV-2 [15][16][17][18][19]. The active site of Mpro has a catalytic dyad, which is formed by His-41 and Cys-145 located in domain I (residues 8-101) and domain II (residues 102-184), respectively. The initial drug addressed in this pandemic is hydroxychloroquine (HCQ), which is an antiviral drug and is used in the treatment of malaria whose action is reinforced with the addition of azithromycin which was reported by the French research society [20]. Also, some of the other antiviral drugs have been used at present in both preventive as well as clinical drugs to treat COVID-19 [21]. Unfortunately, the efficacy and the side effects of these drugs are still not yet known.
Thus, the design and development of antiviral drug molecules to inhibit the functions of Mpro are essential. Naturally occurring [22] (i.e. herbal plants and foods) chemical compounds have antiviral activity, used as a medicine worldwide. Herbal plants and their phytoconstituents [23,24] (extracted from stems, roots, seeds, barks, foods and flowers) result in potential antiviral activity for various diseases and enhance the immune system [25]. In this pandemic COVID-19, the screening of phytoconstituents and their derivatives is helpful in controlling the infections of SARS-CoV-2 [26]. Computeraided drug design (CADD) approaches are used to screen and identify suitable antiviral drugs against Mpro of SARS-CoV-2 [27]. This method is affordable and decreases the time to discover a new antiviral drug candidate. Singam et al. screened the novel inhibitors which are obtained from the Nuclei of Bioassays, Ecophysiology, and Biosynthesis of Natural Products Database (NuBBE DB ) against the Mpro [28]. Nhung and his co-workers studied the organosulfur compounds from garlic essential oil using gas chromatography-mass spectroscopy (GC-MS) and molecular docking techniques. They found that organosulfur compounds strongly interacted with the human ACE2 and Mpro of SARS-CoV-2 [29]. Kulkarni et al. selected the compounds from various essential oils and their docking analysis reveals that monoterpenes, terpenoid phenols, and phenyl propanoids have a stronger binding affinity with the spike receptor-binding domain (RBD) [30]. Gutierrez-Villagomez et al. studied the antiviral activity of alkamides and piperamides against the Mpro, RdRp (RNAdependent RNA polymerase), and ACE2. They calculated Lipinski's rule and ADME properties and evaluated the drug-like properties [31]. Recently, Wang et al. reported that the known anticancer drugs act as potential inhibitors against SARS-CoV-2 Mpro. These results unravel the antiviral activity of naturally occurring compounds and strongly inhibit the replication process of proteases from SARS-CoV-2. Natural products and biocompatible ionic liquids strongly inhibit Mpro activity [32].
Even though there are numerous in vivo and in vitro approaches present in the design and development of potential antiviral inhibitors for SARS-CoV-2 [33], still there is a lack of evidence to understand the stability, structural changes and active site interactions of protein-ligand complexes. Our study is mainly focused to find potent antiviral drugs from natural compounds against Mpro and unravel the interaction mechanism of the same. For this, we have selected a series of natural products from various herbal plants (non-toxic with no side effects) and performed docking-based virtual screening, molecular dynamics (MD) simulations and predicted ADME properties. In this work, computational investigations were carried out to evaluate the strong antiviral activity of phytochemicals against Mpro. Molecular-level insights will be helpful to design and develop natural inhibitors to control coronavirus-related disease. Our study will provide potential antiviral drug molecules to strongly inhibit the Mpro of SARS-CoV-2.

Materials and methods
The crystallographic 3D structure of Mpro-apo (PDB ID: 6Y2E; resolution 1.75 Å) [34] was retrieved from the protein data bank (www.rcsb.org). The PDB structure of Mpro was used for structure-based virtual screening and considered catalytic dyads of His-41 and Cys-145 are the active sites of Mpro. The Mpro structure is displayed in Figure 1. All the water molecules were removed and polar hydrogen atoms were added to protein preparation of Mpro. We have selected 25 chemical compounds from various herbal plants, and their coordinates were obtained from PubChem. 2D structures of all chemical compounds were shown in Figure 2. All these structures were optimised with B3LYP/6-31 + G* level of theory and their chirality was maintained. This method of optimisation is adequate for organic compounds, there are no structural changes when using a higher method of optimisation. Subsequently, frequency calculations were carried out to confirm the energy minima in the potential energy surface (PES). All the compounds were optimised using the Gaussian16 program [35]. The optimised compounds were considered ligands for the screening process.

Molecular docking simulations
Molecular docking [36] was employed to predict the predominant binding modes of chemical compounds with SARS-CoV-2 Mpro. The best binding pose is considered the lowest binding energy of the complex. Molecular docking was performed using Autodock 4.2.6 (rigid docking), which is the most cited free software and has significant accuracy and performance [37]. The receptor of Mpro and ligands was prepared by adding atomic charges and atom types, then saved as PDBQT. The active sites of Mpro contain His-41 and Cys-145, which are covered by the grid box. The dimensions of an active site box were 60 × 126 × 126 Å 3 with a default grid spacing of 0.375 Å was used for the docking. The selected 25 natural compounds from various herbal plants were docked with active sites of Mpro. The most stable 10 binding conformations were obtained, and we investigated the interaction sites of protein-ligand complexes. The ADME properties (absorption, distribution, metabolism, and excretion) and Lipinski's rule of five (RO5) were calculated using SwissADME [38] and PKSCM [39]. The predicted properties include molecular weight, H-bond donors, H-bond acceptors, gastrointestinal (GI) absorption, rotatable bonds, topological surface area (TPSA), hepatotoxicity, number of violations to rule of five and PAINS (Pan-assay interference). This will help to understand the drug-like properties of all chemical compounds.

Molecular dynamics simulations
MD simulations were performed to understand (i) the relative stability and structural changes of Mpro-apo and Mpro bound with ligand (6Y2E + inhibitors) (ii) binding free energy calculation and residue-ligand decomposition analysis. The protein-ligand complexes were placed in a cubic box with the dimensions of 97.5 Å × 97.5 Å × 97.5 Å. Mpro of SARS-CoV-2 was parameterised with Amber99SB-ILDN force field [40], and the box was solvated with SPC/E water model. The ligands were parameterised with generalised amber forcefield (GAFF) by using AmberTools18 [41] and ACPYPE [42] protocol. The partial atomic charges were assigned for ligands using the restrained electrostatic potential (RESP) method [43], which is computed by quantum chemical calculation at B3LYP/6-31 + G(d,p) basis set. Four Na + ions were added to neutralise the system. The protein-ligand complexes were minimised by using the steepest descent algorithm. Further, the entire system was equilibrated under an NVT and NPT ensemble for 5 ns using velocity-rescaling and Berendsen coupling to maintain the temperature at 300 K and pressure at 1 bar, respectively. Periodic boundary condition (PBC) was applied for all three dimensions. The electrostatic interactions were calculated using the Particle Mesh Ewald (PME) method with a cutoff of 1.2 nm. We employed the LINCS algorithm to constrain all bonds involving hydrogen atoms. In the production run, all the system was simulated for 100 ns and the integration time step of 2 fs was used. MD simulations were performed by using Gromacs 2016.3 package [44]. The binding free energy was calculated between the Mpro and ligands using the g_mmpbsa tool [45]. The trajectories and structural conformations are visualised by using pyMOL [46] and VMD software [47].

Results and discussion
The pandemic coronavirus causes severe respiratory illness, and to date, the available drugs are not sufficient to control the infections. Currently used drugs causes an adverse effect, and the inhibition mechanism is also not clear. Our study aims to identify the potential antiviral inhibitors for COVID-19. Mpro of SARS-CoV-2 cleaves more polyproteins than PLpro, and it plays a significant role in the process of viral replication. So Mpro acts as a good drug target for the inhibition of SARS-CoV-2. We unravel the chemical activity of various natural compounds with active sites of Mpro using CADD approaches [48]. In silico approaches will be helpful to get more insights about ligand interactions and structural changes of Mpro-complexes. In this study, we performed the two-step hierarchical virtual screening (HVS) of natural inhibitors against Mpro. First, molecular docking has been widely used to screen a large number of chemical compounds based on their binding affinity with active sites of Mpro. Next, the docked hits were subjected to MD simulations to understand the stability and structural changes of protein-ligand complexes. The final natural inhibitors were selected based on the MM-PBSA binding free energy calculation. Thereby, potent antiviral drugs can be identified to inhibit the SARS-CoV-2 Mpro.

Molecular docking
Molecular docking is an economical and fast pre-screening tool used to find out the ligand conformations, positions, orientations of binding sites, and binding affinities of drug molecules. We employed Autodock 4.2 to rapidly estimate the binding pose of Mpro with our selected ligands. The success rate of docking affinity for this package is 77%, according to the previous benchmark studies [49]. Huynh et al. reported that docking results of curcumin (−7.1 kcal/mol) and quercetin (−6.6 kcal/mol) with Autodock tools, these results are well correlated with our docking results and interacted with the active sites of Mpro [50]. Molecular docking is the first step of the HVS process. The predicted binding affinity (in kcal/ mol) of the chemical compounds is considered as the antiviral activity against SARS-CoV-2 Mpro. The binding affinity of the top five docked hits and their type of residue interactions were provided in Table 1. The obtained binding affinity falls in the range from −4.7 to −10.6 kcal/mol. Five chemical compounds show a larger binding affinity to Mpro, where the range is more than −8.0 kcal/mol. Our docking results show stronger binding affinity compared with FDA-approved drugs, this will lead to find potent antiviral drugs against SARS-CoV-2 Mpro [51]. The complete details of selected compounds, respective sources, binding affinity, and their potential active site residue interactions are shown in supplementary Materials Table S1.

ADME properties
ADMET profiles (absorption, distribution, metabolism, excretion and toxicity) were evaluated to understand the theoretical drug-like properties of our selected ligands. Lipinski's Rule of five (RO5) was assessed for all the compounds to identify the oral activity of drugs in humans [59]. The pharmacokinetic properties of all compounds were shown in Table 2. Some of the compounds violate MLogP but other properties of RO5 are in the acceptable range. The number of acceptable violations of Lipinski's rule is one. From the top five compounds, glabridin and asarinin exhibit high GI absorption and abides by the RO5.

Molecular dynamics simulations
The second step of HVS is the all-atom molecular dynamics simulations. There are several factors such as ligand protonation state, conformational changes, ions, and solvation effects are involved in the MD simulations. It is the refinement of docking results and to get more insight into the structural changes of the protein-ligand complexes. MD simulations were performed for the selected Mpro-complexes and Mproapo structures to understand the structure and stability of the complexes. This will be helpful in the optimisation of screened ligands and rational drug design. Plots of total energies with time will be helpful to confirm the stability of the complexes during the entire MD simulations, which is shown in Supplementary Materials ( Figure S2). Root mean square deviation (RMSD) determines the stability of the protein-ligand system, Root mean square fluctuation (RMSF) depicts ligand effects on system fluctuations, Radius of gyration (R g ) shows compactness of global structure, and binding free energy (MM-PBSA) calculations are helpful to determine the binding affinity between protein and ligand. The free energy decomposition analysis was also carried out  to find the contributions of specific amino acids on proteinligand interactions. These free energy methods quantify the most efficient inhibitors and thereby lead to find potent Mpro inhibitors.

Structural stability of Mpro-apo and Mprocomplexes
The trajectories of Mpro-apo and Mpro-complexes are analyzed to understand the stability and conformational changes (RMSD) of the system. Results reveal that the backbone of Mpro-complexes attains stable deviations to entire MD simulations, and the average deviation is within 2Å. The ligands strongly interacted with Mpro and stabilised the proteinligand complexes, RMSD plots were displayed in Figure 4. In summary, the RMSD shows that ligands are strongly stabilised the protein during the overall MD simulations. In the first ∼5 ns, lupeol was located in its initial position, after that the position was slightly changed. It forms stronger H-bonding interactions with Thr-26. All ligands remain in stable positions in the active sites of Mpro. The calculated RMSD backbone and C-alpha values for Mpro-apo and Mpro-complexes were shown in Supplementary Materials (Table S2). The calculated residue-wise fluctuations (RMSF) for each amino acid of Mpro-apo and Mpro-complexes were shown in Supplementary Materials ( Figure S3 and Table S2 Lupeol (also known as Fagarsterol) belongs to the triterpene group which is enriched in vegetables such as pepper, curcumin, strawberry and red grapes and some medicinal plants such as American ginseng and sea butter plant. It exhibits significant anti-inflammatory activity, which is equal to dexamethasone. The triterpene group directly inhibits the growth of human cancer cells under in vivo and in vitro systems. Lupeol act as a therapeutic and chemopreventive agent for the treatment of cancer and diabetes [65,66]. The -OH group of lupeol, completely rotates their structure and interacts with the other amino acids. Thr-26, His-41,  Cys-145, and Glu-166 of Mpro made hydrophobic interactions with lupeol while Thr-26 forms a strong H-bond interaction (O … H-N) at the distance of 1.8 Å (Figure 5(C,  D)). Licorice is widely available for herbal treatment used in Traditional Chinese Medicine (TCM). Glabridin is the main active compound, which has antiviral and antimicrobial activity [67,68]. It is a type of isoflavonoid that interacts with residues Thr-26, Arg-188 forms non-conventional Hbond and forms π-π stacking interactions with His-41 Mpro (Figure 6(A,C)). The interacting sites were slightly changed during the MD simulations by the flipping of glabridin. But still, maintains their interactions with the Mpro active sites. Licorice is used for thousands of years in Chinese medicine as a life-enhancing agent [68]. Asarinin which contains a furofuran ring, extracted from the roots of Asarum sieboldii. It exhibits cytotoxicity activity against human ovarian cancer cells [69].

MM-PBSA calculation
After MD simulations, free energy methods can be used to obtain the binding free energy of protein-ligand complexes. Free energy methods are diverse and have high accuracy for estimating Mpro with ligands. A total of 1000 snapshots were extracted evenly from the MD simulation. From the snapshots, the energy of molecular mechanical (E MM ) and solvation accessible (PBSA) were calculated for our system. The energy terms of electrostatic, van der Waals (vdWs), polar and solvent-accessible surface area and total binding energies were displayed in Figure 7. The electrostatic, vdWs, and nonpolar solvation-free energy are dominantly driving the binding energy. The polar solvation-free energy destabilises the binding interactions. The vdWs interactions predominantly enhance the binding energy between protein and ligands. Compounds β-sitosterol [70], spinasterol [71] and asarinin [72] possess stronger binding interactions with Mpro structure. The respective binding energies are −24.56, −26.07 and −25.76 kcal/mol obtained from MM/PBSA. Especially, asarinin shows higher vdW, electrostatic, and polar interactions with Mpro. The average binding energy for all the simulated complexes was plotted with time which is displayed in Supplementary Materials Figure S4. The binding free energy plots of lupeol and glabridin complexes show large fluctuations and their binding energies are −21.08 and −18.17 kcal/mol, respectively. From the energy calculations, we concluded βsitosterol, spinasterol and asarinin exhibit stronger interactions with Mpro.
Further, we calculated free energy decomposition analysis to understand the contribution of specific amino acid interaction on binding free energy using MM-PBSA. Free energy decomposition analysis to identify the hot spot residues that participate in protein-ligand binding interactions. The hot spot residues are helpful to rationalise the drug design of potent inhibitors to the drug target. We extracted 1000 snapshots for each system and the average ligand-residue interaction energies (ΔE lig-res ) were calculated. The hot spot residue is defined by the residues that possess (ΔE lig-res ) more than −1.0 kcal/mol, which is listed in Supplementary materials Table S3. The interaction energies between ligands and residues of Mpro are displayed in Figure 8. We considered the most significant hot spot residue is (ΔE lig-res ≥ −3.0 kcal/mol), and the common significant hot spot residues in proteins are Thr-25, Met-49, Cys-145, Met-165, and Gln-189. These residues constantly interact with the ligands through hydrophobic interactions. Mostly, asarinin has stronger interactions with Mpro residues. It forms stronger per-residue interactions of −6.7 and −7.7 kcal/mol with Met-49 and Met-165, respectively. β-Sitosterol, spinasterol and asarinin have stronger interactions with Mpro, and asarinin exhibits high GI absorption.

Conclusions
The contagious disease of SARS-CoV-2 spread worldwide, and the mortality rate also hikes. It is an urgent need to find potential antiviral drugs against SARS-CoV-2. CADD approaches are helpful to identify the potential drug against COVID-19. Available drugs cause adverse effect and their interaction mechanisms are still elusive. The selected phytochemicals from various herbal plants strongly interacted with the Mpro of SARS-CoV-2. We have selected 25 chemical compounds, which are involved in the apoptosis process and exhibit antiviral and anti-inflammatory activities. Molecular docking was used to screen the chemical compounds, according to their binding energies. ADME properties help to understand the drug-like properties of all chemical compounds. The five compounds whose binding affinities are greater than −8.0 kcal/mol are selected for further screening where MD simulation was incorporated. As a result, protein-ligand complexes are maintained stable throughout the simulation time.
Thereby, the stability was improved in the presence of chemical compounds. MM-PBSA calculation provided the binding free energy for all compounds. Compounds β-sitosterol, spinasterol, and asarinin show stronger interactions with Mpro of SARS-CoV-2. The most significant hot spot residues Thr-25, Met-49, Cys-145, Met-165, and Gln-189 made important contributions to protein-ligand binding, which can help drug design for target interactions of Mpro. We found out that among these three compounds, the chemical scaffold of asarinin is attractive and the predicted GI absorption is high and made stronger interactions with Mpro. In conclusion, we screened the natural inhibitors, which is helpful to proceed with further drug development against the Mpro of SARS-CoV-2.

Code availability
Gaussian16, Gromacs and Amber packages.

Disclosure statement
No potential conflict of interest was reported by the author(s).

Data availability statement
The data are available from the corresponding author upon reasonable request.