Inhibition of multiple SARS-CoV-2 proteins by an antiviral biomolecule, seselin from Aegle marmelos deciphered using molecular docking analysis

Abstract Our earlier experimental and computational report produced evidence on the antiviral nature of the compound seselin purified from the leaf extracts of Aegle marmelos against Bombyx mori Nuclear Polyhedrosis Virus (BmNPV). In the pandemic situation of COVID-19 caused by the SARS-COV-2 virus, an in silico effort to evaluate the potentiality of the seselin was made to test its efficacy against multiple targets of SARS-COV-2 such as spike protein S2, COVID-19 main protease and free enzyme of the SARS-CoV-2 (2019-nCoV) main protease. The ligand seselin showed the best interaction with receptors, spike protein S2, COVID-19 main protease and free enzyme of the SARS-CoV-2 (2019-nCoV) main protease with a binding energy of −6.3 kcal/mol, −6.9 kcal/mol and −6.7 kcal/mol, respectively. Docking analysis with three different receptors identified that all the computationally predicted lowest energy complexes were stabilized by intermolecular hydrogen bonds and stacking interactions. The amino acid residues involved in interactions were ASP1184, GLU1182, ARG1185 and SER943 for spike protein, SER1003, ALA958 and THR961 for COVID-19 main protease, and for SARS-CoV-2 (2019-nCoV) main protease, it was THR111, GLN110 and THR292. The MD simulation and MM/PBSA analysis showed that the compound seselin could effectively bind with the target receptors. The outcome of pharmacokinetic analysis suggested that the compound had favourable drugability properties. The results suggested that the seselin had inhibitory potential over multiple SARS-COV-2 targets and hold a high potential to work effectively as a novel drug for COVID-19 if evaluated in experimental setups in the foreseeable future. Communicated by Ramaswamy H. Sarma


Introduction
The outbreak of coronavirus disease 2019  recently has transitioned into a pandemic state and consumed numerous human lives due to its contagious nature. This dreadful disease has infected more than 182 million people, and the reported deaths have been over 4 million, till date [World Health Organization (WHO), Coronavirus, COVID-19 update -https://covid19.who.int/]. The antiviral drugs treated for many diseases have been tested onto COVID-19 as well, reckoning by its potentiality. But these current medications, which include administration of repositioned Food and Drug Administration (FDA) approved drugs, namely lopinavir/ritonavir (experimental, retrovirus protease inhibitor), remdesivir (experimental, viral polymerase inhibitor) and favipiravir [experimental, viral ribonucleic acid (RNA) polymerase inhibitor] etc., are based on in vitro studies reported for COVID-19 and recommended by testing batches of patients across the world. Nevertheless, it is still in a biased and anecdotal state (Scavone et al., 2020).
The genome of severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) comprises single-stranded (þ) RNA that encodes for structural and non-structural proteins. Spike glycoprotein, envelope protein, membrane protein, and nucleocapsid protein comes under structural proteins. The non-structural proteins are the enzymes involved in the replication of virus. Those are polymerase, helicase and endoribonuclease (Jogalekar et al., 2020). In a short period, documentations on molecular docking of naturally occurring plant compounds with SARS-CoV-2 proteins have been reported with plausible lower binding energy. Among these, particular attention has been given to spike protein and proteases of SARS-CoV-2 (Hall & Ji, 2020). This is because, upon the molecular studies on virus transmission, it is evident that i) the spike protein binds to the host cellular receptor angiotensin converting enzyme-2 (ACE2) and responsible for the fusion between the viral envelope and the cellular membrane ii) the proteolytic cleavage of polyprotein by a papainlike cysteine protease and 3 C-like serine protease forms the replication enzymes of virus (Hofmann & P€ ohlmann, 2004;Wan et al., 2020;Wrapp et al., 2020;. The resulted information of reported docking studies highly facilitates the categorization of drugs and takes a call on proceedings with potential candidates for in vitro and follow up studies. Plants especially possessing medicinal values, have been a preferable source for human healthcare purposes, particularly by discovering lifesaving drugs (Jimenez-Garcia et al., 2013). Therefore, the ideal approach of ethnopharmacology is to classify the evidence of medicinal plants and isolate the drugs that take the edge off human illness (Farnsworth, 1994). In screening for bioactive compounds, the assortment of the plant species to be studied is a decisive factor for the eventual success of the investigation. In that way, Aegle marmelos, commonly called as bael (Family: Rutaceae), reported to be rich in several bioactive compounds such as aegelin, lupeol, cuminaldehyde, eugenol, skimmianine, cineol, citral, marmesinin, marmelide, b-sitosterol, flavone, glycoside and marmeline, was chosen to the study for the following reasons (Arul et al., 1999;Farooq, 2005;Govindachari & Premila, 1983;Manandhar et al., 1978). A. marmelos exhibited antiviral activity against Human coxsackieviruses B1-B6, an enterovirus (that also includes polioviruses and echoviruses) belongs to the picornaviridae which causes clinical manifestations such as respiratory illness (types 2-5), meningoencephalaphitis (types 1-5), myocarditis (types 1-5), myocardiopathy, pancreatitis etc., (Badam et al., 2002;Melnick, 1984). Interestingly, a high throughput screening of approximately 6800 small molecules identified that proteases of picornavirus and coronavirus have a common inhibitor (Kuo et al., 2009). Significantly, in our earlier report, we have successfully purified the molecule seselin from the leaf extracts of A. marmelos (Somu et al., 2019). It is noteworthy that it exhibited antiviral activity against Bombyx mori nuclear polyhedrosis virus (BmNPV). The process of drug development is a high risk associated platform with numerous evaluations but a high payoff endeavour ultimately (Chadwick & Marsh, 2008). Amid pandemic scenario, rapid and prompt report analysis of drug's eligibility using in silico approach would serve as an initiating factor for a study to pursue on its basis. In this context, on presuming the potentiality of seselin against the SARS-CoV-2 based on its antiviral nature, this study aims to analyse the in silico structural inhibition of SARS-CoV-2 proteins by seselin.

Preparation of ligand
The structure of the ligand seselin, an anti-BmNPV compound obtained from the purified crystal in our earlier report (Somu et al., 2019) was drawn using MarvinSketch program (MarvinSketch, ChemAxon). PRODRG server was used to describe the ligand molecules, and a variety of topologies for use with GROMACS, WHAT IF, Autodock, HEX etc., were generated. CORINA-SMILES notations verified the atomic coordinate and quality of structures. To prepare the ligand for docking, Open Babel (O'Boyle et al., 2011) in PyRx 0.8 (Dallakyan, 2008) was used. The Graphical User Interface was used to change energy minimization parameters, and the energy of the ligand was minimized. Finally, the ligand was converted into Autodock ligand (PDBQT) format for docking.

Preparation of receptor
The structures of Spike protein S2 (PDB-ID: 6LXT) (Zhu & Sun, 2020), the crystal structure of COVID-19 main protease in complex with an inhibitor N3 (PDB ID: 6LU7)  and the free enzyme of the SARS-CoV-2 (2019-nCoV) main protease (PBD ID: 6Y2E)  were downloaded from the RCSB protein data bank (https://www.rcsb. org/search). The receptors preparation was performed by optimizing protein model geometry, removing ligands and other heteroatoms using Biovia Discovery studio 20.1.0.

Prediction of potential ligand binding site
Before docking analysis, the ligand-binding sites on protein surface were identified using Metapocket 2.0 (https://projects.biotec.tu-dresden.de/metapocket/index.php). MetaPocket 2.0 uses a consensus method in which the predicted binding sites from eight methods such as LIGSITE cs , PASS, QSiteFinder, SURFNET, Fpocket, GHECOM, Concavity and POCASA are combined to improve the success rate of prediction. A z-score is calculated separately for each pocket site in different predictors. Top 10 potential ligand-binding sites were retrieved to analyze active site binding residues and comparison of docking results.

Virtual screening and visualization
PyRx 0.8 was used for virtual screening that uses Vina and AutoDock 4.2 as docking softwares with Lamarckian Genetic Algorithm (LGA) as the scoring function. Docking analysis of seselin with three different receptors was carried out using AutoDock vina (Trott & Olson, 2010), PyRx 0.8. Binding site amino acids were selected, and the docking site on the protein target was defined by generating a grid box that allows selecting the search space. All bonds of ligands were set to be rotatable. Charges were added for both ligand and receptors. The calculations for protein-fixed ligand-flexible docking were done using the Lamarckian Genetic Algorithm (LGA) method. The best conformation for each receptor was chosen with the lowest docked energy once after the docking search was completed. The interactions of protein-ligand conformations, including hydrogen bonds and the bond length, were analyzed using Discovery Studio 20.1.0 and PyMOL Molecular Graphics system, Schrondinger, LLC. The 2 D structure of the protein-ligand interaction was visualized using Biovia Discovery Studio 20.1.0.

Molecular dynamics simulations
Finally, the more profound insights about molecule binding were explored by the use of Molecular dynamics (MD) simulations studies. Here, we use Gromacs version 5.1.2 to carry out the MD simulation studies at GROMOS 43a1 force field. The simple point charge (SPC) model was used to solvate the system (Singh et al., 2019). The counter ions of 4 Na þ were added to neutralize the charge of the 6LU7 and 6Y2E systems. On the other hand, 5 Na þ ions were supplied as counter ions to the 6LXT system. All the three systems were subjected to energy minimization using the steepest descent algorithm for 10000 nsteps. The system was exposed to periodic boundary conditions, and the canonical ensemble (NVT) and isothermal-isobaric ensemble (NPT) were employed for 50000 nsteps at normal temperature (300 K) and 1 bar of pressure. Berendsen temperature coupling and standard pressure coupling were used for keeping the system in a stable environment (300 K, 1 bar). To deal with long-range Coulomb interactions of the system, the particle mesh Ewald method was used. The LINCS algorithm was used to constrain all the bonds. For each system, MD simulation was performed for 130 nanosecond (ns), and trajectories were retrieved for further analysis. For instance, the different structural parameters such as Root Mean Square Deviation (RMSD), Root Mean Square Fluctuations (RMSF), Inter-molecular Hydrogen Bonding and Free Energy Landscape (FEL) were computed as a function of time to explore the dynamic behavior of the protein-ligand binding pattern.
In essence, the binding energy calculations of the complexes were calculated using molecular mechanics Poisson-Boltzmann surface area (MM/PBSA) method. A protocol developed by the Rashmi kumari and GROMACS utilities were used to calculate the different energy terms of the system (http://rashmikumari.github.io/g_mmpbsa/) (Kumari & Kumar, 2014). The equilibrated trajectory information, the last 5 ns of each MD trajectories, were used as an input for MM/PBSA analysis. The binding free energies of each system were calculated as the amount of free energy difference between the unbound component and the complex structures. The MM/PBSA analysis provides an individual contribution to the binding energy and a per-residue contribution that provides the key amino acid residues, which might be helpful in better understanding the binding pattern of the compound.

Pharmacokinetic analysis
The drug-likeness of the ligand was predicted by Lipinski filter at pH 7.0, which helps distinguish drug-like molecules from non-drug like molecules. According to this, a drug should comply with a minimum of four of the five laid down criteria, which include molecular mass (< 500 Dalton), lipophilicity (< 5), hydrogen bond donors (< 5), hydrogen bond acceptors (< 10) and molar refractivity (between 40 and 130) (Lipinski, 2004).
The ligand properties concerning absorption, distribution, metabolism and excretion (ADME) were analyzed using SwissADME (Daina et al., 2017). In which the user can either draw their ligand or include SMILES. This website allows computing the ADME parameters, pharmacokinetic properties, drug-likeness that can support drug discovery. pkCSM online machine-learning platform (Pires et al., 2015) was used for toxicity prediction. It can provide information on water solubility, intestinal absorption (human), skin permeability, the volume of distribution, total clearance, maximum tolerated dose, skin sensitization, chronic toxicity, hepatotoxicity etc. To estimate the most probable targets of the compound SwissTargetPrediction server (Gfeller et al., 2014) was used. This online tool can predict the macromolecular targets (proteins from humans, mouse and rats) of small bioactive molecules.

Docking analysis
Docking of the molecule, seselin (Figure 1) identified and purified from the leaves of A. marmelos with three different receptors using docking procedure showed that all the computationally predicted lowest energy complexes were stabilized by intermolecular hydrogen bonds and stacking interactions. The ligand seselin showed the best interaction with target receptors based on the RMSD values. In addition to RMSD clustering, AutoDockVina was also used to calculate the binding free energies of these interactive molecules to find the best binding mode. The calculated final docked energies were À6.3 kcal/mol for Spike protein S2 (PDB-ID: 6LXT), À6.9 kcal/mol for COVID-19 main protease (PDB ID: 6LU7) and for the free enzyme of SARS-CoV-2 (2019-nCoV) main protease (PDB ID: 6Y2E), it was À6.7 kcal/mol. Docking results identified that the molecule seselin could enter the substrate-binding region of the active site ( Figure 2). The results demonstrated clearly that the molecule seselin accurately interacted with all three receptors. High binding affinity was predicted for all the receptors. The molecule seselin observed the formation of hydrogen bonding with ARG1185 for spike protein S2 (Figure 3).
In COVID-19 main protease, seselin formed hydrogen bonding with the residue SER1003, a p-alkyl interaction with ALA958 and pr interaction with the residue THR961 ( Figure 4). The ligand showing one hydrogen bond formation with the residue THR111, a Van der Waals interaction with GLN110 and a Pi-donor hydrogen interaction with THR292 was observed with free enzyme of the SARS-CoV-2 (2019-nCoV) main protease ( Figure 5). Information about the atoms involved in bonding with ligands, bond lengths and docking energies of all the three receptors were given in Table 1.

Molecular dynamics simulations
MD simulation studies were performed to identify the structural and dynamic behaviour of the complex system. The RMSD (Root-mean-square deviation) were calculated for the backbone atoms for all the complexes. The results of RMSD were plotted in Figure 6. It was evident from the figure that all the complexes were showing up drift during the initial stage of MD simulation. It was important to note that all the systems were started to maintain a plateau after 100 ns of simulation. The RMSD value of the 6LU7 system was ended with $0.45 nm, the 6Y2E system ended with $ 0.40 nm, and the 6LXT system were ended with $2 nm.
Further, the fluctuation of Ca residues of the target structure was monitored using RMSF (Root-mean-square fluctuation) analysis, and the results were represented in Figure 7. Notably, all the complexes were showing RMS fluctuation between $ 0.05 and $ 0.15 nm. Further, the intermolecular hydrogen bonding was assessed during the dynamic simulation using GROMACS's g_hbond tool. The results were plotted in Figure 8. Importantly, all the complexes had a maximum of 3 hydrogen bond interactions with the respective target receptor.
The binding energy values obtained from the MM/PBSA analysis were represented in Table 2 and Figure 9. It is evident from the table that the van der Waals interaction energy majorly contributed binding free energies of all three systems. The compound seselin exhibited a total binding free energy of À35.994 þ/-54.785, À50.697 þ/-59.195 and À26.121 þ/-39.985 kJ/mol for 6LU7, 6Y2E and 6LXT, respectively. Moreover, the energy contributed by each amino acid residue with the respective target receptors were plotted in Figure 10. It was observed from the figure that CYS-22, PHE-66, ASN-65 and HIS-64 were the major contributing residues of the 6LU7 system. ASN-1178, ASN-953 and VAL-1176 major contributing residues of the system 6LXT.

Pharmacokinetic analysis
Analysis of pharmacokinetic properties showed that the molecule seselin obeyed the Lipinski rule of five and had efficient ADME properties. The biological molecule used in this present study was found to pass all the five criteria mentioned in Lipinski's rule of five (Table 3). This suggested that the seselin had the great potential to work effectively as a novel drug. This information about the ADME properties of the seselin would be highly advantageous during the early process of drug discovery indeed intended to be the first step. Physicochemical descriptors such as ADME parameters,   pharmacokinetic properties, drug-like nature and medicinal chemistry friendliness that support drug discovery for the molecule used in the study were computed. Properties including intestinal absorption and blood-brain barrier penetration were also identified. It was predicted that the molecule could be transported across the intestinal epithelium, could cross the blood-brain barrier and soluble in an aqueous medium. The physicochemical properties computed by swiss ADME were presented in Table 4. Toxicity of the molecule predicted using the pkCSM online machine-learning platform suggested an optimal pharmacokinetic profile. It seemed to be characterized by better water solubility and metabolic stability. In addition to this, there was no AMES toxicity, and the maximum tolerated level in a human was about 0.033 log/mg/kg/day. It was found that human ethera-go-go-related protein (hERGI) and hERGII were not inhibited by this compound. The molecule was not hepatotoxic and caused no skin sensitization. Swiss Target Prediction was used to validate and estimate the most probable targets of the bioactive molecule, seselin ( Figure 11). The probability score computed for off-targets suggested that the molecule seselin had a high affinity towards the binding site that it was directed to. These details help understand the molecular mechanism underlying bioactivity, rationalizing possible sideeffects, and assessing the possibility of repurposing therapeutically relevant compounds.
MD simulations are beneficial for studying the conformational space of proteins and visualizing dynamic structural changes . The lower fluctuations during the RMSD and RMSF analysis indicated the higher stability of the seselin in the binding pocket of target structures. It is worth noting that throughout the binding process, all three systems yielded minimal fluctuations (± $0.05 nm) except the 6LXT system during the RMSD calculation. Also, the compound seselin displayed minimal RMS fluctuations at the binding site of respective target receptors. For instance, ASP1184, GLU1182, ARG1185 and SER943 showed RMSF  fluctuation of $ 0.075 nm in the 6LXT system. In such a way that SER1003, ALA958 and THR961 in the 6LU7 system were displayed $0.075 nm RMSF fluctuation and THR111, GLN110 and THR292 were present in the 6Y2E system exposed fluctuation of $0.05 nm. The higher binding affinity of the system was also indicated by more H-bonds between the protein and the ligand molecule (Surti et al., 2020). Interestingly, all three systems produce three stable inter-molecular hydrogen bond interactions throughout the simulation period. We believe that this increased number of hydrogen bonds with a consistent interaction mode may result in tightly bound complex systems. Moreover, the various energy terms calculated during the MM/PBSA analysis produce additional support to understand the system's binding affinity. Each system has a large number of amino acid residues that contribute to the binding free energy. Notably, high contributing residues are more than the weakly contributing residues in all the systems. In particular, the 6Y2E system has 9 highly contributing residues which are better than the other two systems investigated in this study. Overall, these MD simulations and MM/ PBSA results produce more incredible support for the effective binding of seselin with the SARS-CoV-2 target receptors.
The discovery and development of drugs are exorbitant and time-investing. Accurate predictions of the in vivo pharmacokinetics of the drug under study earlier in the process are paramount since this can prevent clinical phase drug development failures. The Lipinski rule assesses five factors that determine the possible and potential interactions between a drug and the target. It appraises the tendency of the desired compound to fall under specific essential categories. It states that an orally active drug should comply with a minimum of four of the five laid down criteria that include (a) molecular mass <500 Dalton (b) high lipophilicity (expressed as LogP < 5) (c) less than 5 hydrogen bond donors (d) less than 10 hydrogen bond acceptors (e) molar refractivity between 40-130 (Lipinski, 2004). It assures the recognition of drug absorption, tissue distribution, the fate of metabolism, its excretion and toxicity (ADMET), which can optimize the selection of suitable drug candidates for development (Boobis et al., 2002). Our results on this concern showed that the seselin has potentially cleared all the criteria put forth.
Drug safety assessment is critical which should be evaluated in preclinical and clinical trial phases. Prediction of toxicity of the compound rationalizes possible side-effects and assesses the possibility of repurposing therapeutically relevant compounds (Yang et al., 2018). Cytochrome p450 (CYP) enzymes are essential oxidases that help in the metabolism of drugs. Induction or inhibition of the enzyme, CYP3A4 mainly found in the liver and intestine, oxidizes small foreign organic molecules such as toxins or drugs that may influence the drug's pharmacokinetics altering their efficacy or toxicity. Another important target of many drugs is hERG (human ether-a-go-go-related protein responsible for K v 11.1, the alpha subunit of a potassium ion channel), the blockage of which can lead to sudden death (Feng et al., 2018).
An optimum pharmacokinetic profile is desired because it can prevent unforeseen toxic effects on humans (Bugrim et al., 2004). In-depth knowledge of the toxicological profile of the drug molecule is crucial as the toxicity is the reason for the failure of the drug in many clinical trials (Cronin, 2001). One of the most decisive steps in rationalizing a biomolecule is predicting or mapping its targets. This can, in turn, provide molecular insights into the mode of action of the drug candidate, possible side effects or cross-reactivity. For many proteins, including kinases and phosphatases, hundreds of ligand molecules are identified. Understanding the bioactive molecule's potential targets can also help understand how a molecule can be chemically modified to improve its bioactivity towards a particular target protein.
The probability score calculated can thus provide information on how specific the ligand is to the target that it is directed to (Gfeller et al., 2014). Water solubility (LogS), Caco-2 permeability and topological polar surface area (TPSA) all fell within   Figure 11. Probable targets of the compound predicted by Swiss Target Prediction server.
the reference range. Seselin as an anti-SARS-COV-2 molecule achieved with good binding affinity and ADMET. The outcome of pharmacokinetic analysis suggests that the compound had favourable drugability properties that further help eliminate expensive reformulation later. Along with the antiviral nature of 'seselin' in our earlier report, it has also been proven to be an inhibitor of indole acetic oxidase and peroxidase enzyme (Goren &Tomer, 1971), ovicidal (Tanaka et al., 1985), tumour-suppressive and anti-HIV (Huang et al., 1994), cytotoxic (Gunatilaka et al., 1994), antinociceptive and vasodilatory (Lima et al., 2006), anti-fungal (Ortega et al., 2007), antifeedant and larvicidal (Mukandiwa et al., 2013(Mukandiwa et al., , 2015 and DNA binding specific (Parveen et al., 2016), that by showcasing itself in a broad manner of acquiring bioactive potentials. Thus, the findings of this study suggest that the in silico multiple target inhibition of seselin on SARS-CoV-2 virus proteins predicted with higher binding affinity, non-toxicity, solubility and stability, is proficient and competent for pursuing the experiments to prove at in vitro and in vivo studies to hold its candidacy in therapeutics and drug discovery for COVID-19.