Computational approach for the design of potential spike protein binding natural compounds in SARS- CoV2


 Angiotensin converting enzyme 2 (ACE2) (EC:3.4.17.23) is a transmembrane protein which is considered as receptor for spike protein binding of novel coronavirus (SARS-CoV2). Since no specific medication is available to treat COVID-19, designing of new drug is important and essential. In this regard, in silico method plays an important role as it is rapid, cost effective, compared to the trial and error methods using experimental studies. Natural products are safe and easily available to treat coronavirus effected patients, in the present alarming situation. In this paper five phytochemicals which belong to flavonoid and anthraquinone subclass, selected as small molecules in molecular docking study of spike protein of SARS-CoV2 with its human receptor ACE2 molecule. From the detail analysis of their molecular binding site on spike protein binding site with its receptor, hesperidin, emodin and chrysin are selected as competent natural products from both Indian and Chinese medicinal plants, to treat COVID-19.


I. Introduction
COVID-19 is caused by novel coronavirus named SARS-CoV-2. Virus particles are spherical in shape having spike proteins around them. These proteins are responsible for virus replication in human host cells. Spike proteins latch onto human cells and undergo a structural change, which results in the fusion of viral membrane with human host cell membrane. Thus, the viral genes enter into the host cell and produces more viruses after coping its genome. SARS-CoV-2 spike proteins bind to the receptor proteins, on the human cell surface, known as angiotensin converting enzyme 2 (ACE2). Atomic level structure of SARS-CoV-2 spike proteins have a Receptor Binding Domain (RBD) for binding to host human cells. Receptor Binding Domain (RBD) of spike glycoprotein (RBD-S) can bind to the ACE2 receptor at the Protease Domain (PD) of the host human cell, causing viral infection.
Considering the preliminary data, it has been suggested that ACE2 is a receptor for the novel coronavirus (SARS-CoV-2), that was identi ed as the cause of the respiratory disease outbreak in Wuhan in late 2019 [1], [2]. SARS-CoV-2 is a beta coronavirus, having similarity with SARS-CoV virus, in binding with human ACE2 receptor and spike glycoprotein for viral entry [3]. Tai et al, 2020, suggested that RBD fragment (from amino acid residues 331 to 524 of spike protein) in SARS-CoV-2 strongly binds with to human ACE2 (hACE2) and as well as bat ACE2 (bACE2) receptors. Thus, this spike protein fragment can block the entry of SARS-CoV-2 and SARS-CoV into their respective hACE2-expressing cells, resulting in that it may serve as a viral attachment inhibitor against SARS-CoV-2 and SARS-CoV infection.
Every coronavirus contains four structural proteins, for example spike (S), envelope (E), membrane (M), and nucleocapsid (N) proteins. Among them, S protein is the most important protein which controls the biological processes such as viral attachment, fusion and entry into the host cell. As a result, it can be considered as a target for development of antibodies, entry inhibitors and vaccines, similar to SARS-CoV infection [4] [5]. The S protein facilitates viral entry into human host cells by rst binding to a host receptor (ACE2) through the receptor-binding domain (RBD) and then fusing with the viral and host membranes. But SARS-CoV-2 spike protein is 10-20 times more likely to bind with ACE2 on human cells, compared to that of spike protein from the SARS-CoV infection (occurred in 2002). This may enable SARS-CoV-2 to spread more easily than SARS-CoV infection. Despite very much similarity (76.5%) in sequence [6] and structure between the spike proteins of two viruses, three different antibodies against the 2002 SARS virus (SARS-CoV) cannot be successfully administered against SARS-CoV-2, which is popularly known as COVID-19.
ACE2 is a functional receptor for both SARS-CoV and SARS-CoV-2. For SARS-CoV infection, ACE2 is con rmed as receptor in both in vitro and in vivo studies [6]. Similarly, Zhou et al, 2020, has con rmed that SARS-CoV-2 uses ACE2 as a cellular entry receptor in human host [7]. ACE2 enzyme having catalytic activity in maturation of angiotensin, a peptide hormone. ACE2 is a type I membrane protein, expressed in many extrapulmonary tissues including heart, kidney, endothelium, and intestine. ACE2-expressing epithelial cells have high levels of multiple viral replication related genes, [8], signifying that the ACE2expressing epithelial cells facilitate coronaviral replication in the lung [9]. The presence of ACE2 receptor in other tissues, can explain the cause of kidney damage, heart failure and liver damage in COVID-19 infected patients. Different activities of ACE2 protein and inhibitory role of spike protein, are depicted in Figure 1.
Schematic diagram for structure of transmembrane ACE2 protein is shown in Figure 2. There are three topological domains in ACE2 such as extracellular domain (from 18-740), a transmembrane helical domain (from 741-761) and cytoplasmic domain (from 762-805) [10].
Several potential therapeutic approaches have been investigated for the treatment of SARS-CoV-2 infection such as protein-based vaccine design, blocking of ACE2 receptor and effect of phytochemicals on spike protein binding with its ACE2 receptor. Among the various therapeutic strategies that have been proposed for the treatment of SARS-Co V 2 treatment, drug designing with phytochemicals is a well-known method. Several phytochemicals for example, Ocimum sanctum extract on main protease protein [11], 5,7,3′,4′tetrahydroxy-2'-(3,3-dimethylallyl) iso avone from Psorothamnus arborescens on 3-chymotrypsin-like protease [12] and curcumin, brazilin, and galangin from Curcuma sp., Citrus sp., Alpinia galanga, and Caesalpinia sappan on both SARS-CoV-2 protease and RBD (Receptor Binding Domain) of spike glycoprotein (RBD-S) [13] and Belachinal, Maca avanone E & Vibsanol B on envelop protein [14] are analyzed with the help of molecular docking and molecular dynamics simulation studies. In the last study, hesperidin, one of common avonoids in Citrus sp., has selected as potent inhibitor with the lowest docking score for protein receptors resulting the highest a nity to bind the receptors. C Wu et al, 2020 [15] have used homology modeling technique to model 18 viral proteins and 2 human target proteins. They have screened potential small-molecule compounds from a ZINC Drug Database (2924 compounds) and a small in-house database of traditional Chinese medicine and natural products (including reported common anti-viral components from traditional Chinese medicine) and derivatives (1066 compounds) to identify small molecules to treat SARS-CoV-2 infection. Hesperidin molecule, which is known for its anti-in ammatory, anti-oxidant effect, is obtained from Citrus aurantium. This is the only compound that could bind the interface between Spike and ACE2. So, they have suggested hesperidin may disrupt the interaction of ACE2 with RBD. But during molecular docking analysis, they used PDB le SARS_CoV-2_Spike_RBD_homo_Hesperidin considering RBD-S (PDB ID: 6LXT) and PD-ACE2 (PDB ID: 6VWI).
To study the effect of Indian phytochemicals on spike protein fragment, molecular docking study is used for spike glycoprotein fragment with human ACE2 receptor. Bound structure of spike glycoprotein with human ACE2 receptor is considered here as target molecule for treatment of COVID-19.
Some phytochemicals, which have been reported earlier as spike protein inhibitor for SARS [16], [17] are considered here as small molecules for protein -ligand molecular docking study. These phytochemicals are present in Indian medicinal plants. Name, source, chemical class and structures of phytochemicals e.g. hesperidin, emodin, anthraquinone, rhein and chrysin are enlisted in Table 1 and Figure 3. This information is collected from IMPPAT: Indian Medicinal Plants, Phytochemistry And Therapeutics a curated database [18]. Ii. Methodology 1. Protein molecular modeling of spike protein fragment 3D structure of RBD fragment (from amino acid residues 331 to 524 of spike protein) in SARS-CoV-2 is considered in this paper as responsible fragment for strongly binding with to human ACE2 (hACE2) receptor protein. Before molecular docking analysis, following steps are performed with the primary sequence of spike protein fragment.

1 Retrieval of protein sequence for spike protein fragment
The protein sequence of spike glycoprotein from Severe Acute Respiratory Syndrome Coronavirus 2 (SARS-CoV 2) containing 193 amino acid residues from positions 331 to 524 is retrieved from GenBank database (https://www.ncbi.nlm.nih.gov/protein/QHR63250.2) in FASTA format and considered as spike protein fragment in this study.
1. 2 3D structure homology modeling and validation of modeled structure In modeling 3D structure of the spike protein fragment by using sequence homology approach, rst of all sequence alignment method is used. Thus, the best matching PDB structures of other proteins are identi ed with the help of following steps:
The target sequence is searched with BLAST [19] against the primary amino acid sequence contained in the SMTL. A total of 63 templates are found.
An initial HHblits pro le has been built using the procedure outlined in [20], followed by 1 iteration of HHblits against NR20. The obtained pro le has then been searched against all pro les of the SMTL. A total of 110 templates are found.

2. 2 Template Selection
For each identi ed template, the template's quality has been predicted from features of the target-template alignment. The templates with the highest quality have then been selected for model building.

2. 3 Model Building
Models are built based on the target-template alignment using ProMod3. Coordinates which are conserved between the target and the template are copied from the template to the model. Insertions and deletions are remodeled using a fragment library. Side chains are then rebuilt. Finally, the geometry of the resulting model is regularized by using a force eld. In case loop modelling with ProMod3 fails, an alternative model is built with PROMOD-II [21].

2. 4 Model Quality Estimation
The best model among obtained models by using two types of selection methods are estimated by QMEAN4 scores [22] and Ramachandran plot [23,24]. The global and per-residue model quality has been assessed using the QMEAN scoring function [22] for both models while the Ramachandran plot for two models are obtained using PROCHECK [23] and MolProbity [24]. Evaluation of backbone conformation of protein molecule is assayed by Ramachandran plot dividing the percentage of amino acid residues of the model in the allowed and disallowed regions [23,24].

Molecular docking between spike protein fragment and human ACE2 receptor
Molecular docking studies between spike protein fragment and human ACE2 receptor are performed using ClusPro [25]. In ClusPro 2.2 web server [25], Cluster scores for lowest binding energy prediction are calculated using the formula-E = 0.40E_{rep} + -0.40E_{att} + 600E_{elec} + 1.00E_{DARS}. Here, repulsive, attractive, electrostatic as well as interactions extracted from the decoys as the reference state, are considered for structure-based pairwise potential calculation in docking [26].

Molecular docking study of phytochemicals from Indian medical plants
Docking of bound structure (spike protein fragment and its receptor ACE2) with phytochemicals are carried with SWISSDOCK web server based on EADock DSS [27]. Many binding modes are generated in the vicinity of all target cavities (blind docking). Simultaneously, their CHARMM energies are estimated on a grid with CHARMM force eld [28] on external computers from the Swiss Institute of Bioinformatics.
The binding modes with the most favourable energies are evaluated with FACTS [29] and are therefore clustered. Molecular complexes are ranked by the most favourable binding energies. Among those, we select the one structure representing the best binding mode for each phytochemical, based on an energy average value corresponding to the rst ve ranked structures. The most favourable clusters are visualized by the USCF Chimera software [30].
Overall, 101 templates are found.
QMEAN value is intended as a linear combination of four statistical potential terms and transformed to a Z score relating it to high resolution X-ray structures of similar size. Higher Z score is related to more favorable model. Ramachandran plots are drawn for this model by using two web servers e.g. Molprobity [24] and PDBsum [31], are shown in Figure 4 (b) and 4 (c). For this model the overall average value of Gfactors is -0.18 which is not unusual for dihedral angles and main-chain covalent forces. The value of Gfactors provides a measure of how unusual or out-of -the ordinary, a property is. From MolProbity version 4.4 [33] it is calculated that, modelled structure has 94.44% residues in favored regions, 0.56% residues in outlier region and 3.18% in rotamer outlier region. Ramachandran plot statistics from PDBsum [34] for

Molecular docking between spike protein fragment and human ACE2 receptor
Human ACE2 receptor (PDB ID 1R42) [35] is considered as receptor protein for molecular docking study of spike protein fragment with its receptor in human host.
By using ClusPro [25] web server, docking structure of A chain of human ACE2 receptor, binds with spike protein fragment, is obtained. SARS CoV2 spike protein binds with human ACE2 receptor protein with binding energy -779.8 Kcal/mole. A conformational change occurs in ACE2 receptor protein after binding with spike protein fragment ( Figure 5).
Amino acids present in distorted site of ACE2 are ASP136, ASN 137, PRO 138, GLN139 and interacting amino acids of spike protein fragment are GLN 403, LYS 451 and ASP 416 ( Figure 6).

Spike protein binding with ACE2 in presence of emodin
The phytochemical emodin, obtained from Rheum emodi or Himalayan rhubarb [36], binds with spike protein fragment and its receptor human ACE2 protein [37], at the same cleft (Figure 8), same to that of hesperidin. But binding energy is less for emodin binding (-6.19 Kcal/mole) compared to that of that of hesperidin (-8.99 Kcal/mole).

Spike protein binding with ACE2 in presence of anthraquinone
Though anthraquinone can bind with bound structure of spike protein fragment and its receptor ACE2 molecule, with releasing binding energy -6.15 Kcal/mole, but the binding site of this phytochemical is totally different from that of hesperidin and emodin ( Figure 9).

Rhein binding with bound spike protein and ACE2 receptor protein
The phytochemical rhein binds with docked structure of spike fragmented protein and human ACE2 receptor with Δ G value -8.73 Kcal/mole. But the binding site of this chemical totally different from earlier substances ( Figure 10). Rhein can bound with only spike protein fragment. It has no interaction with human ACE2 receptor protein molecule.

Chrysin binding with bound spike protein and ACE2 receptor protein
Chrysin binds with the spike protein fragment and its ACE2 receptor with binding energy -6.87 Kcal/mole ( Figure 11). This phytochemical binding site is almost similar with that of spike protein fragment molecule and its receptor. A conformational change occurs in ACE2 receptor molecule after spike protein fragment binding. Chrysin binding cleft is nearly located to that site as shown in Figure 12.
Energy parameters of bound structure of phytochemicals with spike protein fragment and ACE2 receptor are shown in Table 3. Considering the lowest binding energy, the phytochemical hesperidin is considered as most suitable ligand for target molecule, which is formed by binding with spike protein fragment and its human host ACE2 receptor.
In six docking structures interacting amino acids of ACE2 receptor and spike protein fragment are summarized in Table 4. Among three interacting amino acids of spoke protein fragments GLN 403 and ASP 416 are well conserved among all sequences. But LYS 451 is conserved among SARS-CoV2 spike proteins and differed with ARG in SARS-CoV spike proteins. Though arginine is a positively charged, polar amino acid, it can be substituted with the other positively charged amino acid lysine. But a change from arginine to lysine is not always neutral. Arginine contains a complex guanidium group on its positively charged sidechain and shows a geometry and charge distribution for ideal binding with negatively charged amino acid residues. It can also form multiple hydrogen bonds. But lysine also can interact with negatively charged amino acid residues, but it is more limited in the number of hydrogen bonds it can form [38]. Hesperidin is a major avonoid compound, present in orange and lemon fruits. Orange juice contains 470-761 mg/l of hesperidin [38]. These phytochemical exhibits various medicinal uses. According to oral toxicity study of hesperidin, it can be concluded that this phytochemical can be safely used in herbal formulations with its LD 50 value is more than 2000mg/kg [39]. This avanone glycoside, has a long medicinal history in both Indian and Chinese herbal medications [40]. This phytochemical alone or in combination with chemicals, often be used in various diseases.
Emodin is a polyphenol found in the roots, leaves and bark of several plants including aloe vera, cascara, rhubarb, senna etc. In traditional medicine, emodin has been used for cardiovascular diseases, osteoporosis. It has been suggested earlier that emodin can inhibit In enza Avirus replication and in uenza viral pneumonia [41] via several cell signaling pathways.
Chrysin a natural avonoid, is commonly found in propolis and honey and traditionally used in herbal medicine. As reported earlier, chrysin can act as inhibitor during enterovirus 71 (EV71) growth and replication [42]. Similarly, Song et al, 2015 have described antiviral activity of chrysin against coxsackievirus B3 (CVB3) [43].
Considering the results obtained from molecular docking studies, phytochemicals hesperidin, emodin and chrysin can be recommended for the treatment of COVID-19, after in -silico mutagenesis study and experimental veri cation.

Declarations
Authors' contribution statement   Distorted amino acids after spike protein binding in ACE2 receptor  Spike protein binding with ACE2 in presence of anthraquinone Figure 10 Rhein binding with bound spike protein and ACE2 receptor protein Chrysin binding with bound spike protein and ACE2 receptor protein Figure 12 Chrysin binding cleft