Shortlisting Phytochemicals Exhibiting Inhibitory Activity against Major Proteins of SARS-CoV-2 through Virtual Screening

Severe Acute Respiratory Syndrome Corona Virus 2 (SARS-CoV-2) declared as a pandemic by WHO that has affected more than 40 lakh peoples and caused death of more than 2 lakh individuals across the globe. Limited availability of genomic information of SARS-CoV-2 and non-availability of vaccines and effective drugs are major problems responsible for the ineffective control and management of this pandemic. Several attempts have been made to explore repurposing existing drugs known for their antiviral activities, and test the traditional herbal medicines known for their health beneting and immune boosting activity against SARS-CoV-2.In this study, efforts were made to examine the potential of 721 phytochemicals of 37 plant species in inhibiting major protein targets namely, spike glycoprotein, main protease (M Pro ), NSP3, NSP9, NSP15, NSP10-NSP16 and RNA dependent RNA polymerase of SARS-CoV-2 through virtual screening approach. Results of our experiments revealed that SARS-CoV-2 M Pro shared signicant dissimilarities against SARS-CoVM Pro and MERS-CoVM Pro indicating the need for discovering novel drugs. This study has identied the phytochemical cyanin (Zingiber ocinale) exhibiting broad spectrum inhibitory activity against main proteases of all the three Coronaviruses. Amentoavone, agathisavone, catechin-7-o-gallate and chlorogeninwere shown to exhibit multi target inhibitory activity. This study has identied Mangifera indica, Anacardium occidentale, Vitex negundo, Solanum nigrum, Pedalium murex, Terminalia chebula, Azadirachta indica, Cissus quadrangularis, Clerodendrum serratum and Ocimum basilicum as potential sources of phytochemicals combating nCOVID-19. More interestingly, this study has generated evidences for the anti-viral properties of the traditional herbal formulation “Kabasura kudineer” recommended by AYUSH, a unit of Government of India. Testing of short listed phytochemicals


Introduction
Novel Coronavirus disease (nCOVID-19) caused by SARS-CoV-2 virus has become a global threat and WHO has declared it as a pandemic [1]. nCOVID-19 is the third life threatening virus in the SARS family of viruses after SARS-CoV occurred during 2002-03 and MERS-CoV which occurred during 2012 [2][3][4][5]. As of April 29, 2020, a total of 216,563 deaths have been reported globally due to nCOVID-19. It is named as a novel Coronavirus as it shares signi cant dissimilarity against other members of the SARS family of viruses viz., SARS-CoV (30%) and MERS-CoV (60%) [6]. Its unique genetic makeup has made it not responsive to available vaccines and drugs. Ineffectiveness of existing drugs and vaccines against nCOVID'19 is attributed to its unique genetic makeup which necessitated search for novel targets for vaccine development and drugs for effective prevention and treatment of nCOVID-19.
Exploding increase in the nCOVID'19 affected cases has brought this globe to a halt. Scienti c community is trying to unravel genome complexity of nCOVID'19 for identifying novel targets for Page 3/25 development of vaccines, screen available anti-viral drugs for effective management and shortlisting effective botanicals for therapeutic interventions. This has resulted in the accumulation enormous genomic information of nCOVID '19 in the public domain (https://www.ncbi.nlm.nih.gov/genbank/sarscov-2-seqs/). Genomic analysis of nCOVID'19 revealed that it is approximately 30 kb in size (NCBI Accession # NC_045512) and further investigations identi ed three key genes viz., 1) coronavirus main protease (3CL pro )/papain-like protease (PL pro ); 2) RNA-dependent RNA polymerase (RdRp) and 3) spike glycoprotein (S protein) as potential targets for drug designing [7][8][9].
Screening of existing antiviral drugs including interferon α (IFN-α), lopinavir/ritonavir, chloroquine phosphate, ribavirin, chloroquine, hydroxychloroquine and arbidol is in progress and many of these experiments require pre-clinical and clinical validation [7]. Non availability of vaccines and ineffectiveness of existing anti-viral drugs have made the doctors to resort using traditional medicines in nCOVID '19 treatments[8, 10]. Several attempts have been made to exploit the potential of several herbal products having potential to inhibit the main protease (Mpro)/chymotrypsin-like protease (3CLpro) using molecular modelling and docking studies [11][12][13]. SR et al. [7] made an attempt to screen 27 different ligands present in commonly used herbals of indian cuisines against SARS-CoV-2 Main protease and identi ed 15 different ligands effective in binding the viralprotease [10]. made a systematic review of herbal drugs used in the effective treatment of SARS-CoV and MERS-CoV and emphasized the urgent need for evolving procedures involving complementary and alternative treatments in managing nCOVID'19. Studies conducted so far have made attempts by using limited number ligands which may hinder discovery of effective viral inhibitor in the herbal gene pool. In this context, shortlisting potential herbal drugs effective against nCOVID'19 through in silico docking of globally available ligands and validating them through laboratory and clinical trials is one of the viable approaches in managing this pandemic. India is one of the richest biodiversity centers in the world and known for its vast repository of medicinal plants. Considering India's richest biodiversity of herbal medicinal plants and regular use of such medicinal plants in Indian health care system, the present study was undertaken to screen about 721 ligands including small molecules and phytochemicals from 37 different indian medicinal plants against 7 different protein targets of nCOVID'19 through molecular docking. Protein-Ligand interactions were analyzed carefully to shortlist potential small molecules and phytochemicals for drug development.

Materials And Methods
Phylogenetic analysis of main protease of nCOVID '19 Protein sequence of SARS-CoV-2 encoding for main protease was used for PSI BLAST (NCBI) [14] search to identify its homologs for understanding the evolutionary relationship with main proteases of other viruses. Multiple sequence alignment and phylogenetic analysis of SARS-CoV-2 main protease with other viral proteins was done using MAFFT server [15].
Virtual screening of herbal ligands against potential targets of nCOVID '19 Page 4/25

Protein targets
Corona virus genome was reported to encode for 29 proteins, out of which main protease is considered to be an important drug target. ORF1ab of the coronavirus genome contains 15 polypeptide chains encoding for non-structural proteins (NS proteins). Other part of the genome encodes for envelope and coat proteins. Availability of X-ray crystal structures for most of the proteins in the coronavirus 2 genome facilitates virtual screening to search for potential inhibitors. We have performed molecular docking of 721 ligands against seven different target proteins of nCOVID'19 genome (Table 1). Inaddition, virtual screening was also performed against M Pro of SARS CoV and MERS CoV with a view to identify inhibitors exhibiting inhibitory activity against main protease of all three viruses and inhibitors speci c to nCOVID'19.

Ligand Library Preparation
Chemical structures of all the small molecules were retrieved from Dukes database [16], PubChem [17] and DrugBank [18] From the DrugBank database, Chemical structures of drugs approved for the treatment of respiratory diseases and compounds exhibiting antiviral activity were collected from DrugBank Database. Structures of phytochemicals belonging to 37 different herbals and spices used in South Indian Traditional Medicine were also used for virtual screening ( Table 2). Known active ingredients of 8 herbal plants included in the Tamil traditional medicine "Kabasura Kudineer" (meaning water capable of boosting immunity) were also included in the screening. Overall, a total of 721 small molecules/ligands (Table S1) were used for virtual screening against 7 different protein targets.

Virtual Screening
Virtual screening was performed using Python Prescription Virtual Screening tool (PyRx 0.8) containing AutoDock Vina module [19]. Protein structure was prepared by using SWISS PDB Viewer by adding hydrogen atoms and energy minimization. Prepared proteinstructure was fed into the PyRx tool along with the structure of 721 ligands. Both the ligands and protein molecules were converted to pdbqt le using the AutoDock module of PyRx tool. Binding sites were predicted using CASTP server [20] and the same were used for setting grid (XYZ dimensions: 25*25*25) in the AutoDock Vina for virtual screening experiment with the exhaustiveness value of 8. Furthermore, phylogenetic analysis of SARS-CoV-2 M Pro was carried out using PSI-BLAST (NCBI) [14] and MAFFT server [15]. Top 10 ligand hits against each of the 7 protein targets were taken for further analysis. 2D and 3D interactions between the protein-ligand were analysed using Schrodinger Maestro visualizer [21]. Properties of top 10 ligands against individual protein targets are given in Table S2.

Results And Discussion
Phylogenetic analysis on corona virus main proteases Page 5/25 Main protease (M pro , also called 3CL pro ) is considered as one of the important molecular targets for designing novel drugs against corona viruses [22]. With a view to design drugs/inhibitors speci cally targeting main protease of nCoVID'19, in-silico analysis was performed using main protease sequences of SARS-CoV-2, SARS-CoV and MERS-CoV. Multiple sequence alignment identi ed 12 signi cant differences between main proteases of SARS CoV and SARS-CoV-2 ( Fig. 1). Out of the 12 differences, S45 to A45 was found to reside within in the binding site of SARS-CoV-2 main protease. This may play a crucial role in determining differential binding a nity of the two proteases.
Phylogenetic analysis was performed using main protease sequences sharing >50 percentage similarity against SARS-CoV-2homolog revealed its signi cant genetic relatedness with main proteases of SARS CoV (96.08% similarity) and bat coronavirus (76.84% similarity) (Fig. 2). Next to this, it shared signi cant similarity with ORF1ab of Rousettus bat coronavirus. Main protease of nCoVID'19 shared only 50.65% similarity against main proteases of MERS-CoV. Above results clearly indicated the need for a highly speci c novel drug speci cally inhibiting main proteases of SARS-CoV-2.
Virtual screening of potential herbal ligands against major protein targets of SARS-CoV-2 Virtual screening of 721 ligands belonging to small molecules and active compounds from 37 medicinal herbs against 7 major protein targets of nCOVID'19 identi ed potential inhibitors. Information regarding the binding site residues predicted using CASTp server is provided in Table 3. Top 10 hits reported with higher binding a nity for each target protein is considered for downstream analysis (Table 4). Seven molecular targets of SARS-CoV-2 include, Main protease, RNA-dependent RNA polymerase (RdRp), NSP3, NSP9, NSP10-NSP16, NSP15 and Spike protein.

Small molecules/herbal compounds exhibiting signi cant inhibitory activity against Main Protease
Despite of signi cant structural (RMSD: 0.71 Å) and binding site volume similarity between SARS-CoV and SARS-CoV-2, they showed differential binding a nity against different inhibitors (Table 1).Virtual screening of small molecules against M Pro identi ed agathis avone as the best inhibitor exhibiting the binding a nity value of -8.2 kcal/mol. Out of 721 ligands screened against SARS-CoV main protease, a ligand namely rutin abundantly found in Terminalia chebula, Azadirachta indica and Ocimum basilicum exhibited highest binding a nity value of -9.0 kcal/mol. In case of MERS-CoV main protease, amento avone predominantly found in Mangifera indica and Garcinia species showed the maximum binding a nity value of -8.6kcal/mol. Interestingly, a cytotoxic bi avonoid agathis avone found in cashew nut (Anacardium occidentale) was shown to exhibit signi cant binding a nity with -8.0 kcal/mol against the main protease of SARS-CoV-2. Agathis avones have been reported for their cytotoxicity against malignant cell lines [23]. Agathis avone is a bi avanoid derived from plant source and has been found to possess several biological activities [24]. Various studies have found that agathis avone possesses antioxidant, anti-in ammatory, antiviral, antiparasitic, cytotoxic, neuroprotective, and hepatoprotective activities. It has also been suggested that agathis avone could be used in the treatment of oxidative stress, in ammatory diseases, microbial Page 6/25 infection, hepatic and neurological diseases and cancer [25]. This compound was found to involve in the formation of 3 hydrogen bonds at ASP 187, PRO 52 and ARG 40. This was followed by Rubusic acid Spike Protein X-ray crystal structure of spike glycoprotein (PDB ID: 6M71) was used chosen for performing virtual screening. Virtual screening was performed by choosing ACE interacting region as the binding site (Fig.   4a). 1,8-Dichloro-9,10-diphenylanthracene-9,10-diolfrom Carica papaya was found to exhibit signi cant binding a nity against spike glycoprotein (-8.2 kcal/mol). GLY 496 residue was found to be involved in the formation of hydrogen bond with the 1, 8-Dichloro-9, 10-diphenylanthracene-9,10-diol. Earlier, leaf extracts of Carica papaya was reported to have signi cant effect in combating dengue virus infection [26] and its exact role in increasing platelet counts is not clear. 1,8-Dichloro-9,10-diphenylanthracene-9,10-diol was found buried in the binding site of spike glycoprotein exhibiting hydrophobic interactions with residues such as LEU 39, TYR 41, TYR 449, TYR 453, TYR 495, PHE 497 and TYR 505 (Fig. 3e). This was followed by other small molecules viz., agasthis avone, amento avone, ivermectin, agnuside (Vitex negundo), taraxerol (Cissus quadrangularis) and nimbinene (Azadirachta indica) exhibiting signi cantly higher level of binding a nity towards spike protein of nCOVID'19.

Non-Structural Proteins (NSPs)
Apart from the four major structural proteins (S, E, M and N proteins), non-structural proteins namely NSP3 (cleavage of N-terminal replicase poly protein), NSP9 (ssRNA binding) NSP10-NSP16 (co-factor in Page 7/25 activating replicating enzyme) and NSP15 involved in the transcription and replication of nCoVID'19 can also serve as potential targets for containing the virus using inhibitory herbal molecules [27].
Another small molecule friedelin from Vitex negundo and Acorus calamus was also found to exhibit signi cant binding a nity of -9.6 kcal/mol against NSP9 (Table 4). Eventhough, many hydrophobic interactions were observed, no hydrogen bond interaction was found in the binding site of NSP9. It is very interesting to observe that ve out of the top ten inhibitors are from a single plant source Solanum nigrum. As evidenced from other studies, Solanum nigrum is one of the traditionally known medicinal plant known for its use in treatment ofseizure, pain, ulcer, in ammation, diarrhea, eye infections, jaundice and oxidative stresses [30][31][32].
Molecules exhibiting inhibitory activity against multiple protein targets of nCOVID '19 Phytochemicals exhibiting inhibitory activity against multiple targets of viruses are expected to confer durable protection to the patients. This will be more bene cial in situations where the virus is developing mutations in one of the targets. Small molecules namely, amento avone, agathis avone, catechin-ogallate and chlorogenin exhibited signi cant binding a nity towards multiple targets of nCOVID'19.
Many of these plants have been used in traditional medicine for several thousand years in different parts of the world. Several studies have reported that amento avone possess anti-in ammatory, anti-oxidative, anti-diabetic, anti-tumor, anti-viral and anti-fungal activities [35]. Evidences have been reported for amento avone exhibiting anti-senescence activity in the cardiovascular and central nervous system [36]. Further, Amento avone isolated from Torreyanuciferawas demonstrated to possess inhibitory activity against SARS-CoV3CL Pro [37].

Effect of FDA approved drugs on SARS-CoV-2 protein targets
Hydroxycholoroquine,chloroquine and ivermectin drugs were selected as positive controls as they were reported to possess anti-viral activity [38,39]. Hydroxychloroquine was reported to show promising inhibitory activity against nCOVID-19 spike protein [40,41]. Our results revealed that hydroxycholorquine and chloroquine showed less binding a nity against all the 7 targets of nCOVID-19 compared to ivermectin (Table 6). Ivermectin exhibited signi cant binding a nity value of -9.4 kcal/mol and -8.2 kcal/mol against RNA -dependent RNA polymerase (RdRp) and spike protein respectively (Fig. 6). Ivermection also exhibited signi cant binding a nity against NSP9 (-7.5kcal/mol) and spike glycoprotein (-8.2 kcal/mol) (Table S4).

Analysis on top reported plants with best ligand hits
Among the 721 phytochemicals originating from of 37 plant species, 36 (5% approx.) phytochemicals from 22 plants (Fig. 7) were found to be the best hits with higher binding a nities against all the seven targets (Table 5). Among those 27 plants, 6 plants were found to be the ingredients of a traditional siddha herbal formulation namely "Kabasura kudineer" recommended by AYUSH Board of Government of Intensive genomics and proteomics research may lead to identi cation of novel drugs against this pandemic disease.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.