Exploring structurally diverse plant secondary metabolites as a potential source of drug targeting different molecular mechanisms of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) pathogenesis: An in silico approach CURRENT STATUS: POSTED

Plants are endowed with a large pool of structurally diverse small molecules known as secondary metabolites. Present study aims to virtually screen these plant secondary metabolites (PMS) for their possible anti-SARS-CoV-2 properties targeting four protein/enzymes which determines viral pathogenesis. Results of molecular docking and data analysis revealed a unique pattern of structurally similar PSM interacting with the target protein. Among the top-ranked PSM with lower binding energy, >50% were triterpenoids against viral spike protein, >32% were flavonoids and their glycoside against Human transmembrane serine protease, >16% were flavonol glycosides and >16% were Anthocyanidine against viral main protease and >13% were flavonol glycoside against viral RNA dependet RNA polymerase. The primary concern about these PSM is their bioavailability. However, several PSM recorded higher bioavailability score and found fulfilling drug-likeness characters as per Lipinski's rule. Natural occurrence, biotransformation, bioavailability of selected PSM and their interaction with the target site of selected proteins were discussed in detail. Further, we hypothesized the use of selected PSM to cure the COVID-19 by inhibiting the process of viral host cell recognition and replication in host cell. However, these PSM needs thorough in vitro and in vivo evaluation before taking them to clinical trials. analyzing and Analyzing the bioavailability of using


Introduction
Coronaviruses (CoV) are spherical or pleomorphic enveloped, non-segmented particles contains positive-sense single-stranded RNA (Zhu et al., 2020). There are several types of low pathogenic CoV which cause mild respiratory symptoms. In general, it is classified under four genera such as α, β, γ, and δ CoV. α and β-CoV are reported to cause fatal respiratory tract infections in mammals. Severe Acute Respiratory Syndrome-CoV (SARS-CoV) is grouped under β-CoV. Whereas, γ, and δ CoV infects birds (Yin and Wunderink, 2018). Based on genomic sequence evidence, bat CoV RaTG13 shares 96.2 % similarity with SARS-CoV-2. Hence, bats are considered as a primary source of SARS-CoV-2 and infected humans through several intermediate hosts (Zhou et al., 2020).
Coronavirus disease 19  was first spotted in a seafood market of Wuhan city, Hubei and Gang, 2000). The variability of metabolites can be seen across different plant families, genus, species and different parts of the same plant species (Holeski et al., 2012). The concentrations of PSM varies according to the growth stages and in response to biotic and abiotic stress to which plant exposed. The development of drugs from phytopharmaceuticals is a trending approach to look for eco-friendly therapeutic molecules with no or minimal side-effects. This time-bound situation requires an efficient and effective method to develop therapeutics which disables the virus molecular machinery. Considering the safety of the users, any conventional drug discovery plan is timeconsuming process that sometimes takes decades to complete. Thus, repurposing the already available FDA approved drugs, use of plant based herbal medicines, or edible plant parts rich in antiviral PSMs are other strategies appears to be promising under current situations.
In-Silico or computational approaches are algorithm-based virtual screening methods developed for screening large number of molecules in shorter time and identification of probable potent drug candidate. In a recent research, 1903 approved drugs were virtually screened through molecular docking and binding free energy calculations suggested nelfinavir as a potential inhibitor against SARS CoV-2 (Xu et al., 2020). Similarly, phytomolecules such as kaempferol, quercetin, luteolin-7glucoside, demethoxycurcumin, naringenin, apigenin-7-glucoside, oleuropein, curcumin, catechin, and epicatechin-gallate have been reported potential viral M pro inhibition (Khaerunnisa et al., 2020). Elfiky (2019) reported Ribavirin, Remdesivir, Sofosbuvir, Galidesivir, and Tenofovir as potent drugs candidates against RdRp of SARS-CoV-2 through molecular docking studies.
Most of the earlier PSM based virtual screening were either limited by number of PSM or structural diversity of test ligands and number of target proteins. Hence, the present study is aimed to (i) Creating a structurally diverse PSM library, (ii) Finding potent PSM which binds to the target site of selected protein/enaymes with low binding energy (BE), (iii) analyzing structural and functional relation of top scored PSMs, and (iv) Analyzing the bioavailability of selected PSM using Swiss ADME.

Materials And Methods Preparation of plant secondary metabolites library
To prepare PSM library, an extensive literature survey was conducted on selected plants and the general and species-specific PSMs including Alkaloids, Phenolics and Terpenoids were listed. The 3D and 2D structures (SDF Files), and canonical smiles of the selected PSMs were retrieved from online databases such as PubChem (https://pubchem.ncbi.nlm.nih.gov) and ChemSpider (http://www.chemspider.com). The 2D structures were converted into 3D coordinates and geometries were optimized by using the free offline tool Marvin Sketch (http://www.chemaxon.com/products/marvin/marvinsketch). As several PSM are present in multiple plant species, a 6% duplication was allowed in the main PSM library. Additionally, PMS isomers were considered as a separate ligand. All the files were coded and used for further studies.

Target Proteins
In the present study we selected four target protein, one from human (human transmembrane serine protease 2, TMPRSS2) and three from SARS-CoV-2 (spike protein, M pro and RdRp). The crystal structure of SARS-CoV-2 spike receptor-binding domain bound with ACE2 (PDB ID: 6M0J) (2.45Å) was retrieved from Research Collaboratory for Structural Bioinformatics (RCSB) Protein Data Bank (PDB) (RCSB, www.rcsb.org) (Lan et al., 2020) and ACE2 was removed, processed and used for docking studies. The 14 amino acid residues (AAR) (THR415, ASN439, TYR449, TYR453, LEU455, PHE486, ASN487, TYR489, GLN493, GLN498, THR500, ASN501, GLY502 and TYR505) of spike protein that are key for binding hACE2 (Walls et al., 2020) were considered as an active site for molecular docking process. The TMPRSS2 sequence (NP_001128571.1) was retrieved from the National Center for Biotechnology Information (NCBI) protein database. The structure of TMPRSS2 was generated by using the SWISS-MODEL online server (Biasini et al., 2014). The structures were marked, superimposed and visualized by using Chimera (Pettersen et al., 2004). Model 1 of our work and O15393 present in uniport were found 100% similar. Three amino acid residues (HIS296, ASP345 and SER441) of catalytic site (Meng et al., 2020) were considered as a key residue of TMPRSS2 in the molecular docking process. The crystal structure of SARS-CoV-2 Mpro (PDB ID: 6LU7; 2.16 Å) (Jin et al., 2020) was retrieved and used for docking studies after processing. Mpro has three domains, and the active site is located between domain I and II. Here, the CYS145-HIS41/SER144-HIS163 can act as nucleophilic agent and GLY143 and GLU166 can form hydrogen bonds with the "CO-NH-Cα-CO-NH-Cα" structure of the backbone of the substrate protein. These 6 residues were considered as the key residue of Mpro in the molecular docking process . The SARS-CoV-2 RdRp protein sequence (YP_009725307.1) was retrieved from the National Center for Biotechnology Information (NCBI) protein database. The structure of RdRp was generated by using the SWISS-MODEL online server (Biasini et al., 2014). The structures were marked, superimposed and visualized by using Chimera (Pettersen et al., 2004). The amino acid residues VAL557 in motif F (Gao et al., 2020) was considered as a key residue of RdRp in the molecular docking process.
Removal of water molecules, metal ions, cofactors, and addition of charges and hydrogen atoms was done by UCSF Chimera tool. (Pettersen et al., 2004) Computing energy minimization and reconstruction of missing atoms and to perform stereo-chemical quality checks to come up at the best possible three-dimensional structures were done through discovery studio software (Dassault Systemes BIOVIA, Discovery Studio, Version 3, San Diego: Dassault Systèmes: 2019).

Molecular docking
The ligands were energy minimized by conjugate gradients optimization algorithm with total numbers of 200 steps performed as a default universal force field (UFF) parameters (Rappe et al., 1992). The capability of ligands to interact with the target site of selected protein was studied following computational ligand-target docking approach. Molecular docking was carried out using PyRx, AutoDock Vina option based on scoring functions (Trott et al., 2010;Dallakyan et al., 2015). The least binding energy (BE, Kcal/mol) conformation was considered as the most favourable docking pose. The interaction between ligand and protein were analyzed using pymol (De Lano., 2020) and Discovery Studio 3.5 (Accelrys Software Inc. Accelrys, San Diego, CA, US).

Structural activity relationship Analysis
According to their BE, all the ligands were assigned ranking and aligned in ascending order. For the ease of the study top 250 molecules (268 for Mpro) were subjected to ligand structure similarity analysis (based on canonical SMILES) and BE using Data Warrior software (version 5.2.1).

Physicochemical properties and Bioavailability of PSM
The drug-likeness and the physicochemical properties were studied using SwissADME (www.swissadme.ch). The canonical SMILES of the selected PSMs were subjected to SwissADME analysis. The PSMs were analyzed for their drug-likeness properties following Lipinski's rule of five (Lipinski et al., 2001). The bioavailability radar chart obtained were analyzed for their drugs like properties, i.e., lipophilicity (XLOGP3 between −0.7 and +5.0), Molecular weight (150 and 500 g/mol), Topological polar surface area (between 20 and 130 Å 2 ), Solubility (log S not higher than 6), Flexibility (no more than 9 rotatable bonds) and Saturation (fraction of carbons in the sp 3 hybridization not less than 0.25) (Daina et al., 2017). The bioavailability score (BAS) for selected PSM was also recorded (Martin, 2005) Results And Discussion The PSM library contains 4704 molecules collected from 203 plant species belongs to diverse plant families (Sup File 1). Over 22000 docking reactions were run (which includes replications, to confirmation the activity of several molecules with BE) using 4750 ligands against four selected target proteins/enzymes which involved in host cell recognition, entry and replication of SARS-CoV-2. Upon molecular docking, a wide range of BE was obtained for all four target proteins. The obtained results were arranged in ascending order of BE. For the ease of the study, we selected the top 268 molecules against Mpro and 250 molecules for all other target protein for further analysis (Sup. File 2) viz, structural similarity and activity relationship and Physicochemical characterization to evaluate its drug-likeness.

SARS-CoV-2 Spike Protein
Spike protein is a class I fusion protein present within the envelope as a homotrimer and consists of three S1-S2 heterodimers. The receptor-binding domain (RBD) is located on the head of S1 (Gui et al., 2017) and binds with the cellular receptor hACE2 (Kong et al., 2020). Any PSM interacts with these selected AARs of spike protein by forming multiple H bonds and other interactions with lower BE may interfere with the spike protein -hACE2 interactions, thereby preventing the initial recognition of host cell by SARS-CoV-2.
Interestingly, the large pool of PSM found interacting with the exposed surface of spike protein were found belonging to class triterpenoids. More than 50% of the PSM among top 250 molecules belongs to Triterpenoids and their derivatives, largely includes triterpenes and sterols. Here sterol lactones alone represent >14% of total PSM (Fig. 1a). Few biflavonoids, flavonoid glycoside and hydrolysable tannins were also showed promising results.
Triterpenoids are a large pool of diverse phyto-molecules grouped under terpenoids, which includes triterpenes, steroids, limonoids, quassinoids, triterpenoidal and steroidal saponins (Mahato et al., 1992;Dzubak et al., 2005) found widely distributed across the plant kingdom. Yin et al. (2012) reported the bioavailability of eight triterpenes (oleanolic acid, ursolic acid, arjunolic acid, asiatic acid, boswellic acid, corosolic acid, madecassic acid, and maslinic acid) from different vegetables and fruits, in their intact form in the different organs of Mice. A similar kind of study proved the bioavailability of betulinic acid exerting its anti-tumor properties (Godugu et al., 2014). The data obtained from our study indicated that possibilities of developing triterpenoid based drug molecules targeting SARS-CoV-2 Spike protein. Coagulins from Withania coagulans recorded lower BE with target AAR of spike protein. Similar observations were made with structurally similar triterpene and steroid, i.e. steroidal lactones, steroidal saponins, steroidal glycoalkaloids, triterpene glycosides, triterpene saponins, and triterpene sterols. Coagulin N recorded a lest of -9.1 BE followed by Coagulin K (BE -8.9). Coagulin N forms H bond with ARG403, TYR449 and GLY496 of spike protein, whereas Coagulin K forms H bond with ARG403, GLY496, GLN498 and TYR505, which may interfere with viral host cell recognition process. Also, both the Coagulin K recorded 0.55 BAS, and passed all the Lipinski's rule of drug-likeness except MW.
Withanolides are naturally occurring C 28 -Steroidal lactone triterpenoids build on an intact or rearranged estrogen framework (Christen, 1986, Glotter, 1991Mirjalili et al., 2009) Saponins are naturally occurring plant glycosides found in a wide range of plant spices. They are high molecular weight amphiphilic compounds having triterpenoids and steroid aglycon as lipophilic moiety and sugars as hydrophilic moiety (Vicken et al., 2007). Another class of basic steroidal saponins contain nitrogen analogues of steroid sapogenins as aglycones. Almost all saponins have unfavourable physicochemical properties for drug likeliness rule. Glycosylation leading to the increased number of sugars makes the molecule more complex and poorly bioavailable. The biotransformation of saponins mainly occurs at intestine with the aid of gut microbes leading to the generation of the rare low molecular weight saponins containing no or lower number of sugar moiety (He et al., 2019). These hydrolyzed products are higher in bioavailability and bioactivity compared to their parental compounds (Gao et al., 2012;Ramasamy et al., 2015;Del Hierro et al., 2018).
Graecuninare a group of spirostanol saponins isolated from the leaves of Trigonella foenum-graceum (Fenugreek) (Varshney et al., 1994). In our study, Graecunin E recorded a least of -8.6 BE and found interacting with THR415 and GLN493 AAR's of spike protein ( Fig. 1b; Sup. File 2,). Other Graecunin related compounds, Trigofoenoside E1 (BE -8.2), Uttronin B (BE -8.3), Stigmasteryl glucoside (BE -7.9) and Yuccagenin (BE -7.6) also showed promising results. It was noticed that BE of abovementioned molecules was related to their number of the sugar moiety. As the number of sugar moiety reduces, BE also found reducing. However, with the loss of sugar moiety, their bioavailability is increasing as observed between Greacunin E and Yuccagenin (aglycon form) ( Fig. 1b; Sup. File 2 and 3). Also, Yuccagenin was found forming H bonds with GLY496 and ASN501 AARs of spike protein which are crucial to interact with AARs of Human ACE2 and found fulfilling all drug-likeness characters as per Lipinski's rule (Fig. 1b).
Alkaloidsare low molecular weight and heterocyclic N containing basic compounds widely distributed across the plant kingdom with >20000 compounds identified (Yang and Stockigt, 2010). Based on their biosynthetic origin, they are broadly classified as indole, Pyrrolizidine and Quinolizidine alkaloids (Seigler, 1998). They are preferably used at a lower dosage for pharmacological applications, at the higher dose they may adversely affect the consumers. Additionally, several unfavourable pharmacokinetics behaviours of alkaloids, such as low oral absorption/bioavailability, short half-life, and rapid clearance, have been reported to significantly decline in their bioavailability and bioactivity (Li et al., 2004;Zaho et al., 2011;Pirillo and Catapano, 2015). Bismahanine, a carbazole alkaloid isolated from leaves of Murraya koenigii (Husson, 1985;Chakraborthy and Roy, 1991) and some other Rutaceae members (Chakraborty, 1993). Bismahanin and related compounds are reported for their wide biological activities such as anti-oxidant, anti-diabetic, anti-inflammatory, anti-microbial, anticancerous, anti-viral, etc. (Knolker and Kethiri, 2008;Nalli et al., 2016;Yu et al., 2020). In our studies, physicochemical properties of Bismahanin revealed it as a low polar, non-soluble, high MLOGP value and high molecular weight PSM according to Lipinski's rule of drug-likeness, which reduces its bioavailability (BAS 0.17). The molecule interacted with residue GLU406 through H bonds and showed Van der Waals interaction with other residues (TYR495, LEU455, ASN501 and GLY502) of spike protein Alkaloids of Veratum rhizome have been reported for their hypotensive, anti-thrombotic and antitumor activity (Xu et al., 2001;Berman et al., 2002;Thayer et al., 2003). In the present study, Tannic acid is a polyphenolic compound which is well known for protein precipitation properties. It is a specific tannin that contains 10 galloyl (3,4,5-trihydroxyphenyl) unit surrounding a glucose centre. It is mainly used for industrial applications and as raw materials for gallic acid and pyrogallol production through hydrolysis using enzyme tannase (Lekha and Lonsane, 1997). As it is high molecular weight compound and highly soluble in water (low lipophilicity), its GI absorption is almost nil in its original form. But it is hypothesized that bio-transformed products may enter into plasma and exert biological activities which still needs a thorough study. In our study, Tannic acid recorded lower BE of -8.9, and some of the structurally related hydrolyzable tannins such as Strictinin (BE -8.6), Tercatin (BE -8.3), Punicalgin (BE -8.5), Terchabulin (BE -8.4), Terflavin (BE -8.1) also showed promising results (Sup. File 2). Details of biotransformation and bioavailability of hydrolysable tannins are discussed somewhere else in this paper.
Arecatanninsare class of condensed tanninsProcyanidins) contains catechin or epicatechin as oligomers. Arecatannin A3 contains epicatechin-epicatechin-epicatechin-epicatechin-catechin as their basic structure and in its original form may be poorly available because of their unfavourable physicochemical properties (Sup. File 3). But, its bio-transformed products, especially monomers, dimers or trimers, may show a varied degree of bioavailability and bioactivity. Their monomers catechin and epicatechin are reported as bioavailable better than their parental molecules (Baba, 2001). However, their dimer and trimers are poorly bio-available (Serra et al., 2010). In our studies, Human Transmembrane serine protease (TMPRSS2) Structural similarity and BE correlation analysis indicated that a large number of flavonoid glucoside (> 32%) were found interacting with the target site of TMPRSS2, followed by ellagitannins and triterpenoids. Interestingly, two triterpenoid saponins (Liquorice and Glycyrrhizic acid) and a stilbenoid (Cis-Miyabenol C) recorded lowest BE (Fig. 2a). The data obtained from this study indicated that possibilities of developing Flavonoid glycoside or triterpenoid based drug molecules targeting human TMPRSS2 which process SARC-CoV-2 Spike protein and facilitate the entry of the virus into the host cell.  (Kumar and Pandey, 2013;Panche et al., 2016).
In plants, most flavonoids occurs in their β-glycosidic form bound to one or more sugar molecules.
The exceptions are catechin and proanthocyanidins (Willimson, 2004). In our study, a large number of flavonoid glycosides were found efficiently binding to the target site of protein with lower BE.
Similarly, in most of the earlier in vitro studies, flavonoid glycosides are proved to be a better candidate for enzyme inhibition activity related to health benefits. However, a major concern about flavonoid glycosides is their bioavailability. Upon consumption, at the small intestine, they absorbed mostly in their aglycon form at the cellular level. Absorption of glycosidic form mainly depends on their number of the sugar moiety. Through many research studies, it was confirmed that flavonoid glycosides are converted to aglycones by gut microbes and decomposed to yield two different phenolic products. Whereas, catechin and procyanidins are biotransformed into 5-(hydroxyphenyl)-γvalerolactone found in blood plasma and urine as sulfate and glucuronide metabolites (Monagas et al., 2010). However, several earlier studies supported the hypothesis that glycosidic forms are more bioavailable and resistant to microbial degradation. Also, they can deliver aglycone moiety better when administrated as glycosidic form. Participation of active uptake of flavonoid glycosides by enterocytes was researched in detail. During this process, flavonoid glycosides are converted into aglycones by membrane-bound beta-glucosidase (Reviewed by Kumar and Pandey, 2013). Here, Agatisflavone, a bioflavonoid recorded a BE of -8.9 and found interacting with catalytic site AAR's SER441 of TMPRSS2 forming hydrogen bonds. Nirurin, a prenylated flavonone glycoside found in Though the flavonoids are dominating the group in numbers, the lowest BE was recorded by two triterpenoid saponins, i.e., Liquorice (-9.7) and Glycyrrhizic acid (-9.5) followed by a stilbenoid Cis-Miyabenol (-9.4). Here, Liquorice was found forming H bond with SER441 of TMPRSS2 catalytic site AAR through glucose moiety. Similarly, Glycyrrhizic acid also showed H bond formation with HIS296 of TMPRSS2 through Oxygen involved in a glycosidic bond. Liquorice and Glycyrrhizic acid are isomers, triterpenoid glycosides obtained from roots of Glycyrrhiza glabra (Liquorice). Roots of this plant are traditionally used to alleviate jaundice, gastritis and bronchitis. Gancao, a Chinese herbal decoction of dried plant roots and stem, contains 3.63 -13.06% Glycyrrhizin and well known for their therapeutic properties, including antiviral (Pompei et al., 1979). Glycyrrhizic acid was reported as an active component from the roots of Glycyrrhiza glabra which inhibited the growth and cytopathology of both DNA and RNA virus viz., Herpes simplex type I, Newcastle disease virus, Vesicular stomatitis virus and Polio type I virus without affecting host cell activity and replication. Research regarding the bioavailability of Glycyrrhizic acid revealed that after oral administration, only bio-transformed Glycyrrhetic acid was detected in plasma at higher concentration (Stormer et al., 1993;Takeda et al., 1996). It is because of the complete biotransformation of Glycyrrhizic acid to Glycyrrhetic acid by the activity of gut microbes. Hence to know the capability of aglycon form, i.e. Glycyrrhetic acid (Enoxolone, not present in main PSM library) was docked and found it also form H bond with SER441 of TMPRSS2, but the BE was reduced to -7.7 (Fig. 2b).
Cis-Miyabenol C, a stilbenoid (resveratrol timer) found in fennel (Foeniculum vulgare).. Stilbenoids are characterized by two phenyl group linked by a transethane bond and reported to exhibit a wide range of biological activities and pharmacological properties (Akinwaumi et al., 2018). In the present study, Cis-Miyabenol C recorded a lower BE of -9.4 and found interacting with catalytic site AAR of TMPRSS2, i.e., ASP345 through H bond (Fig 2b; Sup. File 2). This compound was low in GI absorption, highly lipophilic, insoluble in water and found violating Lipinski's rule of drug-likeness. The low bioavailability of stilbenoids is largely due to their rapid, extensive metabolism in the intestine and liver during and after adsorption giving rise to a lower level of the free parent compound (Walle et al., 2011). Pharmacokinetics studies showed a higher level of sulphate and glucuronides (metabolites of stilbene) in plasma. Therefore it was hypothesized that these metabolites might act as a reservoir for the proving itself as a better drug candidate to inhibit the activity of TMPRSS2.
Plumbagin is a naphthoquinone derivative from the roots of Plumbago zeylanica and well studied for their anti-cancerous properties (Hazar et al., 2002;Sugie et al., 1998). In our study, 3,3'-Biplumbagin recorded BE of -8.9 and found interacting with HIS296 and SER441 of the catalytic site of TMPRSS2 and with VAL280 and GLY439 AAR's in the close vicinity, showing its strong affinity towards the catalytic domain of TMPRSS2. Also, it recorded higher BAS of 0.55 and found fulfilling all necessary physicochemical character for a drug-like molecule as per Lipinski's rule (Fig 2b; Sup. File 2 and 3).
Quinic acid is a carboxylated cyclohexanepolyol that is found in several plants like coffee, tomato, carrot, etc. and exist either in free form or as esters (de Maria et al., 1999). Quinic acid is a starting compound used to synthesize "Tamiflaue", an antiviral drug used to treat Influenza A and B virus (Federspiel et al., 1999). Derivatives of Quinic acid were reported to be antiviral in nature against Structurally, they are complex class of polyphenols characterized by one or more hexahydroxydiphenoyl (HHDP) moieties esterified to a polyol (Khanbabaee and Van Ree, 2001;Niemetz and Gross, 2005). Ellagitannins are the most potential anti-oxidant agents because of their high polyphenolic nature. In the present study, group of bulky Ellagitannins such as Terflavin A, Terflavin B, Punicalin, Strictinin, Pedunculagin, Punicafolin, Tellimagradin I, Tercatain, Punicalin, Emblicanin A, Phyllanemblinin B, etc. recorded lower binding efficiency indicating their capability to inhibit the activity of TMPRSS2 enzyme. Similarly, Granatin B an ellagitannin commonly found in the pericarp of Punica granatum recorded lower BE of -9.1 followed by Granatin A (-8.9). Granatin B recorded H bond formation with HIS296 of TMPRSS2. Bioavailability score of Granatin B was 0.11 and found violating Lipinski's rule ( Fig. 2b; Sup. File 2 and 3). However, they are all high molecular weight, and according to previous studies, they are poorly/not bioavailable. Further, they hydrolyze into Ellagic acid and their complex derivatives in the intestine and can cross the gastrointestinal barrier into the bloodstream (Gil et al., 2000). Whereas gut microflora is reported to convert Ellagitannins to Urolithins an anti-cancerous compound (Seeram et al., 2004;Heber, 2011). Bioavailability of Ellagitannins metabolites was successfully proved in humans in the form of Ellagic acid in blood plasma (Seeram et al., 2008).

SARS-CoV-2 Main Protease (M pro )
The key enzyme in proteolytic processing of SARS-CoV-2 replication is M pro . It is initially released by the auto-cleavage of pp1a and pp1ab. Then M pro , in turn, cleaves pp1a and pp1ab to release functional proteins necessary for viral replication (Krichel et al., 2020). Any PSM binding to the AAR's of the catalytic site or pocket with H bonds and other interactions may interfere with the viral replication in host cell, thereby reducing the severity of the COVID-19.
When BE and canonical smiles structural similarity of top 250 PSM were analyzed, it was observed that molecules from two major groups, i.e., Flavonol glycosides and Anthocyanidine were dominating with >16% and >16% PSM, respectively. Other flavonoids and triterpenes also recorded promising results (Fig. 3a). But the least BE was recorded by Hypericine -a naphthodianthrone and Amentoflavone -a biflavonoid.
Hypericin, a naturally occurring chromophore "Naphthodianthrone" compound, derivative of anthraquinone found in common St. John's Wort (Hypericum species) and in some fungi. Hypericum perforatum a source of Hypericin has been used as folk medicine. Also, Hypericin is reported as antidepressive, anti-tumor, anti-viral, antineoplastic, etc. (Kubin et al., 2005). Here, Hypericin recorded least BE of -10.4. Further, it was found forming H bond with GLU166 residues in catalytic sits of M pro ( Fig. 3b; Sup. File 2). Physiochemical parameters of Hypericin showed that it is poorly soluble in water and violates Lipinski's rule of drug-likeness ( Fig. 3b; Sup. File 2) with 0.17 BAS. It is a naturally occurring photosensitizer reported to accumulate in tumour cells and upon illuminating release ROS killing the cancerous cells (Watson et al., 2014). Recent in silico studies revealed its potential to bind SARS-CoV-2 spike protein (Smith and Smith, 2020). Additionally, its antiviral properties against Bronchitis virus (Chen et al., 2019), Hepatitis C virus (Shih et al., 2018) and Human Coronavirus Oc43 (Kim et al., 2019) represents it as a potential antiviral drug material.
Amentoflavone, a biflavonoid (dimer of two apigenin) occurs naturally in more than 120 plants belong to families; Selaginellaceae, Cupressaceae, Euphorbiaceae, Podocarpaceae and Calophyllaceae (Yu et al., 2017). They are reported for wide spectrum of biological activities. Sotetsuflavone, a biflavonoids from Dacrydium balance (Gymnosperm) was found to be most potent inhibitor of Dengue virus NS5 RdRp with IC 50 of 0.16 µM, among 23 bioflavonids screened. Further, their enzyme inhibitory activity was related to the number and position of methyl groups on the biflavonoid moiety and the degree of oxygenation on flavonoid monomers (Couleric et al., 2013). Gastrointestinal absorption of Amentoflavone was poor and their physicochemical parameters violated Lipinski's rule (Sup file 3).
However, its higher bioavailability as intravenous (>70%) administration was reported in rats (Liao et al., 2015). In our studies, Amentoflavone recorded least of -9.7 BE against M pro and found interacting with target AAR by forming H bonds with GLU166 and other residues in the vicinity of catalytic site (Fig 3b; Sup. File 2). Similarly, other biflavonoids, Agatisflavone also recorded lower BE of -9.3, followed by Ginkgetin and Isoginkgetin, derivatives of Amentoflavone (from Ginkgo biloba) were recorded BE -9.5 and -9.5, respectively.
Bioavailability of Cyanidine-3-glucoside was studied in human trials by Czank et al. (2013) following the isotopically labelled compound. They found radioactivity in plasma, urine and breathe, indicating the absorption of the labelled compound. However, the study did not reveal the existence of bioactive metabolites; it may be in degraded forms of the parental compound.
Terflavin B,an ellagitannin (found in Terminalia chebula and Terminalia catapa) recorded BE of -9.7. Also, it was found forming H bonds with AARs HIS41, GLY143, SER144, CYS145 and GLU166 present in the catalytic site of SARS-CoV-2M pro . It is water-soluble, low in GI absorption and violates Lipinski's rule (Fig 3b; Sup. File 2). Aqueous extracts of Terminalia chebula was reported to be inhibitory to Hepatites B virus infection in Hep G 2.2.15 cells (Kim et al., 2001). Not much research studies on their bioavailability and antiviral properties have been done with purified Terflavin B. Vescalagin, and Castalagin are ellagitannins found in oak and chestnut wood (Peng et al., 1991;Tanalka et al., 1996).
They are water-soluble, high oxidizable and astringent (Vivas et al., 2004). Here, Viscalagin recorded -9.6 BE and found interacted with GLU166 residues of M pro catalytic site through H bond, and Castalagin showed -8.8 BE ( Fig. 3b; Sup. File 2). Viscalagin is high molecular weight, and high polar molecule recorded lower BAS 0.17. Vilhelmova et al. (2011) demonstrated the antiviral activity of Vescalagin and Castalagin against Herpes simplex virus type I and II. Also, these PMS synergistically inhibited the multiplication of test virus along with antiviral compound Acyclovir.
Mudanpioside J is a monoterpene glucoside. This compound is one among those identified in Poeonia delavayi (chines medicinal plant), which was found inhibitory to influenza virus neuraminidase (Li et al., 2016). In our studies, Mudanpioside J recorded BE of -9.6 and showed strong interaction with GLY143, SER144 and CYS145 residues of M pro catalytic site through H bonds (Fig 3b; Sup. File 2).
Studies on Pharmacokinetics properties of monoterpene glycosides are limited. Paeoniflorin and albiflorin, Mudanpioside J related compounds were reported as low in oral bioavailability due to its poor membrane permeability and gut microbes-induced metabolism (Takeda et al., 1997;Liu et al., 2006). Swiss ADME analysis of Mudanpioside J also indicated its P-glycoprotein substrate nature and violation of Lipinski's rule with 0.17 BAS (Sup file 3).
Arylnaphthalene lignan glycosides are PSM widely distributed in family Acanthaceae. Lignans were reported for their wide biological activities, including antiviral against Hepatitis B, Hepatitis C, Herpes simplex virus type 1 and 2, Epstein-Barr virus and Cytomegalovirus (Ye et al., 2005;Charlton, 1988).

SARS-CoV-2 RNA Dependent RNA Polymerase (RdRp)
The central component of coronaviral replication/transcription machinery is RNA-dependent RNA polymerase (RdRp, also named nsp12) that constructs copies of its RNA genome playing the key role in replication and transcription of SARS-CoV-2 in the host cell (Subissi et al., 2014;Gao et al., 2020).
Because of its high sequence and structural conservation, it remains the target of choice for the prophylactic or curative treatment of several viral diseases. Studies on structural activity relationship and BE of PSM revealed that a large number of falvonol glycoside (>13%) followed by hydrolysable tannins, anthocyanins and triterpenes were found interacting with target site of RdRp with lower BE (Fig. 4a). Though the number of flavonol glycoside was high, the lowest BE was recorded by Erodictyol-7-O-glycoside and Narirutin belongs to group flavanon glycosides. Interestingly none of the PSM analyzed was found interacting with VAL557 of RdRp, but they found forming H bond with AAR's in the catalytic pocket probably sterically hinders the substrate interaction with the catalytic site.
Erodictyol-7-O-glycoside is a flavanone glycoside commonly found in lemons (Miyake et al., 1970), also called as lemon or citrus flavonoid. It recorded -9.9 BE and found interacting with ARG553, ALA554, THR556, ASP618, TRY619 and ASP623 in the active site of RdRp. Both glycone and aglycone moiety of this molecule were found involved in H bond formation with the target site of RdRp.
Similarly, Narirutin another flavonone glycoside which naturally present in sweet oranges recorded lower BE of -9.7 and recorded a similar pattern of H bond formation with active site of RdRp. Both molecules recorded BAS of 0.17, indicating their low bioavailability and found violating Lipinski's rule (Fig. 4b;Sup. File 2). Structurally similar compounds, Nirurin from Phyllanthus niruri and Naringin from grapefruits also showed promising results with BE -9.0 and -8.9, respectively.
Though they recorded lower BE, Rhoifolin and Isorhoifolin were not interacting with AAR's close to the active site of RdRp. Isoginkgetin, a biflavonoid from Ginkgo biloba recorded lower BE of -9.5 followed by Agatisflavone (BE -9.4) and ginkgetin (BE -9.2). Isoginkgetin was found interacting with ARG553, THR556, THR687 and ALA685 of RdRp catalytic pocket. Compared to flavonoids glycosides, aglycons were frequently reported to be bioavailable. Hence, Myricetin, a monomeric aglyconic flavonoid, was analyzed, and record BE -8.4. Further, Myrcitin showed H bond with ARG553, ARG555, ARG624 and THR680 of RdRp active site, proving itself as a better candidate for developing drug targeting viral RdRp. Similarly, Quercitin glycosides were found promising, but quercetin was not able to appear in the list of top ranked PSM.
Here, Rotundioside B recorded -9.5 BE and formed H bond with ASP452, SER501, ALA554, ASP623, ARG624, ASN691, SER759 and ASP760 of RdRp catalytic pocket proving itself as a strong contender of antiviral drug. However, this molecule recorded poor BAS of 0.11 and violated Lipinski's rule of drug-likeness ( Fig. 4b; Sup. File 2). Similar to this, Ginsenoside Ro, a triterpene saponin from roots of Panax ginseng and related plants recorded -9.4 BE. Trigofoenoside G, a steroidal saponin from plants and seeds of Trigonella foenum-graecum was found interacting with RdRp with -9.3 BE and forms H bonds with ASP452, LYS500, ALA554, ALA558, LYS621 and ALA685 of catalytic pocket (Fig 4b; Sup. File 2).
Following to this, Trigofoenoside F and Trigofoenoside A recorded -8.9 and -8.4 BE, respectively (Sup file 2).
Hippomanin A, an ellagitannin found in Hippomane mancinella plant is known for its toxic properties.
It causes oropharyngeal and gastrointestinal tract lesions, hypotension and bradycardia (Boucaud-Maiter et al., 2019;Rao, 1974). Here, Hippomannin A recorded -9.6 BE and forms H bond with ARG553, TYR619, ARG624, THR680, SER682 and ASP760 with target site of RdRp. A structurally similar molecule, Tellimagrandin I found widely in fruits, nuts and vegetables. Tellimargrandin was reported for their wide spectra of biological activities (reviewed by Zheng et al., 2012). Here, it recorded -9.5 BE followed by Punigluconin (BE -9.3). Other hydrolyzable tannins viz., Emblicanin A and Phyllanemblinin C found in fruits of Indian Gooseberry (Emblica officinalis) were recorded lower BE of -9.4 and -9.3, respectively. As the above discussed hydrolyzable tannins are highly watersoluble and larger in molecular weight, their bioavailability in original form is a major concern. Some of the molecules, structure similar to their biotransformed metabolites with higher bioavailability, like Ellagic acid (BE -8.3), 3′-O-methyl ellagic acid-4-xyloside (BE -8.4), and 3,3′-Di-O-methyl ellagic acid (BE -8.2) also recorded promising results. Indicating the possible involvement of ellagitanninsbiotransformed products to inhibit RdRp activity thereby reducing the COVID-19 severity.

Conclusion
From the obtained results, it could be concluded that virtual screening through molecular docking is a promising preliminary step towards developing an effective drug against a desired target protein/enzyme. Here, more than the BE of a PSM, its bioavailability play a crucial role in determining its biological activity under actual conditions. Triterpenoid based PSM structures (Coagulins, Withanolides, Pseudojervine, Kamalachalcone, etc.) are hypothesized as potent drug molecules which can be used to block surface AARs of spike protein which interacts with hACE2, thereby preventing host cell recognition by SARC-CoV2. In the case of TMPRSS2, Mpro and RdRp, molecules belongs to group flavonoid glycosides, biflavonoids, ellagitannins, anthociyanidins, triterpens, etc. can be explored. Though the large number of PSMs were found violating Lipinski's rule and recorded lower BAS, these PSMs cannot be ignored. Because, several biotransformed structure of these PSMs are highly bioavailable and they may retain structural moiety of the parental compound. These biotransformed molecules may further interact with target site of protein and exert a similar results as observed in molecular docking studies. Considering the safety, though these PSM are of natural origin, they should be tested thoroughly under in vitro and in vivo conditions for their biological activity, biotransformation, bioavailability and toxicity before taking them into the clinical trials.