3.1. Evolutionary relationship of SARSCoV2 variants
The phylogenetic studies showed that SARS-CoV-2 genome organization has higher nucleotide homology (> 90%) with SARS-CoV, MERS CoV and SARS-related coronaviruses (SARSr-CoVs) found in pangolin, bat and civet, placing it in the subgenus sarbecovirus of the genus betacoronavirus(Fig. 1). Also, Delta, Alpha, Beta, Gamma and Omicron were evolved from the same clade of SARS-CoV-2 wuhan strain along with lineage of pangolin. However, Omicron variant differed from other strains did not evolve from Delta, and strongly implicated monophyletic group with Gamma variant. Matrix of estimated evolutionary distance among different strains calculated using maximum likelihood (ML) approach (supplementary file 2). The SARS-CoV-2 genome demonstrated a distinct evolutionarily lineage with pathogenic HCoV isolates from a SARS bat CoV. Within this clade, a recent coronavirus pangolin possesses high lineage with SARS-CoV-2 isolates, whereas civet CoV shared higher phylogeny with SARS-CoV isolates, and parallel ancestors of other SARSr- CoVs. It suggests that Pangolin may be associated with the evolution of subsequent outbreaks of SARS-CoV-2. Furthermore, other beta CoVs: canine CoV, bovine CoV, Murine CoV, porcine hemagglutinating encephalomyelitis virus (PHEV) exhibited lower genetic similarity with SARS-CoV-2 strains, revealing that evolutionary divergences between humans and other animals such as dog, cow and murine were low and it might be considered that viral transmission between these animals to human would be non-reliable.
3.2. SARS-CoV-2 target proteins
Through Psi-blast analysis, high identity proteins and their available structures information was obtained. The available 3D crystal structure of spike glycoprotein, 3CL protease, PL pro, RdRp and helicase were retrieved; 6VSB, 6LU7, 6W9C, 6M71 and 6YJT from RCSB Protein Data Bank respectively.
3.3. Pharmacophore modeling
The pharmacophore was validated for N3 inhibitor consisted of 6 hydrogen donors, 2 hydrogen bond acceptors, 2 aromatic rings (Fig. 2). The tea compounds aligned with more than three features with a pharmacophore fit score from 36.13 to 95.16 (Table 2). A higher fit score in Table 2 indicates a better fit to the model. Among all compounds, theaflavin-3,3'-digallate, rutin, tannic acid, (-)-epigallocatechin gallate, epitheaflagallin 3-o-gallate, theaflavin-3-gallate, epicatechin possess pharmacophore fit score of > 70, which reflects that these compounds would fit to the model effectively. The lead theaflavin-3,3'-digallate was aligned to the pharmacophore model and the aligned theaflavin-3,3'-digallate and SARS-CoV-2 Mpro structures are structures were merged and shown in Fig. 2.
Table 2
Hit compounds arranged in order of decreasing pharmacophore fit score. The compounds were used for further downstream analysis including molecular docking.
S.No | Name | Pharmacophore fit score |
1 | Theaflavin-3,3'-Digallate | 95.16 |
2 | Rutin | 75.67 |
3 | Tannic acid | 75.27 |
4 | (-)-Epigallocatechin Gallate | 75.18 |
5 | Epitheaflagallin 3-O-Gallate | 66.5 |
6 | Theaflavin-3-Gallate | 66.14 |
7 | Epicatechin | 65.31 |
8 | Caffeine | 56.51 |
9 | Epicatechin Gallate | 55.61 |
10 | Procyanidin B2 | 55.49 |
11 | Catechin | 55.32 |
12 | Quercimeritrin | 55.31 |
13 | Nicotiflorin | 55.29 |
14 | Strictinin | 55.28 |
15 | Epicatechin 3,5-Di-O-Gallate | 55.21 |
16 | Epigallocatechin 3-O-P-Coumarate | 55.19 |
17 | Epigallocatechin 3,4',-Di-O-Gallate | 54.97 |
18 | Delphinidin 3-O-Beta-D-(6-O-(E)-P-Coumaryl)Galactopyranoside | 46.07 |
19 | Quercetin | 46.05 |
20 | Epigallocatechin | 45.93 |
21 | Isoquercitrin | 45.89 |
22 | Epigallocatechin 3,3',-Di-O-Gallate | 45.75 |
23 | Theaflavin | 45.66 |
24 | Epitheaflagallin | 45.64 |
25 | Theaflagallin | 45.64 |
26 | Isoschaftoside | 45.51 |
27 | Epigallocatechin 3,5,-Di-O-Gallate | 45.46 |
28 | Epicatechin 3-O-P-Hydroxybenzoate | 45.43 |
29 | Myricetin | 45.42 |
30 | Erythromycin | 45.22 |
31 | Gallocatechin 3'-O-Gallate | 44.81 |
32 | Theophylline | 36.32 |
33 | Gallocatechin 3-O-Gallate | 36.13 |
3.4. Validation of ADMET
To determine the clinical significance and toxicity of compounds, physicochemical descriptors and ADMET (Absorption, Distribution, Metabolism, Excretion, Toxicity) analysis were computed in order to gain a deeper understanding the efficiency of optimized compounds. Notably, physiochemical properties, drug-likeness properties, gastrointestinal absorption (HGA), Blood–brain barrier (BBB) penetration, Caco-2 permeability, CYP inhibitory promiscuity, AMES toxicity, hERG toxicity, carcinogenicity, and rat acute toxicity LD50 were calculated and showed in supplementary file 3. All compounds showed good Caco-2 permeability (ranges from − 3.58 to 1.22 log Papp in 10 − 6 cm/s) and not to be a ligand of CYP2D6. Also, these compounds indicated higher BBB penetration capability to human body at -8.62 to 0.64 log BB. Interestingly, epicatechin, quercetin, catechin, theophylline, caffeine, epicatechin 3-o-p-hydroxybenzoate exhibited excellent ADMET properties for orally available, qualified by Lipinski rule and possessed high human gastrointestinal absorption rate. Furthermore, no toxicity was seen in the compounds due to inactive inhibition of AMES and hERG. Rat Acute Toxicity LD50 value of all compounds ranged between 2.308 to 3.662 mol/kg. Taken together, tea compounds exhibit a largely favourable ADMET profile and possessing less toxicity capability.
3.5. Molecular docking and visualization
From the molecular docking, the tea compounds exhibited lowest binding energies against SARS CoV-2 targets with binding energy in the range of -12.9 kcal/mol to -4.8 kcal/mol(Table 3). Lead interactions are shown in Table 4. A five protein-ligand docking visualization in 2D and 3D representation including hydrogen bond interaction residues, van der waals force, alkyl-pi interaction, pi-anion interaction, were shown in Fig. 3.
Table 3
Binding affinity(kcal/mol) of tea compounds to SARS-CoV-2 target proteins from Autodock vina docking results
Compounds | 3CL protease(6LU7) | Glycoprotein(6VSB) | PL pro (6W9C) | Helicase(6YJT) | RdRp(6M71) |
Strictinin | -11.6 | -10.8 | -10.7 | -12.7 | -12 |
Epigallocatechin 3,3',-Di-O-Gallate | -11.1 | -10.2 | -9.5 | -10.4 | -11.3 |
Epicatechin 3,5-Di-O-Gallate | -10.4 | -9.6 | -9.8 | -10.1 | -11.2 |
Tannic Acid | -9.3 | -9.6 | -9.8 | -12.9 | -11.1 |
Gallocatechin 3-O-Gallate | -9.2 | -10 | -8.8 | -10.5 | -10.7 |
Epitheaflagallin 3-O-Gallate | -11.4 | -9 | -9 | -10.9 | -10.2 |
Epicatechin 3-O-P-Hydroxybenzoate | -8.5 | -8.1 | -8 | -9.1 | -10.1 |
Rutin | -8.8 | -8.2 | -8.3 | -11.3 | -10 |
Theaflavin-3-Gallate | -8 | -7.5 | -9.4 | -8.4 | -9.9 |
Gallocatechin 3'-O-Gallate | -10 | -9.4 | -10.2 | -10.7 | -9.6 |
Nicotiflorin | -8 | -7.6 | -7.3 | -8.3 | -9.6 |
Theaflavin-3,3'-Digallate | -8.4 | -8.5 | -8 | -10.5 | -9.5 |
Epitheaflagallin | -9.9 | -8.9 | -8.7 | -10.7 | -9.3 |
Lupeol | -8.7 | -8.5 | -7.9 | -10.2 | -9.1 |
Epigallocatechin 3,5,-Di-O-Gallate | -7.4 | -8.5 | -7.6 | -8.6 | -9.1 |
(-)-Epigallocatechin Gallate | -8.2 | -7.2 | -7.9 | -8.6 | -9.1 |
Procyanidin B2 | -8.2 | -8 | -7.6 | -8.6 | -9 |
Epigallocatechin 3,4',-Di-O-Gallate | -8.5 | -8.8 | -7.6 | -8.8 | -8.9 |
Erythromycin | -7.9 | -7.5 | -6.8 | -8.6 | -8.8 |
Myricetin | -9.1 | -8.7 | -8 | -9 | -8.7 |
Theaflavin | -8.2 | -7.9 | -7.6 | -9.4 | -8.5 |
Epicatechin Gallate | -8.1 | -8.4 | -8 | -8.3 | -8.3 |
Isoschaftoside | -7.6 | -6.4 | -7 | -8.2 | -8.3 |
Isoquercitrin | -8.1 | -6.8 | -6.9 | -6.9 | -8.3 |
Epigallocatechin 3-O-Caffeate | -7.7 | -7.3 | -7.4 | -9.3 | -8.2 |
Theaflagallin | -8.2 | -6.8 | -6.9 | -8.3 | -8.2 |
Epigallocatechin 3-O-P-Coumarate | -7.3 | -8.1 | -7.4 | -9 | -8.1 |
Quercimeritrin | -7.5 | -7.3 | -7.4 | -8.1 | -8.1 |
Delphinidin 3-O-Beta-D-(6-O-(E)-P-Coumaryl)Galactopyranoside | -8.9 | -7.6 | -8.1 | -8.3 | -7.7 |
Epigallocatechin | -7.1 | -6.9 | -6.6 | -8.6 | -7.1 |
Quercetin | -7.2 | -7 | -6.6 | -7.6 | -7 |
Epicatechin | -7.2 | -6.7 | -7.2 | -7.5 | -6.8 |
Catechin | -6.7 | -6.3 | -7.3 | -7.2 | -6.8 |
Caffeine | -5 | -4.8 | -4.9 | -5.7 | -5.1 |
Theophylline | -5.2 | -4.9 | -5 | -5.5 | -5 |
Grazoprevir | -8.9 | -7.9 | -8.6 | -11.6 | -8.8 |
3.6. Molecular docking of RdRp
RdRp is involved in the replication and transcription of the SARS-CoV-2 genome and it is product of polyproteins 1a and 1ab from cleavage of ORF1a and ORF1ab. For RdRp molecular docking, strictinin, epigallocatechin-3,3',-di-o-gallate, epicatechin-3,5-di-o-gallate, tannic acid, gallocatechin-3-o-gallate, epitheaflagallin-3-o-gallate, epicatechin-3-o-p-hydroxybenzoate, rutin were noted to exhibit higher binding affinity for RdRp with ranging from − 10 to -12 kcal/mol (Table 3.1). Of them, strictinin is a polyphenol that exhibited very lowest binding affinity (-12 kcal/mol) against to RNA binding channel of SARS-CoV-2 RdRp, and formed strong hydrogen bonds with LYS621, ASP452, ASP761, GLU811 of RdRp receptor binding site (Fig. 3.1). These sites were involved in positioning of the priming nucleotide and stabilizing the core structure of the RdRp domain, proved as a potential therapeutic option for inhibition of coronavirus (50),(51),(52).
3.7. Molecular docking of Glycoprotein
Glycoprotein is the main structural protein that interacts with the host by binding to host cell receptors to mediate virus expansion and determine viral tissue or host tropism (53),(54),(55). Blocking the surface glycoprotein has valuable insight towards viral entry to host. For spike glycoprotein, seven active compounds: strictinin, epigallocatechin-3,3',-di-o-gallate, gallocatechin 3-o-gallate, tannic acid, epicatechin 3,5-di-o-gallate, gallocatechin 3'-o-gallate and epitheaflagallin 3-o-gallate could tightly bind to the binding interface with binding energies − 10.8, -10.2, -10, -9.6, -9.6, -9.4 and − 9 kcal/mol respectively. All top-ranked compounds exhibited a higher binding energy than control compound grazoprevir (-7.9 kcal/mol). Remarkably, epicatechin 3,5-di-o-gallate possessed binding energy of -9.6 kcal/mol with more number of hydrogen bond interaction viz: ILE896, ILE882, PHE898, ASP796, ALA879, SER884, GLN895. Moreover, these interactions destabilize the binding pockets which are located on single promotor surfaces. Targeting these pockets may contradict the function of spike protein and block membrane fusion into host(56).
3.8. Molecular docking of PL pro
PL pro is a main protein which is responsible for the cleavages of N-terminus of the replicase poly-protein to release Nsp1, Nsp2 and Nsp3 that crucial process of SARS-CoV-2 replication and infection into the host(57). Based on molecular docking results, we identified 10 compounds with higher binding affinity to PL pro ranging from − 8.1 to -9.1 kcal/mol. Gallocatechin 3'-o-gallate and strictinin have showed promising interaction to PL pro with binding energies of -10.2 and − 10.7 kcal/mol respectively. In addition, hydrogen bond foundation at sites LYS217, TYR251, THR259, TYR305 of PL pro. These binding sites have been reported to be crucial in the attachment of the Papain-like protease(58),(59).
3.9. Molecular docking of 3CL protease
3CL protease is encoded by Open Reading Frame-1 of viral genome which plays an important role generate other non-structural proteins (pp1a and pp1ab) vital in viral replication. These non-structural proteins help in the viral replication, generation and infection mechanism. Hence, inhibition of SARS-CoV-2 3CL protease is considered an important target of COVID-19 manifestation(60). Docking results of 3CL protease showed lower binding energies ranges from − 11.6 kcal/mol to -5 kcal/mol. Promisingly, strictinin, epitheaflagallin 3-o-gallate, epigallocatechin 3,3',-di-o-gallate, epicatechin 3,5-di-o-gallate, gallocatechin 3'-o-gallate were having binding affinities − 11.6, -11.4, -11.1, -10.4, -10 kcal/mol respectively(binding affinity threshold <-10 kcal/mol). Among the screened compounds, epitheaflagallin 3-o-gallate showed distinct binding affinity − 11.4 along with potential interaction of viral target sites viz; GLU166, LEU167, PRO168, GLY143, THR190, and GLN192 stacking with conventional hydrogen bond. Van der waals interaction were observed at sites MET165, LEU167, THR190, GLN192, HIS163, GLY143 and pi-alkyl bond at CYS145 and the indole moiety. These binding sites are key catalytic sites to cleave poly protein synthesis, and in viral replication mechanisms(61),(62),(63). Intriguingly, epigallocatechin 3,3',-di-o-gallate also showed comparable binding affinity that could potentially bind to the target binding sites; THR111, GLN110, PHE294, PRO108, HIS246 with stabilized and strong hydrogen bond interactions.
3.10. Molecular docking of helicase
SARS-CoV-2 helicases are NTP-dependent proteins and involves in essential cellular processes like viral genome replication, transcription and translation(64). In addition, nsp13 harbors RNA 5′-triphosphatase activity through RNA capping and it possesses high sequence identity (> 99%) with their ortholog in SARS-CoV, indicating a conserved helicase mechanism across coronaviridae(65). Molecular docking revealed that tea compounds could effectively interact to SARS-CoV-2 helicase along with binding affinity ranges from − 12.9 to -5.5 (kcal/mol). Tannic acid, strictinin, rutin, epitheaflagallin 3-o-gallate, gallocatechin 3'-o-gallate, epitheaflagallin, gallocatechin-3-o-gallate, theaflavin-3,3'-digallate, epigallocatechin-3,3',-di-o-gallate, lupeol showed notable binding affinities − 12.9,-12.7,-11.3,-10.9,-10.7,-10.5,-10.4,-10.2 respectively(binding affinity threshold <-10 kcal/mol). Of them, tannic acid, a tannin compound showed distinct binding affinity towards SARS-CoV-2 helicase and formed strong hydrogen bonding interaction by residues of ASN177, ARG560, ARG178, ASP207, ARG409, ASN516, HIS554. These binding sites function in blocking the viral replication and disease progression (66), (67), (68).
Table 4
Summary of top seven ranked tea compounds screened against SARS-CoV-2 target receptors with their respective binding affinity, interacting residues, hydrogen bond distance and hydrophobic residues
SARS-CoV-2 targets | Compounds | Binding affinity (kcal/mol) | Residues involved to the ligand | Hydrogen bond Distance | Hydrophobic residues |
RdRp | Strictinin | -12.6 | LYS621, LYS621, SER814, ASP761 | 2.54, 2.00, 2.22, 2.48 | PRO620, TYR619, LYS798, TRP800, PHE812, CYS813, ASP452 |
Helicase | Tannic acid | -12.3 | ASN177, ARG178, ASN179, TYR180, PHE200, ASN516, HIS554, ASN557, ARG560 | 2.84 3.32 2.81 2.37 3.16 2.46 2.56 2.5, 3.1 | TYR149, PRO175, PRO175, ASN177, TYR515 |
Helicase | Epigallocatechin 3,3',-di-o-gallate | -12.3 | ARG560 PRO514 THR410 SER486 | 2.58 2.18 2.47 2.10 | PRO406 |
3CL protease | Epitheaflagallin 3-o-gallate | -11.4 | SER144, ASN142, GLU166 | 2.45, 2.68, 2.26 | MET165, LEU167, PRO168, GLY143, THR190, GLN192 |
3CL protease | Epigallocatechin 3,3’, -di-o-gallate | -11.1 | GLN107, GLN110, GLN110, THR111, ASN151, HIS246, PHE294 | 3.3, 3.05, 2.24, 2.28, 3.37, 2.1, 2.71 | GLN107, PHE294, PHE112, ASP153 |
Glycoprotein | Strictinin | -10.8 | TYR251, LYS306 | 2.86, 2.27 | GLU214, THR257, LYS306 |
PL pro | Gallocatechin 3'-o-gallate | -10.2 | LYS217, TYR251, TYR251, THR259, TYR305 | 2.67, 2.22, 1.97, 1.96, 1.84 | GLU214, TYR305, LYS306 |
Discussion
The SARS-CoV-2 causing the pandemic of COVID-19 respiratory disease has resulted in high mortality worldwide. Global efforts have been carried out in studying the detailed mechanism of the SARS-CoV2 pathogenesis. Traditional practices and herbs used in our diet also possess excellent medicinal applications. Natural products have been shown possible inhibitory effect on human coronavirus, possesses prophylactic action to stop or at least slow down SARS-CoV-2 transmission(69). Tea is common source of polyphenolic flavonoids that consist rich amount of catechin, epicatechin, epigallocatechin, epicatechin gallate, epigallocatechin gallate are to be 80–90% and kaempferol, quercetin and myricetin glycosides are to be 10% of total flavonoids(70). Therefore, in this study, we evaluated the therapeutic activity of Tea (Camellia sinensis) that can potentially inhibit SARS-CoV-2 targets using integrative computational approach.
The phylogenetic analysis of an emerging virus is crucial part to understand the virus characterization, molecular evolution and viral lineages among the families(71). Primarily, from our investigation, we elucidated evolutionary relationship among SARS-CoV-2 with 10 different beta coronaviruses. Based on phylogenetic analysis, the whole genome of wuhan strain showed high homology (sequence similarity > 90%) lineage with SARS-CoV and MERS-CoV, also lineage with recent strains SARS-CoV-2; Delta, Alpha, Beta, Gamma and Omicron. Omicron variant directly evolved from Gamma variant and also had highly lineage with Delta, Alpha and Beta strains. It may possible that the Omicron variant was in circulation for a long time before it was identified. Remarkably, SARS-CoV-2 and SARS-CoV possesses high genomic similarity with pangolin coronavirus and civet coronavirus respectively that are parallelly clustering with other beta coronaviruses. In addition, Pangolin and civet are potential intermediate species for SARS-CoV-2 and SARS-CoV viruses, it suggested them as a potential reservoir species (72). Our phylogenetic analysis corroborated that SARS-CoV-2 possesses distinct evolutionary relationship with the subgenus of sarbecoviruses, as well as closest evolutionary ancestor with among the subgenus of merbecoviruses and embecoviruses.
ADMET evaluation and drug-likeness property stated that compounds possess no toxicity and good gastero-permeable capability, and low toxicity properties. Besides, tea compounds consist of feasible pharmacophore features that exhibit compounds cable to bind to 3CL protease binding sites. Molecular docking results showed that derivative-compounds of catechin and theaflavin possesses the strong binding affinities towards SARS-CoV-2 targets. Specifically, nine natural potent compounds (strictinin, epigallocatechin 3,3',-di-o-gallate, epicatechin 3,5-di-o-gallate, gallocatechin 3'-o-gallate, epitheaflagallin, theaflavin-3-gallate, epitheaflagallin 3-o-gallate, rutin and tannic acid) were having lowest binding affinities to SARS-CoV-2 targets ranges from − 12.6 kcal/mol to -8 kcal/mol. Strong binding affinity, less hydrogen bond distance, interaction on potential catalytic site suggests that these compounds could be used as specific inhibitors against SARS-CoV-2. Further, positive control compound grazoprevir displayed low binding energies to all five SARS-CoV-2 target receptors (− 7.9 to − 11.6 kcal/mol) that are comparable to all other compounds in the docking results. Additionally, appropriate common binding site interaction were found for grazoprevir and other catechin derivative compounds suggesting that tea compounds possess same potential interaction like grazoprevir towards SARS-CoV-2 targets inhibition. Epigallocatechin 3,3’,-di-o-gallate is a potential catechin derivative compound that had binding affinity with 3CL protease (-11.1 kcal/mol), glycoprotein (-10.2 kcal/mol), helicase (-10.4 kcal/mol), papain-like-protease (-9.5 kcal/mol) and RdRp (-11.3 kcal/mol). Surprisingly, compounds (epicatechin, quercetin, epigallocatechin, catechin, theophylline, caffeine) which have shownhigh bioavailability from ADMET analysis, also docked with SARS-CoV-2 targets. The outcomes indicate these natural compounds from tea have potentially interacting to crucial targets of SARS-CoV-2 and may serve as potential candidates for COVID-19 prophylaxis action. Taken together all, this study provides potential leads for clinical application of tea compounds in prophylaxis of SARS-CoV-2.