3.1. Putative drug targets
14 drug target genes such as protein S, M, E, rep, 1a, 3a, 7a, 9b, nsp6, nsp7, nsp8, nsp10, nsp12 and RBD were identified based on extensive literature survey and database search. 11 targets possessed experimentally solved structures (Table1) and 3 targets were computationally modeled (Table 2). Spike glycoprotein, S (PDB ID: 1WYY, 6VXX, 6VYB) was identified to be one of the most important targets due to its involvement in mediating the host transfer by attaching the virion to the cell surface and interacting with the host leading to infection 64, 65. The post-fusion conformation of the spike glycoprotein is a hairpin structure comprising two heptads repeat regions that are stabilized by Asparagine and Glutamine residues forming hydrogen bonds between the two heptad region domains. The open and closed conformation of the surface spike glycoprotein act as important part of interaction with the host cell surface and transfer mediation, and in the closed conformation, three human Angiotensin-Converting Enzyme 2 (hACE2) recognition motifs were observed, therefore the open conformation is important for interaction with the hACE2 at the host cell surface initiating the viral entry 10. A receptor binding domain of the membrane protein (PDB ID: 6M17) was identified to bind to the human cellular receptor hACE2 that recognizes SARS CoV-2 29. The membrane protein M (PDB ID: 3I6G) is involved in morphogenesis and its interactions, envelope membrane protein E (PDB ID: 2MM4) induces apoptosis by activating host inflammasome and replicase protein rep (PDB ID: 1Q2W) is a multifunctional protein involved in transcription and replication of viral RNA, it also contains proteinases that cleave the polyprotein 66, 67. Replicase polyprotein 1a (PDB ID: 5RE4) is a proteinase that cleaves the N and C- terminal of replicase polyprotein and is also involved in suppressing the host gene expression 68. RNA-dependent RNA polymerase (PDB ID: 6M71) consists of three non-structural proteins nsp7 (PDB ID: 6M71_C), nsp8 (PDB ID: 6M71_B, D) and nsp12 (PDB ID: 6M71_A) 30. Protein3a that induces apoptosis 69, protein 7a (PDB ID: 1XAK) was identified to be dispensable for virus replication in cell culture 70, Protein9b (PDB ID: 2CME) is a lipid-binding protein, non-structural protein6, nsp6 initiates induction of autophagosomes, non-structural protein8 (nsp8) acts as primase in viral replication and non-structural protein10 (nsp10) is essential for viral mRNA cap methylation 71, 72, 73. Therefore, based on the relevant functions in the mechanism of virulence these proteins played, these were considered to be putative molecular targets in the present study for drug screening.
3.2. Hypothetical model building and validation of selected targets
The target proteins such as Protein3a, non-structural protein6 and non-structural protein10 did not possess experimentally solved structures; the 3D structures were computationally predicted. The sequence size of Protein3a, 6 and 10 are 274, 63 and 139 respectively and the 3D structures of these sequences were predicted by C-I-TASSer, which utilizes deep convolutional neural-network-based contact-map predictions 32
The theoretical model of Protein 3a is depicted in Fig. 1a. The energy of the modelled protein was initially predicted to be 1716e/kt by Anolea and was further minimized using ModRefiner to 879e/kt (Table 2). The secondary structure of the modelled protein comprised of 26.18% alpha-helices, 29.82% beta sheet, 10.18% beta turns and 33.82% coils as predicted by STRIDE web server depicted in Fig.1b. ProCheck analysis suggested that the Ramachandran plot of the model demonstrated 69% of the residues in the most favoured region, 35.8% residues in the allowed region and 5.2% residues in the outlier region (Fig. 1c). The stereo-chemical validation in terms of z-score obtained from ProSA was found to be -4.88, which is similar and comparable to the z-score of experimental structures (Fig. 1d). The VERIFY3D plot of this model indicated that 63.64% of residues possessed 3D-1D score>=0.2 that indicate good quality of the model (Fig.1e). The Z-score predicted by WHATIF showed -6.042 (Table 2). The overall quality of the model was validated by ERRAT, which predicted 30.71 as the quality factor (Fig. 1f). Thus, based on the computational validation, it is suggested that the hypothetical model of Protein 3a showed good stereo chemical validity, secondary structural alignment and potentai energy, which probably can be used as molecular target.
The hypothetical model of nsp6 visualized by PyMol is depicted in Fig. 2a. The energy of the protein model was minimized using ModRefiner from 366e/kt to 359e/kt as predicted by Anolea. The secondary structure of the modeled protein comprised of 70.49% alpha-helices, 9.84% beta sheet, 8.20% beta turns and 11.48% coils that was predicted by STRIDE web server is shown in Fig. 2b. ProCheck analysis suggested that the model showed good stereo-chemical stability. Ramachandran plot of the model is shown in Fig. 2c, suggested that 89.3% of the residues were present in the most favoured region, 9.1% residues located in the additionally allowed region and 1.6% in the outlier region. The stereo-chemical validation by ProSA in terms of Z-score found to be -3.33, which are reliable as they are close to the z-score of native structures (Fig. 2d). The plot obtained by VERIFY3D for nsp6 indicated that 50.82% of the residues showed 3D-1D score>=0.2 (Fig. 2e), which indicate the model quality. The Z-score predicted by WHATIF was found to be -5.67 (Table 2). The overall model quality predicted by ERRAT was found to be 39.62 (Fig. 2f). From the computational prediction, it is evident that the hypothetical model demonstrated structural and stereo-chemical quality and the model can be considered as potential molecular target in the absence of an experimental structure.
Similarly, the modelled structure of nsp10 is shown in Fig. 3a. The energy of the protein model minimized from -636e/kt to -740e/kt, which was predicted by Anolea (Table 2). The secondary structure of the modelled protein comprised of 19.42% alpha-helices, 25.90% beta sheet, 11.51% beta turns and 43.17% coils as predicted by STRIDE web server, which is depicted in Fig. 3b. Ramachandran plot of this model predicted by ProCheck showed 68.8%, 27.9% and 3.2% residues in the favored, allowed and outlier regions respectively (Fig. 3c). The stereo-chemical validation obtained from ProSA was found to be -4.05 in terms of Z-score, which is reliable as it is in the range of the z-score of the 3D experimental structures (Fig. 3d). The plot of theoretical model of nsp10 obtained by VERIFY3D indicated that 70.5% of the residues showed 3D-1D score>=0.2 (Fig. 3e) that indicated the model quality. The Z-score predicted by WHATIF showed the value of 0.334 (Table 2). The quality of the model further validated by ERRAT showed quality factor of 93.6 (Fig. 3f). Therefore, the computational validation showed the 3D model of nsp10 is ideal model in terms of stereo- chemical and structural stability and the model can be considered a target for the lead screening.
3.3. Computational screening of natural lead molecules
The post virtual screening of 92 natural compounds from PuBChem and ChemSpider databases showed that 48 of them qualified the drug-likeliness predicted by various filters available at PreADMET and SwissADME. The predicted drug likeliness features of the natural lead molecules are shown in Supplementary material, Table S2. When 48 lead molecules were further subjected to ADME prediction, 16 compounds demonstrated good skin and Caco2 cell permeability, human intestinal absorption, blood-brain barrier concentration, gastrointestinal absorption, lipophilicity, water-solubility essential for drug development. The ADME features of the selected molecules are shown in Supplementary material, Table S3. Further, when 16 molecules that qualified the ADME features were subjected to toxicity prediction, 4 molecules qualified toxicity features required for lead development that included carcinogenicity, mutagenicity, algae and fish toxicity, and hERG inhibition. Thus, the computer aided screening of the lead molecules suggested that out of 92 molecules elected, 04 molecules were qualified drug likeliness, ADME and toxicity features. The molecules selected were Hyoscyamine, Rotiorinol-C, Scutifoliamide-A and Tamaridone present in Datura stramonium, Chaetorium cupreum, Piper scutifolium and Tamarix dioica respectively. These natural lead molecules were prioritized as the lead molecules and the interaction of these molecules towards the 14 molecular targets of SARS-CoV-2 were computationally modeled and validated.
3.4. Molecular docking studies
The binding of four selected natural lead molecules such as Hyoscyamine, Rotiorinol-C, Scutifoliamide-A and Tamaridone towards 14 molecular targets (S, M, E, rep, 1a, 3a, 7a, 9b, nsp6, nsp7, nsp8, nsp10, nsp12 and RBD) of SARS-CoV-2 were predicted by molecular docking studies by AutoDock and Autodock vina. The best conformations generated were analyzed and scrutinized based on binding affinity, number of hydrogen bonds and other stabilizing interactions, and cluster RMSD. The binding of four selected natural molecules towards the prioritized molecule targets of SARS-CoV-2 predicted by the molecular docking studies are shown in Table 3.
Hyoscyamine ([(1S,5R)-8-methyl-8-azabicyclo[3.2.1]octan-3-yl](2S)-3-hydroxy-2-phenyl-propanoate), an alkaloid derivative and anticholinergic, derived from Datura stramonium (Jimson weed), which is known for inhibiting receptors in smooth and cardiac muscle, the sino-atrial and atrio-ventricular node for a slow ventricular response during atrial fibrillation [74,75]. Molecular docking studies suggested that this lead molecule showed good binding potential towards the spike glycoprotein in the post-fusion conformation (PDB ID: 1WYY), closed state (PDB ID: 6VXX) and open state (PDB ID: 6VYB) with the binding energy of -8.14, -6.0 and -5.7 kcal/mol respectively. The major interacting residues present in the binding cavity of post-fusion conformation of spike protein and the lead molecules included Ser924, Thr925, Gly928, Asp932, Gln1161, ILe1164, Asn1168 (Chain A), Ser924, Thr925, Gly928, Gln1161, Ile1164 (Chain B). The interaction is stabilised by two hydrogen bonds with Asn1168 of Chain A (Fig. 4a). It is observed that these residues located at one of the binding sites of the molecule as predicted by the Castp and Depth servers. Similarly, Trp104, Ile119, Val126, Ile128, Tyr170, Ser172, Ile203, Val227 were identified as the major residues present in the binding site of open state of spike protein where the ligand interacted and the interaction is stabilised by a hydrogen bond (Fig. 4b). The major interacting residues present at the closed state were identified to be Leu335, Pro337, Phe338, Gly339, Asp364 and Val367 (Fig. 4c). Most of the residues interacting with hyoscyamine were predicted to be the residues forming the binding pockets. It is clear that the conformation plays a major role in the binding of the ligand and maximum interaction is seen between the post-fusion conformation of the spike glycoprotein and the lead molecule in compassion with other conformation. The binding energy of the docked complex of lead molecules and replicase polyprotein 1ab (PDB ID: 1Q2W) was estimated to be -6.1 kcal/mol and the interacting residues such as Glu156, Met165, Asp187, Arg188, Gln189 and Gln192 were stabilised the binding along with a hydrogen bond (Fig. 4d). Similarly, the lead showed a binding energy of -5.4 kcal/ mol towards the RBD of the membrane protein (PDB ID: 6M17). The major interacting residues present in the binding cavity were Thr376, Val407, Ala411, Val433 and Tyr508 (Fig. 4e). Hyoscyamine showed binding energy of -0.7 kcal/mol (interacting residue: Gly) towards membrane protein (PDB ID: 3I6G) (Fig. 4f) and the lead molecule exhibited the binding energy of -2.2 kcal/mol towards the envelope protein (PDB ID: 2MM4) in which Lys63 and Asn64 were found to be the major interacting residues (Fig.4g). The binding energy of the docked complex of replicase polyprotein1a (PDB ID: 5RE4) and Hyoscyamine was estimated to be -6.0 kcal/mol. The major interacting residues at the binding site of receptor included Trp218, Asn221and Leu271 forming a hydrogen bond (Fig.4h). The binding energy of the docked complexes of Protein 3a, 7a (PDB ID: 1XAK) and 9b (PDB ID: 2CME) were estimated to be -5.3, -4.4 and -4.4 kcal/mol respectively. The major interacting residues of protein 3a included Phe56, Ile63, Lys66, Tyr189 and Glu191 (Fig. 4i). Similarly, interacting residues of protein 7a included Tyr3, Tyr5, His47, Gln61 (Fig. 4j) and the major residues present to the binding cavity of protein 9b were found to be Arg68, Ala69 and Phe70 (Fig. 4k). The binding affinities towards RNA dependant RNA polymerase i.e, towards nsp7 (PDB ID: 6M71_C), nsp8 (PDB ID: 6M71_B, D) and nsp12 (PDB ID: 6M71_A) were estimated to be -5.3, -5.9 and -5.4 kcal/mol respectively. The nsp7 interacted with the lead molecule via Thr46, Phe49 and Val53 (Fig. 4l). Similarly, the major interacting residues present at the binding site of nsp8 was observed to be Leu128, Val130, Thr141 and Tyr149 (Fig. 4m) and the interacting residues were found at the binding pocket of nsp12 were Phe412, Phe415, Phe440 and Phe843 acted as the binding residues (Fig. 4n). The binding affinities towards non-structural proteins 6 and 10 are estimated to be -5.2 and -6.1 kcal/mol respectively. The major interacting residues present at the binding site of non-structural protein 6 were identified to be Asp6, Phe7, Leu15, Leu40 and Ile60 (Fig. 4o). Similarly, the interacting residues present in the binding pocket of non-structural protein 10 were identified to be Ile55, Cys74, Tyr76, His83, Asp91, Leu112, Thr115 and Val116 along with a hydrogen bond (Fig. 4p). Thus, the interaction modelling of natural lead Hyoscyamine towards all the selected targets of SARS-CoV-2 suggested that, the highest binding potential of Hyoscyamine was observed towards the spike glycoprotein in the post-fusion conformation with a binding energy of -8.14 kcal/mol and several stabilising forces and interactions in comparison with the other selected molecular targets.
Rotiorinol-C ((9R,9aR)-3-acetyl-9-hydroxy-6-[(1E,3E,5S)-7-hydroxy-3,5-dimethylhepta-1,3-dienyl]-9a-methyl-9H-furo [3,2-g]isochromen-2-one), an azaphilone derived from the fungus Chaetomium cupreum (Dark-walled mold) is known to exhibit antimicrobial activity [76]. It showed a good binding potential towards the spike glycoprotein in the post-fusion conformation (PDB ID: 1WYY), closed state (PDB ID: 6VXX) and open state (PDB ID: 6VYB) with the binding energy of -6.5, -6.2 and -6.3 kcal/mol respectively. The interacting residues present in post-fusion conformation included Gln931, Asn936, Asn942, Val1157, Asn1159, Gln1161, Asn1168 and Lys1172 forming two hydrogen bonds with the lead molecule (Fig. 5a). The interacting residues in the closed state were observed to be Arg34, Thr208, Pro209, Leu212, Pro217, Gln218 and Phe220 forming a hydrogen bond between Phe220 and the lead molecule (Fig. 5b). Similarly, the major interacting residues present at the open state of spike glycoprotein were identified to be Thr33, Phe59 and Asp287 (Fig. 5c). The superior interaction of the Rotiorinol-C was observed in the post-fusion conformation of the spike glycoprotein compared to that of open and closed state conformations. The binding energy of the docked complex of replicase polyproteins 1ab (PDB ID: 1Q2W) and the lead molecules was measured to be -9.82 kcal/mol and the major residues involved in the binding were observed to be Arg4, Lys5, Tyr126, Gln127, Arg131, Asp289 and Glu290. The interactions stabilised by five hydrogen bonds (Fig. 5d). The lead showed a binding energy of -6.1 kcal/mol towards the receptor binding domain of the membrane protein (PDB ID: 6M17) and the major residues involved in the binding are Cys336, Phe338, Gly339, Ala363, Asp364, Val367, Ser371, Ser373 and Phe374, forming a hydrogen bond with Ser373 (Fig. 5e). The lead demonstrated binding energy of -0.6 and -2.2 kcal/mol towards the membrane protein (PDB ID: 3I6G) (Fig. 5f) and envelope protein (PDB ID: 2MM4; residues: Lys63 and Asn64) (Fig. 5fg) respectively. The binding affinity of lead towards the replicase polyprotein1a (PDB ID: 5RE4) was estimated to be -6.7 kcal/mol. The major interacting residues involved in the binding were Lys137, Thr199, Tyr239, Leu286, Leu287, Glu288, and Asp289 forming three hydrogen bonds (Fig. 5h). The binding energy towards Protein 3a, 7a (PDB ID: 1XAK) and 9b (PDB ID: 2CME) showed -6.3, -5.4 and -5.0 kcal/mol respectively by the lead molecule. The major interacting residues present at the cavity of protein 3a included Phe28, Val29, Arg68, Ala72 and Val90 (Fig. 5i). Similarly, the major interacting residues present in the protein 7a were identified to be Tyr3, His47, Arg57, Thr59 (Fig.5j) and the interacting residues available in protein 9b included Ser56, Leu65, Glu66, Ala67, Arg68, Ala69, Phe70 and Ser72, and the interaction is stabilised by a hydrogen bond (Fig. 5k). The binding affinities of the lead molecules towards RdRp- nsp7 (PDB ID: 6M71_C), nsp8 (PDB ID: 6M71_B, D) and nsp12 (PDB ID: 6M71_A) were estimated to be -5.9, -6.0 and -6.6 kcal/mol respectively. The major residues interacting with nsp7 included Lys2, Asp5, Thr9, Thr46, Phe49, Met52 and Val53 (Fig. 5l). Similarly, the interacting residues present in the cavity of nsp8 were found to be Pro133, Asp134, Tyr135, Trp182 and Pro178 forming a hydrogen bond with Tyr135 (Fig. 5m) and the interacting residues observed in nsp12 were included Arg181, Gln184 and Asn213 (Fig. 5n). The binding energy of non-structural proteins 6 and 10 with the lead molecules were estimated to be -5.6 and -6.8 kcal/mol respectively. The major interacting residues present in the non-structural protein 6 were Gln8, Lys38, Lys42 and Leu44 forming a hydrogen bond (Fig. 5o). Similarly, the main interacting residues available in the non-structural protein10 included Ile55, His83, Lys96 and Val116, forming a hydrogen bond with His83 (Fig.5p). The molecular docking studies suggested that highest binding affinity of Rotiorinol-C was observed towards the replicase polyprotein 1ab with a binding energy of -9.82 kcal/mol and stabilizing interactions when compared with the binding of Rotiorinol-C and prioritised targets of SARS-CoV-2.
Scutifoliamide-A ((2E, 4Z)-5-(1, 3-Benzodioxol-5-yl)-N-isobutyl-2, 4-pentadienamide) derived from Piper scutifolium (Pepper) is known for its antimicrobial activity 77. It showed good binding potential towards the spike glycoprotein in the post-fusion conformation (PDB ID: 1WYY), closed state (PDB ID: 6VXX) and open state (PDB ID: 6VYB) with the binding affinity of -6.4, -5.4 and -6.6 kcal/mol respectively. The interacting residues present in the post-fusion conformation were identified to be Asn910, Ala940, Thr943, Leu944, Gln947, Asn951, Leu1178, Leu1181 and Glu1183 (Fig. 6a). The interacting residues present in the binding site of closed state were observed to be Leu118, Val120, Phe135, Leu141 and Leu241 (Fig. 6b). Similarly, the main interacting residues present in the binding cavity of open state were found to be Phe823, Asn824, Val826, Pro863, Pro1057 and His1058 forming a hydrogen bond (Fig. 6c). The binding energy of the lead molecule towards the replicase polyproteins 1ab (PDB ID: 1Q2W) was identified to be -6.9 kcal/mol and the main interacting residues included Arg4, Lys5 and Phe291 (A chain), Lys 5 (B chain) (Fig. 6d). The lead showed a binding energy of -5.9 kcal/mol towards the receptor binding domain of the membrane protein (PDB ID: 6M17) interacting with four residues such as Gln493, Tyr495, Gly502 and Tyr505 and forming hydrogen bonds with Gln493 and Gly502 (Fig. 6e). The lead showed binding energy of -0.7 and -0.8 kcal/mol towards the membrane protein (PDB ID: 3I6G) (Fig. 6f) and envelope protein (PDB ID: 2MM4) (Fig. 6g) respectively. The binding affinity of the lead molecules towards the replicase polyprotein1a (PDB ID: 5RE4) was identified to be -6.8 kcal/mol and the main interacting residues included Val212, Arg217, Leu220, Gln256, Ile259, and Asp263 (Fig. 6h). The binding energy towards Protein 3a, 7a (PDB ID: 1XAK) and 9b (PDB ID: 2CME) showed -5.4, -5.6 and -4.4 kcal/mol respectively. The main interacting residues present at the protein 3a included Cys130, Trp131, His150, His204 and His227, and the interaction is stabilised by a hydrogen bond (Fig. 6i). Similarly, the major interacting residues present in the protein 7a included Gln6, Cys8, Val9, Thr12, Leu16 and Lys17 and two hydrogen bonds were formed (Fig. 6j) and the major binding residues available in the protein 9b included Ala58, Arg68, Ala69, Phe70 and Ser72 (Fig. 6k). The binding affinities of lead towards RdRp- nsp7 (PDB ID: 6M71_C), nsp8 (PDB ID: 6M71_B, D) and nsp12 (PDB ID: 6M71_A) were estimated to be -5.9, -6.0 and -6.1 kcal/mol respectively. The main residues available in nsp7 included Asp5, Thr9, Thr46, Phe49, Glu50 and Val53 and one hydrogen bond formed with Thr9 (Fig. 6l). Similarly, the interacting residues present in nsp8 were observed to be Leu128, Met129, Pro133, Thr141 and Tyr149 (Fig. 6m) and the major amino acid residues involved in the binding of nsp12 included Asp484, Ile488, Gln573, Ser578 and Ala581 (Fig. 6n). The binding affinities towards non-structural proteins 6 and 10 were estimated to be -5.5 and -5.6 kcal/mol respectively. The interacting residues present in the non-structural protein 6 were observed to be Leu16, Ile17, Arg20 and Thr21 and a hydrogen bond formed with Arg20 (Fig. 6o). Similarly, the major interacting residues present in the non-structural protein10 included His83, Leu92, Asn114, Thr115 and Val116 and formed two hydrogen bonds with Val116 (Fig. 6p). The computational modelling suggested that highest binding potential of Scutifoliamide-A was observed towards the replicase polyprotein 1ab with a binding energy of -6.9 kcal/mol and the interaction stabilised by several forces when compared with the binding of Scutifoliamide-A and other selected targets of SARS-CoV-2.
Tamaridone (5,7-dihydroxy-2-(2-hydroxy-4-methoxyphenyl)-6-methoxychromen-4-one), a flavone derived from Tamarix dioica (Lal jhau / Ban jhau) showed good binding potential towards the spike glycoprotein in the post-fusion conformation (PDB ID: 1WYY), closed state (PDB ID: 6VXX) and open state (PDB ID: 6VYB) with the binding energies of -7.2, --7.3 and -6.8 kcal/mol respectively. The interacting residues present in the binding pocket of post-fusion conformation of spike protein included Asp931, Asn935, Lys1162, Asp1165, Asn1168 and Lys1172 (Fig.7a). The interacting residues at the binding pocket of the closed state were observed to be Ala520, Phe559, Phe562, Gln563, Gln564, Phe565, Gly566 and Arg567 that formed 4 hydrogen bonds with the lead molecule (Fig. 7b). Similarly, the major residues available at the binding site of open state were Thr732, Thr778, Ser780, His1058, Pro863, that form a hydrogen bond with the ligand (Fig. 7c). The binding affinity of the lead molecule towards the replicase polyprotein 1ab (PDB ID: 1Q2W) was estimated to be -6.3 kcal/mol and the main interacting residues were included Gly11, Lys12, Glu14 and Gly15 forming a hydrogen bond (Fig. 7d). The binding affinity towards the receptor binding domain of the membrane protein (PDB ID: 6M17) was identified to be -6.5 kcal/mol and the major interacting residues identified to be Thr345, Arg346, Ser349, Lys441, Lys444, Asn448, Asn450 and Tyr451 and formed a hydrogen bond with Lys444 (Fig. 7e). The ligand showed binding energy of -0.8 and -3.1 kcal/mol towards membrane protein (PDB ID: 3I6G) (Fig. 7f) and envelope protein (PDB ID: 2MM4) (Fig. 7g) respectively. The binding potential of lead towards the replicase polyprotein 1a (PDB ID: 5RE4) was estimated -7.0 kcal/mol. The major interacting residues present in the binding site of replicase polyprotein 1a were found to be Leu87, Lys137, Thr190, Tyr237, Tyr239, Leu285 and one hydrogen bond formed with the ligand (Fig. 7h). The binding energy towards protein 3a, 7a (PDB ID: 1XAK) and 9b (PDB ID: 2CME) showed -6.6, -5.7 and -5.6 kcal/mol. respectively. The main interacting residues present in protein3a included Gln213, Ile263, Val255, and Thr270 (Fig.7i). Similarly, the major interacting residues available in the binding cavity of protein 7a included Val9, Thr12, Val14, and Arg65 (Fig. 7j) and the main amino acids that interacted with the protein9b were Arg68, Gln71 and Ser72 (Fig. 7k). The binding affinities of RdRp-nsp7 (PDB ID: 6M71_C), nsp8 (PDB ID: 6M71_B, D) and nsp12 (PDB ID: 6M71_A) were estimated to be -5.9, -5.7 and -6.7 kcal/mol respectively. The major residues present at the binding site of nsp7 were Lys2, Val6, Thr9, Phe49, Met52 and formed one hydrogen bond with Thr9 (Fig. 7l). Similarly, the main interacting residues present in nsp8 were identified to be Leu128, Val130, Val131, Thr141 and Tyr149 (Fig. 7m) and the main residues interacted with nsp12 were Lys47, His133, Asn138 and Lys780 (Fig. 7n). The binding affinities towards non-structural proteins 6 and 10 were estimated to be -5.7 and -6.7 kcal/mol respectively. The major interacting residues of non-structural protein6 were observed to be Lys48, Ser50, Leu52, Asp53 and Gln56 and two hydrogen bonds with Leu52 were formed (Fig. 7o). Similarly, the main amino acids that were interacted with non-structural protein 10 included Pro37, Asn105, Asp106, Phe110, Lys113, Asn114 and Tyr126 and the interaction also stabilised by two hydrogen bonds formed with Asn105 and Asn114 (Fig.7p). Thus, the interaction modelling of Tamaridone towards 14 prioritised targets of SARS-CoV-2 suggested that, the lead showed highest binding potential to the spike glycoprotein with a binding energy of -7.2 kcal/mol and several stabilising forces and interactions were evident in comparison with the binding of this lead with other selected molecular targets.
Similarly, the binding affinities of two standard drugs, currently in practice against COVID-19, towards their usual targets were modelled by molecular docking studies. The binding affinity of the conventional drugs Chloroquine and Hydroxychloroquine against their usual drug targets such as glutathione S transferase of Plasmodium falciparum and human angiotensin-converting enzyme 2 (hACE2) respectively were predicted by molecular docking. The interaction modelling suggested that when Chloroquine (DrugBank ID: DB00608) docked against its usual target Glutathione S transferase (PDB ID: 1OKT), the complexes showed a binding energy of -3.71 kcal/mol. The major residues that interacted were Glu90, Leu91, Glu93, Phe94 and Asp97 (Fig. 8a & b). The number of intearctig residues at the binding pockets was comparatively less than the binding of natural molecules to the targets selected in this study. Similarly, when Hydroxychloroquine (DrugBank: DB01161) docked against its usual target human angiotensin-converting enzyme 2 (hACE2) (PDB ID: 1R42), the best docked conformation showed a binding energy of -1.0 kcal/mol. It is revealed that only Gln287 interacted with the ligand (Fig. 8c & d). Further, there were no hydrogen bonds formed. Thus, from the molecular docking studies, it is clear that the natural lead molecules prioritized in the current study showed better binding towards the selected molecular targets in terms of binding energy, number of hydrogen bonds and other weak interactions and number of amino acids involved in the binding when compared with the binding of two currently used antiviral drugs towards their respective molecular targets. Thus, the present study suggested that the natural molecules probably have superior binding and interaction potential towards the molecular targets of SARS-CoV-2 when compared with the comparative control, the binding of conventional drugs and their targets.
3.5. Molecular dynamic simulation studies
Among all the docked complexes, four best docked complexes in term of the minimum binding energy (kcal/mol), cluster RMSD, and number of stabilizing interaction selected from the docking studies (The best conformation of spike glycoprotein and Hyoscyamine, replicase polyprotein 1ab and Rotiorinol-C, replicase polyprotein 1ab and Scutifoliamide A, Spike glycoprotein and Tamaridone) were simulated in order to confirm the dynamics and stabilities of the complexes at 100ns using NVT ensemble. Similarly, the docked complex of Chloroquine and its target glutathione-S-transferase was simulated to compare the binding and stability in comparison with the stability of the natural lead molecules and selected targets.
The trajectories obtained from the molecular dynamic simulation of the best docked conformer of Hyoscyamine and the post-fusion conformation of the spike glycoprotein (PDB ID: 1WYY) complex are shown in Fig. 9. The MD simulation studies suggested that the RMSD of the protein showed a deviation from 8 to 16 Å in100ns indicated substantial change in conformation of the target protein during the simulation period (Fig. 9a). The variation in the RMSD of the protein is due to the binding of Hyoscyamine, which indicate better binding and stability in comparison with comparative control. The protein RMSF deviating from 2.0-14.0Å throughout the protein and more than 60% of the residues showed >10Å, which indicated high fluctuations in the protein along with the C- and N-terminals is depicted in Fig. 9b. The ligand RMSF ranging from 4.0-6.0Å indicated some fluctuations with respect to their entropic role and the interactions with the target protein during the binding is shown in Fig. 9c. The protein–ligand interactions throughout the simulation run time during which prominent water bridges, hydrogen bonds and hydrophobic interactions were observed as shown in figure Fig. 9d. The molecular interactions that occured during more than 10% of the MD simulation period are illustrated in Fig. 9e. The MD studies revealed that the ligand binds to the protein with hydrophobic interaction with Ile1164 of chain B and a polar bond with Gln1161 of chain A. The interactions between the ligand and the protein are shown in two panels (Fig. 9f). The panel at the top represented the trajectory of total contacts such as hydrogen bonds, ionic bonds, water bridges and hydrophobic interactions of the protein with the ligand during the simulation which was observed to be constant throughout, whereas the bottom panel depicted that the number of residues involved in the interaction with the ligand. It was observed that the 24 amino acids were interacted with the ligand throughout the trajectory frame in which 13 residues belonged to chain A and 9 residues belonged to chain B of the protein, that indicated the stability in binding thereby probably inhibit the function of the target protein. The protein and ligand structural conformational changes during the MD simulation studies are shown in Supplementary materials Fig. S1 which represent the structural changes of the post-fusion conformation of the spike glycoprotein and Hyoscyamine that was undergone MD simulation respectively. The secondary structure information of the spike glycoprotein during MD simulation is depicted in Fig S1a, showed the monitoring of 51.14% secondary structural elements (SSE) of each of the residues over time (Fig. S1b) and Fig. S1c showed the percentage of SSE composition throughout the simulation. Fig. S1d showed the 2D representation of the ligand Hyoscyamine, Fig. S1e showed the 2D representation with the colour-coded rotatable bond, Fig. S1f showed various properties of ligands such as ligand RMSD, radius of gyration (rGyr), intra-molecular hydrogen bonds (intraHB), molecular surface area (MolSA), solvent accessible surface area (SASA) and polar surface area (PSA) monitored throughout the simulation and Fig. S1g showed the torsion profiles of the ligand corresponding to the rotatable bond. The dial and bar plot were used to summarize the conformational evolution of every rotatable bond in the ligand throughout the simulation. Thus, the MD simulation studies revealed the stability of the docked complexes of Hyoscyamine-spike glycoprotein of SARS-CoV-2 id better than the stability of the docked complexes of the control chloroquine and its usual target (Fig. 13). Further, the interaction of the natural ligand and spike glycoprotein is much better and stable than the interaction of the control and its target. Thus, the present study suggested that natural molecule Hyoscyamine probably act as potential lead molecule against spike glycoprotein of SARS-CoV-2.
The MD simulation trajectories of Rotiorinol-C and the replicase polyprotein 1ab (PDB ID: 1Q2W) are depicted in Fig. 10. The simulation studies suggested that the target protein showed a deviation from 1.8 to 2.7 Å in its RMSD during simulation at 100ns run time (Fig.10a). This variation in RMSD probably due to the binding of Rotiorinol-C to the protein target. The protein RMSF was observed to be deviated from 0.6 to beyond 4.8 Å with majority of the residues at 1.2 to 2.8 Å, around 40% of the residues at 3.6 Å and above, that indicated the fluctuations in the protein structure during the simulation period (Fig. 10b). Similarly, ligand RMSF observed to be fluctuating between 0.5 – 2.0 Å, due to the variation in entropy caused by binding to the target protein (Fig. 10c). The protein- ligand interactions throughout the simulation are shown in Fig. 10c depicted that hydrogen and hydrophobic interactions are the major stabilizing forces. The molecular interactions that occurred during more than 10% of the MD simulation run time are shown in Fig. 10d and the ligand interacted with the protein with Phe3 of chain B, electrostatic negative interactions with Glu288, Asp289, Glu290 of chain A and Glu288 of chain B, electrostatic positive interactions with Arg4, Arg131, Lys137 of chain A and Arg4, Lys5 of chain B, polar interactions with Gln127 of both chain A and B. The interactions between the ligand and the protein are shown in two panels (Fig. 10f). The top panel represented the trajectory of major contacts such as hydrogen bonds, ionic bonds, water bridges and hydrophobic interactions with the ligand during the simulation, which was observed to be constant up to 80ns with minor increase subsequently, whereas the bottom panel showed the number of residues involved in the interaction with the ligand and several amino acids were observed to interact with the ligand throughout the trajectory frame in which 10 residues belong to chain A, and 10 residues to belonged to chain B, indicating stable binding thereby probably inhibiting the target protein. Arg4, Gln127 of chain A and Lys5, Gln127 of chain B of the target showed constant interaction with the ligand throughout the simulation. The protein and ligand structural information during the MD simulation studies are shown in Supplementary material Fig. S2. The figure represents the structural changes of the replicase polyprotein 1ab and Rotiorinol-C during MD simulation. The secondary structure information of the target protein is shown in Fig S2a, the total 39.92% SSE of each of the residues over time was monitored in which 15.61% were identified to be helices and 24.31% was sheet are shown in Fig S2b and the percentage of SSE composition throughout the simulation is shown in Fig S2c. The 2D structure of Rotiorinol-C is represented in Fig S2d and the rotatable bonds associated with the ligands are shown in Fig S2e. The major properties of ligands monitored throughout the simulation are shown in Fig S2f and the torsion profiles of the ligand with respect to the rotatable bond are shown in Fig S2g. The dial and bar plot showed the conformational evolution of every rotatable bond in the ligand throughout the simulation. Thus, the MD simulation suggested that Rotiorinol-C demonstrated a better binding and stability in its interaction with replicase polyprotein 1ab in comparison with MD simulation studies of chloroquine and its usual target. Thus, this study revealed that natural lead Rotiorinol-C probably act as one of the ideal lead molecule against replicase polyprotein 1ab of SARS-CoV-2.
The trajectories obtained when simulating the complex of Scutifoliamide-A and replicase polyprotein 1ab (PDB ID: 1Q2W) of SARS-CoV-2 are depicted in Fig. 11. The analysis of the RMSD trajectory of the protein suggested that the protein showed a substantial deviation from 5.0 to 45.0 Å in RMSD during 100 ns that indicated major conformational changes of the target protein (Fig. 11a). The high variation in the RMSD of the protein was probably due to the binding of Scutifoliamide-A with the target protein. The protein RMSF of 4.0-28.0Å observed throughout the protein and more than 40% of the residues showed >20Å, indicated that there were substantial fluctuations in the protein as depicted in Fig. 11b. The ligand RMSF ranged between10.0-25.0Å, which indicates that the fluctuations corresponding to their high variation in entropy due to the binding with the target protein (Fig. 11c). The protein–ligand contacts throughout the simulation run time are represented as a histogram (Fig. 11d), in which water bridges, hydrogen bonds and fewer hydrophobic interactions were observed. The analysis of the molecular interactions occurred during more than 10% of the MD simulation period revealed that the ligand binds to the protein by a polar bond with Gln127 of chain A and positive electrostatic interaction with Arg4 of chain A and Lys5 of both chain A and B (Fig. 11e). The interactions between the ligand and the protein are shown in Fig. 11f. The top panel represented the trajectory of the major stabilizing forces such as hydrogen bond, water bridges and hydrophobic interactions of the protein with the ligand during the simulation, whereas, the bottom panel represented the number of residues involved in the interaction with the ligand throughout the trajectory frame were 11 of which 8 residues belonged to chain A and 3 belonged to chain B. Prominent interactions of Scutifoliamide-A were observed with Lys5 of chain B, followed by Arg4, Lys5 and Gln127 of chain A. The protein and ligand conformational changes in the structures during the MD simulation studies are shown in Supplementary material Fig. S3. The secondary structure information of the target protein is depicted in Fig. S3a, the total 38.57% SSE of each of the residues over time was observed in which 16.47% were helices and 22.10% was sheets (Fig. S3b) and the percentage of SSE throughout the simulation are shown in Fig. S3c. Scutifoliamide-A is represented in 2D (Fig. S3d) and rotatable bonds of the ligand shown in Fig. S3e. The features of the ligand monitored throughout the simulation are shown in Fig. S3f and the torsion profiles of the ligand corresponding to the rotatable bond are shown in Fig. S3g. The MD simulation suggested that, although there are some structural and conformations variation for the docked complex of Scutifoliamide-A- replicase polyprotein 1ab, the complex showed stability during the simulation. Further, in comparison with the binding of the control and its targets, the binding of this complex showed better binding, thus, the natural molecule can be considered as one of the probable lead molecules against replicase polyprotein 1ab of SARS-CoV-2.
The MD simulation trajectories of Tamaridone and the spike glycoprotein in its post-fusion conformation (PDB ID: 1WYY) are shown in Fig. 12. The MD simulation studies suggested that the protein showed a deviation from 8.0 to 15Å in its RMSD during the 100ns run time (Fig. 12a). This variation was probably due to the binding of Tamaridone to the spike glycoprotein and undergone conformational changes. The protein RMSF was observed to be deviated from 3.0 to 16.0Å with majority of the residues below 6Å, residues between 50 and 100are around12, residues between 175 and 200 showed 12 to 16 Å RMSF variations indicating the fluctuations in the conformation of the target during the simulation period (Fig. 12b). Similarly, ligand RMSF showed that there were fluctuation between 7.5 – 10.0 Å, due to the change in entropy caused by binding to the target protein (Fig. 12c). The protein and ligand interactions throughout the simulation are represented in Fig. 12d, which showed water bridges, hydrogen and hydrophobic interactions are the major stabilizing forces between the receptor and ligand. The molecular interactions that occurred during more than 10% of the MD simulation run time are shown in Fig. 12e and Tamaridone was observed to be interacted with protein forming hydrophobic interactions with Val1170 of chain A and Tyr1187 of chain B, and polar interactions with Thr923 of chain A. The contact between the ligand and protein throughout the simulation is shown in Fig. 12f. The panel on the top illustrated the trajectory of total contacts such as hydrogen bonds, water bridges and hydrophobic interactions of the protein with the ligand during the simulation, whereas the panel at the bottom revealed that the number of residues involved in the interaction with the ligand throughout the trajectory frame were 30 of which 22 residues belong to chain A and 8 residues to chain B. The major interacting residues of chain A included Ile916, Ser919, Asp1165, Val1170 and the residues make contact with chain B included Asp932, Asn935 and Tyr1187. The protein and ligand structural information during the MD simulation studies are shown in Supplementary material Fig. S4, which represented the structural variation of the spike glycoprotein and Tamaridone observed during the simulation. The secondary structure information of the target protein is shown in Fig. S4a, the total 58.96% SSE of each of the residues over time was monitored of which 58.96% identified to be helices (Fig. S4b) and the percentage of SSE throughout the simulation are shown in Fig. S4c. The 2D structure of Tamaridone and its rotatable bonds are represented in Fig. S4d and Fig. S4e respectively. The major features of the ligands during the MD simulation are shown Fig. S4f and the torsion profiles of the ligand inline with the rotatable bond are shown in Fig. S4g. Hence, the MD simulation studies suggested that the docked complexes of Tamaridone and spike glycoprotein of SARS-CoV-2 are stable throughout the simulation, and the lead molecules demonstrated better binding and dynamics in comparison with the binding of the control antiviral drug to its usual targets (Fig. 13). Thus, the present study prioritized that natural molecule Tamaridone also can be used as potential lead molecule against spike glycoprotein of SARS-CoV-2.
The trajectories obtained from the molecular dynamic simulation of the docked complex of the control drug Chloroquine and its target glutathione-S-transferase complex is shown in Fig. 13. The simulation results suggested that the RMSD of the protein showed a deviation from 1.6 to 3.2 Å in 100ns indicating a less or no conformational change of the target protein during the simulation period (Fig. 13a). The variation in the RMSD of the protein is highly likely to be due to the binding of the conventional drug Chloroquine. Protein RMSF was found to be deviating from 0.8-2.3Å throughout the protein but the residues between 35 and 55 showed a rapid increase from 2.3 to 4.1Å which indicates fluctuations in the protein (Fig. 13b). Ligand RMSF deviated from 8.5-6.0Å indicated fluctuations corresponding to their entropic role in the binding event (Fig. 13c). The protein–ligand interactions throughout the simulation run time during which the water bridges, ionic bonds, hydrogen bonds and hydrophobic interactions were observed as shown in Fig. 13d. The molecular interactions that occur during >10% of the MD simulation run time are depicted in Fig. 13e. The analysis revealed that it binds to the protein with hydrophobic interactions with Phe94 and Tyr95, and an electrostatic interaction with Lys141. The interactions between the ligand and the protein depicted in Fig. 13f in two panels. The top panel represented the trajectory of total contacts such as hydrogen bonds, ionic bonds, water bridges and hydrophobic interactions of the protein with the ligand during the simulation which was observed to be 2 and rises to 4 after 60ns whereas the bottom panel showed that the number of residues involved in the interaction with the ligand. It was observed that the amino acid Phe94 showed maximum interaction during the first 50ns and Tyr95 showed maximum interaction during the second 50ns of the simulation. The protein and ligand structural conformational changes during the MD simulation studies are shown in Supplementary material, Fig. S5 which represent the structural features observed in the MD simulation of glutathione-S-transferase and chloroquine respectively. The secondary structure information of the target protein is illustrated in Fig. S5a, the monitoring of the SSE of each of the residues over time is shown in Fig. S5b and the percentage of secondary structural elements (SSE) composition throughout the simulation are shown in Fig. S5c. The 2D structure of the ligand is represented in Fig. S5d, the 2D representation with the colour-coded rotatable bond is shown in Fig. S5e, various properties of ligands monitored throughout the simulation are depicted in Fig. S5f and the torsion profiles of the ligand corresponding to the rotatable bond are represented in Fig. S5g. The torsion plot (dial plot and bar plot) summarized the conformational evolution of every rotatable bond in the ligand throughout the simulation.
3.6. Binding potential of natural lead calculated by MM-PBSA
Further, the binding potential of the complexes of natural lead molecules and the prioritized targets are confirmed by the energy calculations by MM-PBSA approaches. Gromacs tool estimated and calculated binding affinities of molecular targets and ligand complexes. The interaction energy of the Chloroquine and its usual target glutathione-S-transferase estimated to be -33.12 kcal/mol. Analogously, the interaction energy of the complexes of Hyoscyamine and the spike glycoprotein, Rotiorinol-C and replicase polyprotein 1ab, Scutifoliamide-A and replicase polyprotein and Tamaridone and the spike glycoprotein were calculated by MM-PBSA approaches were estimated to be -48. 25 kcal/mol, -38. 40 kcal/mol, -45. 97 kcal/mol and -38. 79 kcal/mol respectively. Thus, the energy calculations by MMPBSA showed that the interaction of four natural lead molecules towards the prioritized targets of COVID-19 demonstrated better binding interaction compared to that of the binding of control drug with its target.
From the molecular docking, MD simulation and energy calculation studies, it is clear that the four potential natural lead molecules demonstrated better binding energy, stability and dynamic during MD simulations towards the prioritized drug targets of SARS-CoV-2 than the binding of Chloroquine and its target glutathione-S-transferase. Thus, the present study revealed that the natural molecules and the prioritized drug targets are potential candidates for the development of therapeutic agents against COVID-19.
There are several recent studies emphasised on the application and scope of computer aided molecular design towards the lead development and molecular target identification towards COVID-19. Recent studies suggested that natural lead compounds screened against the targets of SARS-CoV-2 genes such as spike glycoprotein (S), non-structural proteins (nsp), envelope protein (E), membrane protein (M) provide new insights for drug repositioning to treat COVID-19 infection 78. The COVID-19 spike binding site to the cell-surface receptor (GRP78) was predicted and it was identified that the binding is more favorable between region 480-488 of the spike protein model and GRP78 with the predicted binding affinity of -9.8 kcal/mol 79. Mpro identified as a potential drug target by Jin et al (2020) revealed a mechanism-based inhibitor, N3 by virtual screening, which establishes a new paradigm for treating infectious diseases 80. Mpro was utilised to screen novel lead molecules by virtual screening and the drug Beclabuvir, with a predicted binding energy of -10.4 kcal/mol by Autodock vina is currently being suggested for clinical trials 81. Similarly, recent studies showed that the phytochemicals from medicinal plants such as Psorothamnus arborescens, Myrica cerifera and Hyptis atrorubens showed good binding potential towards the target Chymotrypsin like protease (CL-pro) 82. SARS-CoV nsp6 has been shown to activate autophagy, inducing vesicles and is known to interact with nsp2, nsp8, nsp9 and accessory protein 9b via yeast two-hybrid assays 83. Therefore, nsp6 is one of the important targets that have been identified in order to suppress the virulent genes of the pathogen. Envelope (E) proteins are reported to form ion channels, which is mainly associated with pathogenesis, therefore, inhibiting these channels provided insights for drug discovery. Belachinal, Macaflavanone-E and Vibsanol-B were reported as potential inhibitors of the such ion channels 84. Recent study suggested that 63 viral peptides were modelled by computational virtual screening for vaccine development 85. Similarly, the current predicted the binding potential of four natural lead molecules namely Hyoscyamine, Rotiorinol-C, Scutifoliamide-A and Tamaridone towards selected molecular targets of COVID-19 and the hypothesis and prediction probably provide profound insight for the development of therapeutic agents against COVID-19 targets.
The present study provides insights on the molecular mechanism of the binding of natural photochemical towards several potential molecular targets of SARS-CoV-2 for future investigation. Since, there are no standard drugs/ vaccine available for the treatment of COVID-19 to compare our studies, the molecular docking results of Chloroquine and Hydroxychloroquine were used as the comparative control, and therefore, this prediction could have a variation when it implement in experimental level. Further, the study initially modelled the binding of four lead molecules towards 14 targets prioritised in the study, however, due to the time and resource constrains, the detailed dynamics, simulation and energy calculations were carried out only for the best lead docked conformation with minimum binding energy and maximum stabilising interaction among the selected 14 targets, thus, only 04 complexes were selected for MD simulation studies. Thus, the dynamics and simulation of selected natural compounds for other targets can also be conducted to obtain significant breakthrough for development of COVID-19 targets and lead molecules. Thus, the study provides ample foundation in future investigation and developing natural lead molecules against probable targets of SARS-CoV-2 causing COVID-19.