3.1 Phylogenetic analysis
Viral genomes can be differentiated from other replicons based on their mechanisms of survival strategies. A viral genome encodes all the necessary informations to maintain and fulfill its infectious stages within its host and they have tripartite survival strategy. Viruses also differ from other cellular organisms in terms of their disability to self-maintain itself and also to self-replicate itself [42]. In the 21st century, with the changes in globalization, pathogens had also evolved among themselves to cope up with the new environments. Moreover, the evolution of host immune responses induces selection pressures, which results in unbalanced survival abilities, even among the concurrent strains. These had increased the scope of a new branch of science, called “phylodynamics”. Phylodynamics deals with those informations where the phylogenetic properties and the epidemic dynamics of viruses gets interconnected [43, 44].
We had conducted phylogenetic and sequence alignment analysis, to investigate the relationships between the taxa. The organisms involved in the formation of each sub-clusters within a cluster are likely to share homology among them, whereas, the differences in the branch length among the taxa within a particular cluster or among the taxa of different clusters indicate the differences in the mutation acquired during the course of evolution. The more the branch length, the greater will be the accumulation of the number of mutations acquired by the organisms. In all the cases, we found a very close relation between Bat coronavirus RaTG13 and SARS-CoV-2. The similar relationship has been found by various researchers, which leads to a solid conclusion that SARS-CoV-2 is genetically almost similar to RaTG13 (isolated from bat in Yunnan in 2013) [45, 46]. From the phylogenetic tree based on entire length of ORF1ab polyprotein, we observed that SARS-CoV-2 is homologous to Bat coronavirus RaTG13. Moreover, the branch length for Bat coronavirus RaTG13 is slightly higher than SARS-CoV-2. But, in cases of other phylogenetic trees, the branch length pattern for Bat coronavirus RaTG13 and SARS-CoV-2, was found to be in reverse order (Figure 1 (A-C)). This concludes that the number of acquired mutations is slightly more in Bat coronavirus RaTG13 than in SARS-CoV-2 along the length of ORF1ab polyprotein.
Multiple sequence analysis result showed differences in amino acids occurred at 12 positions within the sequence of ORF1ab (corresponds to RdRp). At site 4495 (corresponding to 90th amino acid of RdRp), valine is the most frequently used amino acid. But in some cases, valine has been replaced by Leucine (Bat_coronavirus_RaTG13/1-7095 and SARS-CoV-2) and Isoleucine in some cases. The substitution in both the cases is within the same amino acid group, i.e., neutral and non-polar. The site 4497 (corresponds to 92nd amino acid residue of RdRp) is almost occupied by Glutamate (negatively charged, polar & hydrophilic), except for Bat_SARS-like_coronavirus/1-7092, where the glutamate is replaced by glycine (non-polar, aliphatic). Moreover, in cases of Bat_coronavirus_RaTG13/1-7095 and SARS-CoV-2, glutamate is replaced by aspartate. Both glutamate and aspartate belong to the same group. In some cases, we have also found the substitution of glutamate by asparagine, a neutral, non-polar amino acid. At sites 4560 (which corresponds to 155th amino acid of RdRp) and 4576 (which corresponds to 171th amino acid of RdRp), the site is mostly occupied by aspartic acid (a negatively charged, hydrophilic amino acid) and isoleucine (non-polar) respectively, with some exceptions. The site 4594 (corresponds to 184th amino acid of RdRp), is occupied by glutamine (polar, uncharged), with some exceptions in some Bat_SARS_coronaviruses, where it is replaced by a charged and polar amino acid residue arginine. Sites 4564 (corresponds to 249th amino acid of RdRp) and 4671 (corresponds to 266th amino acid of RdRp) are mostly occupied by isoleucine (non-polar, aliphatic), with some exceptions. Sites 4698 (corresponds to 293rd amino acid of RdRp), 5016 (corresponds to 611th amino acid of RdRp), 5042 (corresponds to 637th amino acid of RdRp), 5048 (corresponds to 643rd amino acid of RdRp) and 5212 (corresponds to 804th amino acid of RdRp) are mostly occupied by Threonine, Threonine, Valine, Serine and Lysine, respectively with some exceptions. The detailed analysis of these mutated sites is shown in Figure 1(D).
Furthermore, out of the twenty-five mutations observed along the length of RdRp, seventeen of these mutations are observed outside the polymerase domain of RdRp and fall within the NiRAN domain. For all of these seventeen mutations, SARS-CoV2 and Bat coronavirus RaTG13 shows occurrence of identical amino acids at all the equivalent positions except for position 4603 and 4654 of ORF1ab polyprotein, which corresponds to position 198 and 249 of RdRp respectively and these two organisms shows striking amino acid changes from all of the other taxa. In the position of 198 of RdRp, a change from Asp-Asn is observed from Bat coronavirus RaTG13 to that of the SARS-CoV2. Also, in the position of 249, another amino acid change from Ile to Arg is observed from Bat coronavirus RaTG13 to SARS-CoV2 (Figure 1(B-D)). Moreover, all of the mutable positions within the polymerase domain of RdRp exhibits the presence of identical amino acids at equivalent positions for these two organisms. Also, by comparing all of the twenty-five positions between SARS-CoV2 to other SARS-CoVs (SARS coronavirus Shanghai QXC2, SARS coronavirus TJF, SARS coronavirus HKU and SARS coronavirus Urbani) it is observed that, SARS-CoV2 shows amino acids similarities with the SARS counterparts more frequently within the conserved NiRAN domain at the positions of Asp4497(corresponds to position 92 of RdRp), Asp4560(corresponds to position 155 of RdRp), Ile4576(corresponds to position 171 of RdRp), Gln4589(corresponds to position 184 of RdRp), Ile4671(corresponds to position 266 of RdRp), Thr4698(corresponds to position 293 of RdRp). On the other hand, the occurrence of the identical amino acids in the equivalent portions between SARS-CoV2 and the SARS counterparts within the variable polymerase domain is two out of all of the eight amino acid substitutions that is being observed within the polymerase domain. They are, Val5042 (corresponds to position 637 of RdRp) and Lys5212(corresponds to position 807 of RdRp).
The variability of the amino acid characters within the polymerase domain between SARS-CoV2 and the SARS counterparts may be the reason for SARS-CoV2 RdRp efficiency.
3.2 Molecular docking studies
Based on our analysis of several already published reports, we had chosen the aim of our study. In order to fulfill the aim, we had conducted multiple molecular docking studies with SARS-CoV-2-nsp12. The SARS-CoV-2-nsp12 polymerase activity is activated by nsp7 and nsp8. To maintain similar situation as in vivo, we had conducted molecular docking to investigate the RdRp activity of SARS-CoV-2-nsp12 present in complexed state with nsp7-8 hexadecamer. Our result showed that the RNA template binds to the nsp7-8 complex with more affinity rather than binding to SARS-CoV-2-nsp12 within the SARS-CoV-2-nsp7-8-12 complex. Previous study reports showed that nsp7 takes part in polymerase activity and nsp8 also possesses non-canonical RdRp activity [47]. These leads to a conclusion that nsp7 and nsp8 within this complex must carry an RNA binding domain. The NTP entrance channel within the nsp12 (formed by the hydrophilic residues like Lys545, Arg553, Arg555 of Motif F), facilitates the entry of the incoming NTPs [48]. After the initial binding of the template or parental RNA with nsp7, the RNA is expected to meditate its entry into the active site of nsp12 polymerase domain (formed by Motif A & Motif C) and synthesis of new RNA strand takes place [28]. Moreover, nsp7-8 complex has been reported to interact with nsp12, at the sites of Thr409, Lys411, Trp509, Gly510, Gly897, Met899 of nsp12, which fall within the polymerase domain of the later. Therefore, it may also be concluded that nsp7-8 complex also occupies the nsp12 polymerase domain, as a result of which the viral polymerase domain becomes unavailable to many drugs. The binding affinity between only nsp12 to RNA is slightly greater (-328.84 with RMSD value of 133.19) than the binding affinity for nsp7-8-12 complex with the same RNA template (-317.09 with RMSD value of 83.68), but lesser RMSD value indicates that nsp7-8-12 complex has better confirmation to bind with the template RNA.
Subsequent molecular docking studies were conducted to investigate interactions between nsp12 and the chosen drugs (that might act as potential inhibitor of nsp12). These studies were carried out to examine the alterations of binding between nsp12 and RNA template by introducing the drug to nsp12 prior to its RNA binding. Our results showed that all the drugs eventually bind to the residues of nsp12 that fall under the polymerase domain of nsp12. The drug IDX184 forms hydrogen bonds with the residues Thr817, Leu819, Tyr831 and His872 of nsp12 polymerase domain (as indicated by Chimera and PLIP) and hydrophobic interactions with the residues Lys807, Tyr831 and pie stacking with residues Tyr831 as indicated by PLIP. Ribavirinforms multiple hydrogen bonds with Tyr530 and Val535 of nsp12 polymerase domain as indicated by Chimera and PLIP but no pie stacking or salt bridge interactions (although numerous hydrogen bonds are also found outside the polymerase domain of nsp12) and Sofosbuvir forms hydrogen bonds with Arg914 and Tyr915 of nsp12 polymerase domain as indicated by Chimera and PLIP. Also, additional hydrogen bonds are found between Sofosbuvir with Tyr915 and Glu919 of nsp12 polymerase domain as indicated by PLIP. Sofosbuvir also forms multiple hydrophobic interactions with the residues Tyr595, Tyr903 and Tyr915, pie stacking interactions with Tyr595 and salt bridge interactions with Arg914. Exceptions are for Galidesevir (hydrogen bonds are formed between the drug molecule and nsp12 with the residues Asn52, Arg116, Lys121, Tyr217 & Asp218 of nsp12 as indicated by PLIP, besides, the pie-stacking interactions are observed between Galidesevir and nsp12 with the residue Asp217 of nsp12 as indicated by PLIP and no salt bridge interaction is found in this case) and Tenofovir (hydrogen bonds are formed with residues Thr120, Thr123 of nsp12 as indicated by both Chimera and PLIP, and Cys53, Cys54 of nsp12 as indicated by Chimera). We had also got indications from Chimera software results that upon introduction of these chosen drugs (except for Setrobuvir) to nsp12, even though the binding affinity values between nsp12 and RNA template get lowered to some negligible extent, but still the template can form multiple hydrogen bonds with different amino acid sites within the nsp12 polymerase domain (Lys500, Ser501, Arg569, Lys577, Tyr689, Tyr689 and Tyr903). In case of Setrobuvir, the binding affinity between nsp12 and template RNA also get lowered (to a negligible extent), but unlike others in this case, the template RNA forms hydrogen bonds with those amino acid sites of nsp12, which are mostly positioned outside its polymerase domain (Tyr38, Asp40, Lys41, Thr76, Ser78, Asn79, Gly220 and Asp221). Molecular docking result also shows that when Sofosbuvir binds to nsp12, it increases the binding affinity of nsp12 towards RNA template. Moreover, the template RNA can still establish multiple contacts with the polymerase domain of nsp12. This can conclude preliminarily that Sofosbuvir can no longer be treated as a potential inhibitor of nsp12.
Keeping in mind about the in vivo situation, we had also conducted another series of molecular docking experiment with our drugs of interest to investigate that whether these drugs confer the RNA binding ability of nsp7-8-12 complex upon binding with the complex. Our results clearly indicated that whenever the nsp7-8-12 forms complex, the polymerase domain of nsp12 is protected by the cofactor and as a result, the drugs are unable to get the access of nsp12 polymerase domain. Additionally, the template RNA is still able to bind with nsp7 and nsp8 chain with in the nsp7-8-12 complex through the formation of hydrogen bonds with the residues of Lys27, Arg21, Glu23 of nsp7 and Arg75, Gln73 of nsp8 as indicated by Chimera. PLIP also showed that template RNA forms hydrogen bonds with Arg21, Ser26 of nsp7 and Gln73, Arg75, Arg80 of nsp8 within the nsp7-8-12 complex. None of these drugs inhibits the binding affinity of template RNA to nsp7 and nsp8 chain within the nsp7-8-12 complex. Meanwhile, when sofosbuvir is docked against nsp7-8-12 complex and then RNA is introduced for docking against nsp7-8-12 to Sofosbuvir complex, it was observed that Sofosbuvir increased the binding affinity of nsp7-8-12 complex for RNA template, which is indicating that this drug can no longer be treated as the inhibitor of nsp12.
Lastly, we have performed a final series of docking studies with nsp12 complexed with each of these eight drugs individually, which is then allowed to interact with nsp7-8 and ultimately docked against template RNA. We have found that the RNA template still can be able to bind with nsp7 and nsp8 chains of nsp7-8-12 complex, although the binding sites may vary for each of the selected drugs. Besides these, one contact is also been made by the template RNA at Ala406 site of polymerase domain of nsp12, upon introducing Remdisivir with nsp12. Also, template RNA is able to bind with the polymerase domain of nsp12 through the formation of hydrogen bonds with the residues of Thr402, Asn403, Gln408 and Gly670 upon the introduction of Remdisivir to nsp12, which is then complexed with nsp7-8 prior to the RNA binding, which brings about the polymerization by nsp12. Also, upon the introduction of Sofosbuvir to nsp12 before the formation of nsp7-8-12 complex, it is observed that, RNA template forms hydrogen bonds with nsp12 complexed with Sofosbuvir at the residues Asp499, Lys500, Ser501, Tyr521, Arg569, Lys577, Asp684, Tyr689, Ser814, Arg836 and Tyr903 of nsp12 polymerase domain, as indicated by Chimera and PLIP and salt bridge interactions are also observed between RNA template and nsp12 at the site of Arg513, Lys593 and Arg836 of nsp12. As a result, these drugs do not seem to alter the binding affinity of template RNA upon binding with nsp12 when the latter forms complex with its cofactor nsp7-8 complex. The results of the docking experiments are shown in the figures 2-4 and supplementary table 1a-f.