3.1. The Binding of Remdesivir Triphosphate and ATP to RdRp
Adenosine triphosphate was docked against the crystal structure of SARS-CoV-2 RdRp (PDBID: 7BV2) at the i-binding site which formed hydrogen pairing with the template strand of 14 nucleotide bases and the primer strand of 11 nucleotide bases [44]. The conformational pose of the docked RTP compared to the covalently bound RDV in the crystal structure was nearly identical (Figure S2). We then generated a docking grid from the docking pose of RTP to dock ATP into the same catalytic site of RdRp. The conformational pose of ATP compared to the covalently bound crystal RDV was nearly identical. It was also seen that the triphosphate tail of RTP was positioned near one of the Mg+ ions. These conformations were then used to perform long (1.0 µs) simulations for each complex to further elucidate more about the relaxed conformations of each and to obtain a stable conformation for our high-throughput virtual screening portion of the study.
Table 1. Survey of the current studies on the discovery and/or repurposing of approved drugs for the inhibition of SARs-COV-2 RdRp.

The trajectory convergence of the simulation for both systems was confirmed through analyzing the RMSD during the 1000 ns (1.0 µs) simulations shown in Figure 2A and Figure 2B. The trajectory RMSD is of the protein receptor backbone Cɑ atoms and the ligand main atoms for both the ATP and RTP complexes. It is clearly seen in Figure 2A that the RTP system reached stability very early in the simulation (~50 ns) while the ATP system in Figure 2B reached stability around 550 ns. Importantly, both protein complexes maintained average RMSD values of approximately 2.0 Å. This is indicative that the differences between ATP and RTP binding had no significant change in the overall RdRp conformation through the 1.0 µs simulation.
The ligand RMSD average values for the ATP system are within 1 Å of the protein RMSD values and reached stability at approximately 550 ns. The ligand RMSD for the RTP system also matched the average distance as in the protein with a convergence very early in the simulation; approximately 50 ns. It is clearly seen that the complex containing ATP possessed higher RMSD values compared to the RTP complex indicating that the binding of RTP was more stable during the simulation.
We additionally wanted to confirm the stability of the template-primer RNA in the complex through simulation. The RMSD of the RNA main atoms is shown in Figure S3, both ATP and RTP systems show clear stability and early convergence at around 50 ns with values both approximately 2.0 Å which is also consistent with the protein Cɑ RMSD. Likewise seen in the ligand main atoms RMSD plot, the complex containing RTP showed slightly lower average values compared to the ATP system indicating more stable binding occurred during the simulation. To observe the fluctuation of each individual amino acid in the protein structure, we recorded the RMSF of the RdRp complexes during the simulation.
Ligand main atoms RMSF (Figure S5) was also recorded to observe the fluctuation of the individual atoms of each ligand in order to better understand the regions of stability of ATP and RTP. Clearly, RTP in complex saw less fluctuation per atom over the simulation as compared to ATP. Short peaks at atoms 1-4 indicate the slight instability seen in the gamma phosphate on the phosphate tail which is not abnormal. Interestingly, small peaks in atoms 16 and 30 indicate a slight increase in fluctuation seen at the 3’ and 4’ hydroxyl groups on the ribose ring of RTP. This specificity at only these two polar functional groups is also seen in atoms 29 and 30 of the ATP system and likely indicates positioning for nucleophilic attack during nucleotide addition.
3.2. Clustering Analysis
We then performed cluster analysis of both MD systems after the simulation to generate the most abundant conformational pose during the 1000 ns duration. In the most abundant conformational pose of the simulation (Figure 5), it was observed that the adenine motif of both ATP and RTP maintained H-Bond pairing with U10 on the template strand and partial pi-pi stacking of A11 on the primer strand. The RTP system was more able to sustain salt bridge interactions with positive residues R555, R553, R624, K621. The RTP system also saw interaction with S759 on the nitrile group while Y619 was seen to form H-Bonds with negative oxygen on the gamma phosphate of ATP. In addition, the RTP system also saw stabilization of the 3’ hydroxyl group on the ribose ring by H-Bonds with residues N691 and T680.
During the 1.0 µs simulation of both the ATP and RTP systems, several amino acid contacts were observed to be maintained throughout with high occupancy (Figures S7). For both the ATP and RTP systems, there was significant occupancy in the H-Bond interactions between the triphosphate tails and a positive region of amino acids such as K551, K621, K798, R553 and R555 of ≥ 30% of the simulation time. For the RTP complex, residues K621, R553 and R555 formed H-bonds with negatively charged oxygen on alpha, beta and gamma phosphate with all possessing an occupancy of above 90%, indicating that these contacts could be crucial for the stability and positioning of RTP in the catalytic site of RdRp. These interactions were seen in previous study where both ATP and RTP were docked to RdRp [61]. Both systems also maintained the maximum occupancy interaction with Mg+ ions with the triphosphate tails, suggesting that at least one ion is responsible for the stabilization and positioning of incoming nucleotides. There was modest occupancy of hydrogen bond formation with hydroxyl groups located on the ribose ring portions with residues D623, D760, and N691, indicating secondary stabilizing roles for these amino acids potentially related to the initiation of nucleotide addition. Negative residue D623 saw an average occupancy of 56% bonding with both hydroxyl groups of the ribose ring portion of RTP. In the ATP complex, D760 saw 34% occupancy with the 3’ hydroxyl group located on the ribose ring which is involved in nucleotide addition.
In Figures S6 are the histogram of protein-ligand contacts occupancy during the simulations for ATP and RTP complexes. These show the smaller contacts seen with occurrence of ≤ 30% of the simulation as well as contacts that occurred at over 100% which indicates multiple ligand atoms contacting same amino acid. In the ATP complex, the ligand contacted several more protein residues with brief occupancy compared to the RTP complex which is likely due to lower relative ligand bound stability seen in the former. Also, more water bridge interactions occurred with the ATP complex indicating weaker overall affinity compared to the RTP complex. Furthermore, RTP saw higher occupancy of H-bonds compared to others confirming the relative favorability of bonding to RdRp.
Previous studies have also reported similar findings on the critical amino acids involved in RDV binding. Koulgi et al. described an ensemble approach of the free form binding of prodrug RDV in complex with SARS-CoV-2 RdRp contacted amino acids Y451, T540, M542, R548, K551, R553, R555, A558, D618, S674, D761 and E811 [62]. Amino acid R555 was observed to form H-bonds with the alpha phosphate group in RDV in all five ensemble representatives [62]. Other studies have aimed to observe the active triphosphate form interaction upon RdRp binding. Zhang et al. found that the nitrile group was positioned in a pocket including K545, Y546 and A547 with the phosphate tail interacting with K551, R553 and R555. Additionally, positive amino acids near the palm subdomain were found to be crucial for RTP activity [63].
3.3. High-Throughput Virtual Screening finds 14 compounds.
Once we validated that our simulations with ATP and RTP bound to RdRp met convergence during the simulation, we then used these complexes to perform our initial screening. Firstly, pharmacophoric screening of the ZINC database was performed by selecting seven pharmacophoric features, i.e. two negative ionizable, one aromatic ring, and four H-bond acceptors of RTP. Shown in Table 2 is the results from HTVS docking against RdRp to retrieve the top fourteen hits with the highest GlideXP scores and lowest #stars as the determining factors. The GlideXP scores measured in kcal/mol are tabulated based on decreasing favorability. A more negative GlideXP score typically indicates a good initial binding affinity to RdRp. The initial conformational pose of each ligand in complex with RdRp and RNA as well as their detailed 2D ligand contacts is shown in Table S2. These fourteen compounds were then subjected to classical 200 ns MD simulations and further MM-GBSA energy calculations for each.
Table 2
Summary of docking results and pharmacokinetic information of the top 14 ZINC candidates.
Molecule
|
Docking score (kcal/mol)
|
# Stars
|
ZINC000014651456
|
-13.5
|
0
|
ZINC000257306096
|
-12.9
|
0
|
ZINC000238950253
|
-12.8
|
0
|
ZINC000299798705
|
-12.7
|
0
|
ZINC000067790716
|
-12.4
|
0
|
ZINC000089920955
|
-11.9
|
0
|
ZINC000097971592
|
-11.8
|
1
|
ZINC000065742965
|
-11.7
|
0
|
ZINC000016040970
|
-11.6
|
0
|
ZINC000408592119
|
-11.6
|
0
|
ZINC000237948681
|
-11.5
|
0
|
ZINC000069492350
|
-11.4
|
0
|
ZINC000002146610
|
-11.2
|
0
|
ZINC000084651559
|
-11.2
|
0
|
* Docking score is empirically calculated in kcal/mol from rigid receptor GlideHTVS protocols which help define compounds with good binding affinity. The scoring function is comprised of lipophilic, hydrogen bonding, hydrophobic terms as well as a rotable bond penalty. The #stars is a parameter defined by QikProp module and scores compounds based on their similarity to known medicines. |
The MM-GBSA energies and average protein-ligand RMSD of all fourteen MD simulations are summarized in Table S3. Within Figures 6A and 6B show the ligand RMSD and RMSF values of all fourteen ZINC candidates with ATP and RTP systems calculated over the last 100 ns of simulation. The free energy of binding gives a more accurate insight into both the simulation efficacy and flexible binding affinity of ligands to their target as opposed to rigid ligand docking. The summary of the 200 ns simulation runs and MM-GBSA energy calculation in Table S3 shows that several compounds possessed more favorable free energy of binding compared to the ATP and RTP systems. From this subset of ligands, we selected the top four with the most favorable MM-GBSA calculated energies for further convergence analysis to ensure that the conformational stability was reached and maintained. Figures 2C-F shows the RMSD of protein Cα atoms and ligand main atoms over the entire duration for these selected compounds. The protein portions reached convergence early in the simulation for each system containing ZINC097971592, ZINC002146610, ZINC069492350 and ZINC408592119. It was observed that the RMSD values of the protein Cα of the four ZINC systems were lower than that of the RTP system. Also, the ligand RMSD values for ZINC097971592 and ZINC408592119 were higher than that of the RTP complex.
We additionally wanted to observe the RMSF of the RNA main atoms during simulation to gain some more insight into the individual fluctuation of each nucleotide in the RNA sequence. In Figure 4A, this displays the template-primer RNA RMSF obtained from the O5’ atoms on the RNA backbone for ATP and RTP systems and the four ZINC systems with Figure 4B indicating the nucleotide sequence indexing used during the recording. The RMSF values were high at nucleotides G1-U3 on the 5’ end of the primer strand as well as high peaks around A22-C25 towards the 3’ end of the template strand for all 6 systems. This indicates high fluctuation induced from the exposure to solvent. In the middle of the plot is a slight peak around U12 on the template strand indicating that this portion did not interact with either ligand. Nucleotides U13-A15 on the template strand show a sharp decrease in fluctuation which is most likely due to the binding and hydrogen pairing of both ATP and RTP.
The average RMSD values over the last 50 ns of simulation for receptor and ligand is also summarized in Table 3. The four ZINC systems maintained similar average receptor RMSD values were approximately 3.3 Å which is near the value of the ATP system of 2.4 Å. This result suggests the RdRp complex system remained in similar conformational state regardless of their structure dissimilarity against ATP. Table 3 is also a breakdown of the individual MM-GBSA terms such as the Van der Waals, electrophilic and lipophilic energies of the top four selected ZINC compounds as well as ATP and RTP for comparison. The summation of these term equates to the relative free energy of binding measured in kcal/mol. Clearly, the four selected compounds from the ZINC database resulted in lower overall MM-GBSA values over the ATP and RTP systems of -35.9 ± 3.1 kcal/mol and -21.3 ± 5.9 kcal/mol, respectively. The RdRp complex containing ZINC097971592 had the most negative MM-GBSA value of -54.5 ± 13.4 kcal/mol, with ZINC002146610, ZINC069492350 and ZINC408592119 having values of -47.5 ± 3.4, -43.9 ± 3.8 and -37.6 ± 4.6 kcal/mol, respectively. The four ZINC systems maintained similar average receptor RMSD values were approximately 3.3 Å which is near the value of the ATP system of 2.4 Å. This result suggests the RdRp complex system remained in similar conformational state regardless of their structure dissimilarity against ATP.
Table 3
MM-GBSA energy values and the average receptor/ligand RMSD from the last 50 ns of simulation time.
Ligand
|
ΔGVDW
|
ΔGELE
|
ΔGLIP
|
MM-GBSA (kcal/mol)
|
Receptor RMSD (Å)
|
Ligand RMSD (Å)
|
ATP
|
-14.4 ± 5.0
|
-15.3 ± 4.4
|
-6.2 ± 0.7
|
-35.9 ± 3.1
|
2.4
|
4.5
|
RTP
|
-19.1 ± 4.1
|
6.9 ± 6.2
|
-9.1 ± 1.4
|
-21.3 ± 5.9
|
1.9
|
0.9
|
ZINC097971592
|
-42.3 ± 5.4
|
6.2 ± 15.9
|
-18.3 ± 3.1
|
-54.5 ± 13.4
|
3.9
|
4
|
ZINC002146610
|
-52.8 ± 3.0
|
23.1 ± 3.0
|
-17.8 ± 1.3
|
-47.5 ± 3.4
|
3
|
1.5
|
ZINC069492350
|
-42.2 ± 3.1
|
14.4 ± 3.3
|
-16.1 ± 2.1
|
-43.9 ± 3.8
|
3.1
|
2.8
|
ZINC408592119
|
-55.6 ± 3.1
|
35.9 ± 3.7
|
-17.9 ± 1.9
|
-37.6 ± 4.6
|
3.3
|
4.4
|
*VDW energies was calculated with the summation of ligand ΔGvdW energies, pi-pi packing correction energies and the self-contact correction energies. |
*Electrophilic terms are calculated by the summation of coulombic, hydrogen-bonding and generalized born electrostatic solvation energies. Hydrophilic term comprised of lipophilic energies. Total MM-GBSA calculated in kcal/mol is summation of ΔGvdW, ΔGele and ΔGhyd terms. |
The detailed 2D receptor-RNA-ligand contacts diagrams of the most abundant conformational pose for each of the top four ZINC candidates are shown in Figure 7A-L. The 2d ligand-protein interactions diagram for the most abundant conformational pose of ZINC097971592, or estriol 3 glucoronide, in complex with RdRp is shown in Figure 7C. There were several residue contacts conserved from the ATP and RTP pose. Positive residues R624, R555, and R553 formed H-bonds with mainly the polar negative carboxylate portion of the molecule, indicating that this compound is situated with the polar end in the positive catalytic region of RdRp. Additionally, ZINC097971592 also came into H-Bond contact with S682 on the hydroxyl groups on the aromatic portion. The ligand-protein contacts summary diagrams for these four ZINC candidates are shown in Table S7. In this, several of the residues seen in contact with the most abundant pose were also maintained with large occupancy through the simulation. Residues R555 and R624 established interactions with ZINC097971592 with 89% and 74% occupancy, respectively. Interestingly, K798 formed a salt-bridge with the negative oxygen with a relatively high occupancy value of 85%. Contacts with lesser occupancy were S682 with a 44% occupancy and R551, R553 of 50% or less. This compound’s enantiomers were identified in a previous VirtualFlow in that there was activity towards plpro, nucleoprotein, nsp10, nsp14 and orf7a in addition to nsp12 [64].
In Figure 7F, the psoralen derivative ZINC002146610 also shows some similarity with the RTP pose and with the other ZINC hits in that the polar carboxylate group is facing inwards towards the positive amino acid region where triphosphate tail usually lays. Also, this compound formed pi-pi stacking interactions on the psoralen ring with nucleotide U20 on the primer strand and A11 of the template strand of RNA. Additionally, the polar end formed salt bridge and H-Bond interactions with R555 and the nearby Mg+ ion. The ligand-protein contacts summary for ZINC002146610 shows similar contacts seen in the most abundant pose. Positive amino acid R555 maintained an occupancy of 113% with the polar region of the molecule, meaning there were additional contacts seen with residue in addition to the full simulation time. Like the most abundant pose, the magnesium ion maintained full contact with the polar end via salt-bridge interactions which was seen to be further stabilized by D618. Additional stabilizing water-bridges were established with occupancies of 47% and 48% with D623 and R555, respectively. Interestingly, this compound as well as its derivatives have seen previous attention in being low toxicity potential selective and reversable inhibitors of (B5i) or the chymotrypsin-like subunit of human immunoproteasome which is associated with the treatment of autoimmune diseases and various types of cancer [65].
Table 4. Ligand-protein contacts during simulation from ATP, RTP and the four ZINC compound systems.

*Ligand-protein contacts may consist of hydrogen bonding, lipophilic, hydrophobic and electrostatic interactions. Contacts shown are present in at least 30% of trajectory time. Green highlight are residues that interacted with at least 3 of the six tested ligands.
In Figure 7I, conserved residue R555 is seen in contact with the polar end of ZINC69492350 in similar fashion to the previous ZINC compounds in addition to the RTP system. Amino acid R555 has been seen to be crucial in the activity of small molecule binding to RdRp in this study as well as others [7–11, 31, 39–41, 62, 66, 67]. In this binding pose of ZINC69492350, R555 forms a salt bridge to the negatively charged oxygen on the carboxyl group, with an additional hydrogen bond contact with the partially negative doubly bonded oxygen on the carboxyl end. Likewise, the aromatic portion of the molecule faces in direction with the hydrogen pairing nucleotide U10 on the template strand RNA. Surrounding amino acids are also conserved which includes D760, D623, S682 and K545. Lastly, the magnesium ions are seen to stabilize ZINC69492350 with pi-cation interactions with the thiophene portion. In the ligand-protein contacts summary diagram in the supporting information, R555 maintained full occupancy with the addition of other interactions less seen with a value of 150% with the polar carboxyl end of the molecule. Like the other selected ZINC compounds, D623 formed stabilizing water-bridge interactions with the same polar end with a value of 33%. Polar residue T687 contacted ZINC69492350 55% of the simulation, unlike the other compounds where it served as a spectator.
In Figure 7L, ZINC408592119 lies in a similar binding pose to ATP and RTP. According to the amino acid contacts seen in the most abundant conformational pose of the simulation. Positive residues K551, K545, K621 and R555 were all seen to form hydrogen bonds and pi-cation interactions with partially positive groups on the polar ends of the molecule. Negative residues such as D618 and D760 are seen to form hydrogen bonding with partially positive portions such as on the imidizolidine group. The significance of D618 is consistent with previous studies [39–41, 62, 66, 67]. In the ligand-protein contacts summary diagram in the supporting information, positive residue K551 maintained an occupancy of 34% in the same fashion as the most abundant pose. Magnesium also contributed full occupancy with the polar carboxyl end of the molecule consistent in Table S7 and other ZINC candidates. Lastly, in addition to forming hydrogen bonds with the imidazolidine portion, also formed a water-bridge 76% of the simulation. Table 4 is the receptor-ligand contacts with an occurrence of greater than 30% tabulated for the ATP, RTP and four ZINC hit complexes. Clearly, positive residues K621, R555, R553 and K551 all induced contact with more than two-thirds of the ligands selected, with R555 meeting all 5 ligands. Likewise, negative residues D623, D798 and D760 saw interaction with at least half of the ligands. Lastly, hydrophobic residue N691 interacted with all the selected ligands apart from ZINC408592119.
The absorption, distribution, metabolism and excretion (ADME) properties were also explored for these four compounds in addition to ATP and prodrug RTP using the SwissADME webserver [68] (Table 5). Within the four non-nucleoside analogs, three of them were reported to have high predicted GI absorption and cytochrome P450 inhibition for CYP1A2, CYP2C19, CYP2C9 and CYP3A4. With these compounds being very early drug hits selected using in silico methods, obviously there is the need for experimental evaluation needed to evaluate their biological attributes. Additionally, it is still unclear how these compounds would behave and be metabolized when introduced to a biological system. Based on the predicted bioavailability, three of the four compounds possess high (56%) availability in rats.
Table 5. Summary of drug property predictions of ATP, RTP and top four ZINC candidates obtained from the SwissADME webserver.
