Dataset of potential therapeutic targets
Non-structural protein 3 (Nsp3, papain-like protease - PLpro)
The multi-domain non-structural protein 3 (Nsp3) is the largest protein produced by the coronavirus, comprising 16 different domains and regions that regulate viral infection, with the papain-like protease domain (PLpro) being the most widely targeted domain from Nsp3. Since the outbreaks of SARS-CoV in 2003 and MERS-CoV in 2012, the three-dimensional structure of Nsp3 has been solved by X-ray crystallography and nuclear magnetic resonance (NMR) spectroscopy. Currently, we provide in the DockThor-VS web server monomeric structures and genomic variations of the PLpro domain. Structural information and selected mutations regarding the Macrodomain I/II/III (MacI/II/III) or active ADP-ribose-100-phosphatase domain (ADRP, app-1”-pase) will be available soon.
PLpro is a cysteine protease that processes the amino-terminal end of the replicase polyprotein (pp1a) generating mature Nsp1, Nsp2 and Nsp3 proteins through self-catalyzed cleavage reaction 25. This protein is also responsible for aiding the coronavirus in its invasion by counteracting host innate immunity. Indeed, the PLpro is a multifunctional enzyme capable of cleaving the viral polyprotein, and also functions as a deubiquitinase (DUB) and deISGylating (deconjugating interferon-stimulated gene 15 [ISG15] molecule from modified substrates), using identical catalytic residues 26. Thus, the therapeutic inhibition of PLpro would have two antiviral effects: restoration of the antiviral effect of deubiquitinylation/ISGylation and inhibition of viral replication by blocking polyprotein cleavage 27.
The SARS-CoV-2 PLpro catalytic site is composed of a classic triad Cys111-His272-Asp286. Cys111 performs the nucleophilic attack on the peptidic substrate, while His272 and Asp286 act by stabilizing the intermediate of the reaction 25. Trp106 forms the oxyanion site participating in the stabilization of the negatively charged intermediate. Many non-covalent inhibitors interact at an allosteric site near the catalytic site. This allosteric site is mainly composed by the Asn267-Tyr268-Gln269 residues, forming a β-turn secondary structure 25. Its flexibility is well described in the literature and is mainly characterized by distinct conformations of the residue Tyr268, which usually makes stacking interactions with aromatic groups of some inhibitors. The protonation state of the residues of the catalytic triad was defined based on the mechanism of reaction proposed in the literature for SARS-CoV 25: neutral Cys111, His272 neutral at NE2 and Asp286 negatively charged.
To date, there are 16 PLpro crystal structures in the PDB. Given the flexibility observed for Tyr268 from the allosteric site and Leu162, located at the entrance of the catalytic site, we provide to the users two prepared structures related with the PDB codes 6W9C (apo structure) and 6WX4 27 (solved in complex with a covalently bound peptide inhibitor) Figure 1. In both structures, the Tyr268 is presented on an open conformation, allowing the binding of ligands with different sizes. The recently solved structure of PLpro complexed with a non-covalently bound compound (PDB code 7JIW) will be provided soon in the DockThor-VS.
Non-structural protein 5 (Nsp5, Mpro, 3CLpro)
As is well-known in coronaviruses, the two overlapping polyproteins pp1a and pp1ab, firstly produced after infection, are further proteolytically processed into 16 non-structural proteins (Nsp1–16). This proteolytic process is carried out in a coordinated manner by the PLpro and the Mpro 28. Mpro is also known as 3-chymotrypsin-like cysteine protease (CCP or 3CLpro), that first is auto-cleaved from polyprotein pp1a to yield the mature enzyme and then digests the remaining pp1a (at least by 11 conserved sites) to produce the downstream non-structural proteins (Nsps 6 to 16) 29. Given the pivotal role of Mpro in the viral life cycle, it becomes an attractive target for the design of anti-SARS drugs.
The Mpro consists of a homodimer with each polypeptide composed by the domains I (residues 8-101), II (residues 102-184) and III (residues 201-303). The substrate-binding site is located in a cleft between the domains I and II and has the Cys145-His164 catalytic dyad as the reaction center, following a mechanism similar to other coronaviruses. Currently, there are 191 structures of the Mpro deposited in the PDB with many of them complexed with covalent or noncovalent inhibitors. At this moment, we provide to the users the dimeric structure of the Mpro in two distinct conformations (PDB codes 6LU7 28 and 6W63), which are complexed with covalent and noncovalent inhibitors, respectively. The superposition of the structures highlights some important conformational changes within the ligand binding site, mainly the residues Met49, Asn142 and Gln189, reinforcing the importance of considering multiple protein conformations in virtual screening experiments to accommodate distinct compounds (Figure 2). The 6LU7 conformation was the first structure experimentally solved for SARS-CoV-2 with an inhibitor, while the 6W63 contains a drug-like reversible inhibitor at the binding site. In the preparation process, the protonation states and flips of key residues were manually adjusted to provide to the users the Mpro structures with neutral His41 at ND1, the catalytic Cys145 protonated (i.e., neutral), neutral His163 at NE2 and neutral His164 at NE2.
Non-structural protein 12 (Nsp12, RdRp)
To replicate and transcript positive ssRNA, an RNA-dependent RNA polymerase (RdRp, also known as Nsp12) of coronaviruses has evolved to perform this process forming an intricate complex with several non-structural proteins (Nsps) produced as cleavage products of the ORF1a and ORF1ab viral polyproteins. Nsp12 catalyzes the synthesis of viral RNA and possibly with the assistance of Nsp7 and Nsp8 that function as cofactors 30. It has been demonstrated in SARS-CoV-2, that the overall architecture of the Nsp12-Nsp7-Nsp8 complex is similar to that of SARS-CoV with a root mean square deviation (RMSD) value of 0.82 for 1078 Ca atoms 31.
RdRp is considered an interesting target for therapeutic solutions against COVID-19, for which the inhibitor remdesivir (RDV, GS-5734), a nucleoside analogue prodrug of the ebola virus (EBOV) RdRP has been already approved 32. Since the nucleoside analogues have a high structural similarity, other similar drugs such as favipiravir, which was effective in clinical trials, can be used as an inhibitor 33.
The conserved architecture of the Nsp12 core consists of a right-hand RdRp domain (residues Ser367 to Phe920) and a nidovirus-specific N-terminal extension domain (residues Asp60 to Arg249) that adopts a nidovirus RdRp-associated nucleotidyltransferase (NiRAN) architecture 31. The average length of the core RdRp domain is less than 500 amino acids and is folded into three subdomains, namely thumb, palm, and fingers resembling a right-handed cup 34. The NIRAN and RdRp domains are connected by an interface domain (residues Ala250 to Arg365). In addition, COVID-19 virus Nsp12 possesses a newly identified β-hairpin domain at its N terminus 31.
The active site of the SARS-CoV-2 RdRp domain is formed by the conserved polymerase motifs A to G in the palm subdomain and configured like other RNA polymerases 31. Particularly, the motifs A and C have conserved residues that are characteristic of viruses +ssRNA, such as the catalytic aspartates in motifs A (Asp618) and C (Asp760) 35. Motif B has highly conserved Ser682 that is crucial for the recognition of the 2'-OH group of the NTP ribose as well as Gly683, which is conserved in all RdRps 35. Motif D and motif A, both guide the structural change of the active site during catalysis 36. Regarding the nucleotide (NTP) selection by RdRp, motif D has a prime role in the efficiency and fidelity of NTP addition. Indeed, NMR studies have indicated the inability of motif D to achieve its optimal conformation for catalysis when an incorrect nucleotide is incorporated, thereby demonstrating its role in the selection of NTPs 37. Motif E together with motif C interact (in the upstream position) with the newly synthesized backbone of the second and third nucleotides 35, motif F establishes the upper limit for the entry path of NTPs 35. In the motif F of SARS-CoV-2, the residue Ala547 of the N-Terminal region is equivalent to the highly conserved glutamate in almost all +ssRNA viruses 35. This amino acids change leads to structural and possibly dynamic differences in this region, which in turn can interfere with the RNA synthesis 38. The motif G uses the residues S96 and A97 that interact with residues +1 and +2 of the backbone of the template ribbon to move it vertically 35,39.
To date, eight RdRp structures have been deposited at the PDB. We provided at the DockThor-VS platform the RdRp conformation found in the RdRp-RNA-remdesivir complex (PDB code 7BV2 40, Figure 3) without the RNA primer and the inhibitor remdesivir to allow the virtual screening experiments with the free binding site.
Non-structural protein 15 (Nsp15, endoribonuclease, NendoU)
The Nsp15 of SARS-CoV-2 is a nidoviral RNA uridylate-specific endoribonuclease (NendoU) that displays its RNA endonuclease activity (specific for uridine) acting on both, single-stranded RNA and double-stranded RNA 41. Recently, Susan Baker’s Lab revealed for the first time, the molecular mechanism of Nsp15, in which the NendoU activity limits the generation of 5′-polyuridines from negative-sense viral RNA, termed PUN. The PUN can act as a CoVs MDA5-dependent pathogen-associated molecular pattern (PAMP), which in turn can activate the type I interferon (IFN) response in macrophages. The authors found that NendoU cleaves the polyU sequence on the PUN RNA, limiting the length and abundance of the polyU extension. These studies revealed the function of NendoU during replication is to reduce the length of polyU sequences, thus limiting the potential to generate PAMPs and activate the host sensor MDA5. Consequently, the NendoU activity delays recognition by the host innate immune sensors and thus, Nsp15 is a highly conserved virulence factor and a potential target for antiviral and vaccine strategies 42.
The Nsp15 endoribonuclease from SARS-CoV-2 is composed of 347 amino acid residues (sequence from Met1 to Gln347) 41. The SARS-CoV-2 Nsp15 monomers group into a functional hexamer, composed by a dimer of trimers 41. The hexameric form is pivotal for the enzymatic activity. Each monomer presents three domains: (i) the N-terminal (Nsp15-NTD, residues 1-62), formed by an antiparallel β-sheet wrapped around two α-helices; (ii) the central middle (residues 63-191), formed by β-strands and short helices; and (iii) the C-terminal catalytic NendoU domain (NendoU, residues 192-347), formed by two antiparallel β-sheets. Currently, there are seven high-resolution crystal structures of Nsp15 endoribonuclease from SARS‐CoV‐2 available at the PDB containing the three domains.
The active site is located at the CTD, flanked by five α-helices in its concave surface, in a shallow groove between two β sheets, and contains six highly conserved residues: His235, His250, Lys290, Thr341, Tyr343 and Ser294. Based on the similar arrangement of its active site with that of Ribonuclease A, the residues His235, His250 and Lys290 are suggested to be the catalytic triad of the NendoU 41. The prepared Nsp15 structure is based on the conformation of the protein complexed with tipiracil (PDB code 6WXC) prepared considering the pH of 6.2 and consists of His235 and His250 neutral at NE2 and ND1, respectively, and Lys290 positively charged (Figure 4). New Nsp15 conformations will be available soon at the DockThor-VS platform.
Nucleocapsid phosphoprotein (N protein)
The nucleocapsid protein has an essential structural function in CoVs. This is a multifunctional phosphoprotein that establishes an arrangement with genomic RNA forming the ribonucleoprotein (RNP) complex and plays a critical role during transcription, virus assembly and antagonism of host’s innate immunity. As such, the N protein can form a helical filament structure that is assembled into virions by interactions with the viral membrane (M) protein 43. Despite its location within the virion rather than on its surface, it has been identified as is highly immunogenic and abundantly expressed during viral infection 44. Interestingly, it has been demonstrated that the antibody to the SARS-CoV-2 N protein is more sensitive than the Spike protein antibody during the early infection 45. In regards to the context of viral infection, the nucleocapsid protein acts as a viral suppressor of RNAi (VSRs), and thereby it antagonizes one of the cell-intrinsic antiviral immune defence mechanisms of the host 46. Particularly, during RNA viral infection, virus-derivated dsRNA (vi-dsRNA) are generated, which could be recognised and cleaved by the host endonuclease Dicer into virus-derived siRNAs (vsiRNAs). These vsiRNAs ultimates are integrated into de Argonaute protein within the RNA-induced silencing complex (RISC) directing the destruction of cognate viral RNAs in infected cells47. Jingfang Mu and collaborators (2020) showed that nucleocapsid of SARS-CoV-2 associates with dsRNA and suppresses RNAi by sequestrating viral dsRNA in cells, which probably prevents its recognition and cleavage by the host endonuclease Dicer 46. Therefore, the N protein also represents a prime immune evasion factor of SARS-CoV-2, contributing to the pathogenicity of this novel coronavirus. Consequently, the nucleocapsid phosphoprotein can be an attractive target, for example, to inhibit the stages of the viral life cycle, or else to recover the host's immunity mediated by an antiviral RNAi system.
The structure of N protein from coronavirus is composed of the domains N-terminal RNA binding (N-NTD), C-terminal dimerisation (N-CTD) and central Ser/Arg (SR)-rich linker 48. At the time of writing of this manuscript, there are 12 experimentally solved structures for the N protein, where three of them are related to the N-NTD. We provide to the users experimentally solved structures of the N-NTD already prepared for docking. The preparation of the N-CTD will be available soon. Currently, there are no drugs or potential compounds experimentally validated as SARS-CoV-2 N protein inhibitors.
Herein, we provide five monomeric structures of N-NTD obtained by NMR experiments (PDB code 6YI3) to account for the protein flexibility, specially the basic finger moiety, which is commonly locked in one conformation in the X-ray solved structures available due to crystal lattice contacts 49. According to studies with the N protein from Influenza virus, antiviral drugs targeting N proteins should stabilise the monomeric form or induce abnormal oligomerisation, or interfere with the RNA binding 50. Also, they suggested that the monomeric form binds to the replicating viral RNA in infected cells.
Recently, surface plasmon resonance (SPR) analysis experiments of SARS-CoV-2 N-NTD show low binding affinities for different kinds of ribonucleotide (AMP/UMP/CMP), except GMP, suggesting potential distinct ribonucleotide-binding mechanism between SARS-CoV-2 and HCoV-OC43 N protein 48. Some important characteristics observed in the experimentally solved structures of SARS-CoV-2 N-NTD protein that may explain these findings are: (i) the N-terminal tail is highly flexible and adopts more opened conformations than in HCoV, probably allowing the interaction with viral RNA genome of high order structure, (ii) replacement of Tyr102 in HCoV to Arg89 located near to the nitrogenous base recognition site, (iii) phosphate-binding site containing Thr54 and Ala55 in SARS-CoV-2 instead of Ser67 and Gly68 in HCoV.
According to NMR-based titration experiments of N-NTD with a short double-stranded RNA (5'-CACUGAC-3' and 5'-GUCAGUG-3'), the amino acid residues Ala50, Thr57, His59, Arg92, Ile94, Ser105, Arg107, Arg149, Tyr172 were proposed to form the molecular interface of the N-NTD: RNA complex 49. Curiously, some key residues involved in RNA recognition on other CoV N proteins such as HCoV, Tyr109 and Tyr111, were not affected by the RNA binding in the NMR titration experiments. However, they are well conserved among the coronavirus and remain to occupy the same spatial region in the SARS-CoV-2 structures when compared to the HCoV-OC43 structure (PDB code 4LI4 51).
Thus, we provide to the users five distinct conformations of the N-NTD solved with NMR after clustering the 31 conformations containing the Glu174 on an opened conformation and selecting representative structures according to the flexibility of the residues Arg102 and Tyr109 (Figure 5). The suggested binding site for docking experiments is centred on the hotspot located at the surface of the N-NTD between the finger and palm subdomains, which have been claimed as essential for RNA binding and a target site for small molecules 49.
Spike glycoprotein
Spike S protein is a class I virus fusion protein 52, it is the limiting factor for the virus to enter the host cell 53 and uses Angiotensin-Converting Enzyme 2 (ACE2) and human dipeptidyl peptidase-4 (hDPP4) as the main receptor 54.
Spike is a homotrimer in which each monomer has about 180 kDa and contains approximately 1,255 residues 55. It consists of the N-Terminal-Domain (NTD) S1 subunit that covers residues 1 - 667 and will direct the link with the receptor, and the C-Terminal-Domain (CTD) S2 subunit that covers residues 668 - 1255 and will be responsible for the merger between virus and host membranes 55. The S1 subunit is the main target for the development of new drugs because it has a region responsible for the interface of interaction with the host receptor called Receptor Binding Domain (RBD) that is between residues 333 - 527, whereas the RBD region that performs direct contact with the receptor is called Receptor Binding Motif (RBM) and is located between residues 438 - 506 56.
By describing the conformation states "up" and "down" (6-8) of Spike's S1 structure, it is possible to illustrate the states of interaction with the receptor 57. As such, in the “down” configuration the receptor is in an inaccessible state, while in the “up” configuration the receptor is in an accessible state. Since the ACE2 receptor only interacts with the RBD when it is in the “up” conformation, the down conformation would leave the RBD inaccessible to ACE2 or even to any possible inhibitor on this interface 58. For this reason, all the structures that we are providing to the users are in the “up” conformation.
Until now, there are 69 structures available in the PDB related to the SARS-CoV-2 Spike protein, of which nine are complexed with ACE2 and four with neutralising antibodies. Currently, we provide at the DockThor-VS platform three Spike structures: the Spike-ACE2 complex and without ACE2 (PDB code 6M0J 56), and the Spike conformation found in the PDB code 7BZ5 59 without the neutralizing antibody (Figure 6).
In the preparation process, we kept the Asn and Gln flips predicted by the PrepWizard/PROPKA tool, since some of them are part of the Spike protein-protein interaction interface and may be influenced by the interacting partner. For example, Gln493 and Asn501 were predicted with distinct flips when Spike is complexed with ACE2 (PDB code 6M0J) or the neutralizing antibody (PDB code 7BZ5).
Non-synonymous variations in the selected targets
The ongoing pandemic spread of SARS-COV-2 is resulting in the increasing generation of thousands of genome sequences (available in the GISAID repository, https://www.epicov.org, 140,000 sequences on 07/10/2020). Then, massive sequencing of SARS-CoV-2 genomes allows performing innumerable comparative, evolutionary and epidemiological analyzes, as well as to identify genetic mutations, such as synonymous or non-synonymous variations (NSVs), deletions and nucleotide insertions. Particularly, for the rational drug design, more attention is given to the study of NSVs in the coding regions since the substitution of amino acids can affect fold, binding affinity, post-translational modification, protein-protein interaction (PPI) and other protein characteristics 60. Even so, some NSVs may not produce visible changes in the structure of the protein; in that case, the mutation may not have a biological impact (neutral). Alternatively, with the intra- and inter-host viral evolution in infected humans (quasispecies dynamics), the purifying selection can eliminate deleterious mutations over time, which are more detrimental to the pathogen's fitness, or else the positive selection promotes the spread of beneficial ones 61.
The estimated mutation rate underlying the global diversity of SARS-CoV-2 is approximately 6×10−4 nucleotides/genome/year 62, which is considered moderate for coronaviruses that have the Nsp14 proofreading correction mechanism. Currently, the genomic analysis of more than 55 thousand circulating genomes from patient samples showed that there are more than 16 thousand non-synonymous substitutions among 26 out of 29 proteins encoded on the SARS-CoV-2’s genomes in comparison with the reference genome sequence of isolate Wuhan-Hu-1 (NC_045512.2) (CoV-GLUE 63, accessed on July 7, 2020). The distribution of this genomic diversity shows huge allele frequency for five replacement amino acid in just four proteins, namely Spike (D614G, 75.81%), Nsp12 (P323L, 75.62%), N (R203K, 29.95% G204R, 29.88%) and ORF 3a (Q57H, 22.93%). The remainder corresponds to numerous NSVs with low alleles frequency (~11% to 0.002%). Ultimately, this can be explained by the positive selection that acts at a higher rate after the zoonotic transfer, suggesting an increasing mutant load in the circulating strains of SARS-CoV-2 in the epidemiological scenario 64.
Considering 55,189 SARS-CoV-2 genomes, the total number of variants per target that exceeded one thousand replacements was 1,131 (Nsp12), 2,078 (Spike) up to 3,196 (Nsp3), while this value was lower than 1,000 for N (848), Nsp15 (707) and Nsp5 (377) (CoV-GLUE 63, accessed on July 7, 2020). Then, we chose a total of 16 NSVs among the six targets (Table S1), whose corresponding variant structures are available to the users through the DockThor-VS web server. For the selection of these mutations, we assessed the impact of the occurrence of the residue in the catalytic region and its possible interaction with a ligand, as well as the amino acid properties (hydropathy, charge and side chain) between a residue in the reference genome and its replacement corresponding in the patient sample. Alternatively, we also have taken into account the effect on the biological function (neutral vs deleterious) and searched in the literature any mutagenesis experiments with evidence for alteration in the protein’s molecular function and/or viral fitness in CoVs involving the focused residue.
For PLpro, we chose three amino acid substitutions with neutral functional effect, whose corresponding residues fall into the Peptidase C16 domain (Table S1). So far, we have selected only the replacement M165I on the Mpro, highlighting that the residue falls on a beta-sheet and is directly part of the ligand binding site, with the side chains oriented towards the ligand (Table S1). Here, we are describing one selected non-synonymous variation on the RdRp, namely G683V with deleterious functional effect (Table S1). Replacement G683V on the RdRp increases the volume of the side chain of a highly conserved glycine 35 and it has already been described in vitro as a deleterious NSV 31. We selected the four non-synonymous variations on the Nsp15, S293A, S293T (Ser294 in PDB 6VWW), Y342C and Y342H (Tyr343 in PDB 6VWW), whose residues are falling directly on the ligand interaction binding site. Particularly NSVs S293A and S293T are interesting since the Ser293 accounts as the key residue for enzyme discrimination between uracil to cytosine, or adenine to guanine bases 41. For the nucleocapsid phosphoprotein, we selected the NSVs A50V, R92S and R149L, whose residues fall on the RNA-binding surface of its cognate domain 65. These substitutions have a neutral (A50V and R92S) or deleterious (R149L) predicted functional effect (Table S1).
Finally, for the Spike glycoprotein of the SARS-CoV-2, we selected four amino-acid substitutions (N439K, F456L, G476S and V483A) (Table S1), whose residues are located on the receptor-binding motif (RBM) that is comprised between residues 437 to 508 56. The residues Phe456 and Asn439 are both important for the interaction interface with the human receptor ACE2. Regarding Phe456, it is interesting to mention that single amino acid substitution on the equivalent residue in SARS-CoV Spike glycoprotein (Leu443) affected both the antibody binding and neutralisation 56. Similarly, mutagenesis assays in SARS-CoV on the equivalent residue of Asn439 (Arg426) demonstrated that at least two amino acid substitutions significantly reduced binding to ACE2 66. Concerning the residue Gly476, deletion mutagenesis of the equivalent positively charged region in the RBD of the SARS-CoV Spike (SΔ, 422-463) abolished the ability to induce potent neutralising antibodies in vivo as well as mediate viral entry 67. On the other hand, studies with the equivalent position of the residue Val483 in MERS-CoV (Ile529) showed that single amino acid substitution reduced the host’s receptor affinity, with the consequent increase in resistance to antibody-mediated neutralisation 68
Virtual screening for drug repurposing
We performed virtual screening experiments with DockThor-VS for the e-Drug3D dataset at the reference pH (6.6 to 7.4) for all SARS-CoV-2 targets available at the platform so far (e.g., PLpro, Mpro,RdRp, NendoU, Spike and N protein) using the wild type genomic variant. When the protein target has more than one conformation, we adopted an ensemble docking strategy to select the top-scored binding pose according to the predicted affinity for each drug (see Section 3.6 for details).
The virtual screening results of the drugs currently under clinical trials against COVID-19 are present in Table 1. Considering all targets and all drugs evaluated, we found that the majority of the predicted binding affinities are in the range of high to low micromolar affinity units (scores higher than -6.8 kcal/mol correspond to binding affinity values higher than ~ 10 µM, whereas scores lower than -8.2 kcal/mol corresponds to submicromolar affinities). This result is expected since many drugs under clinical trials exhibited no activity or only modest inhibitory effects in some experimental studies reported in the literature 69–71. However, we found some interesting results that deserve to be highlighted. We identified ledipasvir, imatinib, lopinavir and daclatasvir as the most promising drugs under clinical trials. They all show a good multi-target profile and exhibit some predicted binding affinities at low nanomolar concentration (Table 1).
Ledipasvir is an antiviral to treat chronic Hepatitis C targeting the non-structural protein 5A from the HCV (NS5A) and was predicted as the most potent Mpro inhibitor (score = -10.59 kcal/mol) among the drugs currently ongoing clinical trials against COVID-19. Also, ledipasvir was predicted as the most potent drug under clinical trials against Spike and RdRp (scores of -9.66 kcal/mol and -9.16 kcal/mol, respectively).
Daclatasvir was the top-ranked drug against the N protein (score = -9.50 kcal/mol). Daclatasvir is also an inhibitor of NS5A from HCV and might also be a promising Nsp5 inhibitor, with a predicted binding score of -9.43 kcal/mol.
Lopinavir is another antiviral drug inhibitor of the HIV-1 protease administered in combination with other antiretrovirals in the treatment of AIDS. Herein, it was predicted as the most potent drug against Nsp15 in the virtual screening experiments.
The anticancer imatinib is an Abl kinase inhibitor with anti-SARS-CoV effect through blocking fusion of viral envelope with the cell membrane 73. According to the virtual screening results, it also potentially inhibits the SARS-CoV-2 Mpro target with a docking score of -10.10 kcal/mol, interacting at the binding site with key residues such as His163 and Met49.
Ivermectin, montelukast, posaconazole, ritonavir and telmisartan also show an interesting multi-target profile having predicted binding affinities below sub-micromolar concentration for at least five targets.
Prazosin is an alpha-1 antagonist used to treat hypertension that is currently under clinical trials to evaluate its efficacy and safety in preventing the COVID-19 cytokine storm. In our virtual screening experiments, it was predicted as the best Nsp3 drug among those currently under clinical trials with a moderate docking score of -8.84 kcal/mol.
Table 1. Virtual screening results of drugs ongoing clinical trials for the drug targets available at DockThor-VS. Affinity predictions (kcal/mol) are given for the top-energy pose according to the ensemble docking strategy. The top-scored drug for each target is underlined and the most promising target for each drug is highlighted in bold.
Name
|
# Studies1
|
PLpro
|
Mpro
|
RdRp
|
NendoU
|
N protein
|
Spike
|
Acetylcysteine
|
5
|
-6.13
|
-6.28
|
-6.48
|
-6.32
|
-6.28
|
-6.74
|
Amodiaquine
|
1
|
-7.57
|
-8.23
|
-7.23
|
-7.64
|
-7.96
|
-7.79
|
Atorvastatin
|
4
|
-7.65
|
-8.73
|
-7.37
|
-6.92
|
-7.97
|
-8.71
|
Atovaquone
|
2
|
-7.68
|
-8.28
|
-7.91
|
-8.50
|
-8.33
|
-8.12
|
Azithromycin
|
71
|
-7.90
|
-7.73
|
-7.73
|
-7.88
|
-7.75
|
-8.40
|
Baricitinib
|
12
|
-7.53
|
-7.96
|
-6.43
|
-8.26
|
-7.98
|
-7.61
|
Chloroquine
|
29
|
-7.92
|
-8.06
|
-7.49
|
-7.88
|
-8.10
|
-7.65
|
Chlorpromazine
|
2
|
-7.68
|
-8.41
|
-6.74
|
-7.71
|
-7.45
|
-7.61
|
Ciclesonide
|
4
|
-8.04
|
-8.68
|
-7.42
|
-8.45
|
-8.92
|
-8.00
|
Cobicistat
|
2
|
-8.36
|
-9.46
|
-8.58
|
-8.82
|
-9.16
|
-9.31
|
Daclatasvir
|
6
|
-8.83
|
-9.43
|
-8.63
|
-9.02
|
-9.50
|
-9.02
|
Darunavir
|
3
|
-8.07
|
-8.45
|
-7.84
|
-8.27
|
-8.32
|
-7.87
|
Deferoxamine
|
3
|
-7.56
|
-7.86
|
-7.43
|
-7.87
|
-8.20
|
-7.81
|
Dexamethasone
|
18
|
-7.28
|
-7.83
|
-6.90
|
-8.01
|
-7.74
|
-7.60
|
Disulfiram
|
1
|
-7.23
|
-7.80
|
-6.74
|
-7.75
|
-7.52
|
-7.22
|
Eltrombopag
|
1
|
-7.96
|
-8.48
|
-7.78
|
-7.60
|
-8.56
|
-8.40
|
Emtricitabine
|
3
|
-6.47
|
-7.03
|
-6.51
|
-6.55
|
-6.64
|
-6.60
|
Fingolimod
|
1
|
-7.96
|
-7.46
|
-6.83
|
-8.10
|
-7.64
|
-7.94
|
Hydroxychloroquine
|
185
|
-7.76
|
-7.83
|
-7.42
|
-7.66
|
-8.10
|
-7.70
|
Ibuprofen
|
2
|
-6.86
|
-7.43
|
-6.70
|
-7.26
|
-7.19
|
-6.89
|
Icatibant
|
1
|
-6.95
|
-7.58
|
-8.97
|
-7.50
|
-8.34
|
-9.04
|
Imatinib
|
4
|
-8.79
|
-10.10
|
-7.69
|
-8.78
|
-8.66
|
-8.61
|
Isotretinoin
|
5
|
-7.04
|
-7.20
|
-6.93
|
-7.03
|
-8.03
|
-7.72
|
Ivermectin
|
36
|
-8.37
|
-9.26
|
-8.62
|
-8.94
|
-9.44
|
-8.78
|
Ledipasvir
|
3
|
-8.47
|
-10.59
|
-9.16
|
-9.25
|
-9.36
|
-9.66
|
Leflunomide
|
2
|
-7.15
|
-7.83
|
-7.13
|
-7.79
|
-7.42
|
-7.17
|
Lopinavir
|
43
|
-8.80
|
-8.47
|
-7.77
|
-10.12
|
-9.06
|
-8.76
|
Losartan
|
12
|
-7.56
|
-8.42
|
-7.51
|
-8.51
|
-8.39
|
-8.03
|
Mefloquine
|
1
|
-7.77
|
-7.46
|
-7.03
|
-7.49
|
-7.17
|
-7.47
|
Methylprednisolone
|
17
|
-7.76
|
-7.99
|
-6.81
|
-8.05
|
-7.74
|
-7.68
|
Montelukast
|
1
|
-8.38
|
-9.08
|
-8.65
|
-8.52
|
-9.02
|
-9.28
|
Niclosamide
|
3
|
-7.44
|
-8.04
|
-7.28
|
-7.29
|
-7.47
|
-7.68
|
Nitazoxanide
|
19
|
-7.41
|
-7.77
|
-6.87
|
-7.29
|
-7.57
|
-7.18
|
Oseltamivir
|
9
|
-7.12
|
-7.03
|
-6.83
|
-7.33
|
-7.25
|
-7.64
|
Prazosin
|
2
|
-8.84
|
-8.57
|
-7.56
|
-8.41
|
-8.53
|
-7.38
|
Ribavirin
|
7
|
-6.91
|
-6.45
|
-6.10
|
-6.80
|
-6.84
|
-6.54
|
Ritonavir
|
48
|
-8.69
|
-8.70
|
-7.63
|
-9.00
|
-8.85
|
-8.61
|
Ruxolitinib
|
19
|
-8.16
|
-8.41
|
-7.06
|
-7.83
|
-7.95
|
-7.89
|
Sildenafil
|
2
|
-8.26
|
-8.64
|
-7.79
|
-8.08
|
-8.62
|
-8.07
|
Sofosbuvir
|
7
|
-7.92
|
-8.49
|
-7.29
|
-8.10
|
-7.40
|
-7.48
|
Telmisartan
|
9
|
-8.55
|
-9.38
|
-8.12
|
-8.42
|
-9.09
|
-8.60
|
Tenofovir
|
1
|
-6.86
|
-6.89
|
-6.86
|
-6.65
|
-6.73
|
-6.90
|
Thalidomide
|
2
|
-7.15
|
-7.40
|
-6.66
|
-7.17
|
-6.71
|
-7.21
|
Tranexamic acid
|
4
|
-6.34
|
-6.59
|
-6.15
|
-6.18
|
-6.54
|
-7.12
|
1 Number of clinical trials reported related to COVID-19 by Mapped Drug Intervention at the ClinicalTrials.gov (accessed on 2020-09-03, https://clinicaltrials.gov/ct2/covid_view/drugs).
|
We also evaluated the top-20 drug candidates for repurposing that are no ongoing clinical trials for each SARS-CoV-2 target (Table S2). One of the most interesting results is associated with the antiviral drug elbasvir, which was predicted to interact with both Mpro and N protein at docking scores lower than -10 kcal/mol, suggesting a multi-target effect of this drug. In addition to Mpro and N protein, elbasvir also was predicted to be within the top-20 drugs for other SARS-CoV-2 targets, i.e., Spike (-9.79 kcal/mol), NendoU (score = -9.61 kcal/mol) and RdRp (score = -9.21 kcal/mol).
Among the top-ranked drugs predicted to target Nsp3, none of them was predicted with binding affinities lower than -10 kcal/mol. However, we can highlight bazedoxifene (score = -9.65 kcal/mol) and menaquinone (score = -9.50 kcal/mol) as interesting findings. Bazedoxifene is a selective estrogen receptor modulator used to prevent postmenopausal osteoporosis and strong antiviral effects have been reported for SARS-CoV-2 70,74. Due to its inhibitory effect on IL-6 signalling, bazedoxifene has been proposed as a promising drug to prevent the cytokine storm, ARDS and mortality in severe COVID-19 patients 75–77. Menaquinone (Vitamin K2) is one of the three types of Vitamin K, and a recent preliminary study suggested that patients with Vitamin K deficiency were related to poor prognostic 78. However, there is no evidence if the administration of menaquinone helps to treat or to prevent COVID-19 infection.
The Mpro screening predicted six antiviral drugs within the top-20 best-affinity compounds, where two of them (i.e., ledipasvir and velpatasvir) are currently ongoing clinical trials. The anti-HCV drugs elbasvir, pibrentasvir, velpatasvir and ombitasvir are still not being evaluated in clinical trials but could be promising drugs for repurposing to fight against COVID-19. The comparison between the experimental structures of Mpro suggests important conformational changes of amino acid side chains within the binding site, especially the residues Met49, Asn142, Met165 and Gln189. Imatinib is an example compound that has both affinity and binding pose predictions affected by the receptor conformation, interacting better in the 6LU7 conformation (score = -10.09 kcal/mol, ranked 5th best compound) than in 6W63 (score = -8.80 kcal/mol, ranked at the 100th position). In the 6LU7 conformation, the pyridine group of imatinib is able to interact deeply in the Mpro binding site, making a hydrogen bond with His163, whereas in the 6W63 conformation this pyridine moiety is exposed to the solvent (Figure 7). Other examples that the virtual screening ranking was also strongly affected by the receptor conformation are posaconazole against PLpro (ranked 7th in 6W9C and 1054th in 6WX4) and elbasvir against the NMR-derived conformations (ranked 1th in the state-12 and ranked 10th in the state 10). These results show the importance of using various, carefully selected, conformations of a protein target in virtual screening experiments. Using only one particular receptor conformation can generate false-negative results: (i) by discarding promising ligands due to a bad ranking position; (ii) by generating an incorrect ligand binding mode that could harm further fully flexible interaction analysis of the protein-ligand complexes through molecular dynamics simulations currently used by many groups 20,79.
The best-scored drug against RdRp was the anticancer dactinomycin (score = -9.88 kcal/mol), a macrocyclic drug that binds to DNA inhibiting the synthesis of RNA. Associated with the screening result, its mechanism of action might suggest that dactinomycin is able to interact at the RNA binding site of RdRp, which is currently available in the DockThor-VS platform in the free form (i.e., without primer and metal ions observed in the 7BZ5 structure). Ribavirin, a known RdRp inhibitor, was predicted with a weak affinity score = -6.10 kcal/mol) probably due to the concerted mechanism of action with the primer and metal ions.
The antiviral lopinavir was the only drug that achieved a docking score lower than -10 kcal/mol against NendoU. Despite this, we highlight lomitapide (score = -9.96 kcal/mol), a drug widely used to treat familial hypercholesterolemia that was recently found to exhibit anti-SARS-CoV-2 activity on traditional CPE-based antiviral assay in Vero E6 80. Additionally, lomitapide was also predicted to inhibit Spike with a docking score of -9.26 kcal/mol, suggesting a multi-target potential.
Elbasvir and fidaxomicin were predicted to interact with N protein with docking scores of -10.21 kcal/mol and 10.00 kcal/mol, respectively. As mentioned before, elbasvir is an antiviral drug predicted to interact favorably with multiple SARS-CoV-2 targets. Fidaxomicin is a macrocyclic lactone antibiotic drug with activity against C. difficile targeting the bacterial RNA synthase 81. If its antiviral effect against SARS-CoV-2 is confirmed, fidaxomicin could be a promising drug to be repurposed since it has almost no effect on normal colonic microflora and is approved to pediatric use in patients over the age of 6 months 82.
Ombitasvir was the top-ranked compound against Spike and the only compound achieving a docking score better than -10 kcal/mol. It is an antiviral drug targeting NS5A in the treatment of chronic cases of Hepatitis C in combination with other antiviral compounds. To date, there are no experimental studies reporting anti-SARS-CoV-2 effects. However, it might be a promising drug candidate for repurposing due to the similar mechanism of action to other NS5A inhibitors already reported to have anti-SARS-CoV-2 activity. Furthermore, ombitasvir were predicted to interact with Mpro and NendoU at nanomolar concentration, thus suggesting that it can also exhibit a multi-target effect on SARS-CoV-2.
Impact of non-synonymous variations in virtual screening
One of the possible impacts of non-synonymous variations is the conference of resistance to some drugs through the change of the interactions profile performed between the compound and key amino acid residues at the binding site. NendoU-lopinavir is an interesting example to evaluate the influence of the non-synonymous mutation Y343C in the predicted binding mode and affinity for this complex (Figure 8A). Lopinavir was predicted to interact with wild type NendoU deeply in the binding site with a docking score of -10.12 kcal/mol, being characterized by interactions with key residues such as 𝜋-stacking between its phenyl ring and the Try343 side chain. As expected, the Y343C variation led to the loss of this interaction and a worst predicted affinity (-9.33 kcal/mol), with the lopinavir interacting superficially in the binding cavity without reaching key residues such as Tyr343 and Ser294, whereas still making favorable interactions with the receptor. N protein-elbasvir is another interesting example, where the R92S variation weakened the predicted binding affinity (scoreWildType = -10.21 kcal/mol versus scoreR92S = -9.68 kcal/mol) led to the loss of the hydrogen bonded formed between the oxygen-containing heterocycle from elbasvir and the Arg92 side chain in the wild type N-protein, however without significantly changing the predicted binding mode (Figure 8B). In this context, the availability of both wild type and selected non-synonymous variations mainly present in the binding sites targeted by small molecules could be very useful for the drug design of more potent and effective compounds.