Multiple sequence alignments of the SARS-CoV-2 S proteins show three variants present on Mexican population (H49Y, T573I, and D614G), the rest of the residues are conserved (Fig. 1). These variants above mentioned are far from the RBD domain of S protein.
D614G change is caused by an A/G nucleotide mutation at position 23,403 in the reference strain32, this variant is being associated with higher viral loads and enhances viral effectivity in patients8,32,33 despite the higher viral loads, G614 is neutralizing similarly as D614 by polyclonal antibody32. To date, this variant becomes the dominant form displacing the WT, according to the levels of mutation in the world, presented on the nextstrain database (www.nextstrain.org). H49Y variant is produced by C/T change at position 21,70710. The properties of residues H/Y changes from positive to neutral charge9, this variation causes a reduction in the total free energy, while the mutants with D614G substitutions showed stabilizing structure, suggesting a prevalence role in S protein evolution13. Finally, regarding T573I mutants, until now, there is no information available, but the change by T/I implies a modification from polar residue to non-polar hydrophobic residue changing the chemical environment to more hydrophobic site. Even though these are punctual modifications due to the chemical nature of the substitution, modification at the structural level is expected.
Molecular dynamics simulations
The mutations on the 3D structures of S were done using PDB: 6VSB crystal on PyMol (Fig. 2), and proceed to develop the Molecular dynamic simulation with Amber. RMSD was calculated to determine the average deviation in the atomic stability under MD simulations. All RMSD trajectories reached the equilibrium after 30 ns of MD simulation (Fig. 3A). The Rg values for WT showed expansion at the last 30 ns of MD with values from 47.5 to 50 Å. D614G and T573I show compactness since 20 ns of trajectory with values from 49.5 to 44.8 Å and 50.4 to 43.79 Å, respectively. H49Y presents compactness from 48.47 to 43.9 Å in the whole simulation (Fig. 3B). Rg values, suggests that variants have a similar degree of compactness while the WT shows an opposite behavior. We explored protein flexibility by RMSF values of the Cα through MD simulations of the S proteins and its variants (Fig. 3C). Seven principal peaks of fluctuation were located on WT showed up as the most flexible areas situated at 223-288, 688-729, 730- 776, 823- 869,
946-1024, and at the C- terminus from 1060-1143; for D614G: 57-77, 395-594, 722- 780, 823-870, 948-1032, and the C- terminus from 1064-1143; H49Y presented a different pattern of flexibility showing punctual fluctuations, where the most flexible residues were 67, 112, 47, 182, 255, 444, 476, 557, 632, 701, 838, 845 and the C-terminus from 1135-1143 T573I showed three flexible regions located from 55-94, 401-525, 947- 1026 plus the C-terminus from 1138-1143.
Clustering analysis
Representative ensembles were calculated from the last 30 ns of the MD simulation. 70% of the most populated conformations were grouped into the first 10, 13, 15, and 13 clusters for WT, D614G, H49Y, and T573I, respectively (Table 1), this clustering dispersion indicates that proteins possess a complex structural behavior. Although, it was observed that WT protein depicted lesser clustering dispersion than mutated protein, which pointed out that mutated spike is more complexed in their structural behavior. Only the most populated cluster conformation was retrieved from each one of the systems, and it is represented in Figure 4. The most populated cluster conformation of the WT showed more difference regarding the native structure (RMSD = 9.625 Å, Fig. 4A), D614G (RMSD = 9.998 Å, Fig.4B) showed a higher RMSD than WT while, the other mutated proteins depicted lesser difference in decreasing order as follow, H49Y (RMSD = 6.965 Å, Fig. 4C) >T573I (RMSD = 6.858 Å, Fig. 4D). Specifically, WT structural difference was observed in most of the structure, and little differences were observed at 270-318 that belongs to N- terminal domain (NTD) and 531-700, that belongs to S1; however, D614G mutant depicted difference in this region which indicate that a single residue mutation affected the stability of this “central portion of the WT protein. In both cases, WT and D614G difference into secondary structures were observed as well as displacements in loops, α-helices, and β- sheets structures. While in H49Y variant, not greater structural changes were observed in comparison to WT, thus it seems that H49Y induces lesser structural changes that are in agreement with RMSD and RMSF observations, where punctual fluctuations were observed. These observations are in agreement with in silico predictions of energetic stability induce by mainly by H49Y mutations and in lesser grade by D614G13. In T573I variant, fluctuation from 535-580 was observed reduced, therefore T573I mutation seems to induce structural stability into WT protein as well as in the zone around where the mutation is found.
Table 1. Cluster analysis using RMSD cut-off of 2.0 A.
Cluster population (%)
|
Cluster
|
WT
|
D614G
|
H49Y
|
T531I
|
1
|
10.6
|
10.4
|
6.8
|
8.2
|
2
|
9.6
|
7
|
6
|
7.4
|
3
|
9
|
6.6
|
6
|
7.4
|
4
|
7
|
5.4
|
5.4
|
7.2
|
5
|
6.6
|
5.2
|
5.4
|
5.8
|
6
|
6.6
|
5.2
|
5.4
|
5.4
|
7
|
6.4
|
4.6
|
4.6
|
5.4
|
8
|
6.4
|
4.6
|
4.4
|
5
|
9
|
6
|
4.6
|
4.2
|
4.6
|
10
|
6
|
4.2
|
4.2
|
4.6
|
11
|
5.2
|
4.2
|
4.2
|
4.4
|
12
|
4.8
|
4.2
|
4
|
4.4
|
13
|
4.2
|
4
|
3.6
|
4
|
14
|
3.8
|
4
|
3.6
|
4
|
15
|
3.4
|
3.8
|
3.6
|
3.6
|
16
|
2.6
|
3.4
|
3.4
|
3
|
17
|
1.8
|
3.4
|
3.2
|
3
|
18
|
|
3.4
|
3.2
|
3
|
19
|
|
3.2
|
2.8
|
3
|
20
|
|
3.2
|
2.6
|
2.8
|
21
|
|
3
|
2.6
|
2
|
22
|
|
2.4
|
2.4
|
1.8
|
23
|
|
|
2.4
|
|
24
|
|
|
2.4
|
|
25
|
|
|
1.8
|
|
26
|
|
|
1.8
|
|
Principal component analysis
Exploration of the principal components that contribute to the global motions of WT protein and mutant systems was performed (Fig. 5A) on the MD simulations. According to PCA analysis, the first 15 eigenvectors captured 92-96% of the total motions (94.8, 96.2, 92.8, and 96.1 % for WT, D614G, H49Y, and T563I, respectively) (Fig. 5B). Whereas, the projections of the first two eigenvectors (PC1 vs PC2) contributed to 48-71% of the collective motions (67.7, 74.4, 48.1, and 71.3 % for WT, D614G, H49Y, and T563I, respectively) (Fig. 5 C-F). By projections onto the essential subspace of PC1 vs. PC2, it was observed that the WT system showed different mobility behavior in comparison to the variant systems. Similar cluster distribution was observed for WT, D614G, and T563I variants, having slightly higher conformational mobility for D614G and T563I than WT (Fig. 5 C, D, and F). In comparison, H49Y showed a more compact cluster distribution than the other systems, which suggest a reduction in conformational mobility due to single residue mutation (Fig 5D). T563I mutant showed dissimilarly conformed distribution along the subspace in comparison to the others systems, this behavior suggest that the trajectories sampled different regions of the phase space with different minima but small energy barrier, in this sense T563I mutant affects the structural behavior of the protein with is also reflected into the conformations observed during MD simulation (Fig 5F). These results can be corroborated from the analysis of the graphical representation of the full mobility along PC1 and PC2 (Figure 6), which allows us to study the direction and magnitude of the motions that contribute to the total system's mobility. According to this projection, the movement on WT was uniformed, and by regions, they showed the same direction, i.e., RBD region and S1 and S2 (Fig. 6 A). However, for D614G the direction of the movements change mainly on S2 region and in NTD segment (Fig. 6 B), which is in agreement with the cluster distribution of PC1 vs PC2 projection (Fig. 5D), this modification not only increases the flexibility of the hinge but also it changes the direction which might contribute to the increase virulency of this mutant because of might alter the binding to ACE protein13. Previously, it was found that D614 formed a hydrogen bond with T859 and a salt bridge with K854 placed on the subunit two of the other protomer, thus the change of D by G might provide a flexible space between the monomers and more conformational flexibility as our MD simulations suggest which might lead to improved accessibility to ACE2, which are consistent with more cell entry of the virus that carries this mutation 8,34. This motion perturbation is explained because D614 residue is located on a loop exposed side chain, has a negative charge, and that will be a loss on the variant G614, this causes loss of interactions with other molecules or residues. The G614 has a neutral charge, is smaller, and introduces a more hydrophobic residue at this position, and this can result in loss of hydrogen bonds and/or disturb correct folding. Glycines are very flexible and can disturb the required rigidity of the protein at this position 13. In the case of H49Y mutant, the whole mobility decreased, which was reflected in the magnitude of the porcupine representation in Figure 6C, besides the direction of the movement was completely altered in comparison to WT. It was significantly decreased on S2 region (816-1106), suggesting that this mutant not only confers more structural stability as we found but also energetic stability as it was suggested by other authors13. Aside, it was found that the H49Y mutant virus has increased cell entry in comparison to WT S protein8.
Finally, on the T563I mutant at S1 region, an increment on the magnitude of the movements was observed, while on S2 region, the direction of the movements was on the contrary direction to that observed on WT (Fig 6D). So, this mutation altered both magnitudes a direction of the mobility of the protein. In this case, until now, there is not much more evidence about the behavior of the virus that possesses this mutation, so it should be explored to correlate with our findings.
Molecular docking with known ligands
Finally, to explore how these punctual mutations affect the protein-ligand binding, a molecular docking with Cepharenthine, Hydroxychloroquine, and Nelfinavir which are experimental reported spike ligands were performed 29–31
Cepharanthine showed potent antiviral activity In vitro against SARS-CoV-2, and by molecular docking, spike protein was suggested as its target. It is suggested that cepharanthine is bound to RBD, and might interfere with human ACE2 binding avoiding the anchorage of the virus to the cell30,35. In wild type complex, cepharanthine formed a hydrogen bond with E484 and S494 and hydrophobic interactions with L455, F456, Y489, F490, L492, and Q493 which are residues involved in the Interactions between viral spike and ACE2 (7A & D)36. Cepharanthine was bound to WT spike protein with the highest binding free energy (-6.57 kcal/mol), followed by H49Y ( -6.42 kcal/mol) which also interacted with residues belonging to RBD by only hydrophobic interactions with Y449, N450, Y451, L452, F490, L492, S494 and Q493 (Fig. 7A and F). Cepharanthine was a little bit displaced regarding wild type-cepharanthine complex, which might be responsible for the binding energy detriment. While D614G and T5631 showed lesser binding free energy (-5.95 and -5.39, respectively). Cepharanthine also interacted with residues of the RBD domain of D614G mutant, by hydrophobic interactions with Y449, N450, Y451, L452, E484, F490, L492, Q493, and S496 (Fig. 7E). While within the T563I mutant, Cepharanthine was moved away from the typical site where the other complexes interacted, and formed hydrogen bond with R509 and hydrophobic interactions with F342, N343, A344, W436, N437, S438, N439, and N440 none of these residues belong to ACE2-RBD interacting residues (Fig. 7G). It seems that mutation on T563I might affect the binding and maybe the biological effect of cepharanthine as anti-SARS-COV-2 agent37.
Hydroxychloroquine is also a compound with effective in vitro inhibition activity against SARS-CoV-2 infection, besides it elevates the endosomal pH and affects the ACE-2 terminal glycosylation avoiding the virus binding38. Two independent computational work suggested that hydroxychloroquine might interact with the RBD domain of spike protein, through it can disrupt the viral-host recognition to avoid viral infection 29,30. Therefore, in this study, we selected hydroxychloroquine to performed docking with WT spike protein and with mutants. Similar behavior with cepharanthine was observed, hydroxychloroquine reaches the binding site on RBD WT spike protein as well as D614G and H49Y, while in T563I mutant hydroxychloroquine was moved away from this site, and it was reflected on the detriment of the binding free energy (Table 2, Fig. 7B). Hydroxychloroquine in wild type formed a hydrogen bond with Y489 and L492, and hydrophobic interactions with Y449, L452, F456, E484, G485, C488, F490, E493 and S494 (Fig. 7H). In the interaction with D614G, hydroxychloroquine formed hydrogens bond with E484, F490, L492, and Q493 and hydrophobic interactions with S494, Y449, L452, L455, and P491 (Fig. 7I). And with H49Y formed hydrogen bonds with S349, Y449, N450, and S494 and hydrophobic interactions with R346, F347, A348, Y351, A352, Y451 and L452 (Fig. 7J). in wild type spike protein and both D614 and H49Y mutant Hydroxichloroquine interacted with residues of the RBD that participate in ACE-RBD recognition36,37. At the same time, Hydroxychloroquine in T563I mutant interacted with F342, W436, and R509, and hydrophobic interactions with S438, L441, N343, A344, S373, F374, N437, S438, which are residues that are not involved in ACE2-RBD interaction36 (Fig. 7K).
Nelfinavir is currently marketed as an anti-HIV drug, and recently it was demonstrated that inhibit the cell fusion mediated by spike protein of SARS-CoV-2, by molecular docking, it was identified that nelfinavir is placed between fusion peptide and HR1 helix near the S1/S2 cleavage site31. Therefore, nelfinavir was assayed by molecular docking against WT and mutants forms of the spike protein. Nelfinavir was bound to wild type spike protein with the highest binding free energy (-7.88 kcal/mol) followed by H49Y (-7.52 kcal/mol), D614G (-6.10 kcal/mol) and finally, T563I (-5.38 kcal/mol). Regarding the binding site, all ligands were allocated around HR1, FP region, and S1/S2 cleavage site, but Nelfinavir reaches different binding sites in every protein studied (Fig. 7C). Nelfinavir in WT spike protein was placed in the HR1 region and formed hydrogens bond with K310, I312, and Q314, and hydrophobic interactions with L303, E309, G311, Y313, I664, P665 and I666 as previously reported (Fig. 7L)31. While Nelfinavir with D614 mutant was also placed on the HR1 region but interacted with some different residues than in the wild type, formed a hydrogen bond with N317, S596, K947 and Q1010 and hydrophobic interactions with Q314, F592, G593, I1013 (Fig. 7M). Whereas Nelfinavir with H49Y only interacts with a couple of residues of HR1 region by hydrophobic interactions (A942 and S943), and with residues of the N terminal domain (S13-L303) by hydrogens bond T302, L303, K304 or hydrophobic interactions V47, L48, Y49, S305, in this case, the displacement outer from the HR domain was more evident than with D614G. Still, these interactions were energetically favorable (Table 2, Fig. 7N). In the case of Nelfinavir-T563I mutant complex, Nelfinavir interacted mainly with residues of the HR1 region by hydrogen bonds S943, D950 and D954, and hydrophobic interactions with K947, V951,R1014, it also formed hydrogen bond with K310 and hydrophobic ineraction with P665 (Fig. 7O), eventhoug this binding mode was energetically lesser favorable than those observed with mutants and WT protein.
Therefore, the ligand-protein interaction depicted a similar energy behavior than the observed with cepharanthine and hydroxychloroquine, where WT protein-ligand complex has the highest energy, while T563I-ligand complex has the lowest energy of binding, which could indicate that mutation on T563I position could be relevant for drug-spike protein interaction affecting not only the binding mode but also the binding free energy.
Table 2. binding free energy of compounds docked with Wild type spike protein and mutants.
|
Binding free energy
(Kcal/mol)
|
Cepharanthine
|
|
D614G
|
-5.95
|
H49Y
|
-6.42
|
T563I
|
-5.39
|
WT
|
-6.57
|
Hydroxychloroquine
|
D614G
|
-4.55
|
H49Y
|
-4.66
|
T563I
|
-4.19
|
WT
|
-4.46
|
Nelfinavir
|
|
D614G
|
-6.1
|
H49Y
|
-7.52
|
T563I
|
-5.38
|
WT
|
-7.88
|