Impact of the SARS-CoV-2 S-protein G476S mutation on the interaction with human ACE-2 receptor and neutralizing antibodies


 The G476S mutation of the SARS-CoV-2 S-protein occurs in the receptor binding domain (RBD), the region that binds to the human angiotensin-converting enzyme 2 (hACE-2) receptor and also the main target for neutralizing antibodies. The 476S variant was first reported in the USA. Emerging evidence show that the 476S variant resists neutralization by antibodies such as S2E12 and CC6.29. The impact of the mutation on the interactions with hACE-2 receptor and the dynamics of the S-protein, has not been not fully explored. Here, we provide insights into the structure dynamics of the 476S variant and investigate the impact of the mutation on interactions with hACE-2 and selected neutralizing antibodies. We report that the mutation induces a destabilization effect in the RBD and an increased flexibility for most of the receptor binding residues. The mutation, however, does not affect the interactions with the hACE-2 receptor. Both Gly-476 and Ser-476, although located within the hACE-2 interacting residue hotspot, do not contribute to the stabilization of the RBD-hACE-2 complex. Our findings suggest that both H014 and P2P-2F6 antibodies neutralize the 476G and 476S S-proteins with similar efficacy.


Introduction
The coronavirus disease 2019 (COVID- 19), which has been a global threat since November 2019 is caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). Symptoms at the early stages of the disease include fever, fatigue and dry cough 1,2 , but can progress to a severe state with symptoms of pneumonia and severe complications 3 . SARS-CoV-2 is an enveloped positive-strand RNA virus belonging to the genus Betacoronavirus 4 . The SARS-CoV-2 infection is initiated when the viral transmembrane spike (S) glycoprotein (S-protein) binds to the human angiotensin-converting enzyme 2 (hACE-2) receptor through a receptor-binding domain (RBD), causing membrane fusion and entry into host cells 6,7 . The RBD of the S-protein is also the major target for the robust host immune responses elicited against the virus, including neutralizing antibodies 8 . Neutralizing antibodies bind to the receptorbinding domain (RBD) of the S-protein in order to disrupt interactions between the S-protein and the receptor protein, preventing viral entry into host cells and subsequent infection [9][10][11] .
Recently, several variations of interest of the SARS-CoV-2 have been reported, a phenomenon common to RNA viruses 12 . The 614G variant, which became the dominant circulating strain globally, was described to favor the open conformation state of the S-protein, resulting in increased transmissibility and higher viral loads in patients 13,14 . In addition, the 480A/G variants of the SARS-CoV-1 S-protein have been reported to escape neutralization by the antibody 80R 15 . The current circulating strains of high concern (B.1.1.7, B.1.351 and P.1) contains mutations such as N501Y, E484K and K417N/T [16][17][18] . Emerging evidence show that the sensitivity to antibody neutralization vary for these variants [16][17][18] .
The 476S variants was rst reported in the USA and persisted for over 11 weeks 19,20 . This mutation, like those in the strains of concern, occurs in the RBD, the region responsible for hACE-2 binding and also the main target for neutralizing antibody binding. The G476S mutation attenuates a linear and a discontinuous B-cell epitope in the RBD 21 , which could be a possible escape mechanism to host immune responses. Other studies have shown that the 476S variant resist neutralization by antibodies such as S2E12 22 and CC6.29 23 .
The impact of the mutation on interactions between the S-protein and hACE-2 receptor, has not been fully explored yet. It is also unclear whether the mutation has an impact on the molecular dynamics of the S-protein. To provide insights into these issues, the present study aimed to investigate the in uence of the mutation (G476S) on the interactions between the SARS-CoV-2 S-protein and hACE-2 receptor as well as some selected neutralizing antibodies (H014 and P2P-2F6). We further aimed to study how the mutation affect the structure dynamics of the S-protein.

Method Building and Validating Protein Systems
The sequence of the SARS-CoV-2 S-protein belonging to the Wuhan Reference sequence was obtained from the UniProt database (https://www.uniprot.org/uniprot/A0A679G9E9). We also retrieved the human ACE-2 sequence from the same database. We built tertiary structures of the proteins using the SWISS-MODEL tool 24 . The S-protein and hACE-2 models were generated using the crystallographic coordinates of PDB entries 7DK3 and 6LZG, respectively. SWISS-MODEL is a homology modeling tool that relies on the OpenStructure computational structure biology framework 25 and ProMod3 26 modeling engine to build protein structures based on coordinates of templates 24 .
Moreover, the tool checks modeling errors and assess structure quality using the qualitative model energy analysis (QMEAN) scoring function 27 . SWISS-MODEL ranks among the top-performing modeling servers 24 . All structure quality assessments were performed with the SWISS-MODEL quality assessment tool. Having con rmed the suitability of the protein models, we generated the 476S variant from the wild-type S-protein structure using the PyMol v2.4 28 mutagenesis function. For the antibody-S-protein complexes, we superimposed the S-protein model to the PDB entries of the antibodies and removed the S-protein in the template. The H014 and P2B-2F6 templates were PDB entries 7CAK and 7BWJ.
Determining the degree of evolutionary conservation of S-protein residues The S-protein structure was submitted to the ConSurf tool 29 to calculate the degree of evolutionary conservation of the SARS-CoV-2 S-protein residues. ConSurf is widely used to determine the evolutionary dynamics of protein or nucleic acid residues. The algorithm performs BLAST searches of query inputs (amino acids or nucleic acid residues) against the UNIREF-90 database and generate phylogenetic trees based on derived multiple sequence alignment 29 .
Based on an empirical Bayesian methodology, ConSurf uses the Rate4Site algorithm 31 to compute evolutionary rates of each residue. These rates are normalized and ranked into grades from 1 (highly variable residue) to 9 (highly conserved residue). We used the CSI-BLAST as homologous search algorithm and a total of three number of iterations. The E-value cut-off was 0.0001. Total number of reference sequences to be used was 150 with 95% maximum sequence identity. The minimum identity for counterparts was set to 35%. The default values for alignment method, calculation method and evolutionary substitution model were used.
Simulating the molecular dynamics of protein systems All simulations were performed with GROMACS 31 version 2021. The parameters from the CHARMM36 32 force eld and TIP3P water model were used in all systems. To mimic the full length of both S-protein and hACE-2 receptor, we capped the N-termini and C-termini with acetyl group and N-methyl amide, respectively. The canonical states of all titrable amino acids were assigned at physiological pH. Van der-Waals interactions were treated with a single cut-off of 1.4 nm. Long-range electrostatics were treated with an advanced Particle-Particle Particle-Mesh algorithm (P3M-AD) and a cut-off of 1.4 nm. Neighbor searching was performed every 10 steps. We also applied periodic boundaries in all directions.
Protein systems were placed in a cubic box such they were 2 nm from the box edges. The systems were fully solvated and neutralizing counter-ions (NaCl) were added. To obtain a suitable starting structure for the simulation, the systems were energy minimized using the steepest descent algorithm for 20,000 steps. We then simulated the proteins systems with position restraints on heavy atoms under constant volume (NVT) ensemble for 400 ps and constant pressure (NPT) ensemble for 400 ps. Protein and non-protein components of the systems were independently coupled to V-rescale thermostat and isotropic Berendsen algorithm for pressure coupling. For molecular simulation analysis, production simulations were performed for 100 ns in triplicates without any constraints at a time step of 2 fs. Simulations for binding energy analysis were performed for 15 ns in triplicates. A summary of the simulation and protein systems is provided in Table 1.

Trajectory preparations and MD analysis
All trajectory preparations, including re-centering, tting, periodicity treatments, and concatenation of the trajectories, were performed with in-built GROMACS tools. The dynamics of molecular systems were assessed using root-meansquare deviation (RMSD) and root-mean-square uctuation (RMSF). The RMSD was calculated over all backbone atoms after least-squares t to the reference backbone, while the RMSF was calculated per residue after least-squares t to C-α atoms. All graphical representations were generated with GraphPad Prism v8.0. RMSD and RMSF plots were averaged over three independent simulations. Molecular visualizations of structures were done with PyMol v2.4 28 .

Binding energy calculations
The binding energy and individuals forces of interactions were calculated with g_mmpbsa v5.1.2 33 Table 1. For each complex, calculations were performed on 225 frames sampled over three independent 15 ns simulation. Binding energy and interaction energies were computed as previously described 33 . Brie y, the total binding energy is the summation of four individual energy terms; van der Waals, electrostatics, polar solvation and non-polar solvation energies. The current g_mmpbsa free energy of binding. The tool is however appropriate for measuring relative binding energies, such as when comparing different ligands/proteins that bind to the same receptor. The binding energy, E binding is represented as; where the bonded interactions (E bonded ) consist of bond, angle, dihedral and improper interactions. The non-bonded interactions include electrostatic (E elec ) and van der Waals (E vdW ) interaction forces which are calculated using Coulomb and Lennard-Jones potential functions, respectively. The polar solvation does not calculate contributions from entropic terms, i.e. does not give the absolute energy (G polar ), which is the electrostatic contribution to solvation is computed using the Poisson-Boltzmann equation. The non-electrostatic contribution to the solvation energy, which is estimated from the solvent accessible surface area (SASA) is represented by the non-polar energy (G nonpolar ).
Parameters for calculating energy terms and values for constants were adopted from a previous study 33 .

Results
Validation of the SARS-CoV-2 and human ACE-2 models  (Figure 1a). Comparison of the built models to a set of non-redundant protein structures of high quality in the PDB database also showed that the models were suitable for this study (Figure 1b).
We then performed a pair-wise alignment of the model structure and their respective templates. The alignments of the model-template structures are shown in Figure 1c. Having assessed the suitability of the model for this study, we used the PyMol v2.4 mutagenesis tool to create the 476S variant by replacing Gly-476 with Ser-476 in the RBD (Figure 1d).

Effect of the mutation on RBD structure dynamics
To determine the impact of the mutation on S-protein structure dynamics, we rst characterized the degree of evolutionary conservation of Gly-476 using the ConSurf tool 29 . Usually, the natural tendency of an amino acid mutating largely depends on its importance to the protein's structure and functions. The ConSurf conservation pro le for the S-protein show that most of the conserved residues are in the S2-subunit, with the most variable residues in the RBD and the NTD (Figure 2). The pro le shows a ConSurf score of 1 for Gly-476, suggesting that it is highly variable and easily mutable.
To assess whether G476S mutation induces any conformation change in the S-protein, we used root-mean-square deviation (RMSD), RMSD distribution and root-mean-square uctuation (RMSF) to compare the structural dynamics of the wild-type and 476S variant S-proteins (Figure 3a-3e). The RMSD describes the conformation stability of the protein system during the simulation by indicating the average displacement of backbone atoms in the protein structure with respect to the starting structure (equilibrated structure, t = 0 ns). The backbone-RMSD value of the wild-type S-protein averaged 1.08 nm, which increased to 1.18 nm in the case of the 476S variant S-protein, indicating a destabilization impact of the mutation (Figure 3a). To determine the most affected regions of the S-proteins, domainspeci c backbone-RMSD pro les were generated (Figure 3b). The destabilization effect of the mutation was largely observed in the RDB than the NTD and the S2-subunit. The backbone-RMSD of the wild-type RBD averaged 0.39 nm, which increased to 0.45 nm for the 476S variant RBD.
The structures from the simulation trajectories were clustered using RMSD cut-off of 0.1 nm and the gromos method, resulting in several cluster groups. The group-centre structure of the most populated cluster for both wild-type and 476S variant S-proteins were selected and superimposed. The structure superposition shows that the wild-type and 476S variant S-proteins differ by RMSD of 5.3 Å (Figure 3d). The difference in the RBD is also emphasized. The

Impact of the mutation on stabilization of the RBD-hACE-2 complex
To investigate the potential impact of the mutation on the interaction between the S-protein and hACE-2 receptor, we rst examined the dynamics of the RBD upon binding to the hACE-2 receptor. Comparing RMSD of the RBD in bound and unbound states, we observed that the RBD was destabilized upon binding to the hACE-2 receptor (Figure 4a), for both the wild-type and 476S variant RBD-hACE-2 complexes. Further, the residue-speci c exibility was signi cantly affected upon binding to hACE-2, most especially in the RBD. The residues in the RBD demonstrated less exibility upon binding to the hACE-2 receptor (Figure 4b). The most affected regions were residues 330-395 and the least affected region ranged from residues 473-503, including Gly-476 and Ser-476. This suggests a major participation in the interactions with hACE-2 for the residues 330-395.
We examined the interacting interface of the RBD-hACE-2 complex to determine whether Gly-476 and Ser-476 were located within the hACE-2 interacting residue hotspot during the simulation. The middle-structure of the most populated cluster group was selected for this activity. All amino acids that are within 5 Å of the hACE-2 receptor were considered, assuming that most forces of interaction are captured within this distance. Both Gly-476 and Ser-476 were located in the interacting residue hotspot of the RBD-hACE-2 complex during the simulation ( Figure 5).
To characterize how the mutation affect RBD-hACE-2 interactions, the binding energy and the individual energy terms governing the RBD-hACE-2 stabilization were calculated. The energy terms for each complex were calculated using 225 structure frames sampled over three independent 15 ns simulations. The estimated binding energies were highly comparable for the wild-type (-2068.268 ± 138.127 kJ/mol) and 476S variant (-1953.245 ± 128.816 kJ/mol), suggesting that the interactions with hACE-2 is not largely affected by the G476S mutation in the RBD. The computed individual energy terms governing the complex stabilization also show high similarity for the wild-type and 476S variant complexes (Table 2). To determine the contribution of the Gly-476 and Ser-476 to the overall binding energy, we computed the residue contribution to the stabilization of the interaction between the S-protein and hACE-2. The energy contribution pro le of the RBD residues are shown in Figure 6a. The contributions of Gly-476 and Ser-476 to the overall stabilization of the RBD-hACE-2 complexes were -5.7 kJ/mol and -2.7 kJ/mol, respectively. These contributions are insigni cant to the overall stabilization of the RBD-hACE-2 complexes. Based on their energy contributions, the residues ranging from 475-503, play no signi cant roles in the RBD-hACE-2 interactions. The largest contributors to the complex stabilization were from residues 346-424. Although Gly-476 is replaced with a polar residue in the 476S variant, individual energy terms such as electrostatic and van der Waals energy contributions were larger in the wild-hACE-2 complex than the 476S-hACE-2 complex. We, therefore, generated a distance plot for the Gly-476/Ser-476 and hACE-2. The minimum distance between the hACE-2 receptor and Gly-476 averaged 0.23 nm, which increased to 0.43 nm for Ser-476 ( Figure   6b).

Impact of the mutation on interactions with H014 and P2B-2F6
Previous studies have established that the 476S variant resists neutralization by particular antibodies such as S2E12 22 and CC6.29 23 . Thus, we studied the impact of the mutation on interactions with two neutralization antibodies that target the RBD but have different mechanisms of neutralization. The antibodies H014 and P2B-2F6 were selected for this particular study. Although both antibodies target the RBD, H014 neutralizes by inducing sterical clashes to the RBD and interfere with hACE-2 binding 34 . P2B-2F6, however, directly competes with the RBD for hACE-2 binding 35 (Figure 7a).
Analysis of the individual residue contributions to the complex interactions revealed that both Gly-476 and Ser-476 have insigni cant contributions to the stabilization of both RBD-H014 and RBD-H014 complexes (Figure 7b). Moreover, residue contributions for both RBD-H014 and RBD-P2B-2F6 complexes were similar for the wild-type and 476S variant. The energy terms governing the RBD and neutralizing antibody interactions are summarized in Table 3.
The estimated binding energy and individual energy terms did not vary signi cantly between the wild-type and 476S variant interactions with the neutralizing antibodies. These suggest that the mutation has no effect on the interactions between the RBD and the neutralizing antibodies studied.

Discussion
Here, we investigated the structure dynamics of the 476S variant of SARS-CoV-2 and characterize the features of interaction with hACE-2 receptor and some selected neutralizing antibodies. We report that the 476S variant has a destabilized RBD and increased residue-speci c exibility in the domain when compared to the wild-type. Despite the observed changes in conformation particularly in the RBD, interactions with hACE-2 were not signi cantly affected by the mutation. The wild-type and 476S variant RBDs interacted with the hACE-2 receptor with highly similar interaction energies. We studied the S-proteins complexed to the antigen binding fragments of neutralizing antibodies H014 and P2B-2F6. Our results suggest that both antibodies have equivalent neutralizing potency to the 476G and 476S Sproteins.
The G476S mutation occurs within a cluster of amino acids that de ne the receptor interacting interface in the RBD 36 . The current virus strains that are classi ed as variants of interest contain mutations such as N501Y and E484K, all occurring in the said cluster of amino acids. Usually, the tendency of a residue mutating largely depends on its degree of evolutionary conservation. The ConSurf evolutionary conservation pro le show that the RBD residues that interact with the hACE-2 receptor are highly variable with a high tendency of mutation.
Despite the replacement of Gly-476 with a more polar serine residue in the RBD, the interactions with the hACE-2 receptor were not affected signi cantly. Particularly, both Gly-476 and Ser-476 were not critical to the stabilization of the RBD-hACE-2 complex. We observed that RBD residues ranging from 473-503 were least affected in terms of residue-speci c exibility, upon binding to the hACE-2. The same cluster of residues, including residue 476, were the least contributors to the stabilization of RBD-hACE-2 complex.
The G476S mutation attenuates a linear and a discontinuous B-cell epitope in the RBD 21 , which could be a possible escape mechanism to host immune responses. The 476S has also been shown to resist antibody neutralization by S2E12 22 and CC6.29 23 . We studied the S-proteins complexed to the antigen binding fragments of neutralizing antibodies H014 and P2B-2F6. The mutation had no signi cant impact on the interactions with the antibodies. Both Gly-476 and Ser-476 provided no contributions to the interactions between the RBD and the antibodies. Thus, resistance of antibody neutralizing by the 476S variant largely depends on the contribution of the Ser-476 to the antibody-RBD interactions and the epitopes recognized by the antibody.
There could be several variants with mutations in the RBD having mechanisms of antibody neutralizing resistance or improved virus survival but are present at low frequencies in SARS-CoV-2 infected populations 37 . The ndings of this study suggest that the binding of the S-protein RBD, H014 Fab and P2B-2F6 Fab to the hACE-2 receptor are not affected by the mutation. However, previous reports suggest that the 476S variant has mechanisms of escaping the host immunity. Robust surveillance should be performed to track the distribution and spread of the 476S variant.

Conclusion
In this study, we investigated the structure dynamics of the 476S variant of SARS-CoV-2 S-protein, and describe the features of interaction with the hACE-2 receptor and selected neutralizing antibodies. Compared to the wild-type, the 476S variant has a destabilized RBD and high residue-speci c exibility in the domain. Despite the observed changes in conformation, especially in the RBD, the mutation had no effect on interactions with hACE-2. The wild-type and 476S variant RBDs had very similar energies when they interacted with the hACE-2 receptor. S-proteins bound to antigen binding fragments of neutralizing antibodies H014 and P2B-2F6 were also studied. Our ndings suggest that both antibodies neutralize the 476G and 476S S-proteins with similar e cacy.  The degree of evolutionary conservation of Gly-476.

Figure 3
Impact of the G476S mutation on the structural dynamics of the S-protein.

Figure 5
The interacting interface of the complex showing the amino acids within 5 Å of each protomer. Gly-476 and Ser-476 have been circled. The interface was produced from the middle-structure of the most populated cluster group.