Computational analyses of the G476S variant of SARS-CoV-2: A focus on the interaction with human ACE-2 and neutralizing antibodies


 Recently, several mutations in the SARS-CoV-2 genome have been identified and reported. However, little is currently known about the influence of these mutations on the infectivity, transmissibility and antigenicity of the virus. Here, using an integrative computational approach, we characterized the G476S variant of SARS-CoV-2 focusing on interactions with ACE-2 and neutralizing antibodies. The substitution of Gly-476 to Ser-476 in the SARS-CoV-2 Receptor-binding domain (RBD) largely affected the structural dynamics of the S-protein leading to significant influence on the interactions with ACE-2 and neutralizing antibodies. Structural properties of the S-protein such as conformation changes, residual fluctuations and residue surface area largely varied between the wild-type and G476S variant, especially in the RBD. Analyses of the interaction energies between S-protein and ACE-2 suggest that the G476S variant may have enhanced interactions with ACE-2 compared to the wild-type. The G476S variant was found to have weaker interactions with the neutralizing antibody H014 compared to the wild-type. Collectively, our findings have implications for the infectivity and antigenicity of the G476S variant of SARS-CoV-2.


Introduction
The coronavirus disease 2019 (COVID- 19) has caused global damage with over 40 million reported cases and more than 1 million deaths, as of 19 th October, 2020 (https://coronavirus.jhu.edu/map.html). The disease is caused by the severe acute respiratory syndrome coronavirus-2 (SARS-CoV-2). Symptoms at the early stages of the disease include fever, fatigue and dry cough 1,2 , but can progress to a severe state with symptoms of pneumonia and severe complications 3 . SARS-CoV2 is an enveloped positive-strand RNA virus belonging to the genus Betacoronavirus 4 . The virus has four structural proteins namely; spike (S) glycoprotein, envelope (E), membrane (M) and nucleocapsid (N) proteins, which perform important roles during entry and survival of the virus in the host cell 5 . The S-protein is attached to the surface of the virion as homotrimeric protein which facilitates viral entry into host cells by using the angiotensinconverting enzyme-2 (ACE-2) as receptor 6 . The S-protein is composed of S1 and S2 subunits. The S1 subunit binds to ACE-2 through its receptor binding domain (RBD) whiles the S2 is responsible for membrane fusion 7 . Most of the robust immune responses elicited against the virus, including neutralizing antibodies, target the RBD of the S-protein 8 . Neutralizing antibodies usually bind to the Receptor-binding domain (RBD) with the aim of interfering with interactions between the S-protein and receptor protein, thus, providing a mechanism to prevent viral entry into host cells and subsequent infection [9][10][11] . Mutation rates in RNA viruses are extremely high and may give the virus an advantage over its host 12 . Single nucleotide polymorphisms in the protein coding regions of the viral genome cause amino acid substitutions, which may affect viral protein structure and functions. For example, D614G variant of SARS-CoV-2 favors the open conformation state of the S-protein (Fig. 1), resulting in increased transmissibility and higher viral loads in patients 13,14 . In addition, D480A/G variant of the SARS-CoV-1 Sprotein variant has been reported to escape neutralization by the antibody 80R 15 .
Several amino acid substitutions in the S-protein of SARS-CoV-2 have been reported and some of these mutations have been investigated to determine the impacts on viral infectivity, transmissibility and interactions with neutralizing antibodies [16][17][18][19][20] . One mutation, G476S, has been recently reported based on analysis of SARS-CoV-2 genomic sequences obtained from the USA and was found to persist for over 11 weeks 16,20 . This mutation occurs in the RBD, which is the region responsible for ACE-2 binding and also the main target for antibody binding. It is largely unknown as to whether this mutation could interfere with interactions with neutralizing antibodies or in uence the interaction with ACE-2.
In this study, we investigated the in uence of the mutation (G476S) on the virus interactions with ACE-2 and neutralizing antibodies. We used an integrative computational approach involving protein modeling, molecular dynamics simulations and Molecular Mechanics with Poisson-Boltzmann Surface Area (MM-PBSA) interaction energy analysis to study how the G476S variant interacts with ACE-2 and neutralizing antibodies. We report that the mutation results in enhanced interaction with ACE-2 and the G476S variant of SARS-CoV-2 may have lower sensitivity to the antibody H014 compared to the wild-type. We also report the impact of the mutation on the molecular dynamics of the S-protein.

Preparation of protein structures
The UniProt database (https://www.uniprot.org/uniprot/) was used to access the sequence of SARS-CoV-2 S protein and human ACE-2. Protein sequences of all neutralizing antibodies were obtained from RCSB Protein Data Bank (https://www.rcsb.org/). Structural models of proteins were generated and validated using the SWISS-MODEL server (https://swissmodel.expasy.org/). Given a protein sequence as input and a structural co-ordinate as template, the server runs several algorithms to predict and model the 3D conformation of the query protein based on the co-ordinates provided and ranks them based on protein quality 21 . The templates used for the homology modeling were as follows; S-protein wild-type in closed state (PDB ID: 6XR8), S-protein wild-type in open state (PDB ID: 7CAI), ACE-2 (PDB ID: 6M0J) C105 (PDB ID: 6XCN), H014 (PDB ID: 7CAH), P2B-2F6 (PDB ID: 7BWJ). The amino acid substitution was performed using the PyMol mutagenesis tool. Molecular systems comprising a monomeric S-protein complexed to ACE-2 or the antigen binding fragment (Fab) of neutralizing antibodies C105, H014 and P2B-2F6 were studied. The complexes were generated through structural superposition analysis with the PyMol's structural alignment module. Brie y, the models were superimposed to their respective templates and the unwanted chains, fragments and residues removed.

Molecular dynamics simulation
The wild and G476S S-protein molecular systems were simulated using GROMACS 2020.3 package 22 and CHARMM36 force eld 23 . The proteins were centered in a cubic box, placed 1 nm from the box edges and solvated with spc216 water model. Appropriate ions were added to the system to electrostatically neutralize the molecular system. Energy minimization was performed on the system using the steepest descent algorithm for 10,000 steps with a maximum force threshold of 100 kJ/mol/nm. Van der Waals interactions were treated with a single cut-off of 1.216 nm and long-range electrostatics were treated with the Particle-Mesh Ewald (PME) method with 0.16 FFT grid spacing and 4 th order B-spline interpolation. Neighbour search was performed every 20 steps using the grid method with Verlet cut-off scheme.
Protein and non-protein components of the system were independently coupled to v-rescale thermostat and an isotropic Berendsen algorithm for pressure coupling. The pressure was maintained by weak coupling to a reference pressure of 1 bar, with an isothermal compressibility of 4.6x10-5 bar-1. The bond lengths within the protein were constrained using the LINCS algorithm. Trajectories for the Molecular Mechanics with Poisson-Boltzmann Surface Area (MM-PBSA) energy analysis were generated with GROMACS 5.1.5 and CHARMM36 force eld. A 10-ns simulation was performed for each complex, following a 400 ps NVT and NPT equilibrations. Trajectories were sampled every 0.5 ns and the nal 8 ns of each run was used for the energy calculations. Details of the simulations and molecular systems are summarized in Table 1.  In-built GROMACS tools were used for all trajectory processing, which included re-centering, tting, periodicity treatments, clustering and concatenation of the trajectories before subjected to analysis. The rst 20 ns of the 100 ns and the rst 2 ns of the 10 ns simulation trajectories were discarded and the remaining used for the analyses. The dynamics of the molecular systems were evaluated based on properties such as root-mean-square-deviation (RMSD), root-mean-square-uctuation (RMSF) and solvent accessible surface area over the period of the simulation production. RMSD was calculated over all backbone atoms after least-squares tting to the reference backbone, while the RMSF was calculated per residue after least-squares tting to reference Cα atoms. The initial equilibrated structures were used as the reference structures for all calculations. Molecular visualizations were accomplished with PyMol and data plots were generated with the GRACE plotting tool.

Clustering of structures
The simulation produces several structural frames which can be clustered based on the structural deviations from the reference structure. The backbone atoms of the protein structures were used for the structure superimposition and clustering. We used the gromos method and a clustering RMSD cutoff of 0.20 nm. The algorithm counts number of neighbors using the RMSD cut-off, takes structure with largest number of neighbors with all its neighbors as cluster and eliminate it from the pool of clusters. This is repeated for remaining structures in the pool. For each trajectory, the middle structure of the top ranked cluster group was selected as the cluster representative and used for further analyses.

Molecular Mechanics with Poisson-Boltzmann Surface Area (MM-PBSA) energy calculations
The GROMACS tool g_mmpbsa 5.1.2, which is based on MM-PBSA 24 was used for the energy calculations. The individual binding free energy of each component in the system is expressed as: where, E mm represents the molecular mechanics energy term, G solv represents the solvation energy term. It is worth noting that the entropic term is exempted from the calculations particularly due to the high computational demand and there are evidence that the net contribution of the entropic term is often minimal [25][26][27] . This is why the binding energy is designated as E binding instead of delta G. The E mm is made up of all bonded and non-bonded energies in the system, thus, can be expressed as: where E bonded is the bonded interactions consisting of bond, angle, dihedral and improper interactions. E nonbonded represents the non-bonded interactions which include both electrostatic (E elec ) and van der Waals (E vdW ) interactions which are calculated using a Coulomb and Lennard-Jones potential function, respectively. The bonded energy term is taken as zero when the single trajectory approach is used 24 . The solvation energy term (G solv ) is expressed as: G solv = G polar + G nonpolar where G polar represents polar solvation energies and G nonpolar is the non-polar solvation energies. G polar , which is the electrostatic contribution is calculated from solving the Poisson-Boltzmann equation. The non-electrostatic term of solvation energy, G nonpolar , includes repulsive and attractive forces between solute and solvent that are generated by cavity formation and van der Waals interactions, respectively 24 .

Results And Discussion
Molecular dynamics of wild-type and G476S S-proteins To determine differences in the structural dynamics of the wild and mutant S-proteins during the simulation, RMSD of protein structures were computed. RMSD measures the overall deviation of certain group of atoms in the protein with respect to a reference structure, thereby providing details on structural changes and stability during the simulation 28 . The structural deviation of the proteins after least square t to their corresponding reference structures (equilibrated initial structures) over the last 80 ns of the simulation were used to generate a RMSD distribution curve, shown in Fig. 2a. The RMSD distribution pro le show that backbone atoms of the wild-type S-protein undergo relatively smaller changes in conformation compared to those of G476S. In the case of the wild-type backbone atoms, structural deviations ranged from ≈0.05 nm -≈0.25 nm while the structural deviations of the G476S backbone ranged from ≈0.05 nm -≈0.5 nm. Cluster representatives of the two proteins were structurally superimposed and their pair-wise alignment as well as the resulting structural deviation are shown in Fig.  2b. A structural deviation of 11.5 Å indicate a signi cant difference in structural conformation of the two proteins, as observed from the RMSD distribution.
Realizing the difference in conformation of the two proteins, we determined and compared the residue uctuations in the wild and G476S S-proteins. RMSF analyses were performed on the Cα-atoms of the proteins using the last 80 ns of the simulation. The equilibrated structural con gurations were used as references for the calculations. As shown in Fig. 3a, the results reveal that there are signi cant differences in the uctuation patterns of residues corresponding to the RBD (residues 330-510) of wildtype and G476S S-proteins. The uctuation patterns were also projected as beta factors in 3D-structures of the proteins and we observed that residue uctuations were relatively higher in the G476S RBD than in the wild-type RBD (Fig. 3b). The data suggest that the mutation largely in uences the dynamics of residues in the RBD.
We then computed and plotted solvent accessible area (SASA) of protein structures to further characterize the in uence of the mutation on protein dynamics. SASA calculates the surface area of protein that is available for access by solvents, thus, gives a measure of residue exposure and a rough idea of surface expansion. Whole protein and residue accessible surface area were computed and plotted, shown in Fig. 4. The average SASA for wild and G476S S-proteins were 101 ± 12 nm 2 and 102 ± 13 nm 2 , respectively (Fig. 4a). This suggests that the mutation does not alter the overall solvent accessible surface area. However, decomposition of SASA on residue basis revealed that there are large differences in residue area of wild and G476S S-proteins. The regions most affected are illustrated in

Impact of mutation on interactions with ACE-2
Having observed wide differences in the structural properties of wild and G476S S-proteins such as residue uctuations and surface exposure, we next investigated the in uence of the mutation on the interaction with the receptor ACE-2. This was achieved by subjecting the S-protein-ACE-2 complexes to 10 ns simulations and using the resulting trajectories to calculate the energies governing their interactions. The MM-PBSA tool, g_mmpbsa, used for the binding energy calculations does not address contributions from the entropic terms and therefore in principle does not provide the absolute free energy of binding. Nonetheless, the tool is appropriate for determining relative binding energies to compare the interactions between different ligands binding to the same receptor. The results, shown in Table 2, show large differences in the interaction energies for the wild-ACE2 complex and G476S-ACE2 complex. Interestingly, the total binding energy for the G476S-ACE2 complex (-1928.8 ± 222.9 kJ/mol) was relatively higher compared to the wild-ACE2 complex (-1670.9 ± 166.3 kJ/mol). Moreover, contribution from each energy term was higher for the G476S-ACE2 complex than the WILD-ACE2 complex. The molecular mechanics energies (E elec and E vdW ) and polar solvation energy (E polar ) dominated the contributions to the overall binding energy for the complexes.
Analyzing the interacting interface dynamics of the S-ACE-2 complexes revealed that the number of residues that have polar contacts with the receptor differ between the wild-type and G476S S-proteins. It is well known that key residues in the receptor binding motif of SARS-CoV-2 are critical to recognition and binding of the S-protein to ACE-2 [29][30][31] . The residues of the wild-type S-protein that had polar contacts with the receptor protein were restricted to the receptor binding motif, including Gly-476 (Fig. 5a). Notably, we observed that Ser-476 in the G476S variant was not involved in any polar interaction with residues of the receptor protein. However, the total number of residues involved in polar interactions with the receptor was higher in the G476S variant compared to wild-type (Fig. 5b). These included charged amino acid residues (Glu-484, Lys-458, Phe-456, Tyr-449, Gln-414, Arg-408, Asp-405, Glu-406, Ser-375, Phe-374, and Phe-377) that were absent from the wild-ACE-2 interactions.  Table 2: Energy terms governing the S-ACE-2 interaction. Values are represented as mean ± standard deviation.

Energies
These charged residues are capable of forming salt bridges and hydrogen bonds with receptor residues, contributing to better interactions for the G476S variant. This supports the higher contributions from molecular mechanics energy terms seen for the G476S-ACE-2 complex relative to the wild-ACE-2 complex. Binding of the S-protein to ACE-2 is mainly governed by hydrogen bonds and electrostatic interactions in the S-ACE-2 complex 32 . We computed and plotted the hydrogen bonds between the S-protein and ACE-2, as shown in Fig. 5c. We observed that ≈34 hydrogen bonds were involved in the interaction between the G476S variant and the receptor while ≈15 hydrogen bonds were involved in the wild-ACE-2 interactions.
Having observed that the mutation altered the interaction with ACE-2, we compared the structural evolution of the wild-ACE-2 and G476S-ACE-2 complexes to determine the dynamics of the interface residues during the interactions. As shown in Fig. 6, we observed that residues at the interacting interface of the G476S RBD are more exposed compared to the wild RBD. In the case of the wild-type, these residues have limited contacts with the receptor protein resulting in weaker interactions between the Sprotein and ACE-2. Therefore, the substitution of Gly-476 to Ser-476 largely in uenced the interaction between the S-protein and host receptor ACE-2, which was consistent with the changes in RBD dynamics such as residue uctuation and surface area.

Impact of mutation on the interaction with neutralizing antibodies
The impact of the mutation on the interaction between neutralizing antibodies and the RBD was investigated, following our observation of large differences in the dynamics of the two proteins. MM-PBSA analysis was used to determine the energy contributions to the overall interaction between the RBD and neutralizing antibodies. Three antibodies reported to have potent neutralizing activities against the SARS-CoV-2 (H014, C105 and P2B-2F6) were used for the study.
The neutralizing potential of H014 has been well-characterized and in-vivo studies show that the antibody can cause signi cant reduction in SARS-CoV-2 titers 9 . H014 neutralizes the virus by binding to the RBD and interfering with receptor interactions through heavy protein clashes (Fig. 7a). We mapped the epitopes recognized by H014 to residues on the side of the RBD as shown in Fig. 7b. These residues appear to be very close to the receptor binding motif which is the region where ACE-2 binds, explaining the occurrence of protein clashes and competitive binding between H014 and ACE-2 9 .
The data from the energy calculations suggest that wild-type RBD has better interactions with antibody H014 than the G476S variant. The contributing energy terms were relatively higher for the wild-H014 complex compared to the G476S-H014 complex, which resulted in a higher overall binding energy (-464.0 ± 141.2 kJ/mol) for the former compared to the latter (-318.8 ± 117.3 kJ/mol). As reported in Table 3, the difference in the overall binding energy for the wild-H014 and G476S-H014 was mainly from the electrostatic energy term, which is calculated from coulomb interactions. We next explored the cause of the large difference in electrostatic energies between the wild-H014 and G476S-H014 complexes. This was achieved by comparing the interface dynamics of the complexes during the simulation. 246.6 ± 90.3 Table 3: Energy terms governing the interaction between RBD and neutralizing antibodies. Values are represented as mean ± standard deviation.
We observed that ve charged amino acid residues (Asn-370, Ser-371, Arg-408, Asn-501, and Gln-506) involved in the interaction between wild-type RBD and H014 were not present in the interaction between the G476S variant and antibody H014 (Fig. 8). These charged residues could be involved in the formation of hydrogen bonds and salt-bridges with receptor residues contributing to the electrostatic interactions.
We next compared the structures of the complexes from the simulations to determine how these residues differ among the wild and G476S during interactions with H014. Interestingly, we observed that residues Asn-370, Ser-371, Arg-408, Asn-501, and Gln-506 are more exposed in wild-type compared to the G476S variant ( Fig. 9) in the H014-bound state. The residues are buried within the interface in the wild RBD, thereby reducing contacts with the H014 residues.
Based on the results, it can be predicted that the G476S variant of SARS-CoV-2 has lower sensitivity to the antibody H014 compared to the wild-type. Indeed, several other mutations in the RBD have been reported to have effective resistance to the antibody H014 which include Y508H and D614G+A435S 17 .

Page 10/12
The interactions between the RBD and antibodies C105 and P2B-2F6 were comparable for wild-type and G476S. As reported in Table 3, the following overall binding energies were obtained from the energy calculations: WILD-C105 = 79.5 ± 126.7 kJ/mol, G476S-C105 = 85.4 ± 155.9 kJ/mol, WILD-P2B-2F6 = 227.1 ± 104.7 kJ/mol, and G476S-P2B-2F6 = 246.6 ± 90.3 kJ/mol. Other mutations in the RBD such as V483A and F490L have been reported to resist neutralization by the antibody P2B-2F6 17 . However, our results suggest that the neutralizing activities of antibodies P2B-2F6 and C105 against G476S variant of SARS-CoV-2 would be similar to the wild-type. The in uence of the mutation (G476S) on antibody neutralization therefore varies and may depend on the type of antibody and epitopes recognized the antibody. The neutralizing antibodies C105 and P2B-2F6 recognized epitopes in the receptor binding motif, which is entirely different from the H014 binding region (Fig. 6b). As noted from the residual uctuation and surface area analysis of the S-protein, the dynamics of the residues in the receptor binding domain were widely different.

Conclusion
Using integrative computational analyses, we investigated the structural dynamics of the G476S variant of SARS-CoV-2 and assessed the in uence of this mutation on the interactions with ACE-2 and neutralizing antibodies. The substitution of Gly-476 to Ser-476 in the RBD of SARS-CoV-2 S-protein has serious implications with respect to the infectivity and antigenicity of the virus. In the case of viral infectivity, which is initiated by an interaction with the host receptor ACE-2, the G476S variant showed stronger interactions with the ACE-2 compared to the wild-type. On the neutralization of the virus by antibodies, the results suggest that the in uence is dependent on the epitopes recognized by the antibody. The surveillance of the G476S mutations should be increased to track the distribution and spread of the G476S variant.