Enhancement of SARS-CoV-2 Receptor Binding Domain -CR3022 Human Antibody Binding Affinity via In silico Engineering Approach

CR3022 and ACE2 would lead to a more efficient blockade of virus entry. Methods : In this regard, the amino acids with central roles in the binding affinity of CR3022 antibody to spike protein were substituted. The best mutations to increase the affinity of antibodies were also selected based on protein-protein docking and molecular dynamics simulations. Result : The variants 45 (H:30I/G, H:55D/F, H: 103S/Y, L:59T/F, L:98Y/A), 60(H:31T/D, H:55D/E, H:103S/Y, L:59T/D, L:98Y/F), 67(H:30I/G, H:55D/F, H:103S/Y, L:56 W/L, L:59T/Y, L:61E/G), 69(H:31T/D, H:55D/F, H:103S/Y, L:59T/F, L:98Y/A), and 71(H: 31T/D, H:55D/F, H:103S/Y) with respective binding affinities of -167.3, -167.5, -161.6, -173.0, and -169.8 Kcal/mol had higher binding affinities against the RBD of the SARS-CoV2 spike protein compared to the wild-type Ab. Conclusion : The engineered antibodies with higher binding affinities against the target protein can improve specificity and sensitivity. Thus, a more successful blockade of the ACE2 is achieved, resulting in a better therapeutic outcome. In silico studies can pave the way for designing these engineered molecules avoiding the economic and ethical challenges.


INTRODUCTION
Coronaviruses, a highly diverse family of enveloped positive-sense single-stranded RNA viruses, are generally classified into four major genera: Alpha, Beta, Gamma, and Delta coronavirus. Severe acute respiratory syndrome coronavirus (SARS-CoV) and Middle East respiratory disorder coronavirus (MERS-CoV) are classified as βcoronaviruses. The novel coronavirus SARS-COV-2, the causative agent of coronavirus disease 2019 (COVID- 19), is also in the β-coronaviruses group. This disease initially emerged in China in late 2019, and a global pandemic started in March 2020 (1,2).
A primary human receptor for 2019-nCoV is angiotensin-converting enzyme 2 (ACE2). As a type I membrane protein, the enzyme is expressed on the surface of different cells such as lung, kidney, intestine, and heart cells. It reduces blood pressure and plays a role in vasoconstriction. Furthermore, decreased ACE2 expression was related to several cardiovascular diseases (3)(4)(5)(6). Two separate studies recently reported that ACE2 was involved in the new SARS-CoV-coronavirus entry into the human body (7). Since the virus entry is mediated by the receptor-binding domain (RBD) of spike (S) glycoprotein, it becomes a suitable target for generating neutralizing antibodies that can prevent virus entry. RBD of 2019-nCoV spike protein has six amino acid substitutions compared to the RBD in SARS spike protein, enabling interaction with ACE2 receptor for SARS-CoV-2 entry (3,4,8). Recent studies identified several potent monoclonal antibodies against SARS coronavirus spike protein. Neutralizing antibodies (nAbs) against SARS-CoV-2 produced by experimental vaccination or monoclonal antibody (mAb) technology in passive transfer experiments protected mice from infection by reducing the viral load (9)(10)(11)(12)(13).
Among several expressed and purified SARS-CoVspecific antibodies, CR3022 antibody (Ab) targeting RBD showed a stronger neutralizing activity. This antibody in ELISA and Battleford Light Infantry (BLI) assays bound potently to 2019-nCoV RBD (14). The CR3022 Ab could be developed as a candidate therapeutic, alone or in combination with other neutralizing antibodies, to prevent or treat 2019-nCoV infections (14,15).
This study aimed to improve the binding affinity of neutralizing CR3022 Ab against RBD of S protein via in silico protein engineering. Higher binding affinity could improve the diagnostic and therapeutic performance of CR3022 Ab.

MATERIAL AND METHODS
CR3022 complementarity-determining Region Prediction. The Ab binding activity is principally mediated by the complementarity-determining region (CDR).
The Paratome web server (http://ofranservices.biu.ac.il/site/services/paratome/) (16) was used to predict the antigen-binding regions (ABRs) of Ab. The amino acid sequence or 3D structure of the protein was submitted to do the analysis. The Paratome was designed by the structural alignment of a non-redundant set of all established Ab-antigen (Ag) complexes in the PDB. DiscoTope (http://tools.iedb.org/discotope) was used to predict discontinuous epitopes from 3D structures of proteins in PDB format.
CR302 conservation of amino acid positions evolution. Estimating the evolutionary conservation of amino acid (AA) positions in protein sequences was assessed by ConSurf server (http://consurf.tau.ac.il) (17). The estimation of AA position was performed based on the phylogenetic relations among homologous sequences. The degree of AA is evolutionary conservation (i.e., its evolutionary rate) and heavily depends on its structural and functional value. Therefore, the importance of each AA position for structure or function was assessed by conservation analysis of AAs' position among the family members. The rate of evolution was calculated based on the evolutionary similarity between the protein and its homologs.
The Consurf parameters to perform the analysis were as follows: the homolog search algorithm was set to HMMER, the E-value cutoff was set to 0.0001, and the proteins database was set to UniRef90. This database clusters sequences and sub-fragments with 11 or more residues that have at least 90% sequence identity (from any organism) into a single UniRef entry, displaying the representative sequence. CR3022 Interfaces prediction. Potential proteinprotein interaction sites were identified by iterative "mapping" of the interaction sites of each structural neighbor involved in a complex to individual residues in the query protein. We used PredUs (18) web server, which is a flexible, interactive, template-based tool. The other tool we used for predicting protein-protein interaction sites was cons-PPISP (consensus Protein-Protein Interaction Site Predictor) [18]. The cons-PPISP uses a consensus neural network method and predicts the residues that will probably form the binding domain for another protein.
The solvent accessibility and deleterious mutations sites were predicted by WESA (https://pipe.rcc.fsu.edu/wesa/) with an accuracy of 80% (19). The decisive prediction for each residue relies on a weighted total of the specific predictions. The final prediction for each residue is based on a weighted sum of the individual predictions. Residues are predicted as buried or exposed with an expected accuracy of 80%. The exposed residues are defined as having a surface area greater than 20% of the maximum area for amino acid type.
The novel method for predicting partner-specific protein interfaces from .pdb files or input sequences was assessed by Extreme Gradient Boosting (xgBoost) (20). This method relies on Interface Prediction of Specific Partner Interactions (BIPSPI) available at http://bipspi.cnb.csic.es/xgbPredApp/. In this model. The .pdb coordinate format was used as the input file format.
CR3022 binding sites and pocket detection. To identify multiscale pockets on the protein surface, we used mathematical morphology via the Grid-based HECOMi finder) program (GHECOM) at http://strcomp.protein.osaka-u.ac.jp/ghecom/ (21). The structure file was fed in the .pdb format to do the analyses. The maximum radius for the large probe was set as 10.0 angstrom. We calculated the non-grid spherical probes (for UCSF DOCK/sievgene) as off.
The protein-protein interaction site was predicted by Meta-PPISP (https://pipe.rcc.fsu.edu/meta-ppisp.html) (22,23). Several matching methods have been established for calculating protein-protein interaction sites. Meta-PPISP sought to enhance prediction reliability and accuracy by integrating results from separate predictors and reports. The meta-PPISP is developed on three separate web servers, including Promate, cons-PPISP, and PINUP. The meta-PPISP outperforms all three separate servers.
Significant residues selection. Some AAs were selected as significant residues in the CR3022 structure by utilizing the outcome of different software such as Paratome, BIPSPI, Cons PPISP, PredUS, Meta-PPISP, and WESA. These residues were sited in one of three 2021 Vol. 9 No. 3 CDR regions and estimated by the Paratome server. The residues in BIPSPI software with a score above 0.5 have a score above 0.00 in Cons PPISP software and above 4 in GHecom software. In connection with this, PredUS, Meta-PPISP, and WESA predicted residues exploration to pick the major AAs.
These residues, located in one of three CDR regions, were predicted by the Paratome server. Selected residues in BIPSPI software had a score above 0.5, Cons PPISP software a score above 0.00, and GHecom software a score above 4. In this regard, PredUS, Meta-PPISP, and WESA predicted residues research to select the significant amino acids.
SIFT analyses. The Sorting Intolerant From Tolerant (SIFT) server (http://sift.jcvi.org/) was used to find if an AA replacement affects the protein function. The server predicts the conservation degree of AA residues in the sequence alignments that originated from closely linked sequences gathered through PSI-BLAST.
CR3022variants sketching. Seventy-one variants containing mutations in at least one of three ABRs were designed. To design new variants, we replaced the identified essential residues with the tolerable amino acids introduced by the SIFT server (http://blocks.fhcrc.org/sift/SIFT.html).
Residues confirmed by bioinformatic analyses were randomly mutated in the suggested variants.
The three-dimensional structure of all submitted variants was determined by structure-based antibody prediction (SAbPred) from the Oxford Protein Informatics Group (OPIG) (24). All 3D models of the mutated antibodies were predicted and assessed for quality by the PROSA server (https://prosa.services.came.sbg.ac.at/prosa.php).
Molecular docking analyses. The 3D structure of each variant and spike antigen provide the inputs for High Ambiguity Driven protein-protein Docking (HADDOCK) (25). HADDOCK is an information-driven flexible docking method for the simulation of biomolecular complexes. To do the docking analyses between the receptor and ligand proteins, we set the H:30I, H:31T, H:33W, H:54G, H:103S, L:56W, L:58S, L:59T, L:61E, and L:98Y as HADDOCK active residues in antibody structure.
Molecular dynamics simulations. Molecular dynamics simulations were performed using the CABS Flex server (http://biocomp.chem.uw.edu.pl/CABSflex2/) for 50 cycles, 50 trajectory frames, and 10 ns, with some additional distance restraints with a global C-alpha restraints weight of 1.0.
The predicted discontinuous epitopes (as a chart of DiscoTope score vs. residue ID) are shown in Fig. 1. The predictions with scores above the threshold (red line) are positive (displayed in green), and predictions with scores below the threshold are marked as negative predictions (displayed in orange). In the 3D view, Jmol displays the structure with positive predictions (highlighted in yellow). The side chain of each predicted residue is shown in Figure 1.
CR302 conservation for the evolution of amino acid positions. The nine-color conservation scores are projected onto the three-dimensional structure of the Ab, and FirstGlance displays colored protein structure in Jmol ( Fig. 2.). The normal score calculated for each amino acid position was calculated by the Consurf server. The color scale represented by the conservation scores (9conserved, 1 -variable) is shown in Table 1.
CR3022 Interface predictions. Potential interfacial residues identified through PredUs are presented in Table  1. Residue 33W in ABRs I of a heavy chain (H chain ABR I), residues W47, I50, R59, and Y60 in ABRs II of a heavy chain (H chain ABR II), residues I102, S103, and D107 in ABRs III of a heavy chain (H chain ABR III), Residue Q27 in ABRs I of a light chain (L chain ABR I), residues L52, Y55, W56 and S58-E61 in ABRs II of a light chain (L chain ABR II), and residues Y97-S99, P101 and Y102 in ABRs III of a light chain are predicted as possible interfacial residues.
The cons-PPISP calculates a score of neural network for every residue. This score estimates if a residue is involved in the protein-protein interaction or not. The interred and not predicted residues will presume as zero scores. A score above zero is considered an interaction residue, and a score below zero, a non-interaction residue ( Table 1).
Weighted Ensemble Solvent Accessibility predictor (WESA) identified some AAs as solvent-accessible residues (Table 1). BIPSPI predicted the partner-specific protein-protein interfaces. Residues with a predicted score higher or equal to the precision threshold (0.500) were highlighted in green. The interactive visualization of predicted residues in the Ab structure is shown in Figure  3, and the predicted interface scores for the residues of Ab are listed in Table 1.  (Table 1) [24]. Pockets contribute to the prediction of binding sites and active sites from protein structures. A residue in a deeper and larger pocket had a larger value. The pockets of small-molecule binding sites and active sites were higher than the average value; specifically, the values for the active sites were much higher.
The residues H: S103, H: D107, H: V108, L: L52, and L: E61 had a GHECOM score above 4. The H stands for the heavy chain and the L for the light chain. These residues situated in one of three CDR regions were predicted by the Paratome. At least four softwares confirmed the specially selected residues. The cons-PPISP scores above 0.00, BIPSPI scores above 0.5, and GHecom scores above 4 were considered thresholds. In this regard, PredUS, Meta-PPISP, and WESA predicted research residues to select the significant AAs (Table 2).
SIFT analyses. SIFT predicted the tolerated and detrimental alterations in each position of the submitted sequence. Positions with normalized possibilities lower than 0.05 are expected to be deleterious; those higher than or equal to 0.05 are expected to be tolerated (Table 3).   (26) as interactive residues in the crystal structure of CR3022 in complex with SARS-CoV-2 RBD. In this regard, we also mutated H:57E, H:55D, and H:102I to access a more diverse variant. Mutated sequences were aligned and illustrated (Fig. 5).
SAbPred determined the three-dimensional structure of all submitted variants. The quality estimation of the 3D models by the Prosa web server revealed that the Z-score of all predicted models was within the range of scores typically found for native proteins of similar size. The Zscore indicates overall model quality. The value is displayed in a plot that contains the z-scores of all experimentally determined protein chains in the current PDB. In this plot, groups of structures from different sources (X-ray, NMR) are distinguished by different colors. The Z-score can be used to check whether the Zscore of the input structure is within the range of scores typically found for native proteins of similar size (Fig. 6).
Protein-protein docking based upon biochemical or biophysical data. HADDOCK server evaluates ligand and receptor integration based on biochemical and/or biophysical data. Table 4 represents the information of variants with the HADDOCK scores that are more than the control. The Van der Waals and electrostatic energy values, in addition to the interred surface between the two complexes, are shown. HADDOCK scores of all variants are shown in Figure 7.

Light chain
Heavy chain Fig. 2. CR302 conservation of amino acid positions evolution by the Consurf server. The schematic structure of the colored protein displayed by FirstGlance in Jmol.Conservation scores projected onto the three-dimensional structure of the Ab with nine colors. The light chain analysis is shown on the left, and the heavy chain analysis is on the right.

Molecular dynamics simulations.
Molecular dynamics simulations provide an accurate ranking of the potential ligands and binding sites. We obtained ten different models by computationally exhaustive exploratory molecular dynamics simulation using CABS-Flex 2.0. Three-dimensional structures of 10 final models were scrutinized. Visualization of the models showed the structural heterogeneity of the final models. Two options, including the surface and cartoon representation of the trajectory (all final models in superposition), are shown in Figure 8. We selected the first model because of its best structural heterogeneity, optimum free energy, and highly stable configuration.  Figure 9 provided by Cabs-Flex 2.0.

DISCUSSION
Protein engineering is a powerful method for developing ideal therapeutic proteins (27). Contemporary bioinformatics techniques are commonly used in protein engineering (28,29). In diseases like COVID-19, where time is critical, these tools could save a lot of time and effort.
For virus entry into the cells, binding of ACE2 and RBD in spike protein is essential. Therefore, many studies have focused on inhibiting the virus attachment to the binding sites.
Recently Changhai et al. (2020) published a recombinant ACE2-Ig with potential applications for the 2019-nCoV diagnosis, prophylaxis, and treatment (30). Other researchers have produced an engineered ACE2 protein using bioinformatics techniques expected to bind RBD with higher affinity while being more thermostable and lacking enzymatic activity (31) .  Fig. 3. BIPSPI interactive visualization of predicted residues in the antibody structure. Residues whose score has an expected precision greater or equal than the precision threshold (0.500) are green. Interface residues prediction by BIPSPI whose score has an expected precision greater or equal than the precision threshold (0.500) are listed below the picture. The light chain is blue, and the heavy chain is red.
Incoming research has shown that human Ab CR3022 binds to SARS-CoV-2 RBD in a cryptic epitope (14). In addition, we found no studies that designed these antibodies against different epitopes of SARS-CoV-2 proteins, i.e., targeting the spike protein and subsequently preventing its binding to human ACE2. Since passive immunization with polyclonal antibodies has been shown to curb outbreaks of the hepatitis A virus and prevent varicella-zoster infection (32), mAb prophylaxis can be an effective way to contain the SARS outbreak. This computational research aimed to advance the binding affinity of CR3022 Ab neutralization to ACE2 (RBD) in the SARS-CoVs2 virus by focusing on AA mutagenesis. We believed that affinity maturation increases the performance of diagnostic antibodies due to enhanced specificity at reduced Ab concentrations. We used sitedirected mutagenesis approaches for this order. Sitedirected mutagenesis (SDM) methods are used to produce cloned DNAs with changed sequences to investigate the significance of different residues in the structure and function of the proteins (33,34). Our first goal was to find structurally and functionally essential AAs. We ought to identify sites in the complementaritydetermining regions (CDRs) permissive to mutagenesis while maintaining Ag binding. Next, we selected several residues by various methods: 1) predicting the antigenbinding regions of CR3022 Ab using the Paratome web server, 2) predicting the protein-protein interaction sites using ConSurf, PredUs, and Meta-PPISP 3) and finally, partner-specific protein interface prediction by BIPSPI and WESA.   BIPSPI predictions from two input structural models (Ab and Ag) were performed using PREDICT from the structural data option. By employing the results of different softwares, we selected H:30I, H:31T, H:33W, H:54G, H:103S, L:56W, L:58S, L:59T, L:61E, and L:98Y residues. Based on the prediction by the Paratome web server, these residues were placed in one of the CDR regions. At least four softwares confirmed the specially selected residues. A cons-PPISP score above 0.00, BIPSPI score above 0.5, and GHecom score above four were considered thresholds. In this regard, PredUS, Meta-PPISP, and WESA predicted residues to choose the major AAs. Finally, the SIFT server was used to predict whether an AA switch interferes with the protein act. Various studies have used PREDICT and SIFT servers to make the necessary predictions (28,29,35).
Based on these results, we substituted the AAs. Then, the docking analyses were done between the structure of engineered antigens and the receptor structure. Molecular docking plays a significant role in targeted medicine designing. It is the only theoretical method that explicitly models physical interactions between proteins. HADDOCK 2.1 performs the docking of CR3022 Ab variants and SARS-CoV 2 spike RBD, totally flexible, utilizing molecular dynamics simulations. Docking between wild-type and mutant antibodies indicated that the engineered mutations had strengthened the binding affinity between mutated antibodies to the receptor molecule. One of the criteria for the comparison of docking results is the Gibbs free energy (ΔG). The smaller ΔG has higher stability and binding affinity.
In this research, CABS-Flex 2.0 software was deployed for the flexible molecular dynamic simulation of our docked complexes. CABS-Flex could present the stable arrangement of the antigen-antibody complex. The highest RMSF value indicates more fluctuations within the complex structure during the simulation process. Fluctuations in the structure of the antigen-antibody complex show its high flexibility and a potential structure of the complex. As per our findings, the complex had many fluctuations in chain A; the highest amplitude was residue 42 of ~2.44Å.
Based on our results, the variants 45, 60, 67, 69, and 71 antibodies had smaller bonding energy, and their stabilities had also increased compared to the wild type. higher affinity for binding to SARS-CoV2 RBD compared to the wild type Ab. The buried area between the two complexes in these variants was more positive than the natural state, meaning that these mutations have modified and increased the Ab binding properties relative to wild-type antibodies. Consistent with our study, in silico analysis of the interaction between viral S protein and ACE2 receptors showed two neutralizing mouse antibodies, F26G19, and D12, against SARS-CoV2 designed using simulation and antibody-antigen docking could bind to SARS-CoV S protein with high affinities [36]. Also, investigating the affinity maturation of CR3022 convalescent antibody towards RBD of SARS-CoV-2 S protein, alongside in silico designed antibodies, showed that three candidates antibodies, SAM1, SAM2 (post-MDS), and SAM3 (pre-MDS), had higher binding affinities than their counterparts and CR3022 antibody. Besides better binding affinity, they also demonstrated greater target specificity towards SARS-CoV-2 S RBD due to blocking the human ACE-2 receptor binding site, as predicted in the study (37).  I  T  W  G  D  E  I  S  W  S  T  E  Y  Variant1  H  T  D  G  D  E  I  S  W  S  T  E  Y  Variant2  D  N  W  G  D  E  I  S  W  S  T  E  Y  Variant3  G  A  E  G  D  E  I  S  W  S  T  E  Y  Variant4  I  R  K  G  D  E  I  S  W  S  T  E  Y  Variant5  I  T  W  W  F  E  I  S  W  S  T  E  Y  Variant6  I  T  W  R  D  N  I  S  W  S  T  E  Y  Variant7  I  T   Its value is displayed in a plot that contains the Z-Score of all experimentally determined protein chains in the current PDB. In this plot, groups of structures from different sources (X-ray, NMR) are distinguished by different colors. Z-Score can be used to check whether the Z-Score of the input structure is within the range of scores typically found for native proteins of similar size. Fig. 7. HADDOCK Score of all variants. The variants with scores below were predicted to have an enhanced affinity toward the CR3022 human Ab. The 5-best mutated CR3022 human Ab variants are colored in red on the graph.
Our results suggest that the CR3022 Ab could be developed as a candidate therapeutic alone or combined with other neutralizing antibodies to prevent or treat nCoV-2019 infections. Optimizing the CR3022 properties to boost their binding characteristics and affinity maturation to avoid cross-reactivity and selecting suitable antibodies by adding new mutations to create the desired Ab is advantageous.

ACKNOWLEDGMENT
The authors wish to thank the Immunology Research Center, Tabriz-Iran, and the Shahid Sadoughi University of Medical Science, Yazd, Iran, for their support by unrestricted free access to the website for data collection.

CONFLICT OF INTEREST
The authors declare that there are no conflicts of interest associated with this manuscript.