Analysis of COVID-19 Spike protein mutations and their effects on its anity towards human cell receptors

The novel Coronavirus SARS-CoV-2 (2019-nCoV) is a member of the family Coronaviridae and contains a single-stranded RNA genome with positive-polarity. In order to reveal the evolution mechanism of the SARS-CoV2 genome, in particular its spike protein; the main driving force for host recognition, we conducted a comparative analysis with Coronaviruses of different strains, including MERS-CoV, SARS-CoV1 and Pangolin Coronavirus. In addition, a comparative analysis between the newly sequenced SARS-CoV2 from different regions of the world has been carried out in order to understand the evolution of this novel virus throughout its transmission. Among all sequenced strains, the latest France HCoV was the least identical to the reference. Further investigations have therefore been performed and it has been concluded that this strain has undergone mutations which have increased its binding anity to the Angiotensin-Converting Enzyme 2 (ACE2) receptor, thus hypothetically increasing its infectivity.

This new virus is the pathogen responsible for this infectious respiratory disease called Covid-19 (Coronavirus Disease). Globally, by July 1, 2020 (10.00 pm CET), according to the WHO, 10,357,662 cases of 508,055 Covid-19 have been con rmed and patients have died. In addition to con rmed cases, there are also suspected cases of Covid-19, the de nition of which is evolving as times passes and the epidemic propagates.
Spike protein is a crucial recognition factor for virus attachment and entry to the host cell. Also, it's amino-acid sequence is shorter than the whole genome, which is time saving and storage saving in term of coding on Python. Therefore, our study is only based on Spike proteins 3 .
In this study, we focused on determining the strain responsible of the new pandemic COVID-19 and its origin, through comparative analysis with different strains of coronaviruses issued from animals and human coronaviruses. Also, we highlighted a new hypothesis concerning the 'Best-Fit' activity of this virus while comparing binding a nities of France H-CoV and the reference H-CoV issued from China (Wuhan) complexed with ACE2. Objectives: To reveal the strain responsible of the new pandemic COVID-19 and its origin, a comparative analysis with different strains of coronaviruses issued from animals and human coronaviruses was performed.
Also, we highlighted a new hypothesis concerning the 'Best-Fit' activity of this virus while comparing binding affinities of France H-CoV and the reference H-CoV issued from China (Wuhan) complexed with Angiotensin-Converting Enzyme 2 (ACE2).

Materials And Methods
All spike proteins were downloaded from NCBI. MSA of all sequences was performed using ClustalW's Python application and visualized on ESPpript 3. The sequence identity between COVID-19 spike protein and each of the other coronavirus spike proteins was calculated with Pairwise sequence identity. France strain' spike protein was modelled using SWISS MODEL 6,7,8,9,10   The sequences presented in Table1 correspond to spike proteins of animal coronaviruses 2 infecting humans (camel, bat and pangolin). Using ClustalW's Application ID on Python, we performed a Multiple Sequence Alignment was elaborated and for the visualization ESPript3 was used (Figure 1). Saudi HCoV (July 2020) 99.92% Table 2 Pairwise alignment results MSA, the phylogenetic tree ( Figure 2) and the percentages of sequence identities ( 2. Comparative analysis of COVID-19 strains emerging from different regions in the world: MSA (Table 3) and phylogenetic tree (Figure 3) issued from the comparative analysis of COVID-19 strains emerging from different regions in the world have shown that France has the higher root length and is only 99.76% identical to the reference S Protein, thus France strain underwent some mutations that made it unique (Table 2). Therefore, we studied its affinity to ACE2 and compare it to the reference's affinity through analysis methods such as "Docking".

A nity Analysis:
a) Spike protein homology modelling: In order to determine the a nity, we had to go through different steps: First starting out by modelling the spike glycoprotein using SWISS-MODEL 6,7,8,9,10 , different models where suggested. After comparing the QMEAN, the GMQE, (total coverage, Z score) and the sequence identity. The model presenting the highest score, was selected.  Table 4 Binding residues for docking A structure analysis between spike protein's E chain and ACE2's A chain was performed on a pre-modelled S protein and ACE2 complex 11 , using python.
The goal behind this analysis is to determine the binding residues between these two entities. The complex and the residues were visualized with NGLviewer on python, as shown on the ( Figure 5); Red illustrating ACE2's A chain, Blue illustrating spike protein' E chain and Yellow representing the binding residues (Table 4).
c) Docking and determination of binding a nities: Having already determined the residues within our binding site, the next step was to perform a docking. In docking, only the conformation of the ligand is fully explored, in our case, the ACE2's A chain. This docking, gave out multiple clusters for each complex (Reference's S protein docked with ACE and France strain' S protein docked with ACE2 12 .
In fact, the smaller the dissociation constant and the binding a nity are, the more tightly bound the ligand is, or the higher the a nity between ligand and protein.  Table 5 Docking results The docking of the Spike protein of France and China with the ACE2 receptor and the determination of binding a nities of each complex ( Table 5) have shown that the France strain S protein has a higher a nity to the ACE2 receptor (ΔG = -9.3 Kcal.mol-1, Kd = 2,6.10 −7 M) compared to the a nity of the reference strain S protein with the same receptor (ΔG = -8,6 Kcal.mol-1, Kd = 8.10 −7 M).
These results highlight the hypothesis concerning the 'Best-Fit' activity of the COVID-19. In fact, France strain' S protein presented several mutations, making it less identical to the common COVID-19's S protein. These mutations seem to have an effect on its a nity towards the ACE2 receptor according to the theoretical results presented above.

Conclusion
Many host cell receptors are the target for this virus, including ACE2. Study of binding a nities with the ACE2 receptor, is essential to better understand its viral mechanism. The French Strain S protein presented several mutations that in uenced its a nity to ACE2 receptors. Future work involving the modeling of Spike proteins and experimental validation is needed to study the 'Best-Fit' activity of COVID-19.

Declaration
Competing interests