SARS-CoV-2 spike-protein NTD retain extended loop region structurally analogous to MERS-CoV
Receptor binding with host cell is the initial step in virus infection, tissue tropism and cell spread. Coronaviruses utilize complex pattern of receptor recognition to infect diverse host cells. COVID-19 caused by SARS-CoV-2 has been a cause of increase global burden with high co-morbidity and mortality . Limited understanding of the diverse range of tissues targeted by the virus and its potential receptors, there is an immediate need to understand SARS-CoV-2 entry mechanism and pathogenesis to develop effective therapeutics. It is recently identified that trimeric transmembrane spike glycoprotein of SARS-CoV-2 and SARS-CoV that share a high degree of similarity within the RBD, binds to the common human ACE-2 receptor, yet SARS-CoV-2 is highly infectious than SARS-CoV . To investigate the difference in these highly similar spike proteins, we analyzed the phyletic relatedness of the spike proteins from coronaviruses that are known to infect humans. As expected, spike proteins of SARS-CoV and SARS-CoV-2 are highly similar and both groups together in the same clade (Fig. 1a). The closest spike protein to SARS clade is of the MERS- CoV, which suggests that MERS-CoV shares a higher similarity to SARS clade than other coronaviruses. Additionally, the spike protein of HCoV-229E and HCoV-NL63 forms a separate clade (Fig. 1a). Even though, the spike proteins of HCoV-OC43 and MERS-CoV are distantly related, they both bind to host sialic acids as an alternate host-receptor during infection. To date, nothing is known about the interaction of SARS clade with host sialic acid receptors. Despite of the 76% homology between spike-proteins, SARS-CoV-2 is more infectious than SARS-CoV, which suggests a possible structural or mechanistic difference . One stark difference between their spike proteins is the presence of a furin-like cleavage site on SARS-CoV-2 spike protein . SARS-CoV-2 has 12 extra nucleotides upstream to the single Arg↓ cleavage site forming PRRAR ↓SV sequence, which is a canonical furin-like cleavage site . The presence of this furin-like cleavage site in SARS-CoV-2 is predicted as a possible reason for its efficient spread as compared to the other beta coronaviruses . Alternatively, by comparative sequence analysis of the NTD of spike protein, we identified three extended region in SARS-CoV-2 and MERS-CoV (Fig 1b), but not in SARS-CoV (Fig 1b). To identify if these three regions forms a part of domain or a functional module of NTD sequence, we modeled the full-length SARS-CoV- 2 spike glycoprotein strongly biasing on the cryo-EM structure of SARS-CoV-2 spike protein (Fig 1c). The cryo-EM structures of SARS-CoV-2 spike protein display a well ordered β-strand rich NTD, RBD and the core helical domain . Owing to their flexibility, all β-β loops, except for β14-β15 displays almost no cryo-EM density even after B-factor sharpening. The missing β-β loops were modeled ab-initio and the model with the best DOPE score was further energy minimized and used for computational analyses as discussed in the methods section. We compared this modeled structure and found a major difference between SARS-CoV and SARS-CoV-2 with respect to the loop lengths. SARS- CoV-2 has larger, β4-β5, β9-β10 and β14-β15 loops in comparison to SARS-CoV (Fig 1b and c), however these loop lengths were comparable to MERS-CoV. The β14-β15 loop is particularly interesting owing to its length and flexibility due to the presence of interspersed glycine residues and a flanking poly-alanine region (Fig 1b). This putative function of β14-β15 loop is reminiscent to the β6-β7 loop (Thr129-Thr136) of MERS-CoV . MERS-CoV β6-β7 loop has a similar long arm loop that forms critical electrostatic anchor points to host sialoside receptor engagement and stability .
SARS-CoV-2 NTD motifs share loop region and predicted to bind sialosides.
To test the capacity of the SARS-CoV-2 spike protein to engage host-cell sialic acid receptors, we selectively docked Neu5Ac, 2,3-SLN, 2,6-SLN, Neu5Gc and sLeX (Fig. 2) on to the SARS-CoV-2 NTD. The selected sialosides represents large family of more than 500 human sialoside and have been previously shown to bind with the S1A domain of MERS-CoV . A recent study by Milanetti et. al, also predicted a sialoside-binding pocket in the NTD of SARS-CoV-2 by surface iso-electron density mapping, further highlights the importance of NTD interaction with sialic acid . The amino acid residues Leu18-Gln23, His66- Thr78 of β4-β5 loop, and Gly252-Ser254 of β14-β15 loop forms the sialic acid- binding site in SARS-CoV-2 spike protein (Fig. 2 and 3). While, the β4-β5 loop is involved in the engagement with all sialosides, the β14-β15 loop is specific to larger sialic acids such as sLex (Figs 2 and 3) which possibly suggest to have preferential interaction with α2,3-Linked sialosides as in MERS-CoV [25, 26]. The predicted interacting sites of the tested sialosides are mapped in Fig. 3. The presence of key electrostatic and hydrophobic interactions with each of these sialosides suggests possibility of a physiological interaction with the NTD domain of SARS-CoV-2. Molecular dynamics simulation of SARS-CoV-2 NTD-sialoside complexes highlights the flexibility of β14-β15 loop and its induced ability to accommodate larger sialosides (Fig 2f). The superimposition of all produced SARS-CoV-2 NTD-sialoside complexes show an outward movement of the β14- β15 loop, allowing the sialic acid-binding site to accommodate larger sialosides such as sLex (Fig 2f). On the other hand, the spike protein of SARS-CoV features a shorter 9 amino acid β14-β15 loop (Fig. 1b), which offers reduced degrees of freedom, with a decreased capacity to engage host sialosides. In addition, a single-turn alpha helix (Thr20-Leu24) formed key interactions with all sialosides tested. Interestingly, the NTD of MERS-CoV also displays a single-turn helix (Gln37-Phe40), which is important for sialoside binding . However, both the SARS-CoV and HCoV-OC43 spike proteins lack this element [7, 10, 13, 18, 26]. Taken together, these findings suggest that SARS-CoV-2 spike protein might have independently evolved to recognize sialosides using its NTD. This acquired ability of SARS-CoV-2 to accommodate and engage diverse sialosides might be a reason to ascribe potential role of sialosides as an alternative reciprocal host cell receptor supporting SARS-CoV-2 pathogenesis with broad tissue tropism. Such differential distribution of sialic acid in the respiratory tract and other organs along with limited ACE2 expression in human airway epithelia  explains the differential SARS-CoV-2 infectivity, transmission and tropism . In connotation to this, the recent preprint report suggests that the spike glycoprotein also recognizes different Siglecs (Sialic acid–binding Ig-like lectins) and C-type lectins which indicates that the spike protein interact in ACE2-independent infection pathways with the immune cells . In addition, the human ABO blood group and COVID-19 susceptibility [29, 30] may relate to modulation of sialosides distribution pattern on target membrane, possibly regulating SARS-CoV-2 transmission and tropism  despite of high affinity with ACE2. The comprehensive in-silico structural analysis reported in this study; provide a basis for further research to explore the functional role of the reciprocal interaction of SARS-CoV-2 with host sialic acid during virus entry and spread.