Altered N-/O-glycosylation sites on the receptor-binding domain(RBD) in COVID-19 are related to coronavirus infection and pathogenesis.

: Objective: To discuss the possible significances of N-Linked and O-Linked glycosylation on the RBD of COVID-19 Methods: Amino acid sequences multiple alignments of RBD used Clustal Omega (v.1.2.4) using default parameters. Prediction of potential N-linked glycosylation sites by NetNGlyc 1.0 server. Prediction of potential O-linked glycosylation sites by NetOGlyc 4.0 Server. Result: COVID-19 Spike glycoprotein has 22 potential N-linked glycosylation sites and 3 O-linked glycosylation sites. 2 of 22 N-linked glycosylation sites distributed in RBD. None of the 3 O-linked glycosylation sites distributed in RBD, which is markedly different from SARS and other bat coronavirus using ACE2 as a receptor. Comparing with its close coronavirus, but which can’t use ACE2 as a receptor, the COVID-19 has little N- and O-linked glycosylation sites. Conclusion: we show the obvious differences in glycosylation sites in RBD between COVID-19 and other coronaviruses. We speculate that altered N-/O-glycosylation sites on RBD in COVID-19 are related to its infection and


Intrudction：
A novel coronavirus disease (COVID-19) created epidemics in December 2019 and subsequently developing into a global epidemic, threatening human health seriously [1,2]. This is the third outbreak of coronavirus in just about 20 years. It's only been seven years since the last outbreak that the Middle East respiratory syndrome coronavirus (MERS-CoV) outbreak in the Arabian peninsula in 2012. First is the severe acute respiratory syndrome coronavirus (SARS-CoV), which outbreak in 2002 in China. Nevertheless, so far, there are no effective treatments and vaccines yet. So the study of coronavirus is urgently necessary to prevent a direct health threat to humans again.
Coronavirus spike (S) glycoprotein that is the trimeric transmembrane glycoprotein distributed around the viral envelope, mediates viral entry into host cells and affects its stability and infectivity. The four genera of coronavirus(Alphacoronavirus, Betacoronavirus, Gammacoronavirus, and Deltacoronavirus) all contain envelope-anchored spike glycoproteins. It is also the primary target of neutralizing antibodies and the vaccine design [3][4][5]. The S glycoprotein consists of three segments: a large ectodomain, a single-pass transmembrane anchor, and a short intracellular tail. The ectodomain consists of two basic functional subunits S1 and S2. The S1 domain drives receptor association, whereas the S2 domain mediates membrane fusion [6,7]. The receptor-binding domain(RBD), located on the carboxy (C)-terminal domain (CTD) on S1, displays a concave surface during interaction with the receptor for SARS-CoV and MERS-CoV [8]. The entire receptor-binding loop, known as the receptor-binding motif (RBM), is located on the RBD and is responsible for complete contact with ACE2 [9].
Protein glycosylation is one of the most abundant post-translational modifications, which affects almost every aspect of protein from synthesis to playing physiological function [10]. According to the amino acid atom where sugar moiety attached to, the glycosylation is classified as different types including N-, O-and C-linked glycosylation [11]. Which also extensively exists in mang virus including HIV-1, Lassa virus, Hepatitis C virus, and Epstein Barr virus [12][13][14][15]. It has long been known that glycosylation is critical for the coronavirus spike glycoprotein synthesis [16]. And even, the inhibition of glycans processing could prevent SARS infection [17]. In the infectious bronchitis virus(IBV), glycosylation is necessary for RBD interacting with appropriate receptors [18], and N-linked glycans at different positions may also differentially affect S protein-mediated fusion including guide ligand binding and viral infectivity and replication [19]. Sialic acid, a 9-carbon monosaccharide, on human HKU1 glycoproteins have significant effects on receptor determinants [20]. Interestingly, carbohydrate-carbohydrate interactions between virus and host have also been observed [16].
Besides, more importantly, significant effects of glycosylation on S glycoproteins also have been confirmed in MERS-Cov, SARS-Cov and other coronaviruses that infect humans, including human CoV NL63(HCoV-NL63) that generally causes human mild respiratory disease, human coronavirus 229E (HCoV-229E) belong to Alphacoronavirus, and HCoV-OC43, HKU1 belong to Betacoronavirus [5]. Firstly, by cryoEM reconstructions, Walls, A.C. et al.found that N-linked glycans decorated the s trimer on the surface for both MERS CoV and SARS CoV [4]. Yanchen Zhou et al. have demonstrated that glycans were critical for SARS-Cov S protein which affects the interaction between DC-SIGN that is the alternative receptors for SARS-CoV and SARS-CoV [21]. HCoV-NL63 S trimer is covered by an extensive glycan shield consisting of 102 N-linked oligosaccharides obstructing the protein surface. Meanwhile, they put forward that coronavirus S glycans are masking the protein surface to limit access to neutralizing antibodies and thwart the humoral immune response [22]. Swatantra Kumar. et al analysis that some new glycosylation sites were introduced in the new coronavirus spike glycoproteins, which is related to receptors binding, transmission, and pathogenesis [23]. The difference of amino acid substitutions between the new CoVs and SARS CoV summarily reported in the article [24]. Some of those sites were involved in glycosylation.
In sum, glycosylation has critical roles for coronavirus. While the significance of glycosylation on S protein is still unclear. In this article, we analyze the number and location of N-linked glycosylation and O-linked glycosylation on S glycoprotein, especially the RBD that interact directly with corresponding receptors and affect virus entry and pathogenesis, and further discuss its possible influences on coronavirus.

Sequence analysis of COVID-19 1, Position of potential N-and O-linked glycosylation sites in COVID-19 Spike glycoprotein.
COVID-19 Spike glycoprotein has 1273 amino acids with 22 potential N-linked glycosylation sites (Table 1) and 3 O-linked glycosylation sites (Table 2). According to S glycoprotein which had been annotated fully in Swiss Prot, N-linked glycosylation sites are almost all located on the extracellular domain of S glycoprotein. 2 of 22 distributed in RBD that are less than SARS and MERS (table 3). All three O-linked glycosylation sites are concentrated nearly (table 2), Which may locate around the polybasic cleavage site of the new coronavirus [25].

2, Variation in glycosylation sites in RBD among COVID-19, SARS-CoV and previous report Bat SARS-like coronavirus that uses human ACE2 as a receptor
To discuss the effect of glycosylation on receptor binding, the author analyzed the difference RBD amino acid sequence, which all use human ACE2 as a receptor. According to the previous investigation from China, we hold that human SARS-CoV GZ02, BJ01, and Tor2 represent the early, middle, and late phases of the 2002/2003 epidemic, respectively [26]. COVID-19 and SARS-CoV are all maybe come from bat [27]. So we collect some closely related Bat SARS-like coronavirus. Bat SARS-like coronavirus WIV16(SL-CoV-WIV16) represented the closest relatives to the epidemic SARS-CoV strains. SHC014 and WIV1 are also closely related. Meanwhile, they are all able to bind with human ACE2 [28][29][30].
The most obvious feature of COVID-19 is that the N-linked glycosylation site at N370 and the O-linked glycosylation site at S349 are removed(figure 1). So that the COVID-19 glycosylation site number in RBD is less than SARS-CoV and bat SARS-like coronavirus. Especially that there isn't an O-linked glycosylation site on COVID-19 RBD. The N-linked glycosylation sites and motif (Asn-X-Ser/Thr, where X is any amino acid except for proline) among humans and bat are conserved(figure 1). The affinity binding ACE2 of COVID-19 is higher evidently(10 to 20 fold)than SARS-Cov [31]. Recently, another study also experimentally documented that [32]. This may be involved in a higher infection rate and widespread. Meanwhile, COVID-19 can use all ACE2 including humans, Chinese horseshoe bats, civet, pig, but mouse [27]. This may indicate that removing an N-linked glycan at N370 and an O-linked glycan at S349 is beneficial to the new coronavirus transmission. Whilst, this is necessary for efficient receptor binding. Recently, a result of another team also supports this [33]. More details were shown in table 3.
To discuss the effect of glycosylation on cross-species transmission. The author analyzed the different RBD amino acid sequences, which was close, but use different receptors.The bat SL-CoV ZC45 and ZXC21, which founded in 2005 in Zhoushan, China, and from Chinese horseshoe bats(Rhinolophus sinicus) [34], was considered closely related to COVID-19 at the whole-genome level, but they can't interact with human ACE2 [35]. The most remarkable difference in glycosylation between them is that the new coronavirus has less an N-linked glycan at N370 and an O-linked glycan at S349 than ZC45 and ZXC21( figure 2). This may indicate that glycosylation involved in cross-species transmission.

Discussion
Currently, quickly, COVID-19 has become a global public health crisis and is devastating spreading globally [36]. the novel CoVs has a greater capability for the human to human transmission, comparing with SARS and MERS [37]. This may be a potential cause of a large global pandemic. So far, the main measurements had adopted by preventing human-to-human transmission, because of the lack of effective medicine [38].
Coronavirus S glycoprotein is highly glycosylated by host-cell, generally containing 23-38 N-linked glycosylation sites on each spike trimer. The number of O-linked glycosylation sites on the spike trimer isn't fully clear. In this study, there are three potential O-linked glycosylation sites on the S glycoprotein of COVID-19. Although the length of the novel CoVs spike glycoprotein is longer than the bat SARS-like coronaviruses, SARS-CoV, and MERS-CoV [39], the number of glycosylation site is fewer. The current analysis shows that the N-linked glycosylation sites and motif on the novel CoVs RBD are conserved and located in the N-terminus. The result is consistent with a previous study [40]. However, it is striking that O-linked glycosylation sites can't be predicted on the novel coronavirus RBD, which may be an important mutation.
It's seemly that a higher density of N-linked glycans is beneficial for virus immune escape. N-linked glycans on spike glycoprotein help to shield certain epitopes [41]. The RBD of SARS and MERS are further divided into a core subdomain and a receptor-binding subdomain [42].The N-linked glycans mainly located on the core subdomain [43]. In the NL63, from α genera coronavirus, the S2 subunits with higher N-linked glycan density than S1 [22]. The difference shielding way on S glycoprotein may involve in different host immune [44]. Which all may be beneficial for escaping the host immune system. It's amazing that comparing to HIV, influenza with stronger evasion, coronavirus has less sparsity of the glycan shield [41]. Watanabe.et al also suggested that there is a strong correlation between immune escape and glycan density [45].
The deletion of the N-linked glycosylation site(N370) on novel coronavirus RBD may help to induce neutralizing antibodies. Spike glycoprotein is the main target of coronavirus vaccines. Besides, RBD is also a great vaccine candidate, due to S glycoproteins major neutralizing epitopes locate on which [9]. The monoclonal antibodies of RBD could effectively against the virus entry [46]. Exploring the structure and distribution of glycans on the surface of S glycoprotein help to immunogen design and treatment measures [4]. Yuan, M.et al. supposed that the N-glycosylation site at residue N370 on COVID-19 results in the difference of antigenicity between COVID-19 and SARS-CoV [33]. Indeed, glycosylation may affect the antigenicity of RBD, which helpful for vaccine production [47]. This article also observed the differences in glycosylation on different coronavirus RBD.
N-linked glycosylation may be involved in S glycoprotein binding with host receptors. In the current analysis, we seemly find that a higher density of glycosylation is harmful to binding affinity. Visually, in Bat coronavirus, the glycans site generally at the apex of the spike trimer, oriented towards target cells [8]. So it's understandable that glycans on S or receptor sterically hinder binding between RBD and corresponding receptor [47,48]. Similarly, glycans on human ACE2 also against the binding. The hot spot on human ACE2 regularly on the region without glycosylation, for easily accessible to viruses [49]. Proteins, in general, have advantages over sugars as viral receptors by providing higher affinity and specificity for viral attachment [56]. This makes us more convinced of the current opinion.
How did coronavirus cross the species barrier is still an intriguing puzzle.
Nevertheless understanding the mechanism of which is critical to combat outbreaks of coronavirus and develop new drugs. Tissue-specific glycosylation has a significant influence on viral transmission [50]. Two mutations that are S746R and N762A on HKU4 spike glycoprotein can let the HKU4 get the ability to enter into human cells. It is probably because that the mutations disrupted the N-linked glycosylation sites in the human protease motif in HKU4 [51]. The glycoprotein sites N227 and N699 on S protein, outside the RBD, from whether human or animal, are beneficial for coronavirus transmission [52]. In this article, we speculate that removing the glycosylation site on RBD benefit to coronavirus cross-species transmission.
The studies about ACE2 that is also the receptor of COVID-19 further support this hypothesis. If introducing the residues 90-93 of civet ACE2 into the human receptor, the binding capability of human ACE2 mediated by spike glycoproteins would be remarkedly increased. This phenomenon probably because of removing of a glycosylation site at position 90 in human ACE2 [7]. Another study also showed that rats are resistant to SARS-CoV because of introducing a glycosylation site at rat ACE2 position 82 residues, which interfere with the binding between SARS-Cov and rat ACE2 [27]. Similar to the ACE2, dipeptidyl peptidase 4 (DPP4), MERS-CoV uses which as an entry receptor, is affected by glycosylation. Two N-linked glycans, N410 and N487, distributed on the core and RBM, respectively [28]. Disruption of the glycosylation sites in human DPP4 significantly regulates the affinity, which makes human DPP4 act as a receptor by MERS-CoV [29]. So they put forward that glycosylation is an important barrier relate to virus transmission. Aminopeptidase N (APN) serves as a receptor for HCoV-229E. While HCoV-229E can't interact with by substituting an N-glycosylation site at amino acids 291 to 293 in human APN. This means that glycosylation at amino acid 291 of human APN blocks infection by HCoV-229E [30].
The pattern of N-linked glycans on SARS-CoV S glycoprotein is mainly high-mannose(30%), hybrid(28%), and bi-, tri-and tetra-antennary complex glycans (42%) [17]. Meanwhile, 8 of the 22 sites on the novel coronavirus S protein are oligomannose-type glycans, and the remaining are complex-type glycans [53]. On the HKU1 S glycoproteins, the N-linked glycans are mainly high-mannose glycans [16]. Maybe, the different patterns of glycans also play an important role in the utilization of ACE2 proteins of different species by SARS-CoVs [54]. It is supposed that, in HIV, high-mannose type N-glycans may be more beneficial for antigen presentation and virus degradation [55]. The complex-type glycans are more involved in immunogen engineering in coronavirus [53]. And yet, the pattern of N-/O-linked glycans of other coronavirus remains unclear.
However, the researches about O-linked glycosylation in coronavirus are very little, and more details about O-glycosylation are still unclear. According to previous studies, the O-glycans on envelope virus are also important for receptor binding and entery [11]. It seems to be that the functions of O-linked glycans are similar to N-glycans in virus biology. We are surprised to find that there isn't a predicted O-linked glycosylation site on COVID-19 RBD. And the all three O-linked glycosylation sites (residues S673, T678, S686) on the novel CoVs S glycoprotein are nearly around the S1/S2 boundary. Which may be involved in the polybasic cleavage site that is critical to viral infectivity and host range [25]. Our analysis suggests that removing O-linked glycans in RBD is beneficial to coronavirus transmission.

Conclusion
In conclusion, the number and location of glycosylation sites on RBD may influence the immune escape, infinity, host range, and pathogenesis of coronavirus, although without strong evidence currently. Much more experimental work is needed to do on it. The study about the effects of glycosylation on RBD will provide clues for further study of coronavirus. The functions of glycosylation are remaining huge challenges, due to the heterogeneity and complexity of glycosyltransferases and glycosylation sites [41]. This article analysis just focuses on primary structures. The function of glycans or glycosylation sites on secondary, and tertiary structures are also not yet fully understood. We need to note that another scenario is that the glycans don't have a direct affection while affecting the spike glycoprotein conformational stability. Fortunately, mass spectrometric, chromatographic, X-ray crystallography, cryo-electron microscopy(cryo-EM), and other techniques will be beneficial for studies of glycosylation.

Conflict of Interest:
All authors declare that he/she has no conflict of interest. Consent for publication: Written informed consent for publication was obtained from all participants. Availability of data and materials: The authors confirm that the data supporting the findings of this study are available within the article. Ethical approval: "Not applicable" .This article does not contain any studies with human participants or animals performed by any of the authors.