SARS-CoV-2 poses a serious global health emergency. Since its emergence in Wuhan city, Hubei province of China in December 2019, the viral outbreak has resulted in over 2.8 million confirmed cases worldwide and 195,000 deaths1. While it is evident that SARS-CoV-2 can transmit efficiently from person to person, it is less clear what other mammalian species are at high risk of being infected. Both SARS- CoV and SARS-CoV-2 genomes encode a spike protein that uses the human ACE2 receptor to mediate viral entry into the host cell2-5,6,7. Mechanistically, two S protein domains are involved during Betacoronavirus infection on mammalian cells that express ACE2 receptors, with the S1 domain interacting with the ACE2 receptor8,9, and the S2 domain undergoing structural rearrangements to mediate membrane fusion4. Interacting with the S1 domain, the ACE2 receptor-binding domains are primarily located in α-helix 1 and β-sheet 54. Specifically, previous SARS-CoV studies have identified many ACE2 genetic hotspots that affect viral infectivity, and the efficacy of the interaction between S proteins and ACE2 receptors is a good predictor of severity of Betacoronavirus infection4 and potentially plays an important role in subsequent viral replication10,11. Hence, determining the key sites at the human ACE2 receptor involved in binding with SARS-CoV-2 S proteins and their differences in comparison to mammalian ACE2 receptors is crucial to provide insights into viral infectivity, host range, and the patterns of viral evolutionary adaptation that have allowed interspecies transmission and infection of human ACE2 expressing cells. Answering these questions is important to 1) improve our ability to predict and control future pandemics, 2) facilitate efforts in the development of a potential SARS-CoV-2 vaccine, and 3) manage and protect wildlife and domesticated animals.
A potential source of the 2002-2003 SARS-CoV outbreak was the masked palm civets. The palm civet ACE2 receptors bind efficiently to Civet-CoV S proteins isolated from infected palm civet hosts (strain SZ3), from patients with the severe 2002-2003 SARS-CoV (strain TOR2), and from patients with the less severe 2003-2004 SARS-CoV (strain GD)12-15. Whereas in human cells, ACE2 receptors only bind efficiently to S proteins from SARS-CoV TOR2, but not to that from Civet-CoV SZ3 or from the less severe GD variant4. Furthermore, early studies found that differences in binding domains explained why S protein-mediated SARS-CoV infection in human is also efficient in palm civet but is not efficient in rats4, and introducing point mutations in the AEC2 gene variably changed the efficiency of receptor binding in palm civets, humans, and rats. For example, experimentally mutating rat ACE2 His353 to human Lys353 changed the rat ACE2 receptor from one that poorly binds S protein into one that is efficient for binding, introducing human ACE2 residues 82 to 84 into rats led to an increase in S1-protein binding, and changing human ACE2 Met82 into rat ACE2 Asn82 interfered with S protein mediated entry. Furthermore, changing human AEC2 Lys31, Tyr41, Asp355, and Arg357 also interfered with S protein binding. On the contrary, other ACE2 site differences, such as Glycine 354 in human and Asparatic acid 354 in palm civet, and their differences in residues 90-93 that potentially interact to S protein residue 479, did not contribute to a difference the efficacy of S protein binding10. Hence, although major sequence differences are found between palm civet and human ACE2 in α-helix 1 (residues 30-40) and in α-helix 3 (residues 90-93), they do not contribute to differences in efficacy of S protein binding. In brief, mutation experiments by Li, et al. 3 shows that the conservation for Lys31 and Tyr41 on α-helix 1, residues 82-84 in the vicinity of α-helix 3, and residues 353-357 in β-sheet 5 in human ACE2 may be crucial for efficient S protein binding in SARS-CoV because their replacements lead to altered binding dynamics between the S protein and ACE2 receptor.
To illustrate the importance of contrasting between human and other mammalian ACE2 at key sites crucial for S protein binding, figure 1 compares ACE2 gene similarities in a sample of mammalian species in two ways: one is global similarity reflected by the phylogenetic tree from ACE2 alignment (Fig. 1A), and the other is local similarity in ACE2 key sites that are crucial for S protein binding from mutation experiments by Li, et al. 3 (Fig. 1B). Figure 1A shows that the human ACE2 gene shares higher global similarity to mammalian ACE2 genes of other Primates and species of the Rodentia family than to species of the Carnivora, Artiodactyla, and Chiroptera orders. This emphasizes that global ACE2 gene comparison is not a good predictor of which species is at high risk of being infected by SARS-CoV because we cannot infer the two aforementioned information: that the progenitor of SARS-CoV is likely a bat (from the Chiroptera order) virus, and that the probably intermediate host is masked palm civets (from the Carnivora family). In contrast, by comparing the ACE2 gene at key residues that are involved in human SARS-CoV infection, we found that certain species in the Carnivora, Artiodactyla, and Chiroptera orders are at higher risk of being infected by SARS-CoV-2, even though they share less global ACE2 similarity than other Primates and Rodentia species. Indeed, global similarity between palm civet ACE2 and human ACE2 does not identify palm civet as a high-risk creature, but key site similarity comparisons does suggest palm civets as a high-risk intermediate host.
Similarly, given there are key binding sites in human ACE2 that form contacts with SARS-CoV-2 S protein, it is important to understand how well different mammalian ACE2 feature these binding sites, in order to find species at high risk of SARS-CoV-2 infection that may act as potential intermediate hosts. ACE2 binding by SARS-CoV and SARS-CoV-2 share overall structural similarity, likely constrained by the structure of ACE26. However, the spike protein is notably different between SARS-CoV and SARS-CoV-2 with only about 75% homology5,16. Based on two recent crystal structure experiments6,7, different residues on the ACE2 receptor are involved in SARS-CoV-2 and SARS-CoV infections, these differences are summarized in Table 1. In brief, in SARS-CoV-2 Ser19 forms a new hydrogen bound with ACE2 Ala475 while SARS-CoV lacks this interaction7. The ACE2 Asp30 forms salt bridge with SARS-CoV-2 S protein at Lys417 but does not interact with SARS-CoV S protein6. Other noteworthy differences are ACE2 Glu35 and Arg393 interacting only with the SARS-CoV-2 S-proteins (Table 1). These differences may contribute to a difference in binding affinity between the two viruses and suggest distinct viral infectivity and host range. Additionally, because the S proteins of SARS-CoV-2 and SARS-CoV have unique structural differences, even when the same ACE2 residues are involved in binding, they form different contacts. For example, human ACE2 Gln24 interacts with SARS-CoV-2 S-proteins at Leu472 but with SARS-CoV S proteins at Asn4736, and ACE2 Leu79 and Met82 interact with SARS-CoV-2 S proteins at Phe486 but with SARS-CoV S protein at Leu472. While binding at Gln24 was not previously documented to affect S protein binding in SARS-CoV10, the difference in its binding to SARS-CoV-2 may contribute to a difference in binding affinity between the two viruses7. Furthermore, the conservation of several ACE2 hotspots may be critical to achieve efficient S protein binding; examples are Lys31 and Lys3537, both forming salt bridges with the S protein of SARS-CoV-2 and of SARS-CoV. In total, at the ACE2 receptor’s binding region, 15 residues are potentially crucial in binding to SARS-CoV-2 S protein, among which four are distinctly found to interact with SARS-CoV-2 S proteins (Table 1). As of the time this manuscript was submitted, studies that mutate human ACE2 interface residues to assess their binding potential to SARS- CoV-2 are scarce. Hence, we cannot rule out the possibility that all ACE2 interface residues listed in Table 1 are important in SARS-CoV-2 binding. A comparative gene analysis of these 15 binding residues on mammalian ACE2 receptors may provide crucial insights into 1) the identification of mammals with high risk of SARS-CoV-2 infections, and 2) the potential intermediate hosts that facilitated the zoonotic transmission of SARS-CoV-2 infection into humans.
Besides the comparison of ACE2 gene to understand SARS-CoV-2 and SARS-CoV ACE2 host range, one other important question is how did bat SARS-like coronavirus transmitted to humans? Given that there are distinct sites on the S1 receptor binding domain between SARS-CoV-2 and SARS-CoV, we wish to know which mammalian-CoV have the same sites as the two human SARS coronaviruses. Answering this question facilitates the identification of 1) a progenitor that proceeded the common ancestor of SARS- CoV-2 and SARS-CoV, and 2) which mammalian-CoV may evolve to exploit these similarities to infect humans. SARS-CoV-2 might have evolved from the bat RaTG13 coronavirus because they share high S protein sequence similarity. Notably, different from SARS-CoV, SARS-CoV-2 S protein contains a Gly482, Val483, Glu484, and Gly485 four-residue motif in the binding ridge that may facilitate contact with the N-termal helix on human ACE2 receptors. Similarly, the bat RaTG13 virus also contains the four-residue motif7. This may explain why RaTG13 could use human ACE2 as its receptor7 and the simplest assumption is that there are no intermediate hosts. However, the S protein binding domain of the pangolin Betacoronavirus Pangolin-CoV also shares high sequence similarity with that of SARS-CoV-2 and it also contains the 482-485 four-residue motif7. Additionally, the ACE2 receptor is more similar between humans and pangolin than between humans and bats. Indeed, pangolins have been proposed to be an intermediate host17.
If the intermediate host for SARS-CoV-2 is distinct from SARS-CoV, then it should meet two criteria. First, the culprit should share high similarity with humans in ACE2 binding domains and are at high risk for SARS-CoV-2 infection. Second, the SARS-like coronaviruses that are species-specific to high risk mammals should share similarities in the S1 domains with human SARS-CoV-2 and with bat RaTG13 but not with SARS-CoV. Many residues in RaTG13 S protein are not fine-tuned for binding human ACE26,7. Changing bat RaTG13 S protein at Lys486 and Tyr493 to SARS-CoV-2 S protein Phe486 and Gln493, respectively, enhanced human ACE2 recognition7. Similarly, changing RaTG13 Lys479 to SARS-CoV Asn479 also increased human ACE2 binding. Moreover, residues Leu455 and Asn501 are found to favor human ACE2 binding and they are conserved between RaTG13 and SARS-CoV-27, and residues Tyr442, Leu472, Asn479, Thr487 and Tyr505 were also identified as critical sites involved in ACE2 binding18. Table 1 details all other S protein residues on SARS-CoV-2 and on SARS-CoV that interacts with the human ACE2 receptor. Although S protein amino acid sites may differ, many interacts with the same ACE2 residues, among which some residues are identical amino acids (in blue, Table 1) whereas others are different amino acids but share similar chemical properties (underlined, Table 1)6. In total, there are 15 SARS-CoV-2 S protein residues and 13 SARS-CoV S protein residues that are critical to ACE2 binding (Table 1). Determining the differences in S protein binding sites of SARS-CoV-2, SARS-CoV, and SARS- related viruses in other mammalian species may shed light on the evolution of Betacoronaviruses that mediated animal to human transmission.
Table 1. Contacting residues at ACE2/SARS-CoV-2 (columns 1 and 2) and at ACE2/SARS-CoV (columns 3 and 4) crystal structure interfaces, retrieved from Shang, et al. 7 and Lan, et al. 6. Each row specifies the contacts formed between human ACE2 residue and S protein residue (e.g., row 2; ACE2 S19 – SARS-CoV- 2 A475). In red are distinct residues on the ACE2 receptor interface for binding with SARS-CoV-2 and SARS-CoV S proteins. In blue are shared S protein amino acids between the two SARS coronaviruses that interact with the same ACE2 residues, although the position of S protein residue differs between the two viruses. Underlined are different S protein amino acids interacting with the same ACE2 residues, but these interactions share similar biochemical properties.
Human ACE2
|
SARS-CoV-2
|
Human ACE2
|
SARS-CoV
|
S191
|
A4751
|
Q242
|
N4732
|
Q242
|
N4872
|
K311,2
|
Y4421,2
|
Q242, M821,2, L791,2, Y831,2
|
F4861,2
|
H342
|
L4432, N4792
|
D302
|
K4172
|
E372
|
Y4911,2
|
K311,2
|
L4551,2, Q4932
|
D381,2
|
Y4362, Y4842
|
H342, E351,2
|
F4562, Q4931,2
|
Y412
|
Y4842, T4862,T4872
|
E372, R3932
|
Y5052
|
Q422
|
Y4362, Y4842
|
D381,2
|
Y4492, Q4982
|
L792, M821,2
|
L4721,2,
|
Y412
|
Q4982, T5002, N5012
|
Y832
|
Y4752, N4732
|
Q422
|
G4462, Y4492, Q4982
|
Q3252, E3292
|
R4262
|
Y832
|
Y4892, N4872
|
N3302
|
T4862
|
K3531,2
|
N5011, G5022
|
K3531,2
|
T4871, G4882
|
1 Interacting residues retrieved from Shang, et al. 7
2 Interacting residues retrieved from Lan, et al. 6
We performed comparative gene analyses to trace differences in ACE2 binding regions for ACE2 genes in 131 mammalian species across 18 orders. We found that the ACE2 receptors are most strongly conserved among Primates. Among other mammals, in comparison to bat species in the Chiroptera order, the human ACE2 receptor’s key residues are at least similarly or more conserved for SARS-CoV-2 binding in specific species from the Artiodactyla, Rodentia, Carnivora, Perissodactyla, Pholidota, Lagomorpha, Proboscidea, Sirenia, and Tubulidentata orders. Next, we compared sequence similarity in S protein key sites from SARS-CoV-2 and SARS-CoV with that of SARS-like coronaviruses that infect 12 high risk mammalian species, these include species from the Primates, Perissodactyla, Artiodactyla, Canivora, Chiroptera, Rodentia and Pholidota orders, but the Lagomorpha, Proboscidea, Sirenia, and Tubulidentata orders are ruled out because there are no records of mammalian-specific SARS-like coronaviruses for these species. Based on sequence similarity, we found that key residues on S protein in SARS-CoV is most similar to that of palm civet Civet-CoV, whereas that in SARS-CoV-2 is most similar to Pangolin-CoV, and both human coronaviruses share S protein similarities to bat RaTG13 but not with other mammalian coronaviruses. Together, our results suggest that that the progenitor of both SARS- CoV-2 and SARS-CoV are of bat origin. The bat coronavirus infected two different distinct species from different orders, the palm civets and pangolins, because both share high similarities at key sites with human ACE2 and with human SARS S1 binding sites. The rapid evolution of the bat virus in the intermediate hosts allowed it to distinctly adapt high binding potential between the S protein and human-like ACE2 receptors. Indeed, the S protein binding sites are highly similar between SARS-CoV-2 and Pangolin-CoV but not Civets-CoV, and highly similar between SARS-CoV and Civets-CoV but not Pangolin-CoV.