The emergence of a novel coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), resulted in a pandemic. Here, we used X-ray structures of human ACE2 bound to the receptor-binding domain (RBD) of the spike protein (S) from SARS-CoV-2 to predict its binding to ACE2 proteins from different animals, including pets, farm animals, and putative intermediate hosts of SARS-CoV-2. Comparing the interaction sites of ACE2 proteins known to serve or not serve as receptors allows the definition of residues important for binding. From the 20 amino acids in ACE2 that contact S, up to 7 can be replaced and ACE2 can still function as the SARS-CoV-2 receptor. These variable amino acids are clustered at certain positions, mostly at the periphery of the binding site, while changes of the invariable residues prevent S binding or infection of the respective animal. Some ACE2 proteins even tolerate the loss or acquisition of N-glycosylation sites located near the S interface. Of note, pigs and dogs, which are not infected or are not effectively infected and have only a few changes in the binding site, exhibit relatively low levels of ACE2 in the respiratory tract. Comparison of the RBD of S of SARS-CoV-2 with that from bat coronavirus strain RaTG13 (Bat-CoV-RaTG13) and pangolin coronavirus (Pangolin-CoV) strain hCoV-19/pangolin/Guangdong/1/2019 revealed that the latter contains only one substitution, whereas Bat-CoV-RaTG13 exhibits five. However, ACE2 of pangolin exhibits seven changes relative to human ACE2, and a similar number of substitutions is present in ACE2 of bats, raccoon dogs, and civets, suggesting that SARS-CoV-2 may not be especially adapted to ACE2 of any of its putative intermediate hosts. These analyses provide new insight into the receptor usage and animal source/origin of SARS-CoV-2.
IMPORTANCE SARS-CoV-2 is threatening people worldwide, and there are no drugs or vaccines available to mitigate its spread. The origin of the virus is still unclear, and whether pets and livestock can be infected and transmit SARS-CoV-2 are important and unknown scientific questions. Effective binding to the host receptor ACE2 is the first prerequisite for infection of cells and determines the host range. Our analysis provides a framework for the prediction of potential hosts of SARS-CoV-2. We found that ACE2 from species known to support SARS-CoV-2 infection tolerate many amino acid changes, indicating that the species barrier might be low. Exceptions are dogs and especially pigs, which revealed relatively low ACE2 expression levels in the respiratory tract. Monitoring of animals is necessary to prevent the generation of a new coronavirus reservoir. Finally, our analysis also showed that SARS-CoV-2 may not be specifically adapted to any of its putative intermediate hosts.
As of 30 April 2020, the ongoing pandemic of a novel coronavirus, severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), has developed into a global challenge, with the number of total confirmed cases exceeding 3 million, including more than 200,000 fatalities, thereby causing a major loss to global public health and the world economy. This disease is referred to as the 2019 coronavirus disease (COVID-19) by the World Health Organization (WHO) and was defined as a public health emergency of international concern (PHEIC) on 30 January 2020. Its main clinical symptoms include fever, fatigue, and dry cough. A rather large proportion of patients become critically ill with acute respiratory distress syndrome, similar to patients with severe acute respiratory syndrome (SARS) caused by SARS coronavirus (SARS-CoV) (1–3). SARS emerged in China in 2002-2003 and also rapidly spread worldwide but was contained by public health measures. It is thought that bats and palm civets are the natural and intermediate reservoirs of SARS-CoV (4, 5). Likewise, research suggests that SARS-CoV-2 might have also originated from bats and that pangolins might be the potential intermediate host (6–10). Specifically, SARS-CoV-2 has a high nucleotide sequence identity with bat coronavirus strain RaTG13 (Bat-CoV-RaTG13) except for the middle part of its genome, which encodes the spike protein, which might have derived via recombination from a pangolin coronavirus (Pangolin-CoV)-like virus (6, 11–13). A previous study showed that SARS-CoV-2 replicates poorly in dogs and pigs but that cats are permissive to infection (14). However, whether pets can become new hosts of SARS-CoV-2 needs to be clarified further.
The structure of the trimeric spike protein (S) of SARS-CoV-2, the major factor that determines cell and host tropism, has been determined (15–18). It is cleaved by host proteases into the S1 subunit, which contains the receptor-binding domain (RBD), and S2, which mediates fusion of the virion with cellular membranes (19). Proteolytical cleavage of S at two sites (S1/S2 and S2´) is required to prime the protein to execute its fusion activity. Cleavage is performed by the serine transmembrane protease TMPRSS2, but in its absence, lysosomal cathepsin protease L can substitute (20, 21). S of SARS-CoV-2 contains an insertion of four amino acids (aa) that creates a cleavage site for the ubiquitous protease furin (17, 22, 23).
SARS-CoV-2 and SARS-CoV share the same membrane protein, angiotensin-converting enzyme 2 (ACE2), as the cellular receptor (17, 24). In this study, we used a comparative bioinformatics, structural, and experimental approach to better understand the source and host range of emerging SARS-CoV-2. We compared recently released X-ray structures of human ACE2 bound to the receptor-binding domain of S from SARS-CoV-2 with the structure of S of SARS-CoV bound to human ACE2 (25–28). Using ACE2 sequences of species that can serve and not serve as a receptor for SARS-CoV-2, we propose amino acids that are crucial for binding (13). We report that pets (cats) and domestic animals are at risk of infection with SARS-CoV-2 since they have fewer amino acid changes at the ACE2-S interface than ACE2 proteins from animals that are known to serve as the SARS-CoV-2 receptor. In China, Europe, and other countries, pets have long accompanied humans, and thus they may be a neglected source worthy of further research. In addition, using RBD sequences of putative intermediate hosts, we considered which of them is more likely to bind to human ACE2. We also compared O- and N-glycosylation sites and a putative integrin binding motif in the S proteins of these viruses.
RESULTS AND DISCUSSION
Interacting amino acids of human ACE2 and SARS-CoV-2 and SARS-CoV S proteins.
Three crystal structures of the receptor-binding domain of S from SARS-CoV-2 in a complex with its receptor, human ACE2, have been resolved recently (26–28). The protein interaction surface (PDB file 6MOJ) is depicted in Fig. 1A (28). A total of 17 residues in the spike protein are in contact with 20 amino acids in ACE2, 8 of which (shown in orange) form hydrogen bonds with 13 residues in S (see Table 1). The other interactions are mostly hydrophobic, involving many tyrosine residues in the viral spike. A substantial part of the binding energy might be provided by the formation of a salt bridge between a negatively charged Asp30 in ACE2 with a positively charged Lys417 in the spike protein, which is located in the middle of the interaction surface.
A second crystal structure (PDB file 6LZG) identified one more contact site: Ser19 in ACE2 interacts with Ala457 and Gly458. Ser19 is the first amino acid in ACE2 of all the crystals, and hence its side chain might be more flexible (26). The third crystal structure (PDB file 6VW1) identified three more sites, Leu45, Gln325, and Glu329, which were identified in two other papers to be unique for binding to S of SARS-CoV but not the salt bridge involving Asp30 (26, 27). However, the authors used a chimeric S protein that contains the receptor-binding domain of SARS-CoV plus the loop that maintains a salt bridge between Arg426 with Glu329 in ACE2 (see below). Thus, this chimeric spike might not display the authentic binding properties of SARS-CoV-2.
It has been reported in some studies that S of SARS-CoV-2 binds with higher affinity to its receptor than does S of SARS-CoV (15, 17, 26, 29). Wang et al. and Lan et al. identified more amino acids in SARS-CoV-2 than in SARS-CoV interacting with ACE2 that also form more hydrogen bonds and van-der-Waal contacts (26, 28). To visualize the amino acids involved in binding of both S proteins, we used the crystal structure of S from SARS-CoV bound to human ACE2 (25). The contact surface of S is formed by a number of residues (16) similar to that which interact with 20 amino acids in ACE2, but three of the latter residues are unique for binding to S of SARS-CoV (Fig. 1B, labeled in cyan; Table 1). Surprisingly, two of the unique residues (Gln325, Glu329) are involved in the formation of the only salt bridge with Arg426 in S, which is located at the periphery of the binding site. The side chain of Asp30, which forms the salt bridge with SARS-CoV-2 S, is pointing away from the interacting surface and is thus not available for binding. In summary, although most of residues in ACE2 involved in binding to S of SARS-CoV-2 and SARS-CoV are identical, some are unique, suggesting that coronaviruses can adapt in multiple ways to the human ACE2 receptor.
Comparison of the interacting surfaces of ACE2 proteins that serve and do not serve as SARS-CoV-2 receptors.
It has been shown that transfection of HeLa cells with genes encoding the human, pig, civet, and bat ACE2 receptor makes them susceptible to infection with SARS-CoV-2 but not with ACE2 from mice (13). To estimate which of the interacting amino acids in ACE2 are essential for binding of S, we compared the ACE2 sequences from humans with those from the other species. Table 1 shows the amino acids that make contact with S from both SARS-CoV-2 and SARS-CoV (red numbers), only with S from SARS-CoV (blue numbers), only with S from SARS-CoV-2 (green numbers), and some variable amino acids in the vicinity (black numbers), as well as some N-glycosylation sites (highlighted in gray).
Pig ACE2 contains five amino acid substitutions in the interacting surface with S from SARS-CoV-2 relative to human ACE2 (Fig. 2A). Three of the exchanges are located at the periphery of the binding site. Leu79 and Met82, which interact with Phe486 in S, are conservatively replaced by Ile and Thr, respectively. Gln24, which forms a hydrogen bond to N487 in S, is replaced by Leu, which cannot form hydrogen bonds. His34 in the center of the binding site is also replaced by a Leu residue, which is larger but cannot form a hydrogen bond. Furthermore, Asp30, which forms the central salt bridge, is exchanged for a Glu residue. Since Glu is also negatively charged, the salt bridge to Lys417 most likely remains intact. It might even become stronger, since the side chain of Glu is larger and hence the distance between negatively and positively charged residues becomes smaller. In addition, the N-glycosylation site Asn90 in human ACE2 is destroyed by an exchange to Thr.
Three amino acid sequences encoded by the ACE2 gene from the bat species Rhinolophus sinicus are present in the database. The bats were sampled from R. sinicus colonies in three Chinese provinces, including Guangxi (R. sinicus-GX), Hubei (R. sinicus-HB), and Yunnan (R. sinicus-YN) (30, 31). Strikingly, although their overall amino acid identity is very high (99%), they exhibit large amino acid differences in the N-terminal amino acids that contact the S protein (Table 1). Since the accession number of the bat ACE2 sequence is not specified in the publication that demonstrated that it confers susceptibility to SARS-CoV-2 infection, it is presently unclear which of the ACE2 proteins is recognized by the viral spike protein (13). In any case, bat ACE2 proteins contain at least five changes relative to human ACE2. Three of the 5 amino acids are replaced in all bat ACE2 sequences and involve the same residues as in pig ACE2, but they are replaced by different amino acids in pig ACE2. In the bat sequences, Gln24 is replaced by the negatively charged Glu or by a positively charged Arg, His34 is replaced by Thr or Ser, and Met82 is replaced by an Asn. In one bat sequence (R. sinicus-YN), the Pro residue 84 is also replaced by a Ser, which creates a new N-glycosylation site (NXS) at position 82. In addition, in two bat sequences (R. sinicus-YN and R. sinicus-GX), Thr27 is replaced by Met or Ile, and Tyr41, which forms a hydrogen bond with Thr500 in S, is replaced by the smaller His residue. The most drastic substitution in these two bat ACE2 sequences is the replacement of Glu35, which forms a hydrogen bond with Gln493 in S, by a positively charged Lys residue. The ACE2 sequence obtained from a bat in the Hubei province, which exhibits more amino acid substitutions than either of the two other ones, does not contain these last three changes, but instead Arg31 is replaced by a negatively charged Glu and Asp38 by Asn. The reason why bats exhibit so many changes in residues that interact with S is striking and requires further investigation. However, it is tempting to speculate that local coevolution between bats and coronaviruses drive these amino acid changes.
To get further insight into the amino acids not essential for binding to S, we analyzed ACE2 from civets (Paguma larvata), which has been shown to serve as the receptor for SARS-CoV-2 (13). Civet ACE2 contains seven amino acid changes relative to human ACE2 (Fig. 2B). Three of them (Gln24Leu, Asp30Glu, and Met82Thr) are identical to the substitutions in pig ACE2. Another change is at the same position as that in pig ACE2 (residue 34), but His is exchanged for a different amino acid, Tyr instead of Leu. Another unique, but conservative substitution, Leu45Val, is located at the periphery of the binding site, whereas the other three are in the center. Asp38Glu is a conservative change, but K31T and E37Q replace a charged with an uncharged amino acid.
Mouse ACE2, which does not support infection of cells with SARS-CoV-2, has eight amino acid substitutions in the interacting surface with S of SARS-CoV-2 (Fig. 2C). Three of the sites, Gln24, His34, and Met82, are also replaced in the ACE2 proteins from the two other species and are thus unlikely to be the decisive elements that prevent binding. Leu79 interacts with Phe486 in S (Fig. 6A) and is replaced by a Thr. In contrast to bat ACE2, the substitution at position 82 does not create an N-glycosylation site in mouse ACE2 since it is exchanged for Ser. Note also that the two used N-glycosylation sites near the interacting surface in human ACE2, Asn322 and Asn90, are lost in mouse ACE2 due to exchanges of the Asn residues. The other four exchanges, Asp30Asn, Tyr83Phe, Lys31Asn, and Lys353His, are more important for preventing binding to mouse ACE2, as discussed in more detail below.
In summary, binding of S of SARS-CoV-2 to ACE2 proteins tolerates a surprisingly large number of amino acid changes in the interaction surface, five in pig ACE2 and seven in civet ACE2.
Comparison of the interacting surfaces of human ACE2 and ACE2 from pet animals.
We then asked whether receptor binding might present a species barrier for infection of pets with SARS-CoV-2. Dog ACE2 contains five amino acid changes in the amino acids in contact with S (Fig. 3A). Residues 24, 30, 34, and 82 are also replaced in pig ACE2, even by the same amino acid at four of the sites. A unique change in dog ACE2 is Asp38Glu, but since this is a conservative change, it is unlikely to affect the binding properties of S. As in pigs, dog ACE2 lacks the glycosylation site at position 90. Asn90 is replaced by Asp in dogs but by Thr in pigs. Note that the sequence of dog ACE2 in the database contains a nonconservative exchange at position 329 (Glu to Gly), which contacts the S protein of SARS-CoV. This exchange is not present in the ACE2 from a beagle that we sequenced (GenBank accession number MT269670).
Cat ACE2 contains only four changes (Gln24Leu, Asp30Glu, Asp38Glu, Met82Thr), which are also present in dog ACE2 (Table 1). After submission of this paper, results of infection experiments with animals in close contact with humans were published. That study showed that SARS-CoV-2 replicates poorly in dogs but efficiently in cats and was transmitted by droplets to naive cats (14). The only residue in dogs that is not changed in cat ACE2 is His34, which interacts with Tyr453, Leu455, and Gln493 in the center of the interaction surface. It is replaced by the slightly larger Tyr residue in dog ACE2, which is still able to interact with the same residues in S (Fig. 3A, inset). The other difference is the loss of the N-glycosylation site at position 90.
There is anecdotal evidence that tigers and lions in the Bronx Zoo of New York City have been infected by SARS-CoV-2 (https://www.aphis.usda.gov/aphis/newsroom/news/sa_by_date/sa-2020/ny-zoo-covid-19). Therefore, we analyzed the ACE2 gene from these wild cats. One amino acid difference was detected between ACE2 from cat and tiger, but the residues contacting S are identical, explaining why tigers are also susceptible to SARS-CoV-2 infection. ACE2 from a lion has another conservative change relative to cat ACE2: His34 is replaced by Ser, and it exhibits a loss of the N-glycosylation site at position 90, like dog ACE2.
Comparison of the surfaces of human ACE2 and ACE2 from species suitable as animal models.
There is an urgent need to explore potential therapies for COVID-19 animal models. Recent studies showed that ferrets are highly susceptible to infection with SARS-CoV-2 and even transmit virus to naive contact animals but also by droplets, albeit the latter route was less efficient (14, 32). Ferret ACE2 exhibits the exact same five sequence changes as dog ACE2 but also a replacement of Leu79 for His. In addition, a drastic change occurs at position 354, where the small glycine residue is replaced by a large and positively charged Arg residue (Fig. 3B). However, the Arg residue avoids clashing with large amino acids in S (Tyr505) by protruding backwards (Fig. 3B, inset).
The Syrian hamster (Mesocricetus auratus) has also been shown to be susceptible to experimental infection and to transmit SARS-CoV-2 to close contact animals (33). Its ACE2 protein contains only two amino acid substitutions relative to human ACE2. His34 is exchanged for Gln, and Met82 is replaced by Asn. Since residue 84 is also exchanged for a Ser, it creates an N-glycosylation site at position 82. A glycan attached to the same position prevents binding of S of SARS-CoV to rat ACE2.
Guinea pigs (Cavia porcellus) might also serve as an animal model but are also common pets, especially of children. Guinea pig ACE2 contains seven amino acid changes, at positions 24, 27, 31, 34, 38, 82, and 354, and thus more than ACE2 from ferrets or dogs. The glycosylation site at position 90 is preserved, but the site at position 322 is lost due to an Asn-to-Pro change (Table 1).
Comparison of the interacting surfaces of human ACE2 and ACE2 from farm animals.
Farm animals are also in close contact with humans and thus represent another risk group that might become infected by SARS-CoV-2. The ACE2 proteins from chickens contain 10 amino acid changes compared with human ACE2 and lost the N-glycosylation site at position 90 (Fig. 3C). Some of the affected positions (Gln24, His34, Leu79, Met82, and Gly353) are also exchanged in ACE2 proteins that serve as the SARS-CoV-2 receptor, albeit often to different residues. Unique to all ACE2 proteins is the change of Gln24 to the negatively charged Glu residue, but the interaction with Gly446 and Tyr449 in S is probably preserved (Fig. 3C, inset). Most of the other exchanges are likely to be more critical for binding to S. Tyr at position 83 is replaced by a Phe, which is not able to form a hydrogen bond with S. Two of the changes reverse the polarity of a charged amino acid: Lys31 is replaced by a negatively charged Glu, and Glu35 is exchanged for a positively charged Arg. Finally, Asp30, which forms the salt bridge with S, is replaced by the uncharged residue Ala, which makes the ACE2 proteins from chickens and mice the only ones that are not able to form a salt bridge with S.
Duck ACE2 also contains 10 amino acid substitutions, 9 of them at the same position and 8 to the same amino acid, including all the presumably important ones just discussed. Thus, it seems likely that the lack of susceptibility of chickens and ducks to experimental SARS-CoV-2 is due to the inability of the virus to bind to the avian ACE2 receptor (14).
The ACE2 protein of pigs can serve as a SARS-CoV-2 receptor, although it contains five changes in amino acids contacting the S protein (Fig. 2A). Cattle and sheep contain only two amino acid changes (Asp30Glu and Met82Thr) that are also present in pig ACE2 and even retain the N-glycosylation site at position 90 of human ACE2 that is lost in pig ACE2. It thus seems highly likely that ACE2 proteins of both species can function as SARS-CoV-2 receptors, and experimental infection of these animals and surveillance are required to show whether they are susceptible to SARS-CoV-2. Camel is the animal reservoir of the Middle East respiratory syndrome virus (MERS), which, however, uses dipeptidyl-peptidase 4 as the protein receptor. We analyzed the ACE2 gene from Camelus bactrianus and found, in addition to the two changes present in cattle and sheep, another substitution: the positively charged Lys31 located at the center of the binding site is replaced by a negatively charged Glu residue, a change which is also present in guinea pig.
In summary, almost all mammalian species known to be susceptible to SARS-CoV-2 infection (cats and ferrets) have mutations in many amino acids in their ACE2 proteins. This suggests that these species, especially those in contact with humans, are at risk of contracting the virus and that SARS-CoV-2 might establish itself in one of these animals, thereby creating an additional animal reservoir. An exception is apparently pigs, which cannot be infected with SARS-CoV-2, although their ACE2 proteins can function as a SARS-CoV-2 receptor (13). In addition, dogs do not transmit the virus to naive animals in close contact. We therefore investigated the level of ACE2 expression in different organs by quantitative real-time PCR (qRT-PCR). We found that pigs and dogs have the highest mRNA levels in kidney, but the level is also high in other internal organs, such as heart (pigs and dogs) and duodenum and liver (pigs) (Fig. 4). The relative mRNA levels in tissues relevant for SARS-CoV-2 infection, lung, trachea and turbinate, are very low, 1,000-fold (dogs) or 10,000-fold (pigs) lower than in kidney. Among these organs, pigs exhibit the highest mRNA levels in trachea, around 20-fold less in lungs, and very little in turbinates. In dogs, the mRNA level is around 20-fold higher in lungs than in trachea; no mRNA was detected in turbinates. It is thus tempting to speculate that the low ACE2 levels in the respiratory tract prevent the infection of pigs and the efficient replication and hence spread of the virus to contact animals in dogs. However, efficient receptor binding is only the first step in virus infection. Beyond that, numerous restriction factors have been identified in cells from various species that prevent successful virus replication. Our results mean that more experiments and monitoring should be done to explore the source of animal infection.
Amino acids in ACE2 essential for binding of SARS-CoV-2.
Based on an amino acid comparison of ACE2 proteins from animals that do not serve as SARS-CoV-2 receptors (mice) or are not susceptible to SARS-CoV-2 infection (chickens) with ACE2 proteins from animals which are infectible (cats, dogs, ferrets) or that carry an ACE2 protein that confers susceptibility to SARS-CoV-2 infection (civets), the amino acids essential for binding of S can be deduced (Fig. 5; Table 1). Tyr83, which forms hydrogen bonds with Asn487 and Tyr489 in S, is replaced by Phe in mice and ducks. Phe has the same size and hydrophobicity as Tyr but is not able to form hydrogen bonds with its side chain. All other animals exhibit a Tyr residue at position 83. Note, however, that Tyr489 might be shielded by the acquisition of an N-glycosylation site at position 82, as it occurs in ACE2 of mice and in some bat sequences. The other residues are located in the center of the interaction surface. Lys31, which forms van-der-Waals contacts with Y489 and F456, is replaced by a neutral Asn reside in mice and even by a negatively charged Glu in chickens. Glu35 forms hydrogen bonds with Gln493 in S. It is replaced by a positively charged Arg in chicken, duck, and some bat sequences. Lys353, which forms hydrogen bonds with its side chain to Tyr495, Gly496, and Gly502, is replaced by the smaller His residue in mice, which presumably cannot form these hydrogen bonds. ACE2 of all other animals (including chickens) contain a Lys at this position. In support of this hypothesis, a single Lys353Ala mutation has been shown to abolish the ACE2-S interaction (34). Finally, also important is the centrally located Asp30, which is replaced by Asn in mice and chickens. Asn has the same size as Asp but is uncharged and thus unable to sustain the salt bridge with Lys417 in S. ACE2 proteins from all other animals either retain Asp or it is replaced by the negatively charged but slightly larger Glu residue, which is probably also able to form the salt bridge.
Note that positions 330, 355, and 357 are conserved in all ACE2 proteins we analyzed here, and thus their relevance for binding to S cannot be estimated (Table 1). In addition, binding of S to ACE2 might not follow a simple “lock-and-key” principle. The mouse-adapted SARS-CoV strain MA15 contains a single amino acid exchange in the S protein relative to the Urbani strain; Tyr at position 346 is replaced by a His residue (35). Tyr346 forms hydrogen bonds with Asp38 and Gln42 in human ACE2 (Fig. 1B), but mouse ACE2 also contains Asp and Gln at positions 38 and 42, respectively. His is also able to serve as a hydrogen bond donor or acceptor, but its side chain is shorter, and it is not obvious how this exchange enhances binding to mouse ACE2.
Comparison of amino acids in the receptor-binding motif (RBM) of SARS-CoV-2 with bat and pangolin coronaviruses.
It has been reported that SARS-CoV-2 derived from Bat-CoV-RaTG13, but parts of the S protein exhibit a higher nucleotide similarity to a virus from pangolin. Therefore, we aligned the whole amino acid sequence of S from SARS-CoV-2 (GenBank accession number QHD43416.1) with that of Bat-CoV-RaTG13 (QHR63300.2) and Pangolin-CoV (EPI_ISL_410721). The Pangolin-CoV nucleotide sequence was obtained from GISAID and translated into protein. As noted before, S from SARS-CoV-2 contains an insertion of four amino acids (PRRA) that creates a polybasic cleavage site recognized by the ubiquitous protease furin (22, 34). Insertion of amino acids at the S1-S2 junction can also occur in bats, since the novel bat-derived virus RmYN02 contains the insertion of amino acids PAA, which does not create a polybasic cleavage site (36). The following S2 subunit is almost completely conserved among all three viruses; it exhibits nine mostly conservative amino acid substitutions in S of Pangolin-CoV and only two in S of Bat-CoV-RaTG13. The N terminus of the S protein up to residue 400 is also highly similar between S proteins of SARS-CoV-2 and Bat-CoV-RaTG13. There are only six amino acid exchanges in Bat-CoV-RaTG13, two of them affecting N-glycosylation sites, whereas S from Pangolin-CoV contains 101 amino acid differences compared with S of SARS-CoV-2. However, from residue 401 to 518, which contains the receptor-binding domain of S, the homology is reversed (Table 2).
Bat-CoV-RaTG13 contains 18 amino acid substitutions, 5 of which involve amino acids that contact human ACE2 (Fig. 6A, white sticks). Located at the periphery of the interaction surface is Phe486, which interacts with Leu79 in ACE2. It is replaced by a Leu, which is also a large and hydrophobic residue. At the other side of the interaction surface is located Asn501, which forms a hydrogen bond with Tyr41 in ACE2 and is replaced by the negatively charged Glu residue. The other three exchanged amino acids are located in the center of the ACE2 binding site. Gln493, which forms a hydrogen bond with Glu35, is replaced by a Tyr residue. Tyr449, which forms a hydrogen bond with Gln42, is replaced by Phe, which has the same size but cannot form (or forms only weak) hydrogen bonds. The most drastic exchange is probably Tyr505, which forms a hydrogen bond with Gln42 and is replaced by a much smaller His residue.
In contrast, S sequence from Pangolin-CoV contains only three exchanges relative to SARS-CoV-2 in this region, and only one of them involves an amino acid that contacts ACE2, the replacement of Lys417 by an Arg (Fig. 6A, red stick). Since both are basic amino acids, the important salt bridge is probably preserved. In summary, the RBD from Pangolin-CoV is much better suited to bind to human ACE2 than is that from Bat-CoV-RaTG13. From this point of view, it seems possible that SARS-CoV-2 might have acquired the RBD from a Pangolin-CoV to achieve bat-to-human transmission.
Comparison of the interacting surfaces of ACE2 proteins of putative intermediate hosts.
We therefore asked whether ACE2 of pangolin (Manis javanica) contains amino acids on its interaction surface that might closely resemble those of humans. In that case, a precursor of SARS-CoV-2 might have acquired an RBD from a Pangolin-CoV by recombination that is then already adapted to replicate both in pangolins and in humans. However, pangolin ACE2 contains seven changes compared with human ACE2 (Fig. 6B). In addition, the N-glycosylation site at position 322 of human ACE2 is lost due to a change of Asn to Lys. Some of the changes, Asp30Glu, His34Ser, Asp38Glu, and Leu79Ile, also occur in ACE2 proteins from animals that can interact with S. The three other variable positions are also exchanged in other animals but mostly to other residues. Gly354 is replaced by a small His residue and not by the larger Arg, which is present in ferret ACE2. Gln24, which interacts with Asn487 in S, is replaced by a negatively charged Glu residue. Finally, Met82, which interacts with Phe486 in S, is replaced by the larger Asn residue. Although none of the amino acid changes might prevent binding to S, it nevertheless appears that pangolin ACE2 is not especially equipped to serve as a receptor for SARS-CoV-2.
Other potential intermediate hosts are civets and raccoon dogs (Nyctereutes procyonoides). ACE2 of civets has been shown to confer susceptibility to SARS-CoV-2 infection in cell culture, although it contains seven changes relative to human ACE2 in the amino acids contacting S (Fig. 2B). The ACE2 protein from raccoon dogs contains only five substitutions, Gln24Leu, Asp30Glu, His34Tyr, Asp38Glu, and Met82Thr, which are also present in civet ACE2. It is also identical in these residues to ACE2 from dogs, which are susceptible to SARS-CoV-2 infection but do not spread the virus.
In summary, none of the ACE2 proteins of any of the discussed intermediate hosts seems to be especially equipped to attach to S of SARS-CoV-2, but the lowest number of changes occur in ACE2 of raccoon dogs.
Glycosylation sites and a putative integrin binding motif in S of SARS-CoV-2, Bat-CoV-RaTG13, and Pangolin-CoV.
It has previously been shown that an RGD motif (positions 403 to 405, Arg-Gly-Asp) is present in the receptor-binding domain of the spike proteins of all SARS-CoV-2 sequences (37). This sequence mediates attachment of several viruses to integrins, which thus might serve as an additional receptor for SARS-CoV-2. Two RGD motifs are present in S of Pangolin-CoV, one at the same site and another at amino acids 246 to 249, but S of Bat-CoV-RaTG13 contains no RGD motif (Fig. 7A).
Bulky carbohydrates attached to the S protein might mask antibody epitopes and interfere with receptor binding and/or with proteolytical cleavage, which is required to prime the protein to execute its membrane fusion activity. S of SARS-CoV-2 contains 22 N-glycosylation sites (NXS/T), 9 in S2 and 13 in the S1 subunit, all of which are glycosylated with almost 100% stoichiometry if the ectodomain of S is expressed in mammalian cells (38).
A total of 21 glycosylation sites are conserved among all three viruses. S of SARS-CoV-2 contains a unique glycosylation site in the N-terminal domain at position 74 in a loop of the N-terminal S1 domain (Fig. 7B). This region (aa 60 to 80) is dissimilar to that of Pangolin-CoV but identical to S of Bat-CoV-RaTG13, except for the glycosylation site (NGI). Two sites are present only in S of the bat and pangolin coronaviruses, not in SARS-CoV-2. One is also located in the N-terminal domain at position 30, in a region (aa 15 to 49) which is identical to S of SARS-CoV-2, except the glycosylation sequon, which is NSS in S of bat but NSF in S of SARS-CoV-2. The N-terminal domain has a galectin-like fold and is known in other coronaviruses to bind to carbohydrates on the cell surface, i.e., using them as an attachment factor as the first step of virus entry (18). Most interestingly is a site at position 370 which localizes near the ACE2 receptor-binding domain. This residue is located in a region (aa 275 to 400) of high amino acid identity among all three viruses. Thus, it is tempting to speculate that this site was lost during adaption of SARS-CoV-2 to humans in order to get better access to the ACE2 receptor.
As suggested before, the insertion of amino acids at the cleavage site between the S1 and S2 subunits of SARS-CoV-2 creates three potential GalNAc O-glycosylation sites, which are not predicted for S of the bat or pangolin coronavirus (39). Attachment of GalNac to serine and threonine residues is catalyzed by up to 20 different GalNAc transferases with different, partially overlapping substrate specificities. The sugar residues are then elongated by other glycosyltransferases, thereby creating long carbohydrate chains. Interestingly, whether a certain O-glycosylation site is used is cell type dependent, and O-glycans attached near the furin cleavage site have been shown to affect processing (40, 41). It is thus tempting to speculate that usage of an O-glycosylation site might determine whether S is cleaved in a certain cell. Since processing by furin is essential for entry of SARS-CoV-2 into cells that lack cathepsin proteases, O-glycosylation might affect cell tropism and hence the virulence of SARS-CoV-2 (42). This is somewhat reminiscent of hemagglutinin (HA) of an avian influenza A virus, where the loss of an N-glycosylation sequon near a polybasic cleavage site allowed processing of HA, thereby generating a highly virulent strain (43). However, the majority of S proteins in transfected or infected cultured cells are cleaved, indicating that the O-glycosylation sites have little (if any) effect on S processing in vitro (17, 20, 21).
MATERIALS AND METHODS
The S proteins of SARS-CoV-2 and Bat-CoV-RaTG13 were obtained from GenBank (accession numbers QHD43416.1 and QHR63300.2, respectively). The Pangolin-CoV nucleotide sequence was obtained from GISAID (EPI_ISL_410721) and translated into protein. Sequences were aligned using MEGA7.0. A total of 23 ACE2 protein sequences of 20 mammals, including human, dog, cat, guinea pig, hamster, mouse, pig, rabbit, cattle, sheep, ferret, raccoon dog, bat (Rhinolophus sinicus), civet, pangolin, tiger, lion, camel, chicken, and duck, were downloaded from GenBank. The ACE2 protein sequences of dog (MT269670) and cat (MT269671) were obtained from this study (Table 1). All sequences were aligned using MEGA7.0.
The software PyMol was used to create the figures of the cryoelectron microscopy (Cryo-EM) structure of S from SARS-CoV-2 (PDB file 6VSB) and from the crystal structure of its receptor-binding domain bound to human ACE2 (PDB file 6M0J). The amino acids which are mentioned in this publication to mediate contact between S and ACE2 are shown as sticks in the figures, whereas the remainder of the molecules are shown as cartoons. An integrated measuring wizard was used to determine the distance between two atoms, which is shown as dotted lines in the figures, and also to indicate the distance in angstroms. Exchange of certain amino acids was performed with a mutagenesis tool. Among the different possible rotamers of the mutated amino acid side chain, the one that was chosen exhibited no clashes with neighboring amino acids.
Sample collection and nucleic acid extraction.
Sample collection and processing were done at the Institute of Military Veterinary Medicine and Nanjing Agricultural University for animal welfare. All the experiments were approved by the committee on the ethics of animal experiments. Samples from dog and pig, including heart, liver, spleen, lung, kidney, duodenum, bronchus, and turbinate, were used for quantitative real-time PCR (qRT-PCR) to determine the content of mRNA encoding ACE2. The kidney of a beagle or Felis catus was used for PCR amplification of the full length of ACE2. A total of 50 mg of each sample was ground into powder in liquid nitrogen and transferred to a diethyl pyrocarbonate (DEPC)-treated EP tube before liquid nitrogen was volatilized. Total RNA was extracted using 1 ml TRIzol (Vazyme Biotechnology, Nanjing, China), and 1 μg cDNA was synthesized using a RevertAid first-strand cDNA synthesis kit (Thermo Fisher Scientific) according to the manufacturer’s instructions.
Amplification and detection of ACE2 by qRT-PCR.
The primers used to amplify the full length of the ACE2 gene sequence of cat and dog by PCR included the following: Cat-ACE2-F, 5′-AAGAGCTCATGTCAGGCTCTTTCTGGCTC-3′; Cat-ACE2-R, 5′-TTGGTACCCTAAAATGAAGTCTGAACATCATCA-3′; Dog-ACE2-F, 5′-AAGAGCTCATGTCAGGCTCTTCCTGGCT-3′; Dog-ACE2-R, 5′-TTGGTACCCTAAAATGAAGTCTGAGCATCATC-3′. The 25-μl PCR mixtures contained 1 μl each primer, 1 μl cDNA template, 12.5 μl 2× Phanta Max buffer, 0.5 μl deoxynucleoside triphosphate (dNTP) mix (10 mM each), 0.5 μl Phanta Max super-fidelity DNA polymerase (Vazyme Biotechnology, Nanjing, China), and 8.5 μl double-distilled water (ddH2O). PCR conditions were as follows: predenaturation at 95°C for 5 min, 35 cycles of 95°C for 30 s, 62°C for 30 s, and 72°C for 2 min 30 s, and a final extension at 72°C for 10 min. Positive amplicons were sequenced by Comate Biosciences Company Limited (Changchun, China) using an ABI 3730 sequencer, and consensus sequences were obtained from at least three independent determinations.
qRT-PCR was used to determine the content of ACE2 in different tissues of dog and pig. The total volume of qRT-PCR was 20 μl, including 10 μl 2× Aceq qPCR mixture (Vazyme Biotechnology, Nanjing, China), 0.4 μl of forward and reverse primer, 2 μl of template, and the remaining volume of nuclease-free water. The following procedure was used for amplification on a Roche LightCycler 96 instrument: 95°C for 10 min, followed by 45 cycles of 95°C for 10 s, 55°C for 10 s, and 72°C for 20 s. β-Actin was selected as the internal parameter. The primers for ACE2 genes of pig and dog were as follows: Sus-ACE2-F, 5′-TTGATGGAAGATGTGGAGCG-3′; Sus-ACE2-R, 5′-CCACATATCGCCAAGCAAATG-3′; Sus-β-actin-F, 5′-CTCCATCATGAAGTGCGACGT-3′; Sus-β-actin-R, 5′-GTGATCTCCTTCTGCATCCTGTC-3′; Dog-ACE2-F, 5′-GGTGGATGGTCTTTAAGGGTG-3′; Dog-ACE2-R, 5′-ACATGGAACAGAGATGCAGG-3′; Dog-β-actin-F, 5′-GGACTTCGAGCAGGAGATGG-3′; Dog-β-actin-R, 5′-TTCCATGCCCAGGAAGGAAG-3′.
Data are expressed as mean values ± standard deviations (SD) or standard errors of the mean (SEM). Statistical analyses were performed using Student's t test with GraphPad Prism 5 software.
ACE2 protein sequences were deposited in GenBank under accession numbers MT269670 (dog) and MT269671 (cat).
We thank Jiyong Zhou, Zhejiang University, and Changchun Tu, Institute of Military Veterinary Medicine, Academy of Military Medical Sciences, for their guidance and help.
This work was financially supported by the National Key Research and Development Program of China (grant no. 2017YFD0500101), the Fundamental Research Funds for the Central Universities (grant no. Y0201900459), the China Association for Science and Technology Youth Talent Lift Project (2017–2019), Six Talent Peaks Project of Jiangsu Province of China (grant no. NY-045), the Natural Science Foundation of Jiangsu Province (grant no. BK20170721), the Young Top-Notch Talents of National Ten-Thousand Talents Program, Sino-German cooperation and exchange project in international cooperation and Cultivation Project in 2019, and the Bioinformatics Center of Nanjing Agricultural University. Experimental work in the lab of M.V. is financed by the German Research Foundation (DFG).
Li Q, Guan X, Wu P, Wang X, Zhou L, Tong Y, Ren R, Leung KSM, Lau EHY, Wong JY, Xing X, Xiang N, Wu Y, Li C, Chen Q, Li D, Liu T, Zhao J, Li M, Tu W, Chen C, Jin L, Yang R, Wang Q, Zhou S, Wang R, Liu H, Luo Y, Liu Y, Shao G, Li H, Tao Z, Yang Y, Deng Z, Liu B, Ma Z, Zhang Y, Shi G, Lam TTY, Wu JTK, Gao GF, Cowling BJ, Yang B, Leung GM, Feng Z. 2020. Early transmission dynamics in Wuhan, China, of novel coronavirus-infected pneumonia. N Engl J Med 382:1199–1207.
Wang D, Hu B, Hu C, Zhu F, Liu X, Zhang J, Wang B, Xiang H, Cheng Z, Xiong Y, Zhao Y, Li Y, Wang X, Peng Z. 2020. Clinical characteristics of 138 hospitalized patients with 2019 novel coronavirus-infected pneumonia in Wuhan, China. JAMA 323:1061–1069.
Chan JF, Yuan S, Kok KH, To KK, Chu H, Yang J, Xing F, Liu J, Yip CC, Poon RW, Tsoi HW, Lo SK, Chan KH, Poon VK, Chan WM, Ip JD, Cai JP, Cheng VC, Chen H, Hui CK, Yuen KY. 2020. A familial cluster of pneumonia associated with the 2019 novel coronavirus indicating person-to-person transmission: a study of a family cluster. Lancet 395:514–523.
Song HD, Tu CC, Zhang GW, Wang SY, Zheng K, Lei LC, Chen QX, Gao YW, Zhou HQ, Xiang H, Zheng HJ, Chern SW, Cheng F, Pan CM, Xuan H, Chen SJ, Luo HM, Zhou DH, Liu YF, He JF, Qin PZ, Li LH, Ren YQ, Liang WJ, Yu YD, Anderson L, Wang M, Xu RH, Wu XW, Zheng HY, Chen JD, Liang G, Gao Y, Liao M, Fang L, Jiang LY, Li H, Chen F, Di B, He LJ, Lin JY, Tong S, Kong X, Du L, Hao P, Tang H, Bernini A, Yu XJ, Spiga O, Guo ZM, Pan HY, He WZ, Manuguerra JC, Fontanet A, Danchin A, Niccolai N, Li YX, Wu CI, Zhao GP. 2005. Cross-host evolution of severe acute respiratory syndrome coronavirus in palm civet and human. Proc Natl Acad Sci U S A 102:2430–2435.
Hu B, Zeng L-P, Yang X-L, Ge X-Y, Zhang W, Li B, Xie J-Z, Shen X-R, Zhang Y-Z, Wang N, Luo D-S, Zheng X-S, Wang M-N, Daszak P, Wang L-F, Cui J, Shi Z-L. 2017. Discovery of a rich gene pool of bat SARS-related coronaviruses provides new insights into the origin of SARS coronavirus. PLoS Pathog 13:e1006698.
Xiao K, Zhai J, Feng Y, Zhou N, Zhang X, Zou JJ, Li N, Guo Y, Li X, Shen X, Zhang Z, Shu F, Huang W, Li Y, Zhang Z, Chen RA, Wu Y, Peng SM, Huang M, Xie WJ, Cai QH, Hou FH, Chen W, Xiao L, Shen Y. 2020. Isolation of SARS-CoV-2-related coronavirus from Malayan pangolins. Nature.
Wahba L, Jain N, Fire AZ, Shoura MJ, Artiles KL, McCoy MJ, Jeong DE. 2020. An extensive meta-metagenomic search identifies SARS-CoV-2-homologous sequences in pangolin lung viromes. mSphere 5:e00160-20.
Wu F, Zhao S, Yu B, Chen YM, Wang W, Song ZG, Hu Y, Tao ZW, Tian JH, Pei YY, Yuan ML, Zhang YL, Dai FH, Liu Y, Wang QM, Zheng JJ, Xu L, Holmes EC, Zhang YZ. 2020. A new coronavirus associated with human respiratory disease in China. Nature 579:265–269.
Zhou P, Yang XL, Wang XG, Hu B, Zhang L, Zhang W, Si HR, Zhu Y, Li B, Huang CL, Chen HD, Chen J, Luo Y, Guo H, Jiang RD, Liu MQ, Chen Y, Shen XR, Wang X, Zheng XS, Zhao K, Chen QJ, Deng F, Liu LL, Yan B, Zhan FX, Wang YY, Xiao GF, Shi ZL. 2020. A pneumonia outbreak associated with a new coronavirus of probable bat origin. Nature 579:270–273.
Shi J, Wen Z, Zhong G, Yang H, Wang C, Huang B, Liu R, He X, Shuai L, Sun Z, Zhao Y, Liu P, Liang L, Cui P, Wang J, Zhang X, Guan Y, Tan W, Wu G, Chen H, Bu Z. 2020. Susceptibility of ferrets, cats, dogs, and other domesticated animals to SARS-coronavirus 2. Science.
Hoffmann M, Kleine-Weber H, Schroeder S, Krüger N, Herrler T, Erichsen S, Schiergens TS, Herrler G, Wu N-H, Nitsche A, Müller MA, Drosten C, Pöhlmann S. 2020. SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor. Cell 181:271–30224. S0092-8674.
Ou X, Liu Y, Lei X, Li P, Mi D, Ren L, Guo L, Guo R, Chen T, Hu J, Xiang Z, Mu Z, Chen X, Chen J, Hu K, Jin Q, Wang J, Qian Z. 2020. Characterization of spike glycoprotein of SARS-CoV-2 on virus entry and its immune cross-reactivity with SARS-CoV. Nat Commun 11:1620.
Coutard B, Valle C, de Lamballerie X, Canard B, Seidah NG, Decroly E. 2020. The spike glycoprotein of the new coronavirus 2019-nCoV contains a furin-like cleavage site absent in CoV of the same clade. Antiviral Res 176:104742.
Wang Q, Zhang Y, Wu L, Niu S, Song C, Zhang Z, Lu G, Qiao C, Hu Y, Yuen KY, Wang Q, Zhou H, Yan J, Qi J. 2020. Structural and functional basis of SARS-CoV-2 entry by using human ACE2. Cell S0092-8674:30338-X.
Tai W, He L, Zhang X, Pu J, Voronin D, Jiang S, Zhou Y, Du L. 2020. Characterization of the receptor-binding domain (RBD) of 2019 novel coronavirus: implication for development of RBD protein as a viral attachment inhibitor and vaccine. Cell Mol Immunol.
Ge X-Y, Li J-L, Yang X-L, Chmura AA, Zhu G, Epstein JH, Mazet JK, Hu B, Zhang W, Peng C, Zhang Y-J, Luo C-M, Tan B, Wang N, Zhu Y, Crameri G, Zhang S-Y, Wang L-F, Daszak P, Shi Z-L. 2013. Isolation and characterization of a bat SARS-like coronavirus that uses the ACE2 receptor. Nature 503:535–538.
Hou Y, Peng C, Yu M, Li Y, Han Z, Li F, Wang LF, Shi Z. 2010. Angiotensin-converting enzyme 2 (ACE2) proteins of different bat species confer variable susceptibility to SARS-CoV entry. Arch Virol 155:1563–1569.
Chan JF, Zhang AJ, Yuan S, Poon VK, Chan CC, Lee AC, Chan WM, Fan Z, Tsoi HW, Wen L, Liang R, Cao J, Chen Y, Tang K, Luo C, Cai JP, Kok KH, Chu H, Chan KH, Sridhar S, Chen Z, Chen H, To KK, Yuen KY. 2020. Simulation of the clinical and pathological manifestations of Coronavirus Disease 2019 (COVID-19) in golden Syrian hamster model: implications for disease pathogenesis and transmissibility. Clin Infect Dis.
Wang Q, Qiu Y, Li J-Y, Zhou Z-J, Liao C-H, Ge X-Y. 2020. A unique protease cleavage site predicted in the spike protein of the novel pneumonia coronavirus (2019-nCoV) potentially related to viral transmissibility. Virol Sin.
Roberts A, Deming D, Paddock CD, Cheng A, Yount B, Vogel L, Herman BD, Sheahan T, Heise M, Genrich GL, Zaki SR, Baric R, Subbarao K. 2007. A mouse-adapted SARS-coronavirus causes disease and mortality in BALB/c mice. PLoS Pathog 3:e5.
Zhou H, Chen X, Hu T, Li J, Song H, Liu Y, Wang P, Liu D, Yang J, Holmes EC, Hughes AC, Bi Y, Shi W. 2020. A novel bat coronavirus reveals natural insertions at the S1/S2 cleavage site of the Spike protein and a possible recombinant origin of HCoV-19. bioRxiv.
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.