INTRODUCTION
The recently emerged severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) is responsible for the ongoing pandemic of coronavirus disease 2019 (COVID-19), a respiratory disease with an estimated 2 to 5% mortality (
1–7). The SARS-CoV-2 spike (S) glycoprotein mediates the entry of the virus into the host cell and influences tissue tropism and pathogenesis (
8–13). The S glycoprotein trimer in the viral membrane is the target for neutralizing antibodies, which are important for vaccine-induced protection against infection (
9,
11,
12,
14–18). Monoclonal neutralizing antibodies directed against the S glycoprotein are being evaluated as treatments for SARS-CoV-2-infected individuals (
14,
15,
19–26). In the virus-producing cell, the S glycoprotein is synthesized in the endoplasmic reticulum, where it assembles into trimers and is initially modified by high-mannose glycans (
27,
28). Each of the three SARS-CoV-2 S glycoprotein protomers possesses 22 canonical sequons for N-linked glycosylation (
11,
29–35). Coronavirus virions bud into the endoplasmic reticulum-Golgi intermediate compartment (ERGIC), and S glycoprotein trimers on the surface of these virus particles are thought to be processed further during trafficking through the Golgi complex (
28,
36–39). In the Golgi complex, some of the glycans on the S glycoprotein are modified to complex carbohydrates; in addition, the trimeric S glycoprotein is cleaved by furin-related proteases into S1 and S2 glycoproteins, which associate noncovalently in the virus spike (
26–35). During virus entry, the S1 subunit binds the receptor angiotensin-converting enzyme 2 (ACE2) (
9,
11–13,
40–42). The S2 subunit is further processed by host proteases and undergoes extensive conformational changes to mediate the fusion of the viral and target cell membranes (
42–46). Following the insertion of the S2 fusion peptide into the host cell membrane, the interaction of two helical heptad repeat regions (HR1 and HR2) on the S2 subunit brings the viral and cell membranes into proximity (
43).
The SARS-CoV-2 S glycoprotein trimer is modified by glycosylation, which in other coronaviruses has been suggested to modulate accessibility to neutralizing antibodies as well as host proteases involved in S processing (
11,
13,
29–31,
47,
48). Glycans camouflage S glycoprotein peptide epitopes, shielding them from potentially neutralizing antibodies. Glycans can also contribute to epitopes for antibody recognition; for example, the s309 neutralizing antibody interacts with the glycan on Asn 343 of the SARS-CoV-2 S glycoprotein (
49).
Virus entry inhibitors and therapeutic or prophylactic neutralizing antibodies must recognize the mature SARS-CoV-2 spike with its natural glycan coat, as it exists on the viral membrane. The glycosylation of the SARS-CoV-2 spike has been studied using soluble or detergent-solubilized versions of the uncleaved S glycoprotein trimer, modified to retain a pretriggered conformation (
29,
32–35,
50). Fewer studies of the glycosylation of S glycoproteins on SARS-CoV-2 virion preparations have been conducted (
51,
52). Experience with human immunodeficiency virus (HIV-1) indicates that native, membrane-anchored viral envelope glycoproteins can exhibit glycosylation profiles that differ from those of soluble glycoprotein trimers (
53–57). Here, we elucidate the glycosylation and disulfide bonding profile of a wild-type SARS-CoV-2 S glycoprotein trimer and evaluate the importance of naturally occurring variation in O-linked glycans and disulfide bonds. This information enhances our understanding of the complete, functional SARS-CoV-2 S glycoproteins and could assist the development and improvement of efficacious therapies, including monoclonal antibodies and vaccines.
DISCUSSION
As the extensive glycosylation of the spike (S) glycoprotein can potentially influence SARS-CoV-2 infectivity and sensitivity to antibody inhibition, an understanding of the glycosylation profile of the native S glycoprotein trimer is valuable. Glycosylation of the SARS-CoV-2 S glycoprotein apparently can be influenced by subcellular localization and the coexpression of viral proteins (
52). Because proteolytic activation of the S glycoprotein can occur at the target cell surface or in endosomal compartments during virus entry, the uncleaved S glycoprotein, as well as the cleaved S glycoproteins, on virions can support virus infection (
28,
44–46,
94). A SARS-CoV-2 S glycoprotein mutant with an altered site of furin cleavage replicated efficiently in animals but exhibited attenuated pathogenicity (
94). Therefore, understanding the glycosylation of the uncleaved and cleaved S glycoproteins is relevant to SARS-CoV-2 biology. The uncleaved S glycoprotein precursor is initially modified in the endoplasmic reticulum by high-mannose carbohydrates; some of these uncleaved/immature S glycoproteins appear on the surface of expressing cells, perhaps by bypassing the Golgi apparatus (
28). On SARS-CoV-2 virions, both uncleaved and cleaved S glycoproteins are extensively modified by complex carbohydrates, indicating passage through the Golgi compartment (
28,
69,
95). We found that coexpression of the SARS-CoV-2 E protein led to a further enrichment of complex glycans on the uncleaved S glycoprotein on VLPs. These observations provided a rationale for focusing on the S glycoproteins that have been modified during transit through the Golgi compartment. By including a lectin, AAL, that recognizes fucose in the purification scheme, we attempted to increase the representation of S glycoproteins that passed through the Golgi complex, where fucosylation occurs (
66–69). This purification strategy allowed an evaluation of the glycan composition of a Golgi-enriched subset of the S glycoproteins synthesized in the expressing cell. We show that the vast majority of the uncleaved and cleaved S glycoproteins on SARS-CoV-2 VLPs can be recognized by AAL, supporting the relevance of the S glycoproteins purified by using this lectin. It is conceivable that forms of the S glycoproteins with lower levels of glycan processing might also be present on virions, depending on host cell types, production levels, and VLP characteristics.
SARS-CoV-2 S glycoproteins produced for use as vaccine immunogens have been designed to allow secretion of soluble trimers, to inhibit furin cleavage, and to stabilize prefusogenic conformations (
11,
13,
50,
96–100). The glycosylation profiles of virion S glycoproteins and several of these modified S glycoproteins have been characterized (
29,
32,
35,
50–52,
71,
86,
87). Our results agree with the overall predominance of complex carbohydrates of the SARS-CoV-2 S glycoprotein trimer seen in these previous studies. The glycosylation profile of our S glycoprotein preparation most closely resembles that of the S glycoproteins purified from SARS-CoV-2 virions propagated in Vero cells (
51). However, compared with these and the other characterized trimers, the wild-type S glycoproteins purified in our study exhibited more glycan processing. Differences among the particular S glycoprotein constructs might account for variation observed in the glycosylation profiles (
52). Except for the carboxy-terminal 2×Strep tag, the S glycoprotein that we analyzed is wild type in sequence. Our purified S glycoproteins are trimeric and, in the cleaved fraction, the S1 and S2 subunits maintain their association. The purified trimers preferentially bind MAbs that recognize a closed prefusogenic conformation with all three RBDs in the down position. Nonetheless, we expect that the native, wild-type S glycoprotein trimer is dynamic (
70) and might exhibit greater flexibility than S glycoprotein constructs that have been engineered to favor the prefusogenic conformation. This natural S glycoprotein flexibility could increase the access of glycans to processing enzymes in the Golgi apparatus.
We considered the possibility that the uncleaved S glycoproteins in our purified trimer preparation are conformationally heterogeneous and therefore predisposed to complex sugar addition. Functional uncleaved S glycoproteins have been suggested to be more triggerable than cleaved S glycoproteins (
28). We observed that the CR3022 MAb, which recognizes open spike conformations with RBDs in the up position (
12,
82,
83), binds only the uncleaved S glycoproteins on the cell surface or in cell lysates (
Fig. 4A and
B). However, these CR3022-reactive uncleaved S glycoproteins are not modified by complex glycans and, therefore, are not expected to be present in our purified S glycoprotein preparation. The uncleaved S glycoprotein in our purified preparation, like that in VLPs (
28), is modified by complex glycans and is not well recognized by the CR3022 MAb (
Fig. 4C and
D). Future studies will be required to determine whether the cleaved and uncleaved S glycoproteins trafficking through the Golgi compartment acquire different glycan structures.
The source of the S glycoproteins and purification strategy could also influence the glycosylation profile (
52). The full-length, wild-type S glycoprotein trimers studied here are distinct from those analyzed by other groups. The inclusion of a fucose-specific lectin in our purification scheme should have increased the representation of S glycoproteins that passed through the Golgi compartment, where complex carbohydrates are added (
69). Hypothetically, soluble or virion-associated S glycoproteins passing through the Golgi apparatus might be processed less efficiently than our S glycoproteins, which are anchored in the Golgi membrane. Although Brun et al. (
52) also used cells expressing a wild-type, nonstabilized S glycoprotein as their source, they analyzed the glycans only on the S1 glycoprotein monomer that was shed into the cell culture medium. The observed differences in S glycosylation are particularly noteworthy for the Asn 234 glycan, which is predominantly of the high-mannose type in the soluble/modified S glycoproteins, but mostly processed in the wild-type S glycoprotein trimers that we studied. Specific down-selection of the N234 high-mannose glycans by our AAL purification strategy is possible but not likely, given the abundance of multiple fucose-containing glycans on all S glycoforms produced in the 293T-S cells. Asn 234 in the S1 N-terminal domain is near the receptor-binding domain (RBD), and molecular dynamics simulations have suggested that the N234 glycan can modulate the conformational changes that the RBD undergoes in the process of binding ACE2 (
101). Moreover, changes in Asn 234 have been reported to affect virus sensitivity to several neutralizing antibodies (
102). We note that these phenotypes were revealed by changing Asn 234 to an alanine residue, completely removing the potential N-linked glycosylation site (
101,
102). Currently, no experimental evidence exists indicating that the particular type of glycan modifying Asn 234 might influence the binding of ACE2 or neutralizing antibodies directed against nearby epitopes. Another molecular dynamics simulation suggested that the low accessible surface area of oligomannose-type glycans at residues like Asn 234 might limit processing to complex carbohydrates (
87). Our results indicate that the N234 glycan on the N-terminal domain is accessible on the unliganded wild-type S glycoprotein trimer for modification to complex carbohydrates. Flexibility in the S glycoprotein N-terminal domain could increase the accessibility and processing of the N234 glycan. The formation of a noncanonical disulfide bond in the N-terminal domain of a subset of our purified S glycoproteins could reflect conformational heterogeneity. Nonetheless, our purified S glycoproteins were precipitated by MAb 4–8, which recognizes an epitope dependent on the tertiary conformation of the N-terminal domain (
80). Whether flexibility between the N-terminal domain and RBD could potentially increase the accessibility of the Asn 234 glycan requires further study.
O-linked glycans, which are added in the Golgi compartment (
69), were detected on four glycopeptides in the purified S glycoprotein. The occupancy of O-linked glycosylation sites can provide clues to the accessibility of the Ser/Thr residues in the folded S glycoprotein trimer (
52). The low occupancy associated with the O-linked glycosylation sites on our purified S glycoprotein is similar to that reported for virion S1 and soluble stabilized S trimers and contrasts with the high O-linked occupancy observed for S1 shed from S-expressing cells or recombinant soluble S1 glycoprotein (
52). The low occupancy of the O-linked glycosylation sites is consistent with the antigenicity data (
Fig. 4C and
D) indicating that the unliganded S glycoproteins that we analyzed largely maintain closed trimer conformations.
We changed three amino acid residues (Thr 323, Thr 676, and Ser 1170) that are potentially O-glycosylated to those amino acid residues found in less common natural SARS-CoV-2 variants. In all three cases, these changes resulted in entry-competent S glycoproteins. However, the infectivity of the T323I mutant was more sensitive to freeze-thawing than that of viruses pseudotyped with the wild-type S glycoprotein. On the other hand, even though the S1170F change altered posttranslational modification of the S2 glycoprotein, this mutant exhibited wild-type levels of infectivity and resistance to freeze-thawing. Alteration of Thr 676 or Ser 1170 did not significantly change the sensitivity of the pseudotyped viruses to neutralization by sACE2 or convalescent-phase sera.
Of the three naturally observed variants in S glycoprotein cysteine residues, a change in Cys 15 was compatible with an entry-competent S glycoprotein. This implies that the disulfide bond between Cys 15 and Cys 136 within the N-terminal domain is not absolutely required for folding and function of the SARS-CoV-2 S glycoprotein. However, we noted that the infectivity of the C15F mutant virus was compromised after freeze-thawing more than that of the wild-type virus. Apparently, some flexibility in the N-terminal domain can be tolerated in functional S glycoprotein trimers, although the ability of the virus to withstand environmental stress may be affected. Of note, some of the expressed S glycoproteins formed a disulfide bond between Cys 131 and Cys 136 and therefore lacked two of the canonical disulfide bonds (Cys 15-Cys 136 and Cys 131-Cys 166) in the N-terminal domain. Such S conformers, with presumably less stable N-terminal domains, might contribute to viral pathogenesis or to evasion of the host immune response. As discussed above, conformational flexibility in the N-terminal domain could also result in an increase in the accessibility and processing of particular glycans like that on Asn 234.
These studies should assist understanding of the nature and contribution of glycans on the wild-type SARS-CoV-2 S glycoprotein trimer and provide some insight into the impact of natural variation in sites that are glycosylated or disulfide bonded.
MATERIALS AND METHODS
Reagents.
Trizma hydrochloride, Trizma base, ammonium bicarbonate, urea, Tris(2-carboxyethyl) phosphine hydrochloride (TCEP), iodoacetamide (IAM), ethanol, 4-vinylpyridine (4-VP), and glacial acetic acid were purchased from Sigma. Other reagents used in this study included optima liquid chromatography (LC)-MS-grade acetonitrile, water, formic acid (Fisher Scientific), sequencing-grade trypsin (Promega), chymotrypsin (Promega), glycerol-free peptidyl-N-glycosidase F (PNGase F) (New England Biolabs), endoglycosidase Hf (Endo Hf) (New England Biolabs), O-glycosidase (New England Biolabs), neuraminidase (New England Biolabs), and fetuin (New England Biolabs). All reagents and buffers were prepared with deionized water purified with a Millipore Direct-Q3 (Billerica, MA) water purification system.
The 2–4, 4–8, and 2–43 monoclonal antibodies (MAbs) were a kind gift from the laboratory of David D. Ho (Columbia University Vagelos College of Physicians and Surgeons) (
25,
80,
81). The CR3022 MAb was purchased from Abcam (
12,
82,
83).
Plasmids.
The wild-type and mutant SARS-CoV-2 S glycoproteins were expressed transiently by a pcDNA3.1(−) vector (Thermo Fisher Scientific) (
28). The wild-type SARS-CoV-2 spike (S) gene sequence, which encodes an aspartic acid residue at position 614, was obtained from the National Center for Biological Information (NC_045512.20). The gene was modified to encode a Gly
3 linker and His
6 tag at the carboxyl terminus. The modified S gene was codon optimized, synthesized by Integrated DNA Technologies, and cloned into the pcDNA3.1(−) vector. S mutants were made using Q5 high-fidelity 2× master mix, KLD enzyme mix for site-directed mutagenesis according to the manufacturer’s protocol (New England Biolabs), and One-Shot TOP10 competent cells.
Inducible expression of the wild-type SARS-CoV-2 S glycoprotein was achieved using a self-inactivating lentivirus vector comprising TRE3g-SARS-CoV-2-Spike-PSP-StrepII×2.IRES6A.Puro-T2A-GFP (K5650) (
28). Here, the codon-optimized
S gene is under the control of a tetracycline response element (TRE) promoter and encodes the wild-type S glycoprotein with a carboxy-terminal 2×Strep tag. The internal ribosome entry site (IRES6A) allows expression of puro.T2A.EGFP, in which puromycin N-acetyltransferase and enhanced green fluorescent protein (eGFP) are produced by self-cleavage at the Thosea asigna 2A (T2A) sequence.
The plasmid expressing sACE2-Fc was provided by Bing Chen (Boston Children’s Hospital) (
103); sACE2 was produced as described previously (
28). Plasmids expressing the SARS-CoV-2 M, E, and N proteins are described in reference
104.
Cell lines.
The wild-type SARS-CoV-2 S glycoprotein, with Asp614, was inducibly expressed in Lenti-x-293T human female kidney cells from TaKaRa Bio (catalog number 632180). Lenti-x-293T cells were grown in Dulbecco’s modified Eagle’s medium (DMEM) with 10% fetal bovine serum (FBS) supplemented with l-glutamine and penicillin-streptomycin (Pen-Strep).
Lenti-x-293T cells constitutively expressing the reverse tetracycline-responsive transcriptional activator (rtTA) (Lenti-x-293T-rtTa cells [D1317]) (
28) were used as the parental cells for the 293T-S cell line. The 293T-S (D1483) cells inducibly expressing the wild-type SARS-CoV-2 S glycoprotein with a carboxy-terminal 2×Strep-Tag II sequence (
28) were produced by transduction of Lenti-x-293T-rtTA cells with the K5650 recombinant lentivirus vector described above. The packaged K5650 lentivirus vector (60-μl volume) was incubated with 2 × 10
5 Lenti-x-293T-rtTA cells in DMEM, with tumbling at 37°C overnight. The cells were then transferred to a 6-well plate in 3 ml DMEM–10% FBS–Pen-Strep and subsequently selected with 10 μg/ml puromycin.
The GALE/GALK2 cells are 293T cells in which the genes encoding UDP-galactose-4-epimerase (GALE) and galactokinase 2 (GALK2) were knocked out by CRISPR/CAS9 technology (
88). The GALE/GALK2 cells were obtained from Kerafast (
88).
Expression and processing of S glycoprotein variants.
293T cells were transfected with plasmids expressing the wild-type and mutant SARS-CoV-2 S glycoproteins. On the day prior to transfection, 293T cells were seeded in 6-well plates at a density of 1 × 10
6/well. Cells were transfected with 1 μg of the S-expressing plasmid, using Lipofectamine 3000 according to the manufacturer’s instructions. Two days after transfection, cells were lysed with lysis buffer (1× phosphate-buffered saline [PBS], 1% NP-40, and 1× protease inhibitor cocktail) and the cell lysates analyzed by Western blotting. Samples were Western blotted with 1:2,000 dilutions of either rabbit anti-SARS-Spike S1 or mouse anti-SARS-Spike S1, rabbit anti-SARS-Spike S2 (Sino Biologicals), or a 1:5,000 dilution of mouse anti-β-actin as the primary antibodies. Horseradish peroxidase (HRP)-conjugated anti-rabbit or anti-mouse antibodies at a dilution of 1:5,000 were used as secondary antibodies in the Western blots. The adjusted integrated volumes of S, S1, and S2 bands from unsaturated Western blots were calculated using Fiji ImageJ. The values for the processing of mutant S glycoproteins were calculated and normalized to the values for the wild-type S glycoprotein (WT) as
For production of VLPs, the SARS-CoV-2 S glycoprotein was coexpressed with M, E, and N proteins, individually or in combination. One day before transfection, 293T cells were seeded into 10-cm dishes at a density of 5.5 × 106/dish. The next day, the cells were transfected with 3 μg of each expressor plasmid or with an empty vector plasmid to keep the total amount of DNA transfected at 12 μg. Two days after transfection, cell lysates were prepared as described above. Cell supernatants were cleared at 900 × g for 15 min, followed by centrifugation at 110,000 × g for 1 h at 4°C. Pellets were washed with 800 μl of 1× PBS and then resuspended in 90 μl lysis buffer for 5 min on ice. In some cases, cell lysates and pellets prepared from cell supernatants were treated with PNGase F, endoglycosidase Hf (Endo Hf), or O-glycosidase plus neuraminidase (New England Biolabs) according to the manufacturer’s instructions. Samples were analyzed by Western blotting as described above.
S1 shedding from S glycoprotein-expressing cells.
293T cells were transfected with pcDNA3.1(−) plasmids expressing the wild-type and mutant SARS-CoV-2 S glycoproteins, using Lipofectamine 3000 according to the manufacturer's protocol. Cell supernatants were collected, cleared by centrifugation at 1,800 ×
g for 10 min, and incubated with a 1:100 dilution of NYP01 convalescent-phase serum and protein A-agarose beads for 1 to 2 h at room temperature. Beads were washed three times and samples were Western blotted with a mouse anti-S1 antibody. Band intensity was determined as described above. The subunit association index of each mutant was calculated as
Recognition of cell surface S glycoproteins by monoclonal antibodies.
For immunoprecipitation of cell surface S glycoproteins, doxycycline-induced 293T-S cells were washed with washing buffer (1× PBS plus 5% FBS). The cells were then incubated with 6 μg/ml antibody for 1 h at 4°C. After washing three times in washing buffer, the cells were lysed in NP-40 lysis buffer (1% NP-40, 1× PBS, 1× protease inhibitor cocktail) for 5 min on ice. The lysates were cleared by centrifugation at 13,200 × g for 10 min at 4°C, and the clarified supernatants were incubated with protein A-agarose beads for 1 h at room temperature. The beads were pelleted (1,000 rpm for 1 min) and washed three times with final wash buffer (1× PBS, 0.5% NP-40). The beads were suspended in lithium dodecyl sulfate (LDS) sample buffer, boiled, and analyzed by Western blotting as described above. In some cases, the precipitated proteins were treated with PNGase F or Endo Hf prior to SDS-PAGE and Western blotting.
For analysis of total glycoprotein expression in the cell, some of the clarified lysates were saved before the addition of protein A-agarose beads and analyzed by Western blotting as described above. These samples are referred to as input.
AAL recognition of the S glycoproteins on SARS-CoV-2 VLPs.
One day before transfection, 1.8 × 107 293T-S cells were seeded in a 15-cm tissue culture dish. The next day, the cells were transfected with 12 μg of each plasmid expressing the SARS-CoV-2 M, E, and N proteins, using Lipofectamine 3000 (Thermo Fisher Scientific). Following transfection, the cells were incubated in medium containing 1 μg/ml doxycycline. Two days later, cells were lysed in lysis buffer (1× PBS, 1% Cymal-5, 1× protease inhibitor cocktail). The cell lysates were used to confirm expression of M, E, N, and S proteins.
Cell supernatants were cleared at 900 × g for 15 min, followed by centrifugation at 110,000 × g for 1 h at 4°C. The pellets were resuspended in 660 μl lysis buffer. After clearance by centrifugation at 16,100 × g for 30 min at 4°C, 60 μl of the lysate was set aside as the input sample. The remaining lysate was divided into two halves, which were incubated for 1 h at room temperature with 20 μl of either AAL-agarose resin (number AL-1393-2; Vector Laboratories) or protein A-agarose beads. The suspensions were then applied to Econo-Pac columns (Bio-Rad) with gravity flow. The flowthrough fractions were incubated as above with fresh AAL-agarose resin or protein A-agarose beads, and the process was repeated. The final flowthrough fractions were retained. The columns were washed with 2 ml washing buffer (1× PBS, 0.5% Cymal-5). The beads were resuspended in 300 μl 1× LDS buffer and, along with the final flowthrough, analyzed by Western blotting for the S glycoproteins.
Purification of the S glycoproteins.
To express the SARS-CoV-2 S glycoprotein for purification, 293T-S cells were induced with 1 μg/ml doxycycline for 2 days. The cells were resuspended in 1× PBS and spun at 4,500 × g for 15 min at 4°C. Cell pellets were collected and lysed by incubating in lysis buffer (20 mM Tris HCl [pH 8.0], 150 mM NaCl, 1% Cymal-5, 1× protease inhibitor cocktail [Roche]) on ice for 10 min. Cell lysates were spun at 10,000 × g for 20 min at 4°C, and the supernatant was incubated with Strep-Tactin XT superflow resin (IBA number 2-4030-010) by rocking end over end at room temperature for 1.5 h in a 50-ml conical tube. After incubation, the supernatant-resin suspension was applied to a Bio-Rad Econo-Pac column allowing flowthrough by gravity, followed by washing with 20 bed volumes of washing buffer (IBA number 2-1003-100, containing 0.5% Cymal-5) and elution with 10 bed volumes of elution buffer (IBA number 2-1042-025, containing 0.5% Cymal-5 and 1× protease inhibitor cocktail). For the second step of purification, the eluate was incubated with AAL-agarose resin (number AL-1393-2; Vector Laboratories) at room temperature for 1 h in a 10-ml conical tube. The eluate-AAL resin suspension was applied to a Bio-Rad Econo-Pac column for gravity flowthrough. The column was washed with 20 bed volumes of washing buffer (20 mM Tris-HCl [pH 8.0], 150 mM NaCl, 0.5% Cymal-5, 1× protease inhibitor cocktail [Roche]), after which the sample was eluted with 10 bed volumes of elution buffer (9 parts elution buffer [Vector Laboratories number ES-3100-100], 0.5 parts 1 M Tris-HCl [pH 8.0], 0.5 parts 10% Cymal-5). The eluate was buffer exchanged by ultrafiltration three times to remove fucose; this was accomplished using a 15-ml ultrafiltration tube (number UFC903024; Thermo Fisher Scientific) at 4,000 × g at room temperature with a buffer consisting of 20 mM Tris-HCl (pH 8.0), 150 mM NaCl, and 0.5% Cymal-5.
Precipitation of the purified S glycoproteins by monoclonal antibodies and sACE2-Fc.
The S glycoproteins, purified as described above, were incubated with monoclonal antibodies (6 μg/ml) or sACE2-Fc (30 μg/ml). In some cases, antibody precipitation was carried out in the absence or presence of sACE2 (27 μg/ml). After incubation with protein A-agarose, the precipitates were analyzed by Western blotting with mouse antibody against S1 and rabbit antibody against S2, as described above.
Proteolytic digestion of SARS-CoV-2 spike glycoproteins for glycosylation analysis.
The purified SARS-CoV-2 S glycoprotein samples (30 μg) at a concentration of ∼0.03 mg/ml were denatured with 7 M urea in 100 mM Tris buffer (pH 8.5), reduced at room temperature for 1 h with TCEP (5 mM), and alkylated with 20 mM IAM at room temperature for another hour in the dark. The reduced and alkylated samples were buffer exchanged with 50 mM ammonium bicarbonate (pH 8) using a 50-kDa molecular weight cutoff filter (Millipore) prior to protease digestion. The resulting buffer-exchanged sample was aliquoted into two portions, one digested with trypsin and the other with chymotrypsin. All protease digestions were performed according to the manufacturer’s suggested protocols. Digestion with trypsin was performed with a 30:1 protein-enzyme ratio at 37°C for 18 h; chymotrypsin digestion was performed with a 20:1 protein-enzyme ratio at 30°C for 10 h; and the combination of both proteases (a mixture of trypsin and chymotrypsin) was performed using the same protein-enzyme ratio as that used for single enzyme digestion and incubated overnight at 37°C. Ten-microliter aliquots from each digest were treated with PNGase F and incubated at 37°C. The digests were either directly analyzed or stored at −20°C until further analysis.
Chromatography and mass spectrometry.
High-resolution LC-MS experiments were performed using an Orbitrap Fusion Lumos Tribrid (Thermo Scientific) mass spectrometer equipped with an electron transfer dissociation (ETD) option that is coupled to an Acquity UPLC M-Class system (Waters). Mobile phases consisted of solvent A (99.9% deionized H2O plus 0.1% formic acid) and solvent B (99.9% CH3CN plus 0.1% formic acid). Three microliters of the sample was injected onto a C18 PepMap 300 column (300 μm inner diameter by 15 cm, 300 Å; Thermo Fisher Scientific) at a flow rate of 3 μl/min. The following CH3CN/H2O multistep gradient was used: 3% B for 3 min, followed a linear increase to 45% B in 50 min and then a linear increase to 90% B in 15 min. The column was held at 90% B for 10 min before reequilibration. All mass spectrometric analyses were performed in the positive ion mode using data-dependent acquisition with the instrument set to run in 3-s cycles for the survey and two consecutive MS/MS scans with collision-induced dissociation (CID) and ETD (either EThcD or ETciD). The full MS survey scans were acquired in the Orbitrap in the mass range 400 to 1,800 m/z at a resolution of 120,000 at m/z 200 with an automatic gain control target of 4 × 105. Following a survey scan, MS/MS scans were performed on the most intense ions with charge states ranging from 2 to 8 and with intensity greater than 5,000. CID was carried out at with a collision energy of 30%, while ETD was performed using the calibrated charge-dependent reaction time. Resulting fragments were detected using rapid scan rate in the ion trap.
Glycopeptide identification and disulfide bond analysis.
Glycopeptide compositional analyses were performed as described previously (
105,
106). Briefly, glycopeptide compositions were determined manually from both MS and tandem MS data of a glycopeptide-rich region of the LC/MS data. Glycopeptide peaks from high-resolution MS data in this region were identified from a cluster of peaks whose mass difference corresponds to the masses of monosaccharide units (hexose, HexNAc, NeuAc, and fuc). The compositions for the set of glycopeptides were then determined from fragment mass information from CID and ETD data. This information consists of the Y1 ion for identifying the peptide portion, the glycosidic bond cleavages resulting from the losses of the monosaccharide units from CID data, and the peptide backbone information from ETD data. Once the peptide portion was determined, plausible glycopeptide compositions for the set of peaks for the glycopeptide-rich region were obtained using the high-resolution MS data and GlycoPep DB (
107). The putative glycopeptide composition for each glycopeptide-rich region in the LC-MS data was confirmed manually from CID and ETD data. The full list of glycoforms is provided in supplemental material, and the percentage of each type of composition, i.e., high-mannose or processed, is reported here. These percentages are obtained by tallying the relative proportion of high-mannose or processed glycoforms, where each glycoform is weighted equally. The percentages are not meant to correspond to the absolute glycan abundance of high-mannose or processed glycoforms, a quantity that is not precisely knowable, since different glyopeptide glycoforms have different ionization efficiencies (
108).
Disulfide bond patterns of SARS CoV-2 spike glycoprotein were determined by mapping the disulfide-linked peptides. Data analysis was performed using the Mascot (v 2.7) search engine (
109) for peptides containing free cysteine residues, and disulfide bond patterns were analyzed manually as described previously (
110,
111). Briefly, to determine peptides containing free cysteine residues, raw data generated from LC-MS/MS experiments were converted to MGF format using an open-source tool, msConvert. The MGF files were then searched against the UniProt SARS-CoV-2 database (
https://covid-19.uniprot.org/; 106 sequences, 69,061 residues; April 2021) concatenated with a custom database (182 sequences, 110,199 residues) and Swiss-Prot database (2021_02 release, 564,638 sequences, 203,519,613 residues; taxonomy, viruses) using Mascot v. 2.7.0 (MatrixScience). The following search parameters were used: enzyme used, either trypsin alone or combination of trypsin and chymotrypsin, a maximum miscleavage of 2 per peptide, mass tolerance of 10 ppm for precursor and ±0.6 Da for fragment ions. Amino acid modifications were the following: fixed, pyridylethyl (Cys); variable, deamidation (N/Q), Gln-›pyro-Glu(N-term Q), and oxidation (M). An automatic decoy search was applied to determine the false discovery rate (FDR) and, when possible, peptides were evaluated at 1% FDR. Mascot ion score cutoffs of 40 and 38 were used for samples digested with trypsin alone and for samples digested with a combination of trypsin and chymotrypsin, respectively. Using these parameters, no peptide was identified with free cysteine, which indicates that the cysteine-containing peptides are all disulfide bonded. To this end, all disulfide-bonded peptides were analyzed manually.
VSV pseudotyped by S glycoproteins.
VSV was pseudotyped with S glycoproteins expressed stably in 293T-S cells or transiently in 293T cells. 293T-S cells in 6-well plates were induced with 1 μg/ml doxycycline or, as a control, incubated in standard medium without doxycycline. For transient expression, subconfluent 293T cells in a T75 flask were transfected with 15 μg of the SARS-CoV-2 S expression plasmid using 60 μl of 1 mg/ml polyethylenimine (PEI). Twenty-four hours later, cells were infected at a multiplicity of infection of 3 to 5 for 2 h at 37°C with rVSV-ΔG pseudovirus complemented in trans with the G glycoprotein and bearing a luciferase gene (Kerafast). Cells were then washed 6 times with DMEM plus 10% FBS and returned to culture. Cell supernatants containing S-pseudotyped VSV were harvested 24 h later, clarified by low-speed centrifugation (900 × g for 10 min), and either characterized immediately or stored at −80°C for later analysis.
Syncytium formation assay.
293T-S cells in 6-well plates were cotransfected with 1 μg each of an eGFP-expressing plasmid and a plasmid expressing hACE2 with Lipofectamine 3000 (Thermo Fisher Scientific). Cells were then incubated in either standard (control) medium or medium containing 1 μg/ml doxycycline. Twenty-four hours after transfection, cells were stained with BioTracker NIR694 nuclear dye (Sigma-Aldrich) and imaged using a fluorescence microscope with green and red filters. In parallel, cell lysates were collected for Western blotting as described above.
Virus infectivity.
VSV-ΔG vectors pseudotyped with SARS-CoV-2 S glycoprotein variants were produced as described above. The recombinant viruses were incubated with 293T-ACE2 cells, and 24 h later, luciferase activity in the cells was measured.
Virus neutralization by sACE2 and sera.
Neutralization assays were performed by adding 200 to 300 50% tissue culture infectious doses (TCID50) of rVSV-ΔG pseudotyped with SARS-CoV-2 S glycoprotein variants into serial dilutions of sACE2 and sera. The mixture was dispensed onto a 96-well plate in triplicate and incubated for 1 h at 37°C. Approximately 4 × 104 293T-ACE2 cells were then added to each well, and the cultures were maintained for an additional 24 h at 37°C before luciferase activity was measured. Neutralization activity was calculated from the reduction in luciferase activity compared to controls using GraphPad Prism 8 (GraphPad Software Inc.).
Data availability.
The raw MS data have been deposited in the Mass Spectrometry Interactive Virtual Environment (MassIVE) repository, along with search parameters and assignment criteria. MassIVE may be found at
https://massive.ucsd.edu/ProteoSAFe/static/massive.jsp. The dataset identifier is MSV000087606, and it can be accessed at
ftp://[email protected]. Summaries of the glycopeptide compositions and annotated MS/MS spectra for key novel glycopeptide assignments are available in Table S1 and Fig. S1, respectively.
ACKNOWLEDGMENTS
We thank Elizabeth Carpelan for preparation of the manuscript. We thank Peihui Wang (Shandong University), Yuan Liu (Cornell University), Lihong Liu and David D. Ho (Columbia University Vagelos College of Physicians and Surgeons), Michael Farzan (Scripps Florida), and Bing Chen (Harvard Medical School) for reagents.
This study was supported by the University of Alabama at Birmingham Center for AIDS Research (NIH P30 AI27767), by grants from the National Institutes of Health (AI125093 to H.D., J.C.K., and J.S. and R35 GM103054 to H.D.), and by funding to J.S. from the late William F. McCarty-Cooper.
S.Z., H. Desaire, and J.S. conceived the study. H. Ding and J.C.K. established the SARS-CoV-2 S glycoprotein-expressing cells. S.Z. and S.A. analyzed the expression and function of S glycoprotein variants. S.Z. purified the SARS-CoV-2 S glycoproteins and characterized the antigenicity of the purified glycoproteins. E.P.G. and H. Desaire analyzed the glycosylation and disulfide bonding of the purified S glycoproteins. S.Z., E.P.G., H. Desaire, and J.S. wrote the manuscript. All authors contributed to data analysis and editing of the manuscript.