INTRODUCTION
Genomic surveillance of viral mutations is the first step in detecting viral changes that could impact public health by interfering with diagnostics, modifying pathogenicity, or altering susceptibility to existing immunity or treatments. In many countries, the challenge of detecting new mutations of interest in severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) involves sequencing representative genomes from circulating viruses, sharing sequence information on public databases (e.g., GISAID [
1]), and analyzing them in real time using platforms such as Nextstrain (
2). While mutations appear randomly, their fate in the population depends on a combination of the conferred fitness advantage and stochastic and demographic processes. A first step in assessing the potential public health impact of newly observed mutations is to determine whether their increase in frequency is due to chance or adaptation. If they are found to be adaptive, it is important to evaluate whether their adaptation is linked to an improved ability to replicate, colonize, transmit, or evade antiviral hosts defenses (
3). An important challenge in the field is to decipher which of all the variants that appear should be monitored to implement measures that mitigate their risk to public health.
Mutations in SARS-CoV-2 have been reported since the early stages of the coronavirus disease 2019 (COVID-19) epidemic (
4–6). The most common mutations described are single nucleotide polymorphisms (SNPs) and small deletions (
7–9). Genomic surveillance of mutations has been mostly focused on the spike (S) protein because of its key role in viral entry and immunity (
10), as well as the fact that this protein constitutes the basis of numerous SARS-CoV-2 vaccines (
11). S is a homotrimeric protein whose heavily glycosylated ectodomain protrudes from the viral membrane, showing a bat-like shape with an N-terminal globular head connected to the membrane by an elongated stalk (
12). The S protein is proteolytically processed by the cellular furin protease into the S1 and S2 subunits (
13,
14). Additional proteolytic cleavage occurs following the binding of the S protein to host receptors, facilitating S1 subunit release. The C-terminal S2 subunit remains trimeric in the viral membrane but undergoes conformational changes that promote fusion with the host cell (
15).
The first mutation identified as potentially concerning was a change from aspartic acid to glycine in the S1 subunit of the S protein at position 614 (S:D614G). S:D614G emerged early in the epidemic, becoming predominant in most countries within 2 months, and completely dominated the epidemic by August 2020 (
16). As with any mutant, the initial spread of this mutation could have resulted from stochastic events, the dynamics of the epidemic, or an intrinsically higher viral fitness. More than 6 months after the initial report of this mutation, several studies reported evidence in favor of higher transmission efficacy in animal models and human populations (
4,
17–19). S:D614G replicates better in some cell culture and animal models (
17,
18,
20) and is associated with higher viral loads in infected individuals (
16); importantly, however, it does not impact diagnostics or vaccine efficacy.
Following the first wave of the pandemic, additional variants have been reported from many countries. Among the first of these changes associated with variants in the spike protein was the amino acid replacement S:A222V, located at the N-terminal domain (NTD) of the S1 subunit, which occurred in the background of S:D614G. The variant containing this change, termed 20E, was first sequenced in Spain and expanded throughout Europe (
5). Other variants have been reported since, including the so-called cluster 5 variant, which harbors a combination of 3 SNPs and a single deletion related to mink farms in Denmark (
21). One of the SNPs is in the S protein of this variant, S:Y453F; it occurs in the receptor-binding domain (RBD) and may increase binding to cell receptors in mink (
22). By late 2020 or early 2021, three variants of concern (VOC) were described, all of which share the S:N501Y amino acid replacement in the RBD of the S protein: Alpha (also called 20I/501Y.V1 or lineage B.1.1.7) was originally described in the United Kingdom (
9), Beta (20H/501Y.V2; B.1.351) in South Africa, and Gamma (20J/501Y.V3; P.1) in Brazil. Recently, in March 2021, a new VOC known as Delta (21A/478K.V1, B.1.617.2) emerged in India. The Delta variant does not contain S:501Y (
23) and is displacing the predominant variant, Alpha (
24). These variants are of particular concern because of their rapid spread, likely due to increased transmissibility (
25–27). Reduction in neutralization has been found in different amino acids of the spike protein. VOC with the amino acid replacement S:N501Y (Alpha, Beta, and Gamma) exhibit the highest impact on immune evasion, followed by lineages harboring S:L452R that include the Delta variant (B.1.617.2) (
28). While the effect of these variants on the immune response in convalescent and vaccinated individuals is still unclear, current data do not provide evidence of immune escape or compromising vaccine efficacy (
29). Nevertheless, new mutations could emerge that hamper efforts to control the epidemic at regional or global scales by increasing transmissibility and/or reducing vaccine efficacy in the future.
The dominance of a lineage in a geographical region is sometimes determined by the number of introductions and mobility among regions (
30) rather than by a change in a biological trait that confers a selective advantage (
5). Nearly all VOC thus far have spread outside the country where they were initially identified and are estimated to spread faster than other cocirculating genotypes, becoming dominant for a period (
25,
31,
32) and eventually being replaced locally by other variants (
24). The current work describes the workflow for investigating the risk of emerging mutations in the spike protein of SARS-CoV-2, starting from genomic epidemiology and leading up to a biological and immunological characterization of SARS-CoV-2 mutations in terms of viral infectivity, virion stability, and neutralization by sera from convalescent and vaccinated individuals.
DISCUSSION
SARS-CoV-2 success is linked to its ability to infect and be transmitted. Mutations that emerge independently several times and increase in frequency are likely to confer enhanced viral infectivity, transmission, or immune evasion. The identification of such mutants is of great importance, as they can significantly impact public health. On the other hand, the appearance of mutations can also be driven by stochastic events, and the ability to evaluate the potential risk posed by new variants is of key importance to appropriately tailor public health responses. In this work, we identified two amino acid replacements at positions 1163 and 1167 of the S protein that appeared to be potentially beneficial for the virus based on several lines of evidence. First, these mutations are highly variable within SARS-CoV-2 but conserved across the closely related coronaviruses. Second, the vast majority of sequences harboring these mutations appeared in clusters (
Fig. 1a and
b). Third, both positions have been reported as positively selected multiple times throughout the SARS-CoV-2 phylogeny indicating a fitness advantage (
57). Finally, the largest cluster containing either of these mutations, and therefore the most successful in terms of transmission, harbored both mutations together (
Fig. 1a and
b). This infection cluster was sustained for more than 6 months across Europe, suggesting that both mutations together could increase viral fitness.
For these reasons, we conducted a series of experiments to assess whether the two mutations conferred a biological advantage to the virus
in vitro. Analysis of the mutation in the context of available structures suggested that G1167V could alter the flexibility of the S protein stalk by both restricting the conformational freedom normally conferred by the wild-type glycine residue and by introducing a hydrophobic side chain that will favor burial in the HR2 coiled-coil leucine zipper of the prefusion state (
Fig. 3). This extensive flexibility of the S prefusion stalk seems to be unique to the SARS-CoV-2 (
43) and has been suggested to increase avidity for the host receptors by allowing the engagement of multiple S proteins (
43). Therefore, stalk stabilization by G1167V is likely to result in a reduced ability of S to bind receptors in the target cell. In agreement with this, we found reduced infectivity upon introduction of both changes D1163Y and G1167V into the spike protein (
Fig. 4a and
b). In addition, we found no indication of resistance to heat inactivation that could facilitate environmental transition between hosts (
Fig. 4c), and the viral load in clinical specimens showed no difference due to the presence of these two mutations compared to the 20E S genotype (
Fig. 4d).
We examined if these two mutations conferred evasion of preexisting immunity, which could compromise vaccine efficacy and/or result in reinfection. For this, we used sera from both the first (April 2020) and second (October 2020) epidemic waves of the infection in Spain, because an almost complete replacement of SARS-CoV-2 S genotypes of different variants occurred between these two time points in Spain (
30). When utilizing sera from donors infected during the first wave of the pandemic in Spain, we found a modest but statistically significant reduction in susceptibility to neutralization of the 1163.7 S genotype compared to the 20E S genotype of approximately 6-fold (
Fig. 5a). However, no difference in neutralization was observed between the two variants when sera from patients infected during the second wave were used (
Fig. 5b). Overall, the magnitude of the observed reduction in neutralization susceptibility to sera from individuals infected during the first wave was much less pronounced than that observed for other genotypes implicated in immune evasion (
54), although the degree of reduced neutralization required to confer a biologically relevant fitness advantage
in vivo has not been established. Importantly, we also found no evidence for reduced neutralization of the 1163.7 variant by sera from donors immunized with the BNT162b2 vaccine (
Fig. 5c). Since all currently available vaccines, including BNT162b2, are based on the Wuhan S genotype, it is expected that these mutations will not reduce the effectiveness of the other vaccines either.
Both S amino acid positions 1163 and 1167 are embedded in experimentally confirmed T- and B-cell epitopes. Interestingly, for T-cell epitopes, a predicted HLA-II epitope including positions 1163 and 1167 has been experimentally verified to bind to HLA DRB1*01:01, the prototype molecule for the DR supertype (epitope identifier in Immune Epitope DataBase: 9006 [
58]). Additionally, amino acid S:D1163 is included in a SARS-CoV-2 T-cell linear epitope eliciting T-cell responses in convalescent COVID-19 cases (
59) as well as in SARS-CoV-2-naive individuals (
52), indicating cross-reactivity in epitopes involving these regions. B-cell linear epitopes that span D1163 and G1167 have also been reported (
51), with D1163 belonging to a dominant linear B-cell epitope recognized by more than 40% COVID-19 patients used in the assay (
53). Hence, it is possible that these mutations could play a role in modulating T-cell responses. However, at the time cluster 1163.7 appeared and transmitted in Europe, large-scale vaccination had not been implemented and the majority of the population had not been infected by SARS-CoV-2. Therefore, there was likely little selection of SARS-CoV-2 variants that evade existing immunity.
Overall, clinical and experimental data do not support the idea that D1163Y and G1167V in the S protein confer temperature resistance, higher infectivity in vitro, higher viral load in vivo, or significant escape from antibody neutralization. The biological consequences of these mutations are therefore unlikely to confer a significant fitness advantage. Indeed, these early findings are in agreement with the subsequent observation that these mutations ceased to circulate in Europe as VOC Alpha increased in frequency.
ACKNOWLEDGMENTS
We acknowledge the patients and the Consorcio Hospital General de Valencia Biobank integrated in the Valencian Biobanking Network for their collaboration, as well as the patients and hospital staff at Hospital Universitario y Politécnico La Fe de Valencia. In addition, we thank Gert Zimmer (Institute of Virology and Immunology, Mittelhäusern/Switzerland), Stefan Pohlmann, and Markus Hoffmann (German Primate Center, Infection Biology Unit, Goettingen/Germany) for providing the reagents required for the generation of VSV-pseudotyped viruses and the codon-optimized S plasmid. We acknowledge all the efforts from different laboratories and authorities submitting all possible sequences of SARS-CoV-2 worldwide and making them available on the GISAID platform.
M.C. and R.G. are supported by Ramón y Cajal program from Ministerio de Ciencia. This research work was supported by the European Commission–NextGenerationEU, the Instituto de Salud Carlos III project COV20/00140 and COV20/00437 and the Generalitat Valenciana (SEJI/2019/011 and Covid_19-SCI). Action was cofinanced by the European Union through the Operational Program of the European Regional Development Fund (ERDF) of the Valencian Community 2014-2020.
P.R.-R.: Conceptualization, Methodology, Formal Analysis, Investigation, Visualization, Writing Original Draft. C.F.-G.: Conceptualization, Methodology, Formal Analysis, Review and edit draft. A.C.-O.: Formal Analysis, Review and edit draft. M.G.L.: Formal Analysis, Review and edit draft. S.J.-S.: Software, Validation, Review and edit draft. I.C.-M.: Methodology, Resources, Review and edit draft. P.R.-H.: Investigation, Review and edit draft. M.T.-P.: Methodology, Resources, Review and edit draft. M.A.B.: Investigation, Methodology, Review and edit draft. G.D.: Methodology, Software, Data Curation, Review and edit draft. L.M.-P.: Methodology, Project Administration, Review and edit draft. M.G.: Resources, Review and edit draft. M.M.-A.: Resources, Review and edit draft. M.D.G.: Resources, Review and edit draft. J.L.P.: Resources, Review and edit draft. F.G.-C: Funding, Project Administration, Supervision, Review and edit draft. I.C.: Funding, Project Administration, Supervision, Review and edit draft. A.M.: Formal analysis, Writing Original Draft. R.G.: Conceptualization, Methodology, Formal Analysis, Writing Original Draft, Supervision, Funding. M.C.: Conceptualization, Methodology, Investigation, Formal Analysis, Writing Original Draft, Supervision, Funding.
SeqCOVID-SPAIN consortium members include the following: Iñaki Comas, Fernando González-Candelas, Galo A. Goig-Serrano, Álvaro Chiner-Oms, Irving Cancino-Muñoz, Mariana Gabriela López, Manoli Torres-Puente, Inmaculada Gómez, Santiago Jiménez-Serrano, Lidia Ruiz-Roldán, María Alma Bracho, Neris García-González, Llúcia Martínez Priego, Inmaculada Galán-Vendrell, Paula Ruiz-Hueso, Griselda De Marco, Ma Loreto Ferrús Abad, Sandra Carbó-Ramírez, Mireia Coscollá, Paula Ruiz Rodríguez, Giuseppe D'Auria, Francisco Javier Roig Sena, Hermelinda Vanaclocha Luna, Isabel San Martín Bastida, Daniel García Souto, Ana Pequeño Valtierra, Jose M. C. Tubio, Fco. Javier Temes Rodríguez, Jorge Rodríguez-Castro, Martín Santamarina García, Nuria Rabella Garcia, Ferrán Navarro Risueño, Elisenda Miró Cardona, Manuel Rodríguez-Iglesias, Fátima Galán-Sanchez, Salud Rodríguez-Pallares, María de Toro, María Pilar Bea-Escudero, José Manuel Azcona-Gutiérrez, Miriam Blasco-Alberdi, Alfredo Mayor, Alberto L. Garcia-Basteiro, Gemma Moncunill, Carlota Dobaño, Pau Cisteró, Oriol Mitjà, Camila González-Beiras, Martí Vall-Mayans, Marc Corbacho-Monné, Andrea Alemany, Darío García de Viedma, Laura Pérez-Lago, Marta Herranz, Jon Sicilia, Pilar Catalán, Julia Suárez, Patricia Muñoz, Cristina Muñoz-Cuevas, Guadalupe Rodríguez Rodríguez, Juan Alberola Enguídanos, Jose Miguel Nogueira Coito, Juan José Camarena Miñana, Antonio Rezusta López, Alexander Tristancho Baró, Ana Milagro Beamonte, Nieves Martínez Cameo, Yolanda Gracia Grataloup, Elisa Martró, Antoni E. Bordoy, Anna Not, Adrián Antuori, Anabel Fernández, Nona Romaní, Rafael Benito Ruesca, Sonia Algarate Cajo, Jessica Bueno Sancho, Jose Luis del Pozo, Jose Antonio Boga Riveiro, Cristián Castelló Abietar, Susana Rojo Alba, Marta Elena Álvarez Argüelles, Santiago Melón García, Maitane Aranzamendi Zaldumbide, Óscar Martínez Expósito, Mikel Gallego Rodrigo, Maialen Larrea Ayo, Nerea Antona Urieta, Andrea Vergara Gómez, Miguel J. Martínez Yoldi, Jordi Vila Estapé, Elisa Rubio García, Aida Peiró-Mestres, Jessica Navero-Castillejos, David Posada, Diana Valverde, Nuria Estévez-Gómez, Iria Fernández-Silva, Loretta de Chiara, Pilar Gallego-García, Nair Varela, Rosario Moreno Muñoz, Ma Dolores Tirado Balaguer, Ulises Gómez-Pinedo, Mónica Gozalo Margüello, Ma Eliecer Cano García, José Manuel Méndez Legaza, Jesús Rodríguez Lozano, María Siller Ruiz, Daniel Pablo Marcos, Antonio Oliver, Jordi Reina, Carla López-Causapé, Andrés Canut Blasco, Silvia Hernáez Crespo, Ma Luz Cordón Rodríguez, Ma Concepción Lecaroz Agara, Carmen Gómez González, Amaia Aguirre Quiñonero, José Israel López Mirones, Marina Fernández Torres, Ma Rosario Almela Ferrer, José Antonio Lepe Jiménez, Verónica González Galán, Ángel Rodríguez Villodres, Nieves Gonzalo Jiménez, Ma Montserrat Ruiz García, Antonio Galiana Cabrera, Judith Sánchez-Almendro, Gustavo Cilla Eguiluz, Milagrosa Montes Ros, Luis Piñeiro Vázquez, Ane Sorrarain, José María Marimón Ortiz de Zarate, Ma Dolores Gómez Ruiz, Eva González Barberá, José Luis López Hontangas, José María Navarro-Marí, Irene Pedrosa Corral, Sara Sanbonmatsu Gámez, M. Carmen Perez Gonzalez, Francisco Javier Chamizo López, Ana Bordes Benítez, David Navarro Ortega, Eliseo Albert Vicent, Ignacio Torres, Ma Isabel Gascón Ros, Cristina Torregrosa Hetland, Eva Pastor Boix, Paloma Cascales Ramos, Begoña Fuster Escrivá, Concepción Gimeno Cardona, María Dolores Ocete Mochón, Rafael Medina González, Julia González Cantó, Olalla Martínez Macias, Begoña Palop Borrás, Inmaculada de Toro Peinado, Ma Concepción Mediavilla Gradolph, Mercedes Pérez Ruiz, Oscar González-Recio, Mónica Gutiérrez-Rivas, Encarnación Simarro Córdoba, Julia Lozano Serra, Lorena Robles Fonseca, Adolfo de Salazar, Laura Viñuela, Natalia Chueca, Federico García, Cristina Gomez-Camarasa, Ana Carvajal, Vicente Martín, Juan Fregeneda, Antonio J. Molina, Héctor Arguello, Tania Fernandez-Villa, Amparo Farga Martí, Rocío Falcón, Victoria Domínguez Márquez, José Javier Costa Alcalde, Rocío Trastoy Pena, Gema Barbeito Castiñeiras, Amparo Coira Nieto, María Luisa Pérez del Molino Bernal, Antonio Aguilera, Anna M. Planas, Álex Soriano, Israel Fernández-Cádenas, Jordi Pérez-Tur, Ma Ángeles Marcos Maeso, Carmen Ezpeleta Baquedano, Ana Navascués Ortega, Ana Miqueleiz Zapatero, Manuel Segovia Hernández, Antonio Moreno Docón, Esther Viedma Moreno, Jesús Mingorance, Juan Carlos Galán Montemayor, Iván Sanz Muñoz, Diana Pérez San José, Maria Gil Fortuño, Juan B. Bellido Blasco, Alberto Yagüe Muñoz, Noelia Henández Pérez, Helena Buj Jordá, Óscar Pérez Olaso, Alejandro González Praetorius, Aida Esperanza Ramírez Marinero, Eduardo Padilla León, Alba Vilas Basil, Mireia Canal Aranda, Albert Bernet Sánchez, Alba Bellés Bellés, Eric López González, Iván Prats Sánchez, Mercè García González, Miguel Martínez Lirola, Maripaz Ventero Martín, Carmen Molina Pardines, Nieves Orta Mira, María Navarro Cots, Inmaculada Vidal Catalá, Isabel García Nava, Soledad Illescas Fernández-Bermejo, José Martínez-Alarcón, Marta Torres-Narbona, Cristina Colmenarejo, Lidia García-Agudo, Jorge Alfredo Pérez García, Martín Yago López, María Ángeles Goberna Bravo, Carolina Pla Cortes, Noelia Lozano Rodríguez, Nieves Aparici Valero, Sandra Moreno Marro, Agustín Irazo Tatay, Isabel Mariscal Pieper, Ma Pilar Ramos, Mónica Parra Grande, Bárbara Gómez Alonso, Francisco José Arjona Zaragozí, Amparo Broseta Tamarit, Juan José Badiola Díez, Alicia Otero García, Eloísa Sevilla Romeo, Belén Marín González, Mirta García Martínez, Marina Betancor Caro, Diego Sola Fraca, Sonia Pérez Lázaro, Eva Monleón Moscardó, Marta Monzón Garcés, Cristina Acín Tresaco, Rosa Bolea Bailo, Bernardino Moreno Burgos, Carlos Gulin Blanco, Nora Mariela Martínez Ramírez, Miguel Ángel Jiménez Clavero, Fernando Lázaro-Perona, Manuel Ponce-Alonso, Cristina Juana Torregrosa-Hetland, Alberto Benguría, Jovita Fernández-Pinero, Victoria Simón García, María Eugenia Carrillo Gil, Antonio Alcamí, Gonzalo Llop Furquet, Mirian Fernández-Alonso, Pedro Luis Garcinuño Enríquez, Mario Rodríguez-Dominguez, Maria Teresa Cabezas Fernández, Laura Martínez-García, Sara Gonzalez-Bodi, Manuel Ángel Rodríguez Maresca, María Pilar Romero-Gómez, Marta Bermejo Bermejo, María Rodríguez-Tejedor, Irene Muñoz-Gallego, Julio García-Rodríguez, Nieves Felisa Martínez Cameo, Javier Temes, Juan Miguel Fregeneda-Grandes, Maria Dolores Folgueira, Ana Dopazo, Melanie Abreu Di Berardino, Víctor Manuel Fernández Soria, Raúl Recio Martínez, Sergio Callejas, Ricardo Ramos-Ruíz, Amparo Martínez-Ramírez, Jose Maria González-Alba, Maria Paz Ventero Martín, Begoña Aguado, Elias Dahdouh, Mercedes Roig Cardells, Salvador Raga Borja, Verónica Saludes, Cristina Casañ, Isabel Escribano Cañadas, and Fernando Simón Soria.