INTRODUCTION
The genus
Vibrio contains >100 described species, and around a dozen of these have been demonstrated to infect humans (
1). Infection is usually initiated by exposure to seawater or consumption of raw or undercooked seafood (
2,
3).
Vibrio parahaemolyticus is a Gram-negative, halophilic bacterium found commonly in temperate and warm estuarine waters worldwide (
4).
V. parahaemolyticus is the most prevalent food poisoning bacterium associated with seafood consumption in many regions globally, typically causing acute gastroenteritis. This bacterium grows preferentially in warm (>15°C), low-salinity (<25 ppt NaCl) marine water (
5).
A number of important factors underpin the need for a greater understanding of these food-borne pathogens in an international context. Compared to other major food-borne pathogens, the number of
V. parahaemolyticus infections is steadily increasing (
6). Indeed, according to the U.S. Centers for Disease Control and Prevention (CDC), the average annual incidence of all
Vibrio infections increased by 85% between 1996 and 2009 (
7), with
V. parahaemolyticus accounting for 52% of those infections and being responsible for a more recent and marked increase in incidence (
8). In the United States alone,
V. parahaemolyticus is estimated to cause around 35,000 human illnesses each year (range, 18,000 to 58,000 cases) (
9). Additionally,
V. parahaemolyticus infections are now being reported in areas with little previous incidence, including South America and northern Europe (
6,
10). Although the factors driving the escalation in the number of infections are likely multifactorial, climate warming, in particular, appears to be a substantial contributor to the expansion of pathogenic vibrios, especially in temperate regions (
10). Future climate scenarios based on climate modeling suggest that
Vibrio spp., including
V. parahaemolyticus, are likely to continue to pose a significant and expanding public health risk.
The most substantial change in the epidemiology of
V. parahaemolyticus infections over the last 2 decades has been the transition from the dominance of locally restricted strains to the emergence and transcontinental expansion of new clones with pandemic potential. Only two instances of transcontinental expansion of
V. parahaemolyticus strains have been reported, “pandemic clone” CC3 (serotype O3:K6), which emerged in India in 1996 and subsequently spread around the world (
11), and more recently the expansion of sequence type 36 (ST36), which was responsible for numerous large
V. parahaemolyticus outbreaks in the Pacific Northwest region of the United States over the last 2 decades (
12–14). The strains associated with these outbreaks, subsequently termed the Pacific Northwest (PNW) complex (
12,
14,
15), appear to be genetically and biochemically distinct, and have a significantly smaller infectious dose than other toxigenic
V. parahaemolyticus isolates (
6).
Prior to 2012, PNW complex strains were restricted to the Pacific Northwest region of the United States and Canada (
16). However, illnesses associated with this complex were reported along the northeast coastline of the United States in the spring of 2012 (
8,
15) and subsequently in the northwest of Spain in association with a large outbreak of illness in August 2012 (
17). The geographic expansion of these strains caused significant economic losses in the shellfish industry in the northeast and caused the largest known food-borne
Vibrio outbreak reported in Europe (
17). A striking observation from characterization of the 2012 outbreak-associated strains, as well as previous clinical isolates of this complex from the United States, was the indistinguishable serotypes (O4:K12 or O4:Kut), pulsotypes, and STs (ST36) (
15). This initial observation was noteworthy because it suggested that a single, highly pathogenic clone of
V. parahaemolyticus had radiated from the Pacific Northwest region and successfully established itself along the eastern seaboard of the United States and then potentially in western Europe (
15). Illnesses associated with this clone were documented from the northeast coast of the United States in 2013, indicating that these strains overwintered in environmental reservoirs (
8).
Given the highly pathogenic nature of the PNW complex and the fact that other pathogenic variants of
V. parahaemolyticus have expanded worldwide (e.g., O3:K6) (
11), a clear understanding of the potential source, phylogenetic nature, lineage, and time line of transmission of this group is needed. To this end, we performed a genome-wide analysis of historical and contemporary ST36
V. parahaemolyticus from the United States and Europe to gain a comprehensive understanding of the genomic variation and evolutionary process undergone by this group during its geographic expansion and colonization of new areas. These results are critical to understand the particular genetic signatures and evolutionary forces contributing to the expansion of this globally important emerging pathogen.
DISCUSSION
The number of reported cases of
V. parahaemolyticus infection has increased steadily over the previous 2 decades as a result of expansion on a global scale (
10,
25). In addition to the environmental factors driving this expansion, the emergence and transcontinental dissemination of some particular genetic variants of
V. parahaemolyticus are contributing to this process (
15,
17,
26–28). Classical typing techniques applied to the investigation of outbreaks (initially repetitive element PCR and pulsed-field gel electrophoresis and later multilocus sequence typing [MLST]) provided insights into the potential sources and origins of these new variants, documenting the first connections between populations implicated in outbreaks across large geographic distances (
11,
12,
29–31). This situation was particularly relevant for understanding the expansion of
V. parahaemolyticus caused by the O3:K6 pandemic clone (
11,
32). The application of whole-genome sequencing for the study of pathogenic
V. parahaemolyticus populations was crucial to determine that the pandemic clone was not the only group that underwent transoceanic dispersal. Also, almost all of the major
V. parahaemolyticus outbreaks identified in Peru and Chile over the last 25 years had been associated with the introduction of new genetic variants typically originating in Asia (
33).
Following this precedent, we used genome-wide analysis to investigate a more recent instance of transcontinental spreading of a highly pathogenic
V. parahaemolyticus group, ST36. This group, primarily identified by MLST (
12), was initially reported only from the Pacific Northwest. Over the last 6 years, its detection along the northeast coast of the United States has been associated with a rise in the number of cases in the region (
8). In addition, ST36 was identified for the first, and only, time outside North America in the northwest of Spain, where it caused a large outbreak in 2012 (
15,
17). The subsequent emergence of this clone in Europe triggered concern about the potential implications of the transcontinental spreading of a second
V. parahaemolyticus strain and the opportunities for a new pandemic expansion. Furthermore, recent studies have suggested the existence of a new population of ST36 prevailing among illnesses in the northeast of the United States (
8,
34), which have introduced an additional level of uncertainty about the evolutionary history of this group.
The present study provided an exceptional opportunity to investigate the evolution of
V. parahaemolyticus populations in the course of epidemic expansion. The distribution of the ancestral lineages of ST36 was restricted to the Pacific Northwest, and there is no record of possible introductions to any other region. It was not until the emergence of the modern lineage of this clone by 1995 that it showed effective dispersal, particularly after 2000. This modern lineage from the Pacific Northwest was repeatedly introduced into the east coast of the United States until it became endemic to the area by 2008, when it initiated a differentiation process leading to the emergence of the modern U.S. northeast population, which was responsible for large outbreaks of illness from 2013 onward. Our results identified recombination as the major source of genomic variation with a critical contribution to the major processes of diversification within the ST36 group and clear implications in the evolution of the modern lineages in Spain and the United States. In particular, recombination was of crucial importance in the emergence and diversification of the modern populations in the United States. Homologous recombination has been previously identified as a major evolutionary driver in
V. parahaemolyticus, with a high level of recombination in environmental strains (
r/
m = 39.8) (
18) and more moderate levels in disease-related populations (
12). Here we demonstrated that most of the genetic divergence within this ST36 clonal population occurred by recombination, which introduced almost twice as many substitutions as mutations. Furthermore, a fine-tuned analysis of recombination rates for each node revealed that recombination played a fundamental role in the evolution of the modern lineages, reaching
r/
m rates of 13 (overall) and around 25 in particular subpopulations undergoing high diversification. These data stress the critical importance of recombination not only as a source of variation among the highly diverse environmental populations but also within the major clonal populations that emerged from them (
12) as a major driver of the emergence of new pathogenic variants within the population.
Another relevant aspect of the evolution of this clone was the diversity in the mutation rates found across lineages. The highest evolutionary rate was found in the old PNW lineage, which also showed a higher level of diversity and the largest genomes among the strains analyzed. These particular characteristics were uniquely retained by the strains from Spain, which tighten the links between these populations. Moreover, the present study provided a unique perspective on the evolutionary changes that occurred within a single population of
V. parahaemolyticus in the extremely infrequent process of transition from a locally adapted clone to an epidemic clone undergoing a transcontinental pandemic expansion event. The modern populations from the United States, both western and eastern lineages, showed lower evolutionary rates and smaller genomes than their ancestral lineages, where almost all of the processes of diversification and evolution were driven by recombination. Although this needs to be examined in further detail, a first analysis suggests that the gene number reduction and lower mutation rate could be associated with a more specialized lifestyle as a result of niche adaptation. Genome reduction has been observed in many bacterial lineages in their process of specialization to new environments (
35). This pattern of genome shrinkage has been recently documented in other free-living marine organisms, such as
Prochlorococcus (
36), which has undergone a genome reduction as a result of adaptation to the environment. We assume that a similar process may occur in the modern lineage of ST36 evolving through genome reduction resulting from specialization to narrow ecologic niches, limiting its versatility and survival under changing conditions. In terms of colonization, a highly specialized population may lead to a higher rate of survival over the dispersal and also a higher rate of success in the introduction into new areas. Recent experimental observations have revealed a link between genome reduction and a growth rate decrease in bacteria (
37). Similar circumstances may have occurred over the evolution of ST36, where multiple genomic deletions may lead to decreases in the growth rate of modern lineages of this clone, reducing the mutation rate because of a lower number of cell divisions. Although the ecologic implications of this evolutionary pattern need to be explored further, it would be important to analyze other
V. parahaemolyticus clones undergoing similar processes of geographic expansion to assess whether this is a common strategy in the evolution of major epidemic clones. Finally, the exceptional warming trend and regime shift (from 13.8 to 16°C) identified in the northeast region of the United States coinciding with the expansion of the ST36 populations in the area (
Fig. 4) may be the definitive factors contributing to the adaptation of these populations and fostering the growth of populations and interactions between them.
V. parahaemolyticus infections are currently undergoing a process of geographic expansion, reaching new regions and typically associated with the introduction of strains originated from a remote area. Despite numerous studies reporting these particular patterns of spreading (e.g., reference
5), little is known about the mechanisms and biological strategies used by this organism over the process of dispersal. The release of ballast water transported by cargo ships has been identified as one of the potential vehicles of dispersal and sources of introduction of foreign
Vibrio strains (
38) and has been associated with outbreaks occurring in areas in close proximity to important international ports (e.g., references
39 and
40). Movement of oceanic waters was also documented as a mechanism of dispersal in some instances where the emergence and onset of infections correlated with the intrusion of warm oceanic waters into the region (
6,
32). The decrease in the extent of sea ice observed in the Arctic over the last 2 decades has potentially activated a new route for ship traffic through the Bering Strait, allowing an effective connection between the west and east coasts of the United States and the potential dispersal of
Vibrio populations. In a similar context, the melting of Arctic sea ice is removing the physical boundaries between the Pacific and Atlantic Oceans, opening a natural route for the migration of plankton species between both coasts of North America documented over recent years (
41,
42). Without ruling out these two alternatives, it seems unlikely that these natural processes could have provided the opportunity for recurrent introductions of ST36 populations on the east coast of the United States. Furthermore, the presence of ST36 strains in the northwest of Spain represents an additional obstacle to the identification of a single mechanism for the dispersal of this clone. As an additional alternative, the global trade of shellfish may have also been a contributor to the dispersal of
V. parahaemolyticus populations. Recent genetic studies tracking the global distribution and introduction of Manila clams in Europe have identified the origin of clam populations introduced into the northwest of Spain in the Pacific coast of Canada with frequent importations of clams from British Columbia in Canada over the end of the 1990s and the beginning of the 2000s (
43,
44).
Fine-resolution genome-wide analysis of ST36 strains over the course of geographic expansion has facilitated a better understanding of the evolution of this clone over the process of dispersal and introduction in areas of the United States and Spain. A similar approach applied to the study of other clonal groups undergoing similar processes of cross-continental expansion could help to assess whether the evolutionary patterns identified here are shared by other pathogenic V. parahaemolyticus strains in their transition from local distribution to the status of an epidemic clone with a global impact. Furthermore, a more extensive analysis combining disciplines such as evolution, climate science, and oceanography will provide new insights into the complex interactions between these populations and the variable ecologic conditions of their surrounding environments over the process of diversification, aspects that are critical to an understanding of the basis of the mechanisms driving the evolution of novel pathogenic clones and the initiation of geographic expansion and epidemic radiation.