INTRODUCTION
Staphylococcus aureus causes the majority of skin and soft tissue infections (SSTIs) and is also frequently associated with invasive disease, such as bloodstream infections or pneumonia (
1). Over the past few decades, novel methicillin-resistant
S. aureus (MRSA) genotypes have emerged due to the acquisition of variants of the staphylococcal cassette chromosome
mec (SCC
mec) element. Only a minority of emergent MRSA clones have disseminated globally and have caused epidemic waves of health care-associated (HA), community-associated (CA), or, more recently, livestock-associated (LA)
S. aureus infections.
The advent of high-resolution whole-genome sequencing (WGS) has allowed the reconstruction of the evolution of several major MRSA clones such as multilocus sequence types (STs) ST239, ST22, ST30, ST80, and ST8/USA300 by sequencing of large data sets (
2–7). Collectively, these studies have documented how specific antibiotic selective pressures, as well as interactions with human hosts, have shaped the recent evolutionary history of these MRSA clones. While these studies have included a selected number of methicillin-susceptible
S. aureus (MSSA) isolates (
7), the emergence, evolution, and transmission of major MSSA lineages have been largely disregarded. MSSA, however, remains of considerable public health importance as it accounts for the majority of health care- and community-based
S. aureus infections throughout the world (
8). In a possible reversal of past trends, it has been suggested that MSSA now accounts for higher numbers of HA-
S. aureus (HA-SA) infections than does MRSA (
9). Although the clonal backgrounds of MSSAs are highly diverse, a number of dominant pandemic MSSA clones have been identified (
8,
10,
11). Major MSSA lineages overlap the dominant circulating MRSA clones and include ST30, ST5, ST8, and, more recently, ST398 (
12). ST398 was first described as both MSSA and MRSA among pig farmers in France (
13,
14). Since the original description, MRSA ST398 has quickly spread among pigs and other types of livestock and has been associated with MRSA infections in farmers (
15,
16). MRSA ST398 strains are thought to spread only infrequently beyond the immediate animal and farm contacts (
17–19).
In parallel, animal-independent human colonization and infections with ST398 MSSA have emerged and have now been encountered worldwide, including in several European countries (
14,
20,
21), the Caribbean (
22), and the northeastern United States (
23,
24). The prevalence of ST398 MSSA appears to be particularly high in parts of China, where this clone accounts for almost 20% of skin and soft tissue infections (SSTIs) (
25). This lineage has been notably absent from other regions such as large parts of the United States (
11), although it has sporadically been reported in pigs and veterinarians in the Midwest (
26). ST398 MSSA strains are encountered as both CA and HA pathogens (
27,
28). Remarkably, a number of studies from such diverse settings as community households, a jail holding tank, and an intensive care unit have suggested that ST398 MSSA might be uniquely transmissible between humans (
24,
27,
29). In addition to the distinct epidemiology, important genomic differences between human MSSA and LA-MRSA ST398 include a large repertoire of mobile genetic elements (MGEs) (
30) and tetracycline resistance and lack of a unique SA3
int β-hemolysin converting phage (
21,
24,
31). In contrast, most of the human-associated ST398 isolates harbor resistance to macrolides and are
spa type t571. WGS analyses indicated a possible human origin for LA-ST398, followed by the emergence of methicillin resistance driven by antibiotic pressure in animal feeds (
31). A more recent quantitative time-scaled phylogeny, however, indicated that both human MSSA and LA-CC398 emerged in parallel around 1970 (
32).
To further delineate the basis of the wide geographic distribution of ST398 MSSA, we investigated the macro- and microevolution of this clone by comparative whole-genome sequencing of epidemiologically linked isolates from social networks in New York City, farms in the midwestern United States, and geographically distinct Caribbean islands. Our analysis provides insights into the recent adaptation and international spread of this highly successful MSSA clone.
DISCUSSION
There is a lack of information on the evolution and spread of dominant MSSA lineages. This is one of the first longitudinal studies to investigate the evolution of the uniquely host-species-adaptable ST398 MSSA clone in the community using a combined social network and genomic approach. Similarly to recent household studies on USA300 MRSA (
5,
37), we found that households served as major sites for ST398 MSSA transmission. However, by taking advantage of our network enrollment strategy, we were able to demonstrate frequent spread of these isolates within social networks as evident by overlapping SNP distances. We also found evidence for long- and short-distance geographic migration of ST398. First, isolates from infections that occurred during hospitalizations did not differ from colonizing isolates and were embedded within the clade of NM community isolates. This observation highlights the potential role of common community commensals in hospital-associated
S. aureus infections. Second, isolates from France and the French overseas department Martinique were more closely related to evidence for spread from France to Martinique. French tourists account for the majority of visitors (75%) each year to Martinique (
38). Bayesian phylogeographic reconstruction supported the root of Dominican ST398 isolates in Northern Manhattan and suggested subsequent spread to the Caribbean island, which is a frequent travel destination from this community (
22). These links may help explain the disparate distribution of this clone in different geographic regions.
We observed a relatively high and overlapping SNP diversity within individuals, environmental surfaces, households, and networks. In our previous study on USA300 MRSA transmission in the same community, the SNP diversity was much lower on individuals and in households. However, the substitution rates observed here for ST398 as well as previously published rates for both clones (
5,
32) are comparable. When analyzing longitudinal isolates collected from the same individual, we did not find any evidence for an increased accumulation of SNPs in ST398 isolates over this time period. We also did not detect SNPs in genes associated with hypermutability such as
mutL or
mutS.
Alternatively, differences in SNP diversity between lineages might be a direct reflection of long-term persistence of ST398-MSSA over years versus short-term colonization with USA300 in community households. This might be attributable to the increased virulence potential of USA300, the predominant CA-MRSA strain and major cause of SSTIs in the United States (
39). Alam et al. (
37), however, suggested long-term persistence of USA300 in households in Chicago and Los Angeles ranging between 2.3 and 8.4 years. In our study, we did not detect a temporal signal within households due to the relatively large number of SNPs at each sampling time point, thus precluding timing of persistence. While both studies provide evidence for long-term persistence of successful
S. aureus clones, it should be noted that these stand in contrast to rapid changes in individual colonization based on
spa typing alone (
40). The observed high SNP diversity in individuals has direct implications for epidemiological transmission studies, including in the health care setting, when patients become infected with their colonizing bacteria. Differences in SNP diversity between
S. aureus clones such as ST398 and USA300 or other bacterial clones may need to be considered in defining linkages during hospital outbreaks.
Although we found no evidence for presence of selection from the dN/dS estimates, Tajima’s D neutrality test suggests deviations from neutral evolution in households and networks. Negative Tajima D values indicated that the number of observed genetic differences (theta) is smaller than the expected value (pi), consistent with the presence of rare alleles at low frequencies, possibly due to a recent selective sweep or expansion after a bottleneck such as after transmission and adaptation to new colonization sites. Our study provides preliminary evidence that this might be driven by accumulation of SNPs in specific functional pathways. Despite differences in measuring genetic diversity between our study and that by Alam et al., both investigations observed that the number of observed genetic differences (theta) is smaller than the expected (pi), resulting in negative Tajima D values.
In a prior study on differences between human MSSA and LA-CC398 MRSA, we observed pseudogenes and variations in genes encoding surface adhesins in LA isolates but not in human samples (
24). This further translated into impaired binding of LA-ST398 to human keratinocytes
in vitro, whereas both human and LA isolates were able to adhere to porcine keratinocytes. Many of these mutations or insertions and deletions were detected in repetitive elements of these genes. Short-read sequences obtained by Illumina sequencing here preclude a thorough and reliable assessment of these regions.
We detected evidence for a large-scale recombination within a subset of ST398 MSSA isolates obtained from pigs and swine veterinarians in three Midwestern states. This is at least the second report of recombination in this clonal lineage (
31). However, in contrast to large-scale recombinations previously described in ST398 and other
S. aureus clonal complexes, the origin of this ~250-kb region cannot be attributed to a single clonal lineage. This indicates either recombination from a not-yet-identified novel sequence type or indeed mosaic acquisition of several regions from a diversity of clonal complexes (CCs). Homologous recombination might provide additional means of adapting to a novel host environment (
34,
41).
Several limitations to our study need to be considered. Despite efforts to enroll complete social networks, we were able to recruit only about one-third of named close social contacts, introducing possible selection bias. Follow-up data were limited as individuals had moved or were not willing to participate any longer. Cost precluded a complete assessment of environmental samples. Phylogeographic reconstruction is vulnerable to sampling bias. More comprehensive sampling from other geographic origins may have changed the inferred origin of ST398.
In conclusion, we reconstructed the micro- and macroevolution of ST398-MSSA, a clone that is uniquely successful in diverse regions despite the lack of an array of antimicrobial resistance genes. Phylogeographic reconstruction suggests recent spread from France to Martinique and from NM to the DR, in accordance with cultural links and travel destinations (
22), rather than spread within the Caribbean. We provide evidence for an extensive cloud of diversity of ST398-MSSA shared between individuals and the environment in households, extending to social networks, which was distinct from USA300. Recognizing these differences between clonal lineages has important implications for outbreak investigations, including in the clinical setting.