Introduction
In the past two decades,
Enterococcus faecium has become recognized as an important nosocomial pathogen. Up to the 1980s, the large majority of enterococcal hospital-associated infections (HAI) were caused by
Enterococcus faecalis, but since the beginning of the 1990s, the proportion of HAI caused by
E. faecium has increased and has now almost reached parity with that of
E. faecalis (
1). One proposed reason for this changing epidemiology is that, in comparison with
E. faecalis,
E. faecium shows relatively high rates of resistance against important antibiotics, including ampicillin and vancomycin. In addition, studies of the population biology of
E. faecium using multilocus sequence typing (MLST) data have revealed the existence of a distinct genetic subpopulation associated with nosocomial epidemics. This subpopulation has been designated lineage C1 (
2) and was later renamed clonal complex 17 (CC17) on the basis of eBURST (
3) analysis of MLST data (
4,
5).
CC17 has been recognized as a successful hospital-associated
E. faecium (HA
E. faecium) clonal complex, exhibiting high-level vancomycin, ampicillin, and quinolone resistance, although in most European countries CC17 remained primarily vancomycin susceptible (
4,
6,
7,
8,
9,
10,
11,
12,
13). However, in addition to this distinct resistance profile, genome-wide analyses have shown that HA
E. faecium strains have a genetic repertoire distinct from that of
E. faecium strains that asymptomatically colonize the gastrointestinal (GI) tract of humans and animals in the community (
14,
15). This distinct genetic repertoire includes cell surface proteins, of which the enterococcal surface protein, Esp, is a known virulence determinant (
8,
10,
16,
17,
18,
19,
20,
21,
22,
23); genomic islands tentatively encoding novel metabolic pathways (
24); and insertion sequence elements (
14). It is now considered that these determinants may be adaptive elements that have improved the relative fitness of this HA
E. faecium subpopulation in the hospital environment (
5,
9,
25,
26).
Despite the clear importance of CC17 as the main genetic subpopulation, including hospital isolates, its precise phylogenetic status remains uncertain. Analyses of the population structure of
E. faecium indicate that this species undergoes a high rate of homologous recombination (
4,
27). High recombination rates can lead to error in phylogenetic analyses. This is especially true if only a small portion of the genome is interrogated. In the case of the eBURST approach used to define CC17, a consequence of large amounts of recombination is the spurious grouping of diverse and distinct lineages into a single clonal complex. It has previously been suggested (
28) that phylogenetic reconstructions of
E. faecium are vulnerable to such errors. Analyses using alternative methods to eBURST suggest this may be the case, with the constituent lineages of CC17 (sequence types [STs] 17, 18, 78, and 80) representing distinct genetic lineages of which the relationships between cannot be confidently assigned (
5,
26,
27). Together with the observation that whole genome sequences of two CC17 isolates (E1162 [ST17] and U0317 [ST78]) differ substantially in gene content (
15), this strongly indicates that HA
E. faecium may not have evolved from a single founder (i.e., ST17) and that, consequently, CC17 as presently identified may not exist but is an artifact of the assumptions embedded within the eBURST algorithm.
An alternative approach to eBURST is an analysis of genetic population structure, with the power to combine the identification of deep branching lineages and recombination between them. This can be done using Bayesian Analysis of Population Structure (BAPS) software (
29,
30,
31). Unlike approaches such as eBURST (
3), BAPS does not attempt to retrieve phylogenetic information or implement a phylogenetic model of clustering but rather uses a statistical genetic model to partition molecular variation based on both clonal ancestry and recombination patterns as identified from DNA sequence data. This approach has recently been used to probe genetic population structure in
Streptococcus pneumoniae (
32),
Escherichia coli (
33),
Campylobacter coli,
Campylobacter jejuni (
34,
35), and
Neisseria spp. (
36) and has been shown to be able to detect structure even in highly recombinogenic populations. Here, we used BAPS to identify groups of related
E. faecium strains and show a significant association of hospital and farm animal isolates to different BAPS groups. We suggest that hospital-associated lineages contained in different BAPS groups have, however, acquired similar adaptive elements.
DISCUSSION
Multiresistant
E. faecium has become one of the most important nosocomial pathogens. Emergence of multiresistant
E. faecium strains is a problem at multiple levels. From a clinical perspective, they are among the most resistant opportunistic nosocomial pathogens, with an increasing impact on patients receiving health care. Moreover, in terms of the emergence of resistance and their considerable capacity of genetic exchange linked to high recombination rates,
E. faecium is the perfect hub for resistance genes facilitating horizontal gene transfer among bacterial species. A high relative recombination rate means that eBURST, a popular cluster algorithm for MLST data, cannot reliably delineate the patterns of recent evolutionary descent of
E. faecium (
28). Here, we have used BAPS software to probe the genetic structure and evolution of
E. faecium. The BAPS-based partition revealed a nonrandom distribution of animal isolates among BAPS populations and a significant association of isolates derived from hospitalized patients with specific groups that are negatively associated with isolates from farm animals. This is consistent with previous findings that demonstrated host specificity and distinct clustering of animal and human community isolates from clinical isolates based on clustering of amplified fragment length polymorphism (AFLP) profiles (
38) and on comparative genomic hybridizations using an
E. faecium mixed whole-genome array (
14). The observation is also consistent with previous analysis of MLST data at the level of individual STs, confirming that the majority of hospital outbreak and infectious isolates are genotypically distinct from the majority of human commensal and animal isolates. Consequently, antibiotic-resistant clones originating from farm animals appear not to be responsible for the emergence of antibiotic-resistant
E. faecium in hospitalized patients (
27). However, the previous finding of indistinguishable vancomycin resistance transposons in pigs, nonhospitalized persons, and hospitalized patients indicates that while animal-derived
E. faecium clones containing antibiotic resistance genes may not be responsible for infections in hospitalized patients, the resistance genes themselves may be laterally transferred from animal isolates to human clinical isolates (
16).
Three
E. faecium STs, ST17, ST18, and ST78, and STs representing their recent evolutionary descendants are significantly enriched among clinical and outbreak-associated isolates of hospitalized patients and represented major subgroup founders of the previously designated CC17, a presumed hospital-derived subpopulation of
E. faecium that has spread globally (
4,
7,
8,
39,
40). BAPS, however, resolved CC17 into two different subgroups, BAPS 2-1 and BAPS 3-3, with ST78 and descendants belonging to BAPS 2-1, separated from ST17 and ST18, both belonging to BAPS 3-3. This is consistent with the suggestion that CC17, as a monophyletic entity containing the majority of hospital isolates, is probably an artifact of documented problems with eBURST-based clustering (
28), which have led to erroneous linkage of the three main hospital lineages (lineages 17, 18, and 78) into CC17.
Two recent comparative genomic studies of
E. faecium, including 8 to 21 isolates, for which draft whole genome sequences were available, identified a deep phylogenetic split between two
E. faecium clades that were designated clade A and clade B (using the terminology of Palmer and coworkers [41]) or commensal (CA) and hospital (HA) clades (using the terminology of Galloway-Peña and coworkers [42]). We found, using essentially the same genome sequence data, the same ancient split with isolates belonging to BAPS 1 clustering in a separate clade (B or CA), while those belonging to all the other BAPS groups clustered in clade A (or HA). Also, the topology of the phylogenetic tree described in a recent publication by Lam and coworkers (
43) that includes the first complete genome sequence of an
E. faecium strain is highly similar to the subtree we show in
Fig. 2, representing only the upper 22 non-BAPS 1 strains. Our study, however, revealed that BAPS 1 isolates do not solely represent community isolates but that 43% of BAPS 1 isolates also represent isolates recovered from hospitalized patients, indicating that clade B or the CA clade is not representative only for community isolates. Similarly, isolates belonging to the other BAPS groups do not only include hospital isolates but also represent the vast majority of farm and pet animals, which indicates that clade A or the HA clade is not exclusively representative for hospital isolates. More importantly, our BAPS analysis demonstrated that hospital isolates belong to different evolutionary clusters and thus do not share a recent common ancestor—recent in this sense meaning since the establishment of modern hospitals and the dawn of the antibiotic era. The diversification of hospital-associated clusters is something that has happened relatively recently compared to the deep split between the two clades mentioned above, because appearance of large-scale hospitals and the use of antibiotics represent very recent events in evolutionary time. Using whole genome sequence information of 21 human
E. faecium isolates, the split between the CA clade and HA clade was calculated to have occurred between 300,000 and 3 million years ago, while strains in the HA clade were estimated to have diverged from each other ~100,000 to 300,000 years ago (
42). However, with a poor sampling of strains (a relatively limited number of strains, which were all human derived) and a relatively simple analysis taking into account mutation rates of
Escherichia coli and
Bacillus anthracis, it is nearly impossible to say anything serious about divergence times for a very recombinogenic organism like
E. faecium. One would need to carefully purge the data and be very certain that almost all variation that is left is due to mutation. Even then it is not easy to calibrate the clock and estimate divergence times, since mutation and recombination frequency may differ between bacterial populations that reside under different selective pressure.
The finding that ST78 is part of a different BAPS group (BAPS 2-1) than ST17 and ST18 (BAPS 3-3) suggests a distinct evolutionary history for hospital lineage 78. This is supported by the neighbor-joining tree based on concatenated MLST gene sequences of BAPS 2 and BAPS 3 (sub)groups (see
Fig. S3), as well as a ClonalFrame analysis of the population presented in reference 5. While STs 17 and 18 (and descendant STs) are grouped in BAPS 3-3, which is significantly associated with hospital-derived isolates, lineage 78 coclusters with the majority of animal, specifically poultry, isolates in BAPS 2-1. Based on this observation, we hypothesize that the genetic evolution of hospital clones belonging to lineage 78 possibly involved animals (poultry or pet animals) as the ancestral origin since poultry isolates constitute the largest proportion of animal isolates in BAPS 2-1 and because lineage 78 was also significantly associated with STs from pet animals (data not shown). In fact, it is not implausible to speculate that the hospital-associated lineages 17, 18, and 78 all arose by connection to animals. BAPS 3-2, which is also significantly associated with animals and evolutionarily closely related to BAPS 3-3, that includes lineage 17 and lineage 18, contains a relatively high proportion of pig isolates (29% of all isolates in this BAPS group).
The observed coclustering, based on gene content, of
E. faecium hospital isolates belonging to lineages 17, 18, and 78 (
14) indicates cumulative acquisition of adaptive elements, such as ampicillin resistance and the
esp virulence gene, by specific genotypes multiple times during the evolution of
E. faecium. Future whole-genome-based phylogenomics analysis will provide more insights into the evolutionary history and gene content of isolates belonging to the lineages 17, 18, and 78 and the order with which particular adaptive loci and phenotypes, such as ampicillin resistance and
esp, were acquired. If true, this suggests that the continuous rise of nosocomial
E. faecium infections is not the result of clonal expansion of a single successful clone or lineage that emerged in hospitals 20 years ago but of consecutive waves of different clones/lineages that have evolved and were subsequently selected in hospitals.
Despite high estimated levels of recombination in
E. faecium (
4,
27), admixture and gene flow analysis indicated limited amounts of admixture between BAPS groups. Of the three largest BAPS groups (
1,
2,
3),
E. faecium isolates in BAPS 3 show higher levels of admixture, with mosaic genotypes concentrated among isolates from nonhospitalized persons and pigs (see
Table S4 in the supplemental material). This may reflect an increased ability to accept foreign DNA or greater ecological opportunity for recombination. It remains to be investigated which mechanisms are responsible for the observed higher admixture level in BAPS 3. Recently, Manson and coworkers described chromosome-chromosome transfer of resistance and virulence genes as well as MLST markers between
E. faecalis strains (
44). This indicates that plasmid-mediated mobilization of chromosomal DNA contributes to MLST diversity in
E. faecalis (
44), and it is not unlikely that similar mechanisms may exist in
E. faecium. Hospital isolates, either contained in BAPS 2 or BAPS 3, display only low levels of admixture, which may point to genetic isolation of hospital-derived
E. faecium.
In conclusion, BAPS analysis provided new insights into the population structure of
E. faecium, suggesting that CC17 should be divided into constituent groups descending from STs 17, 18, and 78. This analysis, as well as previous typing data, indicates a certain level of host specificity and suggests ecological isolation for some
E. faecium populations. For the hospital population, we propose a model of enterococcal evolution in which strains with high invasive potential arise through horizontal gene transfer, but once adapted to the distinct pathogenic niche the population becomes isolated and recombination with other populations declines. This corroborates previous observations that hospital isolates carry a number of resistance and putative virulence genes not found among community/animal isolates. Analysis of the composition of the
E. faecium hospital population over time from literature references suggests successive waves of successful
E. faecium STs from lineages 17 and 18 in the years 1990 to 2004 (
8) to lineage 78 from 2005 (
7,
39,
45,
46,
47,
48,
49). The recently successful hospital lineage 78 (BAPS 2-1) seems to have an evolutionary history which is distinct from lineages 17 and 18 (BAPS 3-3) that dominated in the 1990s and may have evolved from farm animals, most probably poultry, or pet animals. The emergence of
E. faecium as a leading nosocomial pathogen has paralleled the emergence of these three genetically distinct hospital lineages with increased potential of hospital spread. These lineages are enriched in proven and putative virulence genes, like the
esp gene and other genes encoding surface proteins and surface appendages like pili (
20) that have enabled specific hospital-adapted clones belonging to these lineages to colonize and invade hospitalized patients. The finding that successful hospital-adapted
E. faecium strains may evolve from different genetic backgrounds, including those that prevail in animal reservoirs, has consequences for the potential flow of genes conferring resistance or virulence through the
E. faecium population contained in various human and nonhuman ecological niches. Improved understanding of population structure can assist effective control by defining those parts of the population most associated with particular settings, such as health care or agriculture. The finding of distinct health care and agricultural populations of
E. faecium will also facilitate future research in disclosing genetic differences between these populations. This will improve our understanding of the pathophysiological processes that have led to adaptation of the three major hospital lineages to the hospital environment. Increased insights in genes or genetic elements implicated in hospital adaptation may lead to the identification of novel targets for antibiotics and immunotherapy to combat
E. faecium infections.