INTRODUCTION
Enterococcus faecalis is a versatile generalist commensal bacterium that colonizes the gastrointestinal tract and other niches in humans and animals and survives in the environment, including nosocomial settings (
1).
E. faecalis is a subdominant core member of the human gut microbiota, usually acquired early after birth, and its origin dates to the Paleozoic era ~400 to 500 million years ago (
2). Although
E. faecalis predominantly exhibits a commensal lifestyle, it is a conditional or opportunistic pathogen (
3,
4). It causes life-threatening opportunistic infections, including bacteremia, endocarditis, intra-abdominal infection, pneumonia, and meningitis infections typically associated with high mortality (
5,
6). Since the 1970s,
E. faecalis has emerged as a leading cause of community-acquired and nosocomial infections, most of which have become increasingly difficult to treat due to intrinsic and acquired antibiotic resistance, making it a major threat to public health globally (
4,
6 – 9). Such increasing antibiotic resistance has reignited calls to develop enterococcal vaccines.
The commensal-to-pathogenic switch of
E. faecalis is marked by its overgrowth in the gut and subsequently translocation into the bloodstream via the intestinal epithelium (
10). Such extraintestinal translocation can lead to bacteremia, infective endocarditis, and infections in other distal tissues from the intestines. However, the specific mechanisms driving
E. faecalis bloodstream invasion, survival, and virulence are still being uncovered (
3,
5,
11,
12). Observational studies have shown that antibiotics, such as cephalosporins, promote overgrowth and extraintestinal translocation of
E. faecalis into the bloodstream (
13,
14), an observation supported by
in vivo murine experimental models (
14 – 16). Such overgrowth of
E. faecalis reflects the impact of ecological side effects of broad-spectrum antibiotics in driving dysbiosis of the gut microbiota
, a phenomenon similarly observed with
Clostridioides difficile (formerly known as
Clostridium difficile) (
17,
18).
E. faecalis also harbors a diverse arsenal of putative virulence factors (
19 – 21), which foster its adaptation and survival in the harsh clinical and midgut environments and potentially promote extraintestinal translocation into the bloodstream. These virulence factors appear to be enriched in the dominant epidemic
E. faecalis lineages (
22,
23), highlighting their importance to the success of these clones. For example, the gelatinase (
gelE) gene encodes a metalloprotease exoenzyme commonly associated with epidemic clones (
22) and is important for infective endocarditis (
24) and extraintestinal translocation into the bloodstream (
25). Other exotoxins, namely, hemolysin and enterococcus surface protein, are also important for virulence in endocarditis (
26) and biofilm formation (
27), respectively, although the role of the former on intestinal colonization and translocation has been questioned (
28,
29 ). Acquisition of extrachromosomal elements, including pathogenicity islands (
30,
31) and plasmids (
32), has also been associated with virulence and survival in nosocomial settings (
33). Understanding the distribution of these known and novel
E. faecalis virulence factors in strains sampled from different tissues and individuals with contrasting pathogenicity could potentially reveal mechanisms for enterococcal pathogenicity and uncover therapeutic targets.
Remarkable advances in whole-genome sequencing and computational biology have revolutionized population genomics since the sequencing of the first enterococcal genome (
34). To date, the feasibility of large-scale whole-genome sequencing and analysis has facilitated detailed population-level studies to uncover the genetic basis of bacterial phenotypes (
35). For example, the application of genome-wide association studies (GWAS) to bacteria has revealed genetic variants associated with diverse phenotypes, including antimicrobial resistance (
36), host adaptation (
37), and pathogenicity (
38). A key feature of the GWAS approach is that it can identify novel genetic variants associated with phenotypes through systematic genome-wide screening, which does not bias the analysis toward “favorite” genes and mutations commonly studied in different laboratories. Although previous studies have attempted to compare the genetic and phenotypic differences between
E. faecalis isolates causing intestinal colonization and invasive disease (
39), clinical and nonclinical strains (
40), and isolates of diverse origins (
41), these studies were limited by the small sample sizes and use of low-resolution molecular typing methods such as pulsed-field gel electrophoresis. Recent studies of
E. faecalis and
Enterococcus faecium species identified unique mutations associated with outbreak strains, highlighting the potential effects of specific genetic changes on pathogenicity (
12,
42). Despite the increasing affordability of population-scale microbial sequencing, the genetic basis of
E. faecalis infection in individuals with different hospitalization statuses, i.e., pathogenicity and extraintestinal infection, including those due to extraintestinal translocation, remains poorly understood. The application of GWAS approaches to discover the genetic changes driving the pathogenicity and virulence of
E. faecalis could expedite antibiotic and vaccine development.
Here, we leveraged a collection of 736 whole-genome sequenced
E. faecalis isolates sampled from the feces and blood specimens of hospitalized and nonhospitalized individuals (
43). We undertook a GWAS of the isolates to investigate if specific genomic variations, including single-nucleotide polymorphisms (SNPs) and insertions/deletions, were associated with infection by hospitalization status and body isolation source. We show a predominantly higher differential abundance of virulence factors and antibiotic resistance in
E. faecalis isolates from hospitalized than from nonhospitalized individuals, as well as isolates from blood than from feces. This largely reflects the effects of the genetic background or lineages, as no specific individual genetic changes showed population-wide effects on the infection of individuals by hospitalization status or isolation source. Additionally, we found that infection in individuals depends on their hospitalization status and extraintestinal infection, which are heritable traits partially explained by
E. faecalis genetics. Altogether, our findings provide evidence suggesting that the collective effects of several genetic variants, genetic background or lineages, and gut ecological factors drive the pathogenicity and extraintestinal infection of
E. faecalis rather than the population-wide effects of individual bacterial genetic changes. These findings have broader implications for
E. faecalis disease prevention strategies, specifically the need to target all genetic backgrounds when designing vaccines to achieve optimal protection against severe enterococcal invasive diseases.
DISCUSSION
Tremendous advances in sequencing technology and analytical approaches have occurred over the past two decades since the sequencing of the first enterococcal genome
—E. faecalis strain V583 (
34). However, despite the increasing availability of population-level
E. faecalis genomic data sets, no systematic studies have investigated the population-wide effects of individual genetic changes on infection in individuals with varying hospitalization status and extraintestinal infection and the overall contribution of
E. faecalis genetics to these phenotypes (
5). Such studies could reveal critical pathways for
E. faecalis virulence, including survival in the bloodstream through evasion of innate host immune defenses, and inform the development of therapeutics (
12). Here, we address this knowledge gap by investigating the effects of known and novel virulence factors, lineages, and the entire repertoire of
E. faecalis genomic changes in a large collection of human fecal isolates, representing a snapshot of the
E. faecalis diversity in the gut, and isolates sampled from blood specimens of individuals with different hospitalization statuses. Our findings demonstrate that the abundance of certain virulence and antibiotic resistance determinants is higher in
E. faecalis isolates associated with severe disease and extraintestinal infection, largely driven by the effects of the strains, lineages, or genetic background effects but not the population-wide effects of individual genetic changes.
E. faecalis is a versatile pathogen that survives in a wide range of challenging niches, including the human gut, blood, and environment, such as in clinical settings. Such adaptation and survival of
E. faecalis in these diverse environments are modulated by several mechanisms, including antimicrobial resistance (
52), intracellular survival (
53 – 56), and biofilm formation (
27). Although several virulence factors of
E. faecalis have been described (
24 – 27), how (and if this happens) these factors contribute to infection of individuals with varying hospitalization status and extraintestinal infection, especially through gut-to-bloodstream translocation, remains poorly understood. Previous genetic studies shed light on how the distribution of virulence factors shapes the adaptation of
E. faecalis clones to different environments despite the limitation of small sample sizes (
39,
41). In this study, we demonstrate enrichment of known virulence genes in isolates associated with different hospitalization statuses using a larger collection of isolates. These include genes encoding for aggregation substance adherence factors (EF0485 and EF0149) (
32); lantipeptide cytolysin subunits CylL-L and CylL-S (
cylL-l and
cylL-s), cytolysin subunit modifier (
cylM), and cytolysin regulator
R2 (
cylR2) exotoxins (
57); and polysaccharide capsule biosynthesis genes (
cpsC to
cpsK) involved in immune modulation or antiphagocytosis (
58). These findings suggest that the variable abundance of these virulence genes in hospitalized and nonhospitalized individuals could influence
E. faecalis pathogenicity, possibly because they primarily contribute to intestinal colonization, survival, and fitness or competitiveness in different intestinal compartments in the dysbiotic gut of hospitalized patients. Once the strains harboring these genes are established in higher numbers in the gastrointestinal tract, this promotes transmission, which in turn promotes the evolution and fixation of these virulence genes in the population. Interestingly, the observed higher antibiotic resistance, especially aminoglycosides, in isolates from blood and hospitalized individuals than from feces and nonhospitalized individuals suggests that antibiotic-resistant
E. faecalis strains are more likely to survive and overgrow after the use of these antibiotics, consistent with findings reported elsewhere (
14 – 16,
39,
59,
60). Conversely, while the distribution of the virulence factors and clades, or STs, was observed, the observation from the GWAS of
E. faecalis pathogenicity, after adjusting for the genetic background of the isolates, implied that no individual genetic changes influence the severity of diseases at the population level. These findings are consistent with the notion that genetic traits influencing virulence are less likely to be selected than those promoting colonization as similarly seen in other pathogens (
61). Altogether, these findings suggest that the distribution of the
E. faecalis virulence factors may largely depend on the genetic background, implying that the lineage effects on pathogenicity may be more pronounced than the population-wide effects of individual genetic changes. Alternatively, there may be a predominance of certain lineages in some individuals, as seen with other opportunistic pathogens (
62), whose risk factors for infection, including hospital exposure history, antibiotic treatment, and other underlying conditions, make them favorable for the selection of
E. faecalis strains enriched in antibiotic resistance genes and other adaptive traits.
Likewise, the distribution of known
E. faecalis virulence factors by isolation source mirrored the patterns observed for infection in individuals with varying hospitalization status due to the correlation between these phenotypes. These findings suggested that no individual genetic changes are overrepresented in blood and gut niches independent of the genetic background, which implied that while individual genetic changes may have an impact on extraintestinal infection, their effect at the population level is likely minimal. However, some genetic changes could be linked to specific lineages, making disentangling their effects from the genetic background a challenge. However, the absence of genetic changes statistically associated with the body isolation source, after adjusting for the population structure, suggests that these variants are not likely under positive selection because extraintestinal infection represents an evolutionary dead-end for
E. faecalis (
63). Therefore, even if such genetic changes exist, they may be rare and likely exhibit small effect sizes, making their detection challenging without analyzing large data sets with thousands of genomes. We speculate that the observed strong but nonstatistically significant signals in a single prophage, integrated at chromosome coordinates 1,398,051 to 1,446,151 bp in the V583
E. faecalis genome (
34), could exemplify a potential locus with small population-wide effects on virulence. Indeed, prophages play a critical role in the pathogenicity of
E. faecalis (
64 – 67) and other bacterial pathogens, such as
Staphylococcus aureus (
37,
68). Therefore, further studies using even larger genomic data sets than the present study and adjusting for other important covariates, such as prior antibiotic usage and immune status, are required to fully investigate the impact of the identified
E. faecalis prophage in modulating extraintestinal infection. Crucially, such studies should prospectively collect samples to minimize confounding effects due to cohort and temporal variability between the number of cases and controls for a robust GWAS, which was one of the limitations of this study. Furthermore, definitive
E. faecalis genetic signals for extraintestinal infection may be identified by comparing isolates obtained from the blood of patients with feces from individuals with confirmed negative blood cultures as controls. Inclusion of
E. faecalis strains from community-acquired infections could also overcome the confounding effects due to factors related to hospitalization, such as
E. faecalis from individuals with community-acquired bacteremia who are at a higher risk of developing infective endocarditis (
69). Altogether, our findings demonstrate that no individual
E. faecalis genetic changes exhibit a population-wide statistical association with extraintestinal infection, implying that all
E. faecalis strains are capable of translocating into the bloodstream and causing severe diseases, consistent with their known opportunistic pathogenic lifestyle. Although
E. faecalis genetic changes that are important for survival in the blood may exist, these would not be fixed in the population, especially if they had no impact on colonization, as individual strains would have to accidentally “re-discover” them repeatedly. Therefore, vaccination strategies targeting all rather than specific genetic backgrounds would lead to increased protection from severe
E. faecalis diseases.
The estimated heritability based on unitig sequence variation of ~40% for infection in individuals with different hospitalization statuses and ~30% for body isolation source suggests that the contribution of
E. faecalis genetics to these phenotypes is not negligible but relatively modest compared to that observed for other phenotypes, such as antimicrobial resistance (
70). Our findings are consistent with findings from a recent bacterial GWAS of pathogenicity in
Streptococcus pneumoniae (
71) and Group B Streptococcus (
Streptococcus agalactiae) (
72). However, other studies have found negligible heritability for pathogenicity in
Neisseria meningitidis (
61), which suggests that the evolution of the pathogenicity trait is neutral. Previous studies have suggested that antibiotic resistance plays a major role in bloodstream invasion (
14 – 16,
59,
60). Indeed, broad-spectrum antibiotic use disrupts the stable gut microbial community by removing typically antibiotic-susceptible competitor species, leading to the overgrowth and dissemination of
E. faecalis into the bloodstream (
59,
60). Therefore, follow-up studies of
E. faecalis isolates sampled from feces of healthy individuals and bloodstream of patients, adjusting for other important variables such as antibiotic use, are required to determine specific genetic changes modulating pathogenicity and virulence and account for potential missing heritability. These studies will be better placed to assess the relative effect of host and gut environmental factors, such as microbiota perturbations due to antibiotic use, compared to the population-wide impact of individual genetic changes in modulating
E. faecalis virulence and pathogenicity (
73).
We acknowledge the limitations of this study, which primarily stem from the sampling biases due to the use of preexisting sequencing data sets. Firstly, there was uneven distribution of blood and fecal isolates from hospitalized and nonhospitalized individuals. Secondly, due to the retrospective nature of the study, we did not have access to detailed clinical information, including comorbidities, previous antibiotic use, and the individual’s age. Adjusting for these factors would further strengthen our findings. Thirdly, our sample size is modest as it is based on a collection of
E. faecalis isolates from only two countries in Europe. However, our data set size is similar to or larger than those described in previous studies (
68,
74), which demonstrated sufficient power to detect statistically significant associations between specific individual loci and phenotypes. We recommend follow-up studies with larger sample sizes, balanced data sets by hospitalization status and body isolation source, and most importantly, including detailed clinical information, especially antibiotic use, comorbidities, and an individual’s age, to adjust for potential confounding effects in the GWAS analysis.
Our exploratory findings derived from a geographically and temporally diverse whole-genome data set of
E. faecalis isolates suggest that the pathogenicity of
E. faecalis infections may not be primarily driven by the specific population-wide effects of individual genetic changes. These results may further illustrate the opportunistic pathogenic lifestyle of
E. faecalis, whereby infection of individuals with different hospitalization statuses and body isolation sources could be an accidental consequence of gut colonization dynamics as seen in other gut commensals (
63). Due to the absence of specific individual genetic variants associated with body isolation source and hospitalization status, ultimately, the commensal-to-pathogen switch and virulence of
E. faecalis may be predominantly modulated by multiple genetic variants, i.e., polygenic, genetic background or lineages, epigenetic mechanisms, host factors, and the gut milieu, including the ecological side effects of broad-spectrum antibiotics on the gastrointestinal microbiota.
ACKNOWLEDGMENTS
The authors would like to thank the study participants and guardians, the clinical and laboratory staff who collected and processed the samples at various laboratories in the Netherlands, and the sequencing, core, and pathogen teams at the Wellcome Sanger Institute for their support.
A.K.P., S.A.-A., and J.C. were funded by the Trond Mohn Foundation (grant number: TMS2019TMT04); R.J.L.W. and T.M.C. by the Joint Programming Initiative in Antimicrobial Resistance (grant number: JPIAMR2016-AC16/00039); A.R.F. by the FCT/MCTES Individual Call to Scientific Employment Stimulus (grant number: CEECIND/02268/2017); A.R.F., C.N., and L.P. by the Applied Molecular Biosciences Unit-UCIBIO that is financed by national funds from FCT (grant numbers: UIDP/04378/2020 and UIDB/04378/2020); J.C. also by ERC (grant number: 742158); and A.K.P. also by Marie Skłodowska-Curie Actions (grant number: 801133). The funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript, and the findings do not necessarily reflect the official views and policies of the author’s institutions and funders. For the purposes of Open Access, the author has applied a CC BY public copyright license to any Author Accepted Manuscript version arising from this submission.
C.C., A.K.P., and J.C. conceived and designed the study. C.C., A.K.P., and J.C. performed the data curation. C.C. performed the formal data analysis. J.C. acquired the funding. S.D.B. and J.C. provided the resources for the study. C.C. and A.K.P. analyzed the data. C.C., A.K.P., and J.C. wrote the first draft. All authors edited and revised the manuscript.
The authors declare no competing financial or non-financial interests.