INTRODUCTION
Legionnaires’ disease (LD) is a potentially fatal form of bacterial pneumonia caused by various species of
Legionella (
1). Individuals with chronic lung diseases, immune system deficiencies, smokers, and people of advanced age are at an increased risk for LD. A milder form of legionellosis characterized by fever and “flu-like” symptoms is termed Pontiac fever. In built environments,
Legionella can multiply in water that is stagnant, maintained at permissive temperatures (~25 to 37°C), and that lacks appropriate disinfectant levels. Various devices, such as cooling towers, showerheads, fountains, and spas, can aerosolize contaminated water. Inhalation of these aerosols by susceptible individuals can result in legionellosis.
Over 60 species of
Legionella have been identified (
http://www.bacterio.net/-allnamesdl.html ) and at least one-third of these have been linked to human disease (
2).
L. pneumophila is the most frequent cause of LD in North America. Other less common, clinically relevant species include
L. longbeachae,
L. bozemanii,
L. micdadei, and
L. dumoffii (
3–6)
L. pneumophila is highly diverse, with 17 known serogroups (
1). Genome sequence analysis has revealed that much of the genetic diversity among isolates of
L. pneumophila is driven by recombination (
7).
During legionellosis outbreak investigations,
Legionella isolates from potential environmental sources are compared with clinical isolates in an effort to support epidemiological associations. Various subtyping schemes have been used for this purpose, such as pulsed-field gel electrophoresis, sequence-based typing, and more recently, whole-genome sequencing (
1,
8). Confirmation of the environmental sources of
Legionella may help to shorten the duration of an outbreak by focusing remediation efforts on a specific source and by informing ongoing prevention strategies.
The mechanisms by which sources of
Legionella from the natural environment colonize the built environment are poorly understood. However, it is likely that at least some of the
Legionella strains present in source water for built environments is derived from natural aquatic ecosystems, such as rivers, streams, and lakes, where
Legionella strains have been shown to be widely distributed (
9,
10). Various studies have demonstrated a link between
Legionella and various protozoa, including amoebae such
Acanthamoeba spp.,
Naegleria spp., and
Hartmannella spp. (
11,
12). Many of the molecular mechanisms that legionellae use for growth in amoebae appear to overlap those for growth in human macrophages (
13). Moreover, growth within amoebae not only amplifies the number of
Legionella organisms but also may enhance bacterial virulence (
13,
14). Finally,
Legionella spp. have been detected in biofilms, which are considered a major reservoir of the organism in colonized human-made water systems (
15,
16).
Few studies have attempted a comprehensive analysis of the microbiome of natural aquatic environments. Recently, a year-long study was conducted to understand the microbial community composition in various watersheds with different land uses in British Columbia, Canada (
17,
18). In this study, sites within agricultural, urban, and protected watersheds were sampled monthly. Size fractionation methods were employed to generate templates for sequencing. 16S and 18S amplicon sequencing was conducted to quantify changes in the microbiomes of these environments, while shotgun metagenomic sequencing was conducted to understand community structure and function. Notably, this study revealed that the most abundant bacterial phyla present in the watersheds studied were
Proteobacteria (which includes
Legionella),
Actinobacteria,
Firmicutes, and
Bacteroidetes (
17).
In order to better understand the ecology of Legionella in the natural aquatic environment, we evaluated this extensive data set with the goals of quantifying Legionella abundance among different watersheds, determining the diversity of Legionella spp. present, and evaluating the role of amoeba when in the presence of this bacterium in such environments. Understanding the presence and diversity of Legionella in these watersheds may help to improve our ability to control colonization by these organisms of human-made water systems from natural water sources.
DISCUSSION
The robust methodology described by Uyaguari-Diaz et al. (
17) to separate various microbial components (eukaryotic, bacterial, and viral) in natural water samples via both amplicon (16S and 18S) as well as metagenomic sequencing was used to characterize the compositions of water samples from various watersheds in British Columbia. The predominant bacterial phyla in the analysis of seven watershed sampling sites using metagenomic sequencing were
Proteobacteria,
Actinobacteria,
Firmicutes, and
Bacteroidetes. A year-long study that examined the same sites showed a shift in the microbial community composition (revealed by the average genome composition and k-mer composition) among some of the sites corresponding to the season and/or nutrient concentrations (
18). In the current study, we detected
Legionella, a member of the
Gammaproteobacteria order, at all sampling sites and collection dates.
Unlike the seasonal patterns observed when examining the composition of the entire bacterial community (
18), the relative abundance of
Legionella varied throughout the year, without any discernible seasonal patterns, reaching as high as 2% of the bacterial taxa present. Although LD cases peak in summer and autumn months, natural waters are not considered a source of disease to the extent that water systems in built environments are considered the sources (
1). Other researchers have detected
Legionella by using culture or quantitative PCR in natural water sources. In a recent study, investigators found ~10
4 to 10
5 cells/liter of
Legionella spp. in some Taiwanese river water samples via real-time PCR (
20), while a previous study of marine and freshwater sites in Puerto Rico demonstrated an abundance of
L. pneumophila of 10
4 cells/ml via direct fluorescence antibody (DFA) testing (
21). These findings suggest that the quantity of
Legionella in the natural environment may be highly variable.
The relative abundance of
Legionella was highest in sites with limited land use: from the site upstream of agricultural activity (AUP) and from a river that empties into a drinking water reservoir (PUP). In both cases, these sources feed downstream sites where the relative abundance of
Legionella is lower. There are several possible explanations for this decrease in
Legionella relative abundance. Downstream sites may contain contaminants or lack specific nutrients for
Legionella growth. There may also be an increase in certain non-
Legionella genera in these downstream sites, resulting in a lower relative abundance of
Legionella in the community. Alternatively,
Legionella may become associated with biofilms, which would decrease their relative abundance in the surface water samples collected in this study. Notably, the water collected at the PDS site travels through a nearly 9-km pipe made of concrete and steel and lined with coal tar. It is possible that biofilms present within this pipe may trap
Legionella;
Legionella can also survive and grow within various amoeba species. Similar to the pattern observed with
Legionella relative abundances, the lowest mean relative abundances of
Amoebozoa were found at the ADS, APL, and PDS sites, supporting the possibility that
Legionella relative abundance is amplified by the presence of amoeba in these natural water sources. The differences in relative abundance of
Legionella seen in our study may also be due to the presence of other protozoa; in addition to amoebas,
L. pneumophila has been shown to infect and grow within ciliates (
22), and protozoan predators have recently been isolated that graze on virulent
Legionella spp. (
23).
Note that the abundance values for both bacteria and eukaryotes were relative rather than absolute abundance values. Furthermore, the relative abundance of eukaryotes inferred by 18S rRNA gene sequencing would be affected by the large variation in copy numbers of the 18S rRNA gene, which can vary by several orders of magnitude between species (
24).
This study revealed that
Legionella spp. present in the watersheds examined are incredibly diverse. More than 70 OTUs were detected via 16S amplicon sequencing. The metagenomic sequence analysis used in this study demonstrated that
L. pneumophila was the most common species represented. Similarly, Fliermans et al. detected
L. pneumophila by DFA in nearly all concentrated water samples collected from 67 natural water sources in North Carolina, South Carolina, Georgia, Florida, Alabama, Indiana, and Illinois (
10). Various
Legionella spp. are frequently detected in studies of natural water sources (
20,
21,
25,
26). Notably, sequence analysis of the most common
Legionella 16S rRNA gene-based OTU amplified from water samples along a French river were associated with unknown/uncultured bacteria (
25). The high diversity of
Legionella spp. among these sources may have implications for clinical disease, since several non-
pneumophila Legionella species are associated with clinical disease (including pneumonia), especially among immunosuppressed populations (
2). A study of natural water sources in the Mount St. Helens (Washington, USA) blast zone was conducted after researchers exposed to lakes and streams in the region reported symptoms consistent with Pontiac fever in the early 1980s (
27). Various known
Legionella spp. were detected in this study, with higher organism relative abundances found in water samples taken within the blast zone and in lakes receiving water from hydrothermal seeps than in sites outside the blast zone. A novel species (
L. sainthelensi) was isolated from water samples collected around Mount St. Helens (
28), and this species was subsequently found to be associated with clinical disease (
29). Although we did not attempt to isolate and grow the putative novel
Legionella species from our samples, doing so could be the next step for future studies.
Metagenomics classification programs such as MEGAN6 used in this study may overclassify reads to incorrect species if the matching species is not present in the database (
30). More specifically, the program might assign reads to the most closely related species in the database. There are currently over 500
L. pneumophila genome sequences in the NCBI database, but only a few representatives are present for other
Legionella species. The wide range of alignment identities observed with the MEGAN6 analysis further suggests that novel or uncharacterized
Legionella strains may be present in the samples. Notably, alignment of the shotgun metagenomic reads with the
Legionella mip gene also uncovered a large number of
Legionella species (>35) among the watershed samples, but sequencing coverage of this gene may be limited. Nonetheless, the alignment identity of these matches was low (typically <80%), further suggesting the presence of additional novel
Legionella species in these watersheds.
While the presence of Legionella spp. in natural water samples alone is not a significant public health concern, these organisms may seed human-made water systems. In turn, these systems could become sources of Legionella dissemination under permissive conditions. Understanding the diversity of organisms present in the natural aquatic environment and factors that may contribute to increased abundance of specific Legionella spp. in these environments may help public health workers identify potential new threats to human health and respond quickly to LD by using improved diagnostic and typing assays. This study demonstrates that natural aquatic environments, including watersheds, likely harbor previously unrecognized Legionella spp. As culture-independent diagnostic tests for LD become more commonly utilized, it will be important to evaluate the ability of these assays to detect new and emerging Legionella spp. and assess their potential to cause disease.