INTRODUCTION
Mycobacteria are a diverse and ubiquitous group of actinobacteria that have been well-studied given their potential importance as pathogens and their prevalence in a wide variety of habitats, including many soil and aquatic environments (
1). Obligate pathogens
Mycobacterium tuberculosis and
M. leprae are perhaps the best-studied mycobacteria, but there are almost 200 additionally described mycobacterial species that are collectively referred to as nontuberculous mycobacteria (NTM) (
2,
3). The members of this group are phylogenetically diverse, and there is recent genomic evidence to suggest that the genus
Mycobacterium should actually be split into 5 different genera (
4). Here, we studied the entire group, referring to the group as “NTM” or “mycobacteria” for reasons detailed in Materials and Methods.
Most NTM are innocuous to humans, and some have metabolic capabilities that make them potentially useful for the biodegradation of environmental pollutants (
5,
6) or as industrial biocatalysts (
7). Likewise, exposures to several NTM strains have been linked to positive human health outcomes via immune system regulation and modulation of serotonin pathways (
8,
9). However, a subset of NTM are important pathogens of humans and animals (
10,
11). NTM infections are predominantly acquired from the environment and are increasingly recognized as a threat to public health worldwide (
12). Whether pathogenic or beneficial mycobacteria, water and soil are widely considered their predominant environmental reservoirs (
1,
11,
13). A growing number of studies have focused on NTMs in aquatic systems, including residential plumbing systems (
14–18), lakes (
19,
20), and other aquatic environments (
21–25). However, much less is known about the biodiversity and ecological preferences of NTM in soil and the relevance of soil-derived NTM to human and environmental health.
Mycobacteria are commonly detected across a broad range of soil types using both cultivation-dependent and cultivation-independent methods. We know from cultivation-independent surveys of soil bacterial communities that mycobacteria are nearly ubiquitous in soil and are consistently one of the more abundant genera of soil bacteria, ranging in relative abundance from ∼0.5% to 3.0% of the total community (
26–28). Mycobacterial species have been detected in soils from bogs (
29), forests (
20,
30,
31), croplands (
32,
33), and livestock farms (
34–36), and even in potting soil (
37). However, much less is known about the ecological preferences of soil mycobacteria. Current knowledge derived from a few studies suggests that mycobacteria are particularly abundant in high-latitude boreal forests (
20,
31), with some research suggesting higher abundances in more acidic soils (
24,
29,
32). However, studies identifying the specific ecological preferences of mycobacteria at both the genus and species or strain level of resolution across a range of soil and ecosystem types are lacking.
There are three main reasons why we lack a comprehensive understanding of mycobacterial diversity and distributions in soils across the globe. First, the majority of studies investigating environmental mycobacteria have relied on cultivation-based approaches to survey abundances and diversity (
25,
38–40). While such approaches are clearly useful for addressing specific research questions, typical cultivation-based approaches are likely to underestimate the mycobacterial diversity found in environmental samples due to well-known culturing limitations, including the difficulties associated with growing, isolating, and identifying many mycobacterial colonies
in vitro (
39,
41,
42). As such, it is likely that many environmental mycobacteria have remained uncharacterized (
14,
43). Second, although PCR-based methods can be used to more broadly survey soil mycobacterial diversity (
27,
34,
44,
45), most of these cultivation-independent studies have relied on the PCR amplification and sequencing of regions of the 16S rRNA gene, which often do not provide sufficient resolution to differentiate between distinct species or strains of mycobacteria (
35,
44,
45). Alternate genetic markers, such as the heat shock protein (
hsp65) gene, have proven useful for resolving closely related mycobacterial species, but to date these marker gene sequencing approaches have been used predominantly for clinical isolate identification (
46–49). Third, preexisting studies have generally focused on a relatively small number of soil samples with limited geospatial coverage. A more comprehensive understanding of the diversity and distributions of soil mycobacteria will clearly benefit from cultivation-independent analyses that provide greater taxonomic and phylogenetic resolution of those mycobacteria found across a broad range of soil types.
To advance our understanding of the ecology and biogeography of soil-dwelling mycobacteria, we analyzed a broad range of distinct soils collected from a range of ecosystem types across the United States and other sites across the globe (143 locations in total) via cultivation-independent amplicon sequencing of two different marker genes (the 16S rRNA gene for genus-level analyses and the hsp65 gene for more detailed analyses of specific mycobacterial taxa). Using this extensive survey, we addressed the following questions: (i) which soils harbor the highest relative abundances of taxa within the Mycobacterium genus? (ii) how does the diversity of mycobacterial communities vary across contrasting ecosystem types? and (iii) what are the ecological preferences of individual mycobacterial lineages?
DISCUSSION
We found mycobacteria to be ubiquitous and abundant in soil, as determined by our cultivation-independent 16S rRNA gene analyses. When grouped together as one genus,
Mycobacterium was typically one of the more abundantly named genera (12th of 398 named genera) of bacteria found in soil, confirming results reported in comparable studies (
20,
26,
28). We note that these results based on 16S rRNA gene surveys likely underestimate the relative abundance of mycobacteria, as the 16S rRNA gene copy number for members of this group (∼1) is typically lower than that observed for most other major bacterial groups (
58). However, we did not correct for 16S rRNA gene copy number in our analyses, following the recommendations of Louca et al. (
59). The relative abundance of mycobacteria was variable across soils ranging from 0.03% to 2.9% of 16S rRNA gene reads, a range similar to that reported previously (
26–28). Some of this variation in total mycobacterial abundances could be explained from the measured soil and site characteristics, with mycobacterial relative abundances typically being higher in more acidic, colder, and wetter soils. This observation is consistent with both cultivation-dependent and -independent results published previously which have suggested that mycobacterial abundances tend to be higher in more acidic, wetter environments (
23,
24,
29). However, we note that our model only explained 25% of the variance in genus-level mycobacterial abundances across the collected samples. Although such portion of explained variation is considered to be relatively high for comparable global-scale studies (
60), unexplained variation could be a product of the fact that we did not measure all possible soil or site characteristics that can influence mycobacterial abundances, including the presence of amoebae (
61,
62), specific organic carbon substrates (
5), or the presence of livestock (
32,
35,
36). Additionally, some of this unexplained variation could be a product of the coarse, genus-level analyses employed, which may not sufficiently capture the phenotypic differences or differences in niche preferences across taxa within the genus
Mycobacterium.
As expected, we detected a diversity of soil mycobacteria by using a higher resolution genetic marker (the
hsp65 gene). Across all soils, we identified 472 ESVs that represented 159 distinct phylogenetic lineages, with most of these lineages restricted to a small subset of soils (
Fig. 3). Not all mycobacteria were detected in all soils, an observation that could result from either dispersal limitations or, more likely, different mycobacterial strains having distinct environmental preferences. The latter explanation is supported by our results. In particular, we found that soil pH and climatic parameters were often the best predictors of the distributions of individual lineages (
Fig. 4; Table S3 and S4), and most (65%) of the lineages preferred low pH soils, as determined by significant Spearman correlations (a result consistent with the genus-level analyses) (
Fig. 2). However, we observed three individual lineages that were more commonly detected in high pH soils (Table S4). Likewise, while a majority lineages were mainly found in colder and wetter sites, two clades (clade 12 and clade 44) were more commonly detected in drier ecosystems (
Fig. 4; Table S4 and S5). These results highlight that soil mycobacteria can exhibit contrasting, yet often predictable, environmental preferences.
Our findings further indicate that soils harbor a large amount of undescribed mycobacterial diversity. Of the 159 lineages detected in our global survey only 3% of the lineages (5/159) encompassed described isolates. These results contrast to a comparable survey of mycobacteria in household plumbing that used the same
hsp65 marker gene sequencing approach, where the majority of the mycobacterial lineages included described taxa (
14). The high proportion of undescribed mycobacterial lineages we recovered from soil is slightly higher than that reported previously in soils (
27) and lakes (
19) via 16S rRNA gene analyses. Of the small subset of soil mycobacterial clades that included described isolates, we found lineages that included
M. rutilum and
M. avium complex, taxa that have previously been identified from cultivation-dependent analyses of soil mycobacterial communities (
5,
33).
Only one of the detected lineages (
M. avium complex) included known human pathogens. Members of this group are frequently associated with clinical cases of NTM respiratory disease (
24,
63). Interestingly, we found that the lineage including the
M. avium complex (clade 31) showed strong preferences for wet and acidic soils, information that may be directly relevant to understanding the epidemiology of mycobacterial disease. However, as mycobacteria related to known pathogens were infrequently detected across the 143 samples studied, we think it is important to reexamine the widespread assumption that exposure to soil is an important mode of transmission of mycobacterial disease in humans. Specifically, outside the
M. avium complex, most of the mycobacterial isolates frequently recovered from patients with respiratory NTM disease, such as
M. abscessus,
M. mucogenicum,
M. kansasii, and
M. fortuitum (
10), were not detected in any of the soils. This differs from a similar investigation showing that mycobacteria frequently implicated in NTM disease are common and relatively abundant in showerhead biofilms (
14).
The most ubiquitous clade detected in the soils sampled here included the described members
M. rutilum and
M. novocastrense. M. rutilum was first identified in soils collected from Hawaii and is known for its ability to degrade polycyclic hydrocarbon pollutants (
5). Both species are rapidly growing and are closely related to
M. vaccae, which has been well-studied for its potential benefits to human health (
64). While
M. novocastrense has been found very rarely in infected tissue, it is not generally considered to be pathogenic (
65,
66). This result suggests a potentially large reservoir of soil-dwelling mycobacteria that could be beneficial to human or environmental health.
Our study represents a comprehensive investigation of soil mycobacterial diversity and the ecological factors shaping the distributions of individual soil mycobacterial lineages, many of which remain undescribed. We found that soil mycobacterial populations (both the
Mycobacterium genus and individual lineages of mycobacteria) are often predictable from information on soil pH and site climatic conditions. Notably, most mycobacteria appear to prefer acidic, colder, and wetter soils; however, contrasting environmental preferences are found for individual clades. More generally, we show that mycobacteria can be abundant members of soil bacterial communities, but much of the soil mycobacterial diversity remains undescribed. Although the cultivation biases inherent in mycobacterial surveys are well-known (
14,
43), our results suggest that this cultivation bias is particularly important in soil. We note that those mycobacteria commonly considered to be human pathogens are rarely detected in soils worldwide, challenging widely held expectations that soil is an important source of NTM diseases. Taken together, our study provides novel insights on the distribution and ecological preferences of soil mycobacteria in terrestrial ecosystems across the globe.