INTRODUCTION
Infection by fungal pathogens in the
Blastomyces dermatitidis complex (
B. dermatitidis and
B. gilchristii) causes blastomycosis that is predominantly observed in humans and canines (
1–4). While uncommon globally, disease rates in areas of hyperendemicity can exceed 100 per 100,000 humans (
5,
6). Untreated blastomycosis can be serious and is often misdiagnosed as bacterial pneumonia, with mortality rates over 10% (
3,
7). Outbreaks of blastomycosis are frequently documented but can be sporadic, even in regions of endemicity (
1–3,
6,
7).
Based on disease prevalence, the
B. dermatitidis complex is endemic in the Mississippi and Ohio River Valleys and around the Great Lakes, with the highest infection rates occurring in Minnesota, Ontario, and Wisconsin (
3,
8,
9). In Minnesota, most infections occur in the north or along the St. Croix River but can occur throughout the state (
7). However, the geographic range of
B. dermatitidis is expanding; blastomycosis is now frequently reported in New York state and Saskatchewan (
10,
11). Two additional species,
Blastomyces helicus and
Blastomyces percursus, cause human disease similar to blastomycosis, but epidemiologic studies suggest these fungi are endemic to western North America and Africa, respectively (
12–14).
The environmental niche of
B. dermatitidis remains unknown. To date, the environmental location of
B. dermatitidis has been determined through epidemiologic data associated with outbreaks. Patients in one Wisconsin outbreak had explored a beaver dam, and two other outbreaks had evidence of transmission from riverbank soil (
15,
16). These and other epidemiologic outbreak data led to the suggestion that the ecological niche for
B. dermatitidis involves decayed wood and/or moist soil (
9,
17). Epidemiologic data also indicate increased risk from behaviors that increase outdoor exposure (
7,
15,
16,
18).
B. dermatitidis is a dimorphic fungus that grows as a filamentous mold and sporulates at 25°C but grows as budding yeasts in humans at 37°C (
1,
3,
19). Yeasts are readily isolated from infected individuals, and
B. dermatitidis can be propagated in the laboratory as both mold and yeast, but isolation from the environment through culture-based methods has remained elusive (
15,
20–23). The only successful environmental isolation of
B. dermatitidis was from a single woodpile sample (
22) and has not been replicated. The most effective method of isolating
B. dermatitidis from the environment is through animal passage, but this suffers from reproducibility and animal welfare issues, is labor-intensive, and is not well suited for large-scale studies (
20,
23).
A PCR-based method to identify the presence of
B. dermatitidis DNA in environmental samples was recently developed. Burgess et al. used PCR amplification of the
BAD1 virulence gene (
3,
24) to detect
B. dermatitidis DNA in five soil samples (
25). The purpose of this study was to use this culture-independent PCR-based assay to identify
B. dermatitidis DNA in 250 environmental samples collected in Minnesota. We used the resulting data to build a random forest model that was 75% effective at predicting the environmental presence of
B. dermatitidis.
DISCUSSION
The use of culture-independent sampling methods to characterize microbial communities and localize specific microbes has become common in recent years. These methods allow for the identification of nonculturable organisms, as well as decreasing bias toward strains that grow better in the laboratory than the environment. For a recalcitrant disease-causing organism such as
Blastomyces dermatitidis, where culturing from the environment ranges from extremely difficult to impossible (
15,
20–23), molecular-based techniques offer a more consistent, reproducible, and less labor-intensive method of detection. This is supported by our detection of
B. dermatitidis from the 1999 outbreak samples where previous
in vitro and mouse bioassay cultures were unsuccessful. Using a PCR-based technique, we identified
B. dermatitidis DNA in the environment in Minnesota and used those data to build a predictive ecological niche model for
B. dermatitidis. Additionally, we detected only transient associations with flooding events.
Information about the environmental niche, and modeling of that niche, in
B. dermatitidis was previously derived from epidemiological data because the organism could not be reproducibly cultured from the environment (
9,
15,
20–23). Early studies focused on specific outbreaks, but more recently, studies utilizing patient location information have used global information system (GIS) data to examine associations between environments where individuals with blastomycosis reside and environmental factors (
9). These modern GIS-linked analyses are invaluable tools to understand the epidemiology of blastomycosis but are limited by an increasingly mobile population. Our data highlight the difficulty in using patient-based epidemiological approaches because defining where patients acquire infections and infection rates in areas with seasonal populations is challenging (
7).
Our DNA data show locations with high geographic prevalence of B. dermatitidis in the environment. Consistent with the epidemiologic data, when we collected samples from random GPS coordinates in high-endemicity counties, with the goal of identifying basal levels of B. dermatitidis in the environment, we detected B. dermatitidis DNA in 33% of samples. In contrast, Hennepin County, where levels of blastomycosis are low, had no detectable B. dermatitidis DNA except transiently after a flooding event. These data suggest that the basal presence of B. dermatitidis in the environment may be higher in high-endemicity areas.
We also observed environmental prevalence variability on multiple levels: (i) between regions in the same county, (ii) at sites collected in the same region, (iii) temporally at the Hennepin county site, and (iv) possibly between the basal and outbreak locations. We were able to link a spike in B. dermatitidis DNA detection with a flooding event at the Hennepin county site, and the short duration of this spike may explain the lack of reproducibility inherent in previous attempts to localize B. dermatitidis in the environment following flooding events and the much lower prevalence of B. dermatitidis DNA in the outbreak B site compared to the other outbreak sites.
There was variability in the number of samples collected and analyzed at each location due to differences in both the scale and the type of site being tested at the locations. These differences in number of samples at each location may impact the current modeling and should be taken into consideration when interpreting our data. For example, the association between the presence of B. dermatitidis and longitude could be due to the large number of samples collected in St. Louis County. Significantly larger-scale studies, with more diffuse environmental sampling strategies, will be needed to overcome this limitation and provide a better description of site-specific differences in B. dermatitidis environmental prevalence.
Our sampling technique was able to successfully detect the presence of the BAD1 gene, which is thought to be specific to B. dermatitidis, in environmental samples. However, this method does have limitations. Because the BAD1 PCR-based assay is culture-independent, the assay does not differentiate between DNA detected from living B. dermatitidis, dead B. dermatitidis, or a possible unknown species of Blastomyces that could also have the BAD1 gene. The fact that we observed a high percentage of positive environmental samples in areas of endemicity and from areas with previous outbreaks provides support that the assay is detecting virulent B. dermatitidis. However, additional studies confirming the viability and pathogenicity of B. dermatitidis would be beneficial. Additionally, this assay does not quantify the amount of B. dermatitidis at a given location.
Our study also identified a new
BAD1 promoter haplotype not previously defined by Meece and colleagues (
32). In their previous study, Meece et al. showed that
B. gilchristii had two large insertions in the
BAD1 promoter region (haplotype 4) compared to the haplotype present in the sequence reference strain ATCC 26199 (haplotype 1). In contrast, the two other previously identified
B. dermatitidis haplotypes had relatively minor alterations in the promoter region, including a single nucleotide deletion (haplotype 2) and a small 5-bp insertion (haplotype 3). The new haplotype present in our environmental sequences, as well as two additional strains that were previously whole-genome sequenced (
28), contain this same 5-bp insertion but also an additional A→G conversion. Interestingly, previous population genetic studies using microsatellites to analyze
B. dermatitidis strains had identified four populations that were geographically distributed across North America, with Minnesotan strains identified in both populations 1 and 4 (36). The
BAD1 haplotypes present in population 1 strains of
B. dermatitidis are currently unknown, but both haplotypes 3 and 5 were identified in population 4 strains. Given this diversity in the
BAD1 haplotype among the population 4 strains, and the levels of recombination/admixture detected between populations 1 and 4 previously observed (
27), we hypothesize that
BAD1 haplotype 5 could also be present in population 1 strains of
B. dermatitidis and thus prevalent in Minnesota.
This study comprises the largest number of samples detecting the basal environmental presence of
B. dermatitidis. We used these basal environmental samples to build three ecological niche models to better understand the unknown ecological niche by using 13 environmental characteristics for each sample. The logistic regression model indicated that longitude, water classification, and nearby fecal matter were significantly related to the presence of
B. dermatitidis DNA, but the classification error was poor, with only 64.9% accuracy. Tree-based methods are used for analysis of ecological data because they can simultaneously handle both continuous and categorical variable types without the need for variable selection (
33). The disadvantages of tree-based models include slightly lower predictive performance and a higher risk of overfitting the training data. In our case, the tree makes splits to maximize the purity of grouped sampling sites by the presence of
B. dermatitidis DNA. Use of a random forest tree model improved prediction accuracy to 75.6%. The random forest model identified the latitude, longitude, elevation, landform, and site classification as major contributors to
B. dermatitidis prevalence. Surprisingly, other factors previously associated with
B. dermatitidis, such as distance to water, water classification, or decaying wood did not have a major influence on the model. Unfortunately, the error rate for this model is still too high to make definitive statements about the ecological niche or biology of
B. dermatitidis.
However, the results provided by the random forest model did provide intriguing information to consider. While our model did not contain sufficient data to identify the defining characteristics of the ecological niche, it instead revealed general geographical data and site classification as surrogate information. Latitude and longitude were both associated with the presence of
B. dermatitidis, with northern and northwestern sites showing a higher number of samples with detectible
B. dermatitidis DNA. We performed sampling in areas of endemicity that tended to be in the northern latitudes of the state, so the observed associations may be based on the narrow sampling range. We do not yet know the role this geographic location plays in the presence of
B. dermatitidis. Future studies, expanded to other locations, such as New York, Canada, and Wisconsin, will likely provide further insights into geographic associations. Based on the observation that
B. dermatitidis is more prevalent in the Midwest and Northeast North America and not West and Southwest North America, it seems likely that geographic location is relevant in some way. Furthermore, once studies that are no longer limited to narrow geographic locations are performed, other environmental factors influencing site classification that explain the association with forested areas will likely emerge for evaluation. More samples in additional locations may also show that soil type, vegetation, or other unexpected environmental features not specifically coded for in our modeling are associated with the presence of
B. dermatitidis. Elevation and the associated landform classification also had a significant association with the presence of
B. dermatitidis, and this may be related to flooding risk or vegetation type. Surprisingly, factors previously associated with
B. dermatitidis—such as rotting wood, distance to water, and water classification (
9,
15–17)—did not show a positive association with the presence of
B. dermatitidis, suggesting these might not be critical factors in the ecological niche. Our current model may not yet reveal the precise ecological niche of
B. dermatitidis, but it provides an important first step and critical clues that lay the foundation for future studies to further expand our understanding of the ecological niche.
Historically, we had a very poor understanding of
B. dermatitidis in the environment, as the organism is so difficult to isolate from the environment. The PCR-based assay provides a method to identify
B. dermatitidis in the environment and can be combined with the environmental parameters associated with each sample. Unusual for an endemic mycosis,
B. dermatitidis showed an almost ubiquitous presence in the regions of endemicity of Minnesota, with many environmental samples positive without a clear environmental source. In comparison,
Coccidioides spp.—another dimorphic, environmentally associated, endemic mycosis found in North America—does not have a generalized environmental presence and instead is consistently found associated with rodent burrows (
34). Unfortunately, because of the almost ubiquitous presence of
B. dermatitidis in the basal and outbreak environmental samples, the current assay cannot be used to determine the likelihood of contracting blastomycosis in a given location within the region of endemicity, the source of exposure, or the potential risk of exposure. Instead, the trend toward outbreak samples being positive compared to the basal samples, and the temporal flooding samples showing changes in positivity over time, suggest that
B. dermatitidis levels in the environment may fluctuate based on environmental conditions. Quantitative PCR assays with appropriate samples are needed to determine if these fluctuations impact blastomycosis. For example, very high basal levels of
B. dermatitidis may be present in areas of hyperendemicity, leading to the higher rates of blastomycosis observed in these areas. There may also be temporary increases in
B. dermatitidis due to environmental changes that result in outbreaks of blastomycosis. Such quantitative studies may also be able to predict a theoretical environmental infectious dose—a parameter that is completely unknown at this time.
This study is the most comprehensive analysis of the ecological niche of
B. dermatitidis to date. Still, the study lacks sufficient environmental sampling data to train a model with a low enough error rate to reliably predict environmental presence of
B. dermatitidis. In addition, an improved assay to quantify
B. dermatitidis DNA levels in the soil will be necessary to differentiate risk, particularly in areas of endemicity where basal soil prevalence is high. Finally, the modeling performed in this study suggests that combining large-scale randomized environmental sampling in areas of endemicity, culture-independent PCR-based detection of
B. dermatitidis, and analysis of environmental features has the potential to define the ecological niche for
B. dermatitidis. Importantly, this unbiased approach allows modeling areas of future spread, such as the emergence of blastomycosis that is already being observed in New York and Saskatchewan (
10,
11), as our climate continues to change.