Overview of site geochemistry and study design.
16S rRNA gene sequencing has been conducted in several alkaline hot springs in YNP and has been useful in determining putative phototrophic taxa (reviewed in reference 4). Based on previous 16S rRNA amplicon sequencing, we found that putative phototrophs, including
Synechococcus,
Roseiflexus, and
Chloroflexus, were abundant in eight different hot spring sites in YNP. These sites ranged in temperature from 62ºC to 71ºC, pH between 7 and 9, and sample morphology included mats, pinnacles, and filaments (Table S1 in the supplemental material) (
11). In general, the sites cluster by geothermal area while temperature, dissolved organic carbon, sulfide, and iron were also major drivers of dissimilarity (
Fig. 1). Here, we leveraged metagenome sequencing to determine ecological diversity and metabolic potential of phototrophic bacteria in eight alkaline springs that have not been the focus of historic work in YNP. While Mushroom and Octopus springs are alkaline, they differ compared to our sites in terms of morphology, geochemistry, and geographic location. We generated metagenomes to determine the diversity, distribution, and abundance of specific genes involved in phototrophy, autotrophy, and nitrogen fixation. Because diversity at the 16S rRNA gene level decreases with increasing temperature and geographic isolation plays a role in structuring hot spring communities (
6,
8,
35,
36), we hypothesized that these factors would also impact the distribution, diversity, and abundance of functional genes. To this end, we calculated Shannon diversity for each target gene in our eight hot spring sites and examined gene abundance by mapping metagenome reads to genes of interest in the assembled metagenomes.
Geographic isolation plays a role in the diversity and distribution of cyanobacterial photosystem genes.
Oxygenic photosynthesis is a remarkable metabolism that involves two photosystems, Photosystem I (PSI) and Photosystem II (PSII), working in concert to harvest electrons from water to fuel carbon fixation and other cellular processes. PSII houses the oxygen-evolving complex and antenna proteins where light energy is captured to liberate electrons from water—a process that requires expression of several proteins that are encoded by psb genes (
37–39). We quantified the abundance of three key
psb genes:
psbA, psbB, and
psbD (
Fig. 2A). The
psbA and
psbD genes encode for the D1 and D2 proteins, respectively, which both serve to ligate the redox-active components in PSII and are highly transcriptionally regulated in cyanobacteria (
37). The
psbB gene encodes CP47, a chlorophyll binding protein crucial to forming a stable PSII reaction center; taxa with multiple copies of
psbB are acclimated to far-red light (
40). While we observed a range of sequence abundances from rare (0.001) to 3 normalized reads mapped, we did not observe statistically significant differences in photosystem gene abundance in our data (Table S2) nor a decrease in abundance with increasing temperature (
Fig. 2A).
We classified
psbA genes into operational taxonomic units (OTUs, 99% nucleotide similarity, reference database in supplemental material) resulting in 27
psbA OTUs (
Fig. 3, Figure S1). To examine diversity in taxonomy of the psbA OTUs, we translated
psbA to PsbA and ran both phylogenetic and BLASTP analyses (
41) (Fig. S1). Based on the phylogenetic placement of PsbA and our BLASTP results, 14 of 24 OTUs were classified as
Synechococcus. Several OTUs were related to the high temperature strains JA-2-3B'a (
2–13) and 63AY4M2. Two PsbA OTUs, OTU15 and OTU17, were most closely related to strain 63AY4M2 and were present in our highest temperature sites, WCA1 (71.0ºC) and WCA2 (69.4ºC) (Fig. S1). Both reference strains were isolated from Mushroom and Octopus Springs, where temperatures range from 60–65ºC (
42), and our results may suggest a range for strain 63AY4M2 beyond 65ºC. While strain-level distribution cannot be discerned from these data alone, future work should be done to determine the genomic variation in Synechococcus strains beyond Mushroom and Octopus Springs (see reference
43). While the majority of the OTUs recovered here were
Synechococcus, we also recovered OTUs that were most closely related to
Gloeomargarita lithophora,
Thermosynechococcus sp., and
Leptolyngbya sp. in 62ºC sites. These observations are consistent with previous work suggesting cyanobacterial diversity increases with decreasing temperature in alkaline hot springs (
7,
9,
10).
While the abundance of photosynthesis reaction center genes did not correlate with temperature, we did observe differences in photosystem gene copy number. For example, in our highest temperature sites (68–71ºC), there were few highly abundant
psbA sequences while at lower temperatures there were more less abundant
psbA sequences. We expected that the > 68ºC samples would contain distinct
psbA variants compared to the lower temperature sites because temperature selects for ecotypes that vary in photosynthetic properties in Mushroom and Octopus Springs (
3,
5–7). We found that
psbA variants were largely site specific (
Fig. 3) and alpha diversity across sites did not correlate with temperature (Fig. S2A), highlighting that geographic isolation could play a selective role in this environment.
In general,
psbA richness was higher in 62ºC sites compared to others (
Fig. 3). In the 62ºC sites, only two OTUs were present in more than one site (OTU06 and OTU07 in site RCA4 and site GCA3), and in the high temperature sites, all
psbA OTUs were unique. Both OTUs were associated with
Synechococcus OH strains capable of growth up to 70ºC in pure culture (
44). We observed several abundant OTUs in Rabbit Creek sites (RCA, sites RCA3, RCA4, and RCA6), where our previous 16S rRNA analysis revealed abundant
Synechococcus 16S rRNA gene sequences (
11). The recovery of multiple psbA OTUs in each RCA site is consistent with the presence of multiple
Synechococcus strains or ecotypes with several distinct copies of
psbA. Fewer distinct OTUs in sites of >63ºC is consistent with strain (or ecotype) adaption at higher temperatures, like what was observed in Octopus spring (
7).
Chloroflexi photosystem genes have distinct distributions with temperature and reveal novel taxa.
Given that cyanobacteria photosystem genes did not follow a distinct temperature pattern, but PsbA OTU analysis revealed gene variants are largely site specific, we sought to determine if anoxygenic phototrophs followed a similar pattern. Anoxygenic phototrophs commonly observed at temperatures of >60ºC have type-II reaction centers that are encoded by
puf genes (
45), and the majority of anoxygenic phototrophs in hot springs of >60ºC are phototrophic Chloroflexi (
7,
11,
20). Here we surveyed puf genes to examine if the diversity of putative phototrophic Chloroflexi (class Chloroflexales and
Candidatus Thermofonsia) also decreases with increasing temperature (
Fig. 2B).
pufL and
pufM encode PufL and PufM, membrane-spanning proteins that bind bacteriochlorophylls in type-II reaction centers, while
pufC gene encodes a cytochrome involved in photosynthetic electron transfer (
19,
45,
46).
puf gene abundances ranged from rare (0.001 normalized reads mapped) to 1.5 normalized reads mapped (
Fig. 2B). We recovered more copies of
pufLC genes in sites of <68ºC, which is consistent with a decrease in genetic (or taxonomic) diversity with increasing temperature, as seen in Mushroom Spring, Octopus Spring, and Rabbit Creek (
3,
5,
10–12,
17). In contrast, several copies of
pufM genes were abundant in all sites. Together, these results suggest that taxa with type-II reaction centers could encode multiple copies of
pufM. Furthermore, our data suggest that diversity of anoxygenic phototrophs decreases with increasing temperature or taxa at temperatures 62ºC contain multiple copies of
pufLC. The presence of multiple copies of
puf genes has not been confirmed in Chloroflexi isolate genomes, but in other phyla gene homologs are necessary for adaption to changing environmental conditions (
47) and should be investigated further in phototrophic Chloroflexi.
To determine the diversity of
puf genes in these sites, we assigned OTUs to our concatenated and translated
pufLM genes (at 99% similarity) and assigned taxonomy using BLASTP (
41). We found that
pufLM diversity did not correlate with temperature (Fig. S2B). We recovered 42
pufLM OTUs across seven sites (
Fig. 4, Fig, S3). Thirty-five of the 42 OTUs were affiliated with Chloroflexi. Of the seven non-Chloroflexi OTUs, none were in the top 20 most abundant OTUs; five were Proteobacteria and two were Actinobacteria. Previous work has shown that phototrophic Proteobacteria are rare in alkaline hot springs at >60ºC (
4,
9,
10), and non-Chloroflexi pufLM OTUs were not abundant in our metagenomes. We found that our most abundant and most common OTUs were Roseiflexus (OTU05) and Chloroflexus (OTU03) genera (Fig. S3), which is consistent with both our previous 16S rRNA gene analysis (
10,
11), and 16S rRNA and metatranscriptomic analyses in Mushroom and Octopus Springs (
5,
7,
20,
48).
The present metagenomic sequencing data set provides higher resolution than our previous 16S rRNA gene analysis (
11). Our metagenomic sequencing approach resulted in the recovery of taxa that have not been identified in YNP hot springs at present. Three of our top 20 most abundant OTUs were assigned “
Candidatus Roseilinea sp. NK_OTU-006” by BLASTP. The only described species from this class is “
Candidatus Roseilinea sp. strain NK_OTU-006,” recovered from sulfidic hot springs in Japan near 56ºC (
18). Our
Ca. Roseilinea-like
pufLM OTUs (OTU23, 24, and 33) were found in two alkaline sites low in sulfide (RCA4 and GCA3), both with temperatures of 62ºC, pushing the geographic range and upper temperature limit of this novel class. Furthermore, eight of our
pufLM OTUs were assigned “Chloroflexi bacterium” by BLASTP (Table in Fig. S3B), suggesting novel Chloroflexi are present in these hot spring sites.
In Mushroom Spring, Klatt et al. (2013), observed
Roseiflexus in 60ºC and
Chloroflexus transcripts in 65ºC sites, indicating temperature partitioning of the two phototrophic Chloroflexi genera (
8). Our data are consistent with the Mushroom Spring study but suggest temperature partitioning of the two genera at higher temperatures: we recovered putative Roseiflexus OTUs in sites up to 68ºC and putative
Chloroflexus OTUs in sites up to 69ºC. We also observed more
Chloroflexus than
Roseiflexus OTUs in 68ºC–71ºC sites (Fig. S3C). Recovery of cyanobacterial
psb genes and Chloroflexi
puf genes from the same sites is consistent with several historical studies postulating the presence of “green non-sulfur bacteria” co-occurring with cyanobacteria in Mushroom and Octopus spring mats (
49–53). Recent works have examined the distribution of phototrophic Chloroflexi using single marker genes (
9–14), and our data support the hypothesis that both phototrophic taxa persist at temperatures of >68ºC with two different optimal temperatures:
Roseiflexus up to 68ºC and
Chloroflexus up to 69ºC. Future work is needed to determine if this hypothesis holds true with
Roseiflexus and
Chloroflexus metagenome assembled genomes or hot spring isolates.
Calvin cycle genes have distinct distributions with temperature while 3HPB genes are widespread and abundant.
Photoautotrophic bacteria fix the majority of carbon in alkaline geothermal springs using the Calvin-Benson-Bassham (Calvin) cycle (cyanobacteria, some Chloroflexi), the reductive tricarboxylic acid (rTCA) cycle (class Chlorobia), or the 3-hydroxypropionate bicycle (3HPB, most photoautotrophic Chloroflexi) (reviewed in reference
54). Recent work has shed light on the flexibility of carbon fixation in Chloroflexi in high temperature, alkaline hot springs:
Roseiflexus and
Chloroflexus in Mushroom and Octopus springs contain genes for the 3HPB, but a handful of studies have recovered Calvin cycle genes in phototrophic “
Candidatus Thermofonsia” (
55) and “
Candidatus Chlorohelix allophototropha” (
56), and nonphototrophic class Anaerolineaea (
57). The carboxylation step in the Calvin cycle is carried out by the enzyme ribulose 1,5 bisphosphate carboxylase/oxygenase: RuBisCO (encoded by
rbcL [large subunit] and
rbcS [small subunit] genes). In hot springs specifically,
Synechococcus species have evolved a thermotolerant form of RuBisCO that can function up to 74ºC (
58). Phosphoribulokinase (encoded by the
prk gene), a second essential step of the Calvin cycle, does not appear to have an upper temperature limit beyond that of phototrophy, but is likely only present in organisms that use the Calvin cycle (
59).
Given the wide distribution of the genes for the Calvin Cycle in nature (
60), we sought to constrain the distribution of
rbcL, rbcS, and
prk alkaline hot spring samples and relate these data to our phototroph gene analysis. In contrast to the
psb analyses, pairwise comparisons of the abundance of both
prk and
rbcL showed a statistically significant difference in site RCA5 compared to all other sites, except for the highest temperature site (WCA1) (
Fig. 5A). Furthermore, we observed larger mean abundances of
rbcS than
rbcL, but more copies of
rbcL than
rbcS, suggesting the taxa encoding Calvin cycle genes could encode more copies of
rbcL or multiple forms of RuBisCO are present in these high temperature, alkaline hot springs. At present, four forms of RuBisCO exist in nature: form I RuBisCO (cyanobacteria, alpha-, beta-, gamma-proteobacteria, Chloroflexi, and autotrophic eukaryotes) contains both the large and small subunits (encoded by
rbcL and
rbcS genes, respectively), while forms II (alpha-, beta-, gamma- proteobacteria) and III (only in methanogenic archaea) contain only the large subunit (
59,
61,
62). To this end, we calculated the ratio of
rbcL:rbcS with temperature (Fig. S4). A ratio of 1:1 in
rbcL:rbcS genes would be indicative of form I RubisCO, while any larger ratio would suggest several form I RuBisCO taxa with extra copies of
rbcL or the presence of form II and form III taxa. In general, we found ratios of >1:1 in all sites, with the largest differences in sites at <63ºC. Because more
rbcL copies are present at lower temperatures, we infer that taxa encoding form II or III RuBisCO (rbcL only, noncyanobacterial Calvin cycle) persist at lower temperatures while form I (cyanobacterial-Calvin cycle) are more prevalent at temperature >63ºC.
We recovered 77 rbcL OTUs (99% nucleotide similarity, reference database in supplemental material) among our eight sites (Fig. S5). We observed fluctuating rbcL richness (Fig. S5) and diversity (Fig. S2D) in both sites of >68ºC and 62ºC sites (Fig. S2D). The majority of our rbcL OTUs were site-specific, consistent with adaptation to local conditions and/or geographic isolation. Two exceptions were OTU01 (Armatimonadetes) and OTU02 (Synechococcus): OTU01 was present in both high temperature sites and in a 63ºC Rabbit Creek site (RCA4, 62.3ºC), while OTU02 was present in our two highest temperature sites (WCA1, 71ºC; WCA2, 68.4ºC). Given that rbcL is commonly associated with cyanobacteria and some Chloroflexi and psbA and rbcL analyses suggest a combination of local conditions rather than temperature alone is selecting for taxa that encode these two genes, we postulate that these taxa are subject to geographic isolation in alkaline hot springs.
Genes involved in 3HPB, the carbon fixation pathway in most photoautotrophic Chloroflexi, were widespread and abundant in our metagenomes (
Fig. 5B). The 3HPB requires two carboxylation steps (via acetyl-CoA carboxylase and propionyl-CoA carboxylase), followed by steps that generate 3-hydroxypropionate and glyoxylate intermediates (
54,
63). To this end, we surveyed the abundance of three genes involved in three critical steps in the 3HPB: malyl-CoA/citramyl-CoA lyase (
mcl gene, glyoxylate generation), propionyl-CoA carboxylase (
pccA gene, CO2 carboxylation), and 3-hydroxypropionate dehydrogenase (
mcr gene, 3-hydroxypropionate generation). Only one gene (
pccA) returned statistically significant differences in abundance across sites.
pccA abundance was different in site RCA5 (62.5ºC) compared to three high temperature sites (RCA3, BG1, WCA1) and one 62ºC site (RCA4). However,
mcl and
mcr in the 3HPB pathway showed no significant difference in abundance across sites. These results are likely because we recovered several low-abundance (<0.01 normalized reads mapped)
pccA reads in addition to the high abundance reads. This is not surprising given that
pccA is widely distributed in all domains of life and is not unique to the 3HPB (
64).
pccA converts propionyl-CoA to acetyl-CoA, which can enter the Krebs cycle and generate succinate and three equivalents of NADH, a key process that utilizes small carbon molecules for energy generation for all organisms. Furthermore, several studies have shown that
Synechococcus in alkaline hot springs release simple carbon compounds as a by-product of photosynthesis (
6,
8,
12). Therefore, presence of several high and low abundance
pccA reads, particularly in high temperature sites, is indicative of multiple organisms relying on the Krebs cycle to generate energy from simple carbon compounds at high temperatures.
Class Chlorobia contain type I reaction centers and are the only phototrophic group that fixes carbon via the rTCA cycle (
4,
54). We recovered fewer reads associated with type I reaction centers (psc genes, Fig. S6A) compared to both type II reaction center and photosystem genes (
Fig. 2). We recovered very few reads associated with either ATP citrate-lyase subunits, an irreversible and critical enzyme in the rTCA cycle. Together, these results suggest that phototrophic taxa with type I reaction centers are likely photoheterotrophs or photoautotrophs that use alternative carbon fixation pathways.
(Putative) phototrophic Chloroflexi encode nifH.
Alkaline hot springs in YNP are nitrogen limited, and several studies in Mushroom and Octopus Springs have shown that phototrophic bacteria are the primary diazotrophs in these environments (
1,
4,
8,
29,
33). We examined the richness and diversity of nifH genes with respect to temperature (
Fig. 6, Fig. S2C). Like our
psbA and
pufLM analysis above, we assigned OTUs (at 99% similarity) to the
nifH sequences. We recovered 26 nifH OTUs, several of which were present in more than one site (
Fig. 6). In general, we recovered more
nifH OTUs in 62ºC sites (
Fig. 6), but our most abundant OTU (assigned to
Synechococcus sp. by BLASTP) was present in site RCA3 (68ºC). Sample GCA3 contained only unique OTUs, suggesting taxa with these
nifH genes could be adapted to the distinct conditions in this site. Similarly, OTU05 was only present in the two high sulfide sites (RCA5 and BG1), and OTU04 was the most abundant in sites with the highest temperatures (WCA2 and WCA1). Our data suggest the potential for nitrogen fixation is not evenly distributed with temperature.
Loiacono et al. (2012) recovered
nifH transcripts identified as
Synechococcus and
Roseiflexus in samples ranging from 53–73ºC, suggesting the potential for nitrogenase activity near the upper temperature limit of photosynthesis (
33). To determine the taxa associated with our
nifH sequences, we translated
nifH sequences and built a phylogenetic tree and conducted a BLASTP search. Eleven of 26
nifH OTUs were classified as either cyanobacteria or Chloroflexi (Fig. S7A). Six
nifH sequences were closely related to Synechococcus, a common constituent of alkaline hot springs of >60ºC and a known diazotroph (Table in Fig. S7B) (
30). Three of the 20 most abundant OTUs in our data set were closely related to
Roseiflexus species (OTU02, 06, and 22), present in sites ranging from 62ºC to 68ºC in the Rabbit Creek area.
Roseiflexus genomes only encode nifHBDK, and neither of the two isolate species (
R. castenholzii or
Roseiflexus sp. RS-1) can grow in the absence of a fixed nitrogen source (
21,
65). Therefore, it is unlikely that
Roseiflexus fixes nitrogen. However,
Roseiflexus nifH genes are abundant in our data, and Roseiflexus nifH mRNA has been detected in similar hot springs (
8,
15,
17,
30), suggesting NifH serves a functional purpose but that function remains unknown. In cyanobacteria, NifH expression is stimulated by iron (
66). Our samples ranged in Fe
2+ concentration from below detection limits to 2.3 μM but given that
Roseiflexus genomes don’t encode a full nitrogenase, future studies are required to determine the function of NifH in this genus and the conditions that result in transcription.
Roseiflexus nifH could also be important to determining the evolutionary history of nitrogenase as
Roseiflexus nif genes are deeply branching (
67).
The second most abundant
nifH OTU in our data set (OTU09) formed a separate clade near, but not within, the cyanobacteria clade (Fig. S7A). BLASTP assigned OTU09 (and four additional, low abundance OTUs; Table in Fig. S6B) as
Hydrogenobacter thermophilus, in phylum Aquificae, a deep-branching chemolithoautotrophic group with diazotrophic representatives found in high temperature (>70ºC) hot springs (
68). Previous analysis of
nifH genes across all domains of life suggested Aquificae are the oldest extant diazotrophic bacteria (
26). Thus, our data contain several
nifH-containing lineages that are of great importance for solving the evolutionary history of nitrogen fixation.
Conclusion.
Phototrophic bacteria are widely distributed and abundant in alkaline hot springs at >60ºC. By quantifying the distribution of genes involved in carbon fixation, nitrogen fixation, and phototrophy in eight alkaline hot spring metagenomes, we add to the large body of work on the metabolic potential of both cyanobacteria and anoxygenic phototrophs in situ. Additionally, we offer a glimpse into the diversity and physiology of the underrepresented Chloroflexi phylum. While the abundance of photosynthetic genes did not vary with temperature, we observed higher richness in both cyanobacterial psbA genes and pufLM genes affiliated with Chloroflexi in 62ºC sites. Furthermore, we observed more cosmopolitan psbA OTUs in 62ºC sites and unique OTUs in sites of > 68ºC. This suggests that cyanobacteria at higher temperatures contain forms of psbA genes that could allow them to persist at higher temperatures. Conversely, we observed several cosmopolitan pufLM OTUs in both high and low temperature sites, specifically OTUs shared across the Rabbit Creek area, which suggest Chloroflexi are adapted to local geothermal conditions rather than specific temperatures.
Abundance of photosynthesis genes associated with both cyanobacteria and phototrophic Chloroflexi did not significantly differ with temperature. Carbon fixation gene abundances were significantly different in site RCA5 compared to all others. However, in general, we did not observe trends in abundance with temperature. Rather, ratios of rbcL genes suggest temperature selects for specific types of RuBisCO: cyanobacterial-rbcL in sites >63ºC and noncyanobacterial-rbcL in 62ºC sites. Furthermore, the majority of the rbcL OTUs were unique to certain sites, suggesting geographic isolation or adaptation to local conditions. Genes associated with autotrophic, anoxygenic phototrophs did not have distinct distributions with temperature, but we recovered abundant reads associated with the 3-hydroxypropionate bicycle (Chloroflexi, chemoautotrophs) and very few reads associated with the complete reverse TCA cycle (Chlorobia). Together, abundance and diversity of carbon fixation genes suggest that organisms fixing CO2 via the rTCA cycle are rare near the upper temperature limit of photosynthesis where photoautotrophic cyanobacteria and Chloroflexi are abundant.
Finally, we surveyed the distribution and abundance of genes associated with nitrogen fixation (nifH). NifH genes were abundant across sites, regardless of site temperature, and both Roseiflexus and Synechococcus-like nifH sequences were among the most abundant in our data. Synechococcus are known to fix nitrogen in hot springs, but Roseiflexus do not have the full suite of genes required to fix nitrogen; yet, nifH-containing Roseiflexus are abundant in alkaline hot springs, and Chloroflexi are deep-branching taxa. Thus, nifH sequences recovered here could be critical to solving the evolutionary puzzle of nitrogen fixation in bacteria.