Genomes and phylogenomic placement.
Six MAGs and four SAGs were recovered from Zodletone Spring sediments as part of the effort described above. Detailed assembly statistics for these assemblies are presented in Table S3 in the supplemental material. In addition, 2 SAGs were recovered from Lake Baikal, Irkutsk, Russia, 1 SAG was recovered from CrabSpa hydrothermal vent, East Pacific Rise, and 1 SAG was recovered from sediment of Walker Lake, Nevada, with members of the phylum
Calditrichota (
Calorithrix insularis,
Calditrhix abyssii, and
Calditrhix palaeochoryensis) as their closest cultured relatives (12.7% to 17.3% 16S rRNA gene divergence and 39.5% to 54.5% average amino acid identity [AAI]) (
Table 1). Detailed phylogenomic analysis (
Fig. 1) grouped these 14 genomic assemblies into five distinct phylum-level lineages based on the GTDB taxonomic scheme. Group 1 (Zodletone Spring Zgenome_0241 MAG, Zodletone Spring SCGC_AG-640-A22 SAG, and Walker Lake sediment SCGC_AG-301-P11 SAG) was monophyletic with candidate phylum AAMBM5-125-24, a phylum currently defined by MAGs from Aarhus Bay sediments and estuaries of White Oak River, North Carolina, and SAGs from the oxygen-minimum zones of the Northeastern Subarctic Pacific Ocean (
Table 1). Group 2 (Zodletone Spring Zgenome_0002 MAG, Zodletone Spring Zgenome_0273 MAG, and CrabSpa hydrothermal vent SCGC AD-699-J03 SAG) was monophyletic with the phylum
Calditrichota, a phylum currently defined by MAGs from Guyamas Basin sediment, Guyamas Basin hydrothermal vent, and Rifle aquifer sediment, as well as the pure-culture
Cadithrix abyssi LF13 genome. Group 3 (Zodletone Spring Zgenome_0027 and Zodletone Spring Zgenome_0048 MAGs) was monophyletic with 6 MAGs belonging to candidate phylum KSB1 assembled from Guyamas Basin sediment (3 MAGs), Aarhus Bay sediments (1 MAG), Suncor tailing pond (Canada) (1 MAG), and Rifle aquifer sediment (Rifle, CO) (1 MAG) (
Table 1). Group 4 (Lake Baikal SCGC AG-636-I10 and SCGC AG-636-N09 SAGs) was monophyletic with one MAG from estuaries of White Oak River, North Carolina, belonging to the candidate phylum SM23-31. It is worth noting that the phylum names utilized here are based on GTDB taxonomic outlines and that prior publications have often used one phylum name interchangeably, e.g.,
Calditrichaeota in reference
23 or KSB1 in references
24 and
25, as a broad umbrella to describe genomes from all four phyla. Interestingly, the fifth group encompassed 3 Zodletone Spring SAGs and 1 Zodletone Spring MAG (SCGC AG-640-J10 SAG, SCGC AG-640-B15 SAG, SCGC AG-640-I23 SAG, and Zgenome_0250 MAG). These four genomes were low- to medium-quality drafts (
Table 1) with a placement suggesting that they belong to a novel, distinct sister phylum to AAMBM5-125-24,
Calditrichota, KSB1, and SM23-31 (
Fig. 1). This distinct phylum-level placement was corroborated by high intraphylum AAI (80.3% ± 25% [mean ± standard deviation]) and shared gene content (37.9% ± 12.3%) scores (
Table 2) and low interphylum AAI (38% to 42%) and shared gene content (15% to 18%) scores (
Table 3). Two-way intraphylum average nucleotide identities were also calculated for members of the fifth group (using alignment options of 700-bp minimum alignment length, a minimum of 50 alignments, and 70% minimum identity with a 1,000-bp window size and 200-bp step size). Values were obtained for SCGC AG-640-J10, SCGC AG-640-B15, and SCGC AG-640-I23 SAGs (99.99 ± 0.007%). However, due to the incompleteness of the genomes, values for Zgenome_0250 MAG in comparison to those of the three SAGs were below the detection level. Using the LSU ribosomal protein L3, three additional genotypes belonging to LCP-89 were identified in the unbinned contigs in the Zodletone Spring metagenomics assembly (Fig. S1).
Structural features deduced from candidate phylum LCP-89 genomes.
We examined the salient structural features of LCP-89 genomes and compared these features to those identified in the genomes of all four sister phyla (
Calditrichota, candidate phyla SM32-31, AABM5-125-24, and KSB1). LCP-89 cells are predicted to be Gram negative, based on the identification of several enzymes of lipid A and core oligosaccharide biosynthesis (
Table 5), and rod shaped, based on the identification of the rod shape-determining proteins MreBCD and RodA. This Gram-negative rod-shaped morphology is similar in all genomes from sister phyla (
Table 5) (
26–28).
Interestingly, our analysis suggests an unusual cell wall composition within members of the LCP-89 phylum. With the exception of
d-alanine–
d-alanine ligase and two penicillin-binding proteins, all LCP-89 genomes analyzed lacked genes encoding peptidoglycan biosynthesis [e.g., UDP-
N-acetylglucosamine 1-carboxyvinyltransferase (EC 2.5.1.7), UDP-
N-acetylmuramate dehydrogenase (EC 1.3.1.98), UDP-
N-acetylmuramate–alanine ligase (EC 6.3.2.8), UDP-
N-acetylmuramoylalanine–
d-glutamate ligase (EC 6.3.2.9), UDP-
N-acetylmuramoyl-
l-alanyl-
d-glutamate–2,6-diaminopimelate ligase (EC 6.3.2.13), UDP-
N-acetylmuramoyl-tripeptide–
d-alanyl-
d-alanine ligase (EC 6.3.2.10), phospho-
N-acetylmuramoyl-pentapeptide transferase (EC 2.7.8.13), and UDP-
N-acetylglucosamine–
N-acetylmuramyl-(pentapeptide) pyrophosphoryl-undecaprenol
N-acetylglucosamine transferase (EC 2.4.1.227), as well as membrane-bound lytic murein transglycosylase A and
l,-
d-transpeptidase]. Since FtsZ (the bacterial tubulin homolog) is essential for peptidoglycan remodeling during the septum formation process in cell division, we also queried the genomes of LCP-89 for FtsZ. FtsZ homologues were identified in only two LCP-89 genomes but were of an apparent archaeal origin and fused with a C-terminal COG0643 (chemotaxis protein histidine kinase CheA) domain (IMG gene numbers Ga0186948_10031 and Ga0186948_10305), casting doubt on their functionality. No pseudomurein biosynthesis genes were identified. However, two genes encoding S-layer homology domain-containing proteins (Pfam accession number PF00395) were identified, as well as genes encoding enzymes for CMP-legionaminate biosynthesis from UDP-
N,N'-diacetylbacillosamine, an unusual alpha-keto sugar known to glycosylate extracellular structures in bacteria, e.g.,
Legionella and
Campylobacter (
29,
30), arguing for the possibility of an N-glycosylated S-layer in the cell walls of LCP-89 members. Interestingly, both S-layer homology domain-containing proteins in LCP-89 genomes were present upstream from a curli biogenesis system outer membrane secretion channel gene (
csgG) homologue. CsgG in curli fiber-producing bacteria is implicated in the export of the protein components of the curli fiber, a thin aggregative cell surface fiber used for adhesion to surfaces (
31). A possible function for the LCP-89 CsgG homologues in the export of the S-layer protein could therefore be hypothesized. However, S-layer protein export via type I secretion system, as reported for other S-layer-containing bacterial species (
32,
33), could not be ruled out. The lack of peptidoglycan biosynthesis genes and the proposal of the presence of an N-glycosylated S-layer instead has previously been suggested in members of the
Dehalococcoidia class of
Chloroflexi (
34–36), albeit members of
Dehalococcoidia seem to lack an outer lipopolysaccharide (LPS) membrane. The lack of peptidoglycan biosynthesis machinery in LCP-89 genomes is in contrast to its presence in all
Calditrichota, SM32-31, AABM5-125-24, and KSB1 genomes examined (
Table 5 and
Fig. 3). All sister phyla except AABM5-125-24 also encode S-layer homology domain-containing proteins (
Table 5 and
Fig. 3).
Additionally, although LCP-89 genomes possessed a nearly complete chemotaxis machinery (methyl-accepting chemotaxis protein, two-component system, chemotaxis family, sensor kinase CheA [EC 2.7.13.3], two-component system, chemotaxis family, response regulators CheB [EC 3.1.1.61] and CheY, chemotaxis protein CheD [EC 3.5.1.44], purine-binding chemotaxis protein CheW, chemotaxis protein methyltransferase CheR [EC 2.1.1.80], and chemotaxis proteins MotAB), they lacked the majority of genes for flagellar synthesis and assembly. This argues for the utilization of alternative types of motility, e.g., type IV pili (
37), for which genes were identified in LCP-89 genomes (
Table 5), as shown before for
Myxococcus and
Synechocystis spp. (
38,
39). In comparison, flagellar synthesis and assembly genes were identified in the genomes of
Calditrichota, SM23-31, KSB1, and AAMBM5-125-24.
Another interesting structural feature in LCP-89 genomes is their predicted capacity to synthesize bacterial microcompartments (BMCs), as suggested by the identification of homologues of the proteins with Pfam accession numbers PF03319 (EutN_CcmL) and PF00936 (BMC domain). BMCs are most probably utilized by members of LCP-89 and other sister phyla as protective shells to contain products of rhamnose or fucose metabolism (see metabolic characterization below). Such capacity to synthesize BMCs was also identified in all genomes of LCP-89’s four sister phyla. No evidences for encapsulin nanocompartment (Pfam accession number PF04454) (
40) or magnetosome biogenesis (
41) were identified in any of the genomes analyzed.
Predicted metabolic characteristics of candidate phylum LCP-89.
Genes encoding various catabolic and anabolic abilities identified in the LCP-89 genomic assemblies are presented in
Fig. 4 and
Table 5. LCP-89 genomic analysis revealed a heterotrophic lifestyle, with organic compounds acting as the sole sources of carbon, electrons, and energy. The genomes encoded an extensive sugar degradation machinery (
Fig. 4,
Table 5), enabling the channeling of a wide range of sugars (including glucose, mannose, fructose, and xylose) and sugar alcohols (including sorbitol and xylitol) to the organisms’ central glycolytic pathways. LCP-89 genomes encoded complete Embden-Meyerhof, pentose phosphate, and Entner-Doudoroff pathways for conversion of sugars to pyruvate (
Fig. 4). In addition, LCP-89 genomes encoded a complete fucose and/or rhamnose degradation machinery that breaks down these sugars into propanol and propionate. Rhamnose and/or fucose degradation produces propionaldehyde as a toxic intermediate that needs to be sequestered in the organism’s microcompartment (
42).
A complete pyruvate dehydrogenase enzyme complex and a tricarboxylic acid (TCA) cycle for pyruvate oxidation to CO
2 were identified in all LCP-89 genomes. However, the absence of functional elements of an aerobic respiratory chain (
Fig. 4,
Table 5) casts doubt on the use of oxygen as a possible electron acceptor. Nevertheless, the identification of
nrfAH (cytochrome
c nitrite reductase [NH3 forming] [EC 1.7.2.2]) suggests nitrite ammonification as a possible respiratory process in LCP-89 genomes, most probably coupled to lactate oxidation via
d-lactate dehydrogenase (EC 1.2.1.4). No genes for nitrate reduction to nitrite were identified in the LCP-89 genomes.
In addition to their respiratory capacity, elements of pyruvate reduction to fermentative end products were identified in the genomes, suggesting fermentative capabilities. Predicted metabolic end products from sugar degradation include the short-chain fatty acids acetate, d-lactate, and propionate, based on the identification of genes encoding phosphate acetyltransferase and acetate kinase (EC 2.3.1.8 and EC 2.7.2.1), as well as d-lactate dehydrogenase (EC 1.1.1.28) and ethanol, propanol, butanediol, and acetoin, based on the identification of genes encoding alcohol dehydrogenase, acetolactate synthase (EC 2.2.1.6), acetolactate decarboxylase (EC 4.1.1.5), and meso-butanediol dehydrogenase/(S,S)-butanediol dehydrogenase/diacetyl reductase (EC 1.1.1.-, EC 1.1.1.76, and EC 1.1.1.304) enzymes.
Several metabolic distinctions were identified between members of LCP-89 and its sister phyla
Calditrichota, AABM5-125-24, SM23-31, and KSB1 (
Table 5). One important distinction is the variation in respiratory chain structure and putative electron acceptors. While LCP-89 genomes lacked evidence of a functional aerobic respiratory chain, all of the sister phyla encoded complexes I, II, and III and a variety of cytochrome oxidases or reductases with different affinities to O
2 (e.g., high-affinity cytochrome
bd respiratory O
2 reductase, high-affinity
cbb3-type cytochrome
c oxidase, and/or low-affinity
aa3-type cytochrome
c oxidase). LCP-89 and AABM5-125-24 genomes contained
nrfAH (cytochrome
c nitrite reductase [NH3 forming] [EC 1.7.2.2]), which could possibly suggest respiratory nitrite ammonification, but lacked evidences for nitrate reduction to nitrite (no
napAB or
narGHIJ genes).
Calditrichota appears to be capable of dissimilatory nitrate reduction to ammonium (DNRA). Such capacity is due to the possession of complete
napAB and
nirBD machinery for nitrate reduction to nitrite and nitrite reduction to ammonia (
43). Indeed, pure cultures of
Caldithrix abyssi were shown experimentally to use nitrate as an electron acceptor (
28). Partial evidence of elemental sulfur/polysulfide reduction to sulfide occurs in the genomes of some members of LCP-89, SM23-31, and
Calditrichota (
43). One of the AABM5-125-24 genomes (SCGC AG-640-A22 SAG) encodes a full machinery for dissimilatory sulfate reduction to sulfide, a property not encountered in any of the other genomes analyzed.
LCP-89,
Calditrichota, AABM5-125-24, SM23-31, and KSB1 genomes also differed in their oxygen detoxification mechanisms. A plethora of oxidative stress enzymes were encoded by LCP-89 genomes (including superoxide dismutase, superoxide reductase, rubrerythrin, and rubredoxin), the majority of which do not produce O
2 during their catalytic cycle (
44), further attesting to the lack of aerobic capacities in LCP-89 organisms. On the other hand, genomes from all sister phyla encode some combination of catalase/peroxidase, both of which were missing from LCP-89 genomes (
Table 5).
The levels of amino acids and cofactor auxotrophies also differed between genomes from different phyla. While genomic analysis of LCP-89, KSB1, SM23-31, and
Calditrichota suggested 0 to 2 amino acid auxotrophies, genomes of AABM5-125-24 harbored the most auxotrophies (for 7 amino acids) (
Table 5). In addition, genomes from different phyla encoded different substrate degradation capacities. Genomes of LCP-89, SM23-31, KSB1, and
Calditrichota harbored a wide range of carbohydrate degradation capacities, including both sugar and sugar alcohols (
Table 5). On the other hand, AABM5-125-24 genomes suggest a much narrower range of sugar catabolic capacities. Conversely, while LCP-89 genomes encoded amino acid degradation machineries for only 6 amino acids, genomes of all sister phyla encoded various degrees of amino acid degradation capabilities, ranging from 11 to 14 amino acids (
Table 5).
We observed differences between LCP-89 and its sister phyla in the predicted products of fermentative metabolism. On one hand, LCP-89, SM23-31, Calditrichota, and KSB1 encoded enzymes suggestive of the production of various combinations of short-chain fatty acids and alcohols, including acetate, formate, l-lactate, d-lactate, propionate, ethanol, propanol, butanediol, and acetoin. On the other hand, genomic analysis of AABM5-125-24 suggested the production of acetate and ethanol as the only two fermentation end products.
Concluding remarks.
This study provides an overview of the structural features and metabolic capacities of a yet-uncultured bacterial phylum previously identified in 16S rRNA data sets and for which no prior genomes have been described. Current thrusts for gauging global microbial diversity utilize either amplicon-based diversity surveys for faster, high-throughput community characterization (
2,
45) or metagenomics/single-cell genomics approaches for more in-depth, genome-based predictions of organismal properties and characteristics (
15). Obtaining genome representatives of the torrent of novel bacterial lineages identified in 16S rRNA gene diversity surveys represents an important step toward the understanding of the metabolic abilities and physiological preferences of yet-uncultured microbial lineages. Moreover, such efforts help to reconcile both taxonomic outlines and facilitate the development of a unified scheme for microbial taxonomy encompassing both approaches.
Multiple interesting features were identified in the analyzed genomes of LCP-89, some of which appear to be characteristic of closely related sister phyla
Calditrichota, SM32-31, AABM5-125, and KSB1 (e.g., BMC possession), while others appear to be distinct characteristics representative of this phylum, e.g., respiratory nitrite ammonification and lack of peptidoglycan biosynthetic capabilities. The latter trait, coupled with the predicted possession of an outer membrane, an LPS layer, and an S-layer, is quite unique in the bacterial world. With the exception of the intracellular
Mycoplasma genus, the lack of peptidoglycan appears to be an extremely rare trait within the domain
Bacteria, although quite common in the
Archaea. Recent reports have conclusively demonstrated the presence of peptidoglycan in the cell wall of members of the
Planctomycetes and the
Chlamydia, two phyla previously reported to have a peptidoglycanless cell wall structure (
46,
47). It is worth noting that the cell wall structure reported here partly resembles those speculated for members of the
Dehalococcoidia class of
Chloroflexi (
34–36), albeit
Dehalococcoidia lack an outer LPS membrane. This commonality in two divergent phyla suggests gene loss through reductive evolution, which might be responsible for the observed lack of peptidoglycan in the bacterial world. The evolutionary and ecological drivers for this process remain to be discovered.
Finally, we acknowledge the fact that, as with most studies that investigate genomes of uncultured phyla, the SAGs and MAGs analyzed were incomplete. However, we stress that the majority of our analysis highlights features and suggested capabilities that are present rather than absent from the genomes. As such, it is possible that our analysis might underestimate the breadth of structural or metabolic capabilities of the phyla studied. Also, in instances where complete pathways were not detected, we believe that the analysis of several genomes belonging to each phylum (4 LCP-89 genomes, 3 SM23-31 genomes, 8 KSB1 genomes, 6 AABM5-125-24 genomes, and 7 Calditrichota genomes), rather than just one genome, strengthens the predicted absence of certain features or capabilities in the phyla studied.