INTRODUCTION
Marine invertebrates, such as tunicates and sponges, are known to harbor symbiotic communities of bacteria. Because these animals are sessile and have limited physical defenses, their microbiomes are thought to serve defensive functions, and bacterial symbionts in these invertebrates have been implicated in the production of many bioactive small molecules (
1,
2). Such small molecules can possess therapeutically relevant activities, so there is great interest in studying the biosynthetic potential and symbiotic functions of these defensive microorganisms (
1,
2). However, most symbionts (and most environmental bacteria in general [
1,
3,
4]) are difficult to culture, which renders their bioactive metabolites challenging to access on an industrial or clinically useful scale. Due to fastidious or cryptic growth requirements, the study of these systems is possible only through culture-independent methods, such as direct sequencing of environment-derived DNA (termed shotgun metagenomics).
In the present work, we used shotgun metagenomics to gain a greater understanding of the uncultured bacterial symbiont “
Candidatus Endobugula sertula,” which resides within the marine bryozoan
Bugula neritina (
Fig. 1A), where it is known to produce defensive compounds called bryostatins (
5–8) (
Fig. 1B). Although the adult bryozoan is covered in a protective layer of chitin, the larvae require chemical defense after release (
9–11), and high concentrations of “
Ca. Endobugula sertula” have been observed within free-swimming larvae (
10).
Despite many unsuccessful attempts to culture the producing organism in laboratory settings, the bryostatins remain a target of therapeutic interest (
7), as they are cytotoxic highly potent protein kinase C activators that have been investigated in clinical trials for the treatment of cancer, Alzheimer's disease, and HIV infection (
Fig. 1B) (
7). The
bry pathway for bryostatin production is what is known as a
trans-acetyltransferase (AT)-type polyketide synthase (PKS) pathway (
8,
12). PKSs are related to fatty acid synthases, and they construct a molecule in a similar fashion from two-carbon units derived from malonate (
13), using multiple enzymatic domains in a typical PKS protein. Most PKS pathways include AT domains within PKS proteins that are used to load the activated S-coenzyme A (CoA) thioester form of malonate, but in
trans-AT systems, the AT exists on a separate protein. Portions of the
bry pathway for the bryostatins were recovered previously using clone library methods (
6), but the complete genome sequence, including missing pieces of the
bry pathway, has remained elusive due to recalcitrance of the symbiont to culture. A number of factors are thought to contribute to this recalcitrance, including the possibility of substantial genome reduction (
7), which has been reported in other invertebrate symbioses (
14–16). Indeed, the divergence of symbionts in genetically isolated hosts (
17), as well as restriction to
B. neritina and allied species, the vertical mode of symbiont transmission, and the ongoing inability to isolate and culture “
Ca. Endobugula sertula” suggested a lifestyle of extreme host dependence that can lead to gene loss and genome reduction (
18). However, we recently found that under some circumstances, “
Ca. Endobugula sertula” may be horizontally transferred between
B. neritina individuals (
19), which is not consistent with strict host dependency that is associated with extreme genome reduction (
18).
Our objective was to assemble the genome of “
Ca. Endobugula sertula” in order to determine whether the symbiosis has evolved to a state of codependency, evidenced by bacterial genome degradation, as in some other defensive symbioses (
14–16). Shotgun DNA and RNA sequencing revealed a genome with few signs of reduction, largely intact mainstream metabolic pathways, and a putative mechanism of vertical transmission of “
Ca. Endobugula sertula” to host larvae.
DISCUSSION
Reductive evolution is known to occur in a number of settings, including intracellular symbiosis (
18,
53), intracellular pathogenesis (
54), and free-living pelagic microbes (
55). In strictly intracellular organisms, reduction is thought to occur due to genetic drift as a consequence of genetic isolation and strict vertical transmission, low effective population sizes, and frequent bottlenecks (
18). In free-living examples, it is thought that reduction is driven by selection rather than drift (
56), potentially due to a selective advantage in not investing in the production of a metabolite that has become a “public good” in the community (
57). In the “Black Queen” model (
57), when a metabolic pathway becomes rare enough in the community, further loss is selected against to prevent loss of community fitness as a whole.
Features in the “
Ca. Endobugula sertula” genome were inconsistent with extreme reduction due to sequence drift. The early stages of such a process are characterized by a proliferation of pseudogenes and an accompanying low coding density (
18). In contrast to the hundreds of pseudogenes identified in
Sodalis glossinidius, a tsetse fly symbiont in the early stages of genome degradation (
58), only 30 potential pseudogenes were found in the “
Ca. Endobugula sertula” genome. Furthermore, the “
Ca. Endobugula sertula” genome did not have a particularly low G+C content but did have a coding density within the typical bacterial range (
45) and a genome size substantially larger than the previously predicted 2 Mbp (
7). However, careful examination of the metabolic capabilities implied by the “
Ca. Endobugula sertula” genome suggested that it lacked a number of enzymes required for the synthesis of certain amino acids (
Fig. 3), potentially explaining its dependence on its host environment and, perhaps, other microbial constituents of the host microbiome. Metatranscriptomic analysis suggested that the symbiont was actively involved in metabolic processes related to amino acid metabolism, such as arginine biosynthesis. High expression levels of the arginine biosynthesis pathway may help explain an interesting observation made by a previous study that found that
B. neritina larvae do not become depleted of nitrogen during the swimming stage prior to settlement, compared to the larvae of other
Bugula species that are aposymbiotic (
59). Perhaps then, the host's larval nutrition may be augmented through nitrogen recycling by “
Ca. Endobugula sertula” in the form of ammonia assimilation as carbamoyl phosphate (
60) before storage as arginine (
61).
Previously, it was found that different populations and sibling species of
Bugula harbor different strains of “
Ca. Endobugula sertula” (
7), which is a pattern of distribution consistent with a vertical mode of symbiont transmission, where respective populations have been genetically isolated since host divergence. However, we previously found evidence that horizontal transmission is likely also possible (
19). We found that two sibling species of
B. neritina, northern (N) and shallow (S), previously thought to be allopatrically distributed, actually coexist along the Western Atlantic. Type N populations are divergent from type S animals and are aposymbiotic in their typical northern range. However, in Western Atlantic populations, both type N and type S individuals can be found harboring 100% identical “
Ca. Endobugula sertula” strains (as measured by 16S and internal transcribed spacer [ITS] sequences). The most parsimonious explanation for this observation is the horizontal transfer of symbionts from type S to type N individuals.
The noncongruence of host and symbiont phylogenies (
17,
38,
40) also argues against a long evolutionary history of strict vertical transmission. Although “
Ca. Endobugula sertula” has proven difficult to isolate and culture, the lack of extensive genome reduction suggests that a transient host-free existence may be possible, a plausible explanation for the symbiont acquisition by type N hosts found alongside symbiotic type S hosts at low latitudes (
19). A similar situation is found in the tunicate
Lissoclinum patella, which harbors a photosynthetic symbiont,
Prochloron didemni (
62). As with “
Ca. Endobugula sertula,”
P. didemni has never been cultivated in the laboratory, but its large genome size and the lack of evidence for genetic drift and accelerated evolution suggest that it might not be genetically isolated inside individual hosts (
63). “
Candidatus Endobugula sertula” appears to have specific metabolic deficiencies, and these deficiencies might be the reason that “
Ca. Endobugula sertula” is dependent on the host. If the symbiont is indeed able to withstand short periods of time outside the host (i.e., it is not genetically isolated within individual hosts), these metabolic deficiencies may have come about due to selection for not investing in “public goods” that are supplied by the host (
57). However, over evolutionary time scales, further metabolic deficiencies may develop if the symbiont becomes more strictly restricted to only living within the host and undergoes genome degradation and reduction, which would likely prevent horizontal transmission of the symbiont.
Despite few signs of genome degradation, the “
Ca. Endobugula sertula” genome shows some potential adaptations to symbiotic life. Several pseudogenes were annotated with functions in flagellar assembly and motility, and similar functions are lost in intracellular
Rickettsiales (
64). The chitinase-containing Tc locus that was found on NODE28 is related to the Yen-Tc locus from the insect pathogen
Yersinia entomophaga (
65) and might also be involved in symbiosis. The chitinases in Yen-Tc are thought to associate with the secreted Tc assembly, which exhibits chitinase activity
in vitro, to contribute to
Y. entomophaga pathogenicity (
65). The chitinase-containing Tc locus in “
Ca. Endobugula sertula” is highly expressed in the ovicells, consistent with its use in allowing “
Ca. Endobugula sertula” to move from the funicular cords within the adult host (
5) through the potentially chitinaceous ectocyst (
66,
67), thus ensuring vertical transmission to the larvae. The high expression of Tc loci in ovicells is suggestive of the importance of these proteins to the symbiont during the reproductive phase of its host. Other symbiotic systems have been shown to coopt mechanisms used in virulence and immunogenicity of pathogens, such as during the acquisition of the light-producing symbiont
Vibrio fischeri by the Hawaiian bobtail squid, where the immunogenic bacterial products lipopolysaccharide and peptidoglycan are central to symbiont-host interactions (
68,
69). In a similar fashion, the other Tc loci in the “
Ca. Endobugula sertula” genome might also be involved in aspects of recognition and communication with the bryozoan host, and their presence might signify that ancestors of “
Ca. Endobugula sertula” were pathogenic.
In the present work, a draft assembly of the “
Ca. Endobugula sertula” genome was recovered and estimated to be 100% complete by single-copy-marker analysis (
39) using comparative shotgun metagenomic data sets. Adapting a method developed by Albertsen et al. (
49), the sequence of the repeat-heavy
bry biosynthetic pathway was confirmed to agree with the sequence previously constructed using a clone library method and Sanger sequencing (
6). Although additional pathway components were identified, enzymes in the “
Ca. Endobugula sertula” genome that could be responsible for the addition of the ester side chains could not be found (
Fig. 1B), suggesting that the symbiont employs either a novel mechanism that could not be identified bioinformatically or enzymes annotated with roles in primary metabolism, as others have observed (
16,
70). Alternatively, the side chains may be installed by the host, rather than the symbiont. Additionally, because
bryS showed no RNA-seq coverage, this gene may not be involved in the biosynthesis of bryostatins. If BryS is not responsible for
O-methylation of β-branches, this conversion is most likely to be carried out by methyltransferase domains in BryA and BryB polyketide synthases.
Interestingly, although the majority of bacterial secondary metabolite pathways consist of genes clustered into one chromosome region (
71), several symbiotic bacteria have been found to contain fragmented pathways (
14–16,
29), which include the previously known portions of the
bry pathway found to be split into two separate loci in the type D genotype of “
Ca. Endobugula sertula” (
7). The AB1 draft genome of “
Ca. Endobugula sertula” does not contain any other PKS pathway, apart from
bry, that would explain the presence of newly identified
bryTU genes. Identifying these genes likely implicated in bryostatin biosynthesis highlights the importance of recovering complete or near-complete genomes and, thus, the use of unbiased shotgun sequencing in capturing and understanding the fragmented biosynthetic pathways of uncultured microorganisms. Our identification of these two additional
bry genes, along with the metabolic analysis of “
Ca. Endobugula sertula,” may facilitate further studies into the renewable supply of bryostatins, either through heterologous expression or targeted symbiont culture.
ACKNOWLEDGMENTS
This research was performed in part using the computer resources and assistance of the UW-Madison Center For High Throughput Computing (CHTC) in the Department of Computer Sciences. We thank Niels Lindquist (UNC) for assistance with field collections, Ben Oyserman (UW-Madison) for assistance with RNA-seq analysis, and Ahron Flowers (RMC) for assistance with B. neritina genotyping. We thank the University of Wisconsin Biotechnology Center DNA Sequencing Facility for providing sequencing facilities and library preparation services.
The CHTC is supported by UW-Madison, the Advanced Computing Initiative, the Wisconsin Alumni Research Foundation, the Wisconsin Institutes for Discovery, and the National Science Foundation, and is an active member of the Open Science Grid, which is supported by the National Science Foundation and the U.S. Department of Energy's Office of Science. Additionally, this work utilized computer resources at Future Grid, which is supported by National Science Foundation grant 0910812. This work was supported by grant R21AI121704-01 from NIAID, as well as funding from The Thomas F. and Kate Miller Jeffress Memorial Trust (Bank of America, Trustee) and the American Foundation for Pharmaceutical Education (to I.J.M.), as well as the School of Pharmacy, the Graduate School, and the Institute for Clinical & Translational Research at the University of Wisconsin-Madison.
G.E.L.-F. and J.C.K. collected and prepared samples for analysis; I.J.M. and J.C.K. designed and performed the research; I.J.M., N.V., G.E.L.-F., S.S.F., and J.C.K. analyzed data; and I.J.M. and J.C.K. wrote the paper.