Myxobacteria are an excellent source of structurally diverse, bioactive natural products (1
). Isolated from a soil sample with plant residues near Mitgamr, Egypt, in 2007, Cystobacter violaceus
strain Cb vi76, DSM 14727, produces the cytotoxic polyketide natural product gephyronic acid, which is a potent and selective inhibitor of eukaryotic protein synthesis (3–6
). Herein we present a draft genome sequence of the strain collected in our efforts to determine the biosynthetic pathway for gephyronic acid.
C. violaceus genomic DNA was sequenced using a Roche 454 GS FLX sequencer with a combination of shotgun sequencing and 3-kb paired-end sequencing at the Genomics and Bioinformatics Core Facility at the University of Notre Dame. Shotgun Titanium sequencing yielded 552,177 reads, which were assembled using the Roche Newbler Assembler version 2.3 into 431 contigs comprising 12,505,879 bp in total (13× coverage). An additional 2× coverage was supplied utilizing paired-end sequences that were mapped to the shotgun genome sequence to fill gaps and orient contigs. Based on this assembly, we generated the C. violaceus Cb vi76 draft genome sequence consisting of 12,570,057 bp distributed among 83 scaffolds with a GC content of 68.9%, respectively totaling an approximate 15× coverage of the genome.
The C. violaceus
genome was analyzed and annotated using RAST version 4.0 (7
) and the NCBI Prokaryotic Genomes Annotation Pipeline (PGAP) (http://www.ncbi.nlm.nih.gov/genome/annotation_prok/
). Analysis of the unclosed draft genome sequence for C. violaceus
provided an estimated genome size of 12.57 Mbp. The 431 contigs contain 8,201 putative coding sequences (CDS). Genes are evenly distributed between the forward (51.2%) and reverse (48.8%) strands. The average length of the CDSs is 1,065 bp, and 50.1% of the CDSs encode proteins whose functions are unknown. In addition to the identified protein functionalities, single 5S, 16S, and 23S rRNA genes were annotated utilizing RNAmmer (8
). The search server, tRNAscan-SE, annotated 79 tRNA genes representing all 20 common amino acids (9
Additional genome analysis with antiSMASH version 1.1.0 predicted several secondary metabolite biosynthetic gene clusters, including biosynthetic pathways for 5 lantibiotics, 2 bacteriocins, 6 terpenes, 4 polyketides, 7 nonribosomal peptides, and 8 polyketide/nonribosomal peptide hybrids (10
). All predicted secondary metabolite gene clusters for polyketide synthase and/or nonribosomal peptide synthetase are of unknown structure, except the gephyronic acid (6
). Further inspection of the proposed gene clusters revealed that several are in fact gene cluster fragments, befitting of the draft quality of the C. violaceus
The breadth of proposed biosynthetic enzymes with unknown cognate natural products as well as the known gephyronic acid biosynthetic pathway exemplify the potential access to natural products intrinsic to myxobacteria such as C. violaceus. We believe the draft genome sequence will help facilitate future work for further investigation of secondary metabolism in myxobacteria.
Nucleotide sequence accession numbers.
This whole-genome shotgun project has been deposited in DDBJ/ENA/GenBank under the accession number JPMI00000000. The version described in this paper is the ﬁrst version, JPMI01000000.