Roseovarius sp. strain MCTG156(2b) was isolated from a phytoplankton net sample that was trawled in 2009 at a sampling station designated LY1, located on the west coast of Scotland near Oban, Argyll. The strain was isolated by enrichment with phenanthrene in Zobell’s 2216 marine medium at a 10-fold dilution. Colonies on agar plates sprayed with phenanthrene produced distinct halos that indicated the strain’s ability to degrade the hydrocarbon. Based on 16S rRNA gene sequence identity, the closest type species was
Roseovarius mucosus strain DFL-24, which had also been isolated from a laboratory culture of a dinoflagellate (
1).
Here, we report the genome sequence of
Roseovarius sp. strain MCTG156(2b). Genomic DNA was sequenced through the DOE Joint Genome Institute 2014 Genomic encyclopedia of type strains phase III study (
2) using Pacific Biosciences (PacBio) technology. A PacBio SMRTbell library was constructed and sequenced on the PacBio RS platform, which generated 170,293 filtered subreads totaling 598.2 Mbp. All general aspects of library construction and sequencing performed at the JGI can be found at
http://www.jgi.doe.gov . The raw reads were assembled using HGAP (version 2.1.1) (
3). The final draft assembly produced 8 scaffolds containing 8 contigs totaling 5.1 Mbp in size, with input read coverage of 121.5×.
Project information is available in the Genomes OnLine Database (
4). Genes were identified using Prodigal (
5), as part of the Joint Genome Institute (JGI) Microbial Annotation Pipeline (
6). The predicted coding sequences (CDSs) were translated and used to search the National Center for Biotechnology Information (NCBI) nonredundant, UniProt, TIGRFam, Pfam, KEGG, COG, and InterPro databases. The tRNAscan-SE tool (
7) was used to find tRNA genes, whereas ribosomal RNA genes were found by searches against models of the ribosomal RNA genes built from SILVA (
8). Other noncoding RNAs, such as the RNA components of the protein secretion complex and RNase P, were identified by searching the genome for the corresponding Rfam profiles using Infernal (
http://infernal.janelia.org ). Additional analysis and manual functional annotation were performed within the Integrated Microbial Genomes-Expert Review (IMG-ER) platform (
https://img.jgi.doe.gov /) developed by the Joint Genome Institute, Walnut Creek, CA, USA (
9).
The complete genome sequence length was 5,113,782 bp, with a G+C content of 60.7%. The genome contained 5,142 genes (5,078 protein-coding genes), with functional predictions for 4,198 of them. A total of 64 RNA genes were detected. Other genes characteristic for the genus are given in the IMG database (
10).