Open access
8 November 2018

Complete Genome Sequence of Megasphaera stantonii AJH120T, Isolated from a Chicken Cecum


A novel bacterium, Megasphaera stantonii (strain AJH120T), was isolated from the cecum of a chicken during a screen for microorganisms using a brain heart infusion (BHI) medium. PacBio and Illumina MiSeq sequencing technologies were used to produce a single contiguous chromosome. The resulting genome sequence is 2,652,760 bp long with a 52.62% GC content.


The genus Megasphaera comprises butyrate-producing organisms within the Firmicutes phylum. Members of the Megasphaera genus have been isolated from a variety of anaerobic environments, including the intestinal tracts of pigs, cows, and humans (15). Here, we describe the complete genome sequence of a recently characterized Megasphaera species, isolated from the cecal contents of a healthy white leghorn chicken. The isolation, cultivation, and biochemical properties of Megasphaera stantonii AJH120T were previously described (6). Anaerobic commensal bacteria, such as Megasphaera spp., may play an important role in host energy acquisition, and beneficial functions, such as butyrate production, could be exploited for poultry nutrition (7).
Genomic DNA of strain AJH120T was extracted from a 48-h culture grown in peptone-yeast-fructose broth at 42°C using the PureLink genomic DNA extraction minikit (Invitrogen) according to the manufacturer’s instructions. A DNA library (350-bp insertion size) was generated from 100 ng of extracted DNA using the TruSeq Nano DNA library preparation kit (Illumina) and sequenced on a MiSeq machine utilizing v3 chemistry (Illumina). High-molecular-weight genomic DNA was extracted using the Qiagen Genomic-tip 20/G genomic DNA extraction kit according to the manufacturer’s instructions. From that extracted DNA, 10 μg was submitted to the Yale University genomic facility for single-molecule real-time (SMRT) sequencing (Pacific Biosciences) to generate long sequencing reads. A total of 458,687 PacBio subreads were obtained with an N50 value of 20,052 bp. For de novo assembly of the PacBio reads, Canu v1.5 (8) was used on the raw concatenated reads using an estimated genome size of 2.8 Mb and the default settings. Canu corrected, trimmed, and assembled the PacBio reads to generate one 2,666,587-bp contig with a 52.60% GC content. Circularization of the contig was conducted with the program Circlator v1.5 using default settings (9). Circlator identified and trimmed the overlap at the contig ends, resulting in a genome that was 2,652,026 bp long and had a GC content of 52.62%.
To polish the genome sequence, the short MiSeq reads were used to correct errors in the PacBio data. Illumina MiSeq reads were processed with Trimmomatic v0.36 (10) using the default settings for paired-end mode to remove adaptors and sequences that had a quality score of <15 and a length of <36 bp. The Burrows-Wheeler aligner “mem” algorithm (BWA-mem) v0.7.12 (11) was used to map the Illumina reads to the genome contig, producing an alignment file. SAMtools v1.4.1 (12) was used to sort and index the alignment file. Pilon v1.18 (13) was used to correct the genome contig using the option “–fix bases,” resulting in a final genome sequence of 2,652,760 bp with a 52.62% GC content.
The corrected reads were annotated using the PathoSystems Resource Integration Center (PATRIC) online platform (14) to generate an annotated contiguous genome sequence. The PATRIC annotation of the genome identified 2,568 protein-coding sequences, 851 of which are hypothetical proteins. There are 57 tRNA genes and 6 rRNA operons in the genome. AJH120 has a complete butyrate production pathway and several antimicrobial resistance genes, including those for tetracycline, lincosamide, macrolide, and fosfomycin resistance.

Data availability.

This complete genome sequence has been deposited in GenBank under the accession no. CP029462. The version described in this report is the first version.


Illumina sequencing was performed at the National Animal Disease Center in Ames, Iowa. PacBio SMRT sequencing was performed at the Yale Center for Genome Analysis in West Haven, Connecticut.
We thank Darrell Bayles and Daniel Nielsen for help with the assembly pipeline. We also thank David Alt and Lisa Lai for sequencing and technical support.
This work was supported by ARS-USDA CRIS funds. Mention of trade names or commercial products in this publication is solely for the purpose of providing specific information and does not imply recommendation or endorsement by the USDA.


