Paenibacillus is a genus of facultative endospore-forming bacteria, and a majority of its strains are found in soil (
1).
Paenibacillus strains are known for their growth promotion attributes and the production of microbial arsenals relevant to agriculture and medicine (
2,
3). Exploring the genome of
Paenibacillus polymyxa HOB6 can provide a deep knowledge of the underlying mechanisms against a variety of phytopathogens, thus leading to lower usage of synthetic pesticides (
4). Hemp seed oil was purchased from a commercial organic producer (Coco et Calendula, Inc., Montreal, Canada). An equal volume of seed oil was mixed with an equal volume of LB broth in a 15-ml Falcon tube and incubated with agitation (140 rpm) overnight at room temperature. An aliquot (100 μl) of the aqueous phase was streaked onto LB agar (LBA) medium and incubated at 37°C for 5 days to obtain pure single-cell colonies. Genomic DNA (gDNA) was extracted from a single-cell colony using a DNeasy blood and tissue kit (Qiagen, Germany). The preparation of the whole-genome shotgun library and sequencing were carried out by Admera Health (South Plainfield, NJ, USA). The gDNA was fragmented using a Covaris LE220 sonicator, and the paired-end sequencing library was prepared using a Nextera XT DNA library prep kit (Illumina, USA). Genome sequencing was performed on a HiSeq X platform, using a 2 × 150-bp protocol. A total of 5,243,904 sequencing reads were produced and uploaded to the Galaxy Web platform (
5), and we used the public server at
https://usegalaxy.org/ to analyze our data using default settings unless otherwise specified. Preprocessing of the sequencing reads was carried out using FastQC version 0.11.9 software (
6) for quality assessment and Trim Galore (Galaxy version 0.6.3) (
7) to remove the low-quality reads and adapters. The filtered reads were assembled
de novo using SPAdes (Galaxy version 3.12.0) (
8), with k-mer sizes of 21, 33, 55, 77, 99, 111, and 127, which resulted in a total of 219 contigs. Reference genome sequences belonging to five
P. polymyxa strains were used to rearrange and correctly orient the assembly contigs into scaffolds using MeDuSa version 1.6 (
9). These strains and their GenBank assembly accession numbers are as follows: CR1 (
GCA_000507205.2), E681 (
GCA_014706575.1), SC2 (
GCA_000164985.2), SQR-21 (
GCA_000597985.1), and HY96-2 (
GCA_002893885.1). The resultant scaffolds were subjected to NCBI’s contamination screen, and contigs containing contaminants as well as those under 0.2 kb were removed. This resulted in a total of 59 scaffolds with a total length of 5,751,895 bp, a G+C content of 45.57%, and an
N50 value of 2,906,550 bp, with an average depth of sequencing coverage of 266.29×. The assembly statistics were provided by Quast (Galaxy version 5.0.2) (
10). Following the ribosomal multilocus sequence typing approach (
https://pubmlst.org/species-id/) (
11), strain HOB6 was identified as
Paenibacillus polymyxa. The draft genome sequence was annotated using the NCBI Prokaryotic Genome Annotation Pipeline (PGAP) version 5.1 (
12). Annotation of this genome revealed a total of 5,248 genes, including 4,956 coding DNA sequences (CDSs), 157 RNA genes, 107 tRNAs, 21 complete rRNAs, 4 noncoding RNAs (ncRNAs), and 135 pseudogenes. AntiSMASH version 5 (
13) was used for biosynthetic gene cluster identification. The strain HOB6 genome comprises gene clusters for the nonribosomal peptides (NRPs) fusaricidin B, polymyxin, and tridecaptin, as well as the lanthipeptides paenicidin B and paenibacillin.