ANNOUNCEMENT
The bacterium “
Candidatus Nardonella dryophthoridicola” is a Gram-negative gammaproteobacterial endocytobiont (
Fig. 1). Specifically, it is an intracellular obligate mutualist associated with weevils (
1). The bacterium plays a crucial role in cuticle hardening by supplying tyrosine to its host (
2). Unlike the second weevil-associated symbiont, “
Candidatus Sodalis pierantonius,” it is maintained within a functional bacteriome for its host’s entire life cycle (
3–5).
We used long-read sequencing to investigate the genome sequence of “Ca. Nardonella dryophthoridicola” strain NardRF, associated with an Italian population of Rhynchophorus ferrugineus. The insect hosts were sampled from a single palm tree in the region of Catania in 2017. The pupae were kept at 25°C, 24-h dark, until molting into adults. Ten newly emerged adults were dissected to extract their bacteriomes. The bacteriomes were then pooled for DNA extraction using the DNeasy blood and tissue kit (Qiagen, Italy) following the manufacturer’s instructions for animal tissue extraction. The DNA integrity was verified by 0.8% agarose gel electrophoresis at 90 V for 1 h. The DNA purity and concentration were measured with a NanoDrop 100 spectrophotometer (Thermo Fisher Scientific, Italy) and Qubit double-stranded DNA (dsDNA) high-sensitivity assay kit.
Long-read sequencing was performed using the R9.5 flow cell on a MinION Mk1B device. For the library preparation, 2.5 μg of nonsheared and non-size-selected total genomic DNA was used following the 1D ligation sequencing kit (SQK-LSK 108) protocol. Then, 0.5 μg of the final DNA was loaded onto the flow cell. The sequencing was run for 48 h using MinKNOW v18.03.1. Base calling was then run on the fast5 files using Guppy v4.4.1 (
6) with the high-accuracy algorithm and a quality cutoff of 7. Reads longer than 500 bp were used for the subsequent analyses. All tools were run with default parameters unless otherwise specified.
The metagenomics fastq reads (host and symbiont) were first assembled using miniasm (
7). Contigs identified as “
Ca. Nardonella dryophthoridicola” were identified using BLASTn (E value cutoff, 10
−6) against the NCBI nonredundant (nr) database. These contigs were extracted and used to refine the assembly. The contigs were used to map and extract the “
Ca. Nardonella dryophthoridicola” long reads using minimap2 v2.17 (
8). The 836,116 reads were then reassembled using Flye v2.8.1 (
9). The resulting genome was circularized using Circlator v1.5.5 (
10) with the options –merge_min_id 85 and –merge_breaklen 1000 as advised for Oxford Nanopore reads. The circular genome was corrected using the publicly available Illumina short reads (SRA accession number
SRR12633329 [
11]) with POLCA (MaSuRCA v4.0.1) (
12,
13). During the different assembly, circularization, and polishing steps, the genome quality was assessed using BUSCO v4.1.4 (
14) with the Gammaproteobacteria database. The final genome was automatically annotated using GenBank with PGAP r2021-01-09.build5126 (
Table 1) (
15).
Genome comparison with the closest genome (RefSeq accession number
NZ_AP018161 [
2]), using ACT (Artemis v18.1.0 [
16]), revealed that the gene encoding the isoleucine tRNA ligase (
ileS) was complete in our genome, while containing a 1-nucleotide frameshift at position 820. This difference demonstrates the importance of sequencing the same streamlined bacterial endocytobiont from different host populations, as genome reduction through random genetic mutations combined with a maternal transmission bottleneck can result in genomic differences within the same endosymbiont species.
Data availability.
The assembly has been deposited in GenBank under accession number
CP069383 and BioProject accession number
PRJNA699994. The version described in this paper is the first version,
CP069383.1. The Oxford Nanopore reads used for the assembly of “
Ca. Nardonella dryophthoridicola” have been deposited under SRA accession number
SRR14598013.