Open access
Genomics and Proteomics
7 June 2023

Complete Genome Sequence of “Candidatus Phytoplasma aurantifolia” TB2022, a Plant Pathogen Associated with Sweet Potato Little Leaf Disease in China


The complete genome sequence of “Candidatus Phytoplasma aurantifolia” TB2022, which consists of one 670,073-bp circular chromosome, is presented in this work. This bacterium is associated with sweet potato little leaf disease in Fujian Province, China.


Phytoplasmas are a group of phloem-inhabiting bacteria transmitted by sap-feeding insects (1). They have a broad host range and are associated with a variety of symptoms, such as witches’ broom, little leaf, and phyllody, causing significant damages in many crops. One species-level taxon, “Candidatus Phytoplasma aurantifolia,” is widespread in the Western Pacific and has been a persistent threat to sweet potato production in the coastal area of Fujian Province, China, since the 1960s (2). To improve our understanding of this pathogen, we conducted whole-genome shotgun sequencing of a field sample. All kits were used according to the manufacturer’s protocols, and all bioinformatics tools were used with default settings unless stated otherwise.
Strain TB2022 was collected from symptomatic sweet potatoes cultivated in Tianbian Village (Hui’an County, Fujian Province) in July 2022. Leaves from one plant exhibiting severe witches’ broom and little leaf symptoms were collected for DNA extraction using the Hi-DNAsecure plant kit (DP350; Tiangen Biotech). For Illumina sequencing, a library was prepared using the VAHTS universal Plus DNA library prep kit (ND617-C3-02; Vazyme), followed by sequencing on the Illumina NovaSeq 6000 platform to generate ~66.8 million reads, totaling 10.0 Gb. For Oxford Nanopore Technologies (ONT) sequencing, DNA fragments of >10 kb were collected using the BluePippin system (Sage Science), followed by processing using NEBNext formalin-fixed, paraffin-embedded (FFPE) DNA repair mix and the NEBNext Ultra II end repair/dA-tailing module (New England Biolabs). A library was prepared using the ONT ligation sequencing kit (SQK-LSK109) without shearing and sequenced using PromethION flow cells (FLO-PRO002) on the PromethION 48 system. Guppy v3.2.6 was used for base calling with fast mode, which produced 799,539 reads totaling 12.1 Gb (N50, 25.4 kb).
By mapping the Illumina reads to published phytoplasma genomes (3) using BWA v0.7.17 (4), TB2022 was found to have ~99% genome-wide average nucleotide identity with “Ca. Phytoplasma aurantifolia” NCHU2014 (5, 6) (GenBank accession number CP040925). Thus, a two-step assembly procedure was utilized. First, the Illumina and ONT reads were mapped to the reference genome using BWA v0.7.17 (4) and Minimap2 v2.15 (7), respectively. Mapped reads with an alignment score above 30 (for Illumina) or 1,000 (for ONT) were extracted and used for assembly with Trycycler v0.5.3 (8). The gene prediction and annotation procedure was based on the one described in our previous work (9, 10). For gene prediction, RNAmmer v1.2 (11), tRNAscan-SE v1.3.1 (12), and Prodigal v2.6.3 (13) were used. Annotation was performed based on the homologs in other phytoplasmas, as identified using OrthoMCL v1.3 (14), followed by manual curation using BlastKOALA v12 (15) and GenBank (16). Putative secreted proteins were identified using SignalP v5.0 (17) and filtered using TMHMM v2.0 (18).
The reference-based assembly produced one circular chromosome of 670,073 bp, with 25.0% G+C content; no plasmids were found. The average sequencing depth is 885-fold, based on ~4.4 million Illumina reads, and 618-fold, based on 71,436 ONT reads (N50, 21.0 kb). The contig was rotated to have dnaA as the first gene. The annotation contains 6 rRNA genes, 27 tRNA genes, 535 protein-coding genes, and 22 pseudogenes.

Data availability.

This genome project has been deposited at NCBI under the BioProject accession number PRJNA937424. All raw reads derived from the infected plant sample have been deposited at the NCBI Sequence Read Archive under the accession numbers SRX19479157 and SRX19479158. The phytoplasma genome sequence has been deposited at GenBank under the accession number CP120449.


Sequencing was performed by Biomarker Technologies (Beijing, China). Funding for this work was provided by Academia Sinica (to C.-H.K.) and by the Natural Science Foundation of Shanghai (grant number 23ZR1470300) and the CAS Center for Excellence in Molecular Plant Sciences (to W.H.). The funders had no role in study design, data collection and interpretation, or the decision to submit the work for publication.


Hogenhout SA, Oshima K, Ammar E-D, Kakizawa S, Kingdom HN, Namba S. 2008. Phytoplasmas: bacteria that manipulate plants and insects. Mol Plant Pathol 9:403–423.
Huang Y-K, Wang X-Y, Zhang R-Y, Li J, Li Y-H, Shan H-L, Cang X-Y, Wang C-M. 2023. The diversity, distribution, and status of phytoplasma diseases in China, p 121–147. In Tiwari AK, Caglayan K, Al-Sadi AM, Azadvar M, Abeysinghe S (ed), Phytoplasma diseases in Asian countries volume one: diversity, distribution, and current status. Academic Press, Cambridge, MA.
Huang C-T, Pei S-C, Kuo C-H. 2023. Genomic studies on Asian phytoplasmas, p 67–83. In Tiwari AK, Oshima K, Yadav A, Esmaeilzadeh-Hosseini SA, Hanboonsong Y, Lakhanpaul S (ed), Phytoplasma diseases in Asian countries volume three: characterization, epidemiology, and management. Academic Press, Cambridge, MA.
Li H, Durbin R. 2009. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25:1754–1760.
Tan CM, Lin Y-C, Li J-R, Chien Y-Y, Wang C-J, Chou L, Wang C-W, Chiu Y-C, Kuo C-H, Yang J-Y. 2021. Accelerating complete phytoplasma genome assembly by immunoprecipitation-based enrichment and MinION-based DNA sequencing for comparative analyses. Front Microbiol 12:766221.
Chang S-H, Cho S-T, Chen C-L, Yang J-Y, Kuo C-H. 2015. Draft genome sequence of a 16SrII-A subgroup phytoplasma associated with purple coneflower (Echinacea purpurea) witches’ broom disease in Taiwan. Genome Announc 3:e01398-15.
Li H. 2018. Minimap2: pairwise alignment for nucleotide sequences. Bioinformatics 34:3094–3100.
Wick RR, Judd LM, Cerdeira LT, Hawkey J, Méric G, Vezina B, Wyres KL, Holt KE. 2021. Trycycler: consensus long-read assemblies for bacterial genomes. Genome Biol 22:266.
Chung W-C, Chen L-L, Lo W-S, Lin C-P, Kuo C-H. 2013. Comparative analysis of the peanut witches’-broom phytoplasma genome reveals horizontal transfer of potential mobile units and effectors. PLoS One 8:e62770.
Cho S-T, Zwolińska A, Huang W, Wouters RHM, Mugford ST, Hogenhout SA, Kuo C-H. 2020. Complete genome sequence of “Candidatus Phytoplasma asteris” RP166, a plant pathogen associated with rapeseed phyllody disease in Poland. Microbiol Resour Announc 9:e00760-20.
Lagesen K, Hallin P, Rodland EA, Staerfeldt H-H, Rognes T, Ussery DW. 2007. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res 35:3100–3108.
Lowe T, Eddy S. 1997. tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res 25:955–964.
Hyatt D, Chen G-L, LoCascio P, Land M, Larimer F, Hauser L. 2010. Prodigal: prokaryotic gene recognition and translation initiation site identification. BMC Bioinformatics 11:119.
Li L, Stoeckert CJ, Roos DS. 2003. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res 13:2178–2189.
Kanehisa M, Sato Y, Morishima K. 2016. BlastKOALA and GhostKOALA: KEGG tools for functional characterization of genome and metagenome sequences. J Mol Biol 428:726–731.
Benson DA, Cavanaugh M, Clark K, Karsch-Mizrachi I, Ostell J, Pruitt KD, Sayers EW. 2018. GenBank. Nucleic Acids Res 46:D41–D47.
Armenteros JJA, Tsirigos KD, Sønderby CK, Petersen TN, Winther O, Brunak S, von Heijne G, Nielsen H. 2019. SignalP 5.0 improves signal peptide predictions using deep neural networks. Nat Biotechnol 37:420–423.
Krogh A, Larsson B, von Heijne G, Sonnhammer ELL. 2001. Predicting transmembrane protein topology with a hidden Markov model: application to complete genomes. J Mol Biol 305:567–580.

Information & Contributors


Published In

cover image Microbiology Resource Announcements
Microbiology Resource Announcements
Volume 12Number 718 July 2023
eLocator: e00308-23
Editor: David A. Baltrus, University of Arizona
PubMed: 37284786


Received: 12 April 2023
Accepted: 24 May 2023
Published online: 7 June 2023



Yalu Li
Shanghai Center for Plant Stress Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
Xiao-Hua Yan
Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan
Yanzhi Liu
Shanghai Center for Plant Stress Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China
Shen-Chian Pei
Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan
Institute of Plant and Microbial Biology, Academia Sinica, Taipei, Taiwan
Weijie Huang [email protected]
Shanghai Center for Plant Stress Biology, CAS Center for Excellence in Molecular Plant Sciences, Chinese Academy of Sciences, Shanghai, China


David A. Baltrus
University of Arizona


Yalu Li and Xiao-Hua Yan contributed equally to this article. Author order was decided by agreement among all authors.
The authors declare no conflict of interest.

Metrics & Citations



  • For recently published articles, the TOTAL download count will appear as zero until a new month starts.
  • There is a 3- to 4-day delay in article usage, so article usage will not appear immediately after publication.
  • Citation counts come from the Crossref Cited by service.


If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. For an editable text file, please select Medlars format which will download as a .txt file. Simply select your manager software from the list below and click Download.

View Options

Figures and Media






Share the article link

Share with email

Email a colleague

Share on social media

American Society for Microbiology ("ASM") is committed to maintaining your confidence and trust with respect to the information we collect from you on websites owned and operated by ASM ("ASM Web Sites") and other sources. This Privacy Policy sets forth the information we collect about you, how we use this information and the choices you have about how we use such information.
FIND OUT MORE about the privacy policy