Molecular typing of pathogen populations is essential to gain insight into their genetic diversity and population dynamics in order to elaborate efficient strategies for disease control (1
). In agricultural systems, pests are ideally controlled by integrated approaches, including eradication or treatment of diseased organisms and planting of resistant varieties. However, the durability of resistance can be challenged if pathogen diversity is significant. Importantly, gene flow between pathogen populations can facilitate the breakdown of resistance in crop plants (3
). Hence, efficient and precise molecular-typing tools for identifying strains and differentiating among related bacterial isolates are essential for microevolutionary reconstruction as a population genetics approach for integrated plant protection.
Rice, one of the major crops worldwide, is affected by two bacterial diseases that are caused by strains of Xanthomonas oryzae
, bacterial leaf blight (BLB), caused by X. oryzae
pv. oryzae, and bacterial leaf streak (BLS), caused by X. oryzae
pv. oryzicola. Collectively, these two diseases cause significant yield losses in tropical and temperate rice-growing areas. X. oryzae
pv. oryzae colonizes xylem vessels upon entry into the vascular system. X. oryzae
pv. oryzicola infects the plant via natural openings and colonizes the mesophyll (4
). Genomes of members of both pathovars have been sequenced; however, the determinants of tissue specificity are still largely unknown (5
). While X. oryzae
pv. oryzicola has been shown to be seedborne and seed transmitted (7
), the evidence that X. oryzae
pv. oryzae is seedborne is still controversial (7
Because both pathogens infect the same host species and cause symptoms that at later stages of infection may be difficult to distinguish, both BLB and BLS diseases are not easy to unambiguously diagnose in the field. In the laboratory, strains of the two pathovars are identified by inoculation methods on susceptible host plants (10
). Upon leaf inoculation, X. oryzae
pv. oryzicola colonizes the mesophyll, resulting in water-soaked lesions, while X. oryzae
pv. oryzae colonizes and spreads in the vascular system, resulting in long lesions along the leaf blade (4
). Besides the observation of visual symptoms, and for a more reliable diagnosis, a genomics-based multiplex PCR has been developed to differentiate the two pathovars (11
). Recently, these diagnostic loci were converted into a loop-mediated isothermal amplification (LAMP) assay (12
). In addition to pathovar discrimination, other assays were developed that can differentiate X. oryzae
strains from Africa and Asia (13
). Currently, phenotypic, multiplex PCR, and LAMP methods are routinely used by several laboratories to identify, to characterize, and to detect X. oryzae
BLB was first described in Fukuoka Prefecture, Japan, in 1884 (14
) and has been reported in a number of rice-growing countries from Iran to Japan and Philippines (16
). The disease was also reported in African countries, including Mali, Senegal, Niger, Gabon, Nigeria, Cameroon, Burkina Faso, Mauritania, and the Gambia (17
). Reports for northern Australia (18
) and South America (19
) exist but are rather sporadic and less important (16
BLS was first described in Philippines in 1918 and is prevalent in Asia (16
), as well as in Africa in Mali (20
) and Burkina Faso (21
), and was recently reported in Madagascar (22
), Burundi (23
), and Uganda (24
So far, studies have focused on X. oryzae
pv. oryzae, describing the genetic and pathotypic diversity of different populations. Most of these studies analyzed regional populations of X. oryzae
pv. oryzae, mainly using pathogenicity assays and/or restriction fragment length polymorphism (RFLP) profiling (25–30
). For instance, Mishra and coworkers (2013) analyzed more than 1,000 isolates of X. oryzae
pv. oryzae based on their reactions to 10 resistance genes. They also differentiated a subset of strains by an RFLP analysis using IS1112
as a probe. However, the study included only strains from India (30
Much less is known about X. oryzae
pv. oryzicola diversity. In Philippines, a study evaluated the diversity of 123 X. oryzae
pv. oryzicola strains by RFLP analysis, concluding that the pathovar is endemic due to the large diversity of strains (31
). A repetitive-element palindromic (rep)-PCR analysis of 141 Chinese X. oryzae
pv. oryzicola strains revealed significant genetic diversity among X. oryzae
pv. oryzicola strains in southwest China (32
). One of the most exhaustive phylogenetic studies of X. oryzae
was performed by Gonzalez and coworkers, who analyzed a set of 26 Asian X. oryzae
pv. oryzae strains, 21 African X. oryzae
pv. oryzae strains, and 14 X. oryzae
pv. oryzicola strains from different origins (20
). Using a polyphasic approach, including RFLP, rep-PCR, and fluorescent amplified fragment length polymorphism (AFLP), three lineages were defined within the species as Asian X. oryzae
pv. oryzae, African X. oryzae
pv. oryzae, and X. oryzae
pv. oryzicola (from Asia and Africa). More recently, multilocus sequence typing (MLST) targeting nine housekeeping genes of a few strains of X. oryzae
confirmed the designation of the three lineages (33
). Other studies combining multilocus sequence analysis (MLSA) with the analysis of the type III effector repertoires of 40 X. oryzae
strains belonging to different pathovars and from different geographical origins clearly confirmed that both X. oryzae
pv. oryzae and X. oryzae
pv. oryzicola pathogens belong to closely related but distinct phylogenetic groups formerly defined as lineages (20
). Finally, MLST and RFLP analyses focusing on a large collection of X. oryzae
pv. oryzicola strains revealed a high level of genetic diversity among African strains (35
Molecular methods previously used to evaluate the genetic diversity of X. oryzae
were either cumbersome, poorly reproducible, insufficiently discriminative, or difficult to interpret evolutionarily. Over the past several years, MLST based on a set of housekeeping genes has become popular to investigate microbial populations of bacterial pathogens (36
). However, this technique does not have enough resolution for an in-depth study of X. oryzae
populations or epidemic outbreaks. Now, with easy access to nearly complete genome sequences, single-nucleotide polymorphism (SNP) analyses of genomes receive more attention due to the unprecedented increase in resolution they provide and the consequent possibility of investigating epidemics more accurately (37
). Although the costs of genome sequencing have decreased tremendously in the last few years, such typing methods are unlikely to become widely adopted as a standard due to bioinformatic and infrastructural constraints. Most developing countries that are concerned with epidemiological surveillance of plant diseases are not yet equipped for such analyses.
First reported in eukaryotic species, DNA motifs that are repeated in multiple copies were identified in bacterial genomes (39
). Variation between strains is reflected by a change in the size of the repeat array, called variable-number tandem repeats (VNTRs) or mini- or microsatellites. Among the different models of mutations of microsatellites, the stepwise mutation model (SMM) was widely adopted for microsatellites, even if large jumps in repeat numbers may occasionally occur (40
). The SMM postulates that the size of a VNTR locus evolves through the addition or deletion of one repeat unit per mutation event. Consequently, VNTR loci provide us with connectible data reflecting patterns of evolutionary descent that can be used for epidemiological tracing of bacterial strains. The flanking regions next to the repeats are generally well conserved, sometimes even among different species. Consequently, PCR primers could be designed allowing the analysis of VNTR polymorphisms at different levels, e.g., the species or subspecies level (42
). Since 2001, shortly after the first genome sequence became available, VNTR studies of bacterial plant pathogens became more and more popular, as exemplified by Xylella fastidiosa
). In 2009, the first VNTR typing scheme was developed for Xanthomonas
). Later, a 25-locus-based VNTR scheme (named MLVA-25) was developed for X. oryzae
pv. oryzicola (45
) and evaluated on a limited collection of X. oryzae
pv. oryzicola strains. Preliminary in silico
analyses indicated that a few loci would be useful to characterize the X. oryzae
pv. oryzae strains, as well, but the discriminatory power would have been rather low for the two X. oryzae
pv. oryzae lineages in comparison to the X. oryzae
pv. oryzicola lineage. Hence, a highly discriminatory multilocus variable-number tandem-repeat (MLVA) scheme that could at the same time identify the different X. oryzae
pathovars and distinguish strains within pathovars would be very useful.
Here, we report on a new multilocus VNTR analysis that allows the production of robust and reproducible genetic data to efficiently characterize X. oryzae strains and to study epidemics of BLB and BLS on rice. For this purpose, a collection of 338 strains of the X. oryzae pathovars oryzae and oryzicola, originating from 20 countries, was analyzed. Both global and small-scale MLVAs of the three X. oryzae genetic lineages are discussed in the context of the geographical origins of the strains, the sampling dates, and the host plants. With this work, we wish to promote the worldwide use of MLVA-16 (a 16-locus-based VNTR scheme) for monitoring of BLB and BLS on various temporal and geographical scales.
We developed a new MLVA scheme that identifies the different lineages of X. oryzae, i.e., pathovars and those of continental origin, and shows high discriminatory power on small scales in space and time.
On a global scale, the MLVA-16 scheme confirmed the lineage differentiation that was largely described in previous typing and phylogenetic studies (20
). However, on a smaller scale, the MLVA-16 scheme was shown to be more discriminatory than other previously used molecular-typing tools. The MLVA-16 scheme could discriminate strains from the same region and even strains originating from the same field. Interestingly, epidemiologically related strains kept a signature of their relationship by descent on this small spatiotemporal scale.
Recombination does not strongly bias the phylogenetic signal.
Even though the association indices are quite low for some lineages, linkage disequilibrium was highly significant. Moreover, distance matrices obtained from MLVA-16 and from a different genotyping technique (MLSA) were significantly correlated. These lines of evidence support low levels of recombination, as has also been suggested for other Xanthomonas
). However, one needs to be cautious, because these observations could result from sampling biases. For instance, geographical isolation might have occurred in our collections, consequently promoting high linkage disequilibrium (57
). To test this hypothesis, the locus association should be tested on a strongly sampled population, i.e., more strains isolated on a small spatiotemporal scale. Nevertheless, our results support the usefulness of the 16 VNTR markers to investigate demographic or epidemiological patterns.
As a special case, a distance matrix obtained from a tal
gene-based RFLP data set was not correlated with an MLVA-16-based distance matrix when applied to a set of Malian and Burkina Faso X. oryzae
pv. oryzicola strains. tal
genes, which consist of tandem repeats with a large repeat unit (58
), contribute to the colonization of host plants in a compatible interaction or trigger specific defenses in an incompatible interaction. Hence, tal
gene-based markers are expected to be under selection. Moreover, the RFLP band sizes of tal
genes do not always reflect their functional relatedness, i.e., identical tal
RFLP bands can belong to functionally unrelated tal
genes while different tal
RFLP bands can correspond to functionally analogous tal
genes. Together, these considerations may explain why a matrix generated by this marker is not correlated with a matrix obtained from neutral markers, such as most VNTRs.
MLVA-16 as a tool to identify lineages.
The MLVA-16 scheme produced some private and monomorphic alleles for each genetic lineage. Consequently, the different lineages of X. oryzae can be distinguished. In addition, other lineage-polymorphic loci also contribute to this differentiation. Hence, MLVA-16 directly identifies the pathogen responsible for a disease, i.e., BLB or BLS. Since BLB and BLS symptoms can be confused in the field at very early and late stages of the diseases, and also in cases of double infection, this information is important for efficient disease management. In the context of prior knowledge about prevalent BLB and BLS pathogens in a certain geographic area, the ability to distinguish between the two pathovars helps in deployment of disease-resistant rice cultivars and/or application of specific antibacterial agents. Moreover, the ability to distinguish between BLS and BLB might be helpful for quarantine purposes, especially with the increase in rice seed trade between countries.
indices of diversity suggest that African X. oryzae
pv. oryzae strains are closer to X. oryzae
pv. oryzicola than to Asian X. oryzae
pv. oryzae, as shown previously by Gonzalez and coworkers using RFLP, AFLP, and rep-PCR. However, these results are challenged by an MLSA of the three housekeeping genes gyrB
, and glnA
at a lower resolution, where the two X. oryzae
pv. oryzae lineages group together and are more distant from X. oryzae
pv. oryzicola (34
). Interestingly, another MLSA using a different set of housekeeping genes, fusA
, and gapA
, rather supported our scenario, i.e., that African X. oryzae
pv. oryzae strains are closer to X. oryzae
pv. oryzicola than to Asian X. oryzae
pv. oryzae (33
). Apparently, the choice of markers (with only a few polymorphisms) used for such analysis plays an important role in the discrimination of lineages. To resolve this problem, a genome-wide SNP analysis or an MLSA of the core genome could be performed (55
). In conclusion, MLVA-16 differentiates lineages well on a global scale.
MLVA-16 has limited value for large-scale epidemiology.
First reported in Asia, X. oryzae
pv. oryzae and X. oryzae
pv. oryzicola were described decades later in Africa (4
). It has been assumed that BLB and BLS have been introduced accidentally from Asia to Africa. The fact that both African X. oryzae
pv. oryzae and X. oryzae
pv. oryzicola showed less allelic richness than Asian X. oryzae
pv. oryzae and X. oryzae
pv. oryzicola supports an Asian ancestor for African strains. However, even though this study was conducted on a large strain collection, additional and more extensive sampling will be necessary to confirm an Asian origin of the African populations of X. oryzae
On a global scale, the MLVA-16 scheme revealed hardly any shared or related haplotypes (SLVs and DLVs) among strains from different countries. Similar findings were obtained with a microsatellite-based MLVA-14 of X. citri
). Generally, no link can be drawn from a vast number of strains on a really large geographic scale or over a long time based on microsatellites that evolve rapidly. Therefore, the development of another MLVA scheme based on TRs with a slower molecular clock, i.e., minisatellites, where repeat units are longer, would be more appropriate for large-scale epidemiology. Indeed, as shown by N′Guessan and coworkers for Ralstonia solanacearum
, the number of alleles in a large collection of strains decreases with an increase in the repeat unit size (59
). Similarly, Pruvost and coworkers demonstrated the superiority of a minisatellite-based scheme over a microsatellite-based scheme for large-scale epidemiology of X. citri
). Preliminary analyses have shown that no useful VNTR loci with repeat unit sizes above 12 bp are shared among the three X. oryzae
lineages, thus preventing the development of a universal scheme. In the future, lineage-specific minisatellite schemes will be developed for global epidemiological monitoring.
Despite these concerns, MLVA-16 provided some important information on a larger geographical scale. For instance, we identified two Chinese X. oryzae pv. oryzicola strains (Xoc-China and HN-DA-1) that differed greatly from the other Asian X. oryzae pv. oryzicola strains (only 7 common alleles with the closest Asian X. oryzae pv. oryzicola strain). Our results suggest that these strains are rather related to African X. oryzae pv. oryzicola strains (10 common alleles with the closest African X. oryzae pv. oryzicola strain). This finding coincides with the recent increase of the seedborne BLS disease in Africa and with tighter commercial links between Africa and China. Moreover, the two strains from South America, which are related to two Philippine haplotypes, support a scenario with distinct events of introduction in South America, perhaps from Philippines. Finally, two African X. oryzae pv. oryzae strains from different countries, BAI3 from Burkina Faso and NAI8 from Niger, shared the same haplotype. This finding may reflect the exchange of material between western African countries.
In conclusion, even if the MLVA-16 scheme is not ideal for large-scale epidemiology with slow migration, it is still a valuable tool for more modern situations where migration is very fast due to human transport. To demonstrate the power of this tool, we would need a larger set of strains representative of the total diversity. Currently, access to strains or even genomic DNA of X. oryzae is limited due to quarantine restrictions or regulatory issues concerning biodiversity in the different countries. The MLVAbank database, which makes our data accessible for analyses and comparisons, will help to coordinate international efforts to understand the epidemiology of X. oryzae.
MLVA-16 as a new tool for local epidemiological monitoring.
Our results show that the MLVA-16 scheme is very useful for local epidemiological surveillance on a regional or country-wide scale for all three X. oryzae
lineages. First, African X. oryzae
pv. oryzae strains sampled from Malian regions (Niono and Sélingue) in the same period (2010 and 2012) showed relatively close haplotypes. Their allelic profiles reveal an expansion that has occurred clonally from an unidentified founder. However, the Malian collection showed a dynamic population structure, since Malian strains sampled in 2010 and 2012 are distant from Malian strains sampled up to 2009. However, a phylogenetic signature for relatedness remained while epidemiological information was lost, because strains of both complexes differ by at least nine alleles. Second, on a slightly larger scale, a set of X. oryzae
pv. oryzicola strains sampled in southern China in 2011 shared identical or similar haplotypes, as shown by the minimum spanning tree and eBURST analysis (Fig. 4
; see Fig. S5 in the supplemental material). Third, 24 Philippine X. oryzae
pv. oryzae strains sampled over 9 years in Laguna Bay are linked in a clonal complex. It would be interesting to clarify how such a limited number of haplotypes was maintained for many years in this region. Moreover, since several very different haplotypes of Philippine X. oryzae
pv. oryzae were found to coexist within this region, further investigation by extensive sampling would be of high interest.
Our results show that the MLVA-16 scheme holds great potential for small-scale reconstruction of population dynamics. Such an improved understanding of microevolution will be key for short- and medium-term management of control strategies for X. oryzae, particularly by deciphering the pathways of dispersal both by natural transmission and by human-mediated transmission (including seed transmission).
The MLVA-16 scheme will be useful for high discrimination among X. oryzae
strains and to identify the lineages and pathovars of these pathogens. MLVA-16 will further allow us to analyze new outbreaks and epidemics of both pathovars of X. oryzae
. Additional samplings at the population level will be a further step to evaluate the efficiency of MLVA-16 by describing the patterns of descent of the strains and the population structures. Specifically, recently isolated X. oryzae
pv. oryzicola strains from eastern and central Africa could be characterized (22–24
Directions for use.
In cases where the pathovar and geographical origin are known, e.g., upon multiplex PCR (13
), one could omit one of the four primer pools that correspond to loci that are largely monomorphic within a lineage. Hence, one would use a subset of primer pools in a multiplex MLVA-12 scheme that is adapted to the pathovars and origins of the isolates. Asian X. oryzae
pv. oryzae strains need to be analyzed with pools 1, 2, and 4 (MLVA-12a); African X. oryzae
pv. oryzae strains need pools 1, 2, and 3 (MLVA-12b); and X. oryzae
pv. oryzicola strains need pools 1, 3, and 4 (MLVA-12c). Results can be analyzed and compared to our and other's data using the public database at IRD Montpellier (http://www.biopred.net/MLVA/
), which allows sharing of information with other scientists in an interactive manner and provides access to a few phylogenetic-analysis tools (46