Open access
Research Article
12 June 2014

In Silico Detection and Typing of Plasmids using PlasmidFinder and Plasmid Multilocus Sequence Typing


In the work presented here, we designed and developed two easy-to-use Web tools for in silico detection and characterization of whole-genome sequence (WGS) and whole-plasmid sequence data from members of the family Enterobacteriaceae. These tools will facilitate bacterial typing based on draft genomes of multidrug-resistant Enterobacteriaceae species by the rapid detection of known plasmid types. Replicon sequences from 559 fully sequenced plasmids associated with the family Enterobacteriaceae in the NCBI nucleotide database were collected to build a consensus database for integration into a Web tool called PlasmidFinder that can be used for replicon sequence analysis of raw, contig group, or completely assembled and closed plasmid sequencing data. The PlasmidFinder database currently consists of 116 replicon sequences that match with at least at 80% nucleotide identity all replicon sequences identified in the 559 fully sequenced plasmids. For plasmid multilocus sequence typing (pMLST) analysis, a database that is updated weekly was generated from and integrated into a Web tool called pMLST. Both databases were evaluated using draft genomes from a collection of Salmonella enterica serovar Typhimurium isolates. PlasmidFinder identified a total of 103 replicons and between zero and five different plasmid replicons within each of 49 S. Typhimurium draft genomes tested. The pMLST Web tool was able to subtype genomic sequencing data of plasmids, revealing both known plasmid sequence types (STs) and new alleles and ST variants. In conclusion, testing of the two Web tools using both fully assembled plasmid sequences and WGS-generated draft genomes showed them to be able to detect a broad variety of plasmids that are often associated with antimicrobial resistance in clinically relevant bacterial pathogens.


Plasmids are double-stranded circular or linear DNA molecules capable of autonomous replication and transferable between different bacterial species and clones. Most of the known plasmids have been identified because they confer phenotypes that are subject to positive selection on the bacterial host, such as the presence of antimicrobial resistance or virulence genes. Such features aid the successful spread of different plasmid types among bacteria of different sources and geographical origins. The acquisition of plasmids carrying antimicrobial resistance or virulence genes might drastically alter the prevalence of virulent or multidrug resistant-bacterial clones. It is thus important not only to study the molecular epidemiology of different bacterial clones but also to study and understand the molecular epidemiology of transferable plasmids. For this specific purpose, plasmid typing systems are needed.
Most plasmids include specific regions, called replicons, encoding functions that are able to activate and control replication (1). Since 2005, a PCR-based replicon typing (PBRT) scheme has been available that targets in multiplex PCRs the replicons of the major plasmid families occurring in members of the family Enterobacteriaceae (2). This method was initially developed to detect the replicons of plasmids belonging to the 18 major incompatibility (Inc) groups of Enterobacteriaceae species (3). More recently, novel replicons and plasmid types were identified by whole-genome and plasmid high-throughput sequencing, extending PBRT to the identification of 25 different replicons (49). However, this method is based on multiplex PCR, which is laborious to extend to cover more groups and which is not always suited for the detection of replicon variants, especially if this variation is within the primer binding sites. Together with other specific characteristics of the bacterial strain (i.e., resistance gene content, sequence type [ST] as determined by multilocus sequence typing [MLST], phylogroup, serotype, etc.), plasmid typing (by PBRT or degenerate primer MOB typing [10]) is currently used for comparative analysis of unrelated and related strains during epidemiological investigations.
Not all plasmid families occur at the same frequency in clinically relevant enterobacterial strains. For very frequent plasmids, sequence-based typing schemes were devised to identify related plasmid scaffolds. IncF, IncI1, IncN, IncHI2, and IncHI1 plasmids are currently subtyped by plasmid MLST (pMLST; (4, 7, 1114).
With the recent rapid increase in whole-genome sequence (WGS) and whole-plasmid sequence data generated by high-throughput-sequencing platforms, there is a need to be able to identify resistance genes and plasmids using raw sequence data or contigs generated by high-throughput sequencing of entire genomes. To extract the relevant information from the large amount of data generated, a Web-based tool, ResFinder, for the identification of acquired or intrinsically present antimicrobial resistance genes in whole-genome data was recently developed (15).
Here, we describe the design of two new easy-to-use Web tools useful for the rapid identification of plasmids in Enterobacteriaceae species that are of interest for epidemiological and clinical microbiology investigations of the plasmid-associated spread of antimicrobial resistance. Currently, most of the plasmid sequences available in GenBank have been automatically annotated, and in many cases, the annotation of the replicon sequences does not refer to any plasmid classification scheme (Inc groups or relaxase groups). Therefore, direct submission by BLASTn to NCBI cannot be easily used to recognize the lineage of the plasmid under study. The PlasmidFinder Web tool is based on a curated database of plasmid replicons intended for the identification of plasmids in whole-genome sequences originating from Enterobacteriaceae species by microbiologists without specialized bioinformatics skills using direct high-throughput raw reads, assembled contigs, or assembled Sanger sequences. PlasmidFinder not only provides the detection of replicons in the WGS but also assigns the plasmids under study to lineages that trace back the information to the existing knowledge on Inc groups and suggests possible reference plasmids for each lineage.
The pMLST Web tool is able to perform pMLST analysis on the same variety of data for the five incompatibility groups that currently have a pMLST scheme available.


Plasmid database.

A total of 745 sequences corresponding to nonredundant, complete sequences of plasmids identified in bacterial species belonging to the family Enterobacteriaceae were collected from the NCBI nucleotide database ( Among them, 186 plasmids were identified in bacterial species that were endosymbiont of insects (“Candidatus Ishikawaella capsulata Mpkobe,” Buchnera aphidicola, Sodalis glossinidius, and Wigglesworthia glossinidia) or nematodes (Photorhabdus asymbiotica), pathogens of plants (Pantoea spp. and Erwinia spp.) and fish (Edwardsiella ictaluri), or living in the rhizosphere and soil (Rahnella spp., Ralstonia eutropha, and Delftia acidovorans). These plasmids were excluded in the current version of the PlasmidFinder database.
Among the 559 plasmid sequences of interest downloaded from the GenBank database and analyzed in this study, 224 small plasmids with sizes ranging from 1,308 to 16,030 bp and 335 large plasmids (>20 kb in size) were identified in 40 different bacterial species of the Enterobacteriaceae family (see Table S1 in the supplemental material).

Construction and evaluation of the PlasmidFinder Web tool.

Based on the analysis described in Results and Discussion, a database containing 116 unique probes was generated. This database included all of the replicons identified in the 559 plasmids of interest. This database was used to build a Web tool ( utilizing the BLASTn algorithm to look for DNA homologies in both raw and assembled sequencing data from four different sequencing platforms. If assembled bacterial genomes or plasmids are uploaded to the Web service, they are immediately converted into a BLAST database. If raw sequence reads are uploaded, they are first assembled (after the sequencing platform is given by the user) as described previously (16). Upon sequence submission, a percent identity (% ID) threshold (the percentage of nucleotides that are identical between the best-matching replicon sequence in the database and the corresponding sequence in the assembled sequencing data) of 100%, 95%, 90%, 85%, 80%, or on down to 50% can be selected. However, it should be noted that in the current form, PlasmidFinder is designed to identify replicons with at least 80% nucleotide identity with those currently included in the database (see Table S1 in the supplemental material) and will not adequately cover plasmid diversity outside this scope.
Details on how the best match is selected are described in reference 16. For a hit to be reported, it has to cover at least 60% of the length of the replicon sequence in the database. Output data include information on what DNA fragment (contig) was found and the position of the hit within this contig. Also, information regarding the % ID, the length of the hit, and the length of the replicon sequence is included in the output.
Initial validation of the PlasmidFinder was done by submitting the entire replicon sequence database to the PlasmidFinder Web server to ensure perfect recognition of replicon sequences and replicon sequence lengths.

Validation of the PlasmidFinder.

A set of 24 complete sequences of plasmids associated with important resistance determinants, covering 12 different replicon groups as determined in previous studies by PBRT (5, 8, 9, 1724), were used to test the ability of PlasmidFinder to detect the correct plasmid replicons. Here, the replicon sequences from the PlasmidFinder database were BLASTed against the complete plasmid sequences and the best-matching hits in each genome for each replicon sequence were given as output using a % ID threshold value of 80%.

Construction of the pMLST Web tool.

An automatic weekly download script was set up for collecting all pMLST allele sequences and ST profiles from the plasmid MLST Web site ( and used as the database for the pMLST Web tool ( Similar to PlasmidFinder, this Web tool utilizes the BLASTn algorithm for finding DNA homologies in both raw and assembled sequences. Upon submission, the user must select which pMLST scheme to use. The hit in the plasmid has to cover at least 66% of the length of the gene in the database with at least 85% conservation to be reported. After identifying the pMLST allele for all genes of the pMLST scheme, the plasmid sequence type is determined based on the combination of alleles identified. A perfect hit to an allele is marked in green, whereas nonperfect hits are marked in red and should be verified by traditional Sanger sequencing before the ST type is reported.
Validation of the pMLST Web tool was done by submitting a subset of completely sequenced IncF, IncHI1, IncHI2, IncN, and IncI1 plasmids (at least 6 plasmid sequences from each incompatibility group with a pMLST scheme) to the Web server. These plasmids were selected from the studies initially describing the pMLST schemes (

Identifying plasmids in bacterial whole-genome data.

Draft assemblies of short Illumina sequence reads (100-bp paired-end reads) from 49 Salmonella enterica serovar Typhimurium genomes isolated from healthy pigs, as described previously (25), were analyzed using PlasmidFinder. Here, the replicon sequences from the PlasmidFinder database were BLASTed against the assembled genomes, and the best-matching hits in each genome for each replicon sequence were given as output, using a % ID threshold value of 80%.

In vitro detection of plasmids in selected isolates.

The contents of large plasmids in the six isolates predicted by PlasmidFinder to be plasmid free were examined using DNA linearization with S1 nuclease followed by pulsed-field gel electrophoresis (S1-PFGE). Five units of S1 nuclease (Fermentas) was used per plug slice. Plug slices with XbaI-digested S. enterica serovar Branderup (PFGE strain recommended for PFGE analysis) were used as size ladders. Samples were run on the CHEF-DR III System (Bio-Rad Laboratories, Hercules, CA), and the conditions used were as follows: 1% agarose gel (SeaKem gold agarose; Lonza) in 0.5× Tris-borate-EDTA, voltage gradient of 6 V/cm, with phase from 6.8 to 38.4 and run time of 19 h.
The content of small plasmids in the same six isolates was analyzed using the Qiagen spin miniprep kit (catalog number 27104) according to the manufacturer's instructions. Five-microliter samples were run on an 0.8% agarose gel at 100V for 4 h.

pMLST on selected genomes.

pMLST analysis was performed using the pMLST Web tool on the subset of draft genomes in which PlasmidFinder identified one or more of the incompatibility groups for which there is a pMLST scheme available.


Definition of the PlasmidFinder database.

A set of 559 complete plasmid sequences available in GenBank were first aligned against 39 DNA sequences of previously characterized replicons in plasmids of Enterobacteriaceae species (Table 1). Using these replicon sequences, a total of 263 plasmids from GenBank (262 large and 1 small, belonging to the X2 group; see pFL129 in Table S1 in the supplemental material) were successfully recognized, showing >95% nucleotide identity and >96% coverage with the reference replicon sequences (see Table S1 for the list of plasmids recognized by the probes in Table 1).
TABLE 1 List of probes detecting previously characterized replicons
ProbeaPositionLocus targetedReference
P_1_alpha_L2775812765–12232Control of repA2
The number following the final underscore is the GenBank accession number of the sequence.
Of the 296 plasmids that were not recognized by the 39 existing PBRT probes, 72 and 224 were large and small plasmids, respectively. In the 72 large plasmids, the annotated replication protein sequences were analyzed by using BLASTp at the NCBI site. Forty of these plasmids showed replicons and replicase protein sequences matching those of plasmids previously classified in the FII, FIB, I, B/O/K/Z, N, A/C, L/M, X, and P groups (>85% amino acid identities). In particular, the replicase proteins showing the pfam02387 or pfam01051 conserved domains were assigned to the FII and FIB groups, respectively (31). Twenty-seven additional FII and FIB probes were therefore included in the PlasmidFinder database (Table 2). Thirteen probes were included for the I2-, B/O/K/Z-, N3-, A/C1-, L/M-, X-, and P-like replicons to recognize (>95% nucleotide identity) 58 of the 72 not-yet-classified plasmids (see Table S1 in the supplemental material for the list of plasmids recognized by the probes in Table 2). The last 14 plasmids encoded RepA proteins that were not clearly homologous to any of the previously assigned Inc groups; for these plasmids, 14 probes were included in the PlasmidFinder database, using as the name of the probe the name of the plasmid each derived from (Table 2; see Table S1).
TABLE 2 List of probes detecting novel replicons in large plasmids
ProbeaPositionLocus targetedSource
A/C_1__FJ705807684–1100repAThis study
B/O/K/Z_1_CU928147119424–119574RNAIThis study
B/O/K/Z_3_GQ25988824642–24793RNAIThis study
FII(p14)_1_p14_JQ41853834250–33989repFIIThis study
FII(p96A)_1_p96A_JQ41852142799–43332repFIIThis study
FII(pCoo)_1_pCoo_CR94228527847–27586repFIIThis study
FII(pCRY)_1_pCRY_NC_005814739–1331repFIIThis study
FII(pCTU2)_1_pCTU2_FN5430958327–8903repFIIThis study
FII(pECLA)_1_pECLA_CP0019193–749repFIIThis study
FII(Phn7A8)_JN232517412–671repFIIThis study
FII(pENTA)_1_pENTA_CP00302784306–84865repFIIThis study
FII(pMET)_1_pMET1_EU383016690–1266repFIIThis study
FII(pRSB107)_1_pRSB107_AJ85108921949–21689repFIIThis study
FII(pSE11)_1_pSE11_AP00924275349–75085repFIIThis study
FII(pseudo)_1_pseudo_NC_01175976101–76490repFIIThis study
FII(pYVa12790)_1_pYVa12790_AY150843121–794repFIIThis study
FII(SARC14)_1_SARC14_JQ41854036938–37382repFIIThis study
FII(Serratia)_1_Serratia_NC_00982940691–40968repFIIThis study
FII(29)_ pCE10A_CP0030351393–1135repFIIThis study
FII(pSFO)_AF4012923516–3773repFIIThis study
FII(pKP91)_CP00096689831–90060repFIIThis study
FIB(pHCM2)_1_pHCM2_AL513384104938–105812repFIBThis study
FIB(pB171)_1_pB171_AB024946250–892repFIBThis study
FIB(pCTU1)_1_pCTU1_FN5430942655–3347repFIBThis study
FIB(pCTU3)_1_pCTU3_FN543096142054–142620repFIBThis study
FIB(pECLA)_1_pECLA_CP001919211–771repFIBThis study
FIB(pENTAS01)_1_pENTAS01_CP003027141679–142239repFIBThis study
FIB(pENTE01)_1_pENTE01_CP000654142054–142620repFIBThis study
FIB(pKPHS1)_1_pKPHS1_CP003223201–761repFIBThis study
FIB(pLF82)_1_pLF82_CU638872211–771repFIBThis study
I2_1_Delta_AP002527207–522repRThis study
L/M(pOXA-48)_1_JN62628655802–56462repAThis study
L/M(pMU407)_1_pMU407_U2734522–760repAThis study
N3_EF21913429016–29492repAThis study
P(Beta)_1_Beta_U6719415582–16163repAThis study
X1_2_CP00341724754–25101repAThis study
X1_3_CP00112336953–37325repAThis study
X1_4_JN935898844–1220repAThis study
X5_1_NC_01505434413–34786repAThis study
X6_1_AM9427601–374repAThis study
p0111_AP01096254482–53598repAThis study
ICC168_FN5435046671–7382repAThis study
pADAP_1_AF135182113796–114335repAThis study
pENTAS02_1_CP00302825220–24242repAThis study
pESA2_CP0007842497–3246repAThis study
pIP31758_1_CP000718639–1556repAThis study
pIP31758_1_CP000719150807–151715repAThis study
pIP32953_1_BX936400262–1188repAThis study
pSL483_1_CP00113723947–22953repAThis study
pXuzhou21_1_CP00192771–791repAThis study
pYE854_1_AM90595077347–78325repAThis study
pEC4115_1__NC_01135193–798repAThis study
pJARS36_1__NC_015068179–712repAThis study
pSM22_1__NC_01597217401–18023repAThis study
The number following the final underscore is the GenBank accession number of the sequence.
The analysis of the replication initiation protein genes in plasmid data released in GenBank provided evidence that the majority of large plasmids were still classifiable by DNA homology into phylogenetically related families corresponding to the formerly defined Inc groups (Fig. 1). Therefore, PlasmidFinder has been conceived to translate plasmid DNA sequences obtained by WGS into the well-established Inc group-based classification framework, allowing the WGS results to be traced back into the scaffold of previous knowledge and literature on relevant resistance plasmids characterized by standard methods (PBRT or Inc typing by conjugation). For each replicon, the PlasmidFinder output is linked to an NCBI GenBank entry connecting to the whole sequence of the plasmid showing that respective replicon. This allows the identification of related plasmids that can be used as references for further, more extensive and accurate comparative analysis of the WGS data.
FIG 1 Numbers of fully sequenced plasmids (y axis) classified into incompatibility groups occurring in the different bacterial species of the Enterobacteriaceae family. The collection of 335 large plasmids (>20 kb) (listed in Table S1 in the supplemental material) downloaded from the GenBank database was classified into Inc groups by BLASTn (using the criterion of >95% nucleotide identity) using reference replicon sequences (Table 1) and the novel probes in the PlasmidFinder database (Table 2).

Definition of a database for small plasmids in Enterobacteriaceae.

The 224 fully sequenced small plasmids were classified into 23 families of homology. Because of the huge sequence variety within the replication controls of this group of plasmids, the number and length of replicon sequences recognizing these plasmids in the PlasmidFinder database were maintained at the minimum for using the >80% nucleotide identity and 96% coverage criteria (Table 3), to avoid the occurrence of multiple alignments on different replicon sequences, which would make interpretation of the output results difficult. This classification was based on multiple phylogenetic analyses of nucleotide alignments of one or more plasmid targets among the repA, RNAI, and other plasmid sequences. The largest homologous group was obtained using the RNAI of the small narrow host range (NHR) ColE1-like plasmids from Klebsiella pneumoniae (pIGMS31, pIGMS32, and pIGRK), which can be maintained in several closely related species of the class Gammaproteobacteria but not in members of the Alphaproteobacteria (32). In fact, one replicon sequence (probe ColRNAI_DQ298019) alone detected 146 of the 224 small plasmids and 1 large plasmid (K. pneumoniae p15S; see Table S1 in the supplemental material) at >80% nucleotide identity and >96% coverage. For the majority of the remaining plasmids, the replicon sequences were devised on the repA gene, since they lacked the RNAI. Two replicon sequences (probes Q1 and Q2) were devised to recognize the repA gene of 7 small IncQ-like plasmids, and one replicon sequence [probe P(6)] was devised to detect the small IncP plasmid pRIO-5 (see Table S1 for the list of plasmids recognized by the probes in Table 3).
TABLE 3 List of probes detecting small plasmids
ProbeaPositionLocus targetedbSource
Col156_NC_0097811552–1705repAThis study
Col3M_JX5140652453–2609Intergenic regionThis study
Col8282_DQ9953523349–3555repAThis study
Col(BS512)_1_NC_0106561096–1328repAThis study
ColE10_1_AY167049159–177Intergenic regionThis study
Col(IMGS31)_1_NC_011406700–897ORF2This study
Col(IRGK)_1_AY5430712041–2224ORF2This study
ColKP3_JN2058006541–6820repAThis study
Col(KPHS6)_1_NC_016841326–503repAThis study
Col(MG828)_1_NC_008486967–1228repAThis study
Col(MGD2)_NC_0037891235–1370repAThis study
Col(MP18)_1_NC_013652266–458repAThis study
Col(pVC)_JX133088799–991repAThis study
Col(pWES)_1_DQ26876410151–10328repAThis study
ColRNAI_DQ2980195738–5867RNAIThis study
Col(SD853)_1_NC_0153921363–1556repAThis study
Col(VCM04)_1_HM231165776–952repAThis study
Col(YC)_1_NC_0021445043–5195repAThis study
Col(Ye4449)_1_FJ6964052409–2602mobCThis study
Col(YF27601)_1_JF9376551–175RNAIThis study
Q1_1_HE6547262181–2630repAThis study
Q2_1_NC_0143564547–4996repAThis study
P(6)_1__JF7855501477–2282repAThis study
The number following the final underscore is the GenBank accession number of the sequence.
ORF, open reading frame.
In conclusion, a total of 77 novel replicon sequences, 54 and 23 recognizing large and small plasmids, respectively, were devised in this study and included in the PlasmidFinder database, together with the 39 sequences of previously studied replicons, thus obtaining a database of 116 specific plasmid replicon sequences.

Evaluating PlasmidFinder on complete resistance plasmid sequences.

A collection of 24 previously characterized and fully sequenced plasmids, carrying different assortments of replicons of the FII, FIIK, HI1, FIB, FIA, X, I1, N, and A/C types and associated with the most relevant and diffused resistance genes, such as blaNDM-1, blaKPC-2, and blaCTX-M-15,were used to test the ability of PlasmidFinder to correctly identify the replicons located on these plasmids. PlasmidFinder was able to recognize each replicon correctly at a 100% match.

Using PlasmidFinder on whole-genome sequencing data.

A selection of 49 S. Typhimurium draft genomes were chosen in order to test the ability of PlasmidFinder to detect plasmid replicons in bacterial whole-genome sequencing data. These genomes originated from isolates from healthy pigs in Denmark, collected as part of the DANMAP program (33) in 2011, and represent an unbiased sample. An identity threshold of 80% was used in order to detect both large and small plasmids based on the criteria described above. Among the 49 S. Typhimurium isolates, none had more than five unique hits to replicons in the database, 1 isolate had five unique hits, 10 isolates had four hits, 8 isolates had three hits, 10 isolates had two hits, 14 isolates had one hit, and 6 isolates had no hits to any of the replicons in the database. Of the S. Typhimurium replicons identified through the PlasmidFinder Web tool, 97 sequence hits had a % ID score of 90% or above, whereas only six had % ID scores between 85% and 90%, and no hits were found with % ID scores below 85%. In 74 hits, the full length of the replicon sequence present in the database was found to be present in the draft genomes, while 29 hits had a length between the cutoff length of 60% and 100% relative to the matching sequence in the database.
The six plasmids with % ID below 90%, as well as all but one of the 29 hits with non-full-length sequences, all mapped to replicon sequences from plasmids predicted to belong to the small-size Col plasmid group (expected size, <20 kb).
Among the 116 replicon sequences in the PlasmidFinder database, 22 were found to match sequences in the draft S. Typhimurium genomes (Table 4) according to the criteria listed above. The most abundant hits were to the Col(RNAI) sequence (n = 21), the IncQ1 sequence (n = 18), and the FIB(S) and FII(S) sequences (n = 13).
TABLE 4 Distribution of replicons identified among 49 whole-genome-sequenced Salmonella Typhimurium isolates by using the PlasmidFinder Web server
RepliconaIsolate(s) with replicon (% identity)b
Col(RNAI)S2* (91.7), S4* (89.4), S8* (91.3), S9* (95.0), S11* (89.4), S16* (89.4), S17* (91.2), S18* (89.4), S19* (90.4), S20* (90.4), S32* (89.4), S33* (91.5), S37* (92.0), S39* (95.1), S40* (91.2), S41* (91.5), S43* (91.6), S44 (100), S45* (89.36), S47* (91.5), S50* (92.7)
Q1S3 (100), S6 (100), S8 (100), S9 (100), S12 (100), S20 (100), S21 (100), S22 (100), S24 (100), S26 (100), S27 (100), S28 (100), S29 (100), S30 (100), S34 (100), S36 (100), S38 (100), S44 (100)
FIB(S)S1 (99.8), S2 (100), S4 (99.8), S11 (100), S13 (100), S16 (99.8), S32 (99.8), S33 (100), S40 (100), S41 (100), S45 (99.8), S47 (100), S50 (100)
FII(S)S1 (100), S2 (100), S4 (100), S11 (100), S13 (100), S16 (199), S32 (100), S33 (100), S40 (100), S41 (100), S45 (100), S47 (100), S50 (100)
Col(pVC)S11 (100), S33* (99.2), S41* (100), S47* (100), S50 (100)
Col(8282)S5 (100), S6* (91.4), S22* (91.4), S30* (91.4), S38* (91.4)
I1S13 (100), S25 (99.3), S31 (99.3)
FIA(HI1)S25 (100), S36 (100), S38 (100)
HI1AS25 (99.8), S36 (99.8), S38 (99.8)
HI1B(R27)S25 (100), S36 (100), S38 (100)
X1S1 (98.9), S28 (98.7)
X4S28 (100), S37 (99.7)
COL(156)S39 (94.8), S40 (94.8), S49* (95.1)
F1AS49 (100)
FII(pCoo)S49 (96.2)
F(AP001918)S48 (97.7)
FII(pHN7A8)S48 (97.3)
HI1A(CIT)S14 (100)
HI1B(CIT)S14 (99.0)
HI2S39 (100)
HI2AS39 (100)
Col(BS512)S2* (100)
Replicon types matching the probes in the PlasmidFinder database. Subvariants of replicons are given in parentheses.
Isolate names are given in boldface if more than one replicon was identified on the same contig, indicating colocalization of the replicons on the same plasmid. Hits not in full length are marked with an asterisk. No replicons were identified in strains S7, S10, S15, S23, S42, and S46.
The IncQ1 replicon is identical to the replicon of the broad-host-range RSF1010 plasmid from E. coli, containing the strA, strB, and sul2 resistance genes, which confer resistance toward streptomycin and sulfonamides. This was found to be present in 39% of the isolates and was in all cases located on sequence contigs with an approximate size of 4.5 kb. This element corresponds to the previously characterized IS26repA, repC, sul2, strA, strB-IS26 composite transposon, in which part of the IncQ1 replicon of the RSF1010 plasmid has been mobilized by IS26, together with the streptomycin and sulfonamide resistance genes (34). This mobile element is widely disseminated on plasmids and bacterial chromosomes; for instance, it was previously reported to be present in 88% of S. Typhimurium isolates from an international collection of strains originating mostly from humans but also from various animals, including pigs (35). Interestingly, resistance gene analysis on the same draft genomes using the online search tool ResFinder to detect resistance genes ( also found the strA, strB, and sul2 genes to be located on this 4.5-kb fragment, as previously described in other Salmonella plasmids (34, 35). This analysis, therefore, enabled us to link antimicrobial resistance determinants directly to the replicon sequence, predicting the presence of the RSF1010-derived element in these strains.
The FIB(S) and FII(S) sequences (Table 1) present in 27% of the isolates were always found to be present together in the same isolates and were in more than half the cases actually located on the same DNA fragment (contig), with sizes between 20 kb and 93 kb and thereby, also on the same plasmid. These replicons are characteristic for the Salmonella virulence plasmids, which represent a specific subgroup within the large family of the IncF plasmids (4). Here, no antimicrobial genes could be identified on the contigs carrying the replicons for FIB(S) and FII(S) sequences, which could be explained either by the lack of such genes on the virulence plasmids or by the inability of the current sequencing and de novo assembly methods to assemble complete plasmid sequences.
As the actual plasmid content of the test strains examined was not known, it was not possible to verify that all plasmids present in the isolates were also detected within the WGS data. However, this also applies to the current in vitro PBRT method, which can fail in detecting replicons because of nucleotide mutations occurring at the primer site sequences (4, 8). Furthermore, the PlasmidFinder database can detect many more plasmid replicon sequences than PBRT, also including novel groups. However, to get an indication of the extent of underreporting from PlasmidFinder, we subjected the six isolates reported by PlasmidFinder not to contain any plasmids to in vitro plasmid analysis by S1-PFGE analysis and DNA electrophoresis of purified plasmids. Using these methods, we were not able to detect any plasmids present in the strains, thus confirming the PlasmidFinder result (data not shown).
In silico detection using PlasmidFinder and ResFinder on WGS data offers the opportunity to link replicons, as well as other genetic features, such as multiple replicons, antimicrobial resistance genes, and virulence genes, to the same DNA fragment, because these tools provide the exact position of the genes in the uploaded sequence data. However, with the limitations in the current sequencing technology, it cannot be concluded that these genetic elements identified on different contigs are not located on the same plasmid. To examine this, transfer experiments by conjugation or transformation of the relevant plasmid(s) combined with S1-PFGE to ensure transfer of a single plasmid is most likely still required. But as the next-generation sequencing technologies mature even further to produce longer read lengths or single-molecule sequencing of larger pieces of DNA than today and thereby increase the ability to assemble sequencing data into longer contigs, this linking of genes will at some point be generally achievable. Furthermore, novel plasmid replicon groups identified by in silico BLAST homology searches or traditional replicon cloning and subsequent sequencing can easily be compared to the existing sequences and eventually be added to the database if they are found to differ significantly from these, based on the criteria used to build the PlasmidFinder database (>95% ID for large plasmids and >80% ID for Col-like plasmids).

pMLST analysis of plasmids.

Among the replicons identified in our S. Typhimurium collection, some belonged to four of the five incompatibility groups with available pMLST schemes. We detected strains carrying IncF, IncI1, IncHI1, and IncHI2 plasmids, while we did not detect any IncN plasmids as present in the S. Typhimurium WGS data.
Eighteen isolates contained an IncF replicon. In addition to the 13 with the IncFIB(S)/IncFII(S) virulence plasmid described above, this also included other IncF-related replicons, i.e., IncFIA(HI1), IncFIB(AP001918), IncFII(pRSB107), IncFIA, IncFII(pHN7A8), and IncFII(pCoo). In silico pMLST analysis using the FAB (FII:FIA:FIB) typing scheme for IncF plasmids suggested in reference 4 on the 13 sets of sequencing data carrying IncFIB(S)/IncFII(S) replicons showed 8 to belong to the S1:A:B17 FAB type previously found to be associated with the S. Typhimurium virulence plasmid (4), while 5 belonged to a highly similar FAB type only differing in a single mutation within the IncFIB fragment (assigned to a new FIB allele 35, leading to an S1:A:B35 FAB type).
A new FIA allele that was identified among the three isolates containing IncFIA(HI1) replicons (strains S25, S36, and S38) had only 88% identity to the closest match in the pMLST database (FIA allele 2). The FIA allele 8 was therefore assigned to this allelic variant, resulting in the F:A8:B FAB type. Strain S48 carried an IncFIB(AP0001918) and an IncFII(pHN7A8) replicon with an F40:A:B20 FAB type, and strain S49 carried IncFII(pCoo) and IncFIA replicons, leading to a new FII allele closely related to allele 13 (assigned to FII allele 67) and an FIA allele 6, giving the FAB type F67:A6:B.
Three strains contained replicons belonging to the IncI1 group. pMLST analysis showed them to belong to sequence type 25 (ST25) (strain S13), ST3 (strain S25), and ST36 (strain S31). In addition, two of these strains (strains S25 and S36), as well as a fourth strain (strain S14), contained IncHI1 replicons and one strain (strain S39) contained two IncHI2 replicons (HI2 and HI2A) located on the same DNA fragment after de novo assembly of the sequencing data. Isolates S25 and S36 both had perfect matches to all 6 alleles of the IncHI1 pMLST scheme; however, the combination of alleles was new (subsequently assigned to ST10), while isolate S14 produced six completely new alleles with 85% to 93% identity to the alleles already present in the pMLST database for IncHI1 plasmids. These alleles have been added to the pMLST scheme, and ST13 has been assigned to the IncHI1 plasmid from isolate S14. Interestingly, five of the six IncHI1 alleles in S14 were identical to the IncHI1 plasmid pNDM-CIT that was previously identified in a Citrobacter freundii isolate from a French patient with urinary tract infection returning from India (8). Finally, isolate S39 carrying two IncHI2 replicons only produced a result in one of the two alleles (smr0018 allele 1) of the IncHI2 scheme and is therefore not typeable by this method.
Based on the results presented above, the PlasmidFinder and pMLST Web tools present an opportunity for microbiologists without particular bioinformatics skills to analyze whole-genome data in both raw and assembled formats obtained from their own benchtop sequencers and retrieve plasmid replicon information to be used in clinical and epidemiological investigations. An advantage of using freely available Web-based services is that different investigators can use the same comprehensive curated database and standardized analytic settings, thus enabling more reproducible data for comparison of results between studies. At the moment, only one data set per each query session can be submitted, but a tool for batch upload of multiple data sets is currently under construction. Finally, correlation between plasmid replicons and antimicrobial resistance determinants is in some cases possible with the current technology; however, this is expected to further improve when next-generation sequencing technology and assembly eventually is able to produce long contigs and, ultimately, complete plasmid sequences.


Here, we present two free, easy-to-use Web tools, PlasmidFinder and pMLST, to analyze and classify plasmids from bacterial species of the family Enterobacteriaceae. To generate the PlasmidFinder tool, plasmids in GenBank have been classified into homology groups, which were, however, referred to the current plasmid nomenclature based on incompatibility groups. Therefore, PlasmidFinder not only provides the detection of replicons in the WGS but also assigns the plasmid under study to Inc groups and refers to the GenBank accession number of the plasmid that is the reference for that group. The advantage of using PlasmidFinder instead of submitting the query directly to BLASTn at the NCBI consists in the immediate classification of the plasmid into existing plasmid lineages, which is very useful for epidemiological tracing of the horizontal spread of antimicrobial resistance associated with particular epidemic plasmid types. Furthermore, PlasmidFinder will accept raw sequencing data directly from the sequencers and assemble these prior to comparing to the plasmid replicon database, a feature which is not possible at NCBI.
Testing of these tools using both fully assembled plasmid sequences and WGS-generated draft genomes showed them to be able to detect a broad variety of plasmid replicons among a collection of S. Typhimurium isolates, as well as to be able to subtype IncHI1, IncHI2, IncI1, and IncF plasmids present among the isolates. With the decrease in the price of WGS and the increases in read lengths, plasmid replicon typing in relation to plasmid epidemiology and the spread of antimicrobial resistance determinants by this method offers an alternative to traditional in vitro plasmid detection and subtyping.


We are grateful to Lisbeth Andersen and Rolf Sommer Kaas for excellent technical assistance.
This study was supported by the Center for Genomic Epidemiology ( grant 09-067103/DSF, from the Danish Council for Strategic Research, and by the InterOmics project (PB.P05), funded by the Italian Ministry of Education, University and Research (MIUR).

Supplemental Material

File (zac007143013so1.pdf)
ASM does not own the copyrights to Supplemental Material that may be linked to, or accessed through, an article. The authors have granted ASM a non-exclusive, world-wide license to publish the Supplemental Material files. Please contact the corresponding author directly for reuse.


Couturier M, Bex F, Bergquist PL, and Maas WK. 1988. Identification and classification of bacterial plasmids. Microbiol. Rev. 52:375–395.
Carattoli A, Bertini A, Villa L, Falbo V, Hopkins KL, and Threlfall J. 2005. Identification of plasmids by PCR-based replicon typing. J. Microbiol. Methods 63:219–228.
Datta N and Hedges RW. 1971. Compatibility groups among fi R factors. Nature 234:222–223.
Villa L, García-Fernández A, Fortini D, and Carattoli A. 2010. Replicon sequence typing of IncF plasmids carrying virulence and resistance determinants. J. Antimicrob. Chemother. 65:2518–2529.
Villa L, Poirel L, Nordmann P, Carta C, and Carattoli A. 2012. Complete sequencing of an IncH plasmid carrying the blaNDM-1, blaCTX-M-15 and qnrB1 genes. J. Antimicrob. Chemother. 67:1645–1650.
García-Fernández A, Fortini D, Veldman K, Mevius D, and Carattoli A. 2009. Characterization of plasmids harbouring qnrS1, qnrB2 and qnrB19 genes in Salmonella. J. Antimicrob. Chemother. 63:274–281.
García-Fernández A and Carattoli A. 2010. Plasmid double locus sequence typing for IncHI2 plasmids, a subtyping scheme for the characterization of IncHI2 plasmids carrying extended-spectrum beta-lactamase and quinolone resistance genes. J. Antimicrob. Chemother. 65:1155–1161.
Dolejska M, Villa L, Poirel L, Nordmann P, and Carattoli A. 2013. Complete sequencing of an IncHI1 plasmid encoding the carbapenemase NDM-1, the ArmA 16S RNA methylase and a resistance-nodulation-cell division/multidrug efflux pump. J. Antimicrob. Chemother. 68:34–39.
Johnson TJ, Bielak EM, Fortini D, Hansen LH, Hasman H, Debroy C, Nolan LK, and Carattoli A. 2012. Expansion of the IncX plasmid family for improved identification and typing of novel plasmids in drug-resistant Enterobacteriaceae. Plasmid 68:43–50.
Alvarado A, Garcillán-Barcia MP, and de la Cruz F. 2012. A degenerate primer MOB typing (DPMT) method to classify gamma-proteobacterial plasmids in clinical and environmental settings. PLoS One 7:e40438.
Jolley KA and Maiden MC. 2010. BIGSdb: scalable analysis of bacterial genome variation at the population level. BMC Bioinformatics 11:595.
García-Fernández A, Chiaretto G, Bertini A, Villa L, Fortini D, Ricci A, and Carattoli A. 2008. Multilocus sequence typing of IncI1 plasmids carrying extended-spectrum beta-lactamases in Escherichia coli and Salmonella of human and animal origin. J. Antimicrob. Chemother. 61:1229–1233.
García-Fernández A, Villa L, Moodley A, Hasman H, Miriagou V, Guardabassi L, and Carattoli A. 2011. Multilocus sequence typing of IncN plasmids. J. Antimicrob. Chemother. 66:1987–1991.
Phan MD, Kidgell C, Nair S, Holt KE, Turner AK, Hinds J, Butcher P, and Cooke FJ. 2009. Variation in Salmonella enterica serovar typhi IncHI1 plasmids during the global spread of resistant typhoid fever. Antimicrob. Agents Chemother. 53:716–727.
Zankari E, Hasman H, Cosentino S, Vestergaard M, Rasmussen S, Lund O, Aarestrup FM, and Larsen MV. 2012. Identification of acquired antimicrobial resistance genes. J. Antimicrob. Chemother. 67:2640–2644.
Larsen MV, Cosentino S, Rasmussen S, Friis C, Hasman H, Marvig RL, Jelsbak L, Sicheritz-Pontén T, Ussery DW, Aarestrup FM, and Lund O. 2012. Multilocus sequence typing of total-genome-sequenced bacteria. J. Clin. Microbiol. 50:1355–1361.
Villa L, Carattoli A, Nordmann P, Carta C, and Poirel L. 2013. Complete sequence of the IncT-type plasmid pT-OXA-181 carrying the blaOXA-181 carbapenemase gene from Citrobacter freundii. Antimicrob. Agents Chemother. 57:1965–1967.
Villa L, Capone A, Fortini D, Dolejska M, Rodríguez I, Taglietti F, De Paolis P, Petrosillo N, and Carattoli A. 2013. Reversion to susceptibility of a carbapenem-resistant clinical isolate of Klebsiella pneumoniae producing KPC-3. J. Antimicrob. Chemother. 68:2482–2486.
García-Fernández A, Villa L, Carta C, Venditti C, Giordano A, Venditti M, Mancini C, and Carattoli A. 2012. Klebsiella pneumoniae ST258 producing KPC-3 identified in Italy carries novel plasmids and OmpK36/OmpK35 porin variants. Antimicrob. Agents Chemother. 56:2143–2145.
Dolejska M, Villa L, Hasman H, Hansen L, and Carattoli A. 2013. Characterization of IncN plasmids carrying blaCTX-M-1 and qnr genes in Escherichia coli and Salmonella from animals, the environment and humans. J. Antimicrob. Chemother. 68:333–339.
Woodford N, Carattoli A, Karisik E, Underwood A, Ellington MJ, and Livermore DM. 2009. Complete nucleotide sequences of plasmids pEK204, pEK499, and pEK516, encoding CTX-M enzymes in three major Escherichia coli lineages from the United Kingdom, all belonging to the international O25:H4-ST131 clone. Antimicrob. Agents Chemother. 53:4472–4482.
Bonnin RA, Poirel L, Carattoli A, and Nordmann P. 2012. Characterization of an IncFII plasmid encoding NDM-1 from Escherichia coli ST131. PLoS One 7:e34752.
Carattoli A, Aschbacher R, March A, Larcher C, Livermore DM, and Woodford N. 2010. Complete nucleotide sequence of the IncN plasmid pKOX105 encoding VIM-1, QnrS1 and SHV-12 proteins in Enterobacteriaceae from Bolzano, Italy compared with IncN plasmids encoding KPC enzymes in the USA. J. Antimicrob. Chemother. 65:2070–2075.
Carattoli A, Villa L, Poirel L, Bonnin RA, and Nordmann P. 2012. Evolution of IncA/C blaCMY-2-carrying plasmids by acquisition of the blaNDM-1 carbapenemase gene. Antimicrob. Agents Chemother. 56:783–786.
Zankari E, Hasman H, Kaas RS, Seyfarth AM, Agersø Y, Lund O, Larsen MV, and Aarestrup FM. 2013. Genotyping using whole-genome sequencing is a realistic alternative to surveillance based on phenotypic antimicrobial susceptibility testing. J. Antimicrob. Chemother. 68:771–777.
Cottell JL, Webber MA, Coldham NG, Taylor DL, Cerdeño-Tárraga AM, Hauser H, Thomson NR, Woodward MJ, and Piddock LJ. 2011. Complete sequence and molecular epidemiology of IncK epidemic plasmid encoding blaCTX-M-14. Emerg. Infect. Dis. 17:645–652.
Sherburne CK, Lawley TD, Gilmour MW, Blattner FR, Burland V, Grotbeck E, Rose DJ, and Taylor DE. 2000. The complete DNA sequence and analysis of R27, a large IncHI plasmid from Salmonella Typhi that is temperature sensitive for transfer. Nucleic Acids Res. 28:2177–2186.
Gilmour MW, Thomson NR, Sanders M, Parkhill J, and Taylor DE. 2004. The complete nucleotide sequence of the resistance plasmid R478: defining the backbone components of incompatibility group H conjugative plasmids through comparative genomics. Plasmid 52:182–202.
Poirel L, Bonnin RA, and Nordmann P. 2011. Analysis of the resistome of a multidrug-resistant NDM-1-producing Escherichia coli strain by high-throughput genome sequencing. Antimicrob. Agents Chemother. 55:4224–4229.
Hansen LH, Johannesen E, Burmolle M, Sorensen AH, and Sorensen SJ. 2004. Plasmid-encoded multidrug efflux pump conferring resistance to olaquindox in Escherichia coli. Antimicrob. Agents Chemother. 48:3332–3337.
Marchler-Bauer A, Lu S, Anderson JB, Chitsaz F, Derbyshire MK, DeWeese-Scott C, Fong JH, Geer LY, Geer RC, Gonzales NR, Gwadz M, Hurwitz DI, Jackson JD, Ke Z, Lanczycki CJ, Lu F, Marchler GH, Mullokandov M, Omelchenko MV, Robertson CL, Song JS, Thanki N, Yamashita RA, Zhang D, Zhang N, Zheng C, and Bryant SH. 2011. CDD: a Conserved Domain Database for the functional annotation of proteins. Nucleic Acids Res. 39:D225–D229.
Smorawinska M, Szuplewska M, Zaleski P, Wawrzyniak P, Maj A, Plucienniczak A, and Bartosik D. 2012. Mobilizable narrow host range plasmids as natural suicide vectors enabling horizontal gene transfer among distantly related bacterial species. FEMS Microbiol. Lett. 326:76–82.
DANMAP. 2012. DANMAP 2011—use of antimicrobial agents and occurrence of antimicrobial resistance in bacteria from food animals foods and humans in Denmark. 2012.∼/media/Projekt%20sites/Danmap/DANMAP%20reports/Danmap_2011.ashx.
Miriagou V, Carattoli A, and Fanning S. 2006. Antimicrobial resistance islands: resistance gene clusters in Salmonella chromosome and plasmids. Microbes Infect. 8:1923–1930.
Helmuth R, Stephan R, Bunge C, Hoog B, Steinbeck A, and Bulling E. 1985. Epidemiology of virulence-associated plasmids and outer membrane protein patterns within seven common Salmonella serotypes. Infect. Immun. 48:175–182.

Information & Contributors


Published In

cover image Antimicrobial Agents and Chemotherapy
Antimicrobial Agents and Chemotherapy
Volume 58Number 7July 2014
Pages: 3895 - 3903
PubMed: 24777092


Received: 29 January 2014
Returned for modification: 9 March 2014
Accepted: 19 April 2014
Published online: 12 June 2014



Alessandra Carattoli
Department of Infectious, Parasitic and Immuno-Mediated Diseases, Istituto Superiore di Sanità, Rome, Italy
Ea Zankari
Danish Technical University, National Food Institute, Division for Epidemiology and Microbial Genomics, Lyngby, Denmark
Aurora García-Fernández
Department of Infectious, Parasitic and Immuno-Mediated Diseases, Istituto Superiore di Sanità, Rome, Italy
Mette Voldby Larsen
Danish Technical University, Center for Biological Sequence Analysis, Department of Systems Biology, Lyngby, Denmark
Ole Lund
Danish Technical University, Center for Biological Sequence Analysis, Department of Systems Biology, Lyngby, Denmark
Laura Villa
Department of Infectious, Parasitic and Immuno-Mediated Diseases, Istituto Superiore di Sanità, Rome, Italy
Frank Møller Aarestrup
Danish Technical University, National Food Institute, Division for Epidemiology and Microbial Genomics, Lyngby, Denmark
Henrik Hasman
Danish Technical University, National Food Institute, Division for Epidemiology and Microbial Genomics, Lyngby, Denmark


Address correspondence to Henrik Hasman, [email protected].

Metrics & Citations



  • For recently published articles, the TOTAL download count will appear as zero until a new month starts.
  • There is a 3- to 4-day delay in article usage, so article usage will not appear immediately after publication.
  • Citation counts come from the Crossref Cited by service.


If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. For an editable text file, please select Medlars format which will download as a .txt file. Simply select your manager software from the list below and click Download.

View Options

Figures and Media






Share the article link

Share with email

Email a colleague

Share on social media

American Society for Microbiology ("ASM") is committed to maintaining your confidence and trust with respect to the information we collect from you on websites owned and operated by ASM ("ASM Web Sites") and other sources. This Privacy Policy sets forth the information we collect about you, how we use this information and the choices you have about how we use such information.
FIND OUT MORE about the privacy policy