Free access
Research Article
19 May 2020

Custom Matrix-Assisted Laser Desorption Ionization–Time of Flight Mass Spectrometric Database for Identification of Environmental Isolates of the Genus Burkholderia and Related Genera


Success of discovery programs for microbial natural products is dependent on quick and concise discrimination between isolates from diverse environments. However, laboratory isolation and identification of priority genera using current 16S rRNA PCR-based methods are both challenging and time-consuming. An emerging strategy for rapid isolate discrimination is protein fingerprinting via matrix-assisted laser desorption ionization (MALDI) mass spectrometry. Using our in-house environmental isolate repository, we have created a main spectral (MSP) library for the Bruker Biotyper MALDI mass spectrometer that contains 95 entries, including Burkholderia, Caballeronia, Paraburkholderia, and other environmentally related genera. The library creation required the acquisition of over 2,250 mass spectra, which were manually reviewed for quality control and consolidated into a single reference library using a commercial software platform. We tested the effectiveness of the reference library by analyzing 49 environmental isolate strains using two different sample preparation methods. Overall, this approach correctly identified all strains to the genus level provided that suitable reference spectra were present in the MSP library. In this study, we present a fast, accurate method for taxonomic assignment of environmentally derived bacteria from the order Burkholderiales, providing a valuable alternative to traditional PCR-based methods. The MSP library described in the manuscript is available for use.
IMPORTANCE The Gram-negative proteobacterial order Burkholderiales has emerged as a promising source of novel natural products in recent years. This order includes the genus Burkholderia and the newly defined genera Caballeronia and Paraburkholderia. However, development of this resource has been hampered by difficulties with rapid and selective isolation of Burkholderiales strains from the environment. Environmental metagenome sequencing has revealed that the potential for natural products is not evenly distributed throughout the microbial world. Thus, large but targeted microbial isolate libraries are needed to effectively explore the chemical potential of natural products. To study these organisms efficiently, methods to quickly identify isolates to the genus level are required. Matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) is already used in clinical settings to reliably identify unknown bacterial pathogens. We have adapted similar methodology using the MALDI Biotyper instrument to rapidly identify environmental isolates of Burkholderia, Caballeronia, and Paraburkholderia for downstream natural product discovery.


The natural environment has long been a source of inspiration for the discovery of new classes of bioactive molecules, with natural products driving innovation in health, manufacturing, and agricultural research (1). One common strategy for chemical diversification in microbial natural product libraries is the exploration of niche organisms or environments. Indeed, novel classes of bioactive chemical compounds have been isolated from a wide array of “understudied” genera (24), validating this approach. However, studies show that natural product biosynthetic capacity is not evenly distributed in the environment, even among strains with similar 16S rRNA gene sequences (5, 6). Therefore, large environmental isolate libraries are required if research programs are to capture the full chemical capacity of a given genus. In many cases, challenges with selective isolation and rapid identification of priority strains limit the ability to create these libraries efficiently, precluding the development of these new sources of natural product diversity.
The genus Burkholderia has yielded a number of biomedically important natural products (7) but remains an underexplored genus with respect to natural product research (814). We recently presented a selective isolation strategy for this genus that relies on multiplex PCR and 16S rRNA gene sequencing for genus-level identification (15). However, the low phylogenetic and discriminatory powers of 16S rRNA genes for certain genera, including Burkholderia (16), hamper this already technically demanding and time-consuming identification method. To improve the efficiency of this approach, we have now designed, created, and validated a targeted main spectral (MSP) library for mass spectrometry-based taxonomic assignment of environmental samples. This method bypasses the need for initial 16S rRNA analysis, greatly improving the efficiency of the library creation process and improving the size and specificity of targeted libraries.
Mass spectrometric methods are now routinely used for microorganism identification in clinical settings. Matrix-assisted laser desorption ionization–time of flight mass spectrometry (MALDI-TOF MS) has accelerated identification efforts in clinical laboratories through the development of instruments such as the MALDI Biotyper (Bruker Daltonics). MALDI Biotyper analysis is attractive because samples can be generated directly from colonies on plates, and workup is both facile and inexpensive. The benchtop instrument includes a basic MALDI ionization source and a short flight path TOF mass analyzer capable of obtaining protein fingerprints for target organisms. Organisms can be identified by comparing peaks from the MS fingerprint region (3,000 to 15,000 Da) to commercial reference libraries of protein fingerprints from clinically relevant bacterial strains (Bruker Daltonics, bioMérieux) (17). In a similar fashion, Clark et al. have recently developed a MALDI-based isolate-profiling platform, termed IDBac, which profiles both the protein and small molecule fingerprints from isolate libraries to relate isolates at the protein and small molecule levels (18).
Given the robust classification, rapid acquisition times, and the low per-sample cost of Biotyper analysis, we aimed to extend the use of this technology to the identification of environmental isolates for natural product library construction (Fig. 1). This follows previous work in this area examining horseradish soil (19), arctic seawater (20), spacecraft organisms (21), and polymer-producing bacteria (22). Because the primary application for Biotyper instrumentation is in clinical settings, current reference libraries lack coverage of environmental strains for most genera. To address this shortcoming, we have used our newly created rhizosphere-derived isolate library to create a targeted Biotyper main spectral (MSP) reference library for Burkholderia strains and other related genera.
FIG 1 Workflow for environmental isolate identification using the MALDI Biotyper. Isolate grown on solid agar (A); liquid extraction of single colony (B); application of extract and HCCA matrix to MALDI target (C); protein fingerprint analysis by MALDI mass spectrometry (D); data processing of individual mass spectra (E); and taxonomic identification by comparison to mass spectral reference library (F).
The newly created MSP reference library was evaluated for accuracy of identification for environmental isolates not included in the training set, compared to traditional 16S rRNA gene sequencing. Finally, the MSP reference library was exported in the manufacturer’s standard reference library format. This reference library and associated data can be found at Dryad (


Strain selection.

The accuracy of MALDI Biotyper taxonomic assignment is heavily influenced by the size and constitution of the main spectral (MSP) library used for reference. The current commercial MSP library (MBT Compass Library, Revision F) contains 8,468 entries from 540 genera and 2,969 species, with a heavy focus on clinical pathogens. To extend the coverage of isolates derived from rhizosphere environments, we selected 95 representatives from our 390-member rhizosphere isolate library to create a new rhizosphere-specific MSP library. These isolates derived from 42 unique field samples (Table S1 in the supplemental material) and included representatives from all Burkholderiales genera present in our library (3 genera, 21 species, 75 total strains; Table S2), as well as representatives from untargeted genera (6 genera, 19 species, 20 total strains; Table S2). Although these strains do not cover all genera from the family Burkholderiales, they are representative of the diversity of proteobacteria identified in previous metagenomic studies of rhizosphere environments. This strain panel was therefore well suited to the objective of rapid characterization of Gram-negative proteobacteria from rhizosphere samples.
MSP library creation. For robust library creation, each strain requires a minimum of 20 individual MS spectra, which are then averaged to produce a single MSP entry. These spectra are derived from three technical replicates on each of eight separate MALDI target spots prepared using the “extraction” sample preparation method. Isolates were grown on LB agar plates and single colonies were picked and processed using the standard MSP library creation protocol (see the Materials and Methods section). Following colony workup, samples were prepared on eight positions on the MALDI target plate, and three separate mass spectra were acquired for each target position. For the 95 strains analyzed, this required 760 target positions and a total of 2,260 individual mass spectra. In addition, a bacterial test standard (BTS; Materials and Methods) was pipetted onto the MALDI target plate. For MSP acquisition, each MALDI target plate was required to pass quality control checks on the BTS in flexControl v. 3.4 and flexAnalysis v. 3.4 before advancing to MSP creation. In rare cases, plates failed this quality control check. In these situations plates were remade, with the new MALDI plates passing quality control requirements in all cases.
Following data acquisition, isolates were processed individually by aligning all 24 mass spectra and manually evaluating each spectrum for anomalous signals and/or poor signal to noise. Spectra that failed this quality control check were removed, and the remaining spectra (a minimum of 20) were incorporated into a single MSP entry using the vendor software. Overall the quality of the reference spectra was high, with many MSP entries having no spectra removed. For the remainder, an average of 1 to 2 spectra were removed, mostly due to the presence of erroneous high intensity signals or m/z shifted peaks. Representative MSP entries are presented in Fig. 2, including all replicates for each example as overlaid plots. In this study, all of the strains prepared for MSP creation were inserted into the library, i.e., no situations were encountered where the MS quality was too low to permit insertion. Finally, each MSP entry was annotated with the current genus and species classifications from BLASTn results (April 2019; Table S2) based on 16S rRNA sequence data for each isolate.
FIG 2 Twenty-four overlaid mass spectra with smoothing and baseline subtraction for Paraburkholderia graminis (A) and Bacillus mycoides (B). Expansions present the 4,000 to 6,000 m/z range. Each of the 24 mass spectra was individually reviewed for flatline spectra, intense peaks, dramatic mass shifts, and other anomalies. Between 20 and 24 mass spectra have been averaged to generate each MSP reference library entry.

Evaluation of spectral similarity by taxonomic rank.

The central premise of Biotyper analysis is that MALDI protein fingerprint similarities should parallel taxonomic relatedness. Figure 3 presents representative spectra from within the same genus (Fig. 3A), family (Fig. 3B), and class (Fig. 3C). Qualitative review of these spectra confirms that spectral similarities are highest for the intragenus samples, with the three examples sharing a high proportion of mass spectrometric features. At the family level this similarity is lower, with comparatively few shared features, while at the class level the spectra bear little relationship to one another.
FIG 3 Comparison of MALDI mass spectra at different taxonomic ranks. (A) Example spectra from the same genus. (B) Example spectra from the same family. (C) Example spectra from the same class.
To quantify spectral similarities between isolates, we created a distance matrix of protein fingerprint similarity scores for all members of the MSP library. MSP database entries were compared as a composite correlation index (CCI) matrix (MALDI Biotyper OC v. 3.1). The exported CCI matrix was clustered hierarchically with an average linkage algorithm in Cluster 3.0 (23) and visualized as a heatmap in Java Treeview (Fig. 4A; also, high resolution versions of Fig. 4 panels are available in the supplemental material). Overall, members of the genus Paraburkholderia showed high intragenus correlation scores (blue region) with an average of 54.2% between all members. At the species level, correlation scores were even higher, with an average of 80.4% for Paraburkholderia dipogonis, 78.8% for Paraburkholderia graminis, and 76.3% for Paraburkholderia caledonica. In contrast, members of the Paraburkholderia genus showed very low similarity scores to strains from other genera. The maximum similarity scores between Paraburkholderia and Pseudomonas (4.4%), Paenibacillus (8.1%), and even Caballeronia (16.4%) were very low, indicating the ability of protein fingerprinting to clearly discriminate between genera from this sample set. Interestingly, even for genera with no duplicate species in the MSP library (e.g., Paenibacillus), the intragenus scores were high (42.3%) compared to the scores against other genera (8.1% compared to Paraburkholderia). This suggests that the likelihood of false assignment to the incorrect genus is low, due to the high relatedness within genera and the very low spectral relatedness to strains outside the genus.
FIG 4 Composite correlation index (CCI) matrix visualized as a color-spectrum heatmap. (A) Hierarchical clustering of 95 strains clustered by MSP profiles. (B) Hierarchical clustering of MSP profiles (x) versus 16S rRNA gene sequences (y). Initial clustering of MSP profiles was generated by composite correlation index (CCI) on MALDI Biotyper OC 3.1 with Euclidean distance algorithms. The 16S rRNA sequences were clustered using a maximum likelihood tree and Tamera-Nei model using MEGA X. Average linkages of MSP profiles were generated using Cluster 3.0 and visualized by JavaTree. Distances between mass spectra are represented as a color spectrum, with bright blue denoting highest similarity and black denoting least similarity. Full-page spectra with isolate names are available in Fig. S1 and S2 of the supplemental material.
To compare taxonomic organization between the MALDI Biotyper and traditional 16S rRNA sequencing, we created a second heatmap using the CCI matrix discussed above, substituting in the hierarchical relationships derived from 16S rRNA sequencing on the vertical axis (Fig. 4B). Comparison of panels A and B from Fig. 4 illustrates the degree of conservation for taxonomic rankings between MALDI Biotyper and 16S rRNA analyses. If the two classification methods provided identical results, then Fig. 4A and B would be indistinguishable. Instead, although the majority of Burkholderia and Paraburkholderia strains have high spectral similarity scores (large blue region in both panels), in panel B this region is split by the presence of several strains that show low spectral similarities to other members of the genus. Paraburkholderia strains from the species P. megapolitana, P. fungorum, P. metrosideri, and P. ginsengisoli all have examples that show weak correlations to other members of the genus. Surprisingly, two of these species (P. fungorum and P. megapolitana) have several representatives in the MSP library that show low intraspecies spectral similarities. Inspection of the MSP library entries confirmed that sample preparation and spectral acquisition were performed appropriately, and that alignment to MSP entries was not biased by batch effects from MSP library acquisition. These are the only two species that exhibited this trait, with other species showing consistently high intraspecies similarities. This observation reinforces the requirement for extensive strain coverage in MSP libraries if species-level identifications are required.

Evaluation of MSP library performance.

The central objective of this study was to examine the suitability of MALDI Biotyper analysis for rapid taxonomic assignment of environmental isolates from rhizosphere samples. To explore this question, we evaluated 49 previously unanalyzed strains from our environmental isolate library. These strains were representative of the taxonomic diversity encountered during field sampling campaigns, including examples from the genera Burkholderia, Paraburkholderia, Caballeronia, Pseudomonas, and Dyella. The test set was designed to evaluate different assignment scenarios by including examples of both genus-level and species-level matches to the MSP library (Table S1, 2018 and 2019). Standard 16S rRNA gene sequence data were acquired for all 49 strains using 8F and 1492R universal primers (see Materials and Methods). Using all 144 strains, we created a maximum-likelihood tree that included both the 95 isolates from the MSP library and the 49 additional isolates to illustrate the full taxonomic diversity of this sample set (Fig. 5).
FIG 5 Maximum-likelihood trees of 16S rRNA sequence of isolates used in the MSP library and identification experiments. Isolates are color coded by genus. A black circle indicates isolates that are included in the MSP library.
MALDI Biotyper analysis of test set. MALDI-TOF MS data were acquired for the test set using the “extraction” sample preparation method and standard MS acquisition parameters (see Materials and Methods). For unknown isolates, the workflow involved spectrum acquisition, spectrum processing, and spectral matching against the MSP library. The MALDI Biotyper control software (flexControl v. 3.4, flexAnalysis v. 3.4, MALDI Biotyper OC v. 3.1) generates a score value for each isolate, which indicates whether it has been reliably identified to the species level (a score value of ≥2.0) or the genus level (a score value of ≥1.7), or has not received reliable identification (a score value of <1.7). This score value was calculated by taking the percentage of peaks of the unknown spectrum compared against an MSP entry (Rel_Sc) and multiplying it by the percentage of the unknown sample’s peak lists to the MSP (Rel_P_Num). It was then multiplied by a correlation value that relates to the intensities of the matching peaks (I-Corr) to generate a score value. This score value was multiplied by 1,000 and converted to a logarithmic score out of 3 that defined the confidence of the assignment.
Of the 49 isolates in the test set, 39 had species-level representation in our MSP library. In these cases, all test strains were correctly identified at the genus level (100%; Fig. 6A), with 24 (61.5%; Fig. 6A) also correctly identified at the species level. For the 15 strains that were not assigned to the correct species, 6 had score values that were too low to permit confident species-level assignments, while the remaining 9 strains were incorrectly assigned to other species within the same genus. In general, species with three or more representatives in the MSP library were correctly assigned more frequently (61.5%) than instances with two or fewer MSP representatives (38.5%). This illustrates the importance of MSP library composition for the target strain set being analyzed, and highlights the need for multiple examples of each species in the candidate pool if species-level assignment is required.
FIG 6 Bar graphs illustrating identification rates within the MSP library for both extraction and eDT sample preparation methods. (A) Identification rates for isolates with species-level representation in the MSP library. (B) Identification rates for isolates without species-level representation in the MSP library.
The remaining 10 members of the test set only had representation in our MSP library at the genus level. Of these, just two were assigned to the correct genus, with the remaining 8 strains possessing score values that were too low to permit reliable identification (Fig. 6B). However, there were a further three cases where isolates matched correctly to the genus level, but had logarithmic match scores just below 1.7, so could not be reliably identified (Table S4). This result highlights the importance of species-level representation in the MSP library. Without representatives of the correct species, score values are often too low to permit strain identification. To improve environmental strain identification, it is therefore recommended that strains that do not receive an assignment are characterized by 16S rRNA analysis and added to the MSP library. Our results indicate that this step will involve predominantly strains with no existing species-level coverage in the MSP library, leading to a rapid expansion of coverage for commonly encountered species.

Comparison of extraction and eDT sample preparation methods.

The manufacturer provides two recommended protocols for MALDI Biotyper sample preparation: (i) the extraction method and (ii) the extended direct transfer (eDT) method. These two methods differ moderately in required labor, time, and cost, with the eDT method being more straightforward. The extraction method requires colony washing (water/ethanol) followed by extraction (formic acid/acetonitrile), addition of the extraction supernatant to the MALDI target plate, and overlay with alpha-cyano-4-hydroxycinnamic acid (HCCA) matrix (see Materials and Methods). In contrast, the eDT method involves direct transfer of the bacterial colony to the target plate, followed by treatment in situ with formic acid (1 μL) and application of the HCCA matrix.
To assess the suitability of each method for characterizing unknown environmental isolates, we repeated our analysis of the 49-member test set using the eDT method. Using this approach, 35 of the 39 strains with species-level representatives in the MSP library were correctly identified at the genus level (89.7%; Fig. 6A), with 21 of these also assigned to the correct species (53.8%; Fig. 6A). For the 10 strains with MSP representation at the genus level, four were assigned to the correct genus, with a further four strains matched to the correct genus, but with logarithmic scores below 1.7 (Fig. 6B). Overall, the eDT method was more susceptible to minor variations in sample handling and led to a higher rate of failed MS analyses. Other researchers have recommended the extraction and eDT methods over quicker direct transfer (DT) methods that do not require extraction of proteins using solvents other than formic acid (17). Closer analysis of the logarithmic scores using extraction and eDT methods indicated that use of the extraction method increased the logarithmic scores in 40 of 49 instances, even in cases where both methods classified the isolate to the same genus or species (Table S4). Overall, given the modest differences in materials, costs, and sample preparation time, we recommend the extraction method for analysis of unknown environmental isolate libraries.

Distribution of custom MSP database.

MSP libraries are time-consuming and expensive to produce, due to the dual requirements of 16S rRNA sequencing for all strains and the need for 24-replicate MS analyses, along with detailed manual review for each entry. To facilitate the utility of this approach, we have exported the MSP library described in the manuscript in the required format for import to other Biotyper systems. Both the custom MSP database and other relevant materials are available for download (
In summary, while 16S rRNA gene sequencing remains the industry standard for identifying environmental bacterial isolates, we have demonstrated that the MALDI Biotyper is a rapid, accurate, and economical alternative to this approach. Correct assignment of all tested unknown isolates to the genus level using the extraction sample preparation method affords high assignment accuracy while cutting laboratory time from days to minutes, provided that the target species is represented in the MSP reference library. For this application, accurate assignment to target genera with low false-positive rates is preferable to comprehensive assignments with higher error rates. Given that there were no instances of false assignment at the genus level, the MALDI Biotyper method can be considered a powerful assignment tool even when specific species are not present in the MSP library. In these cases, isolates either were assigned to the correct genus (20% with the extraction method; 40% with the eDT method) or remained unassigned, with no instances of false assignment. Overall, the quality of assignment was strongly contingent on the size and quality of the MSP library. Our large-scale Burkholderiales MSP library is now freely available, and provides a strong foundation for the expansion and further development of this valuable and efficient discovery method.


Source of bacterial isolates.

Bacterial isolates were obtained using established isolation methodology (15). Briefly, 57 root and soil samples were collected from 9 locations in BC, Canada (Table S1). Root material was exposed using a sterilized metal scoop and a short segment of root stock cut from the plant using sterile scissors and placed with attached soil into a 50-ml centrifuge tube (Falcon). This tube was capped with parafilm and stored at 4°C before processing.
Samples were processed by taking a small clipping of root and soil from the Falcon tube and transferring it to a 15-ml centrifuge tube (Falcon). Sterile 1× phosphate-buffered saline (PBS) buffer was added to each 15-ml centrifuge tube, vortexed for 30 s, and allowed to settle for 30 min. Next, 100-μl aliquots were spread onto the six different selection medium plates and incubated for 5 to 7 days at 30°C. Colonies of interest were selected using a sterile plastic loop and grown in LB liquid medium (10 ml) with shaking at 200 rpm overnight to prepare cultures for DNA extraction and sequencing. For long-term storage, 500 μl of each culture was added to a sterile solution of 1:1 glycerol/water in cryo-microcentrifuge tubes and stored at −80°C.

DNA purification and 16S rRNA gene sequence analysis of MSP library isolates.

Isolates underwent 16S rRNA sequence analysis as previously reported (15). DNA was extracted from overnight cultures grown in LB medium using Promega Wizard Genomic DNA purification kit employing the Gram-negative bacterial protocol. DNA concentrations were measured using a SpectraDrop Micro-volume microplate with a SpectraMax i3x plate reader (Molecular Devices). Rehydration buffer from the DNA purification kit was used as the blank in three technical replicates. Extracted DNA was used for PCR experiments.
Reactions took place in 50-μl volumes containing 25 μl 2× MasterMix with dye (ABM), 2.5 μl of 8F and 1492R primers (0.15 μM final concentration), 20 μl of nuclease-free water, and template DNA. Cycling conditions included 35 cycles at 95°C for 5 min, 95°C for 1 min, 52.5°C for 1.5 min, 72°C for 1.5 min, and final extension at 72°C for 10 min. Standard gel electrophoresis (1% agarose in 1× Tris-acetate-EDTA [TAE] buffer) at 100 V for 30 min (Bio-Rad, Mississauga, ON) confirmed PCR products. PCR products were purified using the QIAQuick PCR purification kit according to the manufacturer’s protocol (Qiagen). DNA concentrations were measured using the SpectraMax i3x plate reader. PCR products were sequenced by the UBC-NAPS sequencing service (University of British Columbia, Vancouver, BC) with 8F and 1492R primers. Contigs were generated with CAP3 (24). BLASTn searches of the 16S rRNA sequence database were conducted to find top BLAST homolog and percent identity. A full list of isolates used in this study is presented in Table S2.

MALDI-TOF custom main spectral library sample preparation.

Ninety-five main spectral (MSP) entries were created according to the manufacturer’s protocol for custom library preparation (Bruker Daltonics). Aliquots of 100 μl of glycerol stock solution of isolates (Table S2) were pipetted separately onto LB agar plates, spread with a sterile plastic inoculating loop, and incubated for 72 h at 30°C. An aliquot of 10 μl of biological material was added into a 1.5-ml microcentrifuge tube (Eppendorf) with 300 μl of high-pressure liquid chromatography (HPLC)-grade water. Ethanol (900 μl) was added and mixed by pipette. Tubes were centrifuged for 2 min at 20,000 × g, decanted, and then centrifuged again. Residual ethanol was removed by pipette without disturbing the pellet. The pellet was air-dried for at least 5 min at room temperature. Forty microliters of 70% formic acid (Sigma-Aldrich) was added to the pellet and mixed via pipette and/or vortex, until uniformly suspended. Forty microliters of acetonitrile (Thermo Fisher Scientific) was added and mixed. Microcentrifuge tubes were centrifuged again for 2 min at 20,000 × g. One microliter of bacterial test standard (BTS, Bruker Daltonics) was pipetted onto two MALDI target spots and let dry to use for calibration during acquisition and processing. For each isolate of interest, 1 μl of supernatant was pipetted onto MALDI target plate in 8 replicates and allowed to evaporate at room temperature. Once all target spots had dried, material was immediately overlaid with 1 μl of alpha-cyano-4-hydroxycinnamic acid solution (HCCA; solubilized in 250 μl of 50% acetonitrile, 47.5% milliQ water, 2.5% trifluoroacetic acid) followed by subsequent evaporation at room temperature. All solvents used were HPLC or liquid chromatography-mass spectrometry (LCMS) grade.

MALDI-TOF custom MSP library data acquisition.

MALDI-TOF MS was performed using a Microflex LT bench-top mass spectrometer (Bruker Daltonics) following the manufacturer’s recommended protocol for custom MSP and library creation (Bruker Daltonics). Auto-calibration on the BTS spot was run within flexControl v. 3.4 (Bruker Daltonics) before advancing to MSP creation. Within flexControl v. 3.4, AutoXecute Run with method MBT_AutoX as AutoXecute Method and MBT_Standard.FAMSMethod as flexAnalysis method were used for database creation. Once each analysis was complete, a calibration check was performed in flexAnalysis v. 3.4 (Bruker Daltonics). Three separate scans were taken for each of the 8 technical replicates, yielding 24 spectra per isolate prior to MSP data processing.

MALDI-TOF custom MSP library data processing.

BTS spectra were opened and smoothing was applied to the mass spectrum and baseline. Masses of 8 peaks were compared against manufacturer’s reference masses. Quality control of spectra was done by comparison to BTS. The mass spectra from the 8 replicates were opened and smoothing was applied to the mass spectrum and baseline. Constant values for c0, c1, and c2 of one mass spectrum were measured against these values of the BTS. Outliers, flatline spectra, very intense peaks, dramatic mass shifts, or anomalies were noted, and individual spectra containing these were removed before MSP creation. At least 20 spectra were required for the creation of each MSP. Seventy peaks with greater than 25% frequency were required before incorporation of MSP into the MSP library (Table S2).

Taxonomic comparison of 16S rRNA gene sequences to MALDI-TOF MS protein data.

Similarities between mass spectra of all MSPs were compared by generating a composite correlation index matrix in MALDI Biotyper OC v. 3.1. Three mass spectra of each represented MSP were used. CCI parameter intervals were set to ten, with mass lower and upper bounds set to 3,000 and 12,000 Da, respectively. The generated CCI matrix with values ranging from 0 to 1 was exported into a .csv format, uploaded to Cluster 3.0, and clustered using hierarchical clustering on both x and y axes using an average linkage algorithm. Java Treeview was used to visualize generated heat maps (Fig. 5A and B) (25). The heat map comparing MSP profiles (x) versus 16S rRNA gene sequences (y) (Fig. 4B) was generated using a previously detailed method, except with hierarchical clustering on only the x axis. Along the y axis is the maximum likelihood tree, ordered phylogenetically with following parameters. The 16S rRNA gene sequences were aligned and trimmed using MUSCLE and a maximum likelihood tree was enacted based on the Tamera-Nei model with 1,000 bootstrap replicates using MEGA-X (26). The maximum likelihood tree was exported as a Newick file and uploaded to Interactive Tree of Life (iTOL) to visualize (27). For both heat maps (Fig. 5A and B), distances between mass spectra are represented as a full color spectrum, with bright blue denoting highest similarity and black denoting lowest similarity. Full-page heatmaps with isolate names are presented in the supplemental material (Fig. S1 and S2).

DNA purification and 16S rRNA gene sequence analysis of unknown bacterial isolates for identification experiment.

Isolation followed the Haeckl et al. protocol outlined above, with minor modification. Instead of measuring DNA concentrations for DNA purification and PCR purification using the microplate reader, DNA concentrations were measured using the dsDNA experiment on the NanoDrop One spectrophotometer (Thermo Fisher Scientific). Rehydration buffer and nuclease-free water were used as blanks. Sequencing was performed by GeneWiz for 2019 isolates instead of UBC-NAPS. Isolates used in identification experiments with accession numbers, BLAST homologs, and percent identities can be found in Table S3.

Strain identification of unknown bacterial isolates using MALDI-Biotyper and MSP library.

The 49 isolates were taken from LB plates and underwent both extraction and extended direct transfer (eDT) methods, as recommended by Bruker Daltonics. The extraction method protocol was followed as previously outlined. For the eDT method, approximately one tip of a sterile toothpick, about 105 to 107 bacterial cells, was spread directly onto the MALDI target plate in a circular motion. Two subsequent spots, termed a heavy and light spot, were plated and then overlaid with 1 μl of 100% ethanol (Commercial Alcohols). Once dry, 1 μl of alpha-cyano-4-hydroxycinnamic acid (HCCA) matrix (Sigma-Aldrich) was added to each spot.
Real-time identification experiments on the MALDI Biotyper were run using our new MSP library and Bruker’s internal library. Within flexControl v. 3.4, the method MBT_FC.PAR was used and an AutoXecute Run with method MBT_AutoX was used for real-time identification experiments. Logarithmic scores were generated from each run (Table S4). According to the manufacturer’s user manual, if the logarithmic value of the final score was between 2.3 and 3, the isolate was securely classified to the species level; for values between 2 and 2.3, the isolate was classified securely to the genus level and less securely to the species level; for values between 1.7 and 2, the isolate was classified to the genus level; and for values lower than 1.7, the isolate was not able to be reliably identified. Logarithmic scores above 2.0 were used to generate species-level matches and scores above 1.7 were used to generate genus-level matches.

Data availability.

The 16S rRNA sequences have been deposited to GenBank with accession numbers MN582690 to MN582738. The MALDI-TOF MS reference library and associated data can be found at Dryad (


This work was funded by a Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant (to R.G.L.).
We thank A. Abraham (Bruker Daltonics) for assistance with software settings for Bruker MALDI-Biotyper MSP creation and H. Chen (SFU Mass Spectrometry facility) for assistance with MALDI Biotyper operation.

Supplemental Material

File (aem.00354-20-s0001.pdf)
ASM does not own the copyrights to Supplemental Material that may be linked to, or accessed through, an article. The authors have granted ASM a non-exclusive, world-wide license to publish the Supplemental Material files. Please contact the corresponding author directly for reuse.


Cragg GM, Newman DJ. 2013. Natural products: a continuing source of novel drug leads. Biochim Biophys Acta 1830:3670–3695.
Stierle AA, Stierle DB, Decato D, Priestley ND, Alverson JB, Hoody J, McGrath K, Klepacki D. 2017. The Berkeleylactones, antibiotic macrolides from fungal coculture. J Nat Prod 80:1150–1160.
Giddings L-A, Newman DJ. 2015. Bioactive compounds from extremophiles, p 1–47. In Bioactive compounds from extremophiles. SpringerBriefs in Microbiology. Springer, Cham, Switzerland.
Giddings L-A, Newman DJ. 2015. Bioactive compounds from terrestrial extremophiles, p 1–75. In Bioactive compounds from terrestrial extremophiles. SpringerBriefs in Microbiology. Springer, Cham, Switzerland.
Duncan KR, Crüsemann M, Lechner A, Sarkar A, Li J, Ziemert N, Wang M, Bandeira N, Moore BS, Dorrestein PC, Jensen PR. 2015. Molecular networking and pattern-based genome mining improves discovery of biosynthetic gene clusters and their products from Salinispora species. Chem Biol 22:460–471.
Cimermancic P, Medema MH, Claesen J, Kurita K, Wieland Brown LC, Mavrommatis K, Pati A, Godfrey PA, Koehrsen M, Clardy J, Birren BW, Takano E, Sali A, Linington RG, Fischbach MA. 2014. Insights into secondary metabolism from a global analysis of prokaryotic biosynthetic gene clusters. Cell 158:412–421.
Kunakom S, Eustáquio AS. 2019. Burkholderia as a source of natural products. J Nat Prod 82:2018–2037.
Franke J, Ishida K, Hertweck C. 2012. Genomics-driven discovery of burkholderic acid, a noncanonical, cryptic polyketide from human pathogenic Burkholderia species. Angew Chem Int Ed 51:11611–11615.
He H, Ratnayake AS, Janso JE, He M, Yang HY, Loganzo F, Shor B, O’Donnell CJ, Koehn FE. 2014. Cytotoxic spliceostatins from Burkholderia sp. and their semisynthetic analogues. J Nat Prod 77:1864–1870.
Hermenau R, Ishida K, Gama S, Hoffmann B, Pfeifer-Leeg M, Plass W, Mohr JF, Wichard T, Saluz H-P, Hertweck C. 2018. Gramibactin is a bacterial siderophore with a diazeniumdiolate ligand system. Nat Chem Biol 14:841–843.
Seyedsayamdost MR, Chandler JR, Blodgett JV, Lima PS, Duerkop BA, Oinuma K-I, Greenberg EP, Clardy J. 2010. Quorum-sensing-regulated bactobolin production by Burkholderia thailandensis E264. Org Lett 12:716–719.
Tawfik KA, Jeffs P, Bray B, Dubay G, Falkinham JO, Mesbah M, Youssef D, Khalifa S, Schmidt EW. 2010. Burkholdines 1097 and 1229, potent antifungal peptides from Burkholderia ambifaria 2.2N. Org Lett 12:664–666.
Wang C, Flemming CJ, Cheng Y-Q. 2012. Discovery and activity profiling of thailandepsins A through F, potent histone deacetylase inhibitors, from Burkholderia thailandensis E264. Med Chem Commun (Camb) 3:976–981.
Wang C, Henkes LM, Doughty LB, He M, Wang D, Meyer-Almes F-J, Cheng Y-Q. 2011. Thailandepsins: bacterial products with potent histone deacetylase inhibitory activities and broad-spectrum antiproliferative activities. J Nat Prod 74:2031–2038.
Haeckl FPJ, Baldim JL, Iskakova D, Kurita KL, Soares MG, Linington RG. 2019. A selective genome-guided method for environmental Burkholderia isolation. J Ind Microbiol Biotechnol 46:345–362.
Janda JM, Abbott SL. 2007. 16S rRNA gene sequencing for bacterial identification in the diagnostic laboratory: pluses, perils, and pitfalls. J Clin Microbiol 45:2761–2764.
Maier T, Klepel S, Renner U, Kostrzewa M. 2006. Fast and reliable MALDI-TOF MS-based microorganism identification. Nat Methods 3:i–ii.
Clark CM, Costa MS, Sanchez LM, Murphy BT. 2018. Coupling MALDI-TOF mass spectrometry protein and specialized metabolite analyses to rapidly discriminate bacterial function. Proc Natl Acad Sci U S A 115:4981–4986.
Uhlik O, Strejcek M, Junkova P, Sanda M, Hroudova M, Vlcek C, Mackova M, Macek T. 2011. Matrix-assisted laser desorption ionization (MALDI)-time of flight mass spectrometry- and MALDI biotyper-based identification of cultured biphenyl-metabolizing bacteria from contaminated horseradish rhizosphere soil. Appl Environ Microbiol 77:6858–6866.
Timperio AM, Gorrasi S, Zolla L, Fenice M. 2017. Evaluation of MALDI-TOF mass spectrometry and MALDI BioTyper in comparison to 16S rDNA sequencing for the identification of bacteria isolated from Arctic sea water. PLoS One 12:e0181860.
Seuylemezian A, Aronson HS, Tan J, Lin M, Schubert W, Vaishampayan P. 2018. Development of a custom MALDI-TOF MS database for species-level identification of bacterial isolates collected from spacecraft and associated surfaces. Front Microbiol 9:780.
Karolski B, Cardoso LOB, Gracioso LH, Nascimento CAO, Perpetuo EA. 2018. MALDI-Biotyper as a tool to identify polymer producer bacteria. J Microbiol Methods 153:127–132.
de Hoon MJL, Imoto S, Nolan J, Miyano S. 2004. Open source clustering software. Bioinformatics 20:1453–1454.
Huang X, Madan A. 1999. CAP3: a DNA sequence assembly program. Genome Res 9:868–877.
Saldanha AJ. 2004. Java Treeview—extensible visualization of microarray data. Bioinformatics 20:3246–3248.
Kumar S, Stecher G, Li M, Knyaz C, Tamura K. 2018. MEGA X: Molecular Evolutionary Genetics Analysis across computing platforms. Mol Biol Evol 35:1547–1549.
Letunic I, Bork P. 2019. Interactive Tree Of Life (iTOL) v4: recent updates and new developments. Nucleic Acids Res 47:W256–W259.

Information & Contributors


Published In

cover image Applied and Environmental Microbiology
Applied and Environmental Microbiology
Volume 86Number 1119 May 2020
eLocator: e00354-20
Editor: Robert M. Kelly, North Carolina State University
PubMed: 32245762


Received: 11 February 2020
Accepted: 13 March 2020
Published online: 19 May 2020


Request permissions for this article.


  1. isolate identification
  2. MALDI mass spectrometry
  3. Burkholderia
  4. natural products



Claire H. Fergusson
Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada
Julienne M. F. Coloma
Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada
Mercia C. Valentine
Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada
F. P. Jake Haeckl
Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada
Department of Chemistry, Simon Fraser University, Burnaby, British Columbia, Canada


Robert M. Kelly
North Carolina State University


Address correspondence to Roger G. Linington, [email protected].

Metrics & Citations


Note: There is a 3- to 4-day delay in article usage, so article usage will not appear immediately after publication.

Citation counts come from the Crossref Cited by service.


If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. Simply select your manager software from the list below and click Download.

View Options

Figures and Media






Share the article link

Share with email

Email a colleague

Share on social media

American Society for Microbiology ("ASM") is committed to maintaining your confidence and trust with respect to the information we collect from you on websites owned and operated by ASM ("ASM Web Sites") and other sources. This Privacy Policy sets forth the information we collect about you, how we use this information and the choices you have about how we use such information.
FIND OUT MORE about the privacy policy