Construction of a signature-tagged transposon library and mutagenesis of S. meliloti.
STM was successfully used to study genes important for competitiveness and survival of a number of pathogenic bacteria in the host. Here, we applied a modified STM approach to a symbiotic bacterium and combined it with determination of transposon insertion sites in the mutants. This approach was based on transposons that were each modified by two different short 24-bp signature tags, similar to the method used by Karlyshev et al. (
23). The utilization of short tags with similar G+C contents and melting temperatures made it possible to construct a large library of tagged transposons because these tags were amplified with similar efficiencies and therefore no preselection of tags was required (
11). A high specificity of tag detection was achieved by bar coding each transposon with two different tags.
The mini-transposon mTn
5-GNm (
13,
31) used in this study contains the
nptII resistance gene and a promotorless
gusA reporter gene. mTn
5-GNm was additionally modified by an artificial linker containing HindIII and KpnI restriction sites for cloning and priming sites for amplification of the signature tags. pG18-STM carrying the transposon modified by the linker and a transposase gene was constructed as a carrier plasmid for the signature-tagged transposons (Fig.
1).
A total of 1,498 different tags were designed, and 824 of them were synthesized and used to generate a collection of 412 transposons. Each transposon in this set was individually marked by two unique sequence tags. These transposons were used for random mutagenesis of
S. meliloti Rm2011 (Fig.
2a). The RP4 mobilizable region (
40) of pG18-STM enabled conjugal transfer of plasmids from
E. coli donor strain S17-1 into the
S. meliloti recipient cells by biparental mating. Mutants were selected based on resistance to neomycin conferred by the
nptII gene of the transposon. Twenty-four to 30 clones were picked from each conjugation, resulting in a library of 12,000 tagged mutants. The mutant clones were rearrayed into sets, each containing mutants that differed by their signature tags. A microarray carrying tag-specific probes was constructed and used for detection and quantification of mutants in pilot competition experiments (Fig.
2b).
Mapping and statistical analysis of the mutant library suggested that the insertion sites were random.
The transposon insertion sites of 5,089 mutants were determined by sequencing the junction using primer Qseq1, which bound 64 bp upstream of the linker with the cloned tags (Fig.
1b). Therefore, it was possible not only to determine the insertion sites of the transposons but also to check the tags in the mutants. Figure
3 summarizes the results of transposon mapping. Mapped transposon insertion sites of the mutants in the test set are shown in Table S3 in the supplemental material. The complete list of mapped mutations is available on the website of the public
S. meliloti GenDB Genome Project (
http://www.cebitec.uni-bielefeld.de/groups/nwt/sinogate ).
We performed several statistical tests to ensure that the distribution of transposon insertions was random and that the mTn
5-STM transposons had no hot spots in the
S. meliloti genome. An important parameter that shows the randomness of transposon insertions is the quantity of genes that carry a transposon insertion. A low number of genes hit by the transposon indicates a bias in the pattern of transposition. In order to determine the theoretical number of genes that has to be hit by at least one transposon, the neutral-base-pair model (
22) was used. This model allows estimation of the number of gene hits based on the genome length, the number of transposon insertions, and the gene sizes. Applying this model to the library of 5,089
S. meliloti transposon mutants, we predicted that 2,890 genes (standard deviation, 378 genes) had to be mutated. The actual number of genes that were hit by at least one transposon was 2,711 (43.68% of the predicted number in
S. meliloti 6207 protein-encoding genes), which was consistent with the expectations based on the neutral-base-pair model.
Furthermore, we performed a genome-wide analysis of all transposon insertion sites in relation to the G+C (A+T) content. Using a 100-bp window centered at the transposon insertion position, we calculated the mean G+C (A+T) content. The differences between the mean G+C and A+T contents within all of these windows and the mean G+C and A+T contents of the whole genome were 0.2% (G+C) and 0.1% (A+T). We also performed a χ2 test to exclude balancing effects in deviations of the G+C (A+T) content mean. The result was very low χ2 scores, 12.8 for the G+C distribution and 14.2 for the A+T distribution with 2,710 degrees of freedom, implying that there was no preference of the mTn5-STM transposons to jump into G+C- or A+T-rich regions.
In order to test the uniform distribution of all transposon insertions, we performed a χ2 test. Using a P value of 0.01 and 29 degrees of freedom per replicon, we found that a uniform distribution was highly improbable. This could have been explained by the existence of essential genes that were not represented in the mutant library. Moreover, mutations that resulted in slow growth of bacteria under the conditions used for selection of the transconjugants in this study resulted in underrepresentation of such mutants in the library. We therefore repeated the χ2 test, assuming that the S. meliloti genome contains essential genes. There is not enough information available about the quantity and position of essential genes in the S. meliloti genome that we could exclude defined groups of genes from the χ2 test. To cope with this problem, we created sets of randomly chosen genes and performed the χ2 test many times, leaving out one of the gene sets each time. The numbers of genes per set ranged from 1 to 100% of all genes that were localized on a certain replicon and did not have transposon insertions.
Such a modified χ2 test showed that the distribution of transposon insertions in the genome was likely to be random. The best χ2 test result for pSymB was a likelihood of 90% for a random distribution of transposons in this replicon. This result was obtained when a set containing 6% of the genes with no transposon insertion was excluded from the test. For pSymA, we observed an 87% likelihood of randomness when 10% of the genes with no hit were left out. In contrast, when all genes that did not have a transposon insertion were excluded from the χ2 test, the probability that the distribution was random was less than 11% for both megaplasmids.
Although the modified χ
2 test worked well for the megaplasmids, it failed in the analysis of the transposon insertion distribution throughout the
S. meliloti chromosome. We could not find a set of genes whose exclusion from the χ
2 test increased the likelihood of randomness to more than 15%. The reason for this might have been a high proportion of essential genes on the chromosome (
9) and the great size of the replicon, which led to the large fraction of genes not hit by a transposon. Therefore, a random search to detect essential genes seems to be unsuitable if the number of transposon insertions is not saturating. Nevertheless, based on the data for pSymA and pSymB and the results from the analysis of transposon insertion sites in relation to the G+C (A+T) content, we assumed that there was genome-wide random distribution of transposon insertion sites.
Pilot competition experiments identified signature-tagged mutants with altered growth patterns under different conditions.
In order to validate our STM approach, pilot experiments were carried out using a set of 378 signature-tagged mutants. The test conditions used were growth in rich (TY) medium and minimal medium (VMM), as well as growth in high-osmolarity medium and in medium containing SDS as a detergent. In all cases, the input pool was used as the reference. Each experiment was repeated three times using independent cultures.
The method used for processing the tag microarray data differed from the standard methods used for microarray data analysis due to the use of two signature tags per mutant and because of the comparatively small number of spots on the tag microarray.
After filtering steps performed to exclude technical and nonsignificant variations, 29 mutants were found to have a changed phenotype in at least one of the conditions tested. Using a K-means clustering approach, these mutants were divided into eight clusters corresponding to the pattern of competitiveness (Table
2 and Fig.
4).
Cluster 1 contained clones that were highly competitive under most of the conditions tested. In particular, it included an
flgI mutant, which was impaired for synthesis of flagella. The fast-growth phenotype of this mutant supports the observation that the synthesis of flagella is energetically disadvantageous (
27). Two other mutants in this cluster were
paaG (SMb21633) and
aroE2 (SMb20037) mutants.
paaG encodes a putative enoyl-coenzyme A hydratase/isomerase involved in phenylacetate catabolism, and
aroE2 codes for a putative shikimate 5-dehydrogenase protein involved in chorismate metabolism. Both genes have paralogs in the
S. meliloti genome.
Cluster 2 contained clones that were highly competitive in TY medium and VMM under nonstress conditions. This cluster consisted of three mutants bearing a transposon insertion in fixI2 (encoding an E1-E2-type cation ATPase), in SMb20476 (coding for a putative ABC transporter periplasmic dipeptide-binding protein), and in the intergenic region between SMb20518 (encoding a putative endohitinase) and SMb20519 (encoding a conserved hypothetical protein), probably influencing transcription of SMb20519.
The growth in VMM of mutants that belonged to cluster 3 was strongly impaired. Characteristically, all mutants in this cluster had a transposon insertion in genes involved in the synthesis of amino acids or cofactors not present in VMM, including isoleucine/valine (
ilvC), phenylalanine (
pheA), ubiquinone/menaquinone (SMc01842), cysteine (
cysG), and proline (
proB1). It was previously shown that
ilvC mutants of
S. meliloti are isoleucine/valine auxotrophs (
2) and that
cysG mutants of
Rhizobium etli are cysteine auxotrophs (
41).
Cluster 4 contained six mutants that exhibited high competitiveness in stress conditions but not in nonstress conditions. Two of these mutants had a transposon insertion in the intergenic regions of pSymB, preceding SMb20088 (encoding a conserved hypothetical protein) and upstream of SMb21337 (coding for a putative iron-sulfur-binding protein, probably a subunit of an oxidoreductase-like aldehyde oxidase or xanthine dehydrogenase). This cluster also contained a
panB (SMc01881) mutant. In
Salmonella enterica, a
panB mutation causes auxotrophy for pantothenate (
33). We suggest that in
S. meliloti the function of PanB can also be performed by another protein, probably by the product of SMb20821, which at the amino acid level exhibits 31% identity with the SMc01881 product and contains a conserved PanB domain. Three other clones in cluster 4 had mutations in
xylB (coding for a putative xylulose kinase protein that participates in degradation of
d-xylose), SMb20360 (encoding a putative protease subunit of an ATP-dependent Clp protease), and SMb20931 (coding for a putative sugar uptake ABC transporter periplasmic solute-binding protein precursor).
Cluster 5 contained two clones, cysK2 and SMc03782 mutants, whose growth was impaired under all conditions tested and was more strongly impaired in VMM. cysK2 encodes a probable cysteine synthase A (O-acetylserine sulfhydrylase A), whereas the gene product of SMc03782 has similarities to membrane-bound metallopeptidases involved in cell division and chromosome partitioning.
The competitiveness of mutants in clusters 6 and 7 was impaired more strongly under normal conditions than under stress conditions. Such a pattern probably occurred due to the fast growth of nonstressed cultures in the exponential phase compared to the growth of SDS- and salt-stressed cultures. Mutants that grew and divided more slowly than other mutants may have been less competitive in the fast-growing cultures than in stressed slowly growing cultures, if the slow-growth phenotypes were not caused by the stress conditions themselves. Cluster 6 contained mutants with mutations in the cmk gene encoding a putative cytidylate kinase and in SMb20377 encoding a putative translation initiation inhibitor protein. Cluster 7 contained two clones with mutations in transporter genes (chrA and SMa0070), a sodC (coding for a superoxide dismutase) mutant, and an SMa0091 (encoding a hypothetical protein) mutant.
Cluster 8 contained mutants whose growth was impaired in TY medium and was partially impaired under other conditions. This cluster included an
lpsB (encoding a lipopolysaccharide core biosynthesis mannosyltransferase) mutant whose competitiveness was weakened in a fast-growing TY medium culture and in a TY medium-SDS culture. Since it was previously shown (
12) that
lpsB mutants are sensitive to sodium deoxycholate, we expected this mutant to be attenuated in SDS-containing medium as well. The second mutant in the cluster had a transposon insertion in the
ppiA gene, which encodes a peptidyl-prolyl isomerase. This enzyme has a chaperone-like activity and facilitates the
cis-trans isomerization of peptide bonds N terminal to proline residues within polypeptide chains (
37). Interestingly, another mutant in this cluster had a transposon insertion in the
tig gene that encoded a peptidyl-prolyl isomerase as well. Trigger factor encoded by
tig is a ribosome-bound protein that combines two functions, peptidyl-prolyl isomerization and chaperone-like activities (
17), similar to the
ppiA gene product. Cluster 8 also contained a
pstC mutant, whose slow-growth phenotype was especially noticeable in the fast-growing TY medium culture and was less obvious in VMM and in the stressed cultures. In
E. coli,
pstC encodes a permease protein of a high-affinity P
i-specific ABC transporter (
43). A comparatively high concentration of inorganic phosphate in VMM might have been the reason for the faster growth of the
pstC mutant in this medium than in TY medium.
Three mutants that showed altered growth behavior during cultivation of the mutant pool in VMM compared to the growth behavior in TY medium were analyzed individually in competition with the wild type. In these competition experiments the proB mutant (cluster 3) was analyzed in VMM, whereas the chrA mutant (cluster 7) and the tig mutant (cluster 8) were tested in TY medium. In accordance with the competition experiment analyzing the mutant pool by quantification of the signature tags in microarray hybridizations, the three individually tested mutants showed reduced competitiveness compared to the wild type.
Conclusions.
In this study, we used a modified signature-tagged mutagenesis strategy, which for the first time was applied to a nitrogen-fixing symbiotic bacterium. A novel set of tags that does not require preselection of the tags was designed, and a library of 412 different double-tagged transposons was created using this set of tags. In a number of previous studies the workers demonstrated that there was a broad host range for transposition of the mTn
5 transposon (for a review see reference
34) for several organisms, including
S. enterica serovar Typhimurium,
E. coli,
Klebsiella pneumoniae,
Vibrio cholerae,
Proteus mirabilis,
Bacillus melitensis,
Yersinia pestis, and
Citrobacter rodentium. This broad application spectrum in combination with the large number of signature tags and the tag-specific microarray makes the mTn
5-STM transposon set a powerful and easy-to-use tool that can be applied to a broad spectrum of bacteria.
An extensive library of transposon mutants containing more than 12,000 clones was created by using the set of tagged transposons. The transposon insertion sites were determined for 42% of the mutants in this library. As a result, 44% coverage of all predicted protein-encoding genes by mapped transposon insertions was achieved. Analysis of the transposon library suggested that the insertion sites of the mTn5-STM transposons were random and that there were no hot spots.
Pilot experiments performed to verify the novel signature-tagged transposon set in combination with a microarray hybridization approach designed to identify and quantify individual mutants in the pool proved the reliability of this system for identification of attenuated mutants. The statistical processing of the tag microarray data comprising normalization and clustering allowed identification of clusters of mutants that had similar growth patterns under different growth conditions. We found that clones carrying similar kinds of mutation were grouped into the same cluster.
In future experiments, sets of mutants can be generated using up to 412 mutants carrying different unique tags. These sets should allow testing of the phenotypes of the mutants in diverse conditions. Of special interest is utilization of the signature-tagged S. meliloti mutants to identify genes important for survival and competitiveness in symbiosis with the host plants.