Open access
Research Article
19 January 2023

Simple and Rapid Site-Specific Integration of Multiple Heterologous DNAs into the Escherichia coli Chromosome


Escherichia coli is the most studied and well understood microorganism, but research in this system can still be limited by available genetic tools, including the ability to rapidly integrate multiple DNA constructs efficiently into the chromosome. Site-specific, large serine-recombinases can be useful tools, catalyzing a single, unidirectional recombination event between 2 specific DNA sequences, attB and attP, without requiring host proteins for functionality. Using these recombinases, we have developed a system to integrate up to 12 genetic constructs sequentially and stably into in the E. coli chromosome. A cassette of attB sites was inserted into the chromosome and the corresponding recombinases were cloned onto temperature sensitive plasmids to mediate recombination between a non-replicating, attP-containing “cargo” plasmid and the corresponding attB site on the chromosome. The efficiency of DNA insertion into the E. coli chromosome was approximately 107 CFU/μg DNA for six of the recombinases when the competent cells already contained the recombinase-expressing plasmid and approximately 105 CFU/μg DNA or higher when the recombinase-expressing plasmid and “cargo” plasmid were co-transformed. The “cargo” plasmid contains ΦC31 recombination sites flanking the antibiotic gene, allowing for resistance markers to be removed and reused following transient expression of the ΦC31 recombinase. As an example of the utility of this system, eight DNA methyltransferases from Clostridium clariflavum 4-2a were inserted into the E. coli chromosome to methylate plasmid DNA for evasion of the C. clariflavum restriction systems, enabling the first demonstration of transformation of this cellulose-degrading species.
IMPORTANCE More rapid genetic tools can help accelerate strain engineering, even in advanced hosts like Escherichia coli. Here, we adapt a suite of site-specific recombinases to enable simple, rapid, and highly efficient site-specific integration of heterologous DNA into the chromosome. This utility of this system was demonstrated by sequential insertion of eight DNA methyltransferases into the E. coli chromosome, allowing plasmid DNA to be protected from restriction in Clostridium clariflavum and enabling genetic transformation of this organism. This integration system should also be highly portable into non-model organisms.


Escherichia coli is one of the most important model organisms for bioengineering, and it is arguably the most well understood and studied (1, 2). A large genetic toolbox has been developed for E. coli, including methods for insertion of heterologous DNA into the chromosome. These tools include homologous recombination, λ Red recombinase-based recombineering and its newer CRISPR/Cas variants (35), and site-specific recombinases for heterologous DNA integration into the chromosome (6, 7). Despite the genetic tools already available, research in E. coli can still be hindered by the speed and simplicity of genome engineering tools available. Homologous recombination tools to insert plasmid DNA into the chromosome typically take several steps and selections. Due to the labor and time involved, this technique is not often used in E. coli. The λ Red recombinase system and CRISPR/Cas systems greatly improve upon homologous recombination in E. coli and allows for the creation of markerless mutants. However, integration efficiency is limited for these systems for DNA fragments larger than 3 kb (3).
Site-specific recombinase systems such as the Conditional-Replication, Integration, and Modular plasmid (CRIM) system represent an efficient method for inserting DNA into the chromosome. It uses site-specific DNA recombinases, in this case from native E. coli bacteriophages such as λ, to integrate plasmids into specific sites in the chromosome (8). However, this system does not provide a way to remove the plasmid backbone and the selectable marker, and it is limited to 5 native phage insertion sites.
Site-specific recombinases catalyze a single recombination event between 2 specific DNA sequences, which enables DNA integration and excision events to occur (9). They are classified into 2 families based on the catalytic amino acid in the active site, tyrosine- or serine-recombinases. The tyrosine family includes the commonly used Cre and Flp recombinases and the bacteriophage λ integrase (λ Int). Cre and Flp recognize identical recombination sites (loxP and frt, respectively), and therefore recombination is reversible (10). λ Int catalyzes a single-crossover event at 2 non-identical sites, attachment site from the bacterium (attB) and attachment site from the phage (attP), to create 2 new sites, left and right (attL and attR), and so recombination is irreversible and stable (11). The att sites from this type of recombinase are about 350 to 450 bp in size with a common 15 bp core where recombination occurs (11, 12). λ Int also requires the accessory protein integration host factor (IHF) for functionality, and therefore only functions in E. coli and very close relatives (12).
The second family of recombinases includes serine recombinases, such as the archetypical large serine recombinase ΦC31 integrase (ΦC31 Int) (7, 9, 11). These serine recombinases offer benefits over recombinases like Cre, Flp, and λ Int. They function similarly to the λ Int by enabling unidirectional recombination at attB and attP sites to create attL and attR sites. However, the att sites from these recombinases are typically shorter, ranging in size from 35 to 100 bp (11). The unidirectionality enables a stable integration event that will not spontaneously excise from the genome. Furthermore, unlike λ Int, this group does not require any host-specific factors. Therefore, large serine recombinases from phylogenetically diverse sources function in a wide variety of heterologous hosts, and they have been developed as genetic tools in diverse members of the Bacteria, Archaea, and Eukaryota (6, 7, 1315). They have also been used in E. coli as synthetic biology “parts” (16, 17). Beyond the ΦC31 Int, there are several well characterized serine recombinases that have been shown to function outside of their native host (7). These characteristics allow large serine recombinases to be especially useful in genome engineering.
Some applications require more than 5 successive insertions of heterologous DNA into the E. coli chromosome, exceeding the capacity of the CRIM system. One such application is the heterologous expression of DNA methyltransferases to enable the transformation of new bacteria. Restriction-modification (RM) systems have been demonstrated to be an important barrier to bacterial transformation because they recognize and degrade foreign DNA (1823). Most RM systems are comprised of a restriction enzyme that targets and hydrolyzes non-methylated DNA at a specific DNA motif and a corresponding methyltransferase that protects the host genome by methylating DNA at the same motif (24). Bacteria often encode more than one RM system, and to successfully transform an organism, these RM systems need to be evaded through proper methylation of DNA prior to transformation (22). An example organism is Clostridium clariflavum strain 4-2a (also called Acetivibrio clariflavus), an organism capable of degrading lignocellulosic biomass via the production of an enzyme complex called the cellulosome and catabolizing both glucose and xylose at thermophilic temperatures (25). This strain has not yet been transformed, which is likely due to the presence of 8 predicted RM systems encoded in the genome. Thus, evasion of the C. clariflavum RM systems likely requires the expression of more than 5 DNA methyltransferases in E. coli.
We sought to expand upon the CRIM system by leveraging large serine recombinases to develop a system for efficiently inserting heterologous DNA into the E. coli chromosome. By designing the system to allow for simple removal of the selectable marker and vector backbone, our system enables successive integration of up to 12 constructs without the accumulation of repetitive DNA sequences and selectable markers. As a demonstration of this system in E. coli, we expressed 8 methyltransferases encoded in the C. clariflavum 4-2a genome to methylate DNA and enable transformation of C. clariflavum for the first time.


Design of the integration system.

A heterologous DNA integration system using serine recombinases was created by first integrating a cluster of attB sites into the E. coli chromosome at the HK022 phage site using the CRIM system (Fig. 1). The resulting strain contains 14 orthogonal attB sites stably inserted into the chromosome: Φ370, ΦBT1, R4, BxB1, TP901-1, RV, Spβc, TG1, ΦC1, MR11, ΦK38, A118, Wβ, and BL3.
FIG 1 Overview of the site-specific recombinase system for DNA integration into the E. coli chromosome. (A) 14 attB sites were added to chromosome at the HK022 phage site (AG2005, AG3525, and AG4277) and the 14 corresponding serine recombinases were cloned onto temperature sensitive plasmids to mediate integration between a plasmid containing an attP site (e.g., pLAR067) and the corresponding attB site on the chromosome. In this example, the attP plasmid encodes lacZɑ, and the lacZɑ-attP cassette is flanked by NdeI and BamHI unique restriction sites to facilitate future cloning. (B) The kanamycin resistance marker and plasmid origin of replication can then be removed by introducing the ΦC31 recombinase on a temperature sensitive plasmid (pLAR047). (C) Chromosomal structure of the final strain after removal of the kanamycin resistance marker.
A non-replicating integration vector, which carries the “cargo” DNA to be inserted, contains the kanamycin resistance marker, the conditional oriR6K origin of replication that only replicates in an E. coli strain that expresses the pir gene (26), and 1 of the corresponding attP sites. The oriR6K-kanR cassette is flanked by ΦC31 attB and attP sites, which facilitates marker removal (see below). Outside of the ΦC31 sites is an arabinose-inducible promoter driving lacZα in plasmid pLAR067. This serves as a convenient site for cloning of the intended cargo, with NdeI and BamHI sites flanking lacZα.
Each of the corresponding large serine recombinases were cloned into a temperature sensitive helper plasmid, pInt-Ts (8), replacing the λ integrase. These genes are under the control of the λ pR promoter and cI857, a heat-inducible promoter system. This enables transient expression of the recombinase to mediate integration and, after raising the temperature, subsequent loss of the recombinase plasmid. The recombinase helper plasmid mediates DNA recombination between an attP site from the non-replicating “cargo” plasmid and its associated attB on the E. coli chromosome, resulting in the integration of the entire non-replicating plasmid into the chromosome. Recombination creates 2 new attachment sites called attL and attR that flank the newly integrated DNA (Fig. 1B). Next, the oriR6K-kanR cassette can be removed by transient expression of the ΦC31 integrase. This results in recombination between the ΦC31 attB and attP sites, leaving behind a single ΦC31 attR site and any associated genetic cargo (Fig. 1B and C). To consistently test each recombinase, a single integration plasmid was constructed that contains each of the corresponding attP sites (poly-attP cassette; pLAR31).

Testing integration efficiency of the recombinase helper vectors.

To test integration efficiency of the recombinases, the temperature sensitive helper plasmids encoding recombinases were each transformed into E. coli strain Top10 Δdcm::frt, which contains the poly-attB cassette on the chromosome. Next, a non-replicating plasmid containing the poly-attP cassette (pLAR031) was transformed into each strain expressing a recombinase (Fig. 2). Integration was observed with 12 of the recombinases, of which 6 (A118, TG1, BL3, Wβ, TP901, and Spβ) had an efficiency of approximately 107 CFU/μg of DNA. Another 4 showed recombination frequencies in the range of 105 to 106 CFU/μg of DNA. ΦC1 enabled the lowest recombination frequency of an average of 200 CFU/μg of DNA, while 2 recombinases, ΦBT and RV, did not enable any integration.
FIG 2 Integration efficiencies of each recombinase in E. coli TOP10 Δdcm::frt via CFU per μg (CFU/μg) of the integrating attP vector. Integration efficiency for both the two-step and co-transformation constructs is shown. The data represent individual data points, the mean, and the standard deviation of 4 biological replicates for each condition.

Integration confirmation.

Serine recombinases have been shown to integrate DNA into naturally occurring secondary attB sites, called pseudo-attB sites, that have similarity to the native sequences (27). To determine if the non-replicating plasmid integrated into the intended locus, rather than a pseudo-att site, colonies from the resulting transformations were screened using a four-primer multiplexed PCR screen, with primers flanking the 2 att sites (P1, P2, P3, and P4 indicated in Fig. 1A), similar to the screen used for the CRIM system (8). The 4 primer system gives distinct band sizes to distinguish between the attB (P1 and P4), attP (P2 and P3), attL (P1 and P3), and attR (P2 and P4) sites to evaluate if integration occurred at the intended locus. Upon correct insertion, bands consistent with the attL and attR will be observed. However, if the poly-attP cassette plasmid integrated into a pseudo-attB site, a single band, indicative of the parent strain, will form. Ten colonies from each transformation were screened and all colonies showed the correct integration bands for all 10 colonies except MR11 and ΦC1, where 7 of 10 and 4 of 10 integrated at the correct locus, respectively (Fig. S1).

Streamlined integration system via co-transformation.

To determine if the process could be further streamlined to accelerate strain construction, we explored whether co-transformation of the recombinase helper plasmid and the integrating plasmid could allow for DNA insertion into the chromosome. In this scenario, the recombinase is transiently expressed, but without selection for the helper plasmid. If successful, this would eliminate one round of competent cell preparation, transformation, and plasmid curing. When using the temperature sensitive helper plasmids described above in combination with integrating poly-attP cassette plasmid at a non-permissive temperature, no kanamycin-resistant colonies were observed, indicating no recombination. Expression of the recombinases could have been too low to enable DNA integration; therefore, the temperature sensitive recombinase helper plasmids were reconstructed to use the Clostridium thermocellum enolase promoter (PCt_eno), which has been shown to be highly active in E. coli (28) and is presumably constitutively expressed. Of the 12 recombinases shown to be functional in Fig. 2, the promoter was successfully replaced in all except Spβc and Wβ, where the cloning was not successful.
To test the efficiency of these new recombinase expression plasmids, each were co-transformed with the poly-attP cassette vector into the E. coli Top10 Δdcm::frt poly-attB cassette strain and recovered at a temperature too high to allow replication of the helper plasmid. Each of the 10 enabled integration in the 1 step integration process (Fig. 2). Five of the recombinases had an insertional efficiency of 105 CFU/μg of DNA or higher: BxB1, TG1, BL3, Φ370, and R4. The ΦC1 and TP901 recombinases also enabled DNA integration in a single step but had low efficiencies of less than 100 CFU/μg of DNA.

Backbone removal of oriR6K-kanR.

Some applications will require the removal of the antibiotic resistance gene, for instance to enable further genetic modification. Therefore, we tested the ΦC31 recombinase-based marker removal system described above using a strain in which the attP cassette plasmid integrated into the R4 attB locus (Fig. 1B and C). After integration into the R4 attB site was confirmed, a temperature-sensitive plasmid encoding ΦC31 Int was transformed into the strain at 30°C. After further incubation at a 37°C to the cure the ΦC31 helper plasmid, 12 colonies were screened for successful recombination between the ΦC31 att sites and loss of the helper plasmid by patching single colonies on each antibiotic used in the process (kanamycin and carbenicillin). Ten colonies (83%) did not grow on either antibiotic and were PCR screened for removal of the backbone using primers P5 and P6 (Fig. 1C) that flank the ΦC31 sites. Of the colonies that were sensitive to both antibiotics, 100% were confirmed to have lost the oriR6K-kanR cassette by PCR (Fig. S2).

Identification and expression of methyltransferases for C. clariflavum 4-2a.

A major barrier to the genetic transformation of non-model microorganisms is the presence of RM systems that degrade foreign DNA. One approach to evading these RM systems is to express the target organism's DNA methyltransferases in E. coli. Therefore, as a demonstration of the utility of the above integration system, we targeted heterologous expression of C. clariflavum DNA methyltransferases to methylate plasmids prior to transformation.
Putative methyltransferases in C. clariflavum 4-2a were first identified using the NEB Restriction Enzyme Database (REBASE) (29) and genome annotations (GenBank Accession number CP003065.1 [30]). Nine DNA methyltransferase gene clusters were identified: CloclaDRAFT_2368 - 2369, CloclaDRAFT _1830 - 1831, CloclaDRAFT _1994, CloclaDRAFT 1996, CloclaDRAFT _1058, CloclaDRAFT 1051, CloclaDRAFT _1868 - 1869, CloclaDRAFT _2133, and CloclaDRAFT _1961. To determine which methyltransferases are functional, methylome analysis was performed using Pacific Biosciences (PacBio) single molecule real time (SMRT) analysis to identify motifs containing 6-methyladenine (m6A) and 4-methylcytosine (m4C) and whole genome bisulfite sequencing (WGBS) to identify 5-methylcytosine (m5C). Methylome analysis revealed 8 m6A motifs and no m4C or m5C motifs (Table 1).
TABLE 1 Methylome analysis of C. clariflavum 4-2a
MotifaTypeb% Modc
Methylated motifs are indicated with the methylated base bolded. If T is bolded and underlined, the methylation is on the A of the complimentary strand.
The type of methylation is indicated, all of which are 6-methyladenine (m6A).
% Mod indicates the percentage of motifs detected as modified vs total motifs identified in the genome.
To engineer E. coli to mimic the methylome of C. clariflavum to evade restriction, 8 of the identified methyltransferase were expressed from an arabinose-inducible promoter and inserted sequentially into the E. coli chromosome into different attB sites. CloclaDRAFT 1996 was predicted to target G(m6A)TC and was not included because E. coli natively methylates this sequence. The methyltransferases were cloned into the non-replicating integration vector with a single attP site. The methyltransferases were sequentially integrated into an E. coli strain that natively encodes Dam methyltransferase to methylate G(m6A)TC but lacks dcm because C(m5C)WGG is not methylated by C. clariflavum, resulting in strain AG5645 (Table S1). The plasmid backbone of each integrated vector was removed as described above using ΦC31 Int, allowing repeated use of the kanamycin resistance gene. Methylome analysis was performed to examine the functionality of each methyltransferase in E. coli (Table 2). Six motifs were identified; of these, 4 showed complete methylation, and 2 showed partial methylation. No methylation was detected for ATGCAT and CAGAAG, suggesting that some methyltransferases were not functioning in E. coli.
TABLE 2 Methylome analysis of E. coli AG5645 strain to mimic methylation in C. clariflavum 4-2a
MotifaTypeb% Modc
Methylated motifs are indicated with the methylated base bolded. If T is bolded and underlined, the methylation is on the A of the complimentary strand.
The type of methylation is indicated, all of which are 6-methyladenine (m6A).
%Mod indicates the percentage of motifs detected as modified vs total motifs identified in the genome.

Transformation of C. clariflavum 4-2a.

Even though methylation in E. coli was incomplete, we attempted to transform C. clariflavum with plasmid DNA methylated by E. coli strain AG5645. To determine the impact of DNA methylation on transformation of C. clariflavum, plasmid pMTL83151 was isolated out of the arabinose-induced E. coli methylation strain AG5477 and used to transform C. clariflavum. As a control, unmethylated plasmid was isolated out of Top10 Δdcm::frt and used to transform the strain. Unmethylated plasmid never yielded colonies, while the methylated plasmid yielded 9 ± 3 CFU per microgram of DNA (CFU/μg).


The DNA integration system developed here expands the E. coli genetic toolbox to provide a quick and simple method to integrate up to 12 heterologous DNA constructs stably and sequentially into the chromosome of E. coli. In combination with the CRIM system, this enables insertion of up to 16 plasmids. Integration by our system can occur via a single co-transformation of 2 plasmids, including 1 recombinase-expression plasmid that is common for all insertions into that attB site and 1 that can be customized to carry the desired cargo. Site-specific recombination inserts DNA into the chromosome more quickly than classic techniques like homologous recombination. Furthermore, an additional site-specific recombination step can be utilized to remove vector backbones and resistance markers, which avoids spontaneous resistant mutants than can hinder sacB and other counter-selection systems. Using this system enables stable chromosomal integration in 1 day and unmarked insertions in 3 days, while tools such as homologous recombination often requires several steps and at least 5 days.
Another characteristic of the system is the ability of the recombinases to recombine DNA efficiently and reliably. They function without a significant decrease in efficiency as the size of the insert increases; in our hands, cargo sizes upwards of 7 kb have been integrated, and no decrease in recombination efficiency was seen (unpublished data). Size-based limitations on DNA insertion should be primarily dependent on the diminishing efficiency of DNA entry into the cell as the DNA size increases. This overcomes some of the size-associated challenges with other genetic tools, like use of the lambda red recombinase system (3) for heterologous DNA insertion. The ability to rapidly integrate constructs, at single copy, will help to increase the speed of engineering in E. coli, especially for high throughput, single copy library evaluation.
The DNA integration system described here contains other useful features. Like the CRIM system (8), it uses the conditional oriR6K origin of replication, which is only able to replicate in E. coli strains engineered to express the pir gene. This enables the integration vectors to act as a suicide vector in most E. coli strains, including the one containing the poly-attB cassette. Next, the vector backbone can easily be removed after DNA integration because the oriR6K origin and the antibiotic resistance marker are flanked by ΦC31att sites. Removal of the backbone allows for reuse of the antibiotic resistance marker and reduces repetitive DNA, diminishing the likelihood of homologous recombination events when stacking multiple genes into the poly-attB cassette. Another feature of this system is the unidirectional nature of the serine recombinases. Integration creates new attachment sites, attL and attR, and these new att sites are not recognized by the recombinases. This is especially beneficial when integrating multiple constructs in close proximity on the genome. Unlike the Flp-frt and Cre-lox systems, where the resulting scars are still substrates for the recombinase, ΦC31 does not act on the remaining ΦC31 attR sites after the marker is removed. Because these resulting attR scars cannot recombine with each other, there is no genome instability if 2 scars are near each other. Of note for DNA methyltransferase expression in particular, the dam- dcm- strain is necessarily recA+ because recA and dam are synthetically lethal (31), so the chance of homologous recombination between repetitive regions, like the PBAD promoter, should be considered in this strain. Other strains in this study (Top10/DH10B and its derivatives) are recA-, which should reduce the probability of recombination. While most researchers can work in recA- genetic backgrounds, in the future, the poly- attB cassette can be divided into multiple, smaller clusters and distributed to distant sites in the chromosome to further reduce the possibility of recombination between identical parts of inserted plasmids.
Initially, the first set of recombinase helper plasmids was constructed that allowed for integration in 2 steps. Using this approach, 12 of the recombinases were functional, but 2 were not. For unknown reasons, the BT and RV recombinases did not function as expected; this could be further explored in the future to increase the number of available sites for DNA insertion. To decrease the number of steps in this process, the ability to transform both plasmids in 1 event was desirable. The second set of helper plasmids enabled 1-step integration, though 2 of the recombinases were not successfully cloned. We speculate that high expression of the Wβ and Spβ recombinases was toxic for E. coli. However, the availability of the first set of plasmids still allows the use of the Wβ and Spβ recombinases via the 2-step process. Of the recombinases that are functional in the one-step process, R4, BxB1, and TG1 are highly efficient and reliable, so these 3 insertion sites should be targeted first when using this integration system. The ΦC1 and MR11 recombinases are least reliable, commonly integrating into pseudo-att sites and with lower efficiencies; therefore, these should be the last 2 sites to be considered. There is often a decrease in transformation efficiency when co-transforming with the strong promoter helper plasmids, and A118 and TP901-1 had especially large drops in efficiency for unknown reasons. However, the speed of the 1-step process makes them more useful when speed is desired. The 2-step approach, on the other hand will be most useful when high integration efficiency is needed, such as for the creation of large libraries.
Technologies to transform diverse non-model microorganisms are desperately needed, and mimicking the methylation patterns of the target organism to avoid restriction is an important approach to enable transformation. To this end, and as a demonstration of the utility of this tool, we engineered E. coli to express C. clariflavum methyltransferases to protect plasmid DNA and enable the genetic transformation of C. clariflavum. Transformation efficiency in this strain is still low, and further optimization will likely increase the efficiency. Two of the methylated motifs found in C. clariflavum were not found in the E. coli methylation strain, and 2 other motifs were methylated less than 100% in the E. coli genome, which likely impacts the transformation efficiency. Because C. clariflavum is a thermophilic organism, the activity of the methyltransferases at E. coli growth temperatures could decrease functionality. Future utilization of mesophilic methyltransferases that target the same motifs or engineered versions of the thermophilic enzymes to increase activity at mesophilic temperatures could help to overcome this challenge. Even with the incomplete methylation, low transformation efficiency was enabled, which allows for further genetic manipulation of C. clariflavum for fundamental studies and metabolic engineering related to lignocellulose deconstruction and bioconversion.
Finally, the tools developed here are directly applicable to other organisms, including non-model microbes. Because these recombinases do not require any host factors to function, this system should be adaptable to any strain in which the poly-attB “landing pad” can be inserted into the chromosome. Recently, this has been developed for various Gamma- and Alpha- proteobacteria in a toolset called Serine recombinase-Assisted Genome Engineering (SAGE) (32). Together, these toolsets will help accelerate strain engineering across diverse organisms.


Strains and culture conditions.

E. coli Top 10 Δdcm::frt was used for maintenance of all replicating vectors and integration of the attB cassette (21). PIR-dependent plasmids, containing the oriR6K origin of replication, were propagated in E. coli GT115 (Invivogen). All E. coli strains were grown in LB broth (Miller) with antibiotics, as necessary, for maintenance of vectors and for selection of integrants. Carbenicillin was used at 50 μg/mL, kanamycin was used at 50 μg/mL for replicating vectors and 30 μg/mL for single copy integrants, and spectinomycin and streptomycin were used together at 100 μg/mL each for replicating vectors and 50 μg/mL each for single copy integrants. All strains were grown at 37°C, unless the strain contains a temperature sensitive recombinase helper plasmid, in which case growth was at 30°C. C. clariflavum 4-2a was grown in a Coy anaerobic chamber (Coy Laboratory Products) in CTFUD medium (21) at 50°C. CTFUD medium is comprised of 3 g/L sodium citrate tribasic dehydrate, 1.3 g/L ammonium sulfate, 1.43 g/L potassium phosphate monobasic, 1.8 g/L potassium phosphate dibasic trihydrate, 0.5 g/L L-Cysteine hydrochloride, 10.5 g/L MOPS sodium salt, 6 g/L glycerol-2-phosphate disodium, 5 g/L cellobioase, 4.5 g/L yeast extract, 0.13 g/L calcium chloride dehydrate, 2.6 g/L magnesium chloride hexahydrate, 0.1 mg/L ferrous sulfate heptahydrate, and 0.5 mL/L 0.2% (wt/vol) resazurin. The CTFUD pH was adjusted to 7.0 with 45% (wt/vol) potassium hydroxide and was supplemented with 5 μg/mL thiamphenicol when needed.

Plasmid construction.

Annotated plasmid maps for all plasmids used in this study are provided in the Supplemental Material, and GenBank-style files are available upon request. All plasmids and strains used in this study are listed in Table S1. The first set of recombinase helper plasmids were constructed by GenScript, where the gene encoding λ Int in pINT-ts (8) was replaced with the synthesized serine recombinases that were previously described (7), resulting in plasmids pGSs037, 038, 040, 041, 042, 043, 044, 045, 046, 047, 048, 050, 053, 054, and 082 (Table S1). Recombinases were codon optimized for expression in E. coli.
For the second set of helper plasmids, recombinases were cloned under the C. thermocellum DSM 1313 enolase promoter (28) with a ribosome binding site modified to AGGAGGA. First, ΦC31 was synthesized with the enolase promoter into a pUC replicating vector. The promoter and recombinase DNA were then amplified by PCR using Phusion High Fidelity master mix (Thermo Scientific) and inserted using Gibson Assembly (New England Biolabs) into pINT-ts, replacing the native promoter and lambda recombinase, resulting in pLAR047. Using pLAR047 as a backbone, the remaining serine recombinases were cloned and inserted, replacing ΦC31, using Gibson Assembly, resulting in plasmids pLAR052-62 and 074 (Table S1).
The poly-attB cassette was synthesized by GenScript with all 14 attachment sites previously described (7) (att sites are listed in Table S2) except ΦC31 (plasmid pAMGs177). In order, the attB sites are Φ370, BT1, R4, BxB1, TP901-1, RV, SpβC, TG1, ΦC1, MR11, ΦK38, A118, Wβ, and BL3. The attB cassette was PCR amplified and inserted into the multiple cloning site of pAH144 (8) using Gibson Assembly, resulting in plasmid pLAR080. The attB cassette was then integrated using the CRIM system into the HK022 phage attB site (8) of: (i) E. coli Top 10 Δdcm::frt, (ii) E. coli WM3188 Δdcm::frt, and (iii) BW25113 ΔmcrA::frt ΔmcrC-mrr::frt Δdcm::frt Δdam::frt, creating strains AG2005, AG3525, and AG4277 respectively. The HK022 core attB site (33) is located between bases 1,051,926 to 1,051,940 in BW2115 (34) at sequence 5′- CTTTAGGTGAAAAAG -3′.
The base integration plasmid, pGSs009, was synthesized and constructed by GenScript. It was constructed from pAH55 (8) and a synthesized insert containing ΦC31 attB and attP sites flanking the T5lac promoter and lacZα. The promoter in pGSs009 was replaced with PBAD from pLA2 (8) to create pLAR067. For recombinase efficiency experiments, a synthesized cassette containing 13 attP sites, excluding SpβC, was first inserted into the BamHI and NdeI sites of pGSs009 using Gibson Assembly. Next, oligonucleotides with the Spβ attP and homology to pGSs009 were inserted into the EcoO1091 site, creating pLAR031.
C. clariflavum methyltransferases (CloclaDRAFT _2368_2369, CloclaDRAFT _1830_1831, CloclaDRAFT _1994, CloclaDRAFT _1058, CloclaDRAFT _1051, CloclaDRAFT _1868_1869, CloclaDRAFT _2133, and CloclaDRAFT _1961) were codon optimized and synthesized by GenScript, each with a different attP site. Each methyltransferase was then cloned into pLAR067, under the control of PBAD, using NEB builder, resulting in plasmids pMTV37-44 (Table S1).

Testing integration efficiencies in 2 steps.

To test integration efficiency of the first set of recombinase helper plasmids, each plasmid was transformed into the poly-attB cassette strain, AG2005. Batches of electrocompetent cells were made in duplicate for each strain, and 200 ng of pLAR031 was electroporated into 25 μL of each competent cell batch, each in duplicate (n = 4). Cells were resuspended in 250 μL super optimal broth with catabolite repression (SOC) and recovered at 37°C for 1 h and 42°C for 30 min. The recovery was plated on LB with kanamycin and incubated at 37°C overnight. Colonies were picked into 5 mL LB with kanamycin and PCR screened after growth. PCR was performed on strains using primers P1 to 4 (all screening primers are listed in Table S3) to screen for the corresponding recombination event using Quick-Load taq master mix (NEB).

Testing integration efficiencies in 1 step.

To test integration efficiency of co-transformation of the putative constitutively expressed recombinase helper plasmids and cargo plasmid, batches of electrocompetent cells were prepared of the poly-attB cassette strain (E. coli strain AG2005) in duplicate. Both the helper plasmid and pLAR031 were co-transformed in duplicate for each batch of competent cells (n = 4), using 200 ng of each plasmid and 25 μL of electrocompetent cells. Transformants were resuspended in 250 μL of SOC medium and recovered at 37°C for 1 h. The recovery was plated on LB + kanamycin plates and incubated at 37°C overnight.

Removal of plasmid backbone.

To remove the plasmid backbone, electrocompetent cells were made from the confirmed integration strain and transformed with the ΦC31 helper plasmid. Cells were recovered at 30°C for 1 h and plated on LB + carbenicillin plates. Colonies were picked into LB (with no antibiotics) and grown at 37°C for 8 h to allow plasmid curing. Then, a 5 μL aliquot of the outgrowth was streaked on LB plates and grown overnight. Resulting colonies were then patched on LB, LB + carbenicillin, and LB + kanamycin plates to screen for loss of the kanamycin resistance marker and loss of the helper plasmid.

Methyltransferase expression and methylome analysis.

Methyltransferase expression plasmids were sequentially integrated into E. coli strain AG4277 as described above, ultimately yielding strain AG5645. Strain AG5645 was grown with 1 mM arabinose to induce methyltransferase expression. Genomic DNA was isolated from C. clariflavum and AG5645 using the Qiagen Genomic-tip kit (Qiagen). Methylome analysis was performed on these strains using PacBio SMRT sequencing (35) and WGBS as previously described (21).

Transformation of C. clariflavum 4-2a.

Electroporation was performed similarly to existing protocols for C. thermocellum (36). Two 5 mL cultures were inoculated with C. clariflavum 4-2a and grown overnight. The next day, two 200 mL cultures were inoculated with a 1% inoculum and grown until an optical density at 600 nm (OD600) of 0.9. Once the cultures were grown, they were centrifuged in 50 mL conical tubes at room temperature at 6,000 × g for 15 min. The supernatant was decanted, and 25 mL of electroporation buffer (250 mM sucrose, 10% glycerol) was added to the tube without disrupting the cell pellet. Cells were centrifuged again and washed twice more in the same way. After the last spin, the cell pellets were resuspended in ~ 100 μL electroporation buffer and transferred to a microcentrifuge tube.
Using fresh electrocompetent cells, 20 μL of cells were transformed with 1 μg of pMTL83151 (37). Cells were electroporated in a 1 mm cuvette with a square wave using a Bio-Rad Gene Pulser Xcell Electroporation System set at 1200 V with a 1.5 msec pulse. Cells were then resuspended in 1 mL CTFUD medium and incubated for 3 h at 50°C to recover. After recovery, transformants were plated within CTFUD with 1.5% agar and 5 μg/mL thiamphenicol and incubated for 4 days at 50°C. Colonies from the plates were picked into liquid CTFUD medium. Liquid culture was screened by PCR for the presence of the cat gene and further confirmed via 16S rRNA gene sequencing to verify the culture. Two batches of C. clariflavum competent cells were transformed with plasmid from each methylation state, each time in duplicate (n = 4 total).

Data availability.

Data are available at SRA under accession numbers SRX2120272, SRX2120271, and SRX2120270.


This manuscript has been authored by UT-Battelle, LLC, under Contract No. DE-AC0500OR227259 with the U.S. Department of Energy.
Funding was provided in part by The BioEnergy Science (BESC) and The Center for Bioenergy Innovation (CBI), U.S. Department of Energy Bioenergy Research Centers supported by the Office of Biological and Environmental Research in the DOE Office of Science. Funding was also provided in part by Office of Biological and Environmental Research in the DOE Office of Science under Award Number DE-SC0019401. Oak Ridge National Laboratory is managed by UT-Battelle, LLC, for the U.S. DOE under contract DE-AC05-00OR22725. The PacBio DNA sequencing work was conducted by the U.S. Department of Energy Joint Genome Institute, a DOE Office of Science User Facility, which is supported by the Office of Science of the U.S. Department of Energy under contract DE-AC02-05CH11231. The funders had no role in study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Supplemental Material

File (jb.00338-22-s0001.pdf)
ASM does not own the copyrights to Supplemental Material that may be linked to, or accessed through, an article. The authors have granted ASM a non-exclusive, world-wide license to publish the Supplemental Material files. Please contact the corresponding author directly for reuse.


Rosano GL, Ceccarelli EA. 2014. Recombinant protein expression in Escherichia coli: advances and challenges. Front Microbiol 5.
Chen X, Zhou L, Tian K, Kumar A, Singh S, Prior BA, Wang Z. 2013. Metabolic engineering of Escherichia coli: a sustainable industrial platform for bio-based chemical production. Biotechnol Adv 31:1200–1223.
Pyne ME, Moo-Young M, Chung DA, Chou CP. 2015. Coupling the CRISPR/Cas9 system with lambda red recombineering enables simplified chromosomal gene replacement in Escherichia coli. Appl Environ Microbiol 81:5103–5114.
Baba T, Ara T, Hasegawa M, Takai Y, Okumura Y, Baba M, Datsenko KA, Tomita M, Wanner BL, Mori H. 2006. Construction of Escherichia coli K-12 in-frame, single-gene knockout mutants: the Keio collection. Mol Syst Biol 2.
Ao X, Yao Y, Li T, Yang T-T, Dong X, Zheng Z-T, Chen G-Q, Wu Q, Guo Y. 2018. A multiplex genome editing method for Escherichia coli based on CRISPR-Cas12a. Front Microbiol 9:2307–2307.
Elmore JR, Furches A, Wolff GN, Gorday K, Guss AM. 2017. Development of a high efficiency integration system and promoter library for rapid modification of Pseudomonas putida KT2440. Metab Eng Commun 5:1–8.
Brown WRA, Lee NCO, Xu Z, Smith MCM. 2011. Serine recombinases as tools for genome engineering. Methods 53:372–379.
Haldimann A, Wanner BL. 2001. Conditional-replication, integration, excision, and retrieval plasmid-host systems for gene structure-function studies of bacteria. J Bacteriol 183:6384–6393.
Kilby NJ, Snaith MR, Murray JAH. 1993. Site-specific recombinases: tools for genome engineering. Trends Genet 9:413–421.
Sauer B. 1998. Inducible gene targeting in mice using the Cre/loxSystem. Methods 14:381–392.
Groth AC, Calos MP. 2004. Phage integrases: biology and applications. J Mol Biol 335:667–678.
Landy A. 2015. The lambda integrase site-specific recombination pathway. Microbiol Spectr 3.
Thyagarajan B, Olivares EC, Hollis RP, Ginsburg DS, Calos MP. 2001. Site-specific genomic integration in mammalian cells mediated by phage phiC31 integrase. Mol Cell Biol 21:3926–3934.
Xu Z, Brown WRA. 2016. Comparison and optimization of ten phage encoded serine integrases for genome engineering in Saccharomyces cerevisiae. BMC Biotechnol 16.
Guss AM, Rother M, Zhang JK, Kulkarni G, Metcalf WW. 2008. New methods for tightly regulated gene expression and highly efficient chromosomal integration of cloned genes for Methanosarcina species. Archaea 2:193–203.
Siuti P, Yazbek J, Lu TK. 2013. Synthetic circuits integrating logic and memory in living cells. Nat Biotechnol 31:448–452.
Hsiao V, Hori Y, Rothemund PW, Murray RM. 2016. A population-based temporal logic gate for timing and recording chemical events. Mol Syst Biol 12.
Mermelstein LD, Welker NE, Bennett GN, Papoutsakis ET. 1992. Expression of cloned homologous fermentative genes in Clostridium acetobutylicum ATCC 824. Natur Biotechnol 10:190–195.
Purdy D, O'Keeffe TAT, Elmore M, Herbert M, McLeod A, Bokori-Brown M, Ostrowski A, Minton NP. 2002. Conjugative transfer of clostridial shuttle vectors from Escherichia coli to Clostridium difficile through circumvention of the restriction barrier. Mol Microbiol 46:439–452.
Yasui K, Kano Y, Tanaka K, Watanabe K, Shimizu-Kadota M, Yoshikawa H, Suzuki T. 2009. Improvement of bacterial transformation efficiency using plasmid artificial modification. Nucleic Acids Res 37:e3.
Riley LA, et al. 2019. Rational development of transformation in Clostridium thermocellum ATCC 27405 via complete methylome analysis and evasion of native restriction-modification systems. J Ind Microbiol Biotechnol 46:1435–1443.
Riley LA, Guss AM. 2021. Approaches to genetic tool development for rapid domestication of non-model microorganisms. Biotechnol Biofuels 14.
Chung DW, Farkas J, Westpheling J. 2013. Overcoming restriction as a barrier to DNA transformation in Caldicellulosiruptor species results in efficient marker replacement. Biotechnol Biofuels 6.
Tock MR, Dryden DT. 2005. The biology of restriction and anti-restriction. Curr Opin Microbiol 8:466–472.
Izquierdo JA, Pattathil S, Guseva A, Hahn MG, Lynd LR. 2014. Comparative analysis of the ability of Clostridium clariflavum strains and Clostridium thermocellum to utilize hemicellulose and unpretreated plant material. Biotechnol Biofuels 7:136–136.
Rakowski SA, Filutowicz M. 2013. Plasmid R6K replication control. Plasmid 69:231–242.
Combes P, Till R, Bee S, Smith MCM. 2002. The Streptomyces genome contains multiple pseudo-attB sites for the (phi)C31-encoded site-specific recombination system. J Bacteriol 184:5746–5752.
Olson DG, Maloney M, Lanahan AA, Hon S, Hauser LJ, Lynd LR. 2015. Identifying promoters for gene expression in Clostridium thermocellum. Metab Eng Commun 2:23–29.
Roberts RJ, Vincze T, Posfai J, Macelis D. 2015. REBASE-a database for DNA restriction and modification: enzymes, genes and genomes. Nucleic Acids Res 43:D298–D299.
Rooney EA, Rowe KT, Guseva A, Huntemann M, Han JK, Chen A, Kyrpides NC, Mavromatis K, Markowitz VM, Palaniappan K, Ivanova N, Pati A, Liolios K, Nordberg HP, Cantor MN, Hua SX, Shapiro N, Woyke T, Lynd LR, Izquierdo JA. 2015. Draft genome sequence of the cellulolytic and xylanolytic thermophile Clostridium clariflavum strain 4-2a. Genome Announc 3:e00797-15.
Wang TC, Smith KC. 1986. Inviability of dam recA and dam recB cells of Escherichia coli is correlated with their inability to repair DNA double-strand breaks produced by mismatch repair. J Bacteriol 165:1023–1025.
Elmore JR, Dexter GN, Francis R, Riley L, Huenemann J, Baldino H, Guss AM, Egbert R. 2020. The SAGE genetic toolkit enables highly efficient, iterative site-specific genome engineering in bacteria. bioRxiv.
Kolot M, Yagil E. 1994. Position and direction of strand exchange in bacteriophage HK022 integration. Mol Gen Genet 245:623–627.
Grenier F, Matteau D, Baby V, Rodrigue S. 2014. Complete genome sequence of Escherichia coli BW25113. Genome Announc 2:e01038-14.
Rhoads A, Au KF. 2015. PacBio sequencing and its applications. Genomics Proteomics Bioinformatics 13:278–289.
Guss AM, Olson DG, Caiazza NC, Lynd LR. 2012. Dcm methylation is detrimental to plasmid transformation in Clostridium thermocellum. Biotechnol Biofuels 5.
Heap JT, Pennington OJ, Cartman ST, Minton NP. 2009. A modular system for Clostridium shuttle plasmids. J Microbiol Methods 78:79–85.

Information & Contributors


Published In

cover image Journal of Bacteriology
Journal of Bacteriology
Volume 205Number 222 February 2023
eLocator: e00338-22
Editor: George O'Toole, Geisel School of Medicine at Dartmouth
PubMed: 36655997


Received: 9 September 2022
Accepted: 21 December 2022
Published online: 19 January 2023


  1. site-specific recombination
  2. phage integrase
  3. genome editing



Lauren A. Riley
Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Bredesen Center, University of Tennessee, Knoxville, Tennessee, USA
Irenee C. Payne
Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Melissa Tumen-Velasquez
Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Biosciences Division, Oak Ridge National Laboratory, Oak Ridge, Tennessee, USA
Bredesen Center, University of Tennessee, Knoxville, Tennessee, USA


George O'Toole
Geisel School of Medicine at Dartmouth


The authors declare no conflict of interest.

Metrics & Citations



  • For recently published articles, the TOTAL download count will appear as zero until a new month starts.
  • There is a 3- to 4-day delay in article usage, so article usage will not appear immediately after publication.
  • Citation counts come from the Crossref Cited by service.


If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. For an editable text file, please select Medlars format which will download as a .txt file. Simply select your manager software from the list below and click Download.

View Options

Figures and Media






Share the article link

Share with email

Email a colleague

Share on social media

American Society for Microbiology ("ASM") is committed to maintaining your confidence and trust with respect to the information we collect from you on websites owned and operated by ASM ("ASM Web Sites") and other sources. This Privacy Policy sets forth the information we collect about you, how we use this information and the choices you have about how we use such information.
FIND OUT MORE about the privacy policy