INTRODUCTION
Natural products that are produced by microorganisms have for decades attracted considerable attention for modern therapy. The bioactivity of these structurally complex substances extends from antibiotic to immunosuppressive and from cytostatic to antitumor (
106). Not only have these secondary metabolites been optimized for their dedicated function over millions of years of evolution, they also represent promising scaffolds for the development of novel drugs with improved or altered activities. Optimization can be achieved by the introduction of artificial modifications, which yields semisynthetic derivatives of existing structures, although total synthesis of complete natural-product-based compounds is also envisioned (
138,
144).
Peptidic products represent a large subclass of highly diverse natural products, many of which display therapeutically useful activity. They can be classified into different groups according to their synthesis pathways. The lantibiotics, for example, are ribosomally synthesized antimicrobial agents that are posttranslationally modified to their biologically active forms (
18). Yet another class, a widespread class of therapeutically important peptides, are produced nonribosomally by large multienzyme complexes, the nonribosomal peptide synthetases (NRPS) (
81,
111). In contrast to ribosomal peptide synthesis, non-ribosomally assembled peptides contain not only the common 20 amino acids (aa) but hundreds of different building blocks. Moreover, these secondary metabolite peptides contain unique structural features, such as
d-amino acids, N-terminally attached fatty acid chains, N- and C-methylated residues, N-formylated residues, heterocyclic elements, and glycosylated amino acids, as well as phosphorylated residues (
111). In recent research using both genetic and biochemical methods, experiments have revealed deep insights into the mechanism of nonribosomal peptide synthesis. In many cases, it was possible to alter existing non-ribosomally produced peptides by the combined action of chemical peptide synthesis and subsequent enzyme catalysis. This chemoenzymatic approach, along with a brief overview of the nonribosomal peptide synthesis machinery, will be discussed in more detail later in this review. Another focus of this article will be the labeling of NRPS-derived proteins by site-specific posttranslational modification.
STRUCTURAL RIGIDITY OF NON-RIBOSOMALLY SYNTHESIZED PEPTIDES
Selected structures of some non-ribosomally produced peptides are shown in Fig.
1. A common feature of these compounds is their constrained structure, which ensures bioactivity by a precise orientation required for interaction with a dedicated molecular target (
68). In some cases, these constraints are imposed by heterocyclization. For instance, the iron-chelating siderophore vibriobactin comprises two oxazoline rings, both of which originate from threonine residues (
145). This oxazoline ring can be further oxidized to yield oxazole, as found in the potent telomerase inhibitor telomestatin (
139). In addition to oxazoles, telomestatin also contains a thiazoline ring that is synthesized by the heterocyclization of cysteine. In the case of the antibiotic bacitracin, this heterocyclic element mediates a specific cation-dependent complexation of the phosphate group of the C
55 lipid carrier, leading to depletion of this carrier and subsequent blocking of bacterial cell wall synthesis (
122,
123). An additional strategy to modify and thus constrain the conformation of nonribosomal peptides is exemplified by the glycopeptide antibiotics of the vancomycin and teicoplanin class (
57). These closely related compounds contain a homologous heptapeptide scaffold, whose backbone is constrained by extensive oxidative cross-linking. The joining of electron-rich aromatic rings by aryl ether linkages and direct C-C coupling convert these acyclic, floppy heptapeptides into rigid, cup-shaped structures. The constrained glycopeptides sequester the
N-acyl-
d-Ala-
d-Ala termini of bacterial peptidoglycan strands with five hydrogen bonds and inhibit the transglycosylation and/or transpeptidation steps of bacterial peptidoglycan synthesis (
4,
143).
Macrocyclization is another common constraint of non-ribosomally synthesized peptides whereby parts of the molecule distant in the linear peptide precursor are covalently linked to one another (
68). Thus far, many biological strategies for the cyclization of nonribosomal cyclopeptides have been identified, giving rise to a high diversity in this class of compounds. For instance, the intramolecular capture by amines leads to peptidolactams, whereas cyclization via hydroxyl substituents leads to peptidolactones. The former strategy is observed for the peptide antibiotics tyrocidine A, bacitracin, gramicidin S (
110), and the immunosuppressive drug cyclosporine (
77). In the case of tyrocidine A, amide bond formation occurs head-to-tail between the N-terminal amino group and the C terminus of the decapeptide. An unusual type of head-to-tail cyclization is observed for nostocyclopeptide (
42), where the terminal ends of the peptide are linked via an imine bond. In contrast, the dodecapeptide bacitracin has a lariat structure, with the heptapeptide lactam ring arising from capture of the C-terminal carbonyl group by the ε-amino group of Lys
6. Moreover, the macrolactam gramicidin S is composed of two identical pentapeptides bridged head-to-tail yielding a symmetric dilactam ring. For macrolactones, analogous cyclization strategies lead to branched-cyclic structures as seen for the antifungal lipopeptide fengycin, the antibiotic pristinamycin (
63), and the biosurfactant surfactin A (
110). The former depsipeptides are cyclized via the side chains of hydroxy amino acids such as tyrosine and threonine, whereas the latter compound is cyclized via a β-hydroxylated fatty acid moiety. Finally, the iron-chelating siderophore bacillibactin is a cyclic trilactone that arises from cyclotrimerization of threonine (
84).
DIVERSITY OF NONRIBOSOMAL PEPTIDES
The structural diversity of non-ribosomally produced peptides is best exemplified for the class of acidic lipopeptide antibiotics, including the calcium-dependent antibiotic (CDA) from
Streptomyces coelicolor (
51), daptomycin from
Streptomyces roseosporus (
3,
99) and A54145 from
Streptomyces fradiae (
39,
86), as well as friulimicins and amphomycins from
Actinoplanes friuliensis (
136). All of these lipopeptides originate from streptomycetes, which produce over two-thirds of naturally derived antibiotics (
8). Each member of this class of lipopeptides can be subdivided into various individual compounds that differ in the structure of the N-terminally attached fatty acid moiety and/or the peptide backbone (Fig.
2). For example, A54145 is a complex of eight lipopeptides which are acylated with an 8-methylnonanoyl,
n-decanoyl, or 8-methyldecanoyl lipid side chain. These factors also contain four different cyclic peptide nuclei which differ in glutamate/3-methylglutamate (position 12) and/or valine/isoleucine (position 13) substitutions (
39). The diversity of acidic lipopeptide antibiotics is further amplified by the occurrence of
d-configured as well as nonproteinogenic amino acids, including
d-4-hydroxyphenylglycine (
d-HPG),
d-3-phosphohydroxyasparagine, 3-methylglutamate (3mGlu),
d-pipecolic acid, kynurenine (Kyn), and many more. Interestingly, all of the acidic lipopeptide antibiotics are comprised of a branched cyclic decapeptide lactone ring or lactam ring. The positions of the
d-configured amino acids are strictly conserved in this macrocyclic scaffold. Moreover, two aspartic acid residues are found in equivalent ring positions of the macrolactone or macrolactam ring. Recently, a genomics-based approach revealed the existence of numerous uncharacterized lipopeptide biosynthetic gene clusters, indicating that many more antibiotics of this class have yet to be identified (
88).
The therapeutic importance of the acidic lipopeptide antibiotics is best exemplified for daptomycin. This amphiphatic tridecapeptide is a member of the A21978C complex produced by
S. roseosporus (Fig.
2). Although the major components, A21978C
1 through A21978C
3, have 11-, 12-, or 13-carbon fatty acids, the yield of daptomycin (10-carbon fatty acid) from fermentations is significantly increased by adding decanoic acid to the medium. Daptomycin (Cubicin; Cubist Pharmaceuticals), exhibits bactericidal activity against resistant pathogens for which there are very few therapeutic alternatives, such as vancomycin-resistant enterococci, methicillin-resistant
Staphylococcus aureus, and penicillin-resistant
Streptococcus pneumoniae (
126). At present, spontaneous acquisition of resistance to daptomycin is rare, which might be due to a unique mechanism of action (
99).
Although the mechanism of action of daptomycin is not yet fully understood, it has been clearly established that calcium ions play an essential role in antimicrobial potency (
54,
55). A nuclear magnetic resonance (NMR) study indicated that the stoichiometry of Ca
2+ binding to daptomycin is one to one (
2). Therefore, the total charge of the Ca
2+-conjugated daptomycin (−1) is lower than that of Ca
2+-free daptomycin (−3) at a neutral pH. This would result in a more hydrophobic molecule due to charge neutralization, facilitating interaction of Ca
2+-conjugated daptomycin with lipid bilayers. It has been proposed that, upon association with bacterial cytoplasmatic membranes, a major Ca
2+-dependent conformational change promotes deeper insertion of daptomycin into the lipid bilayer (
55). This is followed by large membrane perturbations, including lipid flip-flop and membrane leakage. Formation of any of these presumably disrupts the functional integrity of the membrane leading to cell death of gram-positive bacteria.
Although some of the key structural prerequisites for daptomycin's antibacterial activity have been identified, the exact nature of the molecular targets within the cytoplasmatic membrane has yet to be established. However, the aforementioned model of the mechanism of action provides an initial step toward understanding how this antibiotic gains access to and interacts with bacterial membranes. Since the other acidic lipopeptide antibiotics CDA, A54145, friulimicins, and amphomycins share key structural features with daptomycin; they might undergo similar interactions with calcium ions and bacterial membranes. Therefore, it is essential to further probe the structure-function relationship of all acidic lipopeptide antibiotics. Using this knowledge will enable the design of new and improved derivatives of this remarkable class of antibiotics. However, in order to engineer more potent variants, one has to understand the biosynthesis of these complex compounds. This will be the focus of the following section.
BIOSYNTHETIC LOGIC OF NONRIBOSOMAL PEPTIDE SYNTHETASES
Despite the structural diversity of the non-ribosomally produced acidic lipopeptide antibiotics, these secondary metabolites share a common mode of synthesis, the so-called “multiple carrier thio-template mechanism” (
75,
76,
81). According to this model, peptide synthesis is performed by nonribosomal peptide synthetases (NRPSs). Figure
3 shows the NRPS assembly lines for daptomycin, A54145 and CDA. Detailed analysis of the daptomycin gene cluster revealed that the daptomycin biosynthetic system consists of three distinct NRPSs, namely, DptA (684 kDa), DptBC (815 kDa), and DptD (265 kDa) (
87). In contrast, the closely related A54145 biosynthetic system comprises four NRPSs (LptA, LptB, LptC, and LptD) (
86). It is assumed that DptBC arises from a fusion of two NRPSs similar to LptB and LptC. Finally, the nonribosomal CDA biosynthetic system is a multienzyme complex consisting of three enzymatic subunits, CdaPS1 (799 kDa), CdaPS2 (395 kDa), and CdaPS3 (259 kDa) (
51).
The multifunctional NRPSs of daptomycin, A54145, and CDA are organized into sets of repetitive catalytic units called modules (Fig.
3). Each module is responsible for the specific incorporation of one residue into the peptide backbone (
107). Therefore, the number of modules within the NRPSs exactly matches the number of residues of the corresponding peptides. Moreover, the order of modules corresponds directly to the primary sequence, because nonribosomal peptide synthesis proceeds colinearly in an N-terminal-to-C-terminal direction (
91). Such biosynthetic templates are also referred to as linear NRPSs (type A). In contrast to that, iterative NRPSs (type B) use their modules or domains more than once in the assembly of peptides that consist of repeated smaller sequences. Finally, nonlinear NRPSs (type C) constitute a considerable fraction of the NRPS repertoire where the sequence of the product does not directly correspond to the linear arrangement of modules and domains within the biosynthetic template. These various biosynthetic strategies of nonribosomal peptide synthesis were extensively reviewed by Mootz et al. (
91).
The proper coordination of communication between partner NRPSs in
trans (i.e., the last module of DptA and first module of DptBC) is facilitated by short regions at the C and N termini of the corresponding proteins (
47). These communication-mediating (COM) domains, also referred to as docking domains, comprise 15 to 30 amino acid residues and prevent undesired interactions between mismatching NRPSs (i.e., the last module of DptA and first module of DptD), which would lead to the formation of truncated peptide products. Sequence alignments revealed that the overall identity among COM domains is low, reflecting the high degree of specialization for their dedicated partner COM domains. The first structural insights into the interaction between multimodular subunits were gained from NMR spectroscopy on related polyketide synthases (PKS) (
14). Studies of fused docking domains of the 6-deoxyerythronolide B synthase (DEBS) multienzyme subunits DEBS2 and DEBS3 revealed that protein-protein recognition is primarily mediated by interhelical contacts. The most important determinant of docking is a set of conserved hydrophobic interactions between four α-helices, which together form the core of a parallel four-helix bundle. In addition to the hydrophobic interface, two partially buried salt bridges between two of these α-helices may play a role in stabilizing this docking interaction. Furthermore, such ionic contacts might contribute to the destabilization of misdocked partner PKS subunits. Knowledge of the structural aspects of intersubunit communication may contribute to engineering of optimized protein-protein interfaces between NRPS, PKS, and mixed NRPS/PKS systems.
NRPS modules are further subdivided into domains that catalyze the single reaction steps, such as amino acid activation, covalent binding of activated residues, amide bond formation, epimerization of covalently bound residues, and peptide release from the NRPS complex. These autonomous catalytic units will be discussed below.
Dissecting the Modules into Domains
At least three domains are necessary for the nonribosomal production of peptides (Fig.
4): the adenylation domain (A-domain), the peptidyl-carrier protein (PCP), and the condensation domain (C-domain). The A-domain (∼550 aa) controls the first step of nonribosomal peptide synthesis, namely, the specific recognition and activation of the dedicated amino acid (
26,
83). This domain catalyzes two reactions. First, the A-domain selects the cognate building block from the pool of available substrates, followed by activation as an aminoacyl adenylate intermediate (Fig.
4). The corresponding reaction in ribosomal synthesis is performed by aminoacyl-tRNA-synthetases, although these enzyme families share neither sequence nor structural relations (
124). Two crystal structures of A-domains have been solved to date. These include the Phe
1-activating A-domain (PheA) of the gramicidin S synthetase A of
Bacillus brevis (
22) (Fig.
4) and the 2,3-dihydroxybenzoate (DHB)-activating A-domain (DhbE) of
Bacillus subtilis (
83). They are composed of a large N-terminal subunit and a small C-terminal subunit. The active site is located at the junction between the two subunits. Comparison of the residues lining the active sites of PheA and DhbE and sequence alignments of other A-domains led to the identification of 10 residues that confer substrate specificity, also referred to as the codons of nonribosomal peptide synthesis (
118). Using this nonribosomal code, it is possible to predict the substrate specificity of A-domains simply by sequence analysis.
Second, the activated aminoacyl adenylate is transferred onto the thiol group of the 4′-phosphopantetheine (ppan) cofactor of the PCP, which is the only NRPS domain without autonomous catalytic activity. The PCP (∼80 aa) facilitates the ordered transport of substrates and elongation intermediates to the catalytic centers with all intermediates covalently tethered to the 20-Å-long ppan cofactor (Fig.
4) (
34,
116). This principle facilitates substrate channeling and overcomes diffusive barriers, therefore maximizing the catalytic efficiency of the NRPS-mediated biosynthesis (
111). First insights into protein structure and function of PCPs were gained from an NMR study on PCP from
Bacillus brevis tyrocidine synthetase (
141) (Fig.
4). PCP exhibits a distorted four-helix bundle fold and an extended loop between the first two helices. An invariant Ser
45 residue, which serves as the site of ppan cofactor binding, is located at the interface between this loop and the second helix. The posttranslational apo-to-holo conversion of PCPs is catalyzed by NRPS associated 4′-phosphopantetheinyl transferases, which use coenzyme A (CoA) as a substrate (
72) (see “MANIPULATION OF CARRIER PROTEINS BY POSTTRANSLATIONAL MODIFICATION”).
Formation of the peptide bond in nonribosomal peptide biosynthesis is mediated by the C-domain (∼450 aa) (
9,
117). This domain catalyzes the nucleophilic attack of the downstream PCP-bound amino acid with its α-amino group on the electrophilic thioester of the upstream PCP-bound amino acid or peptide (Fig.
4). The directionality of this process is realized by donor and acceptor sites on the C-domain for electrophiles and nucleophiles, respectively (
91). According to the multiple carrier thio-template mechanism (
121), the acceptor site binds the nucleophile with high affinity until the incoming electrophile completes the condensation process. First structural insights into this class of enzymes were gained from the crystal structure of the freestanding C-domain VibH of the
Vibrio cholerae vibriobactin synthetase (
60) (Fig.
4). Pseudodimeric VibH consists of a C-terminal domain and an N-terminal domain, with each domain being an αβα sandwich. The substrates DHB and norspermidine enter the active site, which is located at the interface of the two domains, from opposing sites of the C-domain, the so-called N- and C-faces. Therefore, these two faces would correspond to the assumed donor (C-face) and acceptor sites (N-face) of the C-domain, respectively. Biochemical characterization of different C-domains revealed that the acceptor site discriminates against amino acids of opposite stereochemistry and with noncognate side chains (
7,
21). In contrast, the donor site is more tolerant of the respective electrophile. Nevertheless, further investigations with the C-domain of tyrocidine elongation module 5 indicated that the donor position exhibits stereoselectivity toward the C-terminal residue for condensation reactions (
21). This shows that, in addition to A-domains, C-domains serve as a selectivity filter in nonribosomal peptide synthesis.
Proofreading of Nonribosomal Peptide Synthesis
The low substrate specificity of ppan transferases causes undesired misacylation of PCPs. Since the bacterial cell produces a large fraction of CoA in the form of acyl-CoAs (
53), it is therefore likely that these enzymes also modify the PCPs of NRPSs with acylated ppan cofactors. Such misprimed PCPs are not recognized by later-acting domains, thereby blocking nonribosomal peptide synthesis. In order to regenerate these misprimed NRPS templates, a type II thioesterase (TEII) is assumed to catalyze hydrolysis of the undesired acyl groups (
108). Moreover, a recent study suggests that the TEII also hydrolyzes incorrectly loaded amino acids that are not processed by the nonribosomal machinery (
148). According to this model, TEII discriminates “correct” from “incorrect” residues based on the increased half-life of unprocessed aminoacyl-
S-ppan intermediates. In contrast to this, TEII does not catalyze the hydrolysis of stalled peptide intermediates, which indicates that the release of these energy-consuming intermediates is prevented by rigorous editing of misloaded amino acids prior to incorporation into the product (
132,
148).
Lipidation of Non-Ribosomally Produced Peptides
N-terminal lipidation is a key structural feature of many nonribosomal peptides, such as the acidic lipopeptide antibiotics, fengycin (Fig.
1), surfactin A (Fig.
1), syringomycin, and mycosubtilin, etc. As discussed above (see “DIVERSITY OF NONRIBOSOMAL PEPTIDES”), it is important for interaction with hydrophobic targets, e.g., cell membranes. However, in contrast to the well-studied peptide elongation, very little is known about the mechanism of this chemical transformation. In the case of daptomycin, the deduced translation products of the
dptE and
dptF genes are likely to have a role in N-terminal lipidation (
87). DptE exhibits conserved motifs typical of adenylate-forming enzymes and may therefore activate the long-chain fatty acid as acyl-adenylate (Fig.
5). A similar mode of activation was previously described for the long-chain fatty acyl-AMP ligases of
Mycobacterium tuberculosis (
129). According to this work, long-chain fatty acids are activated as acyl-adenylates, which are then transferred onto the ppan cofactor of the N-terminal PCP of the corresponding PKS. However, the daptomycin biosynthetic system lacks such an N-terminal PCP. Instead, DptF may serve this function due to its significant alignment to ppan-binding acyl carrier proteins (ACPs). This domain could then transfer the ppan-bound fatty acid to Trp
1 tethered to the N-terminal module of DptA. Acylation of Trp
1 is presumably catalyzed by the most upstream C-domain, the so-called starter C-domain. Specific starter C-domain-ACP docking may facilitate this acyl transfer reaction (Fig.
5). However, further studies are needed to clarify the specificity and biochemistry of the interaction between the ACP and the starter C-domain of the daptomycin as well as other lipopeptide-encoding biosynthetic systems.
Generation of d-Amino Acid Residues in NRPSs
One striking feature of many NRPSs is that they incorporate
d-amino acids into their peptide products. The
d-configured residues may inhibit the degradation of nonribosomal peptides by naturally
l-specific proteases or may serve structural functions by determining the bioactive conformation (
70,
79,
120). In most cases, incorporation of
d-amino acids into the peptide sequence is mediated by an interplay between the epimerization domain (E-domain; ∼450 aa) (
95,
120) and the downstream C-domain (Fig.
6A). The E-domain catalyzes racemization (equilibration between
l- and
d-enantiomers) of the PCP-bound
l-amino acid or epimerization of the C-terminal amino acid (equilibration between
l- and
d-epimers) of the growing peptide chain. In order to ensure selective incorporation of the
d-amino acid into the peptide backbone, the donor site of the downstream C-domain is
d-specific for the incoming cofactor-bound electrophile (
7). Hence, the C-domain functions as a catalyst directing the condensation of an upstream
d-amino acid with a downstream
l-amino acid (
dC
l catalyst).
A different mechanism for the incorporation of
d-amino acids is utilized by the cyclosporine synthetase (Fig.
6B) (
50). The corresponding biosynthetic gene cluster encodes an alanine racemase to provide substrate for the
d-Ala-selective A-domain in the first module. This shows that besides C-domains, A-domains may also represent a stereoselective filter in nonribosomal peptide synthesis.
Recently, a third strategy of
d-amino acid incorporation was observed in multiple gram-negative
Pseudomonas strains producing arthrofactin, syringomycin, and syringopeptin (
1). The lipopeptidolactone arthrofactin, for instance, contains seven
d-amino acids, yet there are no E-domains in any of the three NRPSs, ArfA, ArfB, and ArfC. Moreover, kinetic measurements revealed that at least the three most upstream A-domains activate
l-amino acids rather than
d-amino acids. Interestingly, epimerization of amino acids is catalyzed by a new type of domain, a C/E-domain, which is proposed to have dual catalytic roles for epimerization and condensation (Fig.
6C). Remarkably, the epimerization reaction does not take place unless the PCP downstream of this C/E-domain is loaded with the dedicated amino acid. Therefore, the epimerization activity may be triggered by a conformational change of the C/E-domain which is induced by the aminoacylated downstream PCP that is primed for peptide bond formation. After epimerization of the upstream aminoacyl/peptidyl thioester, the C/E-domain finally catalyzes the elongation of the peptidyl chain with
DC
l chirality.
MACROCYCLIZATION CATALYZED BY NONRIBOSOMAL THIOESTERASE DOMAINS
Nonribosomal peptides grow by consecutive addition of activated aminoacyl monomer units. The elongated chain is translocated each time from upstream to downstream PCPs during chain elongation. Once the peptide chain reaches its full length at the most downstream PCP, it has to be released in order to reactivate the NRPS machinery for the next synthesis cycle. Typically, termination of peptide synthesis is accomplished by a thioesterase domain (TE-domain; ∼280 aa) fused to the C-terminal module (
68). This enzyme uses an active site serine as a nucleophilic catalyst. Peptide release is initiated by transfer of the ppan-bound peptide chain to the active site serine of the downstream TE-domain to generate an acyl-
O-TE intermediate (
68). This covalent enzyme intermediate may break down either by the attack of a water molecule to yield a linear peptide (e.g., vancomycin) or by attack of an internal nucleophile, producing a cyclopeptide (e.g., daptomycin) (Fig.
7A).
While TE-domains represent the most common solution to peptide release in nonribosomal biosynthesis, alternative strategies are known. For instance, the most downstream C-domain of cyclosporine synthetase is proposed, in the synthesis of cyclosporine 7 (Fig.
1), to catalyze peptide release by head-to-tail condensation (Fig.
7B) (
140). Moreover, peptide release can occur under reduction of the carboxy group mediated by the NAD(P)H-dependent reduction domain (R-domain) such as in the biosynthesis of the linear peptide alcohol gramicidin A in
B. brevis (
62) and in the formation of the macrocyclic imine nostocyclopeptide 8 (Fig.
1) from
Nostoc sp. (Fig.
7C) (
6).
However, macrocyclization catalyzed by nonribosomal TE-domains seems to be the favored mechanism for peptide release, not least because of the role this structural constraint plays in resistance to proteolytic degradation and enhanced bioactivity. For example, the conformation of daptomycin is constrained by a branched cyclic decapeptide lactone derived from TE-mediated cyclization of an
l-threonine side chain onto the C terminus (
51). Considering the diversity in cyclization strategies of nonribosomal peptides, it is not surprising that the overall identity among TE-domains is only 10 to 15%, therefore reflecting the high degree of specialization for their catalyzed cyclization reactions (
106). Structural and mechanistic aspects of these versatile macrocyclization catalysts (also referred to as peptide cyclases) are discussed in the following section.
Structural and Mechanistic Aspects of Peptide Cyclases
First structural and mechanistic insights into the mode of TE-mediated peptide cyclization were gained from the crystal structure of the surfactin cyclase (Srf TE) (
15). The crystallographic studies revealed similarities to structures previously solved for α/β hydrolase family members. However, the Srf TE most significantly differed from the canonical fold of this superfamily by an extended insertion composed of three α-helices that reach over the active site. Based on alignment, this “lid” differs significantly from the corresponding regions of other TE domains, suggesting that the substrate specificity is encoded in this predominantly nonconserved region of the cyclase (
68). The nonconserved residues in the lid may direct cyclization through specific interactions with the Srf TE-bound peptide chain. Based on further studies, the two positively charged residues Lys
111 and Arg
120 in the active site may also contribute to the proper folding of the substrate by coordination of the negatively charged residues Glu
1 and Asp
5 in the surfactin sequence (
130).
In NRPS assembly lines, the TE-domain acts in concert with the upstream PCP that donates the ppan-bound peptide chain. In the case of Srf TE, a putative interaction site allows docking of the C
α chain of PCP to the cyclase (
15). The peptide chain tethered to the 20-Å-long ppan cofactor is presumably directed via a cleft into the active site of the globular cyclase and transferred onto a conserved serine residue. This residue belongs to a catalytic triad composed of Ser
80, His
207, and Asp
107. Cocrystallization studies with a boronic acid inhibitor revealed distinct recognition and binding of the C-terminal residues Leu
7 and
d-Leu
6 of the surfactin peptide in the active site (
130). Finally, breakdown of the generated acyl-
O-TE intermediate occurs by regioselective intramolecular attack of the fatty acid β-hydroxyl group on the oxoester bond to exclusively release the macrolactone.
Autonomous Cyclization Activity of Excised TE-Domains
The great pharmacological potential of many cyclic peptides emphasizes their role in drug discovery, as they show specific interactions with defined cellular targets and high stability against proteolytic digestion (
111). They are therefore very promising scaffolds for drugs. So far, modern organic chemistry faces many difficulties in the reliable production of cyclopeptides. In many cases, the yield is poor or the reaction lacks sufficient regio- and stereoselectivity (
23,
109). These problems could be solved by using nonribosomal cyclases, which catalyze the regio- and stereoselective cyclization of linear precursor peptides without the use of protecting groups. However, the application of nonribosomal TE-domains for cell-free synthesis of cyclic peptides requires translation between the biological and chemical languages. First, the complex NRPS multienzyme machinery required for peptide elongation is replaced by well-established solid-phase peptide synthesis (SPPS), which greatly facilitates the rapid synthesis of peptides containing unnatural amino acids (
109). Second, the TE-domain is used as an isolated enzyme for in vitro peptide cyclization, because the large size of the whole multienzyme complex causes severe preparative problems. Third, to ensure acylation of the excised TE-domain, the natural PCP-bound phosphopantetheine prosthetic group is replaced by a cofactor mimic, which is attached to the C-terminal end of the chemically synthesized peptide.
This chemoenzymatic approach was first achieved by a cooperation between the Walsh and Marahiel laboratories, which reported on the isolation and characterization of the TE-domain of tyrocidine synthetase from
Bacillus brevis (Fig.
8) (
127). Incubation of a chemically synthesized tyrocidine decapeptidyl-
N-acetylcysteamine (SNAC) thioester and excised tyrocidine cyclase (Tyc TE) resulted in the formation of the cyclic decapeptide antibiotic tyrocidine A (Fig.
1). Hydrolysis of the substrate mimic could be detected to a lesser extent and might be due to the fact that the excised cyclase lacks the hydrophobic environment of the multienzyme complex. Recent results indicate that the interaction of the isolated Tyc TE with detergent micelles may serve to mimic the natural contacts of this domain with the larger synthetase (
149). In fact, the addition of nonionic detergent induced a significant shift in the product ratio of Tyc TE in favor of macrocyclization.
To explore the substrate specificity of Tyc TE, a scan through all 10 positions of the peptidyl-SNAC thioester was performed (
127). Notably, it was found that only the substitution of amino acids near the end of the decapeptide, namely,
d-Phe
1 and
l-Orn
9, significantly decreased the rate of TE-catalyzed cyclization. It was also observed that thioester substrates 6 to 14 residues in length could be efficiently cyclized by Tyc TE, resulting in the formation of different-size macrolactams (
67). Alterations of the peptide backbone either by the replacement of three amino acid blocks with flexible spacers or by the replacement of individual amide bonds with ester bonds provided evidence that product-like intramolecular hydrogen bonds facilitate peptide preorganization (
128). This preorganization was efficient enough to allow macrolactone formation by using a hydroxyl group as intramolecular nucleophile despite the lower nucleophilicity of hydroxyl compared to amine. Based on these findings, a model of a minimal cyclization substrate for the Tyc TE was postulated (
128).
Generality of TE-Catalyzed Peptide Cyclization
To provide evidence for the general utility of TE catalysis as a means to synthesize a wide range of macrocyclic compounds, peptide cyclases from other NRPS systems were cloned and overexpressed. The recombinant thioesterase domain SnbDE TE from
Streptomyces pristinaespiralis is a versatile cyclase for the production of streptogramin B antibiotics such as pristinamycin (Fig.
1) (
80). Although the streptogramin B (S
B) SNAC substrates with the natural phenylglycine (Phg) at the C terminus undergo rapid C-terminal racemization under assay conditions, stereoselective SnbDE TE only incorporates
l-Phg into the cyclic product (Fig.
9). This dynamic kinetic resolution (
131) simplifies challenging S
B synthesis to standard peptide chemistry and subsequent enzymatic reaction. Besides having high stereoselectivity, SnbDE TE was able to mediate both macrolactonization and macrolactamization of peptide thioester substrates. Interestingly, macrolactamic S
B derivatives are promising pharmacophores because in some cases, S
B resistance arises from lyase-catalyzed cleavage of the natural lactone bond (
92).
In addition to providing insights into stereoselectivity, biochemical studies of the recombinant
S. coelicolor CDA TE have provided important insights into the regioselectivity of peptide cyclases. Incubation of N-terminally acetylated CDA thioester analogs with CDA TE resulted in two regioisomeric macrolactones which arise from simultaneous nucleophilic attack of the two adjacent Thr
2 and Ser
1 residues onto the C-terminal Trp
11 of the acyl-enzyme intermediate (
45). This relaxed regioselectivity was used to rationally manipulate the ring size of the macrocyclic product. For instance, substitution of either Thr
2 or Ser
1 by Ala led to selective formation of a decapeptide or undecapeptide lactone ring. Interestingly, elongation of the N-terminal acyl group by four methylene groups to the natural length (C
6) of CDA turned the relaxed regioselectivity into a strict regioselectivity, yielding solely the decapeptide lactone ring derived from cyclization via Thr
2. This result suggests the crucial role of the lipid chain in controlling the regioselectivity of TE-mediated macrocyclization. Binding of the N-terminal fatty acid in the active site of CDA TE might ensure a precise positioning of the Thr
2 residue required for a regioselective attack onto the acyl-
O-TE oxoester.
To further expand the set of cyclization catalysts, the peptide cyclases Syr TE from syringomycin synthetase, Fen TE from fengycin synthetase, and Myc TE from mycosubtilin synthetase were cloned and overexpressed (
112,
113). However, the inability to recognize and bind conventional peptidyl-SNAC substrates precluded examination of these cyclases. To mimic the natural substrate presentation as close as possible, a strategy which allowed Sfp-catalyzed loading of peptidyl-CoA substrates onto apo-PCP-TE didomains was employed (
113). This strategy takes advantage of the direct interaction between the ppan-bound substrate of the PCP and the C-terminally adjacent TE-domain. Using this approach, it was possible to detect cyclization of a linear fengycin analog. However, one major drawback of this method is that the ppan cofactor remains attached to the PCP-TE didomain, thereby blocking Sfp-catalyzed transfer of additional peptidyl-CoA substrates onto PCP. To force multiple turnover catalysis, reloading of the ppan-PCP-TE didomain was attempted by chemical transthioesterification using peptidyl-thiophenol substrates (
112). Surprisingly, instead of ppan reloading, the highly electrophilic peptidyl-thiophenol substrates directly acylated the TE active site serine. Furthermore, it was possible to biochemically characterize Syr TE, Fen TE, and Myc TE, which displayed no activity with less electrophilic peptidyl-SNAC substrates.
Solid-phase peptide synthesis enables the detailed analysis of NRPS-derived peptide cyclases. However, relatively little is known about the substrate specificity of macrocyclization catalysts of mixed NRPS/PKS biosynthetic systems. Biochemical studies of such systems are still hampered due to the challenges involved in synthesizing suitable linear precursor compounds. Nevertheless, researchers have begun to explore the substrate tolerances of the epothilone C and cryptophycin terminal TE-domains that mediate macrolactonization of mixed NRPS/PKS-derived chain elongation intermediates (
5,
12). In the former case, the artificial linear substrate was generated from the parent compound epothilone C via hydrolytic ring opening and subsequent conversion of the free acid into the SNAC thioester. Treatment of this thioester with epothilone TE generated a mixture of the 16-membered macrolactone, epothilone C, and the hydrolysis product, seco-epothilone C (
12). In contrast, no conversion of the SNAC thioester to epothilone C was detected in high-performance liquid chromatography assays in the absence of the recombinant TE-domain. Similarly, the isolated TE-domain from the cryptophycin biosynthetic pathway was capable of generating 16-membered depsipeptide rings with high efficiency (
5). While epothilone TE was probed with only one substrate, a monomer-based chemical synthesis approach allowed for the characterization of cryptophycin TE with various SNAC substrates. These studies revealed considerable tolerance for structural variation within the seco-cryptophycin unit C β-alanine residue, whereas a terminal phenyl ring in unit A is essential for efficient cyclization. These investigations are likely to provide access to novel compounds by combining synthetic chemistry and mixed NRPS/PKS metabolic enzymes.
Chemoenzymatic Approaches toward Novel Cyclopeptides
In order to investigate the general utility of NRPS cyclases for generating small molecules with different therapeutic potential, broad substrate tolerance is highly desirable. Kohli and coworkers showed that Tyc TE was capable to cyclize peptide substrates, in which up to 7 of 10 cognate residues were simultaneously replaced (
66). Macrolactamization of these linear peptide precursors containing an integrated RGD sequence yielded potent inhibitors of ligand binding by integrin receptors, with cyclization and N-methylation being important contributors to nanomolar potency (Fig.
10). Therefore, the therapeutic activity of the cyclization product was successfully moved from infectious disease (tyrocidine A) to cardiovascular pharmacology. The ability of Tyc TE to tolerate simultaneous side chain alterations was further utilized to mediate cyclization of substrates containing nonpeptidic elements. Incorporation of ε-amino acid building blocks into the peptide backbone led to the formation of cyclic polyketide/tyrocidine hybrids (Fig.
10) (
65), which could be used to further optimize macrocyclic peptide/polyketide natural products, such as the immunosuppressant rapamycin and the anticancer agent epothilone (
31). Furthermore, the insertion of (
E)-alkene-dipeptide isosters allows the peptide backbone to be modified postsynthetically by chemical metathesis (
40).
To evaluate the potential utility of excised TE domains for generating cyclic peptide libraries, a combinatorial approach was developed by Walsh and coworkers (
69). In a biomimetic synthetic strategy, a solid-phase PEGA [poly(ethylene glycol)acrylamide copolymer] resin functionalized with a synthetic tether substitutes for the ppan cofactor of the PCP (Fig.
10). Subsequent SPPS was used for the preparation of more than 300 linear tyrocidine derivatives. When these solid-support-bound peptides were incubated with the recombinant Tyc TE, the cyclase could productively catalyze peptide release by enzymatic on-resin cyclization. The resulting library of cyclopeptides revealed that replacement of
d-Phe
4 in tyrocidine A (Fig.
1) by a positively charged
d-amino acid led to 30-fold selectivity for bacterial membranes, thereby minimizing the hemolysis of red blood cells. These improved tyrocidine derivatives can now be translated back into an engineered NRPS template for large scale production via fermentation.
The chemoenzymatic potential of Tyc TE was also used to generate glycosylated cyclopeptides. Using this cyclase, macrocyclized tyrocidine decapeptide analogs with unnatural propargylglycine residues incorporated at positions 3 to 8 were prepared (
74). The peptide backbones containing these alkyne residues allowed subsequent postsynthetic modification to selectively introduce azido-functionalized sugar residues by copper(I)-mediated [3 + 2] cycloaddition reactions, also referred to as “click chemistry” (Fig.
11A). Later, Lin and coworkers developed an alternative method to prepare glycosylated cyclopeptides by incorporating glycosylated amino acids into linear peptides via SPPS followed by enzyme-catalyzed macrolactamization (Fig.
11B) (
73). Numerous O-linked glycosylated peptidolactams were prepared using glycosylated serine or tyrosine residues at positions 5 to 8.
While conventional chemical glycosylation of cyclic peptides suffers from little regiochemical control and enzymatic glycosylation is limited by the high substrate specificity of glycosyltransferases, these chemoenzymatic strategies combine regioselective incorporation of sugar moieties with the broad tolerance of Tyc TE for side chain replacements. Hence, these approaches allow carbohydrate complexity to be generated into macrocyclic peptides and should be generalizable to other NRPS cyclases, thereby providing a powerful tool for the production of novel drug leads by large cyclic library screens.
Using chemoenzymatic peptide cyclization, it should also be feasible to make libraries of lipopeptides. For instance, the approved antibiotic daptomycin (see “DIVERSITY OF NONRIBOSOMAL PEPTIDES”) is a complex lipopeptide to approach synthetically. Moreover, chemical modifications of this nonribosomal cyclopeptide have been restricted to the α-amino group of
l-Trp
1 and the δ-amino group of
l-Orn
6 (
25,
49,
114). Interestingly, five of the amino acids in daptomycin's lactone ring are found at the same positions in CDA. In addition, both lipopeptides comprise decapeptide lactone rings. Therefore, the capability of CDA cyclase for the chemoenzymatic generation of daptomycin was investigated. Simultaneous incorporation of six daptomycin-specific residues into the CDA backbone and elongation of the N terminus by two residues yielded a daptomycin derivative which contained
l-Asn at position 2 and
l-Glu at position 12 (
44). In accordance with acidic lipopeptide antibiotics, the bioactivity of this chemoenzymatic assembled daptomycin analog is dependent on the presence of calcium ions (see “DIVERSITY OF NONRI- BOSOMAL PEPTIDES”). To identify calcium-binding sites in the lipotridecapeptide chain, all four acidic residues were successively substituted by either Asn or Gln. Bioactivity studies revealed that only Asp
7 and Asp
9 are essential for antimicrobial potency (
44). According to a recent NMR structure of daptomycin (
101), both residues are part of a type II′ β-turn at the anionic/polar end of the amphiphatic molecule. This structural feature is likely to be important for calcium binding and therefore biological activity.
Interestingly, daptomycin also contains two fluorophores: Trp at the N terminus and nonproteinogenic Kyn at the C terminus. Both fluorophores show significant spectral overlap between the donor emission (Trp) and the acceptor absorption (Kyn). Remarkably, CDA TE-mediated peptide cyclization brings Trp and Kyn in sufficiently close proximity to enable efficient fluorescence resonance energy transfer (FRET), providing a tool to track TE-mediated peptide cyclization in real time (
43).
MANIPULATION OF CARRIER PROTEINS BY POSTTRANSLATIONAL MODIFICATION
Chemoenzymatic approaches are not limited to excised TE-domains. Recent developments indicated that carrier proteins (CPs) are ideal targets for chemoenzymatic labeling strategies with highly diverse compounds. Such CPs from NRPS, PKS, and fatty acid synthases are posttranslationally modified at a conserved serine residue with a ppan moiety from CoA. This modification is catalyzed by ppan transferases, such as Sfp from
B. subtilis (Fig.
12A). An interesting feature of ppan transferases is their ability to accept various functionalized CoA derivatives (Fig.
12B). This relaxed specificity has been used to tag CPs with a variety of reporter groups, such as fluorophore- and affinity-labeled CoA (
71,
85). The synthesis of such CoA conjugates can be readily achievable via Michael addition once maleimide functionalities are linked to the desired small-molecule reporter group (
71).
The advantage of posttranslational modification of CPs is that this method can be selectively carried out in a complex mixture of cellular proteins. Hence, CPs can be used as peptide tags to direct the specific labeling of a target protein (Fig.
12B). Yin et al. reported the affinity labeling of target proteins that were expressed as artificial fusions to a PCP. These PCP-tagged target proteins were selectively labeled with biotin in the cell lysate followed by rapid immobilization on a streptavidin surface, thereby providing a high-throughput method for protein microarray fabrication and enzymatic screening (
151). In another application, the PCP was N-terminally fused to the phage capsid protein III (
152). Subsequent Sfp-catalyzed PCP modification with CoA-small-molecule conjugates enabled the display of small molecules on phage surfaces. By using this method, phagemid-encoded small molecule libraries could be screened for target binding.
In addition to phage surfaces, specific labeling of CPs with chemically diverse compounds can be achieved on cell surfaces of living cells. Recent publications provide evidence that posttranslational modification of CPs is suitable for fluorescence imaging of membrane proteins (
41,
137,
150). For instance, transferrin receptor 1 (TfR1) was fused to PCP, and the TfR1-PCP fusion protein was posttranslationally labeled with fluorophore Alexa 488 by Sfp. In the presence of fluorescently labeled transferring ligand, single-cell FRET measurements provided insights into the trafficking of transferrin-TfR1-PCP complex. The observations agreed with current models for TfR1-mediated transferring uptake, thus indicating that the small size of fused PCP (∼80 aa) did not significantly alter the function of the TfR1 receptor. Similar to this approach, it was demonstrated that a-agglutinin receptor and G protein-coupled receptor neurokinin-1 could be fused to the
Escherichia coli ACP (
41). Instead of Sfp from
B. subtilis, the
E. coli ppan transferase AcpS was used to achieve specific labeling of these cell surface proteins with fluorophores, affinity probes, and CdSe quantum dots.
Interestingly, AcpS has high substrate specificity and modifies only ACPs, whereas Sfp modifies both PCPs and ACPs. These enzyme properties proved useful for the multicolor imaging of two different CP fusion proteins in one sample (
137).
Saccharomyces cerevisiae Sag1p cell wall protein fused to ACP was first selectively modified with fluorophore-labeled CoA in the presence of AcpS. Finally, Sfp catalyzed the labeling of the remaining PCP-Sag1p fusions with a different CoA-fluorophore conjugate.
The main drawback of in vivo reporter labeling of CP-tagged proteins is that the cell-impermeability of CoA derivatives limits posttranslational modification to cell surface proteins. In order to label proteins inside of cells, Clarke et al. replaced CoA-small-molecule conjugates with a cell-permeable fluorophore-labeled pantetheine analog (Fig.
13) (
20). After cellular uptake in
E. coli, this reporter-labeled pantetheine was converted to reporter-labeled CoA via a three-step enzymatic sequence including CoAA, CoAD, and CoAE. CoAA mediates the phosphorylation of the terminal hydroxyl group of the pantetheine analog under the consumption of ATP. Further processing by CoAD should proceed by adenylation of the generated phosphopantetheine analog to yield a dephospho-CoA derivative. CoAE-catalyzed phosphorylation of the 3′-hydroxyl group finally yields a CoA analog. This metabolic conversion into an active, labeled CoA analog was followed by Sfp-mediated posttranslational modification of coexpressed VibB from
Vibriobacter cholerae, a natural fusion between a CP and isochorismate lyase. Labeling of VibB was confirmed by fluorescent sodium dodecyl sulfate-polyacrylamide gel electrophoresis of the cell lysate. These results demonstrated for the first time that one could rationally engineer a chemoenzymatic route to covalently label CPs in vivo via metabolic delivery of cell-permeable CoA precursors.
ENZYMES BELONGING TO THE HIGHER EUCARYOTIC NRPS-LIKE FAMILY
Recent results demonstrate that eukaryotes have preserved an amino acid activation mechanism that until now was considered to be specific for bacterial and fungal NRPSs. Computational sequence comparisons led to the conclusion that a specificity-conferring code, similar to that described for traditional nonribosomal A-domains (see “Dissecting the Modules into Domains”), can be recognized in eukaryotic enzymes such as Ebony (
27). This protein from
Drosophila is a three-domain multienzyme, which is involved in histamine neurotransmitter metabolism at the photoreceptor synapse of the eye (
52) (Fig.
16). It presumably functions as a fast histamine reuptake system to ensure excitation of the postsynaptic cell by disinhibition. Indeed, biochemical studies provided evidence that Ebony is capable of binding biogenic histamine as well as many other primary amines to β-alanine (
100). In vitro assays of
E. coli-produced wild-type Ebony indicated that the respective A-domain exclusively selects β-alanine and activates it as aminoacyl adenylate. The activated β-alanine is then transferred onto the ppan cofactor of the downstream PCP in an NRPS-related mechanism. Finally, a primary amine, such as histamine, performs a nucleophilic attack onto the thioester of the Ebony-bound β-alanine, thus leading to the release of a peptidoamine. This condensation step might be catalyzed by a C-terminal amine-selecting domain (AS-domain; 230 aa), which does not share homology with any known NRPSs. Furthermore, condensation assays suggested a broad substrate tolerance of this domain for various primary amines. A combination of an ethylamino or hydroxyethylamino group with an aromatic ring system was sufficient for rapid peptidoamine formation.
Similar to the Ebony three-domain NRPS, a multifunctional enzyme from mouse, U26, comprises an A-domain, a PCP, and a pyrroloquinoline quinone-dependent dehydrogenase domain (
58). The latter domain has been proposed to catabolize lysine, which yields α-aminoadipic acid. In conclusion, these recent developments indicate that NRPS architecture has been preserved throughout evolution to higher eukaryotes. However, multimodular NRPS systems have not yet been detected, even though dipeptides such as β-alanyl-histidine have been shown to exist in vertebrates (
38).
REPROGRAMMING OF NRPS ASSEMBLY LINES
Chemoenzymatic approaches were developed to reprogram natural peptide sequences by the combined action of chemical peptide synthesis and subsequent enzyme catalysis (see “Chemoenzymatic Approaches toward Novel Cyclopeptides”). By using the combinatorial method described above (
69), large libraries of macrocyclic peptides can be created with both natural and unnatural amino acid building blocks, which can subsequently be screened for novel or improved bioactivity. Once a target analog is unveiled by this method, the modular organization of NRPSs makes it possible to consider reprogramming of the biosynthetic machinery to generate the target analog by fermentation. Several strategies to rationally redesign NRPS templates are conceivable and will be introduced in this section.
Pioneering work was performed by Stachelhaus et al., who reported the genetic engineering of the terminal module of the surfactin synthetase, which incorporates Leu
7 into the final product (
119). To alter the substrate specificity of this module in vivo, the A
7-domain and the adjacent PCP
7 were exchanged by bacterial and fungal A-PCP didomains with various amino acid specificities. Despite the production of the predicted surfactin derivatives, the productivity of the engineered synthetases was dramatically reduced, which could be explained by the high selectivity of C-domains in the acceptor site for cognate side chains (
7). However, further attempts to obtain other surfactin variants by domain swapping were unsuccessful (
103). This can most probably be ascribed to improper artificial fusion of domains, thus indicating that a more precise definition of the domain borders was necessary. Further biochemical studies, detailed sequence analysis, and structural information led to the identification of linker regions between independently folding NRPS domains (
22,
60,
90,
141). These linker regions are about 15 amino acids in length and usually show only little or no sequence conservation. Their suitability for artificial module fusions was first confirmed by in vitro studies on the tyrocidine NRPS. Dimodular hybrid enzymes were generated by fusion of Pro-activating module 2 with Orn-activating module 9 or Leu-activating module 10 (
90). Furthermore, a TE-domain was fused to the terminal modules to ensure product release. Incubation of the engineered dimodules with the
d-Phe-activating module 1 then yielded the predicted tripeptides
d-Phe-Pro-Orn and
d-Phe-Pro-Leu. Moreover, precise linker surgery significantly improved NRPS engineering efforts in vivo. To this end, successful deletion of the entire Leu-activating module 2 of the surfactin synthetase caused the secretion of the predicted lipopeptidolactone analog with a decreased ring size (
89). The yield of this Δ2-surfactin variant was around 10% of native surfactin A (Fig.
1) produced by the wild-type producer, which represents a major improvement to initial in vivo studies of NRPS engineering (
119).
The aforementioned NRPS reprogramming efforts have focused on the interaction between modules within NRPS subunits in
cis. To expand the biosynthetic utility of NRPSs, recent research focused on interpolypeptide interactions between modules of partner subunits in
trans. The predictable manipulation of these interactions would provide a great combinatorial potential. Interestingly, protein-protein communication in
trans is predominantly controlled by the interplay of matching pairs of short COM domains (see “BIOSYNTHETIC LOGIC OF NONRI- BOSOMAL PEPTIDE SYNTHETASES”). Recent studies of the tyrocidine NRPS subunits TycA, TycB, and TycC convincingly showed that productive interactions between nonpartner NRPS subunits can be enforced by the presence of matching pairs of COM domains (Fig.
17) (
47). Specifically, COM domain swapping enabled cross talk between the
d-Phe-activating donor module TycA and the nonpartner Asn-activating acceptor module TycC1 as well as between the
d-Phe-activating donor module TycB3 and the Pro-activating acceptor module TycB1. Formation of the expected dipeptides
d-Phe-Asn and
d-Phe-Pro-diketopiperazine was verified by mass spectrometry. Remarkably, communication between partner modules was also achieved by COM domain pairs derived from different matching modules, i.e., TycA(B3)/(C1)TycB1 and TycB3(A)/(B1)TycC1 (Fig.
17). Moreover, successful cross talk between TycA and the Leu-activating termination module SrfAC of surfactin synthetase indicated that COM domains even mediate protein-protein communication between different biosynthetic systems. In the future, it will be an important issue to clarify if COM domain swapping is also a suitable tool for the production of novel peptide products in vivo.
The engineering of intra- and interpolypeptide interactions between whole modules represents a rather drastic intervention in NRPS biosynthesis. Genetic manipulation of the nonribosomal code (see “Dissecting the Modules into Domains”) by site-directed mutagenesis of the specificity-conferring residues of A-domains, on the other hand, is a rather small alteration. Studies of CDA and surfactin synthetase have shown that this approach is suitable to produce novel peptide products in vivo (
35,
132). For example, the substrate specificity of the Asp-activating module 7 of CDA was rationally altered by changing two residues within the corresponding A-domain (
132). Only two point mutations resulted in a CDA2a analog (Fig.
2), which contained Asn instead of Asp at position 7. However, fermentation yields of the Asn
7-containing cyclopeptide were reduced, and large amounts of a linear hexapeptide by-product were isolated. Hence, this method does not overcome the limitation imposed by the specificity of the acceptor site of the C-domain, which is located upstream of the manipulated A-domain. Moreover, the substrate specificity of the C-terminal TE-domain could cause additional problems.
Interestingly, engineered biosynthesis of non-ribosomally produced peptides can be performed in vivo without genetic manipulation of the NRPS subunits. Instead, engineering of these natural products is achieved by deleting a gene required for the biosynthesis of an unusual amino acid. Feeding synthetic analogs of this unusual amino acid then results in new peptide analogs by precursor-directed biosynthesis. Using this so-called mutasynthesis approach, it was possible to engineer the biosynthesis of balhimycin (
142) and CDA (
51). In the latter case, a mutant that is blocked in HPG biosynthesis and is therefore unable to produce CDA was generated. By feeding the mutant cells a series of synthetic analogs of HPG, novel CDA peptides containing 4-fluorophenylglycine or phenylglycine in place of HPG were synthesized. However, substitutions are limited to derivatives of HPG which contain para-substituents that are no larger than the hydroxyl group (
51). Presumably, the substrate specificity of the HPG-activating A-domain imposes severe restrictions on the size of this para-substituent. Furthermore, in contrast to the approaches described above, mutasynthesis is limited to substitutions of nonproteinogenic amino acids, which originate from secondary metabolism. Knowledge of the genes involved in the synthesis of these unusual amino acids is also crucial for targeted mutation and subsequent precursor directed biosynthesis.
CLOSING REMARKS
Non-ribosomally produced peptides exhibit antibacterial, antiviral, immunosuppressive, and antitumor properties. This broad spectrum of biological activities is reflected in the vast structural diversity found in these natural products such as d-configured residues, oxidation, methylation, halogenation, lipidation, heterocyclization, and macrocyclization. The latter structural feature can be chemoenzymatically generated by excised TE-domains. Future research will show if these enzymes are well-suited for the identification of drug leads via combinatorial synthesis of cyclopeptides. Furthermore, alteration of the substrate specificity of TE-domains by directed protein evolution will increase the utility of these macrocyclization catalysts. In contrast to the case with TE-domains, little is known about the chemoenzymatic potential of tailoring enzymes, which significantly contribute to the structural diversity and rigidity of nonribosomal peptides. It remains to be seen whether these enzymes exhibit a high tolerance in vitro for their dedicated chemical transformations. In addition to single-domain catalysis, reprogramming of entire NRPS assembly lines and mutasynthesis have also proven valuable for the generation of novel peptide products.