CooA Activity Assays
The role of CooA in the cell is to activate the expression of 11 additional genes whose products are involved in CO oxidation when CO is present under reducing conditions. It performs this activation by binding to palindromic sites on the DNA that are at the 5′ ends of operons whose products oxidize CO. Immediately adjacent to the CooA-binding sites are weak promoters, and, because of protein-protein interactions between the bound CooA and RNA polymerase, the affinity of the polymerase for those promoters is increased and transcription initiation can take place (see below).
The sequences bound by CooA are reminiscent of those bound by CRP and FNR, which is consistent with the relatively high similarity among the F helices of these proteins (see below), which make the specific base contacts. For proper biological regulation of expression, the affinity of the binding site for the activator should be such that there is very little occupancy of the site in the absence of the effector but very high occupancy in the presence of the effector.
CooA has been routinely examined for its activity in vivo, using reporter systems that measure the ability of CooA to bind a specific DNA sequence and then properly interact with RNA polymerase to activate transcription and produce a product that can be assayed. When analyzing CooA activity in
R. rubrum, the assay has typically been the activity of the CODH itself, but this is somewhat indirect, since that activity is a reflection of not only gene expression but also CODH maturation (
73,
138). There are several technical advantages in using an
E. coli reporter strain with
lacZ fused to one of the two normal CooA-responsive promoters from
R. rubrum. When
cooA is expressed from a plasmid, this strain expresses very low β-galactosidase activity unless the cells are anaerobic and exposed to CO. Such an assay shows a linear response over a certain range of CooA activity, because maximal β-galactosidase activity requires only that the CooA-binding site upstream of
lacZ be saturated (
71). Significant differences in the fraction of the CooA population in the active form can therefore be missed unless total CooA levels are tuned through regulation of the
cooA promoter.
While more time-consuming, an in vitro assay of DNA binding by CooA has significant advantages. The most readily interpretable assay involves fluorescence anisotropy, in which a fluorescently tagged DNA fragment containing the CooA binding site is incubated with purified CooA (
125). This assay can provide a
Kd for any tested variant in the presence and absence of effector. It measures only DNA affinity, rather than the complex combination of DNA affinity and affinity for RNA polymerase that is measured in vivo. Other assays such as gel shifts (D. Shelver and G. P. Roberts, unpublished data), footprinting (
117), and in vitro transcription assays (
59) have also been successfully applied to CooA.
Insights into the Activation Mechanism from a Structural Comparison of CooA and CRP
To understand the mechanism by which CO activates CooA, and in particular the basis for the specificity of CooA for CO, it is necessary to know the structure of both the inactive (CO-free) and active (CO-bound) forms of the protein. To this point, however, only the reduced, effector-free form of CooA has been solved structurally (Fig.
3B) (
81). The most striking initial feature of the CooA structure is its asymmetry, which is presumably an artifact of crystal packing. Because the “folded-down” monomer (form A in Fig.
3B) makes other contacts in the crystal lattice, it has been proposed that form B might be more representative of CooA in solution (
81). Although the structure of the active form of CooA is unknown, the analysis of a homolog, CRP, has been highly informative. CRP is an extremely well-studied transcriptional factor of
E. coli that activates the expression of genes encoding the utilization of poor carbon sources in response to cAMP binding. The results of several structures of active (cAMP-bound) CRP have been published (
98,
100,
139). As depicted in Fig.
3A, the protein is a dimer, with each monomer consisting of two domains: an effector-binding domain linked by a hinge region to a DNA-binding domain. The structure of effector-free CRP has never been solved, although there is ample evidence that a significant conformational change occurs on cAMP binding (
57,
79). The nature of the inactive form would be of significance for a variety of reasons, but primarily because it is impossible to understand the mechanism of protein activation in response to effector binding unless both the active and inactive states of the protein are known. Until recently, there were two lines of evidence that made specific predictions about the nature of the conformational change. The first was small-angle X-ray scattering analysis on CRP in the presence and absence of cAMP. Unfortunately, the results in the presence of cAMP were not interpretable because of aggregation, but the results in the absence of cAMP matched a model of a prolate elipsoid with an axial ratio of 1:2 (
80). Because the crystal structure of the active form of CRP is roughly spheroidal, this implies a significant conformational extension of some portion of the protein without effector. The other frequently cited analysis involved determination of the Stokes radius by analytical gel chromatography (
62). This method is somewhat indirect, however, since the results reflect not only the shape of the protein but also changes in solvent interaction, and these two factors are difficult to deconvolute. It is also a concern that the published analysis showed no change in global structure until cAMP levels exceeded 100 μM, well above those at which the protein should have bound two cAMP molecules. More recently, a nuclear magnetic resonance spectroscopy (NMR) analysis of inactive CRP suggested that the F helices are not solvent exposed, in contrast to the surface position of these DNA-interacting regions in the structure of active CRP (
98,
100,
139). This is consistent with a very substantial conformational change on effector binding, although the exact nature of this change has been unclear (
141).
A comparison of the structure of effector-bound CRP with that of effector-free CooA (Fig.
3) reveals differences that might be attributable to either the activation process or inherent differences in the proteins themselves. It is therefore difficult to draw conclusions about the former process. It is also unclear to what extent the activation processes in the two proteins are mechanistically similar. Nevertheless, it is worth considering the most central differences in the two structures: structure and positioning of the DNA-binding domains, structural differences within each effector-binding domain, and positioning of the two effector-binding domains with respect to each other.
Comparison of the DNA-binding domains of CRP and CooA shows that the individual domain structures are remarkably similar to each other (
81), at least for regions that are resolved in each structure. However, it is the positioning of these domains with respect to each other and to the effector-binding domains that is very different in CooA and CRP. Irrespective of which of the two forms of inactive CooA is more similar to the solution form of the protein, they have important similarities: compared to the structure of active CRP, both forms are relatively elongated and have a dramatic repositioning of the DNA-binding domains, such that the F helices are actually turned away from the solvent. This latter point is consistent with the NMR analysis of inactive CRP noted above (
141). These results lend credence to the hypothesis that CRP and CooA might undergo a roughly similar conformational change during activation. Effector binding must in some way signal a significant change in orientation of the DNA-binding domains in each protein. The signal transduction pathway that effects this change is of central importance to understanding these proteins.
Within the effector-binding domains, the obvious difference is the heme of CooA. The next most obvious difference between CRP and CooA is in the position of the 4/5 loop within each structure; the 4/5 loop refers to a pair of β strands that extend from the effector-binding domain in each protein toward the DNA-binding domain. While this change in position cannot be definitively attributed to the activation process, it is a very reasonable hypothesis, since the position of that loop in each protein predicts different contacts with the DNA-binding domains, presumably stabilizing each structure. Because the repositioning of the DNA-binding domains is certainly relevant to the activation process, it follows that surfaces within the effector-binding domain that contact the DNA-binding domain in either the active or inactive form of the proteins might also be repositioned after effector binding, a notion that is also addressed below where the protein surfaces that interact with RNA polymerase are discussed. It therefore seems highly likely that there are specific conformational changes within the effector-binding domains on effector binding. However, the actual nature of these structural changes remains poorly understood, except for those in the immediate vicinity of the cAMP in CRP and the heme vicinity in CooA.
The repositioning of the two effector-binding domains with respect to each other on effector binding is clearly central to the process of activation. This notion was first demonstrated by the Poulos group, who solved the effector-free CooA structure and compared it to that of effector-bound CRP (
81). They noted a modest change in the relative position of the two long α helices, termed the C helices, that lie at the dimer interface of the two proteins. They suggested that this C-helix repositioning might serve as a signal transduction pathway between the heme region of CooA and the DNA-binding domains. As detailed below, this hypothesis has been strongly supported by direct mutational analysis of CooA. A similar notion was also proposed for CRP, where cAMP binding immediately adjacent to these helices might cause their repositioning (
100).
In conclusion, a structural comparison of inactive CooA with active CRP shows that there is a substantial change in the position of the DNA-binding domains of CooA after CO binding and that repositioning about the C helices is a likely factor in that response. The data are also consistent with the notion that CRP and CooA undergo similar conformational changes on effector binding, but this speculation requires substantially more experimental testing. Understandably, the basis of CO recognition by CooA therefore requires an analysis of why CO leads to such a repositioning whereas other small molecules do not. This is addressed below.
Population Dynamics of the CooA Response to CO
To understand the response of CooA to CO, we must understand not only the structures of the predominant protein species in the presence and absence of CO but also the population distribution and dynamics among those different species. For example, the notion that CooA and CRP exist in completely inactive forms in the absence of effectors and exist in completely active forms in their presence is clearly simplistic. Rather, the active and inactive forms of the proteins must exist in a dynamic equilibrium, and the degree of homogeneity of either the active or inactive forms of either protein is unclear. For CooA, at least, it is our working hypothesis that its CO-bound form is inherently heterogeneous. The best evidence for this is that we have created CooA variants with substitutions based on the structure of active CRP that have severalfold-higher affinity for DNA in the presence of CO than does wild-type CooA under the same conditions. These substitutions do not lie near the DNA-contacting surfaces and are best explained by their creating a shift in the equilibrium such that a larger fraction of the CooA population is truly active. Another line of evidence discussed below, is from kinetic analysis, which has shown that there are two different states of the heme in a population of CO-bound wild-type CooA, as revealed by substantially different CO off-rates (
103). Lastly, the failure to obtain crystals of the CO-bound CooA is consistent with the hypothesis of the heterogeneity of this form. Such a notion is easily rationalized on the basis of the physiological role of CooA. There is no detectable expression of the
coo operons of
R. rubrum in the absence of CO, suggesting that there is very little active CooA under these conditions. In the presence of CO, however, only two CooA-binding sites need to be saturated, and so it is reasonable to suppose that only a fraction of the CO-bound CooA need be in the form competent to bind DNA in order to achieve optimal gene expression. Because of the ease with which the structure of active CRP was solved, one assumes that CRP is homogeneous under these conditions. In any event, the impact on the presence of effector on the equilibrium of these two protein between their active and inactive forms is biologically important yet poorly understood.
Another complication involving a equilibrium between different protein forms probably exists with inactive CooA. In both forms of inactive CooA in the known X-ray structure, the F helices are buried from the solvent. However, one would expect that each of these regions would have some low affinity for a variety of DNA sequences. One might then predict that the very high concentration of DNA found in the cell might perturb the CooA structure by interaction with one or both of the F helices in a nonspecific way. If it is true that the actual structure of inactive CooA in the cell has a different arrangement of the DNA-binding domains from that depicted in the known structure, we will have difficulty in understanding which interactions stabilize and destabilize that structure. This would also be relevant to the analysis of the response of CooA to CO, since it would affect the actual nature of the CooA population that senses CO and therefore the pathway of CO activation.
Heme Vicinity: Structure and Implications
The residues in the vicinity of the heme are the features that provide the distinctive properties to different heme-containing proteins. For CooA, these residues are the basis for the specificity in CO binding and response. The heme vicinity of CooA (Fig.
3C) lacks the common heme-binding motifs found in the PAS-domain proteins or the globins (
38,
122) but instead is somewhat similar to the effector domain of CRP. The critical residues governing this motif are highly conserved among CooA homologs, implying that there certainly are similarities among all of the CooA homologs in their heme-binding motif. The distal side of the CooA heme, where CO binds, consists of a pair of parallel α helices (the C helices), while the proximal side assumes a β-sheet structure. Once of the most important features of the CooA heme region is the presence of two endogenous protein ligands to the heme in inactive CooA. The incoming CO molecule must therefore replace one of the ligands in order to trigger the conformational change leading to activation of CooA (operationally, the incoming ligand binds to a five-coordinate heme that is created by the transient deligation of an endogenous ligand). This requirement in CooA for displacement of an endogenous ligand is unusual among heme proteins and helps explain the specificity of CooA for CO. Most small molecules are not sufficiently strong heme ligands to displace these endogenous ligands, while NO displaces both protein ligands and creates a five-coordinate heme. This NO adduct is not active, consistent with the notion (explained below) that tethering of the heme by the endogenous His77 ligand is critical for CooA activation. Oxygen oxidizes the heme, and that form of CooA is also inactive. Part of the specificity of CooA for CO therefore relies on the fact that only CO has the appropriate ligand strength to displace one, but not both, of the endogenous heme ligands.
His77 serves as one heme ligand in reduced CooA and is critical for the response of CooA to CO, because substitution with any other residue at that position destroys the CO-dependent response of the protein (116; M. Conrad, H. Youn, and G. P. Roberts, unpublished data). His77 is important for two reasons. First, the His ligation is at a critical poise of ligand strength. It must be sufficiently strong to avoid displacement by CO or other small molecule ligands, since such binding to the “wrong side” of the heme does not allow proper C-helix repositioning. However, it must also not be so strong that a six-coordinate NO adduct is formed, since such a species might well be active. The second important property of His77 is that its serves as the tether to the CO-bound heme. Its precise size and positioning are therefore important for the precise positioning of the CO-bound heme with respect to the C helices, and this last interaction is important for activation of CooA. The ligand strength and positioning of His77 must also be relevant to the proper redox-mediated ligand switch between Cys75 and His77 that is described immediately below. It is therefore not surprising that all CooA homologs have a His residue at the homologous position of His77 (Fig.
4).
On the His77 (proximal) side of the heme are two other residues of importance to CO sensing. The first is Cys75, a heme ligand in Fe(III) CooA that is displaced by His77 on reduction (
6,
107,
116). While Cys75 is not a ligand in the active form of CooA, mutational studies of
R. rubrum CooA have nevertheless shown that only Cys and Ser allow high CooA activity in vivo, while Ala allows some CooA function. The similar size of these residues suggest that larger residues might either perturb heme insertion and stabilization or interfere with proper positioning of the heme on CO binding. The sequences of the CooA homologs are consistent with this, since only Ser and Cys are found at that position. In
R. rubrum CooA, the role of Cys75 as the ligand in the oxidized form predicts that its presence might be critical for stabilization of that form and that Ser, which cannot serve as a heme ligand, would not suffice. A C75S variant of CooA does show some heme stability problems when oxidized, but, surprisingly, there is a substantial population of six-coordinate heme in the oxidized form of this variant, implying that another protein residue is serving as an adventitious ligand (
116). The identity of this residue is unknown. The role of Cys75 in oxidation-reduction is addressed in the following section.
Asn42 is the other proximal-side residue (in addition to His77) that is directly perturbed when CO binds to the heme. This residue makes H-bonding contacts with His77 in the reduced form but not in the CO-bound form (
24,
81). Since His77 is tethered to the heme in both forms, this structural change of His77 suggests a repositioning of His77 with respect to Asn42 in response to CO. CooA variants with substitutions at this position are somewhat perturbed in their ability to be activated by CO, but the precise basis for this is not clear (
24). It is also interesting that the adjacent residue is Glu41, which has an effect on CooA-RNA polymerase interactions (
82). This suggests that CO binding might have a direct effect on this interaction as well, as discussed below.
The other ligand in the reduced form of CooA, Pro2, is the N terminus of the other protein monomer (
81). Proline had not previously been detected as a heme ligand in any protein because it is sterically incapable of playing that role except when it happens to be at the N terminus. The presence of such a novel ligand immediately suggested that it might be critical for the proper activation of CooA, but mutational analysis has disproved that (
125) and has shown that a variety of substitutions at this position provide substantial CooA activity. This view is supported by the observation that none of the other CooA homologs appears to have a proline positioned to serve as a ligand (Fig.
4). However, while Pro2 is not critical for the function of CooA in
R. rubrum, it does appear to be optimal.
Another residue is important in stabilizing Pro2 as a ligand, and that is Arg4, which appears to interact with a propionate of the heme (
81). Removal of this residue by deletion results in a detectable population of five-coordinate heme in both the oxidized and reduced forms of CooA (
125).
When CO binds to the heme of CooA, it replaces one of the two protein ligands, but the identity of the displaced ligand was unknown for some time. In part this reflected the fact that there was no spectroscopic data set for the novel Pro ligand to serve as a control for the CO-bound form of CooA. The issue was resolved by the application of NMR by the Aono group, which showed that CO replaces Pro2 (
144). A resonance Raman analysis has indicated that the displaced Pro2 is not in the immediate vicinity of the bound CO (
24). This result is consistent with the observation that alteration of Pro2 in CooA does not dramatically impair the ability of the protein to achieve the active conformation (
125). However, there appear to be three auxiliary roles for Pro2 in
R. rubrum CooA function. The first is that its ligation to the heme helps keep the protein in the inactive form until CO binds. In an otherwise wild-type background, alteration of Pro2 does not yield a substantial increase in CO-independent activity, which would be the expected result if Pro2 ligation were critical for this role (
125). However, an involvement of Pro2 in this process is revealed in backgrounds with other substitutions that enhance CO-independent activity and in which the replacement of Pro2 is synergistic for this response (
71). We assume that the modest effect seen in an otherwise wild-type background is because of the presence of other unidentified protein ligands that can adequately maintain the inactive form. The second role of Pro2 is that it provides a heme ligation that is weak enough to be displaced by CO yet strong enough to resist displacement by weaker small-molecule ligands. It is not clear if the residues that replace Pro2 in variants of
R. rubrum CooA or in the CooA homologs have similar properties, because binding of other small molecule has not been examined with these proteins. Finally, Pro2 and its adjacent N-terminal residues must be flexible enough to remain ligated to the heme through the oxidation-reduction process. As explained in the following section, this process involves a significant movement of the heme relative to the protein, requiring ligand flexibility.
In the CO-bound form of CooA, the residues presumed to be near the heme-bound CO are all from the two C helices at the dimer interface. These include Leu112, Ile113, Leu116, Gly117, and Leu120 (
146). The evidence for the rolles of these residues is presented below, but some general comments are appropriate here. It is important to recognize that the structure of the CO-bound form of CooA is unknown, and so the exact position of the CO-bound heme with respect to amino acid residues is not clear. Despite these uncertainties, a number of CooA variants altered at positions 113, 116, and 117 show perturbations in the C—O and Fe—C stretching frequencies as determined by resonance Raman spectroscopy (
24). This suggests that the bound CO is located near these residues. As detailed below, some of these residues are critical for the activation of CooA in response to CO. Other substitutions in this region create CooA variants that respond effectively to imidazole as an effector (
146; H. Youn, R. L. Kerby, and G. P. Roberts, unpublished data), consistent with a role of this region in interacting with the small molecule bound to the heme. In the CooA homologs, Leu116, Gly117, and Leu120 are all strictly conserved while positions 112 and 113 have conservative substitutions (Fig.
4). These results are consistent with the hypothesis that it is the interaction of this portion of the protein with the CO-bound heme that leads to the repositioning of the C helices in the normal activation process.
In summary, there are two obvious local changes in the vicinity of the CooA heme in response to CO: displacement of proximal Pro2 ligand, which allows repositioning of the CO-bound heme with respect to the C-helix residues, and breakage of the His77-Asn42 H-bond. This combination of features in the unique CooA heme-binding motif ensures that only CO can trigger the structural rearrangement necessary for activation.
The Oxidation-Reduction Mechanism in CooA and Its Implications
It makes sense that the facultative aerobe
R. rubrum would avoid expressing the
coo operons under aerobic conditions, because the CODH is itself oxygen labile. However, the situation is actually more interesting than that. The NiFe metal center of the CODH is catalytically active only at reduction potentials below −300 mV (
60), which also has been reported to be the midpoint potential of the transition of CooA from the oxidized to the reduced form (
94). CooA of
R. rubrum solves the physiological problem of avoiding
coo expression under oxidizing conditions in the following way. The heme of CooA can bind CO only when reduced, and so the oxidation of the heme, either by O
2 or by other oxidants, prevents CooA activation even in the presence of CO. Mutational and spectroscopic analyses have shown that there is a highly unusual ligand switch after the oxidation and reduction of CooA: oxidized CooA has Cys75 as ligand, but this is displaced by His77 on reduction (
6,
116). It seems likely that this ligand switch sets the specific poise for heme reduction, but this assumption has not been tested experimentally. The structure of the oxidized form of CooA remains unknown, but examination of the structure of reduced CooA makes it clear that this ligand switch involves a significant movement of the heme with respect to the protein backbone (
111) (Fig.
5). Indeed, this observation was one of the facts that made it clear that there was substantial flexibility in the heme position, a notion that has been expanded in our current hypothesis of heme repositioning as an essential feature of CO activation.
Rather interestingly, all the CooA homologs have Ser at the position homologous to Cys75 of
R. rubrum CooA position except
C. hydrogenoformans 2340 CooA, which also has Cys. The particular CooA homologs that have been partially purified and studied in vitro all have the ability to be oxidized. However, even in
C. hydrogenoformans 2340 CooA, it appears that the ligation structures of oxidized and reduced forms, as well as the redox poise of the transition, are somewhat different from those seen with
R. rubrum CooA (
147). This results implies that while Cys75 is important for the precise nature of the ligand switch seen in
R. rubrum CooA, it is not sufficient, and that other residues in the heme vicinity are important for this property.
A number of interesting and biologically significant questions concerning the redox switch in CooA remain unanswered. One is obviously the exact nature of the conformational change that occurs within the effector-binding domain to not only allow this switch but also stabilize both forms of the protein. At present, we have relatively little insight into the homogeneity of either of these species. Indeed, the original analysis of CooA redox properties showed a curious hysteresis such that the curves obtained for oxidation were distinct from those obtained for reduction. This behavior was rationalized by a very slow interconversion between the two forms (
94), yet a different analysis by the same group revealed that the conversion occurred in the millisecond range (
93). The basis for this discrepancy is unknown. A second question concerns the identity of the ligand
trans to Cys75 in oxidized
R. rubrum CooA. Indirect evidence also suggests that Pro2 serves as the ligand
trans to Cys75 in the oxidized form of CooA (
125,
145). Finally, a more biologically interesting question concerns the actual chemical entities sensed by CooA for this redox transitional, though O
2 can certainly suffice. Presumably it is some pool of small molecules such as NAD and NADH, but the identity of that small molecule remains unknown.
Basis of the CO Specificity of R. rubrum CooA
There appear to be two distinct aspects of the remarkable CO specificity of CooA: only CO can displace the appropriate heme ligand to form a six-coordinate form, and only CO, when bound to the heme, can stimulate the proper conformational change to activate the protein. The following discussion explains the basis for each of these properties and their role in CooA activation.
We have already explained that the strength of the endogenous protein heme ligands can explain the remarkable CO specificity of
R. rubrum CooA for its activation. The simple hypothesis was that only CO could form a six-coordinate species by displacing the Pro2 ligand and that this form might therefore be both necessary and sufficient for activation. The obvious prediction was that perturbation of the Pro2 ligation could weaken that bond and allow other small molecules to bind the heme on the proper side. This happens to be true, based on the following analysis of the ΔP3R4 variant of CooA, in which the codons for the third and fourth residues have been deleted. This alteration eliminates the Arg4 residue that stabilizes Pro2 ligation to the heme, producing a small but significant population of five-coordinate heme in the reduced form. Not surprisingly, this variant is able to bind CN
− and imidazole very efficiently, but binding of these molecules does not activate the protein to a detectable extent (
146). This result disproves the simple hypothesis above and indicates that there is another level of discrimination for CO. What might be the basis for this discrimination, especially against CN
−, which is so similar to CO in size?
It is clear that the bound CO exists in a very confined pocket in
R. rubrum CooA, because rebinding of CO after its removal by photolysis is unusually rapid and efficient (
5,
112,
130). Because the structure of the active form has not been solved, the identity of this pocket is unknown, but it is apparently not formed by the N terminus, as evidenced by the resonance Raman results cited above. It is therefore presumed that the pocket must be formed by the only other residues in the heme vicinity, which are those on the C helices of both protein monomers. The nature of this pocket is of interest for two biological reasons that are explained further below. First, the interaction of the CO-bound heme with the C helices is almost certainly a critical step in signal transduction within CooA since it causes the C-helix repositioning necessary for activation. Second, as described immediately below, the nature of the interactions in this heme pocket must certainly play an important role in the specificity of the CO response.
Under the hypothesis that this CO specificity results from a precise interaction between the CO-bound heme and the C-helix residues, a number of these have been analyzed by randomizing the codons singly or in small groups and then screening for variants that responded to CO. The expectation was that certain positions should be critical for a response to CO. The presumption that these residues were in the general vicinity of the bound CO was supported by the observation that certain substitutions at positions 113, 116, and 117 (Fig.
3C) perturbed the CO stretching frequency in resonance Raman analysis (
24). In fact, only Gly117 was absolutely required for a CO response, although position 120 was also fairly stringent, tolerating only Ile in place of Leu120 (
146,
147; R. L. Kerby and G. P. Roberts, unpublished data). While it is tempting to suppose that these might be the residues that determine CO specificity, this hypothesis has been weakened by the results presented below for CooA variants that respond to imidazole.
In a similar analysis, a variety of hydrophobic residues were found to be acceptable at positions 112, 113, and 116. However, the analysis provided the important observation that hydrophilic residues at these positions cause a decrease in the accumulation of heme-containing CooA and were also unable to respond to CO (
24,
146). The first effect is consistent with the idea that a hydrophobic pocket is typically found around hemes and presumably serves to maintain the heme in the protein. The second result suggested the following hypothesis to explain why CO binding to the heme might lead to C-helix repositioning. CO binding to the heme displaces the Pro2 and its attached N terminus, which apparently moves away from the heme. This then exposes the largely hydrophobic surfaces of the C helices to an aqueous environment. The repositioning of the C helices that results in activation might then be the result of an effort to reduce the solvent exposure. Alternatively, hydrophilic residues at these positions might interfere directly with the proper C-helix positioning or indirectly by affecting heme positioning. The result of the above analysis was to suggest that Gly117 and Leu120 might make critical contacts with the bound CO, where the other residues were less likely to do that because a variety of hydrophobic residues at those positions allowed a fairly normal response to CO (
146).
Concurrent with the analysis of the C-helix requirements for a proper response to CO, we analyzed the same region of CooA for its ability to allow activation by imidazole. Recall that the ΔP3R4 CooA variant is able to bind CN− and imidazole but is not activated by them. Under the assumption that the additional level of ligand specificity probably was due to residues in the vicinity of the bound ligand, we therefore started with ΔP3R4 CooA, randomized various C-helix residues, and screened for activation in response to imidazole in vivo. Randomization of positions 117 and 120 yielded no imidazole-responsive variants, but the simultaneous randomization of position 113 and 116 did (Youn et al., unpublished). A variety of combinations of residues at these positions supported this phenotype, but the striking commonality was the presence of a Trp residue at one of the two positions. The basis for this is unknown, and it is clear that other aromatic residues are much less effective. The majority of the imidazole-responsive variants continued to be activated by CO as well, but some, such as ΔP3R4 Trp113 Trp116 and ΔP3R4 Arg113 Trp116, were substantially more active in response to imidazole than in response to CO. This result shows that these positions are critical for the imidazole response, presumably by some interaction with the bound imidazole itself, although more complicated mechanisms cannot be ruled out.
One imidazole-responsive CooA variant (ΔP3R4 Trp113 Trp116 CooA) was then further analyzed for the importance of Gly117 and Leu120. The rationale was that if either of these residues provided a precise contact with the bound CO or, in a related way, served as the basis for CO specificity, then the requirements at these positions would be very different for a response to imidazole. In each case, only the wild-type amino acid residues were acceptable at these positions. While this does not disprove the notion that these residues make specific contacts with the heme-bound CO in wild-type CooA, it is much simpler to imagine that there are similarities in imidazole and CO responsiveness and that these residues are both critical in that shared pathway. The nature of the shared pathway would probably be the C-helix repositioning described below.
While it is therefore obvious that there is another level of CO specificity in CooA, the molecular basis for it remains unclear. Our current hypothesis is that the CO-bound heme must move to a hydrophobic region along the C helices and that this movement is precluded by ligands other than CO. Imidazole is both bulky and hydrophilic, so that its movement into such a pocket is prevented, while the charge on CN
− would also prevent its presence in a hydrophobic pocket. However, if imidazole is too bulky for normal activation, what is the basis of the imidazole-responsive variants that have been detected? Obviously the exact nature of the active forms of these variants is unclear, but our working hypothesis is that their precise mechanism of activation is different from that of wild-type CooA in response to CO. In other words, we imagine that the imidazole-bound heme interacts in a different way with the modified C-helix residues from the way in which CO interacts with the normal residues but that these different interactions both have the common result of C-helix repositioning. We then imagine that residues 117 and 120 are involved in that shared pathway. This result with the imidazole responders is particularly interesting since it indicates that CooA and its variants sense CO and imidazole by mechanistically different processes. In contrast, the models explaining the response of FixL to different small molecules assume that the sensing system of the protein is essentially identical for each effector (
56).
A recent observation is also consistent with heme movement. Kinetic analysis has revealed that CO-bound wild-type CooA is heterogeneous in terms of the CO off-rate (
103). One population shows a very low off-rate, consistent with the tight CO pocket already reported (
112,
130). However, a roughly comparable population displays a significantly higher off-rate, implying a different position of the CO-bound heme. These two populations are in slow equilibrium, suggesting that a substantial conformational change might be occurring in the transition. This result is consistent with the notion that the two populations detected by this method might reflect the populations of active and inactive CO-bound CooA described above.
While we do not know the precise position of the CO-bound heme in CooA, it remains a tantalizing possibility that on CO binding, the heme approaches the position occupied by cAMP in active CRP. If this is correct, then the two proteins might be responding to their respective effectors in fundamentally similar ways. Determination of the mechanistic similarities and differences between the two proteins continues to be a focus of research because it should reveal commonalities for other members of the family of related proteins as well.
Repositioning of C Helices as a Signal Transduction Mechanism
The initial observation that suggested that repositioning of the C helices of CooA might be an important signaling pathway between the heme region and the DNA-binding regions was the comparison of the structures of active CRP and inactive CooA (
81). However, because a comparison of such different proteins is obviously problematic, a more direct test was performed as follows.
CooA, together with its homologs and also CRP and FNR, has a leucine zipper motif in the paired C helices. However, an analogous heptad repeat in the leucine zipper of all of these proteins, which lies about one-third of the way down the helices from the hinge region (positions 121 to 126 of CooA), is poor in comparison to a leucine zipper consensus. This led to the hypothesis that this nonconsensus heptad permitted flexibility in the structure, allowing a transition between an active and inactive form. Support for this notion for CRP has been made on structural grounds (
100), and it is interesting that the D154A substitution that allows FNR to be active under aerobic conditions also affects this region (
76). We reasoned that if helix repositioning was the signal pathway for CooA, then creating such a repositioning by mutation should short-circuit the signal and provide effector-independent activity. We therefore randomized the codons for positions 121 to 126 in an otherwise wild-type CooA background and screened for CO-independent variants (
71). Sixty variants were sequenced, displaying a variety of different phenotypes, but all variants with substantial CO-independent activity had Leu residues (or other appropriate residues for a leucine zipper) at positions 123, 124, and 126. This is a fairly clear result and identifies helix repositioning as a major signal pathway within CooA. As noted above, the notion has also been proposed for CRP (
100).
CooA variants with improved leucine zippers have substantial activity without CO but also show a further increase in activity, to approximately the wild-type level, in the presence of CO (
71). Apparently that repositioning of the C helices is only partially effective at shifting the equilibrium to the active form if CO was not bound to the heme. There are two general possibilities to explain this. First, in the absence of CO, the improved leucine zipper variants might be under competing forces, with the continued Pro2 ligation to the heme preventing a full and proper repositioning. Second, CO binding to the heme might cause other conformational changes within the effector-binding domain that also assist in activation of the protein. In other words, while the C-helix repositioning is very important, it might not be the only signal pathway. In fact, both of these possibilities appear to be true.
The apparent tension caused by the retained Pro2 ligation was shown as follows. Among the CooA variants randomized at positions 121 to 126, one of the most active without CO had Ala121 and Gly122 substitutions. We noted that these substitutions lie between the improved leucine zipper region and the region of the C helices that are near the heme. These substitutions might therefore create a bend or a flexible region in the C helix, reducing the adverse tension in the absence of CO. Consistent with this, when the same pair of residues were introduced into an otherwise wild-type CooA background, the response to CO was diminished. This is reasonable because it is the rigidity of the helices in wild-type CooA that should be necessary for signal transduction through the protein. Subsequent analysis of that vicinity of the C helices is consistent with this idea, although there are contacts with other parts of the protein that complicate the analysis (
71). A different confirmation of this model involved the addition of the ΔP3R4 substitution to the improved leucine zipper background. By itself, the ΔP3R4 causes negligible CO-independent activity in vivo, presumably in part because an adventitious ligand is able to satisfactorily replace Pro2 in keeping CooA inactive without CO. However, in the improved leucine zipper background, ΔP3R4 allows very high CO-independent in vivo activity. This is easily rationalized by the fact that this variant can no longer efficiently tether Pro2 to the heme (nor can the adventitious ligand do this with the same effectiveness as Pro2) and therefore cannot effectively interfere with the C-helix repositioning caused by the improved leucine zipper.
The second possibility, that CO binding sends activation signals by other mechanisms, also has some support. As described below, there is good evidence that CO binding directly alters the positioning of some of the regions of CooA that interact with RNA polymerase. It is less clear that there is another pathway between the CO-bound heme and the DNA-binding domains, but this is certainly a reasonable possibility based on the structure. A comparison of the structures of active CRP with inactive CooA indicates a very different positioning of the 4/5 loop in each protein, and there are clear contacts between this loop and the DNA-binding domain of active CRP. Finally, because His77 is directly connected to the 4/5 loop that starts at approximately residue 69, any movement of the heme after CO binding would be expected to move the tethered His77 and its attached protein backbone as well. However, there is no experimental evidence in support of such a pathway in CooA.
The model of C-helix repositioning is also supported by data from a completely different set of CooA variants. These variants have been found in different mutageneses and screens for effector-independent variants under various conditions. One of the most compelling cases is that of L116K CooA, which is active when reduced but actually loses activity in the presence of CO (
149). A variety of spectroscopic analyses have suggested, albeit indirectly, that this variant is altered in its ligation state and that Lys116 appears to replace Pro2 (the position of Pro2 is shown in Fig.
3C). Modeling such a Lys116 ligation on the known structure of reduced CooA requires a substantial movement of the heme with respect to the C helix. The activity in the reduced form of L116K CooA is therefore probably the result of helix repositioning by a direct covalent bond between the heme and the C helix. This is in contrast to the mechanism already proposed for wild-type CooA, where helix repositioning results from the exposure of the hydrophobic pocket or from heme movement along the hydrophobic C helices.
Whereas important details concerning the exact interaction between the CO-bound heme and the C helices remain to be discovered, the above results provide fairly conclusive evidence that this is a major mechanism for transmission of the of the CO-binding signal through the protein to the DNA-binding regions.
Achieving the Active Structure of CooA
Nature of the active form of CooA.
Our image of the active form of CooA is based largely on the crystal structure and related data for CRP. Because of that and because the information is not central to the response of CooA to CO, we will merely summarize the information here.
(i) The heme in active CooA is certainly deliganded from Pro2 and appears to move with respect to the protein portion of the effector-binding domain, although it certainly remains tethered to His77. While we have hypothesized that the heme might move along the C helices, this has not been demonstrated.
(ii) In the active form of CooA, the C helices have undergone a small but important reorientation relative to each other.
(iii) The DNA-binding domains are arranged in a fashion likely to be that of active CRP, because the sequence bound are very similar. This is discussed in more detail in the following section.
(iv) The hinge region between the DNA- and effector-binding domains (roughly Phe132 in CooA and Phe136 in CRP) is dramatically rearranged. In CRP, Phe136 makes contact with the 4/5 loop, which might be an important interaction for stabilizing the active form. While this notion has not been tested experimentally for CRP, Phe is the only residue at that position that provides significant CO-dependent activity to CooA (H. Youn and G.P. Roberts, unpublished data). This result is consistent with the absolute conservation of a Phe at this position among the CooA homologs.
(v) The 4/5 loop is in a position to interact with the DNA-binding domain, the hinge region, and other residues on the C helix. This centrality to a number of regions of the protein that are likely to be critical to the stability of the active form suggests that the apparent movement of the 4/5 loop from the position in inactive CooA to the position in active CRP is another major aspect of the activation mechanism.
Interaction of CooA with specific DNA sequences.
CooA is known to bind to two naturally occurring DNA sequence in the
R. rubrum genome, which are reminiscent of the sequences bound by CRP and FNR (
42,
58,
79). Because a mutational analysis of the binding sites of CooA has not been performed and because of the paucity of naturally occurring sites, it is premature to speak about a consensus half-site. Nevertheless, the half-sites are so similar to each other that some general comments can be made. For both CRP and CooA, these regions have two 5-base inverted repeats, termed half-sites, separated by six other bases. The striking difference in these sequences is the C at the fourth position of the half-site bound by CooA (TGT
CA), where a G is used by CRP(TGT
GA). This can be rationalized by examination of the F helices of CRP and CooA, since the Glu181 of CRP, known to make direct contact with the G base, is replaced by a Gln (Gln178) in CooA of
R. rubrum and in all of its homologs, although the specificity of this residue for C base in this context has not been tested experimentally. The T in the middle of the palindromes for all three proteins does not seem to be contacted by any residues in the CRP structure, and it is a reasonable hypothesis that it is necessary for the 30° bend that is known to be induced in the DNA by CRP on binding (
79). FNR uses a generally similar F helix, though its exact DNA target sequence is slightly different (
9).
For CRP, a consensus sequence was guessed from the comparison of many naturally occurring CRP binding sites, and this consensus displayed a dramatically higher affinity for CRP in vitro than does any natural site (
54). This implies that the physiologically appropriate affinity of CRP for DNA must be lower than the maximal possible and that each individual CRP-binding site achieves that affinity by different perturbations from the consensus sequence. As a consequence, most naturally occurring CRP-binding sites are discernibly different in their sequences. With this background, it is therefore a bit surprising that the four CooA half-sites have only a single example of a substitution. It seems unlikely that all half-sites would coincidentally diverge from the highest-affinity sequence in the same way. It is therefore possible that CooA achieves the biologically appropriate affinity with an F helix for which there is no DNA sequence with the very high affinity produced for CRP by its consensus sequence. However, a more careful comparison of the affinity of CRP and CooA for their respective DNA sequences and an analysis of the affinity of CooA for other sequences should clarify this situation.
Positioning CooA for proper interaction with RNA polymerase.
It is obviously central to their biological role that CooA, CRP, FNR, and related proteins form proper interactions with RNA polymerase in order to stimulate transcription. While a great simplification, their regulated promoters have relatively poor affinity for RNA polymerase by themselves and the presence of one of these activator proteins bound near the promoter provides protein-protein contacts that make polymerase binding more energetically favorable. CRP and FNR appear to bind at two types of sites: in the enteric bacteria, class I sites are centered from −61.5 to −93 relative to the start site of transcription, while class II sites are typically centered at −41.5 (
8,
16). This difference in positioning with respect to the promoter implies rather different contacts between the activator and the polymerase, but we discuss only the class II promoters, since that is the class into which the two natural promoters for
R. rubrum CooA fall (
42,
58). The work with
R. rubrum CooA in an
E. coli background has also used a binding site/promoter of this class.
At class II promoters, at least three different regions on the activator come into contact with RNA polymerase, and these regions are termed AR1, AR2, and AR3 (for “activating regions”) (
15) (Fig.
6). Because of the geometry of the activator dimer and RNA polymerase, each monomer makes specific interactions as follows. AR1 exists only in the upstream monomer, relative to RNA polymerase, and is found primarily in the DNA-binding domain of that monomer. It makes specific contacts with the carboxyl-terminal domain of the α subunit of polymerase, which reaches over the activator protein on a long, flexible arm (
8,
95,
96). AR2 lies in the effector-binding domain of the downstream activator monomer and makes specific contacts with the amino-terminal domain of the α subunit (
95,
96). Finally, AR3 is also in the effector-binding domain, specifically the 4/5 loop of the downstream monomer, and makes contacts with the σ subunit of polymerase (
8,
84). The relative importance of each specific AR is different in CRP and FNR (
109,
140), but there is no reason to suspect that the geometry of the interacting complexes is profoundly different.
CooA has all three ARs, at least when activating transcription with the heterologous RNA polymerase from
E. coli. Both gain-of-function and loss-of-function variants affecting AR2 and AR3 have been described (
82), and the presence of an AR1 was revealed by the requirement for a functional carboxyl terminus of the α subunit for in vitro transcription (
59).
The proper positioning of the ARs for interaction with RNA polymerase is a direct result of the proper orientation of the DNA-binding domains. This is clear, for example, because AR1 is in the DNA-binding domain and the positioning of the F helix of the latter for DNA interaction must also position the AR1 for RNA polymerase interaction. Similarly, because AR3 is at the tip of the 4/5 loop and it seems likely that this loop is positioned in the active form through interaction with the DNA-binding domain, it seems likely that this region also should necessarily be properly positioned. It is less clear whether there is repositioning of other AR surfaces, predominantly AR2, on effector binding that is independent of the DNA-binding domain movement. In other words, is AR2 necessarily in the proper position for the RNA polymerase interaction when the DNA-binding domains are positioned to bind DNA, or must it be separately positioned in response to effector binding?
There is suggestive evidence in support of this latter idea for both CooA and FNR. In CooA, Glu41 is important for interaction with the RNA polymerase (
82) and is adjacent to Asn42, which makes direct contact with a heme ligand, His77 (
81). It is therefore a reasonable possibility that CO binding to the heme might affect the precise positioning of Glu41 through the repositioning of the heme and therefore of Asn42 (
81). Specific changes at a few other CooA residues in the heme vicinity also have the effect of perturbing activation without perturbing DNA binding. These include Met76, Phe74, and possibly others (M. Conrad, R. L. Kerby, and G. P. Roberts, unpublished data). However, to date, the data are merely consistent with the hypothesis of a direct effect of CO on AR positioning, and it has not been rigorously demonstrated. A similar notion has been proposed for FNR as well, where a C20S substitution that alters a ligand to the 4Fe-4S cluster is defective in transcription activation yet appears to bind DNA effectively based on its ability to bind as a repressor (
114).
Although the hypothesis of a separate and direct positioning of certain AR surfaces in response to effector binding seems initially surprising, it probably should not be. It is clear for all of these activator proteins that effector binding causes a substantial conformational change within the effector-binding domain, in addition to a repositioning of the DNA-binding domains. Given that the AR2 surfaces are adjacent to the regions of the protein that bind the effector, it is actually a reasonable prediction that the hypothesis should be true for at least some AR residues. Presumably the positioning of the AR in the DNA-bound form of the protein would be more appropriate for interaction with RNA polymerase than would that of the effector-free form.
Would there be a biological consequence of direct AR positioning by effector binding? Based on the equilibrium model proposed above, it seems that there will always be a subpopulation of activator protein that positions the DNA-binding domains properly in the absence of the effector. If there were no effect of the effector on the AR regions, this subpopulation would be expected to have the same affinity for DNA and RNA polymerase as does CO-bound CooA. This would result in a background of gene expression in the absence of CO that would be wasteful. However, the AR-positioning effect would presumably reduce the ability of this subpopulation to activate transcription in the absence of CO, even if it could bind DNA. The net effect of the phenomenon would be to decrease the level of activation in the absence of the effector, which in a sense increases the apparent CO specificity of the regulatory system.
Intersubunit communication in CooA.
To this point, we have focused on the communication between the effector- and DNA-binding domains after CO binding. However, the proximity of the hemes to the shared C helices, and therefore to each other, makes communication between the two monomers a reasonable possibility. Such communication between monomers should also result from the fact that the Pro2 displaced by CO is from the other subunit. Intersubunit communication is also an obvious possibility for CRP, since each bound cAMP makes contacts with residues from both subunits. However, there has long been a disagreement about whether this communication causes cAMP binding to be positively or negatively cooperative. The current view is that binding is positively cooperative, and it appears that claims for negative cooperativity are probably based on the binding of two additional molecules of cAMP to the protein. These additional molecules bind to the DNA-binding domains themselves (
99) and reduce the affinity of the protein for DNA, although the physiological significance of this form is doubtful.
The notion of communication between the hemes of CooA was first shown by the binding of CN
− to CooA variants altered at position 77 (
124). Because of the perturbation of the normal His77 ligand, the CN
− binding in these cases is almost certainly on the opposite side of the heme relative to that bound by CO in wild-type CooA. Nevertheless, different substitutions showed different sorts of cooperative binding, demonstrating intersubunit communication in these variants. For CO binding to wild-type CooA, there are technical challenges presented by high CO affinity and low CO solubility. However, recent work by the laboratories of Spiro and Olson have now addressed the problem through kinetic and spectropscopic analyses (
103). They have demonstrated that CooA is positively cooperative for CO binding and reflects the relative effect of a bound CO on the ability of CO and the other Pro2 ligand to compete for the heme iron. It appears that binding of CO to one heme in the dimer lowers the deligation rate of the Pro2 ligand to the other heme, which by itself would yield negative CO cooperativity. However, this CO binding also lowers the rebinding rate of the Pro2 to the other ligand by a much greater degree. The result is positive cooperativity for CO binding, because the other heme is now more accessible to CO. The biological implication of this cooperativity is that low levels of CO would activate a subpopulation of the CooA and therefore lead to transcription activation, even if the CO levels were insufficient to saturate the entire population of CooA hemes.
The situation in vivo is obviously more complex and has not yet been analyzed. For example, the interaction of CooA with DNA and RNA polymerase almost certainly affects the equilibrium between the active and inactive forms, which would have significant biological consequences.