Identification of EG and glycolate catabolic genes by RNA-seq
Having established that RHA1 grows on EG, we conducted transcriptomic studies to identify genes and pathways that are involved in its catabolism. Cells grown on 30 mM EG, glycolate, or acetate were harvested during exponential growth. We hypothesized glycolate to be an intermediate in the catabolism of EG by RHA1 as it is in the catabolism of EG by
P. putida (
12). Acetate was used as a control as it is a C2 compound whose metabolism is distinct from that of EG. Analysis of the RNA-seq data revealed that 433 genes were significantly upregulated during growth of RHA1 on EG versus acetate (log2FC > 2.50,
padj <0.0001), and 100 were significantly downregulated (Table S2). By comparison, 427 genes were significantly upregulated, and none were downregulated during growth on glycolate versus acetate (Table S2). Finally, 51 genes were upregulated and 58 were downregulated during growth on EG versus glycolate. The transcriptomics and bioinformatics data revealed four clusters of genes predicted to be involved in EG and glycolate catabolism (
Fig. 2;
Table 3): the
mad locus, predicted to encode
mycofactocin-dependent
alcohol
degradation, including the catabolism of EG to glycolate; two glyoxylate carboligase (GCL) clusters (
6), predicted to encode glycolate and glyoxylate catabolism; and the
mft genes, predicted to specify the biosynthesis of the mycofactocin used by one of the
mad-encoded enzymes.
The
mad locus includes
madA (
RHA1_RS29605), which was the fourth highest upregulated gene during growth on EG versus acetate and was also highly upregulated on glycolate, (
Fig. 2). RT-qPCR confirmed that
madA was upregulated ~1,400-fold on EG versus acetate, while
madD (
RHA1_RS29620) was upregulated sixfold. MadA is the reciprocal best hit of Mno from
M. smegmatis, with which it shares 92% amino acid sequence identity and the same genomic context (
Fig. 2). As noted above, Mno is a mycofactocin-dependent methanol oxidase that is required for growth on methanol and ethanol and that also acts on formaldehyde (
27,
28). MadA also shares 97% amino acid sequence identity with Mno from
R. erythropolis N9T-4, which possesses formaldehyde dismutase activity (
32). Based on the activity of these close homologs in mycobacteria and rhodococci, as well as the evidence presented below, we annotated MadA as a mycofactocin-dependent alcohol dehydrogenase that initiates EG catabolism in RHA1.
The
mad genes are arranged in two putative operons based on RNA-seq reads,
madABC and
madDSR, and are homologous to the
mno locus of
M. smegmatis (
Fig. 3) The
madS and
madR genes are predicted to encode the sensor kinase and response regulator, respectively, of a two-component signal transduction system that regulates the transcription of the
mad genes. MnoSR, the reciprocal best hits of MadSR, regulate the expression of
mno in
M. smegmatis (
29). MadD, is the reciprocal best hit of ErcA (PA1991) of
Pseudomonas aeruginosa PAO1 (
Table 3), an iron-dependent alcohol dehydrogenase, which may play a regulatory role in ethanol catabolism (
46). The homolog encoded by the
mno cluster, MSMEG_6239, is a putative 1,3-propanediol dehydrogenase and is also regulated by MnoSR (
29). Finally, as summarized in
Table 3, the physiological functions of
madB and
madC are unknown (
29).
The MFT biosynthesis genes were identified based on their similarity to
mftABCDEF of
M. smegmatis (
28). However, the architecture of the cluster in RHA1 differs from the typical organization in mycobacteria (
53). Notably, the
mftABCD and
mftEFG genes in RHA1 are interrupted by three genes annotated here as
mftHIJ. These three genes, together with
mftA, were slightly upregulated on EG versus acetate, while none of the other
mft genes were differentially regulated (
Table 3). Furthermore,
mftG, encoding a predicted GMC oxidoreductase, is likely part of the
mft operon. Interestingly, the
mft locus is located within 10 kbp of the
mad genes in RHA1. By contrast, the
mft genes are more than 2 Mbp away from the
mno genes in
M. smegmatis.
The most highly upregulated genes on EG and glycolate versus acetate occur in two clusters, designated here as GCL1 and GCL2 (
Table 3;
Fig. 4). GCL1 and GCL2 are predicted to encode glyoxylate catabolic enzymes analogous to the Gcl pathways in
E. coli,
P. putida, and
Streptomyces coelicolor, which are associated with allantoin catabolism (
5,
47,
49). Glyoxylate is an intermediate in catabolism of both EG and allantoin. GCL1 includes
gcl,
glxR,
hyi, and
garK, which, based on putative promoters and RNA-seq reads, are part of an eight-gene operon predicted to encode the catabolism of allantoin to glyoxylate and urea (
Fig. 4A and B). Other highly upregulated genes in this region include
allP and
puuE, which are also involved in allantoin catabolism;
aceE2 and
pyk2, which encode glycerate catabolism (
Fig. 4); and
RHA1_RS12640-RS12655, which encodes the mycothiol-dependent detoxification of formaldehyde. The
glcB gene, located upstream of GCL1 and which codes for malate synthase, was not significantly upregulated. Moreover,
aceA, which encodes an isocitrate lyase, was downregulated on EG versus acetate. Together, this suggests that the glyoxalate shunt was not as active during growth on EG as on acetate.
GCL2 includes a second set of
gcl,
glxR,
hyi, and
garK homologs as well as
glcD, encoding glycolate oxidase (
Fig. 4). In
E. coli and
P. putida, GlcD transforms glycolate to glyoxylate and requires GlcE and GlcF, although the precise roles of the latter are unknown (
50,
54). Interestingly, the
glcDEF operon in
E. coli and
P. putida is part of the allantoin pathway in these organisms and is not associated with the glyoxylate cluster (
5,
50). Although GCL2 does not contain
glcE and
glcF homologs, we have provisionally annotated GlcD of RHA1 as a glycolate oxidase. Four other genes of interest lie close to GCL2 (
Fig. 4). Two of these,
RHA1_RS15700 and
RHA1_RS15705, form a putative operon and encode a lactate/glycolate transporter and a second GlcD homolog, respectively. However, they were not significantly upregulated during growth on EG or glycolate. The other two genes,
RHA1_RS15635 (
pyk3) and
RHA1_15680, predicted to encode a pyruvate kinase and an MSF transporter, respectively, were highly expressed on EG and glycolate. This suggests that glycolate is metabolized via an independently regulated glycerate pathway and that
RHA1_RS15680 may translocate glycolate or a related metabolite.
The GCL1 and GCL2 transcripts had very similar relative abundances in EG- and glycolate-grown cells (P(GCL1) = 0.067; P(GCL2) = 0.107), consistent with the conclusion that EG and glycolate catabolism share many steps. Finally, the GCL2 genes were on average 4 and 16 times more highly expressed on EG and glycolate, respectively, compared to the GCL1 genes [P(EG) = 0.0432, P(glycolate) = 0.0003], suggesting that GCL2 plays a bigger role in catabolizing these compounds than GCL1.
Comparison of the differentially regulated genes during growth on EG versus glycolate did not provide further insight into the transformation of EG to glycolate, and a glycolaldehyde dehydrogenase candidate was not identified (Fig. S2). The majority of the highly up- and downregulated genes encode hypothetical proteins or proteins for which no characterized homologs exist. Operons putatively involved in polyketide synthesis (RHA1_RS09515 to RHA1_RS09520 and RHA1_RS02760 to RHA1_RS02775) were among the highest upregulated genes during growth on EG versus glycolate or acetate. Moreover, eda and edd of the Entner–Doudoroff pathway were significantly upregulated on EG compared to glycolate. The functional significance of these adaptations in EG catabolism is unclear.
Adaption of RHA1 on EG
We next tested if RHA1 can be adaptively evolved to grow on higher EG concentrations, similar to what has been achieved in KT2440 (
6). RHA1 was grown in batch culture using shake flasks and was successively transferred to M9G media amended with 30, 60, 90, 120, and 180 mM EG, respectively. This yielded the strain RHA1-EG, which grew at a rate of ~0.13 h
−1 on 180 mM EG (
Table 5). Furthermore, this strain grew at three times the rate of the WT strain on 30 mM EG, comparable to its growth rate on acetate, and attained ~50% higher growth yields (0.62 ± 0.03 vs 0.40 ± 0.04 mg cell dry weight (CDW)/mL;
P < 0.002). Interestingly, the lag times of RHA1-EG on acetate or glucose were similar whether the cells were pre-grown on EG (10 h) or acetate (12 h) (Fig. S6). In contrast, the lag times of WT on acetate and glucose were approximately twice as long when pre-grown on EG (19 h) versus acetate (10 h) (Fig. S6). In shake flasks, the final OD
600 of RHA1-EG cultures increased linearly with substrate concentration to 300 mM EG, which yielded an OD
600 of 31.6 ± 0.7. On 600 mM EG, cultures attained an OD
600 of ~34. However, the pH of the spent medium was very low, indicating that the buffering capacity of the medium had been exceeded.
We also tested growth of RHA1-EG on glycolate, glycolaldehyde, and glyoxal. Like the WT strain, RHA1-EG grew at a reduced rate on 15 mM glycolate and to half the OD
600 as on 15 mM acetate (Fig. S1B). Furthermore, the growth yields of the two strains on 30 mM glycolate were comparable (0.15 mg CDW/mL). Moreover, glyoxal inhibited growth of RHA1-EG to a similar extent as RHA1 (Fig. S1D). However, RHA1-EG tolerated 50% higher concentrations of glycolaldehyde than did WT RHA1 (
Table 2; Fig. S1F). RHA1-EG also tolerated higher concentrations of formaldehyde (
Table 2), growing in the presence of 3 mM formaldehyde, while WT did not grow in the presence of 2 mM formaldehyde (Figure S1G and H). Finally, the adaptive laboratory evolution did not affect the strain’s growth on acetate or glucose (
Table 5; Fig. S6). Moreover, the growth rates and yields were not affected by the substrate used to grow the inoculating culture (i.e., acetate vs EG). Glycerol stocks of this strain grew at similar rates on 60, 120, or 180 mM EG regardless of the growth substrate used in the starter culture, although lag times were shorter when the cells were pre-grown on EG (data not shown).
Elucidating the basis of improved growth of RHA1-EG
To elucidate the basis of the improved growth of RHA1-EG, we compared the transcriptomes of the strain grown on EG with that of the WT strain (
Fig. 2). Comparison of the transcriptomes of the two strains growing on EG revealed 180 downregulated but no upregulated genes. However, compared to the transcriptomes of the WT strain grown on acetate and glycolate, 116 and 183 genes, respectively, were upregulated in the transcriptome of EG-grown RHA1-EG (Table S2). None of these included any of the Mad, GCL1, or GCL2 pathway genes. However, the upregulated genes included a cluster on plasmid pRHL1 spanning
RHA1_RS39745 to
RHA1_RS39775 (Table S3). This cluster includes
RHA1_RS39755, annotated here as
aldA2, whose product shares 66% amino acid sequence identity with PedI (
PP_2680) and AldB-I (
PP_0545), enzymes from KT2440 that have glycolaldehyde dehydrogenation activity (
12). Intriguingly, AldA2 differs by a single amino acid from AldA1 whose gene,
RHA1_RS29725, is part of a putative operon that sits immediately downstream of the
mft operon (
Table 3). AldA1 and AldA2 are likely isofunctional.
To test the hypotheses that AldA2 catalyzes glycolaldehyde dehydrogenation and that its activity can improve the growth of RHA on EG, we overexpressed
aldA2 in RHA1 using pTipQC2. Similar to RHA1-EG, the strain overexpressing
aldA2 grew on 30 mM EG (Fig. S7A and B) and its growth on acetate was not inhibited by 5 mM glycolaldehyde (Fig. S7C and D). However, the strain’s growth was inhibited by 1 mM formaldehyde (Figure S7E and F). When grown on EG, RHA1-EG accumulated approximately 50% of the glycolaldehyde relative to wild type (
Fig. 6D). Interestingly, the strain overexpressing
aldA2 did not detectably accumulate any glycolaldehyde when grown on EG (
Fig. 6E). Furthermore, when grown on glycolaldehyde, RHA1-EG and the
aldA2 overexpressor depleted glycolaldehyde at faster rates than wild type (Fig. S5C and D). These results indicate that AldA2 plays a role in glycolaldehyde catabolism and that the overexpression of
aldA2 may contribute to the growth phenotype of RHA1-EG.