Examination of alterations in the translatome and transcriptome of E. coli in response to varying degrees of acid stress
To mimic natural stress conditions, such as the passage of
E. coli through the gastrointestinal tract, we established the following protocol involving a sudden change to low pH, detection of a rapid response, and severe, near-lethal acid stress. Specifically,
E. coli K-12 MG1655 was cultivated in unbuffered lysogeny broth (LB) medium at pH 7.6 until the exponential growth phase (OD
600 = 0.5). Then, 5 M hydrochloric acid was added directly to expose the cells to a pH of 5.8 or stepwise to a pH of 4.4, corresponding to mild and severe acid stress (
Fig. 1A). The final optical densities were comparable (pH 7.6: OD
600 = ~1.1; pH 4.4: OD
600 = ~0.7) (
Table S1), and the pH values hardly changed compared to
t0 (
Fig. 1A; Table S2). To investigate whether a 15-min exposure to pH 4.4 was sufficient to induce cellular adaptation to severe acid stress, we examined the temporal dynamics of
adiA expression by quantitative reverse transcription PCR (RT-qPCR). We detected an increase in
adiA mRNA levels as early as 15 min after the shift to pH 4.4 and no substantial further increase after 30 or 60 min (
Fig. S1). Additionally, we evaluated cell viability at the final experimental time points using propidium iodide (PI) staining (
36) and examined colony-forming units (CFUs). The average percentage of dead cells detected by PI staining was less than 1% at pH 7.6, 5% at pH 5.8, and 18% at pH 4.4 (
Fig. S2A). A positive control (5-min heat treatment at 80°C) resulted in an average of 97.2% non-viable cells, as determined by PI staining. The CFU count results underline that at least 2 × 10
8 viable cells were collected irrespective of pH at the moment of sample collection for Ribo- and RNA-Seq (Fig. S2B).
Cells were harvested and lysed as previously described by whole-culture flash freezing and cryogenic grinding in a freezer mill to avoid bias from translation-arresting drugs and filtering (
38). The subsequent steps of our ribosome profiling protocol were a combination of methodologies reported by Latif and colleagues (
39) and Mohammad and Buskirk (
40) (see Materials and Methods for details). Strand-specific Illumina sequencing yielded an average of approximately 30 million cDNA reads per sample for Ribo-Seq and 5–10 million for RNA-Seq. The next-generation sequencing data were analyzed using an extended version of the high-throughput HRIBO data analysis pipeline (
37). All samples achieved sufficient coverage with over two million reads, each mapping uniquely to the coding regions. The rRNA contamination was higher in the pH 4.4 Ribo-Seq samples than in other conditions but accounted for less than 15% in all cDNA libraries (Fig. S3). The length distribution of the generated RPFs was broad, ranging from 15 to 45 nucleotides (Fig. S4), consistent with previous observations in other prokaryotic ribosome profiling analyses (
38,
39). We did not detect stress-induced increased relative ribosome occupancy in the initiation region of ORFs under acid stress compared to neutral pH. This is in contrast to previous observations in
E. coli under heat stress (
41) and in yeast under oxidative stress (
42), which reported increased relative ribosome accumulation at start codons in stressed cells. In fact, ribosome occupancy in the translation initiation regions was slightly reduced at pH 4.4 and 5.8 compared with physiological pH (Fig. S5). This could be explained due to diminished ribosome-RNA complex stability and increased ribosome drop-off under acidic conditions. The biological triplicates for each experimental condition clustered on the first three principal components in a principle component analysis (PCA) plot (
Fig. 1B). Notably, the global gene expression profiles were highly distinct at pH 4.4 compared with both pH 5.8 and 7.6.
Coordinated regulation of transcription and translation in response to acid stress
The tool
deltaTE (
43) was used to assess transcriptional and translational changes (i.e., differential expression and differential translation efficiency) in response to mild and severe acid stress. Low-expression transcripts were filtered out, and we focused our analysis on 3,654 genes with mean reads per kilobase per million reads mapped (rpkm) values ≥5 across all investigated conditions. Our findings reveal that 702 transcripts were significantly altered at pH 5.8 compared with physiological pH [absolute mRNA log
2 fold change (FC) ≥1 and false discovery rate (FDR) adjusted
P ≤ 0.05], and 1,030 genes showed significant differences in mRNA levels at pH 4.4 (
Fig. 2A). These results suggest that extensive transcriptional reprogramming occurred, which was influenced by the degree of acid stress. As illustrated by the Venn diagram overlaps (
Fig. 2), a large number of adaptations occurred regardless of the degree of acid stress. Nonetheless, several hundred genes were differentially expressed exclusively at pH 5.8 or 4.4 (
Fig. 2A). This suggests that in addition to universal adaptations at low pH, specific adaptations for mild and severe acid stress occur. We further determined the number of genes with stress-dependent alterations in RPF counts to be 679 at pH 5.8 and 1,440 at pH 4.4 (absolute RPF log
2 FC ≥1 and FDR adjusted
P ≤ 0.05), which was in a similar range compared with the RNA-Seq data (
Fig. 2A and B). Accordingly, the global FC values for mRNA and RPF levels showed a high Pearson correlation coefficient (
r) under both conditions (
Fig. 2C and D, gray dots). This indicates that transcriptional regulation of these genes is the predominant response to acid stress. However, a subset of genes exhibited exclusive and significant regulation at either the transcriptional (red dots) or translational (blue dots) level.
Specifically, at pH 5.8, 193 genes were detected to be significantly regulated exclusively by RNA-Seq, while 216 genes were exclusively affected in the Ribo-Seq data (
Fig. 2C;
Table S3). At pH 4.4, 127 differentially regulated genes were found exclusively by RNA-Seq and 570 genes by Ribo-Seq (
Fig. 2D;
Table S3). Notably, for
fruA at pH 5.8 and
yecH at pH 4.4
, opposite changes were observed at the transcriptional and translational levels (
Fig. 2C and D, yellow dots). FruA is the fructose permease of the phosphoenolpyruvate-dependent sugar phosphotransferase system (
44), whereas the function of YecH remains unknown.
Next, we investigated translation efficiency (TE) to identify genes that undergo translational regulation in response to acidic conditions. TE provides information regarding ribosome counts per mRNA and is calculated as the ratio of RPFs over transcript counts within a gene’s coding sequence normalized to mRNA abundance (
43). We identified 22 genes at pH 5.8 and 89 genes at pH 4.4, which displayed significantly altered TEs (absolute log
2 TE fold change ≥1 and
P-adjust ≤0.05) (
Table S4). The highest increase in TE at pH 4.4 was found for the KpLE2 phage-like element (
topAI), a hydroxyethylthiazole kinase (
thiM), and a palmitoleoyl acyltransferase (
lpxP). In contrast,
yecH and
yjbE, both encoding uncharacterized proteins, and
malM of the maltose regulon showed the most prominent decrease in TE at pH 4.4 (
Table S4). At pH 5.8, we noted the largest increase in TE for a ferredoxin-type protein encoded by
napF, an iron transport protein (
feoA), and a tagaturonate reductase (
uxaB). Conversely, the largest decrease was observed for a protein of the fructose-specific phosphotransferase system (
fruA), a tripartite efflux pump membrane fusion protein (
emrK), and an HTH-type transcriptional regulator (
ydeO) (
Table S4).
In summary, besides extensive transcriptional reprogramming, dozens of genes exhibit significant FCs either at the transcriptional or translational level in response to acid stress. This underlines that transcription and translation are not always coupled in bacteria. Similar findings were reported by Zhang and colleagues (
41), who conducted Ribo-Seq and RNA-Seq analyses for
E. coli under heat stress (
41). Overall, such differential regulation can be explained, for example, by delayed translation relative to transcript synthesis, selective recruitment or release of ribosomes, or regulation during translation initiation, elongation, or ribosome biogenesis (
45–49), which could be beneficial under stress conditions.
Functional implications of genes with differential mRNA and RPF levels under mild acid stress
To obtain a more profound understanding of the fine-tuned response of
E. coli to different degrees of acid stress, we first analyzed all genes with differential mRNA and ribosome coverage levels during mild acid stress (pH 5.8). Under this condition, the top candidates with the highest FC values for mRNA and RPF are as follows: (i) the
cad operon, encoding the core components of the Cad AR system (see also below); (ii) the
glp regulon, responsible for glycerol and
sn-glycerol 3-phosphate uptake and catabolism (
50); (iii) the
mdtJI operon, encoding a heterodimeric multidrug/spermidine exporter (
51); and (iv) genes encoding proteins involved in motility and flagella biosynthesis (
Table 1). A comprehensive list of normalized read counts, mRNA and RPF FCs, and TEs for all
E. coli genes is provided in
Table S5. We tested a representative selection of differentially expressed genes by RT-qPCR. In all cases, the detected changes in mRNA levels were consistent with the data gathered by RNA-Seq (Fig. S6A). Both
recA and
secA were chosen as reference genes for RT-qPCR because their rpkm counts were relatively constant under the conditions tested (
Table S5;
Fig. S6B).
Next, we performed gene set enrichment analysis (GSEA) using
clusterProfiler (
52) to identify biological processes associated with differentially expressed genes at pH 5.8. Among the most enriched Gene Ontology (GO) terms for biological processes at pH 5.8 was “spermidine transmembrane transport” (
Fig. 3), which corresponds to the induction of
mdtJI (
Table 1) and a polyamine ABC transporter encoded by
potABCD (
Table S5). Polyamines are crucial for survival under acid stress, as they reduce membrane permeability by blocking OmpF and OmpC porins (
53–55). External spermidine supplementation also improved acid resistance in
Streptococcus pyogenes (
56). On the other hand, overaccumulation of polyamines can be toxic and potentially lethal for
E. coli (
51,
57). Therefore, precise transmembrane transport of polyamines in acidic environments is critical and contributes to survival in acidic conditions.
The enrichment of the GO terms “glycerol-3-phosphate catabolic process” and “glycerol catabolic process” at pH 5.8 (
Fig. 3) has not yet been associated with acid stress to our knowledge. Notably, of the 14 genes with the largest increase in RPF counts at pH 5.8, 7 belong to the
glp regulon (
Table 1). This regulon is required for the uptake and catabolism of glycerol and
sn-glycerol 3-phosphate (G3P) (
50). In this pathway, G3P is converted to dihydroxyacetone phosphate by membrane-bound dehydrogenases, either aerobically via GlpD or anaerobically by the GlpABC complex (
58,
59). Alternatively, dihydroxyacetone phosphate can be produced directly from glycerol by GldA and the protein products of the
dhaKLM operon (
60). The
dhaKLM operon was also induced at pH 5.8 (
Table S5). It remains unclear whether glycerol and G3P catabolism directly contribute to acid tolerance or whether the
glp regulon is activated as a consequence of other low pH adjustments. Expression of
glp genes is regulated by the repressor GlpR, which is inactivated upon binding of glycerol or G3P (
61). We hypothesize that changes in phospholipid composition under acid stress conditions (
62) may release G3P, which in turn induces the
glp regulon. Accordingly, the GO term “phosphatidylglycerol biosynthetic process” was enriched under acid stress (
Fig. 3).
Another observation is the upregulation of
de novo biosynthesis pathways for pyrimidine and purine nucleotides at pH 5.8 (
Fig. 3). The induction of a large proportion of the PurR-dependent regulon involved in
de novo nucleotide synthesis (
Fig. 3;
Table S5) suggests that
E. coli requires additional nucleotides to cope with the extensive transcriptional reprogramming. Besides, intracellular acidification can lead to DNA damage, such as depurination (
63), making enhanced nucleotide biosynthesis a critical compensatory mechanism. Recently,
Oenococcus oeni was reported to experience a decrease in the abundance of both purines and pyrimidines under acid stress, while nucleotide metabolism and transport increased (
64), suggesting a similar phenomenon in this species. Other enriched GO terms under mild acid stress include “choline transport,” “siderophore transmembrane transport,” “phosphate ion transmembrane transport,” “ribosomal small subunit assembly,” “tRNA aminoacylation for protein translation,” and “bacterial-type flagellum-dependent swarming motility” (
Fig. 3). pH-dependent motility has previously been observed in
E. coli,
Salmonella, and
Helicobacter (
1). These observations suggest that bacterial cells use an escape strategy to migrate to more favorable pH environments when challenged with acidic conditions.
On the contrary, our findings reveal that many membrane and periplasmic proteins (18 of the 20 genes with the most diminished RPF counts,
Table 2) were among the top candidates with decreased mRNA and RPF levels under mild acid stress. This affected, for example, genes encoding ABC transporters (
mal regulon,
dpp operon) and symporters (
actP,
melB,
gabP), highlighting the superiority of Ribo-Seq over mass spectrometry-based approaches, namely, its independence of protein biochemistry and higher sensitivity (
29). Furthermore, GSEA identified membrane transport and metabolic activities as the most downregulated biological processes in response to mild acid stress. For example, “maltose transport,” “isoleucine transport,” “heme transport,” “putrescine catabolic process,” “glycolate catabolic process,” and “aromatic amino acid family catabolic process” were among the most downregulated GO terms at pH 5.8 (
Fig. 3). Downregulation of H
+-coupled transport processes represents a key mechanism by which
E. coli restricts proton influx into cells. In addition, the downregulated metabolic processes are in many cases associated with the synthesis and conversion of amino acids and carbon sources. For example, the catabolism of aromatic amino acids and arginine was also reduced at pH 5.8 (
Fig. 3). Particularly noteworthy is the downregulation of the arginine catabolic pathway, which involves the protein products of the
astEBDAC operon. At pH 5.8, hardly any reads were mapped in the
astEBDAC region, despite detectable expression at pH 7.6 and pH 4.4 (
Table S5). Presumably,
E. coli preserves the intracellular arginine pool at pH 5.8, as this amino acid serves as a substrate for the Adi system during severe acid stress (
15,
65).
In summary, the response of
E. coli to mild acid stress is characterized by the activation of the motility machinery to escape to less acidic habitats, by induction of the
cad operon, and by genes involved in polyamine transport and glycerol-3-phosphate conversion (
Tables 1 and 2;
Fig. 3). In addition,
E. coli restricts the influx of protons and conserves energy by reducing its metabolic activities.
Functional implications of genes with differential mRNA and RPF levels under severe acid stress
Next, we analyzed genes with differential mRNA and ribosome coverage levels in response to severe acid stress (pH 4.4) compared with non-stress (pH 7.6). Genes with the highest number of increased read counts, which were not already upregulated at pH 5.8, were
asr, encoding an acid shock protein, followed by
bdm, encoding a biofilm-modulation protein, and
bhsA, encoding a multiple stress resistance outer membrane protein (
Table 3). Originally, Asr was classified as a periplasmic acid shock protein, although its role in acid adaptation remained unclear (
66). Recently, Asr was shown to be an intrinsically disordered chaperone that contributes to outer membrane integrity and to act as an aggregase in order to prevent aggregation of proteins with positive charges (
67). Our Ribo-seq data clearly illustrate the enormous importance of Asr under severe acid stress in
E. coli, as it is one of the most abundant proteins in the cell, with approximately 2% of all reads mapping in the
asr coding region at pH 4.4 (corresponding to an ~1,000-fold upregulation compared to pH 7.6). Strikingly, almost half of the top 20 genes with increased ribosome coverage of transcripts (
ydgU,
yhcN,
yjcB,
yedR,
yhdV,
ybiJ,
ycgZ, and
ycfJ) are poorly characterized (
Table 3). So far, only YhcN from the above list has been shown to be involved in the response to acid stress (
68).
GSEA for biological processes identified the GO terms “enterobactin biosynthetic process,” “ferric-enterobactin import into cell,” “siderophore-dependent iron import into cell,” and “siderophore transmembrane transport” as significantly enriched at pH 4.4 (
Fig. 3). Specifically, the complete enterobactin biosynthesis pathway, comprising the
entCEBAH operon,
entF,
entH, and
ybdZ, revealed significant enrichment under severe acidic conditions (
Table S5). Furthermore, all subunits of the Ton complex (
tonB,
exbB,
exbD) and its putative outer membrane receptor encoded by
yncD exhibited significantly higher RPF and mRNA levels at pH 5.8 and pH 4.4 (
Table S5). The Ton complex functions as a proton motive force-dependent molecular motor that facilitates the import of iron-bound siderophores (
69,
70). Several other iron uptake systems, including a ferric dicitrate ABC transport system (
fecABCDE), an iron (III) hydroxamate ABC transport system (
fhuACDB), a ferric enterobactin ABC transport system (
fepA,
fepB,
fepCGD), and a TonB-dependent iron-catecholate outer membrane transporter (
cirA), were also induced under acidic conditions (
Table S5). Moreover, the GO terms “protein maturation by iron-sulfur cluster assembly” and “iron-sulfur cluster assembly” were enriched at pH 4.4 (
Fig. 3). Specifically, we detected a fivefold upregulation of all genes of the
isc and
suf operons (
Table S5), which encode components of the complex machinery responsible for iron-sulfur cluster assembly in
E. coli (
71). In contrast, heme transport was among the most downregulated biological processes at both pH 5.8 and 4.4 (
Fig. 3), which could potentially be the cause of iron limitation. Moreover, at low pH, the solubility of iron ions increases, which can destabilize iron-sulfur clusters (
72). The iron limitation would be consistent with our data that
E. coli upregulates the synthesis of iron-chelating siderophores and their transporters, as well as the components of the iron-sulfur assembly machinery. Given the better solubility of iron in a low pH environment, the question arises whether
E. coli synthesizes siderophores to respond to iron limitation, or rather, protects itself against an iron excess. The latter function has been demonstrated for
Pseudomonas aeruginosa, where siderophores protected cells from the harmful effects of reactive oxygen species. In this case,
P. aeruginosa no longer secreted siderophores into the extracellular environment but instead stored them intracellularly (
73). In conclusion, these results prompt the question of whether the upregulation of the iron uptake machinery counteracts iron limitation or rather provides protection against iron excess under severe acid stress.
We also detected a significant enrichment for the GO terms “cellular response to acidic pH,” “stress response to copper ion,” and “copper ion transmembrane transport” at pH 4.4 (
Fig. 3). These results are in line with previous studies that have suggested an interplay between resistance to copper and acid stress in
Escherichia coli (
74,
75). This overlap between the two stress responses is further emphasized by our findings because at pH 4.4, substantial upregulation of the Cu
+-exporting ATPase CopA and CusA, a component of the copper efflux system, was detected (
Table S5). These results are of important physiological relevance, given that copper is an important antibacterial component in the innate immune system (
76,
77).
Among the downregulated genes at pH 4.4, the
tnaAB operon and its leader peptide (
tnaC) showed the most significant decrease in terms of RPF counts (
Table 4).
tnaA encodes a tryptophanase, which cleaves
L-tryptophan into indole, pyruvate, and NH
4+, whereas
tnaB encodes a tryptophan:H
+ symporter (
78). This finding is particularly intriguing because, in a previous study, persister cell formation in
E. coli was related to a lower cytoplasmic pH associated with tryptophan metabolism (
79). It is important to note that we also detected a substantial upregulation in RPFs for
hipA (
Table S5), which encodes a serine/threonine kinase that plays a role in persistence in
E. coli (
80). Therefore, our data provide further evidence for the link between internal pH and persistence.
The expression of several outer membrane proteins and porins (
ompW,
ompF,
nmpC,
lamB) was also downregulated at pH 4.4 (
Table 4). This observation is consistent with the extensive restructuring of the
E. coli lipid bilayers to reduce membrane permeability and limit proton entry. Similar to pH 5.8, the majority of the 20 proteins with the most reduced RPF levels compared with physiological pH are membrane proteins (
Table 4). Moreover, the GO term “ATP synthesis coupled proton transport” was significantly reduced at pH 4.4 (
Fig. 3). This is explained by the reduction in RPF levels of genes encoding subunits of the F
OF
1-ATPase (
Table S5). F
OF
1-ATPase uses the electrochemical gradient of protons to synthesize adenosine 5′-triphosphate (ATP) from ADP and inorganic phosphate but can also hydrolyze ATP to pump protons out of the cytoplasm (
81,
82). As at pH 5.8, the most downregulated biological processes at pH 4.4 were almost exclusively GO terms related to transport and cellular metabolism (
Fig. 3).
In summary, the response of E. coli to severe acid stress is dominated by the activation of survival strategies that limit the entry of protons into the cell, prevent protein aggregation, and maintain iron homeostasis. Severe acid stress leads to a reduction in metabolic, transcriptional, and translational activity, thereby preparing E. coli for a dormant state. Eventually, these dormant cells may be able to withstand antibiotic attack (i.e., persister cells).
Expanding the regulatory network of enzyme-based H+-consuming acid resistance systems
Recently, we have shown that the Adi and Cad AR systems are mutually exclusively activated in individual
E. coli cells, indicating functional diversification and division of labor under acid stress (
15). To gain further insights into the fine-tuned regulation of the three major AR systems, we first studied the mRNA and RPF levels of known enzyme-based H
+-consuming AR components. The core components of the Gad system (AR2) (
gadA,
gadB, and
gadC) and several transcriptional components (
gadW,
gadX,
gadY,
phoP,
phoQ) showed an increase in mRNA and RPF levels by approximately two- to sixfold at pH 4.4, but not at pH 5.8, whereas the expression of
ydeO was massively induced at pH 5.8 (particularly at the mRNA level), and RPF levels were decreased at pH 4.4 (Fig. S7). Expression of the core components of the Adi system (AR3),
adiA and
adiC, was induced at severe acid stress but not at pH 5.8, consistent with our previous study (
15). Upregulation was not detected for regulatory components of the Adi system. A novel finding was that the levels of
adiA but not
adiC were significantly higher in the Ribo-Seq data than in the RNA-Seq data (Fig. S7). In fact,
adiA had the sixth highest increase in TE among all
E. coli genes at pH 4.4 (
Table S4), indicating translational regulation by a thus far unknown mechanism. The only other component of an AR system in
E. coli, known to be subject to translational regulation, is the major regulator CadC of the Cad (AR4) system. CadC contains a polyproline motif, and its translation therefore depends on the elongation factor P, a process that keeps the copy number of CadC extremely low (
83). As expected, expression of the core components of the Cad system (AR4),
cadA and
cadB, was tremendously increased at both pH 5.8 and 4.4. Genes of the Orn system (AR5) were not induced in our experimental setup (Fig. S7).
Next, we analyzed the mRNA and RPF levels of all annotated TFs to search for other potential TFs involved in the acid stress response of
E. coli (
Fig. 4A and B). At pH 5.8, YdeO showed by far the strongest induction at the transcriptional and translational levels, but for all other TFs, the expression levels hardly changed (
Fig. 4A). At pH 4.4, the expression of numerous TFs was induced, including GadW, YdcI, and the antibiotic resistance-controlling regulator MarR. The strongest upregulation was found for the IclR-type regulator MhpR and the iron-sulfur cluster-containing regulator IscR (
Fig. 4B). Notably, while most acid-induced TFs were differentially expressed and displayed constant TE, YdcI exhibited constant mRNA levels but was differentially translated in response to acid stress (
Fig. 4B). The contribution of all TFs with high FC values to survival under acid stress (Table S6) was tested in an acid shock assay. Cells of the corresponding knockout mutants (
84) and, for comparison, the
rcsB and
gadE mutants (each lacking a TF important for acid resistance) were exposed to pH 3 for 1 h. All mutants except
marR and
ydeO showed significantly reduced survival compared to the parental strain (
Fig. 4C). For
ydeO, this result was consistent with our finding that transcript abundance and occupancy with ribosomes were upregulated at mild but not severe acid stress (
Fig. 4;
Fig. S7). Thus, YdeO appears to be only crucial under mild acid stress (
Fig. 4A). In contrast, the
mhpR mutant had a low survival rate comparable to that of
rcsB and
gadE, and the survival rates of the
iscR,
ydcI, and
gadW mutants were only slightly higher (
Fig. 4C). These results confirm the physiological relevance of these TFs for acid resistance. As controls, we re-introduced the corresponding genes
in trans using isopropyl-β-
D-thiogalactopyranosid (IPTG)-inducible pCA24N plasmids from the ASKA collection (
85). Complementation of the
mhpR,
iscR,
ydcI, and
gadW mutants, as well as
rcsB and
gadE controls, resulted in strains with survival rates comparable to the wild-type (WT) strain carrying the pCA24N control vector (Fig. S8).
Subsequently, we tested whether these TFs are involved in the regulation and interconnectivity of the Gad, Adi, and Cad systems. Therefore, we examined the promoter activities of
gadBC,
adiA, and
cadBA in the corresponding knockout mutants (
84) using transcriptional reporter plasmids (promoter-
lux fusions). The cultivation conditions were the same as those used for Ribo-Seq and RNA-Seq (
Fig. 1A), and luciferase activity was monitored during growth in microtiter plates. We found that YdcI significantly affected the promotor activity of
gadBC (
Fig. 4D). Although the LysR-type regulator YdcI has been shown to affect pH stress regulation in
Salmonella enterica serovar Typhimurium and
E. coli, its precise role is still unclear (
86–88). Based on the data presented here, we hypothesize that the decreased survival of the
ydcI mutant under severe acid stress is due to decreased expression of the Gad system. The absence of YdeO resulted in an eightfold stimulation of the
adiA promoter activity (
Fig. 4E). Thus, YdeO not only activates the Gad system (
89) but also appears to be a repressor for the Adi system. This implies that the Adi system is regulated not only by the XylS/AraC-type regulator AdiY but also by YdeO. Thus, YdeO is the first example of a transcriptional activator shown to be involved in the regulation of more than one AR system in
E. coli and might play a role in the heterogeneous activation of the Adi and Gad systems within a population. Although we observed a slight decrease in
cadBA promoter activity in the
ydcI mutant, the decrease was not statistically significant. Therefore, none of the tested TFs affected the Cad system (
Fig. 4F).
In conclusion, based on the differential expression data and lower survival of mutants during acid shock, we identified two novel TFs, namely, MhpR and IscR, which are crucial under severe acid stress (
Fig. 4C), but are not associated with the Gad, Adi, and Cad systems (
Fig. 4D through F). This implies that these regulators ensure the survival of
E. coli in acidic habitats by inducing other defense mechanisms. Of particular interest is MhpR, which had the highest increase in RPFs of all TFs at pH 4.4 (
Fig. 4B), and the corresponding mutant had the lowest survival at pH 3 (
Fig. 4C). Further studies are needed to determine whether MhpR, which is a specific regulator of the
mhpABCDFE operon-encoding enzymes for the degradation of phenylpropionate (
90,
91), directly or indirectly contributes to acid resistance.
Differential expression of known and novel sORFs under mild and severe acid stress
In recent years, the annotation of many bacterial genomes has been extended by previously unknown small proteins (
29), many of which are located in the membrane (
92). This progress has been achieved primarily through the development of optimized detection strategies using adapted ribosome profiling and mass spectrometry protocols (
28,
33,
93). Recently, additional sORFs were identified in
E. coli using antibiotic-assisted Ribo-Seq, which captures initiating ribosomes at start codons (
94,
95). Advanced detection strategies also revealed novel small proteins in other species, such as the archaeon
Haloferax volcanii, the nitrogen-fixing plant symbiont
Sinorhizobium meliloti,
Salmonella Typhimurium, and
Staphylococcus aureus (
33–35,
96).
Among the previously known sORFs in
E. coli K-12 and those discovered by Storz and colleagues (
94), pH-dependent differential RPF levels were observed in our data sets for 12 and 29 small proteins at pH 5.8 and pH 4.4, respectively (
Table S8). These findings validate the expression of these sORFs and highlight their physiological relevance in the acid stress response of
E. coli. For example, induction of
mdtU, an upstream ORF of
mdtJI, was observed under mild acid stress (
Fig. 5A) and corresponds to the observed upregulation of the multidrug/spermidine exporter MdtJI (
Table 1). A previous study has shown that translation of MdtU is crucial for spermidine-mediated expression of the MdtJ subunit under spermidine supplementation at pH 9 (
97). A similar mechanism could operate under acid stress conditions. The strongest induction of sORFs under severe acid stress was detected for
ydgU (located in the same transcriptional unit as the acid shock protein-encoding gene
asr) and
azuC (
Fig. 5B). AzuCR acts as a dual-function RNA and encodes a 28-amino acid protein, but it can also base pair as an sRNA (AzuR) with two target mRNAs, including
cadA (
98). AzuCR modulates carbon metabolism through interactions with the aerobic glycerol-3-phosphate dehydrogenase GlpD (
98).
In addition to known sORFs, we aimed to uncover further hidden small proteins on the basis that our Ribo-Seq data were acquired under stress conditions to which E. coli is exposed in its natural habitat, the gastrointestinal tract. In particular, we searched for novel sORFs that remained undetected in previous Ribo-Seq approaches when E. coli was grown at a neutral pH.
Initial predictions for novel sORF candidates were acquired using the neural network-based prediction tool
DeepRibo (
99). All potential candidates were filtered based on coverage (rpkm >30 across all Ribo-Seq samples) and codon count [10–70 amino acids (aa)], with the exception of sORF15 (93 amino acids) (
Table S7), which was manually discovered by inspecting the 3′ UTR of
gadW. To further refine our search, we focused on sORF candidates that were significantly induced at either pH 5.8 or pH 4.4 (RPF log
2 FC >2 and
P-adjust <0.05) compared to pH 7.6. Predictions that overlapped with annotated genes on the same strand were excluded because Ribo-Seq signals were indistinguishable. This workflow yielded 152 candidates that were visually inspected using the web-based genome browser JBrowse2 (
100). Candidates with continuous coverage across the predicted sORF, matching the ORF boundaries, and promising Shine-Dalgarno sequences were considered high-confidence candidates. In total, we identified 18 acid-induced sORF candidates (
Table S7) that had not been previously detected. Of note, most of the candidates are encoded as part of operons or are located in the 3′ UTR of annotated genes. In addition, we detected one independent antisense sORF (sORF2 encoded antisense to
tesA) and two upstream ORFs (leader peptides): sORF18, located upstream of the translation start site of the periplasmic chaperone encoding
osmY, and sORF8, located close to the glucokinase-encoding gene
glk (
Table S7).
Of these 18 acid-induced candidate small proteins (
Table S7), 17 had higher RPF counts at pH 4.4 than at pH 5.8. This suggests that the contribution of sORFs to acid defense in
E. coli is more relevant under severe acid stress. Only sORF1 showed a higher expression level in cells exposed to mild acid stress (
Fig. 5C). sORF1 is located in the 3′ UTR of
tsx, which encodes a nucleoside-specific channel-forming protein. This finding is consistent with the observed increased requirement for nucleotides by
E. coli at pH 5.8 (
Fig. 3).
For the first time, we identified two sORF candidates located within genes encoding the redundant small regulatory RNAs OmrA and OmrB (
Fig. 5D).
omrA and
omrB are highly identical at the 3′ and 5′ ends, differ mainly in their central parts, and regulate the expression of numerous outer membrane proteins (
101). Our analysis suggests that both OmrA and OmrB act as dual-function RNAs under severe acid stress and encode small proteins: a 28-amino acid protein OmrA (sORF11) and an 11-amino acid protein OmrB (sORF12) (
Fig. 5D). Due to the sequence variation in the central parts, the translation of OmrB ends at an earlier stop codon. Notably, both
omrA and
omrB displayed higher RPF levels at pH 4.4, whereas transcription of
omrA but not
omrB was induced at pH 4.4 (
Fig. 5D). Thus, despite the high sequence similarity,
omrA and
omrB do not encode identical small proteins under severe acid stress and are differentially regulated at the transcriptional and translational levels. We also detected an acid-induced sORF candidate (sORF3) in
rybB (
Table S7), another sRNA involved in the regulation of outer membrane proteins (
102). To our best knowledge, the presence of OmrA, OmrB, and RybB peptides has not yet been reported.
Three new candidate sORFs potentially involved in the regulation of AR systems were detected. sORF10 is located in the 3′ UTR of a potassium-binding protein encoded by
kbp and encoded antisense to the transcriptional regulator CsiR (
Fig. 5E). The latter might be involved in the regulation of the Adi system (
15,
103). Given the significant upregulation of sORF10 at pH 4.4 and its complete complementarity to the 3′ end of the
csiR mRNA, we hypothesize that sORF10 plays a role in fine-tuning the expression of the Adi system. Strikingly, we also discovered two high-confidence candidates for sORFs located in the relatively long 3′ UTR of GadW, one of the major transcriptional regulators of the Gad system (
Fig. 5F). sORF14 and sORF15 exhibit constant coverage across the predicted ORF and contain Shine-Dalgarno sequences (
Table S7). These results suggest that the complex Gad system may consist of even more components.
To gain further insight into the subcellular location and features of the newly identified sORF candidates, we used PSORTb (
104) and DeepTMHMM (
105) for transmembrane topology prediction. Notably, sORF15 is predicted to be located in the inner membrane and has two transmembrane helices, which were predicted with a probability of >90% (Fig. S9A). Additionally, the sORF15 protein structure prediction using AlphaFold2 (
106) in Google Colab (ColabFold) (
107) revealed a potential third helix toward the C-terminal end (Fig. S9B). Using blastp and tblastn (
108), we found homologs of sORF15 with >80% identity in
Vibrio,
Shigella,
Klebsiella,
Salmonella,
Enterococcus, and
Escherichia (Fig. S9C) and identified homologs with at least 60% identity for approximately half of the other candidate sORFs (sORF2, 4, 5, 6, 7, 9, 10, 11, 16, and 18). These results strengthen confidence in the correct prediction of these sORFs. However, homologs in other species often only displayed partial matches and were almost exclusively annotated as “hypothetical proteins,” as illustrated for sORF15 (Figure S9C). Moreover, we evaluated whether sORF15 is translated in the absence of the upstream gene
gadW. A pBAD24-sORF15:3xFLAG plasmid, which harbors the native Shine-Dalgarno sequence of sORF15 (
Table S11), and a FLAG-tagged version of sORF15 were constructed. sORF15 translation was successfully verified by Western blotting (Fig. S9D), which exemplifies that sORFs detected in this study yield detectable protein products.
In conclusion, we identified 18 high-confidence candidates for novel sORFs that are significantly induced upon exposure of E. coli to mild or severe acid stress.
Differentiation of the acid stress and general stress responses using autoencoder-based machine learning
In general, stress response mechanisms can be broadly classified into two categories: global stress responses and adaptations to specific types of stress. Global stress responses can be triggered by various stimuli and provide protection against multiple other unrelated stress factors (
109). The global response often involves the activation of alternative sigma factors that affect hundreds of genes. In contrast, adaptations to specific types of stress are tailored to the specific stressor and involve a regulator that senses an environmental cue and modulates the expression of a set of genes, which counteract the stress (
109,
110).
Given the large number of differentially regulated genes and pathways in response to acid stress (
Fig. 2 and 3), we asked which of these adaptive mechanisms are acid-specific and which are also triggered by other stressors. In order to distinguish acid-specific and general stress responses, we used denoising autoencoders (DAEs), deep learning models designed for meaningful dimensionality reduction (
111,
112). DAEs accomplish this by passing data through an encoder that compresses it into activations of a bottleneck layer (
Fig. 6A1), with each node in the bottleneck layer interpretable as a coordinated expression program (
113). For our analysis, we employed an ensemble of deep DAEs (see Materials and Methods) (
113), trained on the
E. coli K-12 PRECISE 2.0 compendium (
114), augmented with additional stress conditions (
115), as well as the acid stress conditions of the current study (
Fig. 6A1). Using this method and data set, we have conducted a comparative analysis of the transcriptional response of
E. coli to pH 4.4 and pH 5.8, contrasted against an extensive range of other stress conditions, including heat stress (
116), ethanol stress (
117), osmotic stress (
118), oxidative stress (
119,
120), low oxygen (LOX) (
115), and exposure to sublethal concentrations of chloramphenicol (CAM) (
115) and trimethoprim (TMP) (
115).
To identify biological processes associated with a particular stress condition, we passed the associated RNA-seq data set into the encoder of each network and identified bottleneck nodes that were uniquely turned on by that data set (
Fig. 6A2). We then manually turned on these nodes to generate gene sets that are associated with that condition, which can be further analyzed through GO term enrichment (
Fig. 6A3 and 4). Using this procedure, we identified groups of nodes that uniquely turn on for acid stress conditions and turn off for all other stress conditions, as well as groups that are simultaneously on for both acid and one additional stress condition. We observed that there are many nodes that turn on simultaneously upon both acid and ethanol exposure (
Fig. 6B). The overlap between acid stress and ethanol stress responses has been noted previously and can be explained by the fact that ethanol fluidizes the cytosolic membrane and increases the permeability for protons (
121). Furthermore, there are indications of an overlap between acid and antibiotic stress (
122,
123), reflected in the high number of acid + CAM activating nodes (
Fig. 6B).
To pinpoint which cellular adaptations cause acid-specificity for the 48 and 91 specific bottleneck nodes at pH 5.8 and 4.4 (
Fig. 6B), respectively, we conducted GSEA for biological processes on each of the gene sets associated with acid-specific node groups (see Materials and Methods). The GO terms that were significantly enriched in the highest number of both pH 5.8- and pH 4.4-specific upregulating node gene sets were “siderophore transmembrane transport,” “response to cold,” “bacterial-type flagellum assembly,” and “chemotaxis” (
Fig. 6C), reflecting our previous differential RNA-seq analysis (
Fig. 3). The appearance of the GO term “response to cold” might be a result of the lack of cold stress in our compendium of stressors. Additionally, it should be noted that genes associated with this GO term include cold shock proteins, which may have broader roles in the survival of stress conditions (
124,
125), as well as several prophage genes and ribosome biogenesis factors. We found that mild acid stress turns on nodes, which correspond to gene sets associated with nucleotide and ribosome biosynthesis, including the GO terms “ribosome large subunit assembly,” “ribosome small subunit assembly,” and
“de novo IMP biosynthetic process,” while severe acid stress turns them off (
Fig. 6C). These findings are consistent with our previous observations, namely, that
E. coli induces nucleotide and ribosome biosynthesis to cope with mild acid stress but enters a metabolically inactive state under severe acid stress. The GO terms significantly affected in the highest number of acid-specific pH 4.4 downregulated bottleneck nodes were “proton motive force-driven ATP synthesis” and “proton-transporting ATP synthase complex” (
Fig. 6C). These two GO terms exclusively involve genes encoding subunits of the F
OF
1 ATP synthase and can be considered paradigms for acid-specific adaptations since the F
OF
1 ATP synthase can also pump protons (
126).
Considering that we detected a high number of genes induced by severe acid stress with unknown functions (
Table 3) and lacking GO associations, we expanded our search for acid-specific adaptations from GO terms to single genes. In order to select acid-specific candidate genes, we investigated all genes associated with acid-specific bottleneck nodes and calculated the log
2 FC between each acid stress and every other above-mentioned stress condition for these genes. Genes with the highest expression values under acidic conditions and log
2 FCs of at least 0.5 for at least 95% of comparisons were then selected. This procedure yielded 10 candidate genes (
Fig. 6D). To experimentally validate that these genes are indeed specifically upregulated under acid stress, we exposed
E. coli to a variety of common stressors and performed qRT-PCR.
E. coli cells were either grown under acid stress (
Fig. 1A) or exposed to heat (42°C), oxidative (H
2O
2), osmotic (NaCl), antibiotic (chloramphenicol), or ethanol (EtOH) stress. For all investigated genes, the strongest upregulation was observed at either pH 4.4 or pH 5.8 relative to non-stress conditions (
Fig. 6D), except for
ycfJ, which was activated at pH 4.4 and under oxidative and ethanol stress. The remaining investigated genes were only upregulated under one other stress condition at most (
Fig. 6D). Given that
emrE and
mdtJ encode multidrug exporters, the induction upon supplementation with sublethal concentrations of chloramphenicol is not surprising and further underscores the interplay between acid and antibiotic stress. The observed upregulation of
yhcN under oxidative stress (
Fig. 6D) was also reported previously (
127). Nevertheless, we uncovered four
bona fide examples of genes (
ybiJ,
hslJ,
yejG, and
yhjX) that displayed exclusive pH-dependent expression (
Fig. 6D). Induction of
yhjX, encoding a putative pyruvate transporter, might be related to the deamination of serine, which yields ammonia and pyruvate in uropathogenic
E. coli (
128). The precise molecular functions of YbiJ, YejG, and HslJ in the context of acid stress are currently unclear. These results highlight that our autoencoder pipeline is complementary to differential gene expression analysis, yielding biologically consistent results while also identifying expression patterns that uniquely discriminate acid stress from other stress responses.
Conclusions
Here, we present the first comprehensive study on the global transcriptome- and translatome-wide response of
E. coli exposed to varying degrees of acid stress. Our investigation goes beyond previous research, which focused on comparing
E. coli transcriptomes across different pH levels during growth (
18,
21). Instead, we report on rapid changes occurring upon sudden pH shifts, which are relevant for bacteria such as
E. coli, during passage of the gastrointestinal tract (
129).
Using both Ribo- and RNA-Seq, we uncovered not only well-known acid defense mechanisms but also numerous previously undiscovered relevant genes and pathways to combat mild and severe acid stress (
Fig. 7). The latter include siderophore production, glycerol-3-phosphate conversion, copper export,
de novo nucleotide biosynthesis, and spermidine/multidrug export (
Fig. 3 and 7). A striking number of membrane proteins and H
+-coupled transporters were found to be downregulated under both mild and severe acid stress (
Fig. 7;
Tables 2 and 4), underscoring the importance of the cytosolic membrane and its composition as a barrier for protons. Moreover, under severe stress, many outer membrane proteins were downregulated (
Fig. 7;
Table 4). Notably, a large proportion of genes with yet unknown functions were strongly induced, particularly under severe acid stress (
Table 3). Our approach implies that exposing
E. coli to culture conditions mimicking near-lethal habitats can offer valuable insights into the molecular functions of genes with low expression levels under standard growth conditions.
Our analysis revealed two new TFs, MhpR and IscR, involved in acid stress adaptation. Furthermore, we gained new insights into the role of the TFs YdeO and MarR. YdeO controls not only the transcription of genes in the Gad system but also
adiA in the Adi system (
Fig. 4), suggesting that YdeO connects the regulation of two AR systems in
E. coli. The observed upregulation of MarR under acid stress, but the low contribution of this TF to acid resistance (
Fig. 4B), may provide a link to antibiotic resistance and solvent stress tolerance in
E. coli (
130).
In addition to the pH-dependent differential expression levels of previously identified small proteins, such as YdgU, MdtU, and AzuC, we identified 18 high-confidence, not yet annotated, sORF candidates (
Fig. 5). Of particular interest are sORF14 (13 amino acids) and sORF15 (93 amino acids), which are located in a transcriptional unit with
gadW and
gadX, suggesting their association with the Gad AR system and a potential involvement in glutamate transport and/or glutamate decarboxylation to gamma-aminobutyrate (GABA). Considering the predicted membrane location of sORF15 and its adjacent gene
mdtF, an association with either the glutamate/GABA antiporter GadC and/or the multidrug efflux pump MdtF is conceivable.
The autoencoder-based comparison with other common stressors allowed us to distinguish acid stress-specific adaptations from general stress response programs (
Fig. 6). Therefore, it was possible to differentiate between direct and indirect effects triggered by protonation and/or cellular damage. Considering the growing volume of next-generation sequence data, denoising autoencoders will be an increasingly important tool for interpreting future studies in the full context of accumulating RNA-seq data sets. Colonizing the intestinal tract is a complex process that includes not only rapid pH changes but also alterations in oxygen and nutrient availability as well as competition with other bacteria. The ability of pathogenic
E. coli strains to respond to such rapidly changing environments ensures their fitness advantage. We have shown here, for acid stress, the complexity of the regulatory network for ensuring survival and adaptation. The use of autoencoders, successfully tested here, could allow for the identification of physiological weak points associated with the survival of specific stresses. Targeting such weak points could lead to new classes of antibiotics or antivirulence treatments that take advantage of the unique expression patterns induced by natural stress conditions encountered in the host environment.