INTRODUCTION
To ensure cell survival, bacteria have to adapt to changing environmental conditions (
1). For this, bacterial cells are equipped with an array of different signal transduction systems that sense different environmental stimuli, such as osmolarity, oxygen tension, temperature, pH, light, nutrients, toxins, and other chemicals (
2). Chemosensory pathways represent one of the primary bacterial signal transduction mechanisms, and more than half of all the bacterial genomes contain signaling genes (
3). Most chemosensory pathways appear to mediate chemotaxis (
3), whereas others have been associated with type IV pilus-based motility (
4) or alternative cellular functions such as the control of second messenger levels (
4,
5).
In a canonical chemosensory pathway, signals are perceived by binding specific molecules to the
ligand
binding
domain (LBD) of chemoreceptors (CRs), which modulates the activity of the CheA autokinase and the subsequent transphosphorylation to the CheY response regulator. In canonical CRs, the extracytosolic LBD is flanked by two transmembrane (TM) regions, a cytosolic HAMP domain, and a signaling domain (MCPsignal). While the CR signaling domain (MCPsignal) is highly conserved, LBDs are rapidly evolving domains (
6), which reflects the wide variety of chemoeffectors to be sensed. To date, more than 80 different LBD families have been identified (
7,
8), and new types of LBDs continue to be discovered (
9). The thermodynamic parameters for ligand binding to the individual CRs are very similar to those for binding to specific LBDs (
10,
11), supporting the idea that the molecular determinants for signal recognition by CRs are located in the LBD. Further evidence of this came from the construction of chimeric receptors recombining LBDs with other signaling domains (e.g., autokinase domains), where the LBD was proved to define the function of the chimera (
12,
13). Thus, while the conserved MCPsignal domain can be used to identify CRs, their LBDs allow them to be classified on the basis of their function (
7,
8).
On the other hand, there is evidence suggesting that the genomic repertory of CRs is related to bacterial lifestyle (
14,
15). For instance, it has been shown that
plant-
associated
bacteria (PAB) possess a particularly large number of CRs (
8,
16), indicating that chemosensory signaling is indeed an important requisite for plant-bacterium interactions. This is of particular relevance for plant pathogens and symbionts, for which it has been shown that flagellum-mediated chemotaxis is required for optimal virulence or symbiosis establishment (
17–25). Plants represent complex habitats for colonization by different kinds of microorganisms, and PAB species can colonize the plant rhizosphere, phyllosphere, or endosphere (
26). Motile sensory behavior has been shown to play a key role in the establishment of plant-microbe interactions, since bacteria that can sense and rapidly navigate toward niches optimal for growth and survival will have a clear competitive advantage (
27–29). These considerations are valid for both pathogenic and nonpathogenic relationships between microorganisms and plants (
8,
16). Similarly, microbial inhabitants of the phyllosphere, comprising the aerial part of plants, have to deal with the challenges of life on leaf surfaces, where flagellar motility confers advantages in terms of epiphytic fitness (
30). The epiphytic lifestyle also represents the initial stage of foliar colonization by many bacterial phytopathogens, preceding entry into the leaf apoplast via wounds or natural plant openings (e.g., stomata) (
30). However, despite their biological significance, the function and cognate signal have been determined for only a limited number of CRs from PAB, and very little information exists on their phylogenetic and ecological specificity.
In order to study those LBD types most tightly coupled to the plant-associated lifestyle, here we comprehensively identified the CR genes in all known bacterial lineages and classified them according to their LBDs, with a particular focus on the LBD types linked to a plant-associated lifestyle. As such, we employed a novel de novo methodology to extract putative LBD regions from all CR sequences and group them into homology-based clusters (i.e., putative LBD types). This analysis allowed us to identify hundreds of LBD types highly specific for PAB species, many of them unknown. We further found that the taxonomic distribution of the majority of PAB-specific LBD clusters is only partially explained by phylogeny, suggesting that niche and host adaptation might have played relevant roles for their selection. Together, these results form a solid basis for the design of experiments aimed at identifying CRs that are essential for plant-microbe interactions and virulence.
DISCUSSION
In the present study, we carried out a comprehensive phylogenomic analysis of the full repertoire of CRs from a wide collection of microbial genomes, classifying them according to their LBDs. To maximize the representativeness of our study, we used more than 82,000 species-level CR sequences from 11,000 species-representative genomes, significantly expanding the scope of previous works (
7,
15,
45), in terms of both the number of sequences examined and the phylogenetic coverage. To achieve this, we developed a novel method to extract LBDs and classified them based on a
de novo homology-based clustering approach, departing from the traditional classification of CRs centered around their general protein topology (
15,
45–47) or on known LBD domain searches (
7). This approach allowed us to identify many new potential LBD types, suggesting that the chemosensing landscape remains largely unexplored. Additionally, we believe that our strategy delineating large LBD families into finely grained subcategories could provide further information (
Fig. 3). Moreover, by classifying CRs based on their putative LBD type, for the first time we were able to quantify to what extent the chemosensory activity of PAB is linked to lifestyle.
Considering the enormous variety of LBDs at sensor proteins, establishing the forces that have driven their evolution is an important question that was never specifically addressed. To our knowledge, we present here the first clear demonstration showing that environmental factors play an important role in the selection and evolution of LBDs. We found that the specificity of LBDs to a plant-associated lifestyle could not be explained by just a phylogenetic signal, since the taxonomic distribution of most PAB-specific LBD types was scattered over the microbial phylogeny, which at times covered different orders and phyla. This indicates that the selection of the certain CRs might indeed be guided by ecological factors, opening the possibility of identifying lifestyle biomarkers.
We also found that bacterial species more tightly associated with plant environments (such as plant symbionts and phytopathogens) tend to have stronger lifestyle specificity signals in their CR repertory. For instance, plant symbionts had the largest number of PAB-specific LBDs per genome, followed by phytopathogens, with both showing significantly higher ratios than generic soil microbiota. It appears likely that even stronger links between the chemosensory capabilities of bacteria and their lifestyle will be detected in the future as more data become available on new organisms (e.g., via metagenomics sequencing) and on their niche adaptation (i.e., plant-tissue specificity).
These findings thus offer a number of research opportunities in the field of signal transduction. First, it can be explored whether similar relationships can be observed in CRs of bacteria with a different lifestyle, such as for example those that inhabit or infect the human intestine. Another interesting issue that needs to be addressed is the question whether similar LBD types are shared by members of different sensor protein families. Major families of these receptors are sensor histidine kinases; chemoreceptors; adenylate, diadenylate, and diguanylate cyclases; and certain cAMP, c-di-AMP, and c-di-GMP phosphodiesterases, as well as Ser/Thr/Tyr protein kinases and phosphoprotein phosphatases (
48). As the different sensor proteins of a given strain are exposed to the same signals, it appears plausible that the same LBD types might be present in members of different sensor protein families. Several examples have been reported in this direction, such as the specific sensing of nitrate by PilJ-type LBDs of the NarQ-type sensor kinases (
49), the McpN chemoreceptor (
50), and the PAS domain, universally found in different signal transduction systems (
48). It would be of interest to estimate the global occurrence of such cases.
Overall, we believe that our study provides a comprehensive resource for future studies on bacterial chemoreception and that it sets the basis for the identification of novel CRs relevant for bacterium-plant interactions.
ACKNOWLEDGMENTS
This research has been supported by grants PGC2018-098073-A-I00 MCIU/AEI/FEDER, UE (to J.H.-C.), BIO2016-76779-P (to T.K.), AGL2017-82492-C2-1-R (to C.R.), and RTI2018-095222-B-I00 (to E.L.-S.) from the Ministerio de Ciencia, Innovación y Universidades, Spain, as well as grant P18-FR-1621 (to T.K.) from the Junta de Andalucía. C.S.-L. was supported by the FPU program (FPU19/06635, MICINN-Spain), and J.P.C.-V. by the FPI program (BES-2016-076452, MINECO-Spain).