Hiding in Plain Sight: Mining Bacterial Species Records for Phenotypic Trait Information
ABSTRACT
INTRODUCTION
RESULTS AND DISCUSSION
Description of the phenotypic database.
Category | Components |
---|---|
Ancillary data | Yr of publication, article digital object identifier (doi), taxonomic nomenclature, culture collection code |
Morphology/phenotype | Gram stain status, cell length, cell width, cell shape, cell aggregation, motility, spore and pigment formation |
Metabolism | General metabolism, sole carbon substrate use, BIOLOG information available |
Environmental preferences | Habitat of isolation; oxygen requirement; range and optimum for pH, temp, and salt |
Sequence data | GC content, 16S rRNA accession no., genome accession no. |


Phylogenetic signal of phenotypic traits.
Trait | Typea | Phylogenetic signalb |
---|---|---|
Spore | Categorical | 1.225 |
Pigment | Categorical | 0.219 |
Shape (rod) | Categorical | 0.628 |
Shape (coccus) | Categorical | 0.703 |
Aggregation (chain) | Categorical | 0.182 |
Gram stain | Categorical | 1.516 |
Flagella | Categorical | 0.495 |
Aerobe | Categorical | 0.575 |
Anaerobe | Categorical | 0.593 |
Temp preference | Continuous | 0.226 |
pH preference | Continuous | 0.006 |
Salinity preference | Continuous | 0.023 |

Linking genomic information to pH and salinity optima.
KO IDa | Optimum | Descriptionc | Sign of coefficient | TCDBb present |
---|---|---|---|---|
K01546 | Both | K+-transporting ATPase ATPase A chain | − | Yes |
K01547 | Both | K+-transporting ATPase ATPase B chain | − | Yes |
K01548 | Both | K+-transporting ATPase ATPase C chain | − | Yes |
K03310 | Both | Alanine or glycine:cation symporter, AGCS family | + | Yes |
K03499 | Both | Trk system potassium uptake protein | + | Yes |
K07301 | Both | Cation:H+ antiporter | + | Yes |
K08974 | Both | Putative membrane protein | + | No |
K03543 | pH | Membrane fusion protein, multidrug efflux system | − | Yes |
K03446 | pH | MFS transporter, DHA2 family, multidrug resistance protein | − | Yes |
K08677 | pH | Kumamolisin | − | No |
K07799 | pH | Membrane fusion protein, multidrug efflux system | − | Yes |
K06045 | pH | Squalene-hopene/tetraprenyl-beta-curcumene cyclase | − | Yes |
K15495 | pH | Molybdate/tungstate transport system substrate-binding protein | − | Yes |
K15496 | pH | Molybdate/tungstate transport system permease protein | − | Yes |
K14393 | pH | Cation/acetate symporter | + | Yes |
K02168 | pH | Choline/glycine/proline betaine transport protein | + | Yes |
K07393 | pH | Putative glutathione S-transferase | + | No |
K06718 | pH | l-2,4-Diaminobutyric acid acetyltransferase | + | No |
K06720 | pH | l-Ectoine synthase | + | No |
K09908 | pH | Uncharacterized protein | + | No |
K06213 | pH | Magnesium transporter | + | Yes |
K05565 | pH | Multicomponent Na+:H+ antiporter subunit A | + | Yes |
K05567 | pH | Multicomponent Na+:H+ antiporter subunit C | + | Yes |
K05568 | pH | Multicomponent Na+:H+ antiporter subunit D | + | Yes |
K05569 | pH | Multicomponent Na+:H+ antiporter subunit E | + | Yes |
K05570 | pH | Multicomponent Na+:H+ antiporter subunit F | + | Yes |
K05571 | pH | Multicomponent Na+:H+ antiporter subunit G | + | Yes |
K14683 | pH | Solute carrier family 34 (sodium-dependent phosphate cotransporter) | + | Yes |
K14445 | pH | Solute carrier family 13 (sodium-dependent dicarboxylate transporter), member 2/3/5 | + | Yes |
K03451 | pH | Betaine/carnitine transporter, BCCT family | + | Yes |
K03308 | pH | Neurotransmitter:Na+ symporter, NSS family | + | Yes |
K08714 | pH | Voltage-gated sodium channel | + | Yes |
K03826 | pH | Putative acetyltransferase | + | No |
K03975 | Salinity | Membrane-associated protein | − | Yes |
K08223 | Salinity | MFS transporter, fosmidomycin resistance protein | − | Yes |
K07646 | Salinity | Two-component system, OmpR family, sensor histidine kinase KdpD | − | No |
K03549 | Salinity | KUP system potassium uptake protein | − | Yes |
K03699 | Salinity | Putative hemolysin | − | No |
K02276 | Salinity | Cytochrome c oxidase subunit III | + | No |
K07160 | Salinity | UPF0271 protein | + | No |
Future research.
MATERIALS AND METHODS
Database compilation and curation.
Phylogenetic signal analyses.
Association between genomic attributes and environmental preferences.
ACKNOWLEDGMENTS
REFERENCES
Information & Contributors
Information
Published In

Copyright
History
Keywords
Contributors
Editor
Metrics & Citations
Metrics
Note:
- For recently published articles, the TOTAL download count will appear as zero until a new month starts.
- There is a 3- to 4-day delay in article usage, so article usage will not appear immediately after publication.
- Citation counts come from the Crossref Cited by service.
Citations
If you have the appropriate software installed, you can download article citation data to the citation manager of your choice. For an editable text file, please select Medlars format which will download as a .txt file. Simply select your manager software from the list below and click Download.