Streptococcus agalactiae, also known as group B
Streptococcus (GBS), has gained worldwide attention over the past few decades because of its pathogenicity in newborns and pregnant women (
1,
2). GBS is the leading cause of sepsis and meningitis in newborns due to its ability to adhere to the mother's vaginal tract (
3). It also has caused an increasing infection rate in adults, including immunocompromised patients, the elderly, and diabetics (
4). In all cases, the bacteria must adhere to the host cell surface first before the virulence is induced and pathogenesis ensues. Two serine-rich repeat glycoproteins (Srr1 and Srr2) identified in various
S. agalactiae strains (
5,
6) are recognized to mediate bacterial attachment to the host cell surface (
3,
4,
7,
8).
Serine-rich repeat glycoproteins (SRRPs) belong to a growing family of adhesins in Gram-positive bacteria, and many of them contribute to bacterial pathogenesis (
9). Besides Srr1 and Srr2, the SRRP family also includes PsrP of
Streptococcus pneumoniae (
10), Fap1 of
Streptococcus parasanguinis (
11), GspB and Hsa of
Streptococcus gordonii (
12,
13), and SraP of
Staphylococcus aureus (
14). A unique feature shared among these SRRPs is that all SRRPs are glycosylated and that glycosylation plays a central role in the biogenesis and pathogenesis of SRRPs (
9,
15,
16). Therefore, it is important to understand the glycosylation mechanism of the SRRPs. Genes involved in the glycosylation process have been studied for various SRRPs over the past few years (
12,
16–30). However, little is known about the genes involved in the glycosylation of Srr2. Srr2 is a surface protein found in hypervirulent serotype III GBS strains (
6), while another SRRP, Srr1, is present in strains that are commonly associated with neonatal infection such as the Ia, Ib, V, and certain III serotype groups. Like other SRRP genes, the gene encoding Srr1 or Srr2 is located within a conserved gene cluster that contains two regions: a core region, which is highly conserved in every SRRP locus (
9,
16,
31), and a variable region (
16). The core region contains two essential glycosyltransferases, GtfA and GtfB, and several accessory secretory components, SecA2, SecY2, Asp1, Asp2, and Asp3. The variable region contains a number of putative glycosyltransferases, which are species and strain dependent. For instance, the organization of the glycosyltransferases is different between
S. agalactiae strains that contain the serine-rich proteins Srr1 and those that contain Srr2 (
6,
16). The number of glycosyltransferases in this locus differs from the number in the PsrP locus from various
S. pneumoniae strains (
16).
In the variable region, a putative glycosyltransferase,
gtfC, from the hypervirulent GBS strain COH1 is located at the 3′ end of the
srr2 locus and lies downstream of
gtfA and
gtfB (
16) (see Fig. S1 in the supplemental material). GtfC is a homolog of a glucosyltransferase (Gtf3) from
S. parasanguinis, which belongs to a new subfamily of glycosyltransferases (
28). Despite the fact that the catalytic domain of Gtf3 has been well characterized (
28), there is no study determining the acceptor substrate binding site of Gtf3 or Gtf3-like glycosyltransferases. Functional acceptor substrate binding sites are hallmarks of all glycosyltransferases and crucial for enzyme specificity and activity; however, they do not share sequence homology among different families of glycosyltransferases, and thus, it is difficult to infer the binding sites through the primary sequence alignment. In this study, we determined the three-dimensional X-ray crystal structure of GtfC and demonstrated its glucosyltransferase feature. Biochemical studies have also revealed that GtfC catalyzes the second step of the Srr2 glycosylation by transferring glucosyl residues to GlcNAc-modified Srr2. Structural analysis revealed that a flexible loop region is required for the substrate binding. Structure-based mutagenesis and functional studies have determined that the key loop region is required for binding to the acceptor substrate. Furthermore, this loop region is functionally conserved in this family of glycosyltransferases.