Temperate coliphage P2 has three nonessential lysogenic conversion genes,
Z/fun,
tin, and
old, conferring resistance to infections by phage T5, T-even phages, and lambda phages, respectively (
1,
8,
11,
12,
15,
21). These genes have been shown to have a higher AT content than the rest of the genome and a codon usage that differs from that of the host, which suggests that they are horizontally transferred genes (
3). P2-like prophages are common in
Escherichia coli; about 30% of the ECOR collection (
20) contain P2-like prophages. At the region equivalent to the P2
Z/fun gene, i.e., the Z region, which is located between the well-conserved tail genes
G and
FI, these P2-like prophages have been shown to contain different gene cassettes surrounded by a highly similar inverted repeat, indicating a site-specific integration event. Similar inverted repeats, spacing other genes, can be found in genetically unstable regions in pathogenic enterobacteria (
19). The
tin and
old genes are located at the right end of the P2 genetic map close to the
cos site. The
old gene encodes an exonuclease that blocks multiplication of lambdoid phages (
16), and
tin encodes a protein that inhibits T4 DNA synthesis by poisoning the T4 single-stranded binding protein (
15). These lysogenic conversion genes have an unusual AT content and codon utilization and are believed to have been additions from foreign genomes. This is supported by the fact that genes for hypothetical proteins similar to Old are found in various bacterial genomes in the UniProt GenBank (Table
1), while the Tin protein so far is unique. Furthermore, P2-related phages found in other enterobacteria contain other genes at this locus (
4-
6,
14,
17,
18). To clarify the nature of this locus in the P2-like coliphages, we have sequenced and characterized the region equivalent to the P2
tin and
old genes, i.e., the TO region, in seven of the P2-like prophages in the ECOR library.
Sequence variation between the A genes and the cos sites of the prophages.
To sequence the region located between the
A genes and the
cos sites of the prophages, DNA was extracted from seven strains of the ECOR collection, which are known to contain P2-like prophages (P2-ECnb). A primer located at the catalytic site of the
A gene and a primer located to the left of the
cos sequence were used for the PCR amplifications, and specific amplified DNA fragments of variable length were obtained. The fragments were either sequenced directly or first cloned and then sequenced. To obtain the complete region, plasmid primers and internal primers were used. All strains contained different DNA sequences in this region except P2-EC46 and P2-EC48, which were homologous to each other. The point of sequence divergence varies at the left end but is specific at the right end close to the
cos site (Fig.
1). In most cases the point of divergence at the left end is within the
A gene, generating a different C terminus of the A protein. The A proteins of P2-EC5, P2-EC58, and P2-EC67 are also slightly truncated, having five, three, and two fewer amino acids than P2, respectively.
Possible gene content in the variable region.
The inserted sequences vary extensively in length (Fig.
2). P2 has the longest insert, about 3 kb, and P2-EC31 the shortest, about 1 kb. All inserts except that of P2-EC31 have a high AT content (62 to 67%), and a search for open reading frames (ORFs), using
http:/www.ebi.ac.uk/emboss/transeq, and use of the bacterial translation table resulted in at least one ORF per insert when ORFs encoding proteins shorter than 60 amino acids (aa) were disregarded (Fig.
2). P2-EC31, which has an AT content of 51%, contains two open reading frames whose products show homologies to proteins of other P2-like coliphages and thus does not seem to have any horizontally transferred gene at this locus. Database searches, using the ψ-BLAST programs at
http://www.ncbi.nlm.nih.gov/BLAST and FASTAUniProt at
http://www.ebi.ac.uk/fasta33, for proteins similar to those encoded by the open reading frames in the other inserts showed that related putative proteins can be found in a variety of bacteria (Table
1). Interestingly, two open reading frames, orf570 of P2-EC30 and orf544 of P2-EC58, showed homology to genes encoding prokaryotic reverse transcriptases (RTs). The two genes show 69% identity with each other (with 100% identity in RT conserved regions), indicating either that they were inserted a very long time ago, leading to sequence divergence, or that they represent two independent integration events. The product of the other encoded open reading frame in P2-EC58 shows no significant identity to any protein in the UniProt GenBank, and a search for similar DNA sequences showed similarities to phages WΦ, PSP3, and 186, indicating that this region is of phage origin. This favors two independent integration events, but a later deletion in P2-EC30 cannot be excluded. Strain ECOR58 has previously been shown to produce multicopy single-stranded DNA (msDNA), and the reverse transcriptase promoting the synthesis of this branched DNA-RNA complex has been identified (
13). However, this RT is only distantly related to the RT integrated into the P2-like prophage of strain ECOR58. Also, the unique features of the msDNAs, described by Inouye and Inouye (
9), have not been found in P2-EC30 and P2-EC58.
orf570 encodes a functional reverse transcriptase.
To determine whether the putative reverse transcriptase of P2-EC30 indeed displayed RT activity, orf570 was cloned in plasmid PCR2.1-TOPO (Invitrogen) so that it was under the control of the T7 promoter (pOTEC30). orf570 was expressed in vitro using the
E. coli T7 S30 extract system for circular DNA (Promega) and tested in a Quan-T
-RT system (GE Healthcare). The reaction mixture was added to the RT assay mixture containing [
3H]TTP and RT DNA-RNA substrate coupled to scintillant. The homogenous Quan-T
-RT assay makes use of the scintillation proximity assay principle. Only [
3H]TTP nucleotides incorporated by a reverse transcriptase into a biotin-DNA-RNA primer-template linked to streptavidin fluomicrospheres (beads containing scintillant) are close enough to the scintillant to produce light. Unincorporated, tritiated nucleotides, free in solution, are unable to stimulate the scintillant and therefore produce no signal. As can be seen in Fig.
3 the in vitro-produced Orf570 clearly exhibits reverse transcriptase activity, comparable to the activity of 75 units of avian myeloblastosis virus reverse transcriptase in an
E. coli S30 extract environment. However, its role in P2 phage biology was still unclear. Interestingly, Orf570 of P2-EC30 showed homology to AbiK of
Lactococcus lactis (E score, 6.9E−3) which is a putative reverse transcriptase with demonstrated antiphage activity (
2,
7).
Orf570 excludes phage T5.
To elucidate the possible antiphage activity of Orf570, the capacities of different phages, i.e., lambda, T2, T4, T5, and T6, all of which are known to utilize the same hosts as P2, to form plaques on lawns of
E. coli strain BL21(DE3) (
22) transformed with plasmid pOTEC30 expressing Orf570 were investigated. All phages except T5 formed plaques with equal efficiency on
E. coli in the presence or absence of pOTEC30. The plating efficiency of T5 on cells expressing Orf570 was reduced more than 10
7-fold compared to that on cells lacking plasmid pOTEC30. Furthermore no exclusion effect of phage T5 was seen when expression of Orf570 was down-regulated by transforming BL21(DE3) harboring pOTEC30 with plasmid pLysS (
22), which reduces the amount of active T7 RNA polymerase in the cell, since the efficiency of plating was 1.1 compared to cells lacking pOTEC30. To further demonstrate Orf570 as the T5 exclusion agent, an internal deletion of orf570, including the RT conserved YRDD box, was constructed. When the orf570 mutant plasmid, pΔOTEC30, was transformed into BL21(DE3) and exposed to T5, the efficiency of plating was 1.6 compared to cells lacking pOTEC30. The fact that no exclusion of T5 was observed with pΔOTEC30 strongly suggests that Orf570 is responsible for the T5 exclusion activity, possibly through its reverse transcriptase activity.
Although the results are based on a small sample, it seems likely that the region downstream of
A is a very old location for nonessential and presumably horizontally transferred genes, since it is the only defined region of this kind that can be found at the same place in all P2-like phages irrespective of origin. The age of the region may explain why we have not been able to find any signature in the surrounding sequences that would imply an insertion mechanism. Most of these genes have not been characterized, but
Haemophilus influenzae phages HP1 and HP2 carry an adenine methylase gene and coliphage WΦ carries a cytosine methylase gene (
5,
17), and the nonessential genes
tin and
old lie at this position in phage P2. There are also defective P2-like coliphages in
E. coli that carry a cassette containing a restriction endonuclease and a methyltransferase gene at this site (
10). Based on the function of the genes that have been characterized and the similarity of genes in this region to genes encoding non-phage-associated hypothetical proteins in bacteria, it seems likely that the majority of genes are lysogenic conversion genes of key importance for bacteria. This is also evidently supported by the phage T5 exclusion activity of orf570 from P2-EC30. To our knowledge, this is the first report showing phage exclusion activity by a protein with demonstrated reverse transcriptase activity, although this has already been suggested by Fortier et al. (
7), who showed that the presence of an RT motif in AbiK was critical for its antiphage activity. AbiK, which share 16% identity with Orf570, is an abortive infection system encoded by a lactococcal plasmid, and the molecular target of AbiK has been suggested to be phage DNA single-strand annealing proteins involved in recombination activities (
2). Possibly, Orf570 utilizes a similar phage exclusion mechanism. Thus, the next challenge would be to pinpoint by what means Orf570 inhibits T5 growth.
Considering the genetic variation reported here, the variation at the Z region reported in a previous paper (
19), and the variation of other characterized genes in P2-like coliphages, the role for P2-like coliphages in the evolution of the host seems to be to supply lysogenic conversion genes that exclude other phages. This increases the fitness for a P2-like prophage and makes it possible to reside in the lysogenic state. Other genes, such as genes associated with virulence, will also increase the fitness of the lysogen, but the lysogen will still be vulnerable to attacks from lytic phages that would reduce or eradicate not only the population of bacteria but also any lysogenic phage that may be integrated in the bacterial genome. Prophages that protect the host by encoding factors that exclude all foreign DNA, such as restriction/modification systems, are expected to be found in most environments, but it seems reasonable to assume that specific phage exclusion genes are correlated to the most abundant phages in that particular environment.