The AAV-2 genome is a single-stranded DNA molecule of 4.7 kb and contains two open reading frames (ORFs),
rep and
cap, flanked by two inverted terminal repeats (ITRs) (
56). The
rep and
cap ORFs encode four overlapping regulatory proteins (Rep78, Rep68, Rep52, and Rep40) and three structural capsid proteins (VP1, VP2, and VP3), respectively (
6). The two large Rep proteins, Rep78 and Rep68, which are expressed from the p5 promoter, are involved in every step of the viral life cycle, i.e., replication, site-specific integration, rescue, splicing, and regulation of viral-gene expression (
6,
47,
48).
The biochemical properties of Rep78 and Rep68, such as DNA binding (
11,
12,
22,
40) and ATPase (
63), helicase (
20,
23-
25,
63), and endonuclease activities (
23), are essential for AAV DNA replication. It was further demonstrated that Rep78 and Rep68 can specifically bind to the Rep binding site (RBS) (
11,
12,
41,
50) and introduce a nick in a site- and strand-specific manner at the terminal resolution site (TRS) (
7,
23,
55,
67). The TRS-RBS motifs, present in the ITR, serve as the minimal origin for Rep-mediated AAV DNA replication (
54,
61). The function, of the TRS-RBS, however, is not limited to AAV DNA replication; it is also one of the major components in Rep-mediated site-specific integration. Indeed, similar TRS-RBS motifs are also present within
AAVS1 (
62), where they represent the minimal requirements for targeted integration (
39). Biochemical studies have demonstrated that Rep68 is able to simultaneously bind to both viral and cellular RBS motifs (
62) and introduces a nick at the
AAVS1 TRS site (
34,
59). Subsequently, AAV DNA integration is speculated to occur through limited viral and cellular DNA synthesis following template strand switches, resulting in partial duplication of
MBS85 sequences (
19,
38,
39).
Several lines of evidence suggest that
AAVS1 contains a transcriptionally active region. Kotin et al. originally reported the presence of several putative transcription factor binding sites upstream of the RBS and a CpG island, which is often the hallmark of a TATA-less promoter (
28). Further studies by Lamartina et al. identified a DNase I-hypersensitive site within the same
AAVS1 region, which displays transcriptional activities in an orientation-independent manner (
35). It has therefore been suggested that an enhancer is present upstream of the RBS (
35,
36). Studies by Tan et al. have shown that the minimal motifs necessary for AAV site-specific integration, the TRS and RBS, are located only a few nucleotides upstream of the translation initiation site of the myosin binding subunit 85 gene (
MBS85), also termed
PPP1R12C (for protein phosphatase 1 regulatory protein) (
57). To date, there are only a few reports describing either
MBS85 regulation or the function of the resulting protein.
MBS85 is ubiquitously expressed in human and mouse tissues and appears to be highly expressed in the heart (
16,
57). The protein is thought to be a component of the regulatory subunit of the myosin light chain phosphatase, which is involved in myosin phosphorylation, indicating that
MBS85 might play a role in the regulation of assembly and disassembly of the actin cytoskeleton (
57).
The fact that AAV has evolved to integrate site specifically into a ubiquitously transcribed region raises several questions with regard to the nonpathogenic character of AAV. How can the virus integrate, and thus disrupt a transcriptional unit, in the absence of any apparent deleterious effects on the cell? Using a mouse model for site-specific integration, Henckaerts et al. have recently demonstrated that integration into one allele of diploid embryonic stem (ES) cells does not interfere with either in vitro or in vivo differentiation of these cells, indicating a complex mechanism by which a functional copy of the
MBS85 transcription unit is maintained (
19). Through the analyses of multiple integrants, these studies also provided indirect evidence that the p5 and
MBS85 promoters might interact to form an initial integration complex, introducing the possibility that shared promoter elements might be directly involved in the integration mechanism (
19). A further question that arises is based on the challenge that in order to maintain latency AAV needs to put in place mechanisms that secure a level of regulation of the transcriptional activity surrounding the integrated viral genome.
In order to provide a framework for addressing these questions and to gain a better understanding of the mechanisms underlying AAV site-specific integration, we initiated the characterization of the MBS85 promoter and compared its transcriptional activities to those of the AAV p5 promoter. Our results clearly indicate that AAVS1 is defined by a complex transcriptional environment and that the MBS85 promoter shares key regulatory elements with the viral p5 promoter. Furthermore, we provide evidence for bidirectional MBS85 promoter activity and demonstrate that the minimal motifs required for AAV site-specific integration (TRS-RBS) are present in the 5′ untranslated region (UTR) of the gene and play a posttranscriptional role in the regulation of MBS85 expression.
MATERIALS AND METHODS
Cell lines.
293T (HEK_293T), RD, A673, and MCF7 cells were grown in Dulbecco's modified Eagle's medium (Mediatech Inc., Manassas, VA), 10% fetal bovine serum (Gemini Bio-Products, West Sacramento, CA). HeLa cells were grown in minimal essential medium (Mediatech Inc.), 10% fetal bovine serum, 0.1 mM nonessential amino acids, and 1 mM sodium pyruvate (Invitrogen, Carlsbad, CA). The human ES cell line H1 was grown on DR4 mouse embryonic feeder cells in Dulbecco's modified Eagle's medium-F12 supplemented with 20% (vol/vol) Knockout Serum Replacement (both from Invitrogen), basic fibroblast growth factor (20 ng/ml; R&D Systems), 50 U/ml penicillin, 50 μg/ml streptomycin, 2 mM l-glutamine, 0.1 mM nonessential amino acids (all from Invitrogen), and 0.1 mM β-mercaptoethanol (Sigma-Aldrich, St. Louis, MO). Prior to RNA isolation, H1 cells were grown under feeder-free conditions in matrigel-coated plates (Biocoat; Becton Dickinson, San Jose, CA).
Northern blot analysis.
Total RNA was extracted using the RNeasy kit (Qiagen, Valencia, CA). Ten micrograms of RNA was separated on a 1% formaldehyde-agarose gel and transferred onto a Magna nylon membrane (Osmonics, Minnetonka, MN). All membranes were hybridized with [α-
32P]dCTP-labeled probes (Prime-It RmT Random Primer Labeling Kit; Stratagene, Cedar Creek, TX). Northern blots were first hybridized to red fluorescent protein (RFP) or
MBS85 (exons 18 to 22) probes; stripped by being boiled for 15 min in 0.05× SSC (1× SSC is 0.15 M NaCl plus 0.015 M sodium citrate), 10 mM EDTA (pH 8), 0.1% sodium dodecyl sulfate (SDS); and rehybridized to a β-actin cDNA probe. The RFP,
MBS85, and β-actin probes were generated by PCR performed on plasmids pRFP, pND83, and pβ-actin, respectively (see Tables S1 and S2 at
http://www.kcl.ac.uk/linden ).
5′ RACE.
Transcription start sites (TSS) were identified using the 5′ rapid amplification of cDNA ends (5′ RACE) GeneRacer Kit (Invitrogen) according to the manufacturer's instructions. Total RNA was extracted from 293T cells (
MBS85 TSS) and from 293T cells transfected with construct pND29 (antisense TSS) (see Table S1 at
http://www.kcl.ac.uk/linden ). Briefly, DNase I-treated total RNA (5 μg) was dephosphorylated with calf intestinal phosphatase, and the cap structure was removed with tobacco acid pyrophosphatase. The decapped RNA was then ligated to the GeneRacer RNA primer. RNA was reverse transcribed for 1 h at 50°C using an oligo(dT) primer and SuperScriptIII reverse transcriptase (Invitrogen). The resulting cDNAs were amplified by PCR with the GC-rich PCR kit (Roche Applied Science, Indianapolis, IN) and GeneRacer 5′ and ND189 primers (
MBS85 TSS) or GeneRacer 5′ and ND192 primers (antisense TSS) (see Table S2 at
http://www.kcl.ac.uk/linden ). The PCR conditions were as follows: 95°C for 3 min; 10 cycles at 95°C for 30 s, 58°C for 30 s, and 72°C for 1 min 20 s (or 1 min for the antisense TSS); 25 cycles at 95°C for 30 s, 58°C for 30 s, and 72°C for 1 min 20 s with an elongation time of 5 s per cycle; and a final extension at 72°C for 7 min. The PCR products were cloned into the PCR2.1 Topo vector (Invitrogen), and the resulting clones were sequenced using a 3730xl DNA Analyzer (Applied Biosystems, Foster City, CA).
Plasmids.
Plasmids pDsRed2.1, pDsRed2-N1, and pIRES2-EGFP were purchased from Clontech (Mountain View, CA). EST R35625 (catalog number 363135) was purchased from the ATCC (Manassas, VA). Plasmid pDsRed2.1 was used to clone the
MBS85 and p5 promoter regions. The p5 promoter used in this study is the region described by Chang et al. (reference
10 and Table S2 at
http://www.kcl.ac.uk/linden ). All constructs have been sequenced.
Transient transfections.
Transfection experiments were performed in 60-mm plates. At 50% confluence, 293T cells were transfected with 6 μg of reporter construct using Lipofectamine Plus reagent (Invitrogen). HeLa cells were transfected at 80% confluence with 6 μg of reporter construct using Fugene 6 reagent (Roche Applied Science). 293T and HeLa cells were harvested 48 h and 72 h posttransfection, respectively, and assayed for plasmid DNA uptake, along with Northern blot, Western blot, and fluorescence-activated cell sorter (FACS) analyses. Transfection efficiencies were normalized by plasmid DNA uptake, as previously described (
31). All transfection experiments were repeated four times using plasmids that were independently prepared at least twice.
Plasmid DNA uptake.
Transfected cells were lysed in 0.2 M NaOH, 10 mM EDTA. Samples were boiled for 15 min at 90°C and loaded onto a Hybond XL nylon membrane (Amersham Biosciences, Piscataway, NJ) using a slot blot manifold (Bio-Rad). The membranes were hybridized to an RFP or green fluorescent protein (GFP) probe, generated by PCR, to determine the amount of reporter plasmid taken up by the cells.
Western blot analysis.
Cells were solubilized in RIPA buffer (50 mM Tris-HCl [pH 8], 150 mM NaCl, 0.1% SDS, 1% Nonidet P-40, 0.5% sodium deoxycholate, 1× Complete protease inhibitor cocktail) (Roche Applied Science). Samples (10 μg) were loaded on a 15% SDS-polyacrylamide gel and transferred onto a Hybond C extra nitrocellulose membrane (Amersham Biosciences, Piscataway, NJ). The membranes were blocked in 5% fat-free milk and incubated with anti-DsRed polyclonal antibody at a dilution of 1:16,000 (catalog number 632397; Clontech) or anti-actin monoclonal antibody at a dilution of 1:10,000 (catalog number 612656; Becton Dickinson Biosciences). After being washed in Tris-buffered saline-Tween 20 buffer, the blots were incubated with horseradish peroxidase-conjugated secondary antibody at a dilution of 1:10,000 (Jackson ImmunoResearch Laboratories, West Grove, PA). RFPs were visualized by the enhanced-chemiluminescence method using Pico and Femto detection kits (Pierce, Rockford, IL) for 293T and HeLa cell extracts, respectively. Actin proteins were visualized with the Pico detection kit. Membranes were first incubated with anti-DsRed antibody, stripped with the Restore Western Buffer Stripping Buffer (Pierce), and blotted with anti-actin antibody for normalization.
Protein quantification was performed using the Li-Cor Odyssey infrared imaging system (Li-Cor Biosciences UK Ltd., Cambridge, United Kingdom). Anti-mouse IRDye 680 (catalog number 926-32220; Li-Cor) or anti rabbit IRDye 800 (catalog number 926-32211; Li-Cor) fluorescent secondary antibody was used at a dilution of 1:5,000 in 1% milk, 0.5% Tween 20.
Flow cytometry.
Cell suspensions from transfected 293T and HeLa cells were prepared and analyzed for RFP expression. The data were acquired using a FACScalibur (Becton Dickinson) and analyzed by Flowjo software (Tree Star, Inc., Ashland, OR). FACS data are presented as dot plots with linear axes for forward/side scatter and logarithmic axes for FL1 (empty;
x axis) and FL2 (RFP;
y axis). Gates were set to exclude dead cells based on forward/side scatter. This gated population was analyzed for RFP expression. The gate set to determine the percentage of RFP-expressing cells for each sample was obtained by the analysis of a negative population (cells transfected with a promoterless vector, pDsRed2.1) and was verified by a positive population (cells transfected with a cytomegalovirus [CMV]-controlled RFP vector, pDsRed2-N1). The geometric mean fluorescence intensity (Geo MFI) was calculated for this gated population of RFP-positive cells. FACS data collected from four experiments were used to generate graphs (see Fig.
3,
4, and
6).
Sorting experiments were performed on a Moflo cell sorter (Cytomation, Ft. Collins, CO).
RT-PCR.
DNase I-treated RNA was subjected to reverse transcription-PCR (RT-PCR) using SuperScriptIII reverse transcriptase (Invitrogen) and the KCL1 primer. The resulting cDNAs were amplified by PCR with GoTaq polymerase (Promega, Madison, WI) and the KCL2 and KCL4 primers. The PCR conditions were 94°C for 2 min; 35 cycles at 94°C for 30 s, 62°C for 30 s, and 72°C for 40 s; and a final extension at 72°C for 10 min. The PCR products were cloned into the PCR2.1 Topo vector (Invitrogen), and the resulting clones were sequenced.
Epifluorescence microscopy.
Transfected 293T cells were visualized for RFP and GFP expression 48 h posttransfection. Epifluoresecnce microscopy was carried out by using an inverted epifluorescence microscope (Leica; DM IRB) and an ×20 magnification lens. Images were acquired with a Mintron digital camera. Exposure times and camera gain values were kept strictly identical in all pictures. All images were assembled and processed identically (adjustment level correction and contrast enhancement were identical for each image file) with Adobe Photoshop 7.0 (Adobe Systems Inc., Mountain View, CA).
Nucleotide sequence accession numbers.
Nucleotide sequence accession numbers were as follows: Homo sapiens MBS85 expressed sequence tag (EST), R35625; IMAGE clone identifier, 38310; H. sapiens MBS85 mRNA, AF312028; H. sapiens β-actin, NM_001101; AAV2 complete genome, AF043303; and upstream sequences from the MBS85 ATG start codon for H. sapiens (human) chromosome 19, NT_011109.15 from nucleotide (nt) 27897099 to nt 27898224, for Pan troglodytes (chimpanzee) chromosome 19, NW_001228247.1 from nt 1307249 to nt 1307515 and from nt 1305832 to nt 1306173 (there is a gap between the two contigs), for Bos taurus (cattle) chromosome 18, NW_001493632.2 from nt 1526744 to nt 1527802, for Equus caballus (horse) chromosome 10, NW_001867363.1 from nt 24348902 to nt 24349949, for Canis lupus familiaris (dog) chromosome 1, NW_876270.1 from nt 32959175 to nt 32960143, for Mus musculus (mouse) chromosome 7, NW_001030825.1 from nt 1246105 to nt 1247455, and for Rattus norvegicus (rat) chromosome 1, NW_047555.2 from nt 13365737 to nt 13367100.
DISCUSSION
As AAV has evolved to integrate site-specifically into a ubiquitously transcribed region (
57), the question arises whether integration and the maintenance of latency are associated with
MBS85 gene expression, and ultimately whether the p5 promoter of AAV has coevolved with the
MBS85 regulatory elements. In order to gain a better understanding of this potential relationship, we initiated a characterization of the transcriptional activities of
MBS85.
Interestingly, the RBS sequence and its position relative to the
MBS85 translation initiation site are conserved among the different species, suggesting that this motif, which is essential for the viral life cycle, might also play a critical role in
MBS85 regulation. It is further noteworthy that the consensus binding site for the ubiquitously expressed ZF5 transcription factor (
44) overlaps with the RBS (
8,
9,
12) and is one of the most frequently occurring motifs within core promoter regions (
4). However, the function and biological significance of ZF5 sites have not yet been elucidated. The identification of RBS motifs within 5′ UTRs of a number of cellular genes strengthened the hypothesis of a functional role for RBS-like sequences (
5,
14,
64,
65). Lackner and Muzyczka have provided evidence that the p5 RBS acts as a repressor in the presence of the adenovirus and Rep proteins (
33). However, currently there is no insight into the potential role of RBS sequences in cellular transcription.
In contrast, in conjunction with the RBS, the TRS motif shows little sequence homology among the different species. However, even though the primate and mouse
MBS85 RBS-TRS sequences are somewhat different, we have previously shown by in vitro endonuclease assays that Rep68 can introduce a specific nick at the mouse TRS (
16). More importantly, we have further demonstrated that Rep can mediate site-specific integration into the
Mbs85 locus in mouse ES cells (
19). In-depth analyses using ES cell differentiation assays, as well as the generation of chimeric mice with an
Mbs85-targeted transgene, indicated no detectable effects on transcription or, in fact, the ES cell potential in any of the assays employed.
Characterization of the MBS85 regulatory region in this report revealed a single TSS located 94 nt upstream from the ATG translation initiation site, indicating that the minimal signals required for AAV site-specific integration, the TRS-RBS motifs, are located within the 5′ UTR of the gene. We further demonstrated that the MBS85 promoter has bidirectional activity and provided evidence that the TRS-RBS sequence contains an inhibitory signal that affects gene expression at a posttranscriptional level. Using bidirectional promoter constructs, we also showed that the TRS-RBS region inhibits gene expression of only the sense reporter gene without affecting the expression level of the antisense gene. Although the mechanism of inhibition remains unclear, it is possible that the TRS-RBS region might form a stable RNA secondary structure around the translation initiation site, thus potentially affecting ribosome accessibility. An alternative scenario is that trans-acting RBS binding factors might be involved in the inhibition of MBS85 expression. We are currently investigating the mechanisms underlying this regulation.
In order to study the molecular mechanisms responsible for the transcriptional regulation of MBS85, we cloned the promoter region of the human MBS85 gene. We demonstrated that the region from nt −137 to +94 relative to the TSS is sufficient for basal MBS85 expression. Comparison of proximal promoter regions from MBS85 orthologs indicated that several transcription factor binding sites, such as Sp1, CRE-ATF, E4F, Staf, YY1, and ZF5, are conserved among the different species, suggesting a possible role in the regulation of MBS85 gene expression.
In order to determine whether common regulatory mechanisms might be involved in viral and target gene regulation, we compared the transcriptional activities of the human
MBS85 promoter to those of AAV p5. Interestingly, of the three
cis-acting regulatory elements that are involved in the regulation of p5 promoter activity, two (the RBS and YY1 sites) are also present within the minimal
MBS85 promoter. In the absence of helper functions, the YY1 and Rep78/68 proteins repress the p5 promoter activity by direct binding to the recognition site (
21,
31,
32,
45,
53). It has further been demonstrated that ZF5 can repress p5 promoter activity, as well. However, the mechanism for this repression does not require the p5 RBS, but instead, the RBS motif present within the viral ITR (
9).
Our study further indicates that both MBS85 and p5 promoter activities are stronger in 293T cells than in HeLa cells. It can be speculated that the increased gene expression of MBS85 is attributable to the presence of the adenovirus E1A protein in 293T cells, although we have not yet found evidence in support of this hypothesis.
Previously, an enhancer-like activity was reported to be present in the region upstream from the minimal signals required for AAV site-specific integration (
35). Our data show that this element is located within the minimal
MBS85 promoter region. Emerging evidence indicates that as many as 20% of all human promoters have bidirectional activities (
1,
43); it is therefore possible that the activities reported here can be attributed to promoter rather than to enhancer functions. Many bidirectional promoters contain shared transcription factor binding sites, and many genes regulated by bidirectional promoters appear to be coexpressed (
58). It has therefore been suggested that a bidirectional arrangement provides a unique mechanism to regulate the expression of two divergently transcribed genes (
58). Interestingly, YY1 binding sites are significantly overrepresented, and CpG islands often encompass the TSS within bidirectional promoters (
1,
37), as we also observed for the
MBS85 promoter.
To date, there is no evidence for any biological function of the antisense transcripts in vivo. We found that a possible transcript from this promoter could be the
NUTF2 pseudogene, which is located between the
TNNT1 and
MBS85 genes. The NUTF2 protein facilitates the import of proteins into the nucleus through the nuclear pore complex and, in particular, mediates the nuclear import of RanGDP (
49). However, overexpression of antisense ψ
NUTF2 transcripts did not significantly decrease NUTF2 protein synthesis in our hands, suggesting that this particular
NUTF2 pseudogene has no biological relevance for
NUTF2 expression. Several reports are converging toward the idea that short, unstable noncoding transcripts are frequently initiated in the opposite direction from protein-encoding genes, in close proximity to the TSS. It still needs to be determined whether these short transcripts, which are widespread throughout the human and yeast genomes, have any biological functions (
13,
18,
26,
46,
52,
66) and whether the activity observed with the
MBS85 promoter could fall into this category.
The ability of AAV to site-specifically integrate its genome into the
MBS85 gene ultimately raises the question of the uniqueness of this locus or, alternatively, the contribution of the locus to integration and/or the maintenance of viral latency. Despite the fact that Rep78/68 can interact with a large number of RBSs scattered throughout the human genome (
64,
68),
MBS85 has so far remained the only locus satisfying the requirements for AAV integration (
42,
62). One component, which is also shared by other integrating viruses, is the open chromatin structure (
35), which is consistent with the ubiquitous expression of
MBS85. A further intriguing aspect is that AAV has evolved to share what we have demonstrated to be a regulatory element, the RBS, with its target locus, highlighting the possibility of coregulation of the viral and cellular promoters. This particular aspect raises the possibility that through this and possibly more shared regulatory elements, the viral Rep protein might be able to influence the transcriptional activity of the target locus, which in turn could aid in the maintenance of viral latency in the absence of helper virus infection and rescue under permissive conditions. In addition, our finding that AAV integrates in an oriented manner, with the p5 promoter consistently forming the initial junction (
19), invites the hypothesis that shared promoter binding factors (such as Rep and YY1) could be involved in the assembly of the integration complex, thus directly linking transcription, Rep-mediated replication, and subsequent integration into one mechanistic framework.