The nonpathogenic human parvovirus, adeno-associated virus type 2 (AAV), replicates only in cells coinfected with a helper virus (
5,
6). In the absence of such helper conditions, AAV establishes latency, integrating its genome site-specifically into a region of human chromosome 19 called
AAVS1 (
12,
17-
19,
27).
Both AAV DNA replication and site-specific integration require the large nonstructural Rep proteins (Rep68/78) and motifs within the inverted terminal repeats (
4,
28,
32). Rep68/78 specifically bind a sequence motif within the inverted terminal repeats, the Rep binding site (RBS) (
10,
11,
24,
26), and cleave in a site- and strand-specific manner at the terminal resolution site (TRS) located 13 nucleotides (nt) upstream from the RBS (
8,
15,
31,
40). The TRS/RBS sequence can act as a minimal origin for Rep-mediated DNA replication (
29,
35). An almost identical sequence in chromosome 19 represents the minimal sequence necessary and sufficient for AAV site-specific integration (
22,
23). Biochemical assays have demonstrated that Rep can specifically interact with both viral and cellular RBS-TRS motifs (
9,
11,
37); these motifs are thereby thought to target integration of AAV DNA into
AAVS1 (
4,
28,
32). It has recently become evident that the
AAVS1 RBS is located 17 nt upstream from the translation initiation site of the protein phosphatase 1 regulatory inhibitor subunit 12C gene (
PPP1R12C), also called
MBS85 (myosin binding subunit 85) (
33).
By using
MBS85 exon sequences, we analyzed the National Center for Biotechnology Information mouse database for similarities to the human
AAVS1 locus (
1). This analysis revealed a homology of 90% between the 5′ end of the human
MBS85 cDNA and the 969-nt mouse cDNA clone AK010836, which contains a sequence homologous to the human TRS-RBS motifs as well as the Mbs85 initiation codon (separated by 25 nt) (Fig.
1A). Interestingly, a simian
AAVS1 locus containing the corresponding upstream region and a TRS-RBS motif has recently been isolated from the African green monkey genome (
2).
AAVS1 is located 14.9 and 36 kb centromeric to the slow skeletal troponin T (
TNNT1) and cardiac troponin I (
TNNI3) genes, respectively (
12). The mouse
Tnni3 and
Tnnt1 genes are located on chromosome 7 (
13,
14), in a region previously shown to be syntenic to the human chromosome 19 region that contains
AAVS1 (
7). We used the Celera discovery system to search the Celera mouse genome assembly with the mouse
Tnni3 and
Tnnt1 genes, the AK010836 cDNA, and the human
MBS85 genomic sequence. All of these sequences specifically matched the same scaffold (500 kb) in the Celera database. The mouse
Mbs85 is located on chromosome 7 and is separated by only 2.5 and 16 kb from the
Tnnt1 and
Tnni3 genes, respectively (Fig.
1B). The Celera map revealed a gene 3.1 kb downstream of
MBS85, designated
DRC3, the mouse homolog of which is located 2.1 kb downstream of the
Mbs85 gene (Fig.
1B).
We sequenced and assembled three mouse expressed sequence tag clones (AA021750, AW911639, and BE847281) containing
Mbs85. The resulting 3.1-kb mouse cDNA was 77% identical to the human
MBS85 cDNA. The mouse
Mbs85 gene spans 20 kb of genomic sequences, and the 2.3-kb predicted open reading frame is composed of 22 coding exons (
20). Thus, the mouse and the human homologs of
MBS85 display the same overall genomic organization (Fig.
1B). The deduced mouse Mbs85 protein sequence is 781 amino acids in length and is 86% identical to its human counterpart (
33).
To assess the distribution of
Mbs85 mRNAs, a mouse poly(A) multiple tissue Northern blot (Clontech, Palo Alto, Calif.) was hybridized to a mouse
Mbs85 cDNA probe consisting of exons 5 to 22. As is observed in a human multiple tissue Northern blot (
33), a single mRNA of approximately 3.1 kb is highly expressed in heart and testis, and to a lesser extent in kidney, brain, liver, and lung (Fig.
1C).
Current models of AAV integration predict that a Rep-mediated nick at the chromosomal TRS is a prerequisite (
22). To determine if Rep68 can specifically nick the putative mouse TRS, double-stranded and partially single-stranded 5′ end-labeled origin substrates were incubated with purified His-tagged Rep68 proteins in a cell-free endonuclease assay as described previously (
39). Rep68 nicked the AAV, human, and mouse TRS substrates releasing an expected 14-nt labeled fragment (Fig.
2). Nicking is Rep68 dependent since no cleavage of the AAV, human, or mouse origin substrates is observed when an endonuclease-negative mutant is used (Rep68Y156F) (
30,
39). Substitution of the two thymidine residues within the mouse TRS sequence resulted in an expected loss of specific Rep-mediated cleavage (Fig.
2).
Origin interactions by Rep are thought to represent the initiating steps of integration (
29,
34,
35). To test whether the mouse TRS-RBS sequence could also serve a similar function, cell-free DNA replication assays were performed as described previously (
36). Linearized substrates containing the AAV, human, or putative mouse origin in a pBluescript backbone were incubated with HeLa cell extracts in the presence or the absence of purified His-tagged Rep68 protein (75 ng) and [α-
32P]dCTP. Rep68 initiates replication on templates containing the AAV, human, or mouse origin but not on the vector DNA alone (Fig.
3). In all cases, replication is Rep dependent.
We further compared the human and mouse 5′ untranslated regions. It has been reported that the human
AAVS1 fragment located 74 to 426 nt upstream of the translation initiation codon is sufficient to drive the expression of a reporter gene following transient transfections in both 293 and HeLa cells (
21).
Alignment of the human and mouse sequences upstream of the ATG revealed an overall 62% identity in the putative promoter region (Fig.
4A). Several conserved putative
cis-acting DNA elements (i.e.,
Sp1,
CRE/
ATF) suggest the presence of a TATA-less promoter and common regulatory mechanisms for the expression of the human and mouse
MBS85 genes.
We identified mouse cell lines expressing
Mbs85. Total RNAs were extracted from C2C12, NIH 3T3, and N2A cell lines (Tel-Test, Friendswood, Tex.). Northern blots hybridized to the mouse
Mbs85 ex5-22 cDNA probe revealed a unique 3.1-kb transcript in all three cell lines (Fig.
4B).
To test the 324-bp NaeI fragment containing the RBS and TRS motifs for transcriptional activity, it was cloned into the pDsRed2.1 promoterless red fluorescent protein vector (Clontech) in both the sense and antisense orientation. C2C12, N2A, and NIH 3T3 cells were transfected and fixed 45 h posttransfection with 3.7% paraformaldehyde, and the slides were mounted in vectashield mounting medium with DAPI (4′,6′-diamidino-2-phenylindole) (Vector Laboratories, Burlingame, Calif.). The sense, but not the antisense, construct shows transcriptional activity in all three cell lines (Fig.
4C). These results were confirmed by fluorescence-activated cell sorter analysis (data not shown).
In this report, we show that the target site for AAV site-specific integration is not restricted to primates but is also present in the mouse genome in a region that is syntenic to the human chromosome 19 region containing
AAVS1 (Fig.
1).
Currently, Rep interactions with a minimal origin are defined by specific binding to the RBS followed by site- and strand-specific nicking at the TRS (
8,
11,
15,
16). We demonstrate that the TRS and RBS motifs present in the 5′ untranslated region of the mouse
Mbs85 gene can act as a substrate for Rep-mediated nicking and as a functional Rep-dependent origin (Fig.
2 and
3). Figure
2 also shows a second nick on the mouse substrate which has been seen previously (
34,
38), the significance of which is at present unclear.
We further demonstrate that a region containing the TRS-RBS motif upstream of the mouse
Mbs85 ATG contains regulatory elements sufficient to drive the expression of a reporter gene in vitro (Fig.
4). The fact that the TRS-RBS motif and its position relative to the ATG of the
MBS85 gene are conserved between human and mouse suggests that a cellular protein might interact with these sequences to potentially regulate
MBS85 transcription or translation. It is likely that integration of AAV DNA into
MBS85 affects
MBS85 expression and thereby the function of the gene product.
Site-specific integration by AAV might make possible targeted therapeutic gene transfer, thereby minimizing the risk of insertional mutagenesis. Knowledge of the effects of AAV-mediated site-specific integration is a prerequisite for achieving this goal. Therefore, there have been considerable efforts to develop rodent models containing the human
AAVS1 region (
3,
25,
41). Rep-dependent site-specific nicking and DNA replication at the mouse TRS-RBS motifs suggests the possibility of AAV Rep-targeted integration at the mouse locus. The advantage of a mouse model with an innate
AAVS1 will be that the overall organization of the integration site, and thus potential effects of AAV-mediated disruption, would be similar to what occurs at the human locus. However, attempts at isolating viral-cellular junctions from latently infected mouse cell lines have not yet been successful. Currently, we are engaged in a comprehensive study aimed at comparing the efficiency of AAV integration into the mouse genome at the innate
AAVS1 to that described for human cells at the human
AAVS1, to evaluate, in a meaningful manner, the resemblance and/or differences of these two systems.
Sequences were retrieved from GenBank: Homo sapiens MBS85 cDNA, AF312028 ; MBS85 protein, AF312028_1 ; AAVS1, S51329 ; TNNT1 gene, AJ011712 /AJ011713; TNNI3 gene, X90780 ; DRC3 gene, AF282168 ; Homo sapiens chromosome 19 clone CTD-2587H24, AC010327 ; Mus musculusTnnt1 gene, U92882 ; Tnni3 gene, Z22784 ; 969-nt DNA clone, AK010836 ; Mus musculus BAC clone RP23-313M20, AC079521 .
Acknowledgments
We thank Peter Warburton for his help and Patricia Wilson for helpful discussions and critical reading of the manuscript. We also thank Robert Krauss and Jong-Sun Kang for kindly providing the C2C12 cells, Philippe Marambaud for the N2A cells, and Patricia Wilson for the NIH 3T3 cells.
This work was supported by NIH grants GM62234 and DK50795 (to R.M.L.).