INTRODUCTION
Adeno-associated virus (AAV) is a nonenveloped, single-stranded DNA virus belonging to the
Dependoparvovirus genus within the
Parvoviridae family (
1). The AAV capsid consists of 60 capsid monomers of VP1, VP2, and VP3 at a ratio of 1:1:10 that packages a 4.7-kb single-stranded genome (
2,
3). The AAV genome encodes replication (
Rep), capsid (
Cap), and assembly-activating protein (
AAP) open reading frames (ORFs) flanked by inverted terminal repeats (ITRs), which are the sole requirements for genome packaging (
4,
5). As such, the majority of the genome can be replaced by exogenous DNA sequences and packaged inside the AAV capsid to create a recombinant vector for DNA delivery both
in vitro and
in vivo (
6). Unlike other viruses that are replication competent, AAVs are partially defective since they require a helper virus (such as adenovirus or herpes simplex virus) for replication (
7). The lack of pathogenicity and ease of genome manipulation have enabled extensive evaluation of recombinant AAV vectors as candidates for clinical gene therapy (
8).
AAV encodes a unique protein, AAP, which is not found in other autonomous parvoviruses and is required for AAV capsid assembly. AAP is predicted to be a 20- to 24-kDa protein, with an actual size ranging from 27 to 34 kDa, which may be due to posttranslational modifications (
5,
9). AAP is encoded from a +1 frame within the
Cap ORF overlapping the junction between VP2 and VP3 (
5). Introduction of a stop codon within AAP without affecting the coding frame of VP2/3 prevents capsid assembly and virus/vector production (
10). While capsid assembly can be restored by providing AAP
in trans, overexpression of wild-type (WT) AAP does not increase the vector yield (
10). This suggests that AAP is necessary and sufficient for capsid assembly but is not a limiting factor for vector production. Cellular localization of AAPs overlaps with the location of capsid assembly, supporting a direct role for AAP (
11). Most AAPs show strong nucleolar localization, while AAP5 and AAP9 are predominantly nuclear and excluded from the nucleolus (
9,
12). Although the exact mechanism by which AAP supports AAV assembly is still elusive, several studies have convincingly shown that AAP is important for intracellular capsid expression and localization.
For instance, the steady-state level of capsid is dramatically reduced in the absence of AAP (
13). Such regulation must occur at the translational or posttranslational level as
Cap mRNA expression remains the same regardless of the presence or absence of AAP (
10). In addition, the AAV2 VP have been shown to change their cellular localization from cytoplasmic/nuclear to nucleolar in the presence of AAP2 (
12). A previous study has shown that the N-terminal region of AAP might interact with the C terminus of the VP, albeit weakly (
13). Interaction with AAP2 also appears to alter VP conformation as supported by the lack of binding to several conformational specific antibodies (
13). All of the above observations support the notion that AAP2 acts as a chaperone to stabilize and translocate VP to the site of assembly. However, it should be noted that significant differences in cellular localization and cross-complementation have been reported (
9).
AAPs encoded by different serotypes are closely related, with sequence identity ranging from 48% (AAV4) to 82% (AAV7) relative to AAP1 (
11). Phylogenetic analysis of AAP shows the same relationships as AAV VP, where the serotypes 4, 5, 11, and 12 are distinct from the others (
11). The phylogenetic distance is also reflected in the biology of AAP4, -5, -11, and -12, which are unable to complement capsid assembly of other AAV serotypes (
9). Furthermore, in the absence of AAP, AAV4, AAV5, and AAV11 are capable of producing 20 to 40% of AAV particles compared to the WT level; these have been termed “AAP independent” (
9). At the secondary structure level, using previously defined nomenclature, AAP can be separated into multiple functional regions, N to C terminus: the hydrophobic region (HR), the conserved core (CC), the proline-rich region (PRR), the threonine/serine-rich region (T/S), and the basic region (BR) (
13). In the present study, we undertake a systematic dissection of these different domains and generate novel chimeric and artificial AAPs that shed light on the structure-function correlates of this unique viral assembly protein.
DISCUSSION
Adeno-associated virus (AAV) encodes different proteins within its 4.7-kb genome using alternative splicing, alternative start codons, and overlapping reading frames (
23). AAP is expressed from an alternative reading frame overlapping the VP2/3 sequences (
5). We attempted to further understand our observations by calculating the relative conservation of AAP and VP overlapping regions in the
Cap gene (
24). Briefly, we plotted the ratio (AAP/VP) of the conservation scores such that a ratio of >1 denotes higher AAP conservation and a ratio of <1 implies higher VP conservation. Most regions show a preference for VP over AAP, with values of <1 (
Fig. 10A). For instance, the sequences forming the beta-strand regions that form the jelly roll structure of VP and the highly conserved loop I have AAP/VP conservation ratios ranging from 0.3 to 0.5 due to a higher level of VP sequence conservation. These regions correspond to the nonessential T/S and PRR linker domains of AAP, respectively. Thus, in these regions VP function is favored over that of AAP (
Fig. 10A). In contrast, the HR and CC regions show a conservation ratio of >1, indicating a preference for AAP function (
Fig. 10A). In corollary, this region corresponds to the VP2 N-terminal domain, which has been shown to be nonessential for AAV capsid infectivity and essentially serves as a linker between the unique VP1 N-terminal phospholipase A2 (PLA2) domain and VP3 major subunit. Further, homology modeling of different AAP modules shows that the HR is the only region that has a strong secondary structure requirement in the form of an alpha-helix. Although AAP4 BR is also modeled as an alpha-helix, no other AAP BR can be modeled. All other domains either form undefined loop structures or are unable to be modeled (
Fig. 10B).
The latter theoretical observations corroborate our functional characterization of AAP. Indeed, functional analysis showed that the T/S is dispensable and can be replaced by exogenous sequences. All of our T/S deletion or replacement constructs had higher steady-state levels than the WT AAP1. However, we cannot rule out that the T/S could have other potential, regulatory functions. Serine and threonine are the common amino acid substrates for phosphorylation; there are examples showing that multiple phosphorylation events could mark the protein for degradation via phosphodegron modules by the proteasome (
25,
26). As previously shown, degradation of AAP can be inhibited by addition of proteasome inhibitor, MG132 (
10). Therefore, the increase in the steady level of the T/S deletion constructs could be the result of decreased phosphorylation and, consequently, proteasomal degradation. Among other possible roles for the T/S that we cannot rule out at this writing is transcriptional regulation. These attributes might have been evolved to exploit the host cell machinery to regulate AAP levels and thus influence AAV infection. These latter aspects are the subject of ongoing investigations.
Similar to the T/S, the PRR does not appear to have any assigned function. Deletion of the PRR and T/S together impairs capsid assembly. However, deletion of the PRR in AAP1E retains 60% of capsid assembly compared to the wild type. Replacement with the PRR from AAP5 further rescues assembly to 80%. These data suggest that the PRR plays a relatively minor role in capsid assembly and serotype specificity but may act as a key structural linker that physically separates the critical HR and CC modules from the T/S. Unlike the PRR and T/S, multiple studies have shown that the BR contains an important NLS/NoLS signal that is responsible for AAP localization and subsequent translocation of the capsid to the assembly site (
5,
11,
12). We tested whether other AAP BRs can functionally replace the AAP1 BR for capsid assembly. Surprisingly, only the AAP4 BR is able to support AAV1 capsid assembly function. Since the whole AAP4 protein is unable to rescue AAP1 capsid assembly function (
9), our data clearly demonstrate that serotype specificity is independent of the BR. We further corroborate that BR is solely acting as a NoLS in the context of AAP1 by replacing the AAP1 BR with other heterologous NLS/NoLS. Among all of the NLS/NoLS tested, AP3D1, which is supposed to be an endosomal marker as its native form, shows the best nucleolar localization and completely supports capsid assembly. The localization pattern and percent rescue of capsid assembly are highly correlated, where increased nucleolar localization indicates greater restoration of capsid assembly for AAV1. Different BR sequences have been reported for different AAPs, supporting the notion that AAP exerted a functional, rather structural evolutionary constraint in this region compared to VP.
The HR and the CC are the functional domains for AAV capsid assembly and the determinants of serotype specificity. Deleting the HR or CC led to an inability to pull down VP or support capsid assembly. Replacement of the HR and CC from AAP5 rescued interaction with VP; however, only the 5CC replacement was able to restore capsid assembly (to 50%). Since there are only two residues different between 1CC and 5CC (T44M and Q50R), it is not surprising that the 5CC replacement had relatively little effect on AAV1 capsid assembly compared to other N-terminal mutations. However, the ability of 5HR to pull down AAV1 VP was unexpected, since there are 10 amino acid residue differences between the two serotypes. Furthermore, binding of 5HR is not sufficient for function, since capsid assembly was defective. By replacing the T/S with three different oligomerization domains, our experiments suggest that artificially promoting oligomerization in the linker region of AAP does not correlate with the capsid assembly function. However, we note that the HR has been predicted by homology modeling to form an alpha-helix (
Fig. 10B). Since amphipathic helices are often found in oligomerization domains, our data does not rule out the possibility that AAP potentially forms an oligomer. Based on our findings, we speculate that AAP interaction with the capsid is bipartite, where CC binds to a VP structural domain that is conserved among various serotypes (e.g., beta-strand), and HR either binds another site or oligomerizes to facilitate formation of a VP oligomer. Accordingly, capsid assembly only occurs when the two domains act together in a manner similar to a “lock and key” mechanism, where CC is the backbone and HR is the gear of the key which confers serotype specificity. Alternatively, since AAP interaction with the VP subunit is required to be transient, it is also possible that these domains play a role in releasing AAP from VP subunits during capsid assembly.
Using our new knowledge of AAP structure and function, we engineered additional properties onto AAP by replacing the nonessential T/S linker region. The fluorescently traceable AAP1E, which retains the same function as the wild-type counterpart, can potentially be utilized for real-time, live-cell imaging to study intracellular trafficking of AAP and capsid assembly events. Further, we engineered an AAP1-Collagen construct that shows improved stability and, by providing this eAAP in trans, we observed a 2-fold increase in vector yield compared to the wild-type counterpart. Such engineered, hyperstable AAPs could also be utilized to solve the structure of this intriguing protein as is or in complex with AAV capsid proteins. These latter observations suggest that with careful dissection of the mechanisms involving AAP biology, we can potentially develop strategies to improve the efficiency of rAAV packaging and rAAV vector yield. Several other questions remain to be addressed. For instance, how are the building blocks for AAV capsid assembly generated: as dimers, trimers, or pentamers? How is AAP involved in facilitating the oligomerization process? Another intriguing question is why do autonomous parvoviruses not appear to require an AAP-like chaperone for capsid assembly? Although these questions provide topics for ongoing and future investigations, the present study constitutes an important step in further understanding and controlling AAV capsid assembly.
MATERIALS AND METHODS
Cells, viruses, and antibodies.
HEK293 cells were maintained in Dulbecco modified Eagle medium (DMEM) supplemented with 10% fetal bovine serum (FBS) (Thermo Fisher, Waltham, MA), as well as 100 U/ml of penicillin and 10 μg/ml of streptomycin (P/S; Thermo Fisher), in 5% CO
2 at 37°C. Hybridoma supernatant of anti-AAV monoclonal antibodies B1 and A20 were produced in-house and have been described earlier (
27). Mouse anti-rhodopsin (1D4; ab5417) and mouse anti-actin (ab3280) antibodies were purchased from Abcam (Cambridge, United Kingdom). Mouse anti-CD23 antibody (D-6; sc-17826) was purchased from Santa Cruz Biotechnology (Santa Cruz, CA).
Homology modeling of AAP.
The amino acid sequences of AAP1 to AAP9 were used in structural prediction using SWISS-MODEL (
https://swissmodel.expasy.org/). The templates used for modeling AAP is Arabidopsis G protein-coupled receptor 2, GCR2 (PDB
3T33), for the HR and CC regions. The sequence identity of the modeling region is 12.20%, and the global model quality estimation (GMQE) and QMEAN Z-score are 0.08 and −1.42, respectively. Different domains of AAP were also predicted individually; the AAP2 PRR and T/S are modeled with cellobiohydrolase (PDB
1Q9H) (
28), and the QMEAN Z-score is −2.08. The AAP4 BR is modeled with HIV-Rev (PDB
4PMI) (
29), and the QMEAN Z-score is −3.25. The HR domain can also be modeled with bacterial RNase ligase (PDB
4XRU) and
Escherichia coli topoisomerase (PDB
1YUA), with QMEAN Z-scores of −1.81 and −2.08, respectively (
30).
Bioinformatic analysis.
Sequences from AAP and VP (only the residues corresponding to those in AAP) were aligned using MUSCLE, followed by manual adjustment. Alignments were done such that residues and gaps directly correspond in AAP and VP. The percent identities and similarities of AAPs from other serotypes compared to AAP1 were calculated with the Sequence Manipulation Suite (
31) using the following groups for similarity: GAVLI, FYW, CM, ST, KRH, DENQ, and P. Amino acid conservation scores at each position of AAP and VP were calculated using the prediction tool developed by Capra and Singh (
32). Analysis was run using the property entropy scoring method, sequence weighting, the BLOSUM62 background and scoring matrix, and a window size of 0.
Generation of different AAP constructs.
All AAP constructs were cloned into pCDNA3.1 using the EcoRI and NotI sites. Chimera constructs were cloned using either overlapping PCR or Gibson Assembly (NEBuilder HiFi; New England BioLabs, Ipswich, MA). Detailed designs for each construct are illustrated in each figure, and the amino acid sequences are given in
Table 1.
AAP and capsid expression of different AAP constructs evaluated by Western blotting.
HEK293 cells at 60 to 70% confluence on a six-well plate were transfected with 600 ng of pXX680 (adenoviral helper plasmid containing E2A, E4orf6, and VA RNA from human adenovirus 5), 400 ng of pTR-CBA-Luc (AAV packaging plasmid containing CBA-Luciferase flanked by ITRs from AAV2), 600 ng of pXR-AAV1-no AAP (AAV1 Rep and Cap plasmid without ITRs; the AAP ORF is terminated by introducing a stop codon at position 64), and 400 ng of pCDNA3.1 AAP constructs (different AAP substitution, deletion, and fusion constructs driven by a cytomegalovirus promoter) using polyethylenimine (PEI) as the transfection reagent. At 3 days posttransfection, cell pellets were washed two times with 1× Dulbecco-modified phosphate-buffered saline (DPBS) and lysed with 200 μl of 1× passive lysis buffer (Promega, Madison, WI) for 30 min on ice with Halt protease inhibitor (Thermo Fisher). Supernatants were collected after centrifugation at 13,000 × g for 5 min at 4°C. The sample were prepared for Western blotting in 1× LDS loading dye (Thermo Fisher) and 100 mM dithiothreitol (DTT), then boiled at 95°C for 5 min, and loaded on a NuPAGE 4 to 12% Bis-Tris SDS-page gel (Thermo Fisher). The protein bands were transferred to nitrocellulose membrane (Thermo Fisher) using a semidry Xcell Surelock module (Thermo Fisher). VP, AAP, and actin proteins were detected by using B1 hybridoma supernatant at a 1:50 dilution, mouse α-rhodopsin antibody (ID4; ab5417) at a 1:2,000 dilution, and α-actin antibodies (ab3280) at a 1:1,000 dilution, respectively; goat α-mouse antibody labeled with horseradish peroxidase (HRP) at a 1:10,000 dilution was used as the secondary antibody. A chemiluminescence reaction was initiated with enhanced chemiluminescent substrate (SuperSignal West Femto maximum sensitivity; Thermo Fisher), and the membrane was developed on an AI600RGB system (Amersham Biosciences, Little Chalfont, United Kingdom).
AAP-dependent capsid assembly/vector production evaluated by qPCR.
HEK293 cells at 60 to 70% confluence on a six-well plate were transfected with pXX680 (600 ng), pTR-CBA-Luc (400 ng), pXR-AAV1-no AAP (600 ng), and pCDNA3.1 AAP constructs (400 ng) using PEI. The transfection medium was replaced with fresh media after 24 h, and the supernatant was harvested at 5 days posttransfection. Supernatants were collected after centrifugation at 13,000 × g for 2 min. The supernatants were used directly for standard quantitative PCR (qPCR) analysis to determine the vector yield and for transduction assays. Subsequent steps involved the harvesting of media, polyethylene glycol precipitation, iodixanol ultracentrifugation, and buffer exchange. Supernatants were treated with DNase (90 μg/ml) for 1 h at 37°C. DNase was inactivated by the addition of EDTA (13.2 mM), followed by proteinase K (0.53 mg/ml) digestion for 2 h at 55°C. Recombinant AAV vector titers were determined by qPCR with primers that amplify AAV2 inverted terminal repeat (ITR) regions (5′-AACATGCTACGCAGAGAGGGAGTGG-3′ and 5′-CATGAGACAAGGAACCCCTAGTGATGGAG-3′). The relative vector yields from different AAP constructs were normalized to wild-type AAP1 unless specifically indicated otherwise in figure legend.
In vitro AAV transduction assays.
AAV vectors produced with different AAP constructs packaging ssCBA-Luc transgenes were prediluted in DMEM plus 5% FBS plus P/S. Portions (50 μl) of recombinant AAV vectors (1,000 to 10,000 vector genomes [vg]/cell) were mixed with 50 μl of 5 × 104 HEK293 cells and added to tissue culture-treated, black, transparent bottom 96-well plates (Corning, Corning, NY). The plates were incubated in 5% CO2 at 37°C for 48 h. The cells were then lysed with 25 μl of 1× passive lysis buffer (Promega) for 30 min at room temperature. The luciferase activity was measured on a Victor 3 multilabel plate reader (Perkin-Elmer) immediately after the addition of 25 μl of luciferin (Promega). All readouts were normalized to wild-type AAP1 or AAV1 controls.
Immunofluorescence and confocal microscopy.
HEK293 cells were seeded on a 12-mm-diameter poly-lysine-treated glass coverslip (GG-12-1.5-PDL; NeoVitro, Vancouver, WA) in a 24-well plate. Cells were transfected at 60 to 70% confluence with pCDNA3.1 AAP alone (250 ng) using PEI as the transfection reagent. At 2 days posttransfection, the cells were fixed with 4% paraformaldehyde in phosphate-buffered saline (PBS) for 15 min and permeabilized with 0.1% Triton X-100 in PBS for 10 min. The nucleolus was stained using α-C23 (D-6) antibody, followed by goat α-mouse IgG H+L Alexa Fluor 594 (Thermo Fisher). The nucleus was stained with DAPI (4′,6′-diamidino-2-phenylindole; Thermo Fisher). Coverslips were mounted onto microscope slides using ProLong Diamond mountant (Invitrogen, Carlsbad, CA). Fluorescence images were taken by using a Zeiss LSM 710 spectral confocal laser scanning microscope at the UNC Microscopy Service Laboratory.
Immunoprecipitation assays.
HEK293 cells at 60 to 70% confluence on a 15-cm plate were transfected with pXX680 (3,000 ng), pTR-CBA-Luc (2,500 ng), pXR-AAV1-no AAP (3,000 ng), and pCDNA3.1 AAP constructs (7,500 ng) using PEI as the transfection reagent. At 3 days posttransfection, the cells were harvested from the plate in cold 1× DPBS, followed by more two washes in 1× DPBS. The pellets were resuspended in 400 μl of buffer D (20 mM HEPES/KOH [pH 7.9], 25% glycerol, 0.1 M KCl, 0.2 mM EDTA) and lysed on ice for 30 min with Halt protease inhibitor (Thermo Fisher). Lysates were spun at 13,000 × g for 2 min. Then, 5% of the supernatant was retained as input and prepared in 1× LDS buffer and 100 mM DTT. Next, 10 μl of Protein G Mag Sepharose Xtra beads washed three times in buffer D (GE Healthcare, Chicago, IL) was added to the remaining lysate; samples were then placed on a rotator at 4°C for 2 h. Next, the beads were washed twice for 30 min each in buffer D, followed by resuspension in 1× LDS buffer and 100 mM DTT. Samples were denatured at 95°C for 5 min and then loaded onto a precast 10% Bis-Tris gel (Thermo Fisher) and run in MOPS-SDS buffer. The protein was transferred to a 0.45-μm-pore size nitrocellulose membrane (Thermo Fisher) in a wet transfer apparatus. AAP was detected using α-human-HRP at a 1:10,000 dilution and visualized after reaction with Femto Western blot substrate (Thermo Fisher) on an Amersham AI600RGB system (Amersham Biosciences). Membranes were incubated in 30% peroxide for 30 min at room temperature and then reblocked. VP, AAP, and actin proteins were detected by using B1 hybridoma supernatant at a 1:50 dilution and α-actin antibody (ab3280) at a 1:2,000 dilution, respectively; goat anti-mouse-HRP at a 1:20,000 dilution was used as the secondary antibody. The membranes were again visualized as described above.
Statistical analysis.
All error bars shown represent one standard deviation. Statistical analysis was carried out in GraphPad Prism software using an unpaired, two-tailed Student
t test or two-way analysis of variance (ANOVA) in
Fig. 9 (*,
P ≤ 0.05; **,
P ≤ 0.01; ***,
P ≤ 0.001; ****,
P ≤ 0.0001).