INTRODUCTION
Our model for studying viral DNA-binding proteins is one of the eight known human herpesviruses, Kaposi’s sarcoma-associated herpesvirus (KSHV, HHV-8). KSHV is the etiological agent responsible for several human diseases including Kaposi’s sarcoma (KS), multicentric Castleman’s Disease, primary effusion lymphomas, and KSHV inflammatory cytokine syndrome (
1 – 4). Drugs currently used to treat KSHV-associated malignancies are at best moderately effective, and therapies for related human herpesviruses have been shown to be ineffective (
5 – 9). A specific cure or vaccine for the treatment or prevention of KSHV has not yet been approved for clinical use (
5,
6,
10). Given the endemic spread of the disease in Africa and its prevalence in transplant and HIV-infected patients, there is still a need for novel KSHV therapeutic targets (
5,
9). Herpesviruses contain a double-stranded DNA genome and encode their own DNA replication proteins that are essential for successful viral production and dissemination. Therefore, it is critical to characterize the protein-DNA interactions that contribute to viral proliferation, with the goal being to potentially disrupt them and prevent viral spread.
Like all members of the
Herpesviridae family, KSHV has a biphasic life cycle, consisting of a prolonged latency with reoccurring episodes of lytic reactivation. Both phases contribute to the pathogenesis of disease and promote lifelong infections of hosts (
11,
12). However, it is only during the KSHV lytic cycle that new infectious virions are produced. A key step to generating new virions is successful viral DNA replication. Latent viral DNA replication relies on host-cell proteins, while lytic viral DNA replication is controlled by the KSHV-encoded viral DNA replication proteins (
13). In this study, we focused on one of the essential KSHV DNA replication proteins, RTA (replication and transcription activator) encoded by
ORF50 (
14). RTA is known to have dual roles in lytic reactivation, including (i) activating downstream lytic genes as a viral transcription factor (
15,
16) and (ii) binding the KSHV origin of replication DNA as an origin binding protein (
17 – 19). Moreover, it has previously been demonstrated that RTA expression is sufficient and necessary to induce reactivation of the lytic phase in various KSHV cell culture model systems (
20 – 22). Thus, RTA is an appealing anti-viral therapeutic target due to its key activity as a lytic switch protein (
23,
24) and the role in initiating lytic DNA replication.
The cumulative results of previous studies characterizing RTA have shown that RTA expression is highly regulated, and RTA forms complexes with host (
24 – 27) and viral (
27,
28) proteins (
19,
29 – 31). Traditional, gel-based studies with truncated forms of RTA (C-terminus deletion, 1–321 amino acids, aa), full-length RTA (1–691 aa), and RTA with internal deletions identified the DNA-binding domain (1–390 aa), the DNA-binding inhibitory sequence (490–535 aa), and the dimerization domain (1–414 aa) (
31,
32). Accordingly, studies have also been used to identify interaction domains and characterize cooperativity between RTA and other trans-acting cellular factors (MDM2, OCT1, RBJ-k, etc.) that have been shown to regulate the abundance of the protein, mediate RTA autoregulation, and activate cellular pathways during reactivation (
28,
31 – 34). The expected monomeric molecular weight of RTA is ~74 kDa; however, the addition of posttranslational modifications increases the predicted molecular weight to approximately ~110 kDa (
35). It has also been suggested that the protein does not function as a monomer but forms tetramers and higher-order multimers when binding DNA (
32). Experiments combining full-length RTA and a C-terminus deletion mutant showed inhibition of RTA binding to viral promoter regions, further demonstrating that the ability of RTA to form homodimers and/or multimers may be crucial for its activity (
20). Overall, our understanding of the observed molecular weight and oligomeric state of RTA has been informed by traditional assays [immunoblotting, affinity-IP, electrophoretic mobility shift assay (EMSA)] that look at the predominant protein species (
32,
35,
36).
The previously identified RTA-DNA-binding sites are known as RTA response elements or RREs (
37). RTA, as a potent transactivator, binds to RRE in both early and late lytic viral promoters (
38). As a DNA replication protein, RTA binds to sequences within the two KSHV lytic origins of replication (OriLyt-L and -R), which share a ~1.2 kb long homologous sequence (
39). RTA binding to the KSHV lytic origins initiates DNA replication by recruiting the core DNA viral replication proteins:
ORF9/polymerase,
ORF59/polymerase processivity factor,
ORF6/single-stranded binding protein,
ORF56/primase,
ORF44/helicase, and
ORF40/41/primase associated factor (
40). And it has been determined that the 32 bp RTA RRE (5′
CTACCCCCAACTGTATTCAACCCTCCTTTGTTT 3′) found in origin DNA is required for OriLyt dependent DNA replication (
19,
39,
41). These previous studies form the foundation of our understanding about RTA.
In this study, we directly visualized purified viral proteins and viral DNA via TEM to characterize the molecular interactions involved in initiating DNA replication for a human herpesvirus. We also expanded upon our understanding of the degree to which DNA-binding locations and protein conformation are heterogeneous and how protein monomers and dimers function differently with respect to DNA-binding location specificity. TEM images of individual DNA and protein molecules were collected and then quantified to map the DNA-binding positions of RTA bound to the KSHV OriLyt-L and -R, and the relative size of RTA unbound or bound to DNA was compared to protein standards (
42,
43) to ascertain whether RTA was monomeric, dimeric, or oligomeric under the different conditions tested. Using our TEM approach to directly visualize RTA and KSHV DNA, we successfully quantified a highly heterogeneous data set, identified new binding sites for RTA, and provided evidence that suggests RTA forms dimers and high-order multimers when bound to OriLyt DNA. This study enhances our understanding of human herpesvirus DNA replication proteins, and more specifically, we show that the viral origin binding protein binds to discrete locations within the viral origin DNA, and the binding sites are distinctive for the protein monomers and dimers.
MATERIALS AND METHODS
OriLyt-L was isolated from pDA15 (
44) a generous gift from David AuCoin and Cyprian Rossetta at the University of Nevada. OriLyt-R DNA was synthesized using BAC16 KSHV sequence (accession MK733609) via GeneArt (ThermoFisher) and subcloned into pRSET-A plasmid via NdeI/XhoI, see
Table S1 primer pair 1. To purify the 1.8 kb OriLyt-L fragment of DNA, the plasmid containing the origin DNA sequence was digested with HindIII and EcoRI (New England Biolabs) according to the manufacturer’s protocol (
Fig. S2A and B). To isolate the approximately 2.4 kb OriLyt-R fragment, pRSET-A-OriLyt-R was digested with XhoI, ScaI, and NdeI (New England Biolabs) according to the manufacturer’s protocol (
Fig. S2C and D). DNA fragments were separated using 0.8% agarose gel and then purified using Qiaquick gel extraction kit (Qiagen). To incorporate biotin at the 5′ end of OriLyt DNA, digested fragments were incubated with Klenow Exo- polymerase (New England Biolabs) with 2.8 µM biotin-dCTP (Invitrogen) and unlabeled dGTP, dATP, and dTTP for 1 hour at 37°C prior to gel extraction. OriLyt-L restriction digest with HindIII generated a DNA overhang that when filled in with Klenow Exo-polymerase incorporated a biotin-dCTP opposite the G nucleotide (
Fig. S2A). OriLyt-R restriction digest with XhoI generated a DNA overhang that when filled in with Klenow Exo-polymerases incorporated a biotin-dCTP opposite the G nucleotide (
Fig. S2C).
For lytic origin, DNA fragments A–C, PCR amplification of the pDA25 OriLyt-L or pRSET-A-OriLyt-R using primer pairs 2–6 (
Table S1). Either AccuPrime Pfx DNA polymerase (Invitrogen) or Platinum Taq DNA polymerase (Invitrogen) was used, per the manufacturer’s instructions, to amplify viral DNA fragments. Fragments were then isolated by gel extraction. For AP-1 site fragments, single-stranded DNA oligos (
Table S2) were synthesized (Eurofins) and annealed.
Full-length (aa 1–691) and truncated (aa 1–321) RTA sequences were expressed and purified using SF9 insect cell expression system (ThermoFisher). Gene sequences were based on JSC-1 sequence (accession
GQ994935). Alcohol dehydrogenase (Sigma) and conalbumin (Cytiva, Gel Filtration Calibration Kits) were commercially purchased and resuspended using manufacturer’s recommendations.
Preparation of DNA and protein for TEM
DNA-protein binding assays were conducted at a mass ratio of 2:1 (DNA:RTA) in 50 or 100 µL total reaction volume (4 mM HEPES, 10 mM NaCl, 0.1 mM DTT, and 0.1 mM EDTA) for 30 or 60 min at room temperature. Briefly, 400 ng of linearized DNA (OriLyt-L, -R) was incubated in the reaction buffer at room temperature for 10 min. Next, RTA (200 or 400 ng) was added to the reaction with DNA and incubated for an additional 30 min at room temperature. The relative protein to DNA mass ratio was empirically tested (data not shown) to identify conditions where a mixture of DNA with and without protein and minimal numbers of protein aggregates, which were observed at higher protein ratios. To label the 5′ biotin DNA with streptavidin, 0.01 mg/mL streptavidin (Invitrogen) was added to the binding reaction and incubated for 20 min at room temperature. Samples were either added to size exclusion column directly or fixed with 0.6% glutaraldehyde for 5 min at room temperature and passed over a 2 mL column of 2% agarose beads (Agarose Bead Technologies) equilibrated in 10 mM Tris-HCl (pH 7.6) and 1.0 or 0.1 mM EDTA. Sample fractions were collected for direct mounting onto carbon supports for tungsten shadow casting and TEM (
45). For individual proteins, RTA, truncated RTA, conalbumin, and alcohol dehydrogenase were diluted in 4 mM HEPES, 10 mM NaCl, 0.1 mM DTT, and 0.1 mM EDTA and prepared for TEM.
Tungsten-shadow casting
DNA or DNA-protein complexes were prepared by tungsten rotary shadow casting as previously described (
46). Sample fractions were mixed with 2 mM spermidine and incubated on glow discharge treated charged carbon coated 400-copper mesh grids (Electron microscopy sciences, EMS) for 3 min. Carbon grids were washed in high-grade distilled water and dehydrated in a series of ethanol washes (25%, 50%, 75%, and 100%), air-dried, and rotary shadow-cast with tungsten (EMS). Samples were visualized on an FEI Technai 12 or Philips CM12 TEM at 40kV. Micrographs were captured at 15,000×. TEM images were captured on a Gatan First Light CCD camera using Gatan Digital Micrograph software (Gatan, Pleasanton, CA). TEM micrographs were contrast adjusted and inverted using Adobe Photoshop software.
Electrophoretic mobility shift assay
For EMSA-binding reactions, a mass ratio of either 6:1 or 4:1 (DNA:RTA) was used. Briefly, 12 µg of RTA and 50–80 ng DNA were incubated in 50 mM NaCl, 20 mM HEPES, 0.1 mM EDTA, 0.5 mM DTT, and 13% glycerol in a total reaction volume of 20 µL. Reactions were incubated on ice for 30 min. For competition assays, unlabeled competitor DNA (200–300 ng) was added in 4–5-fold excess to the reaction prior to the specific DNA. Reactions were loaded into a 1× Tris-acetate-EDTA (TAE), 0.7–1.0% agarose gel (Invitrogen), pH 9.15, and run for 90 min at 75 volts then 60–90 min at 90 volts. Gels were imaged using BioRad Chemidoc system (BioRad), and DNA was visualized as total DNA via gel red staining (Biotium) or using oligo-specific dye (Alexa-488, Cy-5 or Cy-3).
Data acquisition and analysis
Distances along the length of the DNA and area of protein were manually traced using Gatan Digital Micrograph or ImageJ software (NIH). Data sets for length and area were measured in pixels and pixels
2, respectively. To convert pixels to nm, data were multiplied by a conversion factor derived from the 200 nm scale bar visible in all collected TEM images divided by its length measured in pixels. To convert nm to base pairs, DNA measurements in nm were multiplied by the known length of base pairs in nm (
47,
48).
Following quantification of all the parameters for each DNA-protein complex, the data were analyzed using Microsoft excel, Prism software (GraphPad), and sequence-specific analysis was completed using CLC Main WorkBench20 (Qiagen). Statistical analysis was conducted in Prism software using unpaired t-test to compare the measured protein areas, and P values are reported in figure legends.
DISCUSSION
In this study, we have provided data using a microscopy-based method for examining purified protein behavior at the molecular level, expanded upon our understanding of RTA DNA-binding position, and identified the oligomeric structure of RTA during origin DNA binding. We demonstrated that RTA bound to discreet regions of OriLyt DNA with high frequency (
Fig. 3 and 4) and does so primarily as a dimer (
Fig. 5 and 6). Our data supports what has been previously posited about RTA oligomeric state using traditional methods such as EMSA, sodium dodecyl sulfate–polyacrylamide gel electrophoresis (SDS-PAGE), and immunoblotting (
32,
36). However, our findings delineate the oligomeric state of RTA when bound to viral DNA compared to protein alone and capture the array of RTA-binding location and oligomeric state. We presented evidence that the number of RTA dimers and multimers increased (
Fig. 5B through E) when bound to viral DNA as indicated by the nearly 50% reduction in measured monomers when RTA was bound to DNA.
While our single molecule EM approach enables the analysis of purified proteins and DNA, this approach has several limitations. Primarily, the inability to replicate the entire cellular environment including additional cofactors, proteins, and the DNA architecture within the nucleus, all of which may influence protein activity. In addition, we are unable to replicate the distinct stages of the KSHV lytic phase, and as a result, we cannot delineate the temporal relationship between RTA and its DNA-binding locations. Our single molecule analysis provides “snapshots” of the DNA locations that RTA binds to with high frequencies. For comparison, consider a scenario wherein the location of a person’s vehicle on their daily commute from home to work was quantified by taking satellite photos of the vehicle at various times. In this analogy, the route taken represents the length of DNA, and the vehicle represents the protein. If hundreds of pictures were captured of the commute, some photos would capture the commute and others the location of the parked vehicle. Therefore, the highest positional frequency would correlate to locations the car occupied for the greatest amount of time, like home or work. As such, a major advantage of this study’s TEM and downstream quantification is that we directly visualized and measured hundreds of individual RTA proteins and RTA bound to OriLyt DNA to garner a view of the protein’s highest occupancy locations (i.e., where a vehicle is parked the most often) as well as capturing the lower frequency events.
Interestingly, our data has revealed that the regions where RTA binds viral DNA with the greatest abundance do not correlate with known RTA response elements (
52). We observed the most frequent binding was correlated to the position of AP-1 sites, TATA boxes, and/or A-T rich regions (
Fig. 3I). This pattern was consistent between the OriLyt-L and -R. The RTA localization at AT-rich regions (
36) supports RTA’s role in recruitment of the core viral DNA replication machinery to initiate viral DNA replication (
41,
53). While RTA’s binding to (
36) AP-1 and TATA box locations bolsters RTA’s role in transcriptional regulation during the KSHV lytic phase (
33). Together, these data support the idea that RTA plays several important roles in the KSHV lytic cycle.
The association between RTA and AP-1 sites is of note because of the broad role that AP-1 protein complexes may play in the transformation, tumorigenesis, and pathogenesis of KSHV (
54 – 56). Both the ORF50 gene and OriLyt DNA contain an AP-1 consensus-binding domain (
39,
57). Given KSHV induction of IL-6 occurs early in infection (2 hpi) and this expression is regulated by both AP-1 and RTA, our observed binding to AP-1 site containing regions may point to the early lytic phase activities of RTA (
58,
59). Additionally, RTA activation of lytic genes such as ORF57 has been shown to be facilitated by AP-1 participation (
57). Thus, our unique single-molecule approach characterizing purified proteins and DNA successfully captured the association of RTA with AP-1 sites within the lytic origin DNA. These observations may indicate a difference in affinity for RTA compared to RTA complexed with additional viral or cellular proteins. We hypothesize that at the lytic origin, the abundance of additional KSHV DNA replication proteins could drive localization to RTA RREs; while in the absence of these additional viral proteins, we are capturing RTA’s propensity to bind to transcription-activating locations; further supporting the essential role of RTA in all phases of the lytic cycle: immediate early, early, and late.
We primarily focus on the DNA locations with the highest RTA binding. It is equally important to note RTA was captured along the length of DNAs measured, although at lower frequencies. Based on analysis of monomers, dimers, and multimers (
Fig. 6), we predict that RTA binds the length of the viral DNA; however, dimerization localizes RTA to discrete regions. These data could support a model where RTA binds and scans DNA and upon dimerization RTA locks onto a specific DNA sequence. When we consider the experimental differences between fixed and unfixed, whereas fixation preserves strong or weak DNA-protein interactions, we found that primary and secondary highest frequency-binding sites flipped. Under fixed conditions (
Fig. 3E and F), the AP-1 site is the most prominent RTA-binding site, while under the unfixed conditions (
Fig. 3G and H), RTA bound the A-T rich region with greater frequency. These findings support the potential for RTA to bind to A-T rich regions with higher affinity since these binding events are captured at a higher prevalence in the unfixed samples. Another infrequent phenomenon we observed was the formation of RTA-induced DNA-loops (
Fig. S3C). We posit that DNA loops may indicate that RTA dimers or multimers can drive changes in DNA structure, akin to transcriptional activators looping DNA to enhance transcription.
Beyond facilitating the analysis of heterogeneous populations of single molecules, TEM is advantageous for reasons including: (i) TEM uses highly purified components interacting in solution before being preserved for TEM visualization, allowing for real-time capture of protein-DNA interactions and (ii) TEM is compatible with an array of DNA lengths that do not need to be affixed to the surface as is the case for other high-resolution microscopic approaches, such as atomic force microscopy. Our future studies will seek to improve our system by standardizing sample preparation and streamlining the data analysis process to characterize protein-DNA interactions more efficiently. This system is compatible with various lengths of DNA and can be used to assess RTA interactions with various other viral promoters. Additionally, integrating immunogold labeling (
51) will enable the analysis of multiple proteins simultaneously using TEM. Our long-term goal is to examine RTA in concert with other proteins to determine how interactions with viral transcription machinery and AP-1 complexes may alter RTAs-binding specificity with the OriLyt and other KSHV DNA. Overall, our in-vitro purified system can be used to elucidate information about viral as well as cellular protein-protein and protein-DNA interactions and may be a useful tool in informing future cell-based studies.