KSHV Episome Tethering Sites on Host Chromosomes: Regulation of Latency-Lytic Switch by the ChAHP complex


 Kaposi’s sarcoma-associated herpesvirus (KSHV) establishes a latent infection in the cell nucleus, but where KSHV episomal genomes are tethered and the mechanisms underlying KSHV lytic reactivation are unclear. Here, we study the nuclear microenvironment of KSHV episomes and show that the KSHV latency-lytic replication switch is regulated via viral long non-coding (lnc)RNA-CHD4 (chromodomain helicase DNA binding protein 4) interaction. KSHV episomes localize with a CHD4 complex, ChAHP, at epigenetically active genomic regions and tethers frequently near centromeric regions of host chromosomes. The ChAHP complex also occupies the 5’-region of a highly-inducible lncRNAs and terminal repeats of KSHV genome with latency-associated nuclear antigen (LANA). Viral lncRNA binding competes with CHD4 DNA binding, and KSHV reactivation is accompanied by the detachment of KSHV episomes from host chromosome docking sites We propose a model in which elevated lncRNA expression determines the KSHV latency-lytic decision by regulating LANA/ChAHP DNA binding at inducible viral enhancers.

Introduction 7 mediated repression. We propose that the inducible enhancer activity is regulated by a balance 146 between local eRNA transcription activity and ChAHP complex tethering, and the mechanism 147 regulates KSHV reactivation.

150
Identification of KSHV episome tethering site on host chromosomes. 151 KSHV episomes tether to the host cell chromosomes via LANA, however the mechanism of 152 selection of docking sites is not very well characterized. To understand how and where KSHV 153 episomes tether to host chromosomes, we applied Capture Hi-C (CHi-C) method to identify 154 episome docking sites in three KSHV naturally-infected PEL cell lines, BC-1, BC-3 and BCBL-155 1. The schematic diagram for the CHi-C procedure is presented in Fig. 1a. In order to obtain 156 high resolution, the Hi-C sequencing library was further enriched by using biotinylated tiling-oligo 157 probes that specifically hybridized to KSHV DNA sequences (Fig. 1a). Using this method, we 158 examined the position of host genomic regions that exhibited higher frequencies of normalized 159 chimeric ligation reads with KSHV DNA fragments. We identified selectively enriched chimeric 160 ligation reads throughout the host chromosomes (Fig. 1b). The mapped reads on 23 individual 161 chromosomes for BC-1, BC-3 and BCBL-1 are presented separately in Supplementary Fig. 1a-162 c. Each dot represents the number of normalized contact heterotypic sequence reads between 163 host and KSHV sequences. The CHi-C normalized chimeric read counts on chromosome 1 for 164 three naturally infected PEL cells are shown in Fig. 1c. The results indicated that there are a 165 higher number of chimeric read counts near centromeric regions in all three cell lines (Fig. 1c). 166 To study the frequencies of chimeric read counts that are located near the centromere region, 167 8 we performed a mathematic characterization, in which we extracted the chimeric read counts 168 that derived from regions spanning a distance corresponding to 1% of the size of each 169 chromosome at either 5' or 3' centromere chromosome regions (marked green and blue). Counts 170 per 100 kbp were calculated separately for each chromosome and compared to the average 171 number across the individual chromosome. The results showed that higher chimeric read counts 172 were seen (15 out of 23 chromosomes in BCBL-1) either 5' or 3' regions of centromere, and in 173 some instances (e.g., chromosome 8, 19, X), we observed enrichment in both 5' and 3' positions 174 (Fig. 1d). To further confirm the CHi-C findings, we performed DNA-FISH with centromere-175 specific PNA (peptide nucleic acid) probes in combination with LANA immunostaining. These 176 results further showed that a large fraction of LANA dots was colocalized with centromeres ( Fig.   177 1e, Supplementary Fig. 2). The results are consistent with a previous report, which 178 demonstrated that LANA colocalized with centromeric protein F (CENPF) and kinetochore 179 protein, Bub1, and observed at centromere regions during metaphase (63). To determine 180 whether KSHV docking sites are random, we next examined the similarity of KSHV episome 181 tethering sites among the three cell lines using a Jaccard Index. We calculated the similarity of 182 tethering sites based on the positions of chimeric sequence reads. The index identified 97.86% 183 (BC-1 vs BC-3), 81.99% (BC-1 vs BCBL-1) and 82.36% (BC-3 vs BCBL-1) similarity (Fig. 1f). 184 The results demonstrated that a majority of KSHV episomes tethers the similar host genomic 185 regions in three naturally infected PEL cell lines, suggesting that there is a preferential nuclear 186 microenvironment that can attract/maintain KSHV latent episomes during cell divisions.  Next, we examined the nuclear protein microenvironment of KSHV episome-tethering sites in 191 infected cells. We hypothesized that, by examining cellular proteins neighboring to LANA in 192 infected cells, we should be able to identify the repertoire of proteins important for tethering and 193 selection of KSHV episome-docking sites. To identify proteins in close proximity to LANA, we 194 used a miniTurboID based method. The miniTurboID is a biotin ligase, which covalently attaches 195 biotin to lysine residues in neighboring proteins (<10 nm) in less than 10 minutes with no  (Fig. 2a). The iSLK-LANA mTID cells were incubated with D-biotin for 60 minutes in culture 204 media and biological triplicated samples were prepared (Fig. 2b). This strategy led us to identify 205 76 host proteins (P<0.05) that were physically neighboring KSHV LANA within 10 nm radial 206 distance during the period of D-biotin incubation (64). The 76-host proteins include nuclear 207 mitotic apparatus protein (NuMA), bromodomain-containing protein 4 (BRD4), and lysine-208 specific demethylase 3A and 3B (KDM3A and 3B) that have been previously shown to physically 209 interact with LANA (20, 66-68) (Fig. 2c, Supplementary Table 1). In addition to those previously 210 identified cellular proteins, the study also precipitated components of the ChAHP complex (39), 211 which is composed of chromodomain helicase DNA-binding protein 4 (CHD4), Activity-212 dependent neuroprotector homeobox protein (ADNP) and HP-1γ with high confidence (Fig. 2c). 213 Although we could not identify HP-1γ with our statistical criterion in proteomics study,  protein was previously shown to interact with KSHV LANA (69). The LANA interaction with CHD4 215 and ADNP was further validated with in vitro pull-down assays, and the results showed that 216 LANA could interact with CHD4 and ADNP in the absence of other viral proteins (Fig. 2d). 217 Further, recombinant GST-tagged LANA deletion proteins (Supplementary Fig. 4) were used 218 to map the interaction domain with CHD4 and found that the amino acid (aa) residues 870-1070, 219 near the LANA DNA-binding domain, were responsible for interaction with CHD4 (Fig. 2e). 220 Immunofluorescence assays with mono-specific antibodies further confirmed that LANA and 221 CHD4 were colocalized in naturally infected BCBL-1 cells (Fig. 2f). Taken together, these results 222 suggest that LANA is able to associate with the ChAHP complex in latently infected cells.

224
Association of KSHV episome-tethering sites with ChAHP complex binding. 225 The protein interaction and colocalization between CHD4 and LANA "dots" in the nucleus led us 226 to further investigate the localization of CHD4, ADNP and LANA on both host and KSHV 227 chromosomes. To identify the chromatin occupancy site(s) for ChAHP complex and LANA, we 228 employed Cleavage Under Targets and Release Using Nuclease (CUT&RUN) (70). The LANA, 229 CHD4, and ADNP CUT&RUN peaks clearly overlapped at multiple sites of cell host 230 chromosomes with active histone marks (H3K27Ac), which include previously described IRF4 231 super enhancer region (71, 72) (Fig. 3a). NGS plots between LANA CUT&RUN summit peaks, 232 and CHD4 or ADNP further confirmed co-occupancies of LANA and the ChAHP complex on the 233 host chromosomes (Fig. 3b). Because LANA and ChAHP complex interact with each other, we 234 hypothesized that LANA-ChAHP complex could be important for tethering KSHV episomes to 235 host chromosomes. To test this, we generated NGS plots between LANA, CHD4 and ADNP 236 11 CUT&RUN summit peaks, and distribution of CHi-C chimeric reads. We observed that KSHV 237 episome tethering sites were indeed primarily localized near the LANA, CHD4 and ADNP binding 238 sites, whereas it was not observed for H3K4me1 (Fig. 3c) or H3K27me3 (data not shown). To 239 further calculate degree of interaction mathematically, we measured the relative CHi-C chimeric 240 reads per million and examined association with relative distance with CHD4 binding site.

241
Accumulation index calculated from NGS plot suggested that more than 50% of CHi-C reads are 242 closely located at CHD4 binding site within the genomic regions (Fig. 3d). Altogether, these 243 results suggest that KSHV episome tethers to host chromosome at the ChAHP binding sites. 244 245 KSHV LANA colocalization with ChAHP complex on KSHV genome. 246 We next examined ChAHP binding sites on the KSHV episome. Consistent with studies with 247 cellular chromosomes, CHD4 and ADNP occupy genomic loci with active histone marks, which 248 includes KSHV long non-coding RNAs (PAN RNA, T0.7, and T1.5 [Ori-RNA]) promoter regions 249 (Fig. 4a). The results also showed colocalization among LANA, CHD4, and ADNP with the active 250 histone mark, H3K27Ac, along the KSHV genome, and exceptionally strong peaks were seen at 251 terminal repeat regions (read counts are depicted in Fig. 4a), where multiple copies of LANA 252 bind (73). The strong peaks at TR regions are likely due to a combination of tighter binding of 253 the complex and the presence of multiple copies of the same sequences. The strong signals at 254 TR regions are unlikely due to mapping problems, because H3K27me3 showed a lower number 255 of sequence reads compared with sequence reads in the unique region. The three viral lncRNAs, 256 especially PAN RNA, are known to be expressed at significantly higher transcript copy numbers 257 than the open reading frames during lytic replication (54, 74), and CHD4, ADNP, and LANA were 258 clearly localized at the 5' regions of these lncRNA promoter regions (Fig. 4a). Next, the effects 259 12 of KSHV reactivation on CHD4 occupancies were examined by CUT&RUN with qPCR. The 260 results suggested that CHD4 occupancies on the KSHV genome were reduced during KSHV 261 reactivation (Fig. 4b). We further studied effects of KSHV reactivation on the KSHV episome 262 tethering with the ChAHP complex on host cell chromosomes. For this, TREx-BCBL-1 cells were 263 reactivated for twenty-four hours and CHi-C samples were prepared. We measured the relative 264 chimeric DNA sequence frequencies of KSHV with host chromosomes before and after KSHV 265 reactivation. The relative amount of chimeric sequence reads at one of the KSHV episome 266 tethering sites were visualized with Juicebox. By subtracting the number of relative sequence 267 reads in latent samples from those in reactivated samples, we also visualized changes by 268 induction of active viral transcription. The results showed that KSHV chimeric sequence reads 269 were reduced, suggesting that KSHV reactivation induced detachment of the KSHV episome 270 from host chromosomes, and a similar detachment was also seen in other putative episome-271 tethering sites (Fig. 4c) Table 2). Among the interacting proteins, 74 proteins were common 296 between the PAN MRE Wt and PAN MRE mutant, while 55 proteins were found only in the 297 presence of the wild type PAN RNA sequence (Fig. 5c, d). Deletion of MRE seemed to unleash 298 ORF57 protein and allow ORF57 to interact more freely with other RNA binding proteins ( Fig.   299 5c). Importantly, this proteomics approach identified CHD4 as a putative PAN RNA binding Luciferase) (Fig. 5f) and performed in vitro interaction assays (depicted in Fig. 5g). The results 315 showed that CHD4 was indeed precipitated with PAN RNA; however, MRE mutant, PAN RNA 316 deletion mutants, as well as irrelevant luciferase RNA also interacted with CHD4 protein, 317 suggesting that CHD4 RNA binding is unlikely sequence specific under our binding conditions 318 (Fig. 5h). However, the same RNA binding conditions with full length PAN RNA did not 319 precipitate a DNA binding protein, NF-B (p65) or Luciferase protein, suggesting that CHD4 320 does possess RNA-binding capacity (Fig. 5i) analyses. The results showed that CHD4 was able to bind dsDNA directly (Fig. 5j). Importantly, 326 increasing amounts of PAN RNA (ssRNA) antagonized CHD4 dsDNA binding, which was 327 completely blocked in presence of non-biotinylated PAN RNA at 1:10 (dsDNA/RNA) molecular 328 15 ratio (Fig. 5j). The results suggested that locally transcribing long non-coding RNAs (which 329 would reach up to 3x10 5 copies/cell for PAN RNA (54)) may remove CHD4 locally from the KSHV 330 genome, as seen by reduced CHD4 on KSHV genome (Fig. 4b) and detachment of KSHV 331 episomes from host chromosomes during reactivation (Fig. 4c). These results suggest that 332 amount of long non-coding RNAs expression near the CHD4 binding sites may play a role in 333 both lytic gene induction and episome tethering.

336
The studies above suggest that LANA interacts with the ChAHP complex on both cellular and 337 viral chromosomes, and that the ChAHP complex may restrict KSHV enhancer activity and 338 hence KSHV lytic reactivation. Accordingly, we next examined the significance of CHD4 in the 339 KSHV transcription program. CHD4 expression in iSLK.219 cell was knocked down with shRNA 340 and the degree of KSHV reactivation was examined after triggering K-Rta expression with 341 doxycycline. The results showed that knock-down of CHD4 enhanced KSHV replication more 342 than 8-fold over K-Rta induction alone. Conversely, functional re-introduction of CHD4 by over 343 expression of mouse Chd4 cDNA (i.e., in order to escape from the shRNA, which targets human 344 CHD4) counteracted effects of CHD4 knock-down (Fig. 6a). Notably, over expression of mouse 345 Chd4 almost completely abolished K-Rta mediated KSHV reactivation (Fig. 6a, second bar), 346 and inhibited the aggregation of RNAPII on the KSHV genome, which was measured by 347 immunostaining with overexpressed CHD4 and RNAPII (Fig. 6b). Strong silencing effects were 348 CHD4's ATPase-activity dependent, because mutations in the helicase domain, using the same 349 mutation found in patients with CHD4-associated syndrome (42, 77), was found to increase 350 RNAPII aggregation and KSHV transcription (Fig. 6b, c). In addition to the knock-down or over 351 expression studies, we also performed single cell transcriptomic studies with reactivated 352 iSLK.219 cells. The results clearly indicated that the presence of higher levels of CHD4 had clear 353 inhibitory effects for the triggering and/or prolonging lytic viral gene transcription burst (Fig. 6d). 354 Further, KSHV transcripts were extracted and sorted based on sequence counts and examined 355 for correlation with cellular gene expression at the single cell level (Supplementary Fig. 6a, b). 356 The results again showed a negative correlation between CHD4 expression and the amount of 357 KSHV transcripts in the cell (Fig. 6e), while a similar negative correlation was not observed for  Finally, the significance of CHD4 in the establishment of latency was also examined. To 361 do this, we first knocked-down CHD4 in 293T cells and infected the cells with purified KSHV 362 r.219 virus to monitor viral gene silencing. CHD4 knock-down was confirmed at the protein and 363 RNA levels (Fig. 6f, g). The results showed that KSHV gene expression continued to increase 364 during a 3-day period in two independent CHD4 KD cells, while KSHV gene expression did not 365 increased significantly in shScramble cells at day 3 ( Fig. 6h). Further, KSHV lytic replication was 366 also monitored with RFP signals within the cell population, and results showed that the number 367 of RFP positive cells within the dish were higher in CHD4 knock-down 293T cells compared to 368 negative control knock-down (siC) cells at 96 hours post infection (Fig. 6i, Supplementary Fig.   369   7). Altogether, these results suggest that CHD4 functions to silence KSHV lytic genes at an early  is also known to compete with CTCF binding, and counteract chromatin looping at CTCF binding 393 sites. Therefore, ChAHP maintains evolutionarily conserved spatial chromatin organization by 394 preventing new CTCF binding events that emerged through short Interspersed element 395 expansions (43); this ChAHP biological role suggests that KSHV cleverly find/build a "safe 396 basecamp" with ChAHP to maintain viral episomes structure during evolution.  Fig. 6j, Campbell et al., in preparation). The results combined with this study 443 suggest that the ChAHP-LANA complex has an architectural role for KSHV latent genomic 444 structure, and the TR region that recruits significant number of copies of both ChAHP and LANA 445 ( Fig. 4a) is critical for episome structure and therefore suppression of lytic genes epigenetically.

446
Disruption of such a "backbone" by CHD4 KD or by robust expression of PAN RNA which binds 447 CHD4 therefore triggers "leakage" of viral lytic gene expression similarly to premature activation pAG-MNase incubation were used to normalize data as described previously (70).

545
The HiC-pro 2.11.1 pipeline (94) was used to align sequences from the Hi-C experiments against 546 a combined assembly of reference genomes; the human hg19 (GRCh37) and KSHV 547 (NC_009333.1). The reads were filtered for only uniquely mapped reads pairs by identifying 548 intersection of each read-end, and the valid reads were provided by removing reads with self-549 circle, dangling-end, error, extra dangling-end, too short, too large, duplicated, and random  The normalized counts were also used as input in subsequent analyses. In the analysis of KSHV 556 episome localization near centromeres, centromeres of human chromosome (hg19) were 557 downloaded from the UCSC Table Browser      used for real-time qPCR to determine viral copy number, as described previously (105).   Affinity purification was performed with streptavidin-coated magnetic beads (Thermo-Fisher).

767
Briefly, 150 l magnetic beads/sample were pre-washed with RIPA lysis buffer (150 mM NaCl,    Competing financial interests. 854 The authors have declared that no conflicts of interest exist.      to generate KSHV LANA-mTID. KSHV LANA-mTID was transfected into iSLK cells followed by production of viral particles by stimulating cells with doxycycline (1 µg/ml) and sodium butyrate (3 mM) for 5 days.
Virus was used to infect iSLK cells and cells were selected with hygromycin (1 mg/ml) to obtain an iSLK-