A quantitative temporal analysis of HIV-1 reactivation in latent cells
The ACH2 latent cells harbor a full-length HIV-1 provirus, which is non-responsive to Tat due to a point mutation in TAR20–22. Since Tat fluctuations account for stochasticity in HIV-1 transcription, we used these cells to identify factors other than Tat that influence stochastic reactivation of latent provirus. To quantify un-spliced full-length HIV-1 RNA at single molecule level in reactivated cells, we employed 40 tandem 21-mer oligonucleotide probes (Stellaris®) corresponding to contiguous sequences in gag, each with a fluorescent label (Fig. 1 and Supplementary Table ST1). Exposure of ACH2 to PMA for as little as 2h led to a burst of transcription from the provirus, which was visible as a single large bright spot in the nucleus corresponding to a transcription site (TS) that marks the site of proviral integration (Fig. 1a, panel 2, white arrow)23–25. Single molecules of RNA were detected as diffraction-limited spots in the nucleus (Fig. 1a, panel 2, yellow arrow heads). We employed FISH-quant to quantify the individual gag RNA molecules within a cell and within the TS (Fig. 1a, Panel 3)26. A comparative analysis of an average number of RNA molecules/cell obtained by FISH-quant vs. qRT-PCR of the same sample indicated that RNA-FISH-Quant analysis is more sensitive in predicting HIV-1 RNA molecules/cell, likely owing to the minimal sample processing involved in this method (Fig. 1b).
A temporal analysis of reactivation was conducted by inducing ACH2 cells with PMA for various time points and subjecting them to smRNA-FISH. Images from 50–200 cells were captured for each time point. Distinct TSs appeared in the nucleus as early as 10 min. post reactivation (p.r.) (Fig. 1c, panel 2, white arrow), the size of which increased over time (Fig. 1c panels 2–6). The individual mature mRNAs were visible by 15–30 minutes p.r. as punctate spots within the nucleus, which appeared to move away from the TS (Fig. 1c, panels 3–5). RNA started appearing in the cytoplasm by ~ 3 h p.r. (Fig. 1c, panel 9). By 12–24 h accumulation of RNA could be seen in the cytoplasm as a bright cap (Fig. 1c, panel 15 and 16). HSHRS analysis of > 10,000 cells per time point indicated that by about 12 h p.r. the maximum number of cells (~ 80%) were activated and were positive for either gag RNA and/or TS (Fig. 1d). The percentage of cells with TS increased over time and by about 1–3 h p.r. 60–80% ACH2 cells harbored distinct TSs (Fig. 1e). After 3 h p.r., the number of cells with TS decreased and many cells at this point harbored mRNAs but no TS (Fig. 1f). These results indicated that reactivation from latent provirus is rapid, occurs within ~ 20 min. p.r. and that the first round is completed within a span of 3–4 hours, as, at this time, many cells harbor RNA but no TS (Fig. 1f).
Stochasticity in HIV-1 reactivation in ACH2 cells
We determined the absolute numbers of mature transcripts per cell and transcripts per TS (burst size), by evaluating 10–50 cells per time points using FISH-Quant (Fig. 1g-l). The total number of mature transcripts/cell increased steadily but was highly variable from cell-to-cell. As the magnitude of transcription increased, the variability also increased with a range from 0-1000 transcripts/cell (Fig. 1g, note that each dot in the graph represents the total number of mature transcripts per single cell). Quantitation of nascent gag RNA per TS indicated that the burst size varied highly from cell to cell at any given time, and that a maximum of ~ 100 nascent transcripts/TS were detected (Fig. 1h). Quantitative analysis of subcellular distribution of mature gagRNA indicated that until ~ 2 h p.r., most transcripts were present within the nucleus and the cytoplasmic accumulation was observed starting from ~ 3 h p.r. (Fig. 1c and 1j). The average number of nascent RNA/TS steadily increased, and several peaks of transcripts were observed indicating that the transcription initiation proceeded in waves (Fig. 1k). These results indicated that the HIV-1 reactivation is intrinsically stochastic or “bursty” and that the transcription proceeded in pulses/waves after stimulation.
To quantitate the degree “noise” in transcription activity, at a given time point, we calculated the Fano factor, a statistical method to determine the stochasticity, by determining the ratio between the cell-to-cell variation in the number of transcripts and the mean of the transcripts/cell. Variance analysis indicated that the number of transcripts/cell fluctuated more in the later time points than in the early time points (Fig. 1l and Supplementary Fig. S1a). However, Fano factor was greater than 1 at all-time points measured, indicating an inherent burstiness of transcription (Supplementary Fig. S1b). These studies indicated that there is stochastic activation of HIV-1, even when it is Tat-independent.
We subjected two additional latent T-cell lines, J1.1 and J-Lat, to smRNA-FISH analysis followed by HSHRS (Supplementary Fig. S2). In these cells TS were present, and the reactivation occurred rapidly. A maximum of 80% and 20% cells were reactivated in J-Lat and J1.1, respectively (Supplemental Fig. S2). J1.1 cells exhibited fluctuations in transcription with waves every few hours (Supplemental Fig. S2d).
Time course analysis of Gag protein synthesis
We studied the kinetics of expression of Gag protein, by combining smRNA-FISH with IF using α-p24 antibody (Supplementary Fig. S3). Gag protein was detected at 6 h p.r. at the nuclear periphery, by ~ 12 h, diffusely in the cytoplasm on one side, and by 24 h diffused throughout the cytoplasm and plasma membrane (Supplementary Fig. S3a). In some cells, gag-RNA and proteins co-localized in extreme quantities on one side of the cytoplasm, forming a cap-like structure (Supplementary Fig.S3a, panels 17–20). HSHRS analysis indicated that maximum cells that express RNA and/or Gag protein were detected at ~ 24 h p.r. which decreased at 48 and 72 h p.r. (Supplementary Fig.S3b). In DMSO controls, the positive cells remained at ~ 5%, 24–72 h p.r. (Supplementary Fig. S3c).
Effect of LRAs on the reactivation kinetics of latent provirus
To determine if clinically relevant LRAs lead to stochastic reactivation, we examined the effect of PKC modulator bryostatin and HDAC inhibitor panobinostat, in addition to PMA on the reactivation kinetics in ACH2 cells (Fig. 2a). The drugs exhibited three distinct kinetics of reactivation (Fig. 2b-e). First, bryostatin and panobinostat exhibited rapid reactivation kinetics with a peak at 12 h p.r. (Fig. 2b-c). By 24 h the bryostatin-induced reactivation declined rapidly as compared to PMA and panobinostat (Fig. 2b-c). Bryostatin induced larger TS (containing up to 98 nascent trasncripts/TS) at very early time points at 1 h p.r. indicating a swift transcription initiation when compared to that of PMA (up to 29 trasncripts/TS) and panobinostat (up to 17 transcripts/TS), at the same time point (Fig. 2a, d and e). On the contrary, panobinostat showed peak activity in transcription initiation at 12 h p.r., which was also correlated to an increase in the number of cells expressing large TS and the number of nascent transcripts/TS (Fig. 2a, panel 7 and Fig. 2d and e). In summary, while bryostatin showed rapid initiation of transcription that steadily declined, panobinostat showed a slow initiation that peaked at 12 h p.r., which declined rapidly (Fig. 2d and e). The proviral reactivation was stochastic in the presence of each of the three drugs as indicated by the variability in the mature transcripts/cell and the burst size (Fig. 2b and d). Analysis of kinetics of cytoplasmic accumulation indicated no blocks in this process with these drugs (Fig. 2f-i).
A combination of panobinostat and bryostatin facilitates sustained viral transcription:
Since the LRAs exhibited different kinetics of reactivation, we surmised that a combination of drugs may lead to better activation. We tested reactivation by combining panobinostat and bryostatin, at 6 and 12 h p.r., (Fig. 3). Combination treatment resulted in an increased number of cells that over-expressed RNA by 12 h p.r. such that the images from these cells were saturated and could not be included for FISH-Quant analysis (Fig. 3a, panel 6; and Fig. 3b). With this limitation, the analysis of the remaining cells indicated that the combination treatment resulted in a similar number of mature transcripts/cell and burst size to that of panobinostat treatment at 6 h p.r. (Fig. 3c-f). However, the combination drug treatment resulted in a sustained high percentage of TS harboring cells at 12 h compared to that of 6 h, whereas single drug treatment exhibited a decrease in percentage of TS containing cells at 12 h p.r. (Fig. 3g). Thus, the combination drug treatment resulted in a sustained activation of transcription (Fig. 3g). Consistent with these observations, close to 100% of the cells were reactivated with combination drug treatment (Fig. 3h-k).
Analysis of reactivation in primary latent CD4+ T cells
To study the reactivation kinetics in primary cells, we applied RNA-FISH and HSHRS to an ex vivo latency model29. Analysis of primary CD4+ latent T-cells, reactivated using α-CD3/CD28 antibodies, indicated the clear presence of gag-RNA, TS, and Gag proteins (Supplementary Fig. S4a, panels 1–16). Analysis of % of cells positive for RNA, protein or both indicated that while % of protein positive cells increased, those containing RNA alone decreased suggesting that the peak of transcriptional reactivation occurred 6 h p.r. or earlier (Supplementary Fig. S4b). Interestingly, even the uninduced controls exhibited the presence of low % of reactivated cells (Supplementary Fig. S4c, and S4a, panels 17–24). Most of the cells expressing HIV-1 RNA were not amenable for FISH-quant analysis due to near saturating amounts of RNA in these cells.
Single-cell RNA seq to determine the factors responsible for stochastic activation
We observed transcriptional ‘noise’ even in uninduced ACH2 cells, and in primary cells (Supplementary Fig. S4c), consistent with stochastic activation. About 5–20% of the uninduced ACH2 cells were activated at any given time point, but with a smaller number of mature transcripts/cell (maximum of ~ 250 transcripts/cell, Fig. 1d, 1i, and Supplementary Fig. S5 and Supplementary Table ST2). Thus, the basal level expression reported previously as “leaky expression” of provirus using the pooled samples in uninduced conditions, is likely due to stochastic activation in some cells21. Furthermore, 2–10% of J-Lat, J1.1, and OM10.1 cells were also reactivated under uninduced conditions (Supplementary Fig. S5 and Supplementary Table ST2).
Stochastic activation is regulated both by intrinsic and extrinsic factors, where the ‘extrinsic noise’ is the phenotypic cell-to-cell variations in the levels or activity of factors required for gene expression14–16. To determine the reason for stochastic activation of cells under uninduced conditions, we performed single-cell RNA-seq (scRNA-seq) analysis to determine differences in transcriptome of reactivated and unreactivated cells within the same pool using 10x Genomics. The resulting UMAP data of ~ 5,000 ACH2 cells indicated the presence of 7 closely related clusters, with cluster 6 showing high-level expression of HIV-1 Gag, Pol and Env (Fig. 4a-d). Heatmap analysis indicated that HIV-1 genes were among the top 10 highly upregulated genes in cluster 6 (Fig. 4b and Supplementary Table ST3). Gene ontology (GO) analysis indicated that single-stranded DNA binding, unfolded protein binding and ribonucleoprotein complex binding are the three top molecular functions, and nuclear transport, nucleocytoplasmic transport and RNA localization are the top biological functions in the differentially expressed genes (DEGs) within cluster 6 (Fig. 4c). Consistent with the GO analysis, several lncRNA-encoding genes, including MALAT1 and NEAT1, which had been reported to influence HIV-1 infection and reactivation, were part of the top ten upregulated genes in cluster 6, validating our data (Fig. 4b and Supplementary Table ST3). MALAT1 is a lncRNA that reactivates HIV-1 by replacing the polycomb repressive complex from HIV-1 LTR27. NEAT1 is upregulated during HIV-1 replication and influences export of HIV-1 RNA from the nucleus28,29. LGALS1 or Galectin-1 is another top gene in the group that has been shown to enhance HIV-1 replication30. HIV-1 expression was not influenced by cell cycle stages, as cells at all different stages of cell cycle were represented in cluster 6 (Supplementary Fig. S6a and b).
NR4A overexpression was correlated to stochastic activation of HIV-1
The HIV-1 containing cluster 6 formed a branch that was distinct from the rest of the clusters, based on ClusterTree analysis (Supplementary Fig. S6c). The characteristic genes of node 8 that separated cluster 6 from rest of the clusters identified HIV-1 Gag, Pol, Env, MALAT1, NEAT1, and three nuclear orphan receptor genes belonging to the family 4A, NR4A1, 2, and 3 (Supplementary Table ST4). Violin plot analysis indicated that all three NR4A genes are differentially upregulated in cluster 6 (Fig. 4d). To validate the association of cluster 6 genes with activation of HIV-1, even in the induced conditions, we carried out qRT-PCR analysis of RNA isolated from pools of induced latent cells (Fig. 5). We found that PMA upregulated all the genes from the cluster 6 tested along with HIV-1 RNA (Fig. 5a, top panel). Bryostatin and panobinostat treatment upregulated these genes to variable degrees (Fig. 5a, middle and bottom panels). Treatment of other latency models such as J1.1, J-Lat, U1 and OM10.121,31,32 with PMA resulted in upregulation of most but not all the genes (Fig. 5b). Interestingly NR4A3 was consistently upregulated in all the reactivated latent cell lines and across the three different LRAs (PMA, bryostatin and panobinostat), and correlated to upregulation of HIV-1, suggesting that NR4A3 could be a common factor that determined the stochastic upregulation of HIV-1.
NR4A1 (Nur77), NR4A2 (Nurr1), and NR4A3 (Nor1) are closely related proteins belonging to the orphan nuclear receptor family33. These proteins were initially defined as nerve growth factor induced receptors that bind to NGFI-β-response element33. NR4A receptors bind as homo- or hetero-dimers with each other to Nurr-responsive element33. NR4A1 and 2 can also dimerize with retinoid X receptor to bind to a D5 motif, and NR4A1 is a co-factor for Sp1-regulated genes33. Despite the similarities between NR4A1, 2, and 3, and their similar ability to bind to common cis-regulatory elements, these proteins exhibit unique activities and cell/tissue-specific functions. It is interesting to note that NR4A2 directly binds and recruits CoREST complex to HIV-1 promoter to repress HIV-1 transcription in microglia34. Our results indicated that NR4A2 was either repressed or not highly activated in various cell lines during HIV-1 reactivation, consistent with this observation.
cMYC is inversely correlated to stochastic reactivation of HIV-1 provirus
NR4A1 and 3 share overlapping functions and loss of these genes leads to acute myeloid leukemia (AML) in mice, and thus are considered as tumor suppressors33,35. Furthermore, it has been demonstrated that cMYC is the target of NR4A-mediated repression and tumor suppression36. Interestingly, we noted that cMYC was one of the downregulated genes in cluster 6 of ACH2 cells, suggesting that upregulation of NR4A genes could lead to downmodulation of cMYC expression (Supplementary Tables ST3 and ST4). In addition, cMYC has been implicated in mediating HIV-1 latency by binding to SP1 at the HIV-1 promoter37.
To test the correlation of cMYC down-modulation to NR4A upregulation and HIV-1 reactivation, we tested the effect of different LRAs in ACH2 and other latent cell lines, on expression of cMYC by using qRT-PCR. cMYC was downregulated in ACH2 cells when induced with PMA, bryostatin or panobinostat (Fig. 5c, top panel). Furthermore, cMYC downregulation was correlated to HIV-1 reactivation in all other cell line models tested (U1, J-Lat, OM10.1, and J1.1), (Fig. 5c bottom panel). Furthermore, the expression of NR4A3 and cMYC proteins were inversely correlated and were induced or repressed respectively upon treatment of ACH2 cells with PMA (Fig. 5d). These results are consistent with previous reports of cMYC in inducing latency and provided additional light on the role of cMYC and NR4A3 in stochastic reactivation of HIV-1 latent cells37.
To discern that cMYC expression was inversely correlated with HIV-1 expression at single cell level, we performed RNA-FISH + IF analysis of uninduced ACH2 and J1.1 cells using HIV-1 gag RNA FISH probes and α-cMYC antibody (Fig. 5e). We found that cMYC was expressed in a majority of the cells under uninduced conditions in both ACH2 and J1.1 cells, and a few cells were positive for HIV-1 RNA (Fig. 5e, top and bottom row of panels 1–4 and 9–12). On the contrary, PMA treatment induced HIV-1 RNA in a majority of ACH2 cells and only a few cells were positive for cMYC (Fig. 5e, middle panels 5–8). Furthermore, within the majority of single cells, in both uninduced and induced conditions, cMYC and HIV-1 RNA expression were inversely correlated (Fig. 5e, panels 13–24 white and yellow arrows point to cMYC and HIV-1 RNA, respectively). These results are consistent with stochastic reactivation of HIV-1 in the absence of cMYC at single cell level.
To determine if cMYC was a common factor for regulating HIV-1 reactivation in all the cell lines, we carried out scRNA-seq analysis of two additional cell lines J1.1 and J-Lat under uninduced conditions. About 5,000–10,000 cells were subjected to scRNA-seq and the data from these two sets of cell lines were merged with that of ACH2 to identify a common pool of cells expressing HIV-1. UMAP analysis indicated that even when combined, the HIV-1 expressed cells clustered separately from the rest of the cells (Supplementary Fig. S7). There were 12 clusters from the combined analysis (Supplementary Fig. S7a). The top 10 genes in cluster 11 were HIV-1 genes, as indicated in the heat map analysis (Supplementary Fig. S7b and Supplementary Table ST5). Further analysis indicated that most of DEGs commonly identified in the latent cell lines were repressed targets (Supplementary Table ST5) and DEGs in cluster 11 formed a network containing cMYC, and the genes associated with its function such as EGR1, CEBPE, NPM1, TOP2A, KPNA2, highlighting the importance of cMYC pathway in regulating the expression of HIV-138–43 (Supplementary Fig. S7c).
Upregulation of NR4A3 and downregulation of cMYC in patient-derived samples
We tested to determine if NR4A3 and cMYC expression were correlated to reactivation of HIV-1 in patient derived cells. Latent CD4+ T-cells isolated from three independent aviremic HIV-1 patients (HO27, HO51, and HO55) were reactivated by α-CD28/CD3, and RNA was isolated from the virions and the cells. Rapid ex vivo evaluation of anti-latency (REVEAL) assay was used to measure the viral RNA as a way to measure viral particle production44, and RNAs isolated from the cells were used to determine the expression of cMYC and NR4A genes via qRT-PCR (Fig. 6a-e). We found that cMYC levels were down-modulated and NR4A3 were up-regulated in induced samples as compared to the uninduced controls in all three patients (Fig. 6b and c). Interestingly, NR4A2, which represses HIV-1 transcription was either down-regulated or unchanged (Fig. 6e). NR4A1 was variably expressed in different samples (Fig. 6d). These results indicated that cMYC and NR4A3 expression were correlated to HIV-1 reactivation in both CD4+ T cell lines and in primary CD4+ cells isolated from patients.
NR4A modulator and cMYC inhibitor SN-38 acts as an LRA to reactivate latent HIV-1
NR4A genes activate transcription of cellular genes by binding to common cis elements and are induced in response to T-cell activation45–50. NR4A genes play an important role in oncogenesis and act as tumor suppressors36. In cMYC-dependent AML, small molecule activators of NR4A1 and 3 have been shown to inhibit tumor growth by repressing cMYC51. Based on these studies we hypothesized that a drug that targets NR4A and represses cMYC could act as an LRA that can induce activation of HIV-1 provirus. In addition, TOP2A was one of the commonly repressed genes in HIV-1 positive cells, as indicated by network analysis (Supplementary Fig. S7c). We identified SN-38 (7-ethyl-10-hydroxycamptothecin), a derivative of topoisomerase-I inhibitor irinotecan (CPT-11), as one of the drugs that represses cMYC as a part of its mechanism of action52. Irinotecan is an FDA approved drug used as an anti-cancer agent and SN-38 is a 100 to 1000-fold more active metabolite of irinotecan53. We investigated the effect of SN-38 in inducing HIV-1 in five different latent cell line models, ACH2, U1, J-Lat, OM10.1 and J1.1. Our results indicated that while cMYC was repressed, NR4A3 was upregulated by SN-38 in all the five cell lines (Fig. 6f-h). Consistent with our hypothesis, transcription of HIV-1 was activated in all five cell line models, compared to DMSO treated control (Fig. 6g). Interestingly, the expression of NR4A1 and NR4A2 was much less compared to that of NR4A3 (Fig. 6i and j). Similar results were obtained when the cells were treated with PMA used as a positive control (Fig. 6f-j). These results indicated that SN-38 might act as a novel LRA which can reactivate latent HIV-1 provirus by repressing cMYC and activating NR4A3.