NR4A1 suppresses cancer replication stress through R-loop-dependent inhibition of immediate early gene transcriptional elongation

Deregulation of oncogenic proliferative signals triggers replication stress in cancer cells to which they must adapt 1,2 . Immediate early genes (IEGs), identied by their rapid stress-induced transient bursts of expression, are critical to integrating downstream signaling pathways 3,4 . In studying tumor initiation by patient-derived breast cancer cells, we observed acquisition of open chromatin domains at the genebody and 3’-UTR of IEGs, uniquely across the genome. Through in vivo and in vitro modeling, we show that the IEG and orphan nuclear receptor NR4A1 5 localizes across multiple IEG genebodies, where it binds to RNA Pol II, arresting transcriptional elongation and generating extensive R-loops and accessible chromatin. Acute stress promptly removes NR4A1 from IEG genebodies, triggering immediate release of their poised transcripts. In breast cancer cells, NR4A1 overexpression increases tumorigenesis; conversely, its deletion leads to uncompensated replication stress, chromosomal instability and mitotic catastrophe, driven by deregulation of its IEG target FOS. A large fraction of primary breast and other cancers exhibit open genebody chromatin at IEGs, consistent with preserved NR4A1 function. Thus, NR4A1 mediates a novel transcriptional elongation checkpoint, unique to stress-induced genes and required for their rapid bursts of expression. Cancers that have retained this mechanism in adapting to chronic replication stress may be dependent on NR4A1 for proliferation.

deregulation of its IEG target FOS. A large fraction of primary breast and other cancers exhibit open genebody chromatin at IEGs, consistent with preserved NR4A1 function. Thus, NR4A1 mediates a novel transcriptional elongation checkpoint, unique to stress-induced genes and required for their rapid bursts of expression. Cancers that have retained this mechanism in adapting to chronic replication stress may be dependent on NR4A1 for proliferation.

Main Text
Circulating tumor cells (CTCs) are metastatic precursors that may be viably isolated from blood samples of patients with metastatic breast cancer through micro uidic depletion of normal blood cells 6,7 . CTCs may be cultured under anchorage-independent conditions and reintroduced into immunosuppressed mice, where they are highly tumorigenic 6,8 . Patient-derived cultured CTCs demonstrate considerable heterogeneity and plasticity 6,9 , revealing mechanisms by which they respond to environmental and oncogenic stress, and their associated therapeutic vulnerabilities. By studying early steps in metastatic colonization by CTCs, we uncovered a physiological pathway critical to stress-induced IEG regulation, which is co-opted by cancer cells to suppress replication stress and whose disruption triggers mitotic catastrophe and genomic instability.

Gain of chromatin accessibility at IEG genebodies during tumorigenesis
To study early steps in CTC-induced initiation of metastasis, we tagged two patient-derived, hormone receptor (HR)-positive breast CTC cell lines, BRx142 and BRx82, with both GFP and luciferase and injected them into the left ventricle of immunosuppressed NSG mice, followed by monitoring using in vivo luciferase-based imaging (IVIS). An hour after intracardiac inoculation (day 0) and at serial intervals thereafter, mouse tissues were harvested, subjected to single-cell dissociation, and individual tumor cells were collected by GFP-directed uorescence-activated cell sorting (FACS) (Fig.1a, Extended Data Fig. 1a,   b). To search for chromatin-associated changes at the earliest possible time points, we applied the Assay for Transposase-Accessible Chromatin using sequencing (ATAC-seq) to high-purity cancer cell populations (1,000-50,000 cells) from pre-injection cultures and day 0 post-injection tissue harvesting versus early metastatic lesions (>day30). Remarkably, we observed few loci with consistent increases in ATAC-seq signal across the genome (20 loci with >3X increase), of which only 5% are at characteristic gene promoters/enhancers indicative of transcriptional activation. Instead, the most striking chromatin accessibility marks acquired during early tumor formation by both CTC cell lines are at genebody and 3'-UTR domains (>82% of loci with increased ATAC-seq) (Fig. 1b, c). By gene ontology analysis, the top enriched pathway for these genes is "Cellular response to stimulus" (P<8.2E-4) (Extended Data Fig. 1c), and 45% of the genes gaining chromatin accessibility encode known IEGs, including NR4A1, FOS, FOSB, BTG2, BHLHR40, MAFF, HMGCS1, HSPA1A and SIK1 (Fig. 1b). Among these IEGs, NR4A1 exhibits the greatest increase in chromatin accessibility (6.5 fold) across its genebody and 3'-UTR (Fig. 1b). This unusual IEG genebody open chromatin pattern coinciding with in vivo tumor initiation by HR-positive breast CTCs is also evident using the prototypical "triple negative" breast cancer (TNBC) cell line, MDA-MB-231 (hereafter called M231). Intracardiac injection of the highly tumorigenic M231 cells is also accompanied by striking gain of ATAC-seq signal at the genebodies and 3'-UTR of NR4A1 and other IEGs (Extended Data Fig. 1d-f). In contrast to the genebody localization of chromatin accessibility gains during tumor initiation, 18 loci show loss of ATAC-seq signal, but these are localized primarily to intergenic noncoding regions (Fig. 1b, Extended Data Fig. 1g). NR4A1 (also known as Nur77 or NGFIB) has been implicated as a key effector of a wide range of cellular processes, ranging from cancer, metabolism and angiogenesis to in ammation and immune cell differentiation 5 . As an orphan nuclear receptor, NR4A1 has been thought to mediate these diverse functions as a transcriptional activator, binding to the NGF1-B response elements at gene promoters 10,11 . The unexpected pattern of NR4A1 chromatin accessibility led us to test whether its own chromatin binding pro le is altered during tumorigenesis. Surprisingly, NR4A1 ChIP-seq shows minimal (<3%) gene promoter localization in CTC-derived cancer cells (Fig. 1d), in marked contrast to other IEGs, like FOS, which exhibits 42% localization to promoter regions (Extended Data Fig. 1h). Instead, using two different antibodies, we observed massive NR4A1 ChIP-seq signal across the IEG genebodies, including NR4A1 itself, overlapping precisely with the ATAC-seq open chromatin domains (Fig. 1e). This NR4A1 localization pattern is not present in pre-injection CTCs or in CTCs recovered from tissues of injected mice at day 0, indicating that it is a phenomenon acquired during early tumor formation. The top enriched pathway for NR4A1-bound genes is again "Cellular response to stimulus" (P<6.5E-9), with a predominance of IEGs (Extended Data Fig. 1i). Given this unusual chromatin genebody occupancy for a transcription factor, we tested the localization of RNA Polymerase II (RNA Pol II) at IEGs using ChIP-seq. As expected, genome-wide, RNA Pol II is detectable at the transcriptional start sites (TSS) of most genes; however, it accumulates at the genebodies and 3'-UTRs of NR4A1 and other IEGs, overlapping precisely with NR4A1 binding itself (Fig. 1e). This unusual RNA Pol II distribution is not observed in CTCs prior to tumor formation, suggesting the altered regulation of IEG transcription during tumorigenesis.
As a rst step to de ning the role of NR4A1 expression while CTCs transition into early metastatic tumors, we tested the consequences of NR4A1 suppression and ectopic overexpression on CTC-mediated tumorigenesis. In both BRx142 and BRx82 CTCs, shRNA knockdown of NR4A1 (67.9% and 68.7% suppression, respectively) dramatically reduces in vitro proliferation and in vivo tumorigenesis ( Fig. 1f-h, Extended Data Fig. 1j, k). Conversely, overexpression of NR4A1 in these CTC lines (40-and 30-fold, respectively) strongly enhances orthotopic tumor formation, as well as intravascular metastasis, while having only a moderate effect on in vitro proliferation ( Fig. 1i-k, Extended Data Fig. 1l, m). NR4A1 thus appears to regulate a rate-limiting function in both primary and metastatic tumorigenesis.

NR4A1 restrains RNA Pol II transcriptional elongation at IEG genebodies
To better understand the functional role of NR4A1 and its chromatin con guration, we sought to recapitulate the tumor-associated phenotype using in vitro models of cellular stress responses. Among various stimuli (DNA damage, heat shock, serum stimulation), we found serum starvation followed by replenishment, a classical IEG stimulation protocol that recapitulates proliferative and replication stress signals, to elicit NR4A1 induction and altered localization analogous to that observed in vivo. Since CTCs are already maintained in the absence of serum, these experiments were performed in M231 cells, as well as in MCF10A breast epithelial cells, both of which show serum-dependent proliferation. In these cells, sharp peaks of IEG expression are observed between 30-60 min after serum replenishment, with NR4A1 showing the greatest increase, demonstrated by RNA (>400-fold and >800-fold in MCF10A and M231 cells, respectively) and protein quantitation, and by immuno uorescence staining (Fig. 2a, Extended Data Fig. 2a-c). The IEG FOS is rapidly induced upon serum starvation, while FOSB and EGR1 peak following serum replenishment (Fig. 2a).
By ChIP-seq analysis of cells under baseline serum-replete conditions, IEG gene products such as FOS and MYC, and the MYC heterodimerization partner MAX are, as expected, predominately localized to transcriptional start sites (TSS). However, no signi cant promoter binding by NR4A1 is observed across the genome. Instead, 78% of chromatin-associated NR4A1 protein resides at genebody and 3'-UTR regions (Fig. 2b, Extended Data Fig. 2d, e). As in the tumor analyses, 14 of 39 NR4A1-targeted genebodies encode known IEGs, and the top hit in gene ontology analysis is again "Cellular response to stimulus" (P<4.0E-10) (Extended Data Fig. 2f, Supplementary Table 1). Serial NR4A1 ChIP assays in both MCF10A and M231 cells cultured under baseline conditions, serum starvation, and at closely spaced time points following serum replenishment show striking temporal dynamics in NR4A1 localization across IEGs: NR4A1 binding to the genebodies and 3'-UTR is pronounced at baseline, modestly reduced upon serum starvation, and then disappears within 30 min of serum replenishment, beginning to reappear at 6 hrs (Fig. 2c, Extended Data Fig. 2g, h). H3K27ac ChIP-seq at these time points reveals no associated enhancer activity in NR4A1-bound regions, nor do the promoters of NR4A1-targeted genes show changes in H3K27ac marks, indicating that NR4A1 is not acting as a prototypical transcriptional activator at promoter or enhancer regions (Fig. 2c, Extended Data Fig. 2g).
To correlate the sequential pattern of NR4A1 loading at IEG genebodies with that of RNA polymerase, we also performed RNA Pol II ChIP-seq at serial intervals following serum starvation and refeeding. Across the genome, in both MCF10A and M231 cells, RNA Pol II ChIP-seq reads are strongly enriched at TSS under all culture conditions (Extended Data Fig. 2i). In stark contrast, across IEGs under baseline culture conditions, RNA Pol II residency is evident broadly across genebodies and at 3'-UTRs, precisely overlapping with the presence of NR4A1, consistent with our observations in CTC-derived tumors (Fig. 2c, Extended Data Fig. 2g). The localization of RNA Pol II along the IEG genebodies is resolved 30-60 min after serum replenishment, coinciding with disappearance of NR4A1 occupancy. Remarkably, both NR4A1 and RNA Pol II binding to genebodies are inversely correlated with the RNA expression of IEGs: IEG expression is virtually undetectable when NR4A1 and RNA Pol II are bound at IEG genebodies, and it peaks 30 min after their release from these sites (60 min after serum addition) (Fig. 2d). This tight and inverse sequential timing suggests an inhibitory role for NR4A1 on IEG transcription, associated with RNA polymerase stalling in these genebody regions. The genebody ChIP-seq patterns are observed using antibodies against total RNA Pol II, as well as against the phosphorylated forms of the polymerase C-Terminal Domain (CTD) (phospho-S2 and phospho-S5). Speci c RNA Pol II phosphorylation sites have been best studied in the context of MYC-associated promoter-proximal pausing, with S5-phosphorylation characteristic of the paused polymerase and S2-phosphorylation observed with the licensed and transcriptionally active enzyme 12 . The presence of polymerase phosphorylated at both residues is unexpected, and it raises the possibility that the licensed enzyme is further modi ed at stalling sites along the genebody. To further con rm the RNA polymerase ChIP-seq pattern, we used a different set of antibodies against total RNA Pol II or RNA Pol II phospho S2 to perform ChIP-qPCR. Again, abundant signal of both total RNA Pol II and RNA Pol II phospho S2 is evident at IEG genebodies under basal culture conditions, with a marked reduction 1 hr after serum replenishment (Extended Data Fig. 2j).
The virtually immediate induction of IEG expression following acute stress stimuli is matched by its very rapid resolution (i.e. a single burst of expression). Interestingly, RNA Pol II does not reload at either the IEG TSS or genebodies before 6 hrs after serum stimulation (Fig. 2c, Extended Data Fig. 2g, j). This suggests that IEG expression is primarily mediated by the rapid completion of stalled transcription, which results in a burst of gene expression triggered by acute stress. This "pause and release" pattern may therefore account for both the precipitous onset and rapid termination of IEG expression. Supporting this model, after serum deprivation and replenishment, precision nuclear run-on followed by sequencing (PRO-seq) 13 , shows a transient increase in nascent transcription across IEG genebodies, coinciding with the reduction in RNA Pol II localization (Fig. 2e). Finally, the RNA polymerase travelling ratio (TR), calculated by comparing RNA Pol II ChIP-seq read density between promoter and genebody regions at different time points 14 (see Methods), demonstrates an anti-correlation between NR4A1 binding intensity and RNA Pol II TR, again consistent with a functional role for NR4A1 in inhibiting transcriptional elongation of IEGs (Fig.  2f).
While NR4A1 can bind to DNA with sequence speci city, ChIP-seq analyses do not identify a speci c consensus sequence for its localization along the IEG genebodies. We therefore postulated that the recruitment of NR4A1 to these genebodies may result from protein-protein interactions, potentially including RNA Pol II itself. To test for such a physical interaction, we immunoprecipitated NR4A1 from MCF10A cells, followed by western blotting against the activated phospho-S2 residue within the RNA Pol II CTD. Under basal serum conditions, phospho-S2 RNA Pol II co-immunoprecipitates with NR4A1, but this protein association is no longer detectable upon serum deprivation, as NR4A1 starts to dissociate from the IEG genebodies (Fig. 2g). Thus, NR4A1 dynamically interacts with the RNA polymerase elongation complex, inhibiting its activity at baseline and releasing its inhibition in response to stress-induced signals. A similar protein association in vivo between NR4A1 and RNA Pol II is evident in CTC-derived tumors (Fig. 2h).

NR4A1-dependent R-loops contribute to chromatin accessibility and transcriptional arrest at IEG genebodies
To determine whether NR4A1 directly mediates RNA Pol II stalling at IEG genebodies, we used CRISPR/Cas9 with different pairs of guide-RNAs to generate multiple NR4A1 knockout clones in M231 cells (see Methods). Genomic PCR demonstrated successful deletion of the targeted NR4A1 gene fragments, and the absence of NR4A1 protein was con rmed by western blotting (Extended Data Fig. 3a, b). ChIP-seq analysis of NR4A1-null M231 cells under baseline culture conditions shows no change in overall RNA Pol II presence and localization across the genome, but a striking reduction of the polymerase at the genebody regions of IEGs (Fig. 3a). Consistent with the role of NR4A1 in restraining transcriptional processing of IEGs, expression of FOS, FOSB and EGR1 is markedly elevated in NR4A1-null cells cultured under baseline conditions, as well as following serum withdrawal and replenishment (  Table 2). Unlike M231 cells, MCF10A cells do not tolerate CRISPR-mediated deletion of NR4A1. We therefore achieved shRNA-mediated NR4A1 knockdown in these cells using two different sequences (77.7% and 91.4% knockdown, respectively) (Extended Data Fig. 3f). Again, RNA Pol II ChIP-seq following NR4A1 knockdown shows a striking reduction in RNA Pol II localization to IEG genebodies under baseline culture conditions, associated with increased expression of IEGs, including FOS and EGR1 (Extended Data Fig. 3g, h).
Stalled RNA polymerase processing may be associated with hybridization of the nascent transcript with the unwound matching antisense DNA strand, forming DNA-RNA hybrid structures called R-loops 15 . Such structures may result from RNA Pol II pausing, but they could also directly contribute to pausing 16,17 . To determine whether NR4A1-mediated RNA Pol II pausing is associated with R-loop formation, we rst undertook DNA-RNA immunoprecipitation (DRIP) using the canonical S9.6 monoclonal antibody, which recognizes DNA-RNA hybrids with subnanomolar a nity 18,19 . Using DRIP followed by sequencing (DRIPseq) or qPCR (DRIP-qPCR), we observed strong R-loop signals across the genebodies of IEGs, including FOS and EGR1, in both M231 and MCF10A cells, under baseline and serum starved conditions, with virtual disappearance of the R-loop signal upon serum stimulation (Fig. 3e, Extended Data Fig. 4a).
Similarly, in the in vivo CTC-derived tumors characterized by IEG genebody chromatin accessibility and by NR4A1 and RNA Pol II co-localization, S9.6 DRIP-qPCR analysis reveals dramatic enrichment of R-loop signal at the same IEG genebody loci (Fig. 3f, Extended Data Fig. 4b). In NR4A1-null M231 cells under baseline conditions, DRIP-seq reveals a dramatic erasure of IEG genebody R-loops (Fig. 3g), and a comparable reduction of R-loops at IEG genebodies is observed in MCF10A cells following shRNAmediated NR4A1 knockdown (Extended Data Fig. 4c). In all these experiments, speci city of the R-loop signal was con rmed using in vitro RNAse H digestion. Thus, NR4A1 localization to the IEG genebodies drives the accumulation of R-loops at these loci. The reduction in R-loops in NR4A1-null M231 cells is associated with increased expression of IEGs, notably the prototypical IEG FOS (Fig. 3b, Extended Data Fig. 3d). Since R-loop formation may be either cause or consequence of delayed transcriptional processing 20 , we tested the direction of this causation by establishing ectopic expression in M231 cells of RNase-H1 (Extended Data Fig. 4d), which directly degrades the RNA strand in the DNA-RNA hybrids and resolves R-loops in vivo 19,21,22 . S9.6 DRIP-qPCR analysis of RNase-H1-expressing cells growing under baseline culture conditions shows abrogation of the R-loop signal at the FOS (>13-fold reduction) and EGR1 (>5-fold reduction) genebodies (Extended Data Fig. 4e). In vivo RNase-H1 expression also leads to signi cantly increased expression of FOS, and other IEGs (Fig. 3h, Extended Data Fig. 4f), indicating that R-loops contribute to transcriptional suppression.
Abundant R-loops along the genebody of IEGs have the potential to disrupt DNA compaction, resulting in more accessible chromatin. We therefore asked if the striking genebody ATAC-seq signal that initially drew our attention to NR4A1 and IEGs could itself be the result of extensive R-loop formation. MCF10A cells under baseline culture conditions show ATAC-seq signal across the IEG genebodies, comparable to that observed in CTC-derived tumors (Fig. 3i). In vivo expression of RNAse H1 in these cells leads to a marked diminution in this genebody chromatin accessibility (Fig. 3i, Extended Data Fig. 4g), indicating that it is indeed a consequence of R-loop accumulation. NR4A1 suppression in MCF10A cells results both in the resolution of IEG genebody R-loops as well as reduced ATAC-seq signal at these loci, supporting the role of NR4A1 binding in mediating these two phenomena (Fig. 3j, Extended Data Fig. 4c). Direct resolution of R-loops through in vivo expression of RNase-H1 also reduces RNA Pol II occupancy at the FOS genebody in MCF10A cells (Extended Data Fig. 4h), while NR4A1 binding itself is not affected (Extended Data Fig. 4i). The direct effect of R-loops on IEG gene expression is consistent with the observation that the small-molecule transcriptional inhibitor 5,6-dichloro-1-β-Dribofuranosylbenzimidazole (DRB) induces comparable RNA Pol II pausing and R-loop signal along the IEG genebodies, irrespective of NR4A1 gene status in M231 cells (Extended Data Fig. 4j, k).
Taking all of this evidence together, we propose a model whereby NR4A1 binds to RNA Pol II, arresting transcriptional elongation speci cally along IEG genebodies, generating extensive R-loops that result in dramatic genebody chromatin accessibility. This phenomenon leads to a piling up of poised IEG transcripts. When NR4A1 dissociates from IEG genebodies and releases RNA Pol II in response to acute stress signals, these poised transcripts are very rapidly completed and released. This unique IEG transcriptional elongation checkpoint thus triggers an immediate, coordinated and transient burst of stress-induced IEG expression (Fig. 3k).

NR4A1 suppresses mitotic catastrophe
The mechanism whereby NR4A1 controls the acute IEG stress response may also underlie its potent role in CTC-mediated tumorigenesis (Fig. 1f-k). Much as cultured CTCs do not tolerate loss of NR4A1 expression, NR4A1-null M231 cells show impaired proliferation in vitro and retarded tumorigenesis in vivo (Extended Data Fig. 5a-d). MCF10A cells also show signi cantly reduced proliferation in vitro following NR4A1 knockdown (Extended Data Fig. 3f, 5e). RNA-seq in NR4A1-null M231 cells compared with parental cells identi es 501 downregulated genes (fold change>2, P<0.05), with enrichment for pathways involved in DNA replication (P<3.0E-5) and DNA repair (P<5.7E-4) (Extended Data Fig. 5f, g, Supplementary Table 3). Consistent with these ndings, NR4A1-null cells show elevated levels of phosphorylated Chk1, g-H2AX and RPA32, evidence of replication stress and activated DNA damage responses (Fig. 4a, Extended Data Fig. 5h, i). The increased Chk1 levels in NR4A1-null cells are associated with increased sensitivity to the Chk1 inhibitor, MK-8776, compared with control cells (Fig. 4b).
Underlying this activation of DNA damage response pathways are massive mitotic defects in NR4A1de cient cells. Dual staining for α-tubulin and DAPI in NR4A1-null M231 cells reveals large numbers of chromosomal defects: a mean 39.5% of cells (range: 34.1%-44.9%) have multiple nuclei or gross chromosomal fragmentation, compared to parental cells with a mean 2.1% (range: 1.1%-3.0%, P=0.0028) (Fig. 4c, d). Similar chromosomal instability, including cells with multiple nuclei and micronuclei, is observed in MCF10A cells upon shRNA knockdown of NR4A1 (Fig. 4e, f). Flow cytometric analysis of DNA content shows a very high degree of aneuploidy and genome duplications in NR4A1-de cient M231 cells, compared with parental cells (Fig. 4g). Single-cell karyotypes show increases in the number of chromosomes per cell from a median 59 in parental M231 clones to 87, 132 and 182 in three independent NR4A1-null clones, respectively (P=0.015, P<0.0001 and P<0.0001, respectively) (Fig. 4h). NR4A1-null M231 cells also show a prolonged G2/M phase consistent with mitotic delay, both under baseline culture conditions and following serum replenishment of starved cells, as measured by 5-ethynyl-2'-deoxyuridine (EdU) incorporation assay (Extended Data Fig. 6a). In addition to chromosomal defects evident in stably generated NR4A1-null cells, acute NR4A1 knockdown using three different shRNA constructs in M231 cells (68.8%, 72.8% and 96.0% knockdown, respectively) also triggers genomic duplications and G2/M arrest (Extended Data Fig. 6b-e). Similar chromosomal defects are observed following knockdown of NR4A1 in CTC cell lines (Extended Data Fig. 6f). Together, these ndings indicate that NR4A1 expression is critical to maintaining genomic stability and that its absence results in major chromosomal defects that compromise cell proliferation and survival.
To test whether the genomic instability resulting from NR4A1 deletion results in part from its role in controlling stress-induced IEG expression, we tested the consequences of individual IEG knockdown in NR4A1-null M231 cells. Remarkably, knockdown of FOS using either siRNA or shRNA in NR4A1-depleted cells effectively reverses their massive mitotic defects, genome duplication and proliferative failure (Fig.  4i, Extended Data Fig. 7a-c). No such effect is observed following knockdown of other IEGs tested, including FOSB, BHLHE40 and MYC (Fig. 4i). The reversal of genome duplication in NR4A1-null cells is most likely due to the death of already mitotically compromised cells, and the suppression of further genomic instability mediated by deregulated FOS expression. Indeed, suppressing FOS in NR4A1-null M231 cells leads to reduced replication stress and enhanced proliferation of cells with largely corrected chromosomal content (Fig. 4j, k). Consistent with these ndings, ectopic expression of FOS in wild-type M231 cells su ces to trigger replication stress and mitotic defects (Extended Data Fig. 7d-i). Indeed, FOS together with other IEGs are signi cantly induced when NR4A1 is suppressed using shRNA in CTCs; conversely, their expression is suppressed when NR4A1 is overexpressed in CTC-derived tumors, which exhibit reduced DNA damage and apoptosis, compared with parental CTC-derived tumors (Extended Data Fig. 8a-f). Thus, restriction of inappropriate FOS expression, above all other IEGs, appears to be linked to the function of NR4A1 in suppressing replication stress and chromosomal instability.

Prevalence of NR4A1 and IEG genebody chromatin accessibility in primary human cancers
Having described the phenomenon of IEG genebody and 3'-UTR chromatin accessibility acquired by cultured breast cancer cells during in vivo tumorigenesis, we sought to determine whether this phenomenon is observed in primary human cancers. By reanalyzing a recent ATAC-seq analysis of 404 TCGA human primary cancers 23 , we nd NR4A1 and IEG genebody chromatin accessibility to be prevalent across different cancer subtypes, ranging from 100% of prostate cancers to 59.5% of breast cancers and 11.8% of liver cancers (genebody/promoter ATAC-seq ratio >1; Fig. 5a, b, Extended Data Fig.  9a). Across all TCGA cancers studied, NR4A1 genebody ATAC-seq signal shows strong concordance with ATAC-seq signal at other IEG genebodies, including FOS and EGR1 (Fig. 5c, Extended Data Fig. 9a). Among primary breast cancers, NR4A1 genebody ATAC-seq signal is highly detectable in 65.9% of luminal A/B cancers, which are characteristically HR-positive and well differentiated, versus 22.7% of basal cancers, which include the more aggressive TNBC, and 9.1% of HER2-ampli ed subtypes (Fig. 5d).
Reanalyzing a published dataset of ATAC-seq from normal mouse mammary gland development 24 , we also nd IEG genebody accessibility in normal mouse mammary cells (Extended Data Fig. 9b). Thus, this IEG chromatin pattern appears to re ect normal physiological mechanisms of IEG regulation, which are preserved in more differentiated breast cancers that have retained replication stress control pathways. Consistent with this concept, the presence of NR4A1 genebody chromatin accessibility is associated with lower stages of breast cancer, and it is highly correlated with a favorable clinical outcome (P=0.013) (Fig.  5e, f). Across all cancer types, NR4A1 genebody open chromatin is correlated with reduced apoptosis signaling, stress response and DNA damage response signaling, with increased developmental growth signaling pathways including TGF-b, WNT and NOTCH, as well as with estradiol response signaling, predominately in breast cancers (Fig. 5g, Extended data Fig. 9c). Of note, the NR4A1-associated open chromatin domains, correlated with reduced IEG expression, are functionally distinct from overexpression of total NR4A1 mRNA, which is generally associated with an adverse clinical outcome across different cancer types [25][26][27] .
Finally, to test the potential therapeutic implications of NR4A1 targeting in breast cancer, we treated NSG mice with CTC-derived orthotopic mammary tumors with the NR4A1 small-molecule inhibitor DIM-C-pPhCO 2 Me (NR4A1-i). NR4A1 inhibition dramatically suppresses tumorigenesis by these patient-derived breast cancer cells (Fig. 5h). The residual tumors in NR4A1-i-treated mice show elevated expression of the DNA damage marker phospho γH2AX, and decreased expression of the proliferation marker Ki-67 (Fig. 5i,  j, Extended Data Fig. 9d).

Perspectives
We have uncovered a new transcriptional elongation checkpoint, mediated by NR4A1, which is speci c to IEGs, whose exceptionally rapid bursts of expression depend on the immediate release of poised transcripts (Fig. 5k). This non-canonical mechanism of transcriptional regulation by an orphan nuclear receptor has implications both for the physiological response to stress in normal cells, as well as its adaptation in cancer. The concept of non-oncogene addiction describes the dependence of cancer cells on genes that are not themselves drivers of proliferation, but which control regulatory pathways critical to cancer cell survival. The dramatic effect on tumorigenesis of NR4A1 overexpression and knockdown suggests that cancer cells may rely on this pathway to mitigate replication stress resulting from aberrant proliferative signals. This nding raises the possibility of therapeutic targeting of NR4A1, potentially in combination with inducers of cellular stress, particularly in the >60% of cancers that appear to show preservation of the NR4A1-dependent transcriptional elongation checkpoint. While analysis of TCGA data indicates that more differentiated cancer types are more likely to have preserved the IEG regulatory signatures reported here, we note the effectiveness of NR4A1 suppression in cultured CTCs from advanced HR-positive breast cancer and in the highly malignant M231 TNBC cells, which suggests that this pathway may also be relevant in advanced breast cancers.
We also note that NR4A1 has been implicated as a key regulator of T cell exhaustion, the unresponsive phenotype that follows excessive stimulation in antigen-reactive T cells, as well as in synthetic CAR-T cells 28,29 . IEG induction accompanies T cell activation, and NR4A1 has been postulated to block FOS-JUN promoter binding sites and suppress AP-1-mediated transcription. However, our review of NR4A1 binding landscapes in these cells 29 , suggests predominant binding to the genebodies, rather than to the promoters of IEGs, consistent with the transcriptional elongation control described here. Most recently, NR4A1 was also reported to be one of the key factors to restrain B cell responses to antigen 30 . Taken all together, these observations point to potentially convergent mechanisms in immune and cancer cells, with NR4A1-mediated adaptation to chronic antigen stimulation leading to an exhaustion phenotype in immune cells, and NR4A1 -mediated tolerance to oncogene-driven replication stress preventing mitotic catastrophe in cancer cells. The dependence of critical IEG-mediated signals on NR4A1 may thus reveal therapeutic opportunities in both drug-based and immune cell-mediated treatments of cancer.
The genebody-and 3'-UTR-centered transcriptional elongation arrest mediated by NR4A1 differs fundamentally from the more general function of MYC in releasing the common pausing of RNA polymerase 30-50 bp downstream of the TSS 14,31,32 . MYC-regulated promoter-proximal pausing release serves both to prevent leaky transcription and as a rheostat to broadly increase transcription under proliferative conditions. In contrast, the more targeted effect of NR4A1 across the genebodies and 3'-UTRs of IEGs enables their rapid and synchronous expression in response to stimuli. This immediate and limited burst of coordinated IEG expression is essential to multiple stress responses. Recently, transcription inhibitors have been shown to trigger R-loop formation along the genebody of highly expressed genes 20 , a nding that is consistent with our observation of these DNA-RNA hybrids across the genebodies of IEGs whose transcription is blocked by NR4A1 binding. R-loops have been primarily studied as consequences of DNA damage-induced collisions between transcriptional and replication machineries 15 , attempted transcription through heterochromatin barriers 33 , or abnormalities in cleavage and polyadenylation (CPA) factors 34 . In contrast to these pathology-associated R-loops, those induced by NR4A1 localization to IEG genebodies appear to be linked to the normal physiological regulation of IEG gene expression, and their removal through RNAse-H1 treatment leads to aberrant IEG expression. Notably, the generation of 3'-UTR-localized R-loops has been shown to recruit repressive histone modi ers capable of suppressing transcriptional termination 33      ****P<0.0001 by two-tailed Student's t-Test, n>100. Image quanti cation was carried out by ImageJ software. (k) Rescue of in vitro cell proliferation defect in NR4A1-null M231 cells, following knockdown of FOS using two different shRNA constructs (#a-b). ****P<0.0001 by two-tailed Student's t-Test, n=4.