Supplemental Figure 1 Complexity curves and read distributions of public and in-house GRO-RPR datasets,
indicating trends of lower quality for our libraries with this preparation.
Supplemental Figure 2 Discrete wavelet transform PCA results for 210 highly transcribed genes, demonstrates
41.4% of genes separate on PC1.
Supplemental Figure 3 DWT PCA Results of detail coefficients at UBB Locus. PCA results for UBB locus, as in
Figure 2F, but results are colored by library preparation method. At this locus, the results cluster less distinctly by
library preparation method, compared to the enrichment protocol.
Supplemental Figure 4 Schematic for the Support Vector Machine (SVM) leave one out cross validation (LOOCV)
analysis. Eighteen nascent RNA sequencing samples were used as input. Given a gene, each of the samples was
selected as a test sample and the other samples as training set, the SVM classification was evaluated. Based on this
criteria, a majority of the genes (>75%) accurately classified the protocol for the n=18 samples.
Supplemental Figure 5 Rank correlation of elongation regions of genes in PRO-LIG and GRO-CIRC libraries.
Initiation region was defined as in Fig. 3, (See also Materials and Methods). Genes smaller than 2000 bp were
removed, and only the resulting top 500 transcribed genes (by TPM) were considered. Initiation region rank
correlation (Top) is weaker than in the elongation region (Bottom), suggesting that most of the variability in these
libraries lies near transcription start sites.
Supplemental Figure 6 Read count heatmap of pause regions of genes in GRO-CIRC, GRO-LIG, GRO-RPR, and
PRO-LIG libraries. (TSS +/- 500 bp, 10 bp per region). RefSeq hg38 gene annotations were used. Genes shorter
than 2000 bp were not included. There is comparatively lower coverage near the TSS in many genes, representing
the center of bidirectional transcription. This is especially prevalent in GRO-RPR and PRO-LIG libraries.
Supplemental Figure 7 Read count heatmap of pause regions of genes in public MCF7 GRO-LIG and GRO-CIRC
libraries (TSS +/- 500 bp, 10 bp per region) [14, 60]. RefSeq hg38 gene annotations were used. Genes shorter than
2000 bp were not included. These heatmaps reflect those generated from our own datasets, thus reinforcing that the
patterns found in our datasets are not only a result of batch effects from our lab (See Supplemental Table 1).
Supplemental Figure 8 Metagenes of PRO-LIG libraries with varying Biotin ratios. Libraries generated from
HCT116 cell treated with DMSO, with PRO-LIG strategies. Libraries differed in the amount of available biotin
added (25 μM vs 2.5 μM).
Supplemental Figure 9 Pause Index Correlations of PRO_LIG Replicates.
Supplemental Figure 10 Pause index (PI) and rank correlation of PI generated from GRO-CIRC and GRO-LIG
libraries Pause indices generated with a different method than 3, using a different pause region definition (Pause
region:TSS to +80, elongation region +81:TES-1000, genes shorter than 2000 bp were not included), and a different
counting software (featureCounts). In spite of these changes, the relative distribution and correlation remains
consistent with 3D, suggesting that these patterns are not merely a result of our software or PI region definitions.
Supplemental Figure 11 (Top) Metagenes of public datasets[61, 40]. Libraries were generated from K562 Cells
treated with DMSO and prepped with either PRO-LIG or GRO-CIRC methods. All 4 nucleotides added to the
run-on reaction were biotin-NTPs. (Bottom) Public data[61, 40] were subjected to analysis as in Fig. 3C, left (see
Supplemental Table 1). PI regions were defined as in Fig. 3. Notably, the rank correlation remains low (R=0.44)
consistent with PI differences being driven by protocol.
Supplemental Figure 12 Density plot of read counts (TPM) over HCT116 enhancers annotated in the FANTOM
database. FANTOM annotations were generated from CAGE data, thus we reasoned that most FANTOM regions
would overlap with relatively stable bidirectional transcription. As such, read counts over these regions is much more
highly correlated between different protocols.
Supplemental Figure 13 UpSet plot of Tfit and dREG calls among PRO-LIG, GRO-LIG, and GRO-CIRC libraries.
Calls from replicates and treatments were combined using muMerge [6]. Much of the disparity in these overlaps can
be attributed to Tfit or dREG failing to call bidirectional regions despite the presence of bidirectional transcription,
as shown in 4C,D. However, there remains many enhancer regions not captured in one protocol at this depth.
Supplemental Figure 14 Density Plot of Read Counts (TPM) over Tfit Calls between replicates Read counts for
merged Tfit calls for PRO-LIG replicates. Counts are log(TPM) normalized to correct for depth. Very low read
count calls (TPM < .1) were excluded as likely false positives.
Supplemental Figure 15 Metagene of enhancers differentially captured in either GRO-LIG or GRO-CIRC libraries
Tfit calls across replicates and treatments were combined together using muMerge for both GRO-LIG and
GRO-CIRC libraries. Metagenes of calls that were differentially captured (as determined by DESeq, see also 4,
Materials and Methods) were generated for both GRO-CIRC (Top) and GRO-LIG (Bottom) Tfit calls. Reads counts
were normalized by CPM.
Supplemental Figure 16 Example enhancer region where libraries appear to disparately capture differential p53
enhancer activity. Darker colors represent transcription level in Nutlin-3a treated libraries, while lighter colors
represent levels found in DMSO-treated libraries. Read counts are normalized by CPM.
Supplemental Figure 17 Enrichment plot of GSEA results for GRO-LIG, PRO-LIG, and GRO-CIRC libraries Gene
region definitions were adjusted as per 5A. In spite of library variations, the HALLMARK_P53_PATHWAY (red) is
the strongest hit in all of our library comparisons.
Supplemental Figure 18 Overlap of GSEA p53 genes in GRO-LIG and PRO-LIG libraries. Analysis was performed
using counts over gene bodies (Left), and using a 5’ correction (Right), as in 5A (see also Materials and Methods).
Hunter et al. Page 21 of 21
Supplemental Figure 19 TFEA results for PRO-LIG libraries. Regions were combined by muMerge, as in Fig. 5. Red
dots indicate transcription factors belonging to the p53 family (TP53, TP63, TP73).
Supplemental Figure 20 Rank Differential of GRO-LIG and PRO-LIG enhancers. Ranks were determined within the
TFEA through DESeq2. p53 enhancers which were more than 2 SD away from the mean were considered to be
differentially captured in GRO-seq or PRO-seq.