ChIP-Hub: an Integrative Platform for Exploring Plant Regulome

doi:10.21203/rs.3.rs-812424/v1

Download PDF

Article

ChIP-Hub: an Integrative Platform for Exploring Plant Regulome

https://doi.org/10.21203/rs.3.rs-812424/v1

This work is licensed under a CC BY 4.0 License

Journal Publication

published 14 Jun, 2022

Read the published version in Nature Communications →

Version 1

posted

You are reading this latest preprint version

Plant genomes encode a complex and evolutionary diverse regulatory grammar that forms the basis for most life on earth. A wealth of regulome and epigenome data have been generated in various plant species, but no common, standardized resource is available so far for biologists. Here we present ChIP-Hub, an integrative web-based platform in the ENCODE standards that bundles > 10000 publicly available datasets reanalyzed from > 40 plant species, allowing visualization and meta-analysis. We manually curate the datasets through assessing ~ 540 original publications and comprehensively evaluate their data quality. As a proof of concept, we extensively survey the co-association of different regulators and construct a hierarchical regulatory network under a broad developmental context. Furthermore, we show how our annotation allows to investigate the dynamic activity of tissue-specific regulatory elements (promoters and enhancers) and their underlying sequence grammar. Finally, we analyze the function and conservation of tissue-specific chromatin states based on comparative genomics. Taken together, the ChIP-Hub platform and the analysis results provide rich resources for deep exploration of plant ENCODE.

Computational Biology

Bioinformatics

Plant Molecular Biology and Genetics

plant genomes

ChiP-Hub

plant ENCODE

Genome-wide charting of transcription factor (TF) binding and epigenetic status has become widely used to study gene-regulatory programs in animals and plants. Chromatin immunoprecipitation sequencing (ChIP-seq) is a powerful method to capture DNA targets for TFs or histone modifications across the entire nuclear genome of any organism^1–7. From a technical point view, the success of ChIP-seq experiments largely depends on the development and validation of highly gene-specific antibodies or tagged transgenic lines^8–10. However, crosslinking-based ChIP techniques inherently suffer from several limitations, including low throughput, poor resolution, suboptimal signal-to-noise ratio, and a tendency to ‘detect’ false positives^11,12. In this regard, several recent techniques, such as ChIP-exo¹³ and CUT&RUN¹⁴, are alternatives to the current standard of ChIP-seq to improve the resolution in identifying protein binding locations. The in vitro DAP-seq technique^15–17, which is based on screening of a genomic DNA library genomic DNA library with an affinity-purified TF followed by high-throughput sequencing, is fast, inexpensive, and more scalable than ChIP-seq for the generation of genome-wide TF binding-site maps. However, only a subset of the TF binding sites identified by DAP-seq is accessible in vivo, and typically individual TFs are analyzed – while in vivo, TFs may interact in a combinatorial, tissue-specific manner with other TFs thereby altering DNA-binding preferences. Complementary in vivo experimental approaches -- for example, FAIRE-seq, DNase-seq and ATAC-seq -- can identify binding sites in open chromatin regions for all associated factors simultaneously and can provide additional information about DNA-binding proteins and their regulatory functions^8,18. Thanks to these rapidly developing techniques, a tremendous amount of data have been generated by several large consortia (such as the ENCODE consortium in human¹⁹ and mouse²⁰, as well as the modENCODE consortium in fly²¹ and nematode²²) or various smaller projects (such as the fruitENCODE project in flowering plants²³).

Several databases^24–28 were recently established for visualization and efficient deployment of public ChIP-seq data by the research community. However, no comprehensive resource is available for plant research. Another major bottleneck in current plant research is the lack of a standardized routine for evaluation and analysis of ChIP-seq data. Therefore, the comparison of data generated by different laboratories is not straightforward, hampering data integration to generate novel hypotheses for further investigation.

The ChIP-Hub resource. ChIP-Hub collects all plant regulome data deposited at the NCBI SRA database. These data were generated by high-throughput sequencing experiments including ChIP-seq, DAP-seq, DNase-seq and ATAC-sEq. By the time of finalizing this manuscript (as of July 2021), there are > 10000 individual datasets (whose experiment IDs start with SRX, DRX, or ERX) available at NCBI SRA in > 40 plant species, with a nearly exponential growth in recent years (Fig. 1a,b and Supplementary Fig. 1). Although most datasets were generated in model organisms (such as Arabidopsis, rice and maize), the high-throughput regulome experiments have also been widely used in non-model plant species. We manually curated all the datasets through assessing ~ 540 original publications and > 800 biological projects (Fig. 1c) and categorized them into different experimental groups, including open chromatin (11.5%), TFs and other proteins (27.3%), histone-related (39.9%), and input control experiments (19.4%; Fig. 1d).

We adapted the working standards provided by the ENCODE consortium¹⁰ to set up computational pipelines and to systematically reanalyze all public regulome data in plants (Fig. 1e; see Methods). To make our re-analysis results easily accessible to external users, we have developed an integrative web-based platform (ChIP-Hub) to explore all the re-analyzed data sets. Additional data (e.g., sample metadata, references, TF genes, miRNAs, TF motifs, chromatin states and comparative genomics) were also collected and deposited in the database (Fig. 1e). Therefore, the resources are bundled in a well-accessible application that also allows visualization and meta-analyses (Supplementary Fig. 2). Furthermore, in order to continuously add more source data in the future, we have designed ChIP-Hub to be updated quarterly with semi-automatic pipelines, including systematic metadata curation and automatic data processing.

Comprehensive evaluation of plant regulome data. One experiment may consist of multiple replicates of ChIP-seq samples and associated control samples. We therefore obtained > 6000 individual experiments (Fig. 2a) with manual curation based on experimental designs in original publications or project description. We assigned each experiment to a specific group based on the investigated regulatory factor. ChIP-Hub covers experiments for nearly all plant TF families and well-investigated histone modifications (Fig. 2b and Supplementary Fig. 3).

We then systematically evaluated the data quality of individual experiments (n = 6055). Although 89.2% of the experiments have been published in peer-reviewed journals, nearly 40% of the experiments lack control datasets, and only 37.8% have technical or biological replicates (Fig. 2c). Problems of lack of controls or replicates is more obvious in the earlier studies (Supplementary Fig. 4). Nevertheless, most of the evaluated experiments readily meet a variety of quality specifications based on the ENCODE criteria¹⁰ (Fig. 2d). More than 75% of the investigated datasets show moderate to high values of signal-to-noise ratio based on enrichment scores of FRiP (fraction of reads in peaks), NSC/RSC (normalized/relative strand cross-correlation coefficient)¹⁰ and SPOT (signal portion of tags)²⁹. Most of experiments show good quality in terms of the library complexity, as measured by PBC (PCR bottleneck coefficient) and NRF (nonredundant fraction). Not surprisingly, all these quality metrics are positively correlated with each other (Supplementary Fig. 5). As expected, the enrichment SPOT score of experimental groups is significantly higher than that of input control (Fig. 2e). In summary, the above results indicate overall high quality of individual experiments in the analysis.

We identified a total of 52.3 million high-confidence peaks (with an IDR, Irreproducible Discovery Rate³⁰, < 0.05; see Methods) from experiments for open chromatin, annotated TFs and widely-investigated histone H3 modifications (Supplementary Table 1). As expected, peak summits of TF-bound or open chromatin regions generally locate around the transcription start site (TSS) while genomic locations of histone-modified regions vary among different types of histone modifications (Fig. 2f). For genomes with more than 20 distinct experiments, the number of identified open chromatin regions, TF binding events or histone-modified genomic locations varies from 0.21 million (Chlamydomonas reinhardtii; experiments n = 32) to 21.4 million (Arabidopsis thaliana; n = 3479); the fraction of genome associated with TF-bound and histone-modified regions shows an average of 22.0 % (Fig. 2g), with comparable proportions found in the mouse (12.6%) and human (~ 20%) genomes^19,20. However, the proportion may be far underestimated for most plant genomes since many regulators have not yet been investigated. Of note, about 3500 individual experiments have been generated in Arabidopsis (Fig. 2b), resulting in annotated genomic regions in terms of chromatin status or TF binding encompassing at least 82.1% of the Arabidopsis genomic sequence in aggregate (Fig. 2g). Interestingly, 68.8% of Arabidopsis genome is annotated as potential regulatory regions based on 347 ChIP-seq experiments (each has > 50 targets) for 157 distinct TFs (Supplementary Fig. 6 and Supplementary Table 2), suggesting pervasive regulatory potential in the compact Arabidopsis genome.

Extensive TF co-associations and regulatory loops in Arabidopsis. We investigated TF co-associations and TF-targets gene regulatory networks using TF-related ChIP-seq experiments in Arabidopsis. Integrative analysis of TF-bound genomic regions revealed potential TF co-associations by regulating a similar set of target genes (Fig. 3a and Supplementary Figs. 7–8), as exemplified around the APETALA1 (AP1) gene locus (Fig. 3b). We organized the pairwise TF co-association into networks with TFs as nodes and their co-binding possibility as edges (Fig. 3c). We observed three dominant co-associated TF modules (M1-M3). M1 consists regulators from TF families of bZIP, bHLH and MYB, while M3 includes MADS TFs response for flower development³¹. Interestingly, M2 contains various regulators for the regulation of histone modifications, including histone acetyltransferases (GENERAL CONTROL NON-REPRESSED PROTEIN5 [GCN5]), deacetylases (HISTONE DEACETYLASE 2C [HD2C]), methyltransferases (ET DOMAIN GROUP2 [SDG2], FERTILIZATION-INDEPENDENT ENDOSPERM [FIE], SWINGER [SWN] and CURLY LEAF [CLF]), demethylase (JUMONJI 14 [JMJ14], RELATIVE OF EARLY FLOWERING 6 [REF6]/JMJ12) and bivalent histone readers (EARLY BOLTING IN SHORT DAY [EBS]) (Fig. 3c). These regulators tightly co-associate with timing regulators such as CIRCADIAN CLOCK ASSOCIATED1 (CCA1)/ LUX ARRHYTHMO (LUX) for the circadian clock and FLOWERING LOCUS C (FLC)/ SHORT VEGETATIVE PHASE (SVP) for the initiation of flowering.

Next, we constructed a hierarchical regulatory network by integrating potential direct TF target genes based on ChIP-seq and predicted miRNA-target interactions, with upstream TFs at the top of the hierarchy, miRNA genes and their target genes at the middle and bottom levels, respectively (Fig. 3d). This “meta-network” includes 117 upstream TFs, 134 miRNA genes, and 462 common target TF genes, and the predicted regulation relationship tends to show a regulator (TF or miRNA) family-specific manner. We identified a comprehensive set of miRNA-mediated feed-forward loops (FFLs; n = 13630) from the meta-network (Fig. 3d and Supplementary Table 3). In particular, most of the floral FFLs identified in our previous analysis³¹ are included in this extended list (Fig. 3e). The above analysis provides a rich resource to study the biological role of regulatory loops in specific contexts.

Dynamics of tissue-specific regulatory elements. More than 1100 open chromatin datasets have been reanalyzed in ChIP-Hub, which offer an opportunity to comprehensively annotate plant regulatory elements such as enhancers. As a proof of concept, we predicted a catalogue of 18753 promoters and 9976 enhancers in ten representative tissues by integrative analysis of 65 open chromatin datasets from nine studies^32–40 (Supplementary Tables 4 and 5; see Methods). Clustering analysis based on chromatin accessibility reveals that both promoters and enhancers can distinguish different types of tissues, despite data generated by different studies (Fig. 4a and Supplementary Fig. 9a,b). Supporting this notion, we observed instances of promoters and enhancers that are specifically active in certain types of tissues (Fig. 4b). To compare tissue specificity of promoters and enhancers, we calculated their divergence of chromatin accessibility across tissues based on the Jensen-Shannon diversity (JSD) index. We found that enhancers are generally more tissue-specific than promoters (Fig. 4c). Based on the distribution of JSD score, we defined regulatory elements with JSD > 0.26 as highly specific ones (Fig. 4c). We summarized the number of TF binding sites (including 157 TFs as analyzed above) in both promoters and enhancers and found that enrichment of TF binding in highly tissue-specific regulatory elements is significantly different between promoters and enhancers (Fig. 4d). However, there is no difference of enrichment of TF binding in low tissue-specific promoters and enhancers.

The highly tissue-specific regulatory elements (including 4702 promoters and 4234 enhancers) were grouped into ten different clusters based on their chromatin accessibility (Fig. 5a). We associated potential target genes of these regulatory elements using the “nearest neighbor” strategy⁴¹, so that one gene may have multiple regulatory elements (Fig. 5b). Regulatory elements in clusters 2 (C2) and C3 are highly active in flower-related tissues and their target genes largely involved in biological processes such as “flower development” and “floral organ development”; while regulatory elements in C4 and C7 are specifically active in root- and leaf-related tissues, with target genes in “response to biotic stimulus” and “defense response”, respectively (Supplementary Table 6).

We further investigated the sequence grammar underlying the chromatin dynamics of tissue-specific regulatory elements. We applied Basset⁴² to train a convolutional neural network (CNN) to discriminate one tissue from all other tissues on the basis of the sequence content within accessible sites (Supplementary Fig. 9c,d). The convolutional filters (n = 600) in the first CNN layer detect repeatedly occurring local sequence patterns, each comprising a weighted matrix of sequence features akin to a TF motif position weight matrix (PWM). The resulting PWMs were matched to known TF motif databases using the TomTom motif comparison tool. By this we were able to identify known or novel motifs represented in tissue-specific promoters and enhancers for each tissue (Fig. 4e and Supplementary Fig. 9e). For example, the classification of root tissues is strongly associated with a filter matching the WOX11 motif^43,44.

Comparison and conservation of tissue-specific chromatin states. In order to predict the functional relevance of the genomic regions marked by histone modifications, we generated integrated maps of chromatin states in vegetative-, reproductive- or root-related tissues of wide-type plants for genomes with at least five distinct marks (Supplementary Table 7 and Supplementary Fig. 10), using ChromHMM⁴⁵ to segment the genome into distinct combinations of histone modification marks (Supplementary Fig. 11). As a proof of concept, a 12-state model was trained in Arabidopsis vegetative-related tissues (Fig. 6a). The resulting “marked” states included six active states, four repressed states and a bivalent state that showed distinct levels of gene expression, chromatin accessibility, TF binding and enrichment for evolutionary conserved noncoding sequences (Fig. 6b-f), accounting for 77.8% of the genome (Fig. 6d) and covering all the major states identified in previous studies^46–48. Particularly, active chromatin states 2 and 3, which are proximal to the TSS, are associated with histone modifications of H3K4me2/H3K4me3 and TF binding for a diverse set of developmental regulators (Fig. 6a,e). State 2 differed from state 3 in that it is enriched with H3K9ac, H3K27ac and H3K36me3 towards TSS-proximal gene body regions. These two states can thus be considered as active promoter states. Interestingly, state 8 is associated with both active mark H3K4me2/H3K4me3 and inactive mark H3K27me3, and enriched with TF binding for Polycomb repressive complex 2 (PRC2) and Jumonji proteins, likely being a bivalent or bistable regulatory state^49,50. This state is highly conserved in sequences between Arabidopsis and other crucifer species in terms of phastCons score⁵¹ (Fig. 6f). State 9 is a repressed Polycomb state as it is solely associated H3K27me3 in intergenic regions (Fig. 6a,f). States 10–12 are constitutively enriched with heterochromatin-associated H3K9me2, which is required for the silencing of transposable elements (TEs) and other repetitive DNA^52,53. The H3K27me3-marked heterochromatin state (state 10) can be facultative as it is enriched with binding for proteins such as nucleosome remodeling complexes and DNA methyltransferases. Overall, our results reveal previously unappreciated interplay between chromatin state and regulator binding that likely underlies dynamic gene regulation.

The generation of tissue-specific maps of chromatin states (Fig. 6a-f and Supplementary Fig. 12–16) also offers an unprecedented level of comparison of genomic features among different plant species. We thus tracked the evolution of chromatin states in vegetative-related tissues across five plant species (i.e., Arabidopsis, rice, barley, wheat and maize) using Arabidopsis as a reference (see Methods). We observed that most Arabidopsis chromatin states (excepted heterochromatin-related states) were highly conserved in other plant species (Fig. 6g). For example, orthologous sequences were found for 61.1% of Polycomb-repressed regions in at least one of the compared species. Moreover, we found significant epigenomic conservation at orthologous chromatin state-marked regions (Fig. 6h), consistent with results in human ⁵⁴.

ChIP-seq and complementary assays are powerful methods to measure protein-DNA binding events and chemical modifications of histone proteins at genome-wide level. In recent years, massive research efforts resulted in generation of regulome and epigenome data in various plant species. However, re-use and comparison of data from different source studies is not straightforward due to lack of a comprehensive ChIP-seq database in the plant field. Given this background, we launched a project in 2015 with an aim of uniform reanalysis and comprehensive evaluation of plant regulome data. Here we provide ChIP-Hub which serves as a comprehensive data portal to explore plant regulomes, especially based on ChIP-seq, DAP-seq and ATAC-seq/DNase-seq experiments. Although all the evaluated data so far were taken from public databases, unpublished data provided by users can also be analyzed in the same way as published datasets (see online document under the “About” page). In principle, our computational pipeline is easy and ready to adapt to analyze new types of profiling data, such as CUT&RUN experiments for mapping protein-DNA contacts and histone modifications. To this end, a routine to maintain and to update ChIP-Hub in the future has been established. In addition, we are currently improving ChIP-Hub in order to support the analysis and visualization of plant single cell sequencing data based on ATAC-seq and related techniques.

ChIP-Hub offers a new centralized resource for analysis and comparison of plant regulome and epigenome data. Integrative analysis of such large-scale datasets using machine-learning-based approaches provides a unprecedented opportunity to extract hidden regulatory genomic patterns and thus to advance our views of a specific biological question under investigation^55,56. For examples, several integrative studies based on large-scale analysis of TF ChIP-seq data provide new perspective of gene regulatory networks underlying plant development and evolution^31,57−60. As more data are being generated in different plant species, direct comparisons of data among species become possible. As a start point, we have compared the conservation of genomic DNA regions marked by different chromatin states in five plant species, and found that at least some chromatin states are highly positionally conserved among the investigated species (Fig. 3g,h), suggesting a conserved histone code in plants. In the future, ChIP-Hub would allow to track the evolution of TF binding sites^61–63 and of other active functional elements (such as promoters and enhancers)⁶⁴ in multiple plant species by comparative genomics.

In summary, we hope that ChIP-Hub will not only allow experimental biologists from various fields to comprehensively use all available regulome and epigenome information to obtain novel insights into their specific questions, but also allow theoretical biologists to model regulatory relationships under specific conditions and developmental regimes.

Data source, curation and collection

Metadata of ChIP-seq, DAP-seq, ATAC-seq/DNase-seq samples (equivalent to datasets, accession numbers start with SRX/ERX/DRX) and projects (start with SRP/ERP/DRP) were retrieved from NCBI SRA (https://www.ncbi.nlm.nih.gov/sra), BioSample (https://www.ncbi.nlm.nih.gov/biosample), BioProject (https://www.ncbi.nlm.nih.gov/bioproject) and/or GEO (https://www.ncbi.nlm.nih.gov/geo) databases. ChIP-Hub has a focus on data in “green plants” (i.e., only considering plants in the taxonomy tree with a root ID 33090). Only data generated by Illumina platforms were kept. Firstly, each dataset was associated with publication(s) if available (about 90% samples can be linked with publications). Then, each dataset was manually curated to determine its investigated factor (i.e., which TF or histone modification mark), its experimental type (whether ChIP or control) and its associated replicates (experiment may have several replicates), based on the metadata and the original publications. Note that it is important to manually check the metadata based on its corresponding publication since some metadata was misannotated in the database. For example, the dataset SRX4063234 in fact contains two different samples, one for ChIP experiment (SRR7142417) and another for control experiment (SRR7142416). In this case, “Run” accessions (start with SRR/ERR/DRR) were instead used as sample accessions (ca. 250 of such cases). For datasets without related publications so far, they were marked as a “unconfirmed” status and would be regularly checked in the future. In general, one experiment may contain replicate samples (i.e., datasets), ChIP sample(s) as well as input control sample(s) and it was designed to investigate regulation of a specific factor (e.g., TF or histone modification) of interest under specific conditions. In the analysis (see the section below), each experiment was processed independently. Furthermore, annotation information for investigated factors was also manually curated. Broadly, factors are grouped into “TFs and other proteins”, “histone-related” or “unclassified”. For TFs, their gene IDs and family information were also determined if applicable. Finally, a meta file was obtained for each experiment after curation (see Supplementary Fig. 17 for examples), which is served as an input file for the ChIP-seq computation pipeline (see below).

Raw fastq files for each experiment were downloaded from the European Nucleotide Archive (ENA, https://www.ebi.ac.uk/ena) database. If fastq files were not available at ENA, raw data in the SRA format were downloaded from the SRA database and converted into fastq format using the “fastq-dump” command provided by the SRA Toolkit (version 2.5.1). The “--split-files” option was used for paired-end reads. Fastq files were further checked for completeness before submitted to analysis.

Genome sequences and gene annotations were downloaded from public databases (Supplementary Table 8). Additional annotation data were also included in the ChIP-Hub database in order to better annotate the regulatory factors and their regulatory networks. Annotation for miRNA genes were obtained from miRbase⁶⁵ and their genomic locations were updated (by BLAST) based on current reference genomes. TF family information was retrieved from PlantTFDB⁶⁶. TF DNA binding motifs were downloaded from the JASPAR⁶⁷, CIS-BP⁶⁸ and PlantTFDB⁶⁶ databases and were scanned for occurrences in the genome using FIMO⁶⁹. These data were provided as separated data tracks in the genome browser.

Data processing

We followed the ChIP-seq data analysis guidelines¹⁰ recommended by the ENCODE project to develop computational pipeline for various regulome data analysis (Fig. 1e). The analysis pipeline consists of quality control, read mapping, peak calling and assessment of reproducibility among biological replicates and was used to analyze all annotated experiments a standardized and uniform manner. Specifically, potential adapter sequences were removed from the sequencing reads using the Trim Galore program (version 0.4.1) and the quality of sequencing data was then evaluated by FastQC (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/). Trimmed reads were mapped to the corresponding reference genomes using Bowtie2 ⁷⁰ (version 2.2.6) with parameters “-q --no-unal --threads 8 --sensitive”. The parameter “-k” was set to 1, 2 and 3 for diploid genomes (e.g., Oryza sativa), tetraploid genomes (e.g., Gossypium barbadense) and hexaploidy genomes (e.g., Triticum aestivum), respectively. Redundant reads and PCR duplicates were removed using Picard tools (v2.60; http://broadinstitute.github.io/picard/) and SAMtools⁷¹ (version 0.1.19).

Peak calling was performed using MACS2 ⁷² (version 2.1.0). Duplicated reads were not considered (“--keep-dup=1”) during peak calling in order to achieve a better specificity⁷³. The shifting size (“--shift”) used in the model was determined by the analysis of cross-correlation scores using the phantompeakqualtools package (https://code.google.com/p/phantompeakqualtools/). The parameter “--call-summits” was used to call narrow peaks. For broad marks of histone modifications (including H3K36me3, H3K20me1, H3K4me1, H3K79me2, H3K79me3, H3K27me3, H3K9me3 and H3K9me1), broad peaks were also called by turning on the “--broad” parameter in MACS2. A relaxed threshold of p-value (p-value < 1e-2) was used in order to enable the correct computation of IDR (irreproducible discovery rate) values¹⁰, because IDR requires input peak data across the entire spectrum of high confidence (signal) and low confidence (noise) so that a bivariate model can be fitted to separate signal from noise³⁰. Following the recommendations for the analysis of self-consistency and reproducibility between replicates³⁰, replicate control samples (if available) were combined into one single control in the same experiment. Peak calling was applied to all replicates, pooled data (pooled replicates), pseudo-replicates (half subsample of reads) of each replicate and the pseudo-replicates of pooled sample using the same merged control as input (if applicable). By default, “reproducible” peaks across pseudo-replicates and true replicates with an IDR < 0.05 were recommend for analysis. Besides, peaks with different statistical thresholds are available upon request. For example, “significant” peaks were defined as a fold-change (fold enrichment above background) > 2 and a -log10 (q-value) > 3; while “lenient” peaks as a fold-change > 2 and a -log10 (q-value) > 2. “Relaxed” peaks without additional thresholding were also provided so that any custom threshold can be applied. All peak-based analyses in the pipeline (including peak overlapping, merging and summary) were performed using BEDTools ⁷⁴ (v2.25.0).

Various metric scores were calculated to assess different aspects of the quality of experiments (https://genome.ucsc.edu/ENCODE/qualityMetrics.html and https://www.encodeproject.org/data-standards/terms/; Fig. 2c,d and Supplementary Fig. 5). For example, library complexity is measured using the non-redundant fraction (NRF) and PCR bottlenecking coefficients 1 and 2 (PBC1 and PBC2). The SPOT (signal portion of tags) score, characterizing the enrichment of signal for each experiment, was calculated by the Hotspot⁷⁵ algorithm by subsampling ten million reads. Fraction of reads in peaks (FRiP), another measure of enrichment, is highly correlated with the SPOT score (Supplementary Fig. 5). NSC and RSC (normalized and relative strand cross-correlation coefficient) are related measures of enrichment without dependence on pre-defined peaks, which were calculated by the phantompeakqualtools program⁷⁶.

For visualization purpose, wiggle tracks (using pooled data across replicates) were generated by DeepTools⁷⁷ with the “bamCoverage” program; different normalization methods (including RPKM [reads per kilobase per million mapped reads], CPM [counts per million mapped reads], BPM [bins per million mapped reads], RPGC [reads per genomic content normalized to 1x sequencing depth] and None) were used to generate different types of signal files. Data signal tracks were visualized in the JBrowse⁷⁸ or the WashU Epigenome Browser⁷⁹.

Annotation of promoters and enhancers

We adopted the same approach in our previous study⁴¹ to predict Arabidopsis promoters and enhancers.

Assignment of target genes

Regulatory elements (in layman's terms, called “peaks”) were assigned to putative target genes based on the following rules. For a regulatory region overlapping with any gene(s) (protein-coding genes or miRNAs), the overlapping gene(s) were considered as its targets. Otherwise, the regulatory element was assigned to its nearest annotated gene within up to N bp, where N is the median size of intergenic regions (N was set to 3000 if the median size exceeded 3000). The start of genes (i.e., the transcription start site [TSS] of protein-coding genes and the 5’ end of miRNA precursors [pre-miRNAs]) was used to calculate the distance. In general, this approach associates a single regulatory element with no more than two genes, with a few exceptions in the case of the regulatory element overlapping multiple genes. This procedure was performed in each species independently.

Chromatin state analysis

In order to use the collected histone modification ChIP-seq data from diverse studies for chromatin state analysis and to make the data more comparative among different plant species, only well-characterized H3-related histone modification marks (including H3K9ac, H3K27ac, H3K4me1/2/3, H3K9me1/2/3, H3K27me1/2/3 and H3K36me1/2/3) were considered and only data generated in wild-type plants were used. Furthermore, the datasets were broadly categorized into vegetative-, reproductive- and root-related samples based on their tissue specificity (Supplementary Table 7). In general, these broadly defined “tissue” types (termed reference tissue types) are more comparative among different plant species and difference in tissue collection by different studies can be eliminated. Although the analysis is cell type agnostic, it is informative even when the relevant cell or tissue type has not been experimentally profiled (this is the most case in plants so far). In addition, we filtered out experiments with less than 1000 called peaks and only considered plant species with at least five distinct types of histone modification marks. In summary, 251 experiments from five plant species were retained for chromatin state analysis (Supplementary Table 7 and Supplementary Fig. 10).

ChromHMM⁴⁵ (version 1.19) was applied on the ChIP-seq data of histone modifications in three reference tissue types in five plant species to learn a multivariate HMM model for segmentation of genome in each tissue type. Specifically, the called peaks were first pooled from different ChIP-seq experiments for each type of the histone modifications in each tissue type for each genome separately. Peaks within blacklist regions were excluded from the analysis. The remaining pooled peaks were then processed by the “BinarizeBed” command (with the parameter “-peaks”) into binarized data in every 200 bp window over the entire genome. Models were trained independently for each reference tissue type in each genome since the composition of marks varied in different tissue types. We ran the “LearnModel” command with the number of states ranging from 2 states to 15 states and selected an “optimal-state” model based on a rule that the number of states appeared most parsimonious in terms of clearly distinct emission properties and clear interpretability of distinction between states (Supplementary Figs. 12-16). Furthermore, the resulting chromatin states were interpreted based on enrichment analysis of various types of functional annotations, such as gene elements, neighboring gene expression pattern, TF binding, chromatin accessibility and predicted enhancers^41,80. To this end, the “OverlapEnrichment” and “NeighborhoodEnrichment” commands were used in the analysis. The meaningful mnemonics of states for Arabidopsis vegetative-related tissues was given in Fig. 6d.

Comparative genomics and cross-species comparisons

Whole-genome alignments were performed in a similar way as previously described⁵¹. Briefly, soft masked genomes were aligned to each other using the LastZ alignment algorithm⁸¹. Collinear alignment blocks separated by gaps of <100 kb were then “chained” according to their locations in both genomes and “netted” to choose the best sub-chain for the reference species⁸². For polyploid plants, each sub-genome was individually analyzed such that each contained non-overlapping chaining. The whole-genome alignments can be visualized together with epigenomic tracks through the integrated Epigenome Browser (see below).

Pairwise comparisons of chromatin states were performed by one-to-one mapping annotated regions between species based on the above whole-genome alignments. For regions mapped to multiple orthologous locations in the other genome (i.e., regions split over multiple alignment blocks), only the largest orthologous region in the same alignment block was considered. Marked regions were considered as conserved between species when their orthologous location in the second species overlapped a marked region by a minimum of 50%. Note that the minimum required overlap had little influence on the overall results given that the median value of overlaps is 100% and the mean value is 89.9%. To make state interpretations more comparable across different species (chromatin marks available for state prediction were slightly different among species, see Supplementary Figs. 12-16), the learned chromatin states were re-interpreted (Fig. 6g,h) based on a common set of marks as possible (Supplementary Fig. 18).

Analysis of gene regulatory networks

To study gene regulatory networks (GRNs) controlled by TFs with available ChIP-seq data, we focused on a specific network motif, TF-miRNA-TF feed-forward loops (FFLs), which involves targeting of a TF to both miRNAs and miRNA target TFs. Such trifurcate regulatory circuits are of importance for fine tuning of downstream gene expression^83,84. We highlighted the analysis on Arabidopsis data since a comprehensive list of TFs have been investigated by ChIP-seq experiments in this plant species (Fig. 2b). The methodology, however, can easily be applied to data from any other plants when more and more data are generated. In the miRNA-mediated FFLs, target genes of miRNAs were predicted by the TargetFinder tool⁸⁵, with a prediction score cut-off value set to 4. Other relationships (i.e., TF-miRNA and TF-TF) were supported by ChIP-seq data. The final meta-network consisted of regulatory relationships among 117 master TFs, 134 miRNAs and 462 common target TFs (Fig. 3d and Supplementary Table 3), covering nearly two-thirds of the predicted FFLs involved in flower development ³¹ (Fig. 3e).

Convolution neural network analysis

To identify sequence motifs enriched in dynamically accessible regions among different tissues we used Basset⁴², a convolutional neural network (CNN) approach. We set the input to the CNN as the 600 bp sequences centered at the summit of the top 2500 highly accessible peaks for each tissue. The output of the classifier is a binary vector of length 10 (i.e., the number of tissue types). We used default Basset values for most parameters, except that we set the number of first layer filters to 600. The CNN contained three convolutional layers, each followed by a rectified linear unit (ReLU) and a max pooling layer, and two fully connected layers. The network architecture is schematically shown in Supplementary Fig. 9c. The predictive performance of the networks was assessed by the basset_test.lua script in Basset.

ChIP-Hub Shiny application

In order to efficiently use our reanalyzed data by external users, we developed an integrative web-based application (ChIP-Hub) with the Shiny framework (http://shiny.rstudio.com/), which combines the computational power of R with friendly and interactive web interfaces (Supplementary Fig. 2). All the sample metadata, curated metadata and analyzed result data were loaded into a MySQL database, allowing for interactive retrieval through the ChIP-Hub interface. These data were presented in tabular and chart forms in our Shiny web application. Furthermore, the data can be searched by keyword or gene to select datasets of interests. The associated result files, such as wiggle signal files, peak files and additional annotation files, can be loaded into the integrated Epigenome Browser (https://compbio.nju.edu.cn/browser/) for visualization.

Online access and updates

To make this project easier to maintain for a long life and to update in time, we have developed a semi-automatic computational program (ChIPer) for this purpose. The program regularly (in very month according to our current plan) checks whether any new datasets available in public databases. If so, the new datasets will be sent for curation via email and the curated datasets will be automatically analyzed by the data processing pipeline. New result files will be checked and uploaded to our web server quarterly. Besides, we will include more functionalities in our Shiny application as required.

Statistical analysis

If not specified, all statistical analyses and data visualization were done in R (version 3.4.1). R packages such as ggplot2 and plotly were heavily used for graphics. All the sources data for each figure can be found in the Supplemental Information and the newest data can be found in our ChIP-Hub website.

Code and data availability

The data can be viewed, mined and downloaded through the ChIP-Hub website (http://www.chip-hub.org). ChIP-Hub can also be available at https://biobigdata.nju.edu.cn/ChIPHub/.

Acknowledgements

The authors acknowledge the Center for Information Technology and the High Performance Computing Center of Nanjing University and the North-German Supercomputing Alliance (HLRN) for providing high performance computing (HPC) resources that have contributed to the research results reported in this paper. We would like to thank all the data contributors who make this project possible. Dijun Chen appreciates the great support from the National Natural Science Foundation of China (No. 32070656), the Nanjing University Deng Feng Scholars Program and the Priority Academic Program Development (PAPD) of Jiangsu Higher Education Institutions. Kerstin Kaufmann wishes to thank the Alexander-von-Humboldt foundation and the Federal Ministry of Education and Research for support. Peijing Zhang and Ming Chen appreciate 2018 Zhejiang University Academic Award for Outstanding Doctoral Candidates, the Fundamental Research Funds for the Central Universities and Collaborative Innovation Center for Modern Crop Production co-sponsored by Province and Ministry.

Author contributions

D.C. and K.K. conceived and designed the study. D.C., L.-Y.F., X.Z., R.Y., Z.W. and P.Z. annotated the sample metadata. D.C. implemented the computational analysis pipeline and developed the Shiny application with support from M.C.. D.C. and L.-Y.F. performed the analyses. D.C. wrote the manuscript with input from K.K.. All authors reviewed and approved the submitted version.

Additional information

Competing interests: The authors declare no competing interests.

Johnson, D. S., Mortazavi, A., Myers, R. M. & Wold, B. Genome-wide mapping of in vivo protein-DNA interactions. Science (80-.). (2007) doi:10.1126/science.1141319.
Barski, A. et al. High-Resolution Profiling of Histone Methylations in the Human Genome. Cell (2007) doi:10.1016/j.cell.2007.05.009.
Robertson, G. et al. Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat. Methods (2007) doi:10.1038/nmeth1068.
Mikkelsen, T. S. et al. Genome-wide maps of chromatin state in pluripotent and lineage-committed cells. Nature (2007) doi:10.1038/nature06008.
Kaufmann, K. et al. Orchestration of floral initiation by APETALA1. Science (80-.). 328, 85–89 (2010).
Park, P. J. ChIP–seq: advantages and challenges of a maturing technology. Nat. Rev. Genet. 10, 669–680 (2009).
Farnham, P. J. Insights from genomic profiling of transcription factors. Nature Reviews Genetics (2009) doi:10.1038/nrg2636.
Furey, T. S. ChIP-seq and beyond: New and improved methodologies to detect and characterize protein-DNA interactions. Nature Reviews Genetics (2012) doi:10.1038/nrg3306.
Egelhofer, T. A. et al. An assessment of histone-modification antibody quality. Nat. Struct. Mol. Biol. (2011) doi:10.1038/nsmb.1972.
Landt, S. G. et al. ChIP-seq guidelines and practices of the ENCODE and modENCODE consortia. Genome Research vol. 22 1813–1831 (2012).
He, C. & Bonasio, R. Chromatine mapping: A cut above. Elife (2017) doi:10.7554/elife.21856.
Zentner, G. E. & Henikoff, S. High-resolution digital profiling of the epigenome. Nature Reviews Genetics (2014) doi:10.1038/nrg3798.
Rhee, H. S. & Pugh, B. F. Comprehensive genome-wide protein-DNA interactions detected at single-nucleotide resolution. Cell (2011) doi:10.1016/j.cell.2011.11.013.
Skene, P. J. & Henikoff, S. An efficient targeted nuclease strategy for high-resolution mapping of DNA binding sites. Elife (2017) doi:10.7554/eLife.21856.
O’Malley, R. C. et al. Cistrome and Epicistrome Features Shape the Regulatory DNA Landscape. Cell 165, 1280–1292 (2016).
Bartlett, A. et al. Mapping genome-wide transcription-factor binding sites using DAP-sEq. Nat. Protoc. (2017) doi:10.1038/nprot.2017.055.
Galli, M. et al. The DNA binding landscape of the maize AUXIN RESPONSE FACTOR family. Nat. Commun. 9, 4526 (2018).
Bell, O., Tiwari, V. K., Thomä, N. H. & Schübeler, D. Determinants and dynamics of genome accessibility. Nature Reviews Genetics (2011) doi:10.1038/nrg3017.
Consortium, T. E. P. et al. An integrated encyclopedia of DNA elements in the human genome. Nature 489, 57–74 (2012).
Yue, F. et al. A comparative encyclopedia of DNA elements in the mouse genome. Nature 515, 355–364 (2014).
modENCODE Consortium, T. et al. Identification of Functional Elements and Regulatory Circuits by Drosophila modENCODE. Science (80-.). 330, 1787–1797 (2010).
Gerstein, M. B. et al. Integrative analysis of the Caenorhabditis elegans genome by the modENCODE project. Science 330, 1775–87 (2010).
Lü, P. et al. Genome encode analyses reveal the basis of convergent evolution of fleshy fruit ripening. Nat. Plants 4, 784–791 (2018).
Oki, S. et al. ChIP-Atlas: a data-mining suite powered by full integration of public ChIP-seq data. EMBO Rep. 19, e46255 (2018).
Chèneby, J., Gheorghe, M., Artufel, M., Mathelier, A. & Ballester, B. ReMap 2018: an updated atlas of regulatory regions from an integrative analysis of DNA-binding ChIP-seq experiments. Nucleic Acids Res. 46, D267–D275 (2018).
Mei, S. et al. Cistrome Data Browser: a data portal for ChIP-Seq and chromatin accessibility data in human and mouse. Nucleic Acids Res. 45, D658–D662 (2017).
Nelson, A. D. L., Haug-Baltzell, A. K., Davey, S., Gregory, B. D. & Lyons, E. EPIC-CoGe: Managing and analyzing genomic data. Bioinformatics (2018) doi:10.1093/bioinformatics/bty106.
Ran, X. et al. Plant Regulomics: a data-driven interface for retrieving upstream regulators from plant multi‐omics data. Plant J. tpj.14526 (2019) doi:10.1111/tpj.14526.
Thurman, R. E. et al. The accessible chromatin landscape of the human genome. Nature (2012) doi:10.1038/nature11232.
Li, Q., Brown, J. B., Huang, H. & Bickel, P. J. Measuring reproducibility of high-throughput experiments. Ann. Appl. Stat. 5, 1752–1779 (2011).
Chen, D., Yan, W., Fu, L.-Y. & Kaufmann, K. Architecture of gene regulatory networks controlling flower development in Arabidopsis thaliana. Nat. Commun. 9, 4534 (2018).
Z, L., BT, H., C, V., RM, D. & RJ, S. Combining ATAC-seq with nuclei sorting for discovery of cis-regulatory regions in plant genomes. Nucleic Acids Res. 45, (2017).
KA, M. et al. Profiling of Accessible Chromatin Regions across Multiple Plant Species and Cell Types Reveals Common Gene Regulatory Principles and New Control Modules. Plant Cell 30, 15–36 (2018).
Pajoro, A. et al. Dynamics of chromatin accessibility and gene regulation by MADS-domain transcription factors in flower development. Genome Biol. 2014 153 15, 1–19 (2014).
Potter, K. C., Wang, J., Schaller, G. E. & Kieber, J. J. Cytokinin modulates context-dependent chromatin accessibility through the type-B response regulators. Nat. Plants 2018 412 4, 1102–1111 (2018).
Sijacic, P., Bajic, M., McKinney, E. C., Meagher, R. B. & Deal, R. B. Changes in chromatin accessibility between Arabidopsis stem cells and mesophyll cells illuminate cell type-specific transcription factor networks. Plant J. 94, 215–231 (2018).
Tannenbaum, M. et al. Regulatory chromatin landscape in Arabidopsis thaliana roots uncovered by coupling INTACT and ATAC-sEq. Plant Methods 2018 141 14, 1–12 (2018).
AM, S. et al. Mapping and Dynamics of Regulatory DNA in Maturing Arabidopsis thaliana Siliques. Front. Plant Sci. 10, (2019).
Sullivan, A. M. et al. Mapping and dynamics of regulatory DNA and transcription factor networks in A. thaliana. Cell Rep. 8, 2015–2030 (2014).
W, Z., T, Z., Y, W. & J, J. Genome-wide identification of regulatory DNA elements and protein-binding footprints using signatures of open chromatin in Arabidopsis. Plant Cell 24, 2719–2731 (2012).
Yan, W. et al. Dynamic control of enhancer activity drives stage-specific gene expression during flower morphogenesis. Nat. Commun. (2019) doi:10.1038/s41467-019-09513-2.
Kelley, D. R., Snoek, J. & Rinn, J. L. Basset: learning the regulatory code of the accessible genome with deep convolutional neural networks. Genome Res. 26, 990–999 (2016).
X, H. & L, X. Transcription Factors WOX11/12 Directly Activate WOX5/7 to Promote Root Primordia Initiation and Organogenesis. Plant Physiol. 172, 2363–2373 (2016).
L, S. et al. Non-canonical WOX11-mediated root branching contributes to plasticity in Arabidopsis root system architecture. Development 144, 3126–3133 (2017).
Ernst, J. & Kellis, M. ChromHMM: automating chromatin-state discovery and characterization. Nat. Methods (2012) doi:10.1038/nmeth.1906.
Roudier, F. et al. Integrative epigenomic mapping defines four main chromatin states in Arabidopsis. EMBO J. 30, 1928–1938 (2011).
Wang, C. et al. Genome-wide analysis of local chromatin packing in Arabidopsis thaliana. Genome Res. 25, 246–256 (2015).
Luo, C. et al. Integrative analysis of chromatin states in Arabidopsis identified potential regulatory mechanisms for natural antisense transcript production. Plant J. 73, 77–90 (2013).
Sneppen, K. & Ringrose, L. Theoretical analysis of Polycomb-Trithorax systems predicts that poised chromatin is bistable and not bivalent. Nat. Commun. (2019) doi:10.1038/s41467-019-10130-2.
Harikumar, A. & Meshorer, E. Chromatin remodeling and bivalent histone modifications in embryonic stem cells. EMBO Rep. (2015) doi:10.15252/embr.201541011.
Haudry, A. et al. An atlas of over 90,000 conserved noncoding sequences provides insight into crucifer regulatory regions. Nat. Genet. 45, 891–898 (2013).
Liu, C., Lu, F., Cui, X. & Cao, X. Histone Methylation in Higher Plants. Annu. Rev. Plant Biol. (2010) doi:10.1146/annurev.arplant.043008.091939.
Du, J., Johnson, L. M., Jacobsen, S. E. & Patel, D. J. DNA methylation pathways and their crosstalk with histone methylation. Nat. Rev. Mol. Cell Biol. (2015) doi:10.1038/nrm4043.
Gjoneska, E. et al. Conserved epigenomic signals in mice and humans reveal immune basis of Alzheimer’s disease. Nature (2015) doi:10.1038/nature14252.
Angermueller, C., Pärnamaa, T., Parts, L. & Oliver, S. Deep Learning for Computational Biology. Mol. Syst. Biol. 12, 1–16 (2016).
Chen, D. et al. The HTPmod Shiny application enables modeling and visualization of large-scale biological data. Commun. Biol. 1, 89 (2018).
Heyndrickx, K. S., Van de Velde, J., Wang, C., Weigel, D. & Vandepoele, K. A functional and evolutionary perspective on transcription factor binding in Arabidopsis thaliana. Plant Cell 26, 3894–910 (2014).
Aerts, N., de Bruijn, S., van Mourik, H., Angenent, G. C. & van Dijk, A. D. J. Comparative analysis of binding patterns of MADS-domain proteins in Arabidopsis thaliana. BMC Plant Biol. 18, 131 (2018).
Yang, F. et al. A Maize Gene Regulatory Network for Phenolic Metabolism. Mol. Plant (2017) doi:10.1016/j.molp.2016.10.020.
Song, L. et al. A transcription factor hierarchy defines an environmental stress response network. Science (80-.). 354, aag1550–aag1550 (2016).
Muino, J. M. et al. Evolution of DNA-binding sites of a floral master regulatory transcription factor. Mol. Biol. Evol. (2016) doi:10.1093/molbev/msv210.
Villar, D., Flicek, P. & Odom, D. T. Evolution of transcription factor binding in metazoans-mechanisms and functional implications. Nature Reviews Genetics (2014) doi:10.1038/nrg3481.
Schmidt, D. et al. Five-vertebrate ChlP-seq reveals the evolutionary dynamics of transcription factor binding. Science (80-.). (2010) doi:10.1126/science.1186176.
Villar, D. et al. Enhancer evolution across 20 mammalian species. Cell (2015) doi:10.1016/j.cell.2015.01.006.
Kozomara, A., Birgaoanu, M. & Griffiths-Jones, S. MiRBase: From microRNA sequences to function. Nucleic Acids Res. (2019) doi:10.1093/nar/gky1141.
Jin, J. et al. PlantTFDB 4.0: Toward a central hub for transcription factors and regulatory interactions in plants. Nucleic Acids Res. (2017) doi:10.1093/nar/gkw982.
Khan, A. et al. JASPAR 2018: Update of the open-access database of transcription factor binding profiles and its web framework. Nucleic Acids Res. (2018) doi:10.1093/nar/gkx1126.
Weirauch, M. T. et al. Determination and Inference of Eukaryotic Transcription Factor Sequence Specificity. Cell 158, 1431–1443 (2014).
Grant, C. E., Bailey, T. L. & Noble, W. S. FIMO: scanning for occurrences of a given motif. Bioinformatics 27, 1017–8 (2011).
Langmead, B. & Salzberg, S. L. Fast gapped-read alignment with Bowtie 2. Nat. Methods 9, 357–359 (2012).
Li, H. & Durbin, R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics 25, 1754–1760 (2009).
Zhang, Y. et al. Model-based Analysis of ChIP-Seq (MACS). Genome Biol. 9, R137 (2008).
Bailey, T. et al. Practical Guidelines for the Comprehensive Analysis of ChIP-seq Data. PLoS Comput. Biol. 9, (2013).
Quinlan, A. R. & Hall, I. M. BEDTools: A flexible suite of utilities for comparing genomic features. Bioinformatics 26, 841–842 (2010).
John, S. et al. Chromatin accessibility pre-determines glucocorticoid receptor binding patterns. Nature Genetics vol. 43 264–268 (2011).
Marinov, G. K., Kundaje, A., Park, P. J. & Wold, B. J. Large-scale quality analysis of published ChIP-seq data. G3 Genes, Genomes, Genet. (2014) doi:10.1534/g3.113.008680.
Ramírez, F., Dündar, F., Diehl, S., Grüning, B. A. & Manke, T. DeepTools: A flexible platform for exploring deep-sequencing data. Nucleic Acids Res. 42, (2014).
Skinner, M. E., Uzilov, A. V., Stein, L. D., Mungall, C. J. & Holmes, I. H. JBrowse: A next-generation genome browser. Genome Res. (2009) doi:10.1101/gr.094607.109.
Zhou, X. et al. The human epigenome browser at Washington University. Nature Methods vol. 8 989–990 (Nature Research, 2011).
Zhu, B., Zhang, W., Zhang, T., Liu, B. & Jiang, J. Genome-Wide Prediction and Validation of Intergenic Enhancers in Arabidopsis Using Open Chromatin Signatures. Plant Cell 27, 2415–2426 (2015).
Harris, R. Improved pairwise Alignmnet of genomic DNA. (2007).
Kent, W. J., Baertsch, R., Hinrichs, A., Miller, W. & Haussler, D. Evolution’s cauldron: Duplication, deletion, and rearrangement in the mouse and human genomes. Proc. Natl. Acad. Sci. (2003) doi:10.1073/pnas.1932072100.
Chen, K. & Rajewsky, N. The evolution of gene regulation by transcription factors and microRNAs. Nature Reviews Genetics vol. 8 93–103 (2007).
Herranz, H. & Cohen, S. M. MicroRNAs and gene regulatory networks: Managing the impact of noise in biological systems. Genes and Development vol. 24 1339–1344 (2010).
Fahlgren, N. & Carrington, J. C. miRNA Target Prediction in Plants. Methods Mol. Biol. 592, 51–57 (2010).

There is NO Competing Interest.

SupplementaryTables.xlsx
Supplementary Tables 1-8
SuppFigs.pdf
Supplementary Figures 1-19
nrreportingsummary.pdf
Reporting summary

Download PDF

Journal Publication

published 14 Jun, 2022

Read the published version in Nature Communications →

Version 1

posted

You are reading this latest preprint version

ChIP-Hub: an Integrative Platform for Exploring Plant Regulome

Status:

Journal Publication

Version 1

Abstract

Figures

Introduction

Results

Discussion

Methods

Declarations

References

Additional Declarations

Supplementary Files

Status:

Journal Publication

Version 1