Microdissection of joint interzones and phalanges
The interzones and the adjacent proximal part of phalange were dissected from the third digit of the hindlimb of chick embryos (stage HH32, when the interzone was distinguishable under the microscope; Fig. 1a). Next, the RNA-sequencing was performed on the separated tissues (see Materials and Methods).
Following the bioinformatic analysis, a list of DEGs was compiled. We confirmed that the interzone samples had increased mRNA levels of GDF5, ENPP2, COL3A1 and ERG, each already known to be expressed in joint interzones. Phalange samples had significantly higher expression of COL2A1, MATN1, SNAI1 and RUNX2 (Fig. 1b; Supplementary Table 1). Notably, COL10A1 mRNA expression was not detected in phalanges (Supplementary Table 2), supporting the notion that the phalanges were collected at the early developmental stage, prior to hypertrophy. We further validated differential expression of selected interzone markers (i.e. GDF5, ENPP2, ERG) using RT-qPCR (Fig. 1c), which confirmed our RNA-seq data. These results show that the we successfully dissected tissues of interest.
An enhancer atlas of joint interzone and phalange identifies candidate REs involved in the regulation of cell identity
In the separated joint and phalange we also mapped the global H3K27ac and H3K4me1 signatures by chromatin immunoprecipitation followed by next-generation sequencing (ChIP-seq). The unsupervised clustering analysis revealed that interzone and phalange have distinct profiles for both H3K27ac and H3K4me1 (Fig. 2a,b). This mapping of the regions enriched for these H3 modifications enabled us to generate a joint/phalange CE atlas: regions enriched for both H3K27ac and H3K4me1 were denoted strongly active enhancers, for H3K27ac active enhancers, and for H3K4me1 poised enhancers. Since enhancers are often evolutionarily conserved26, we decided to select regions conserved among chick, mouse and human. In doing so, we identified 14,217 strongly active, 5,479 active and 11,913 poised enhancers in the interzone, and 14,224, 6,041 and 12,997, respectively, in the phalange (Supplementary Table 3; for their frequency and similar ratios in both samples, see Fig. 3a).
For functional annotation of these REs we used GREAT, which extracts gene ontology (GO) terms linked to biological processes27. The denoted strongly active enhancers associated with cartilage and skeletal development (Fig. 3b), whereas active and poised enhancers mostly linked to general cell functions or processes not specific for skeletal development (Supplementary Fig. 1). Our results also matched with Cheung et al.28 who retrieved GO terms that associate with cartilage and skeletal development of strongly active enhancers of human chondrocytes, and more general GOs of active and poised enhancers.
This agreement with data of Cheung et al.28 prompted us to proceed exclusively with strongly active CEs. These CEs are typically located >5 kb away from the respective transcription start site (TSS) (Fig. 3c). Among them, we confirmed previously characterized enhancers of well-studied loci (Fig. 4a), specifically those expressed in interzone (e.g., GDF5)29 or chondrocytes (e.g., phalangeal IHH, SOX9, ACAN)30-32. In parallel, and further confirming the validity of our in silico selection approach, we extended the analysis by using Vista Enhancer Browser dataset33, leading to identification of 257 enhancers (Supplementary Table 4), which have been functionally validated during embryogenesis (for illustration of 6 of these, see Fig. 4b).
To further characterize strongly active CEs of developing interzone and phalange, we selected the mutually exclusive strongly active enhancers, yielding 3,406 CEs (out of the aforementioned 14,217 in total) unique for interzone and 3,407 (out of 14,224) for phalange (Supplementary Table 5). GREAT linked many of such interzone-specific CEs to mesenchymal cell differentiation, and regulation of transmembrane receptor protein serine/threonine kinase signaling (Supplementary Fig. 2, two top panels). In contrast, CEs exclusive for phalange retrieved GO terms including chondrocyte differentiation and endochondral bone morphogenesis (Supplementary Fig. 2, two bottom panels).
Next, we investigated whether the change of enhancer state from strongly active to poised would be relevant to the regulation of tissue-specific genes. Indeed, 2,111 changes occur with strongly active CEs (out of the aforementioned identified 14,217 in total) in interzone and are found poised in phalange; in comparison, 1,502 changes occur with strongly active CEs (out of 14,224 in total) in phalange and are found poised in interzone (Supplementary Table 6). Strikingly, the interzone strongly active enhancers that are poised in phalange were found to associate with regulation of transmembrane receptor protein serine/threonine kinase signaling (Supplementary Fig. 2), which is consistent with our functional annotation of interzone-specific CEs. In contrast, phalange strongly active enhancers that are poised in interzone linked to positive regulation of cartilage differentiation (Supplementary Fig. 2).
Integrative analysis of DEGs and CEs
To correlate the transcribed genes with CEs, we superimposed our RNA-seq and ChIP-seq data. First, DEG analysis revealed 116 upregulated genes in interzone, and 61 such genes in phalange (log2FC > 0.5, p.adj < 0.05) (Supplementary Table 2). Pathway enrichment analysis of these showed that genes upregulated in interzone again linked to transmembrane receptor protein serine/threonine kinase signaling (Fig. 5a), in line with our preceding annotation of interzone-exclusive CEs (see above). In particular RASL11B, LTBP1, TGFB2, GDF5, FSTL1, BMP2, DACT2, CCN3, BMP6, CILP, INHBB and BMPR2 were found upregulated in interzone as compared to phalange (Fig. 5b). Similarly, analysis of genes upregulated in phalange linked these to chondrocyte differentiation and also endochondral bone morphogenesis (Fig. 5c), which is consistent with functional annotation of phalange-specific CEs. The genes involved it these two processes are RUNX2, COL2A1, TRPV4, COL27A1, MATN1, COMP and CYTL1 (Fig. 5d).
Next, to associate the CEs to all DEGs significantly upregulated in interzone, we extracted strongly active enhancers from genomic regions located within -/+ 1 Mb from the TSS of the annotated genes, resulting in the identification of 857 interzone-specific CEs (Supplementary Table 7). Examples of enhancer analyses are presented in Fig. 6a. For phalange-upregulated DEGs we identified 547 phalange-specific CEs (again within -/+ 1 Mb from the TSS of the genes; Supplementary Table 8), with examples given in Fig. 6b. Collectively, the integrative analysis of transcriptome data with CEs assignment, and considering interzone vs. phalange signatures, showed that the DEGs involved in celltype-specific processes are regulated by cell-specific CEs.
CEs regulate skeletal malformation and disease-relevant genes, and are associated with a higher risk of OA
Mutations in genes and REs have been linked to various limb malformations and skeletal defects21,23-25. We applied two types of analysis to screen for CEs that link to molecular etiology of limb disorders in general. First, we assigned our strongly active CEs to the proximal genes (including relevant respective marker genes and DEGs) and tested whether these genes have been associated previously with limb phenotypes, either in patients (including in syndromes) or mouse models. Analysis of interzone/phalange specific strongly active CEs showed that these are indeed involved in the regulation of genes linked to joint and phalange abnormalities (Supplementary Table 9a-o).
The interzone-specific, strongly active CEs particularly associate with defective joint mobility in humans (Fig. 7a). For instance, we identified such putative enhancers of OTX2 and TGFB2, which are genes that have been linked to joint laxity (OMIM #610125 and #614816, respectively); candidate CEs of FLNB, a gene associated with joint dislocation and carpal fusion (OMIM #150250 and #272460, respectively); we also predicted enhancers of COL5A1, a gene linked to joint hypermobility (OMIM #130000) (Supplementary Table 9a). In mouse, the interzone-specific CEs associate with abnormal joint morphology and fused joints (Fig. 7b; see also Supplementary Table 9b-c). Phalange-specific CEs have in humans been linked to aplasia/hypoplasia of the phalanges, short phalanges, and abnormality of the phalanges of the toe (Fig. 7c). For example, we identified putative enhancers of BMPR1B and IHH (Supplementary Table 9d), both being genes associated with brachydactyly type A (OMIM #112500), and candidate enhancers of RUNX2, a gene linked to cleidocranial dysplasia, with brachydactyly (OMIM #119600) (Supplementary Table 9f). The phalange-specific CEs have also been linked to abnormal chondrocyte and cartilage morphology, chondrodystrophy, abnormal bone ossification and short limbs in mouse (Fig. 7d; see also Supplementary Table 9j-o).
Next, we screened the GWAS catalog (NHGRI-EBI34), which resulted in identification of 3,263 single-nucleotide polymorphisms (SNPs) within CEs, with 232 of these linking to skeleton related traits (Supplementary Table 10). For instance, we identified single-nucleotide variations (SNVs) within CEs that have been associated with OA relevant genes, such as for ALDH1A2 (rs4775006, P-value 8x10-10)35,36 and WWP2 (rs34195470, p-value 3x10-13)5,37. Additionally, we identified SNVs associated with increased risk of OA, which are located within CEs that map to LRIG3 (rs79056043, P-value 1x10-9), CRADD (rs7953280, P-value 5x10-12) and ROCR (rs8067763, P-value 2x10-9)5,38. Altogether, the analysis of the tissue-specific CEs showed the association with either synovial joint or phalange congenital abnormalities, which affect their function, as well as the identified regulatory elements of genes relevant for joint degenerative disorders, in particular OA.