The design of scRICA-seq.
To simultaneously profile chromatin accessibility, RNA expression, and transcript structure in the same cell, we employ two strategies that enable parallel capture and sequencing of DNA and RNA. Firstly, for low-throughput single-cell samples, we utilize a nuclear-cytoplasmic separation method that has been previously established by our team5. This method allows us to isolate cell nuclei for chromatin accessibility sequencing analysis, while the cytoplasmic RNA from the same cell is utilized for full-length transcriptome analysis. For high-throughput single-cell populations, we employ the 10x genomics microfluidic system, which enables parallel analysis of chromatin accessibility and full-length transcriptome analysis within individual cell nuclei (Fig. 1a).
To facilitate single-cell full-length transcriptome analysis, we have developed a novel method called scRCAT-seq2, which builds upon our previous work on scRCAT-seq4. The scRCAT-seq2 method encompasses the following steps: (1) Labeling the ends of individual RNA/cDNA molecules with a tag (barcode); (2) Creating a circular cDNA through end-to-end ligation; (3) Performing circular amplification to generate multiple copies of full-length cDNA, each with an identical tag; (4) Fragmenting the full-length cDNA by randomly inserting Tn5 transposes around the ligated ends; (5) Constructing libraries and conducting paired-end sequencing on the resulting fragments; (6) Integrating short reads with the same molecule tag to identify key features of full-length transcripts, such as transcription start sites (TSS), transcription end sites (TES), and specific exons (Fig. 1b, Extended Data Fig. 1).
Accuracy and sensitivity of scRCAT-seq2.
In order to assess the sensitivity and accuracy of scRCAT-seq2 in analyzing single-cell full-length transcriptomes, we introduced the RNA standard set (SIRV-set 3), consisting of multiple isoforms, into a single ES cell sample. The results showed that scRCAT-seq2 can effectively captured the transcript 5' and 3' ends as well as exons of the transcripts (Fig. 2a). SIRV-set 3 comprises 7 genes and 69 isoforms, and scRCAT-seq2 successfully detected 88% of the transcripts (61 isoforms). Moreover, when compared to the actual labels of SIRV-set 3 transcripts, scRCAT-seq2 achieved an accuracy rate of 94.45% in identifying complete transcripts, with 30.78% being identified without any reference and 46.63% being identified based on the attributes of alternate isoforms documented in the reference (Fig. 2b-c, Extended Data Fig. 3).
When compared to scRCAT-seq2 with ScISOr-seq, a third-generation sequencing platform based on Pacbio, scRCAT-seq2 was demonstrated to detect a higher number of genes and isoforms. Specifically, scRCAT-seq2 identified 12,309 genes, while ScISOr-seq only detected 5,378 genes. Additionally, scRCAT-seq2 revealed an average of 2,772 full-length isoforms per cell, compared to only 1,313 isoforms detected by ScISOr-seq. The number of genes and isoforms detected by scRCAT-seq2 was 2.29-fold higher (12,309 ± 719 versus 5,378 ± 1189) and 2.11-fold higher (2,772 ± 156 versus 1,313 ± 386) compared to ScISOr-seq, respectively (Fig. 2d-e). And the majority of genes and transcripts detected by ScISOr-seq were also detected by scRCAT-seq2 (Fig. 2f-g).
Furthermore, by analyzing the intersection of genes detected at different omics levels, we found that scRCAT-seq2 detected 85.9% of the genes detected by the 10x genomics scRNA-seq platform (22,937 out of 26,691), and the majority of gene and transcript promoter regions were covered by scATAC-seq (Fig. 2h). In summary, these results demonstrated the enhanced sensitivity of scRCAT-seq2 compared to third-generation sequencing methods. It provides an effective approach for integrating and analyzing the heterogeneity and correlation among RNA expression, structural differences, and chromatin accessibility in cells.
ScRICA-seq revealed the multi-dimensional landscape of the single-cell atlas of human retinal organoids.
In order to unravel the intricate molecular mechanisms underlying cell development and diseases within the central nervous system, it is vital to investigate the comprehensive regulation of chromatin accessibility, RNA transcription, and splicing across different cell types. To accomplish this, we utilized human retinal organoids as a case study and employed scRICA-seq to build a single-cell multi-dimensional atlas and data analysis workflow.
Single-cell dissociation was performed on the 45-day cultured human retinal organoids, and the isolated cell nuclei were subjected to scRICA-seq (Fig. 1b, 3a). We performed cell clustering on the single-cell multi-omics data and identified 6 clusters, and successfully detected various cell types with typical cell markers, including RPCs, RGCs, AC/HCs, PR precursors, neurogenic RPCs, and cones (Fig. 3b). Notably, the clustering results based on single-cell transcript isoform groupings showed a high degree of consistency with those obtained from scATAC-seq and scRNA-seq (Fig. 3b-d, Extended Data Fig. 5). In addition, we demonstrated that the differential expression of marker genes, both at the RNA and isoform levels, in different cell types was consistent with the corresponding chromatin accessibility levels. For example, the expression of THRB6–8 in rod cells, CRX9,10, PDE6H11,12, and ARR313,14 in cones, PAX615,16 in AC/HCs, and GAP4317,18, ISL119 in RGCs, exhibited consistency between their RNA expression patterns and chromatin accessibility levels (Fig. 3e). Furthermore, this consistency was also observed for fate-determining factors of retinal neurons, such as HES1 in neurogenic RPCs 20, SOX221,22 in RPCs, ATOH723–25 in retinal ganglion cells (RGCs), OTX226–28 in the PR precursor. Importantly, the RNA expression and chromatin accessibility patterns of these genes were aligned with their activity in different cell types. This suggests that chromatin accessibility plays a role in mediating the expression of marker genes and fate-determining factors.
scRICA-seq unveiled coordinated dynamics of chromatin accessibility and RNA expression during RPC differentiation.
By constructing pseudotime differentiation trajectories for RPCs, neurogenic RPCs, PR precursors, and cones (Fig. 4a), we identified notable alterations in multiple fate-determining factors at both the transcriptional and chromatin accessibility levels during the differentiation of RPCs into cones. These changes encompassed key genes such as ARR3, CRX, THRB, HES1, ASCL-1, ATOH7, and NRL (Fig. 4b-d). Although the function of many of these transcription factors as fate-determining factor promoting RPC differentiation has been widely recognized, the epigenetic regulation of these factors remains largely unknown previously. Through direct comparisons of chromatin accessibility and gene expression between RPCs and cones, we observed a simultaneous increase in both the expression levels and chromatin accessibility of fate-determining factors in cones and retinal ganglion cells (RGCs) when compared to RPCs (Fig. 4e- j, Extended Data Fig. 6a-d). To identify transcription factors that are specific to different stages of differentiation, we conducted motif enrichment analysis and discovered significant enrichment of motifs associated with GSC, GSC2, PITX3, OTX2, ATOH7, ISL1, and GAP43 and others (Fig. 4k-l). Alongside the previously mentioned fate-determining factors, we also identified several novel factors, including GSC and GSC2, which are known for their vital roles in embryonic development29–31. Additionally, PITX3, which is essential for neuronal and lens development, was identified as well. Its deficiency has been proved to lead to retinal developmental abnormalities and visual impairments32,33.
In summary, by using scRICA-seq, we have discovered epigenetic regulation of cell fate determinants and other novel transcription factors involved in the differentiation of RPCs into different retinal neurons, such as cones and RGCs.
The chromatin accessibility correlates with the alternative usage of RNA isoforms across different cell types during RPC differentiation.
Additionally, we examined the patterns of RNA isoform use during the differentiation of retinal progenitor cells (RPCs) into various neuronal cell types. This analysis encompassed variations in alternative promoters, alternative polyadenylation, exon skipping (SE), intron retention (RI), mutually exclusive exons (MXE), and alternative 3’ / 5’ splice site events (A3SS/A5SS) 34,35. Initially, we performed an analysis focusing on the variation in transcription start sites (TSS) and transcription end sites (TES) and identified 96 genes that exhibited significant variations in TSS, along with 66 genes showing variations in TES when comparing retinal progenitor cells (RPCs) and cones (Fig. 5a-b). Interestingly, the genes that displayed significant variations in TSS, such as THRB, BTBD8, VMP1, and SLF1, demonstrated concurrent changes in their chromatin accessibility profiles (Fig. 5c-d, Extended Data Fig. 6c-d). For instance, THRB exhibited a preference for longer isoforms in RPCs, while shorter isoforms were more prevalent in cones. Moreover, the ATAC signals in the vicinity of the promoters for these shorter isoforms were significantly increased in cones compared to RPCs (Fig. 5c), pointing towards a correlation between isoform selection and chromatin accessibility.
Moreover, we conducted a comparison of variable splicing patterns between retinal progenitor cells (RPCs) and cones, which uncovered exon skipping events in 463 gene loci. Specifically, 274 loci exhibited exon skipping in RPCs, while 189 loci exhibited exon skipping in cones (Fig. 5e). In the differentiation process from RPCs to cones, we identified 93 gene loci with mutually exclusive exons and 212 loci with intron retention (Fig. 5f-g). Furthermore, our investigation unveiled 129 gene loci with alternative 3’ splice site (A3SS) events and 83 gene loci with alternative 5’ splice site (A5SS) events (Fig. 5h-i). In addition, when comparing the variation in splicing sites with the corresponding chromatin accessibility sites, we found that the majority of splicing changes were independent of chromatin accessibility. This observation was particularly evident in genes that encoded RPC-specific isoforms (e.g., GOPC, DCTN2, PTCD3) and cone-specific isoforms (e.g., RABEPK, BUD31, GUF1) (Extended Data Fig. 6e-j). Interestingly, we also discovered that a small proportion (0.92%) of isoforms exhibited splicing changes that corresponded to changes in chromatin accessibility at the variable splicing sites (Fig. 5j). Examples of such genes include PPP4R3A and ERI3 (Fig. 5k-l). This finding suggests a potential relationship between chromatin accessibility and RNA splicing.