Actual time-series of single-cell transcriptomics reveals circadian clocks as key regulators of cell differentiation in Arabidopsis

The basis of toti/pluripotency is elaborate regulation of cell-cycle progression and cell-fate determination. Circadian clocks are involved in this process, but the underlying mechanisms have not been studied due to technical limitations. In particular, there is a lack of research on the universality of cell differentiation mechanisms in multicellular organisms using plants with high cell-fate plasticity. Here, exploiting in vivo single-cell RNA sequencing and a new actual time reconstitution method, PeakMacth, we analyzed actual time-series of cell reprogramming and differentiation processes at single-cell resolution, and found that the circadian clock modulates cell differentiation via BES1-mediated GSK3 signaling, which has a β-catenin-like function in Arabidopsis. In this pathway, the clock gene LUX in meristematic stem cells directly targets the CYCD and RETINOBLASTOMA-RELATED (RBR) genes, which are commonly involved in cell-cycle progression and cell-fate determination in plants and animals. In addition, the rhythmicity of the circadian clock was associated with cell state, and the establishment of the circadian rhythm preceded cell differentiation. Thus, our study not only reveals the involvement of the circadian clock in the differentiation of plant stem cells but also demonstrates functionally analogous features in the regulatory system of cell differentiation across species.

time perspective delays our understanding of the molecular roles of the circadian clocks on cell differentiation from stem cells and hinders our understanding of the basic mechanisms of toti/pluripotency across species.
Here, we focused on plant cells that exhibit pluripotency and high cell-fate plasticity. By leveraging these properties of plant cells, and by conferring actual time information to pseudo time-series, we analyzed whole cell-differentiation process with high spatiotemporal resolution. We uncovered the molecular basis by which the circadian clock controls plant cell differentiation, and provided a new concept for the functional similarity of signal transduction pathways in the regulation of plant and animal cell differentiation.

Involvement of plant circadian clocks in cell differentiation process
To investigate the involvement of the plant circadian clock in cell differentiation, we applied the vascular cell differentiation induction system, VISUAL 13,14 . In this system, vascular cells (including both xylem and phloem cells) are induced from mesophyll cells through the meristematic stem cells and vascular stem cells within 3 days ( Fig. 1a and Supplementary Fig. 1a,b). Two clock mutants, cca1; lhy; timing of CAB expression 1 (cca1 lhy toc1) and a knockdown of LUX ARRHYTHMO (LUX) / PHYTOCLOCK 1 and BROTHER OF LUX ARRHYTHMO (also known as NOX) by arti cial microRNA (lux nox) 15 showed signi cant defects in vascular cell differentiation ( Fig. 1b and Supplementary Fig. 1c), similarly to a loss of function mutant of BRI1-EMS SUPPRESSOR 1 (BES1), which shows severe defects in vascular cell differentiation in VISUAL 13 . We also con rmed that the clock mutants affect development of vascular bundles, guard cells, and root cells in non-VISUAL conditions ( Supplementary Fig. 1d-f), suggesting that the plant circadian clock is generally involved in the process of cell differentiation. Our analyses thus uncovered requirement of a functional circadian clocks for plant cell differentiation.
The VISUAL assay consists of two steps, reprogramming of mesophyll cells to meristematic stem cells and differentiation of the meristematic stem cells to vascular cells. To determine which step is regulated by the circadian clocks, we next measured expression levels of cell-type-speci c markers. In the wild type (WT), mesophyll cell markers (CAB3 and LHCB2.1) 16 rapidly disappeared after induction, followed by expression of meristematic stem cell markers (RGF6 and HAM2) 17,18 . Afterward, vascular stem cell markers (TDR and ANT) 13,19 and vascular cell markers (IRX3 and VND7) 20 appeared in succession ( Fig.  1c and Supplementary Fig. 1g). Vascular stem cell markers were not fully induced in the cca1 lhy toc1, lux nox, and bes1 mutants, although the reduction of the mesophyll cell markers and the induction of the meristematic stem cell markers in both mutants were comparable to WT ( Fig. 1c and Supplementary Fig.  1g). Overall, marker gene expression revealed that the circadian clock is involved in the step of differentiation rather than reprograming.
Reconstruction of actual time-series of scRNA-seq data Since the expression of the vascular stem cell markers (TDR and ANT) and the vascular cell markers (IRX3 and VND7) overlapped each other ( Fig. 1c and Supplementary Fig. 1g), and vascular cell differentiation proceeded gradually, even in the VISUAL assay ( Supplementary Fig. 1b), it was still unclear when and how the circadian clock induces cell differentiation. To circumvent such low spatiotemporal resolution of bulk analysis, which can be attributed to the crude averaging of various cell types, we performed time-series of scRNA-seq in the VISUAL. Making protoplasts and/or nuclear isolation is the conventional method to perform scRNA-seq in plants [21][22][23] ; however, the protoplast isolation is timeconsuming and potentially stress-inducing, it is not suitable for time-series analysis of the circadian clocks. We therefore obtained total RNA from single cells using glass capillaries in vivo context 24 Table 1).
To separate xylem and phloem cell lineages, we rst applied the Wishbone method, which can order scRNA-seq data with bifurcating developmental trajectories based on an unsupervised clustering 11 . tdistributed Stochastic Neighbor Embedding (t-SNE) of the dataset was represented by a Y-shaped structure, suggesting that the Wishbone can properly reconstruct developmental trajectories of xylem and phloem cells (Fig. 2a). This assertion was further validated by overlaying expression levels of cell-typespeci c markers ( Fig. 2b and Supplementary Fig. 3a). For simplicity, we focused on the xylem cell lineage and applied the Seurat method 12 to improve temporal resolution of the pseudo time trajectory. Seurat allowed us to estimate the most likely location of cells by referring to preset expression patterns of celltype-speci c markers. Clustering of the data obtained by applying 24 reference genes in Seurat yielded four major clusters presumed to be mesophyll cells, meristematic stem cells, vascular stem cells, and xylem cells (Supplementary Fig. 3b and Supplementary Table 2). This classi cation was supported by the enriched GO-terms in each cluster ( Supplementary Fig. 3c).
Actual time-series with interval scale contains temporal information, whereas ordinal scale-based pseudo time-series lacks temporal information. Thus, time-series analysis of scRNA-seq remains a di cult problem 25 and few studies have addressed circadian rhythms in single-cell transcriptomes. To overcome this limitation and to reconstruct actual time-series from single-cell transcriptome datasets, we developed PeakMatch method. The basic concept of PeakMatch is that the timing of signi cant gene expression peaks can be comparable between scRNA-seq data based on pseudo time-series and cpRNA-seq data based on actual time-series ( Supplementary Fig. 4a, see Methods for details). By integrating estimated peak times of 2,217 genes, we reconstructed an actual time-series and succeeded in improving segregation of cell-type-speci c markers (Fig. 2c, d). The half-value width of the gene expression peaks in the reconstructed actual time-series was narrow compared to that of cpRNA-seq (Fig. 2e), demonstrating that PeakMatch reduces the crude averaging effect and improves temporal resolution. Thus, the Wishbone -Seurat -PeakMatch pipeline ( Supplementary Fig. 4b) improved spatiotemporal resolution of scRNA-seq data and enabled us to handle actual time-series data at single-cell resolution. Even though the clock genes were not used as PeakMatch references, the expression rhythms of LHY and CCA1 showed 24 h of rhythmicity (Fig. 3a and Supplementary Tables 3 and 4).

BES1 suppresses LHY expression and induce reprogramming
On the reconstructed actual time-series, periodic oscillations of circadian rhythms were prominent in differentiated cells and ceased in meristematic stem cells in the actual time-series (Fig. 3b), indicating that circadian clock system was dynamically reconstructed during the reprogramming process. To address how the reconstruction of circadian clock system is triggered, we focused on a transcription factor BES1. BES1 plays a pivotal role in the regulation of cell proliferation and cell differentiation 26 , and a bes1 mutant has been shown to severely inhibit vascular cell differentiation in VISUAL 13,27 (Fig. 1b). Like β-catenin in animals, BES1 has an inactive form phosphorylated by GSK3 and an active form that was dephosphorylated 28 . The active form of dephosphorylated BES1 was highly accumulated in the meristematic stem cell state immediately after induction (Fig. 3c). Since chromatin immunoprecipitation (ChIP)-on-chip data of BES1 has shown that BES1 directly suppressed the expression of LHY 29 , we hypothesized that perturbations to the circadian clock by BES1 trigger circadian clock oscillations that are arrested in the meristematic stem cells. To test the hypothesis, we performed ChIP-qPCR and con rmed that the enrichment of BES1-GFP at the G-box and E-box motifs in the LHY promoter (Fig. 3d). The luciferase activity of LHY::LUC was decreased by transient co-expression of bes1-D, a gain-offunction mutant protein 30 , indicating that BES1 acts as a repressor of LHY expression (Fig. 3e). The timing of the accumulation of activated dephosphorylated BES1 was consistent with the timing of the suppression of LHY and CCA1 expressions (Fig. 3f). CCA1 promoter also has BES1 binding site, the G-box and E-box. Overall, BES1 is involved in the reprogramming process through directly modulating LHY expression. In the bes1 mutant, the rhythmic expression of CAB3 regulated by LHY remained after differentiation induction, supporting the involvement of BES1 in the reprogramming process (Fig. 1c).

LUX in the meristematic stem cells simultaneously regulates cell-cycle progression and cell differentiation
The cease of circadian rhythm in stem cells has been similarly observed in mammalian ES/iPS cells [4][5][6][7] . As reprogramming resulted in loss of mesophyll cell identity, PSEUDO-RESPONSE REGULATOR 5 (PRR5) and PRR7 expression was diminished and only slight expression was detected in vascular stem cells and vascular cells (Fig. 4a). On the other hand, EARLY FLOWERING 4 (ELF4) and LUX, components of the evening complex (EC) 31 , expression was not induced in mesophyll cells but was signi cantly induced before vascular stem cell state (Fig. 4a). This tendency was also seen in the correlation coe cient between clock genes and differentiation markers (Fig. 4b). LHY, CCA1, PRR5, and PRR7 were highly correlated with mesophyll markers, whereas EC components LUX and ELF4 tended to be correlated with meristematic stem cell or vascular stem cell markers. Since LHY regulates the expression of the EC components LUX and ELF4 31,32 , and the differentiation induction is inhibited in both bes1 and cca1 lhy toc1 mutants (Fig. 1b), the BES1-LHY pathway explains the induction of LUX in the meristematic stem cells. Given that BES1 acts as an integrated hub in plant growth and development 26 , our data do not rule out the possibility that BES1 regulates LUX induction through other pathways.
To test if reconstruction of circadian clock system precedes differentiation from meristematic stem cells, kinetics of GO-term enrichment scores throughout time course was calculated. GO-terms related to DNA methylation and cell cycle, which are involved in cell differentiation, were signi cantly enriched soon after the induction of ELF4 and LUX. In addition, the reconstruction of circadian clocks occurred more than 24 h before induction of GO-terms associated with secondary cell wall biosynthesis, which indicates completion of vascular cell differentiation (Fig. 4c).
Predominant expression of LUX and ELF4 in the meristematic stem cells implies speci c functions of EC there. To elucidate direct targets of LUX in the meristematic stem cells, we then performed ChIP-seq in the LUX-GFP expressing plants and detected 5,476 peaks (Supplementary Table 5). Among them, genes involved in leading roles in cell-cycle progression and cell differentiation, such as CYCD3;1, RBR, and E2Fc, were signi cantly enriched (Fig. 5a, b). CYCD3;1 is a D-type of cyclin that controls G 1 -S transition and cell differentiation 33 . RBR also controls epigenetic regulation through DNA methylation during cell differentiation 34 , and E2Fc is reported as a key regulator for xylem differentiation 35 . Consistent with ChIPseq, the lux nox mutant showed altered CYCD3;1 and E2Fc expressions (Fig. 5c). Taken together, we concluded that the reconstructed circadian clock simultaneously regulates cell differentiation through ne-tuning of key factors for epigenetic modi cations, cell-fate determination, and cell cycle (Fig. 5d).

Discussion
From actual time-series single-cell analysis, we here revealed that circadian clock was reset during the meristematic stem cell state (Fig. 3a, b). This result explains the continuous phase resetting of CCA1 in the root tip and the rephase of the circadian clocks during the lateral root initiation 36,37 . Also, the expression pro le of circadian clocks was changed dynamically during the late meristematic cell state, prior to differentiation from the meristematic stem cell (Fig. 3b). Similarly, changes in the circadian clock gene expression pro le in calli have been observed 38 , suggesting that reconstruction of circadian clocks during cell differentiation is universal this phenomenon. These results implies that differences in the expression pro les and features of tissue/organ-speci c circadian clocks are established at an early stage of differentiation, and is consistent with the results that the circadian clock regulates the large part of gene expressions, including key factors for cell differentiation 39 . Especially, LUX (probably EC) is highly expressed in the meristematic stem cells and regulates cell differentiation. This approach will further reveal cell-type-speci c functions of circadian clocks. For example, TOC1 also play a signi cant role in regulation of cell-cycle progression 40 , and in our analysis TOC1 had a relatively high correlation coe cient score with meristematic and vascular stem cell markers as with ELF4 and LUX (Fig. 4b).
In many multicellular organisms, the circadian clock controls cell division and differentiation 41 . In plants, this study showed that LUX co-regulated both cell-cycle progression and cell-fate determination in meristem cells, achieving coupling between cell differentiation and cell division (endoreduplication). These frameworks may be extended to animals that have independently acquired circadian clock systems during evolution 1,42 . Indeed, cell differentiation mechanism in plants and animals has functionally analogous features. These include (i) nuclear localization of non-phosphorylated forms of BES1 and β-catenin 28 , (ii) inactivation by GSK3 kinase 43 , (iii) cell differentiation by CYCD / RB1 in animals and CYCD / RBR in plants 44 , and (iv) inverse correlation between circadian clock oscillations and stemness 5 (Fig. 5d). Convergent evolution of molecular mechanisms is also found in innate immunity 45 , developmental hourglass 46 , photoperiodicity, and circadian clock system 1,42 of plants and animals. Since the common mechanism is assumed to be deeply related to the essential of its regulatory system, comparative studies of cell differentiation mechanisms in terms of circadian rhythms will provide a new framework for understanding the basis of totipotency/pluripotency.
Overall, our study not only reveals a new role for the circadian clock in the meristematic stem cells in the regulation of plant differentiation, but also provides a new pipeline to enable actual time-series analysis of single-cell transcriptome and concept for understanding toti/pluripotency.

Plant material and growth conditions
All wild type and transgenic lines used here were Arabidopsis thaliana ecotype Columbia-0 (Col-0). Seeds were surface-sterilized and sown on 0.8% agar plates containing Murashige and Skoog medium with 0.5% sucrose or liquid media as described previously 14   conditions were stained with 20 mg/mL PI and observed using an FV1000 microscope as described above. Meristematic cell numbers were determined from observations of the cortical cells, using confocal microscopy images.

scRNA-seq and cpRNA-seq
For scRNA-seq, the process closely followed the method described by Kubo et al 24 . A cotyledon before and after treatment with VISUAL for the indicated time periods was placed adaxial side down on a glass slide, and xed with an adhesive tape, e.g., cellophane tape. Then the center of cotyledon was cut using a razor blade, and with the aid of a microscope, the contents of a single cell were collected using a glass capillary. Samples were subjected to UMI-tagged sequencing using a NextSeq 500 system (Illumina). Detailed methods are described in Supplementary information. For cpRNA-seq, total RNA was extracted using an RNeasy Plant Mini Kit and subjected to UMI-tagged sequencing, as for scRNA-seq, except that 10 cycles of the PCR ampli cation step were required. Sequence reads were mapped against to the TAIR10 Arabidopsis cDNA sequence by Bowtie 50 with the parameter "-a --best --strata". Gene expression level was quanti ed by counting the absolute number of UMIs mapped to each gene. scRNA-seq data was normalized together with cpRNA-seq using the R/Bioconductor DESeq 51 package (http://bioconductor.org/packages/release/bioc/html/DESeq.html; v.1.34.1).

Wishbone -Seurat -PeakMatch pipeline
To identify scRNA-seq data related to the xylem cell lineage, Wishbone was performed as previously described 11 . Samples after induction in VISUAL were subjected to Wishbone. The rst and third components of principal component analysis (PCA) were used for t-SNE. The xylem lineage was selected according to the expression of xylem cell marker genes on the t-SNE plots.
Clustering of cells was performed using the Seurat R package 12 . In brief, digital gene expression matrices were column-normalized and log-transformed. To obtain a landmark gene set for Seurat, we divided all genes in cpRNA-seq data into 17 groups according to the peak expression time of each gene. The 17 genes showing the highest correlation coe cient with scRNA-seq data in each respective group were selected as landmark genes. In addition to the 17 genes, cell-type-speci c markers (CAB3, LHCB2.1, TDR, AtHB8, IRX3, IRX8, and SEOR1) were also selected as landmark genes. In total, 24 genes were used as a landmark gene set for Seurat.
Finally, we selected the genes whose correlation coe cient between scRNA-seq and cpRNA-seq was more than 0.5. Among the selected genes, 2,217 genes whose sum total of expression levels in the scRNA-seq data were higher than the value equivalent to ten times the cell numbers were subjected to PeakMatch with the following parameters: T = 2, last = 0, intv = 1, inter = 7.

Overview of the PeakMatch algorithm
Let be the set of whole genes under consideration. Discretizing pseudo and actual times into integers for simplicity, we denote by and the sets of available pseudo and actual times, respectively. Suppose that, for each gene , we are given pseudo time-series based scRNA-seq data and actual time-series based cpRNA-seq data , where and represent the gene z's expression levels at a pseudo time p in the scRNA-seq data, and at an actual time a in the cpRNA-seq data, respectively.
To estimate the actual times of gene expressions in the scRNA-seq data, we would like to nd pairs of pseudo and actual times so that the expression levels and are likely to be "comparable" for many genes . Once such pairs are found, we may estimate the actual time of by that of .
The point is that, among the observed gene expression levels, "peaks" are the most important phenomena. Then it is desired that a peak in and a peak in should be matched. It is also required that the pseudo time order should be preserved in the time pairs. To be more precise, whenever a pseudo time p is matched to an actual time a, any pseudo time p′ > p should be matched to an actual time a′ > a.
We formulated the problem of nding such time pairs as the maximum weighted non-crossing matching (MWNCM) problem for a bipartite graph. The problem is polynomially solvable 52 , meaning that it is e ciently solvable from the standpoint of the theory of computational complexity.
We took the bipartite graph so that one vertex subset was the pseudo time set and the other vertex subset was the actual time set . For the edge set, we considered all possible pairs , where we determined the weight of an edge heuristically by how the pseudo time p and the actual time a were comparable peaks. We determined the weight of an edge as follows. For each gene , we decided whether or not the value was within a "peak area" in . We considered that was within a peak area if it was signi cantly larger than a general trend of , which was estimated by an exponential moving average. We set the weight of to a larger value if both and are among peak areas for more genes.
Given the scRNA-seq and cpRNA-seq data, the algorithm PeakMatch constructed the bipartite graph, derived an MWNCM for it, and estimated the actual times of all pseudo times in based on the derived MWNCM.
The ampli ed fragment was cloned into the XhoI site of pENTR1A (no ccdB). After sequence veri cation, this plasmid was recombined with pGWB3 53 , and introduced into Col-0 plants.
The sGFP fragment was inserted just before the stop codon of BES1 by fusion of the two fragments using In-Fusion HD Cloning Kit. After sequence veri cation, this plasmid was introduced into the bes1-3 mutant.
The ampli ed fragment was cloned into the XbaI site of pENTR1A (no ccdB) using an In-Fusion HD Cloning Kit (TaKaRa). In the same way, the coding sequence of LUC+ was ampli ed and cloned into the XhoI-EcoRV site of the plasmid using the following primers: LUC-XhoI-F, 5′-ttcgcggccgcactcgagATGGAAGACG; and LUC-EcoRV-R, 5′-tgaacgattctagatatcTTACACGGCGATCTTTCCGC.
The resulting pENTR1A (LUC-nosT) was used for LHY::LUC. The promoter of LHY was ampli ed from Col-0 genomic DNA using the following primers:

ChIP-qPCR and ChIP-seq
Chromatin immunoprecipitation assays using BES1::BES1-GFP/bes1-3 and LUX::LUX-GFP/lux-4 were performed as described 55 with modi cations. Brie y, 600 mg of seedlings at 8 h after induction in VISUAL were xed in PBS containing 1% paraformaldehyde for 10 min at room temperature, and nuclei and chromatin were isolated. The isolated chromatin was sheared with a Covaris S220 sonicator under these parameters: 4-6°C, 175 W peak power, 5% duty factor, 200 cycles/burst, for 50 s of treatment. To Dunnett's test, *p < 0.05). Representative photos of lignin staining (right, bar = 0.5 mm). c, Expression patterns of cell-type-speci c markers during VISUAL in WT, cca1 lhy toc1, lux nox, and bes1 (n = 3, mean ± s.e.). Green, yellow, purple, and red lines indicate marker genes for mesophyll cells (CAB3), meristematic stem cells (RGF6), vascular stem cells (TDR), and xylem cells (IRX3), respectively. Expression peaks of each respective gene in WT were normalized to 1. White, black, and gray boxes indicate light period, night period, and subjective night period, respectively.   Reconstruction of the circadian clock precedes cell differentiation. a, Expression patterns of clock genes in actual time-series scRNA-seq data. Expression of ELF4 and LUX originates in the meristematic stem cell. Green, yellow, purple, and red bars indicate mesophyll cell, meristematic stem cell, vascular stem cell, and xylem cell states, respectively. Black solid lines are moving average of 2 hours before and after.
Colored clouds are moving standard errors. Gray circles represent the single-cell data. b, Pearson's correlation coe cient between clock genes and cell-type-speci c markers. c, GO-term enrichment during cell-fate transition. The rst peak time of ELF4 and LUX expression is highlighted by the pale blue window. Green, yellow, purple, and red bars indicate mesophyll cell, meristematic stem cell, vascular stem cell, and xylem cell states, respectively. Black solid lines are moving average of 2 hours before and after.
Colored clouds are moving standard errors. Gray circles represent the single-cell data. LUX regulates cell-cycle progression and cell differentiation. a, FDR distributions of genes related to G1-S transition in the candidate LUX target genes. b, Visualization of ChIP-seq data around genes related to clock, cell cycle, cell-fate determination, and epigenetic regulation (bar = 1 kb). Peak counts of reads are shown. c, Expression patterns of CYCD3;1 and E2Fc during VISUAL in lux nox and corresponding WT (n = 3, mean ± s.e.). Peak expression levels of each respective gene in WT were normalized to 1. White, black, and gray boxes indicate light period, night period, and subjective night period, respectively. d, Our model proposes that BES1 represses LHY expression in meristematic stem cells to reconstruct the circadian clock, resulting in phase resetting and restart of the clock with dynamic changes of clock genes expression. LUX in the reconstructed clock modulates key factors for epigenetic regulation, cell-fate determination, and cell cycle, thereby inducing cell differentiation.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.