Single Cell Sequencing Reveals Early PGC-like Intermediates During Mouse Primed to Naïve Transition

Single cell analysis provides clarity unattainable with bulk approaches. Here we apply single cell RNA-seq to a newly established BMP4 induced mouse primed to naive transition (Bi-PNT) system and show that the reset is not a direct reversal of cell fate but through developmental intermediates. We first show that mEpiSCs bifurcate into c-Kit + naïve and c-Kit - placenta-like cells, among which, the naive branch undergoes further transition through a primordial germ cell-like cells (PGCLCs) intermediate capable of spermatogenesis in vivo. Indeed, deficiency of Prdm1/Blimp1, the key regulator for PGC specification, blocks the induction of PGCLCs and naïve cells. Instead, Gata2 knockout arrests placenta-like fate, but facilitates the generation of PGCLCs. Our results not only reveal a newly cell fate dynamics between primed and naive states at single-cell resolution, but also provide a model system to explore mechanisms involved in regaining germline competence from primed pluripotency.


INTRODUCTION
Pluripotency refers to the ability to generate all cell types in an individual. In mice, it has been well established that two distinct, yet interchangeable, states exist, namely naïve and primed pluripotency represented by ESCs (Embryonic Stem Cells) and EpiSCs (Epiblast Stem Cells) 1-3 . Despite the similarities between ESCs and EpiSCs in gene expression profile, in vitro differentiation potentials, they differ markedly in their morphology, ability to colonize blastocysts and conditions for maintenance in vitro 1 .
From an embryological perspective, naive and primed pluripotency appear to represent the developmental potentials of pre-and post-implantation embryos respectively 1 . Therefore, a detailed understanding of both states may inform our general understanding of early embryo development.
The interconvertibility between naive and primed states has been well documented [4][5][6][7][8] . For example, naive ESCs can be differentiated into primed state closely resembling EpiSCs 9,10 . Likewise, EpiSCs have been converted to the naive state similar to mESCs 4,5 . Unlike the differentiation of ESCs to the primed state, the reprogramming of EpiSCs towards naive state has been largely accomplished with the help of transcription factors or TFs 5,[11][12][13][14][15][16] , similar to the iPSC(Induced pluripotent stem cells) strategy pioneered by Yamanaka and colleagues 17 . TFs are powerful cell fate regulators that can overcome barriers established during development through multiple processes involving chromatin remodeling, metabolic switches, cell morphology changes. Recently, we have achieved a BMP4-dependent primed to naive transition (PNT) in a similar way as the reprogramming and show that a 3-day exposure of EpiSCs in BMP4, Dot1L inhibitor and EZH2 inhibitor followed by a 5-day culture in 2iL can convert the majority (~80%) of the primed cells into chimeracompetent naive colonies 18 . Through RNA-seq and ATAC-seq bulk analyses, we demonstrated that many downstream targets of BMP4, including the previously unrecognized Zbtb7 family members, are activated through chromatin remodeling events for the resetting of primed to naïve states 18 . Even though, it is still unclear if there are intermediates governing the resetting of primed fate into a naive one by BMP4 or the precise cell dynamics during this resetting process, as distinct molecular routes have been shown to mediate the acquisition of naïve pluripotency with different TFs and signals 19 . In this report, by taking advantage of single cell RNA-sequencing (scRNA-seq), we report the cell fate continuum between primed and naive states with intermediates previously masked in bulk analysis. Specifically, our results reveal a Prdm1-dependent and germline competent PGC-like cell state mediate the transition of PNT, which provide new insight into the acquisition of naïve pluripotency and germline from primed pluripotency.

Single cell atlas of BMP4 induced primed to naive transition (BiPNT)
We have established a robust BMP4-dependent primed to naive transition (PNT) system, which consisting of two stages, a 3-day incubation with BMP4, EPZ5676, EPZ6438 and 5-day culturing with 2iL 18 , that can convert ~80% EpiSCs to chimera competent naive colonies. To map the cell fate continuum, we performed scRNA-seq from cells that are at D0, 1, 2, 3, 5 and 8 (Fig. 1a). Analysis with t-SNE methods yielded well segregated clusters for cells for each day (Fig. 1b), indicating distinct stages of cell fate during PNT. Indeed, distinct gene expression profiles can be observed from D0 to D8 using selective genes from primed to naive states (Fig. 1c).
To further identify the cell population at two stages, we firstly analyzed single cells from D0-3 at stage 1 and showed that many cells emerged at D3 have naive pluripotency program genes (Fig. 1d). Among them, Klf2, Dppa5a, and Dppa3 are almost exclusively expressed in D3 cluster (Fig. 1d, lower panels). Interestingly, we also found that some cells in D3 also have a placenta-like program, expressing genes like Plac1, Gata2, Krt8 and Peg10 (Fig. 1d, middle panels). These results suggest that mEpiSCs undergo fate transitions towards placenta-like and naïve-like pluripotency cells at the stage1.
2iL is a well-known condition for maintaining naive pluripotent state 20 . When the BMP4-treated cells were switched to 2iL condition, it is expected that they can be further matured into naive pluripotency. Indeed, scRNA-seq reveals a wide spread presence of naive pluripotency program among D5 and D8 cells (Fig. 1e), suggesting that 2iL has further reprogrammed the cells in D3 into naive state. We can also observe minor populations in both D5 and D8 cultures that are placenta-like (Fig. 1e).
We then plotted the cells with naive-and placenta-like programs to show that BMP4 triggers almost equal numbers of cells for each fate up to D2 when the naive fate continues to increase while the placental fate levels off (Fig. 1f). It is apparent that 2iL continues to support the increase of the naive fate as expected but begins to diminish the placental ones at D5 and D8 (Fig. 1f). Therefore, we can conclude that two primary fates emerge in BMP4-treated mEpiSCs and 2iL favors the naive fate while partially suppresses the placenta-like one (Fig. 1f).Gata2, which is highly expressed in placental cells but not in naïve pluripotent cells, plays an important role for the development of trophoblast 21,22 . In order to trace both naïve-and placenta-like cells, we constructed an EpiSC cell line carrying the Oct4-GFP/Gata2-Tdtomato double transgene reporters (Extended Data Fig. 1a). We found that the naive pluripotent reporter Oct4-GFP is incompatible with placental signaling Gata2-Tdtomato at the late stage of PNT ( Fig.1g and Extended Data Fig. 1b ). These results suggest that BMP4 triggers primed EpiSCs to bifurcate into two distinct fates towards naïve-like or placenta-like cells.

c-Kit marks the naive branch
To further resolve the apparent naive versus placental fate choices in D2 and D3, we re-analyzed the scRNA-seq data with Harmony 23 , a program designed to resolve branches or trajectories along cell fate continuum. Harmony resolves the cells into two main branches, the placental branch (PB) or naïve branch (NB) as expected (Fig. 2a).
While the PB is enriched with imprinting genes such as Igf2, Peg10, H19 and placental genes Gata2, Tead3, the NB expresses Dppa5a, Klf2 and Nanog, mostly naive-related genes (Fig. 2b). Consistent with the naive-and placenta-dichotomy, the PB is enriched with genes from placenta-like fate while the NB with those from the naive-like fate (Fig.   2c).
We then searched for cell surface markers that may distinguish the NB from the PB. Among a group of cell surface marks (Extended Data Fig. 2a), we found that c-Kit is a good candidate (Fig. 2d). c-Kit begins to be expressed at D1 and persists in the NB significantly (Fig. 2d). When plotted along the pseudo time, it becomes very clear that c-Kit begins to diverge at early phase of PNT that the NB maintains its expression while the PB diminishes (Fig. 2e), suggesting that c-Kit may help mark the NB. To test this idea, we sorted D3 cells with c-Kit-APC and show that ~ 50% of D3 cells are c-Kit positive (c-Kit + ) (Fig. 2f). We then replated the c-Kit + and c-Kitcells separately and continued their culturing under 2iL to show that almost all (94.2%) of c-Kit + cells become GFP positive compared to 0.7% for c-Kitcells (Fig. 2g). We further performed RNA-seq to probe the difference between c-Kit + and c-Kitpopulation (Fig. 2h), and showed that 1102 genes are highly expressed in c-Kitcells, including the placenta markers Plac1, Gata2 and Krt8, while 1065 genes in c-Kit + cells including naïve markers Pou5f1, Sox2, Nanog, Esrrb, Klf2, mesoderm marker T and primordial germ cells (PGCs) marker Prdm1 (Fig. 2h). We then performed qRT-PCR analysis to further confirm the expression of these naïve pluripotent markers in c-Kit + cells and placenta markers in c-Kitcells (Extended Data Fig. 2b). Consistently, GO (gene ontology) analysis reveals that programs such as embryonic placenta development, endothelial cell migration and placenta development are highly enriched in c-Kitcells, while those involved in gastrulation, BMP signaling pathway, non-canonical Wnt signaling pathway are highly enriched in c-Kit + cells (Fig. 2i). Intriguingly, since the PGC markers such as T and Prdm1 are specifically enriched in c-Kit + (Fig. 2h), we further examined the expression of other PGC related markers by qRT-PCR and showed that Dppa3, Nanos3, Ifitm1, Ifitm3 are also highly expressed in the c-Kit + cells (Fig. 2j). We further performed ATAC-seq to investigate the chromatin open or close state between the c-Kit + or c-Kitpopulation as we described before 24,25 . We identified 16806 peaks specifically opened in c-Kitcells, among which are gene loci such as Gata2, Plac1, Elf5, Krt18 and Igf2 (Placenta markers), (Extended Data Fig. 2c,d), and 13048 peaks specifically opened in c-Kit + cells, including gene loci such as Prdm1, Prdm14, Nanos3

Activation of a PGC-like program during Bi-PNT
Since the c-Kit + cells are also enriching for PGC markers (Fig. 2j), we further examined the detailed cell fate continuum along the naive branch. We generated a pseudo time plot of NB during PNT and found it can be divided into 4 distinct phases (I-IV) (Fig.   3a). The first phase (I) is mainly cells expressing genes of the primed fate, such as

Characteristics of BV + SC + PGCLCs
To further characterize the Day6 BV + SC + PGCLC cells, we performed RNA-seq experiments and compared our RNA-seq data to other published datasets containing Day4/Day6 in vivo PGCLC or primary PGC (E9.5,11.5,13.5) 32,33 . PCA analysis revealed that the transcriptomes of the BV + SC + cells are similar to D4 and D6 PGCLCs, and to a lesser extent of E9.5 PGCs (Fig. 4a).We then evaluated the epigenetic profile of Day6 BV + SC + cells. IF analysis revealed a reduced H3K9me2 and increased H3K27me3 (Fig. 4b), which was further confirmed by western blot (Fig. 4c), and consist with previous report 31 . We further determined the imprinting states of maternally (Snrpn) and paternally (H19) imprinted genes in Day6 BV + SC + cells and showed that whereas the DNA methylation in Snrpn loci was retained, the methylation of H19 loci reduced significantly, suggesting that Day6 BV + SC + cells may undergo the process of imprint erasure (Fig. 4d). Consistently, qRT-PCR analysis also showed the Notably, three months after transplantation, tubules with normal spermatogenesis, which express the spermatogonia marker (DDX4 and PLZF), spermatocytes markers (SYCP3 and γH2AX) and spermatids markers (DDX4 and PNA) can be detected in two out of six testes sections transplanted with BV + SC + cells (Fig. 4e, f and Extended Data Fig. 4d), while only Sertoli cells in the tubules of control group. Together, these data indicated that the Day6 BV + SC + cells are germline competent PGC-like cells.

Prdm1-KO blocks the generation of PGCLC and naïve pluripotency fate
To test whether the PGCLCs are intermediates for the acquisition of naïve pluripotency, we generated OG2-EpiSC deficient in Prdm1 (Prdm1-KO or Prdm1 -/-mEpiSC) (Extended Data Fig. 5a,b), a factor which is obligatory for PGC specification, but not for the derivation of pluripotency 38,39 .  5c). By comparing WT and Prdm1 -/samples using UMAP, we show that the separation between WT and Prdm1 -/populations mainly occurs at Day2-3 (Fig. 5d).
We then analyzed the expression of cell fate markers for PGC, placenta, or naïve pluripotency by UMAP in these populations (Fig. 5e), and show that: 1) PGC markers are mainly present in subpopulations of WT samples, especially Nanos3 and Dppa3 ( Fig. 5e, the first row); 2) the naïve markers are exclusively expressed in the subgroup of WT subpopulation but not in Prdm1 -/ - (Fig. 5e, the second row); 3) the placenta markers, such as Plac1, Igf2, Ahnak and Peg10, are in fact more enriched in the subpopulation of Prdm1 -/samples compared to the WT (Fig. 5e, the bottom row).
These data demonstrated at single cell resolution that when loss of Prdm1, EpiSC failed to enter a PGCLC fate and naïve pluripotent state but retained a placenta-like fate. In addition, we examined the DNA methylation state of the naïve-like cells (Two naïve colonies: Naïve-1# and 2#) and showed that they had loss the DNA methylation in the imprinted loci Peg1, Peg3ß and Snrpn (Extended Data Fig. 5g). Therefore, these data suggest that Prdm1-dependent PGCLCs are an intermediate for successful BiPNT.

GATA2 gates placenta-like program whereas hampers PGCLC program
To determine the key event for early cell fate choice during BiPNT, we reanalyzed the scRNAseq data at D0, D1 and D2 by UMAP (). We found a significant cell diversity at D2 (Fig. 6a). Heatmap for representative genes show an activation of placenta signaling (Gata2, Plac1) as early as Day1 (Fig. 6b), and a diversity of pluripotent signaling (Dppa3, Nanog) at Day2 (Fig. 6b), suggesting an early cell fate choice. To further investigate the key factor(s) regulate this cell fate choice, we use pySCENIC 40 to predict the Regulon of each cell subgroup (Extended Data Fig. 6a). Heatmap showed a widely activation of early BMP4 response regulons, such as Id1, Cdx2, Tfap2c, and a separate pattern for regulons Prdm1 (PGC) and Gata2 (Placenta) at D2 populations (Fig. 6c), corresponding to the final placenta-like trajectory and naïve pluripotent trajectory. Consistently, the expression of Gata2 and trophoblast marker Elf5 was restricted to the placenta branch (Extended Data Fig. 6b) Fig. 6c, d). These two cell lines express comparable levels of primed pluripotency markers (Extended Data Fig. 6e). When performing BiPNT, FACS analysis showed a significant increasing of BV + cells at Day3 and BV + SC + cells at Day6 in Gata2-KO groups than in WT during BiPNT (Fig. 6d). Consistently, qRT-PCR analysis further shows a failed activation of placenta signaling such as Plac1, Peg10 and Phlda2, but a significant increasing in PGC-like signaling such as Prdm1, Prdm14, Dppa3 and Nanos3 in Gata2-KO cells comparing to WT cells (Fig. 6e). These data suggested that Gata2-KO turns off the placenta-like program while facilitates the PGC-like program.

DISCUSSION
In this report, by performing scRNA-seq to a previously well-established PNT process driven by BMP4 (BiPNT) 18 , we generated a cell fate continuum between primed and naive pluripotency states. Surprisingly, we found that BiPNT, in single cell level, goes through cell fate stages previously unexpected, i.e., a Prdm1 + PGC-like stage, before the establishment of naïve pluripotent cell fate. We also identified several key regulators, Prdm1/Blimp1, a primordial germ cell (PGC) determinant, whose depletion totally blocks the derivation of naïve pluripotency; Gata2, a transcription factor necessary for regulation of trophoblast program, whose depletion turns off the placenta-like program while facilitates the PGC-like program (Fig. 7). Interestingly, through re-analyzing our published data, we found that Zbtb7a/b, the novel targets of BMP4 18 , also regulated the expression of some PGC genes during PNT, such as ckit, Prdm1 and Nanos3, in RNA and chromatin level (Extended Data Fig. 7a However, studies also proposed that the BMP4-induced cells in EpiSC and hESC may actually correspond to a mesoderm identity 48 . Intriguingly, the placenta-like cells induced during BiPNT express Elf5, a key transcription factor in trophoblast lineage determination (Extended Data Fig. 6b), suggesting that these cells may be trophoblast   scRNA-seq analysis was adopted from 10x Genomics.

Derivation of EpiSCs from mouse BVSC-ESCs
Mouse BVSC-ESCs were dissociated into single cells using 0.05%Trypsin-EDTA and were plated at a density of 2-5x10 4 in a well of 12-well plate coated with FBS in N2B27-2iL medium. The next day, medium was changed to FAX medium for 3 days and then re-plated at a ratio of 1:10. Cells were passaged with Accutase every 3 days.
Experiments were performed between P8 and P20.

BMP4 induction of EpiSCs into naïve state
Reprogramming of EpiSCs into naïve state was performed with our protocol published

Bisulfite Sequencing
The genomic purification kit (Promega) was used to isolate genomic DNAs, and an

Single-Cell RNA-seq Data Analysis
Fastq reads were aligned to the genome using STARsolo 49 with the setting '--outSAMattributes NH HI AS nM CR CY UR UY --readFilesCommand zcat --outFilterMultimapNmax 100 --winAnchorMultimapNmax 100 --outMultimapperOrder Random --runRNGseed 777 --outSAMmultNmax 1'. The count matrix was lightly filtered to exclude cell barcodes with low numbers of counts: Cells with less than 1000 UMIs, less than 500 genes or more than 20% fraction of mitochondrial counts were removed. The filtered matrix was normalized using scran 50