To characterize the Dido1 expression pattern in different human and mouse tissues, cell lines, and cancer types, we retrieved publicly available data from web bio-sources. Publicly accessible expression data using the Genevisible tool (https://genevisible.com/search). The Affymetrix microarray data (Mouse Genome 430 2.0 and Human Genome U133 Plus 2.0) explored with Genevisible web bio-source (https://genevisible.com) indicated the highest Dido1 mRNA expression in the nervous and hematopoietic systems (Suppl Fig. 1).
Loss of the DIDO3 protein results in anemia and severe peripheral lymphopenia. To examine the in vivo role of DIDO3 in hematopoiesis, we deleted exon 16 of Dido1 in murine HSC early in development. Mice homozygous for the floxed allele (Dido1flE16/flE16) were crossed with mice carrying one functional and one mutated allele of Dido1 (Dido1∆E16, which lacks the DIDO3 C terminus) and the Vav1:iCre transgene. The resulting Dido1flE16/∆E16;R26R-EYFP;iCreVav1 mice (hereafter Dido1∆E16) were used as experimental subjects, and Dido1flE16/+;R26R-EYFP;iCreVav1 littermates (hereafter wt) were used as controls, except where otherwise specified. Ablation of Dido1 exon 16 in early development was confirmed by qPCR on DNA samples from sorted lineage (lin)–Kit+Sca1+(LSK) cells derived from bone marrow, which showed >99% reduction in Dido1E16 DNA content (Suppl. Fig. 2A). Hematologic analysis showed a significant decrease in cellular components when Dido1∆E16 mice were compared with heterozygous littermate controls. White blood cell numbers dropped to 33% of wt (1.71 ± 1.32 vs. 5.11 ± 1.71 x 103 cells/ml, n = 8, p<0.001) and red cell numbers to 92% of wt values (9.65 ± 0.36 vs. 10.54 ± 0.51 x 106 cells/ml, n = 8, p<0.01). Within the leukocyte fraction, the lymphocyte subpopulation was responsible for the defect observed, with a reduction in cell numbers to 27% of wt (1.02 ± 1.20 vs. 3.84 ± 1.61 x 103 cells/ml, n = 8, p<0.0001)(Suppl. Fig. 2B). Peripheral blood samples from wt and Dido1∆E16 mice were analyzed by flow cytometry for EYFP reporter expression; >97% of circulating cells were EYFP-positive in both lymphoid and myeloid gates, as determined by forward scatter (FS) vs. side scatter (SS) gating (Suppl. Fig. 2B).
We calculated total numbers of various cell subsets in peripheral blood from complete blood count and flow cytometry data. Dido1∆E16 mice had significantly reduced lymphocyte subset counts, including B220 (58 ± 26 vs 2675 ± 423 cells/µl in control samples, n = 6, p<0.0001), CD4 (301 ± 123 vs. 749 ± 232 cells/µl, p<0.01), CD8 (109 ± 48 vs. 423 ± 87 cells/µl, p<0.001), and NK1.1 (50 ± 15 vs. 163 ± 69 cells/µl, p<0.05), while monocyte (90 ± 8 vs 90 ± 18 cells/µl) and granulocyte numbers (590 ± 348 vs. 550 ± 161cells/µl) were unaffected (Suppl. Fig. 2C).
Bone marrow analysis
As Dido1 was previously linked more closely to development and stem cell differentiation functions, we analyzed the B cell lineage in bone marrow. We used flow cytometry to identify lin–Sca1+Kit+IL7R– (LSK) and common lymphoid progenitor lin–Sca1+Kit+IL7R+ (CLP) cell populations (Fig. 1A, B), and found no significant differences in the relative proportions of each. Dido1∆E16 mice nonetheless showed reduced total bone marrow cellularity and significantly lower absolute numbers of LSK cells (Fig. 1C). From Dido1∆E16 mice, we obtained 2.85 ± 0.83 x 104 lin– cells, vs. 3.93 ± 0.45 x 104 from wt mice (n = 6, t-test ** p<0.01). The numbers for LSK (2.09 ± 0.64 x 104 vs. 3.34 ± 1.0 x 104, t-test * p<0.05) and CLP cells (1.13 ± 0.50 x 103 vs. 1.92 ± 1.17 x 103, t-test p>0.05) were also reduced.
We analyzed the distribution of the distinct B cell precursor subpopulations, namely prepro‑B, pro-B, pre-B, and immature B cells (Fig. 2A). The number of B220+ cells was substantially lower in Dido1∆E16 mice than in the controls (Fig. 2B). The populations most affected were pre-B and immature B cells, although the number of prepro-B and pro-B cells was similar in mice of both genotypes (Fig. 2C).
Functional analysis of bone marrow precursors
Signaling from the pre-BCR complex promotes the survival and proliferation of cells that have successfully rearranged the Igh locus. This is a first checkpoint in which those rearrangements that give rise to a non-functional heavy chain lead to death by apoptosis. In a second checkpoint, immature B cells whose BCR is activated by an autoantigen in the bone marrow can undergo receptor editing to modify its light chain, or they can be deleted by apoptosis. To detect cell death in distinct precursor populations, we used flow cytometry to analyze bone marrow samples from Dido1∆E16 and wt mice, labeled with appropriate antibodies and with annexin V and DAPI. Apoptosis levels were slightly higher, although non-significant, in Dido1∆E16 mice. In the immature B cell population (EYFP+B220+CD19+IgMlow), both the number of apoptotic cells and the annexin V mean fluorescence intensity (MFI) in bone marrow of Dido1∆E16 mice showed a significant increase relative to controls (Fig. 3A, B). This result suggests that DIDO3 affects the size of the immature B cell population in Dido1∆E16 mice by increasing the number of cells that undergo apoptotic processes at this stage. Our data support the idea that DIDO3 is necessary to overcome the BCR signaling-dependent checkpoint.
We used BrdU (5-bromo-2'-deoxyuridine) incorporation in DNA in vivo to identify and examine proliferating cells, and DAPI to determine DNA content and cell cycle stages. The percentage of cells that incorporated BrdU after 18 h post-intraperitoneal inoculation was quantified for pro-B/pre-B and immature/mature cells. Cells from Dido1∆E16 mice in pro-B/pre-B states showed BrdU incorporation similar to that of control mice (Fig. 3C). In the immature cell population, there was a non-significant increase in the percentage and the MFI of cells that incorporated BrdU (Fig. 3D) (MFI: pro-B/pre-B: Dido1∆E16 = 4.25 ± 1.232 vs. wt = 4.68 ± 0.852; t test: p = 0.52; n = 3; immature/mature B cells, Dido1∆E16 = 3.92 ± 1.144 vs. wt = 2.05 ± 0.443; t test: p = 0.09; n = 3). The percentage of S phase cells of the pro-B/pre-B populations of both mice was similar, although an increase in the percentage of Dido1∆E16 mature/immature B cells was observed in this phase (Fig. 3D). The increase in the percentage of immature B cells in the S phase in Dido1∆E-16 compared to control mice and the increase in cells that undergo apoptosis suggest that immature B cells from Dido1∆E16 mice undergo an S phase block that eventually leads to apoptosis.
In vitro differentiation assays were carried out to determine whether the Dido1∆E16 mouse B cell precursors showed alterations in their capacity to differentiate towards the B lineage. We purified cells lacking lineage-specific markers (lin–) by negative selection and plated them in a semi-solid matrix and medium optimized for B cell development, supplemented with IL7. In cultures from wild type mice, between 25 and 35 colonies were counted in plates seeded with 2 x 106 lin–cells/ml; between 5 and 10 smaller cell groups/plate were also registered. No colonies were found in plates seeded with lin– precursors from Dido1∆E16 mice, although we detected some clusters consisting of a small number of disaggregated and undifferentiated cells (Suppl. Fig 3). B lineage cells (B220+) from the conditional mutant mice thus showed differentiation defects when stimulated in vitro, with blocking of proliferation and maturation towards B precursors in these conditions.
Hematopoietic precursors from Dido1∆E16 mice failed to repopulate the bone marrow of irradiated recipient mice
The small numbers of early hematopoietic precursors in Dido1∆E16 mice and their failure in in vitro differentiation assays led us to test whether their function was affected in vivo in the absence of DIDO3. As transplant recipients, we used nine lethally irradiated B6-CD45.1 mice (10 Gy); at day 1 post-irradiation, the mice received a retro-orbital injection of 150 µL of a mixture of total bone marrow cells (3.75 x 106) from CD45.2+EYFP+Dido1∆E16 and CD45.2+EYFP–Dido1+ donor mice, at a 1:1 ratio (Suppl. Fig. 4).
Peripheral blood was monitored at day 14 post-inoculation; we observed a lower CD45.2+EYFP+Dido1∆E16 cell contribution to circulating blood populations; of the blood cells, 12.78 ± 1.97% were CD45.2+EYFP+Dido1∆E16 and 80.13 ± 3.43% were CD45.2+EYFP–Dido1+. A further 6.73 ± 3.45% were endogenous surviving cells (CD45.1+), mainly of the CD3+ phenotype, reported to be thymus-derived radioresistant cells21 (Fig. 4A). At day 21 post-inoculation, ~25% of circulating leukocytes were B cells, most of which were CD45.2+EYFP–Dido1+, with only 0.05% CD45.2+EYFP+Dido1∆E16 and 0.01% CD45.1 B cells. This imbalance was partially anticipated, given the severely biased composition of the transplanted bone marrow, with higher percentages of pre-B and immature B progenitors in wt mice.
At 28 days post-transplant, four mice were sacrificed and their bone marrow analyzed by flow cytometry. The highest percentage of hematopoietic cells observed was CD45.2 (5.91 ± 6.26% CD45.2+EYFP+Dido1∆E16; 55.64 ± 9.42% CD45.2+EYFP–Dido1+), with a residual percentage of CD45.1 cells (0.99 ± 0.49%). We also verified that only 3.02 ± 1.96 x 104 cells were CD45.2+EYFP+Dido1∆E16 vs. 2.24 ± 1.21 x 106 CD45.2+EYFP–Dido1+ cells; 8.45 ± 6.49 x 103 CD45.1 cells were also found. This result suggests a lower level of nesting and/or expansion capacity in DIDO3-deficient cells.
At 40 days post-transplant, the five remaining mice were sacrificed and bone marrow populations were analyzed by flow cytometry (Fig. 4B). From 6.04 ± 1.33 x 105 lin– cells/femur, only 1.68 ± 1.13 x 104 cells/femur were CD45.2+EYFP+Dido1∆E16, and 1.64 ± 0.41 x 104 cells/femur were CD45.1+.
The bone marrow analysis yielded means of 2.25 ± 1.87 x 103 LSK and 1.31 ± 1.25 x 102 CLP CD45.2+EYFP+Dido1∆E16 cells, significantly lower than the 1.91 ± 0.80 x 104 LSK and 8.58 ± 2.34 x 102 CLP CD45.2+EYFP–Dido1+ cells. B cell precursors were identified by B220 expression and encompass cell populations from the prepro-B phase to immature and mature B cells. Approximately 99% of B220+ cells were CD45.2EYFP–Dido1+ (2.15 ± 1.01 x 106 cells/femur), and only 2.96 ± 1.90 x 104 cells/femur were CD45.2+EYFP+Dido1∆E16 (Fig. 4C). These data indicate that, relative to LSK-wt cells, the LSK-Dido1∆E16 cells and the remainder of the B lineage progenitor cells had a cell-autonomous disadvantage both in nesting and in repopulating the bone marrow of irradiated mice.
Identification of essential regulators of stem cell function and the B cell pathway by ATAC-seq analysis of bone marrow LSK cells
Although the proportion of LSK cells within the lin– population is similar in wt and Dido1∆E16 mice, adoptive transfer experiments showed a lower repopulation capacity, especially considering that this undifferentiated population was overrepresented in the mutant fraction of the inoculum. As DIDO1 was demonstrated to be a chromatin interactor that recognizes accessible regions of chromatin with H3K4me3 histone marks, we analyzed this population by Omni-ATAC-seq22. This technique detects differences in chromatin accessibility, as assessed by sensitivity to DNA digestion by transposase Tn5 in a process termed “tagmentation”. The tagmented fragments were subjected to massive sequencing, and two biological replicas with accessible regions in the range of 104 were used for differential analysis between wt and Dido1∆E16 LSK cells.
Accessible regions were annotated with the ChIPseeker program23 by associating the position of the accessible regions in each replica with the structure of the genes (promoter, 5’-UTR, exon, intron, 3’-UTR, or intergenic) in the mouse reference genome assembly GRCm38/UCSC mm10. After filtering low quality reads, an average of 52 million high quality reads were obtained, and 96.1% clean reads were mapped to the GRCm38/UCSC mm10 assembly using Bowtie2 (bowtie-bio.sourceforge.net/bowtie2/) and in parallel BWA-MEM 0.7.15 (http://bio-bwa.sourceforge.net). The largest percentage of accessible regions was concentrated in gene promoters and introns and in intergenic regions, in both LSK-wt and LSK-Dido1∆E16 cells (Fig. 5A). In the Dido1∆E16 samples, however, there was a relative reduction in the number of introns and intergenic regions relative to promoters, which was related to the preferential association of DIDO1 to enhancer sequences over promoters24.
A Gene Set Enrichment Analysis (GSEA, http://www.gsea-msigdb.org) of the ATAC-seq data was performed. GSEA is a computerized method that determines whether gene sets defined a priori show concordant and significant differences between two biological states25. In seeking overlaps with genesets in the C2 collection (“curated genesets”) of MSigDB, we found that the promoters with the least accessibility in LSK-Dido1∆E16 coincided with (1) genes with mixed epigenetic marks (H3K4me2 and H3K27me3) in mouse embryonic fibroblast (MEF) HCP genes (high-CpG-density promoters), p<1.79 x 10−49; (2) silencing marks (H3K27me3) in embryonic cells, p<4.88 x 10−31; (3) human EED targets (a core subunit of PRC2), p<3.72 x 10−27, and (4) human SUZ12 targets (also a core subunit of PRC2), p<3.61 x 10−25 (Supplementary Table 1). In accordance with these data, our previous microarray data26 analyzed here by GSEA, indicate that genes silenced in DIDO3-deficient ESC correlate significantly with those silenced in mouse Suz12-deficient ESC; the genes overexpressed in these cells also coincided with those overexpressed in the Suz12–/– cells. There thus appears to be a relationship between epigenetic regulation of gene expression by PRC2 and the presence of DIDO3. An online tool (http://bioinformatics.psb.ugent.be/webtools/Venn/) was used to construct a Venn diagram that represents differentially-expressed genes (DEG) associated with distinct genesets by GSEA (Fig. 5B). There was also correlation with (5) genes expressed after p53 activation (p<1.48 x 10−24), in agreement with the Cdkn2a alterations observed in Dido1∆E16 mouse samples at later developmental stages (see below).
When analysis was limited to non-promoter regions (intra- or intergenic), using the MSigDB C3 collection (“regulatory target gene sets”), there was significant overlap with (1) the targets of the putative histone-lysine N-methyltransferase PRDM6 (p<7.5 x 10−24), and (2) the set of genes with at least one occurrence of the highly conserved motif M17 AACTTT, a potential binding site for an unknown transcription factor in the region spanning up to 4 kb around their transcription start sites (p<5.91 x 10−69) (Fig. 5C). Within the 79 genes common to the three genesets, we found a number of essential regulators for stem cell function and B cell pathway entrance, including Akt3, Runx1, Ebf1, Igfr1, Tcf4, Cdk6, and Etv6. GSEA analysis of the few identified sites that show higher accessibility in Dido1∆E16 LSK cells yielded no relevant results.
To validate the expression data in additional samples by RT-qPCR, we selected several genes in the list of regions with different Tn5 accessibility in LSK-wt and LSK-Dido1∆E16 cells: genes related to the Polycomb group (PcG) proteins (Bmi1, Ezh2), maintenance and differentiation of stem cells (Runx1, Nanog, Ikzf2 (Helios), Tcf12 (Heb)), differentiation towards lineage B (Klf3, Il7r, Ebf1, Tcf3 (E2A, E12/E47), Spi1 (PU.1), Foxo1) or towards lineage T (Gata3), and cell proliferation (Mycn (N-myc), Myc, Cdk6). The analysis showed that expression of genes related to differentiation towards the B lineage was non-significantly lower in LSK-Dido1∆E16 than in LSK-wt cells. Other genes such as Gata3 or Cdk6 were expressed more significantly in LSK-Dido1∆E16 cells, whereas Nanog levels were 16 times higher than in LSK-wt cells (Fig. 5D). Dido1∆E16 CLP samples showed reduced Ebf1 expression, whereas Nanog levels were normal.
TLDA analysis of gene expression in B cell precursors.
To analyze expression of several genes related to B cell development in the different precursor subpopulations, we used customized TaqMan Low Density Arrays, tested against preamplified samples of cDNA synthesized from total RNA from sorted subpopulations. The largest number of differences was detected in the prepro-B subpopulation (Fig. 6A), which suggests that a smaller number of cells is able to follow the B cell pathway at this less-differentiated stage. mRNA levels of transcription factors such as Tcf3, Foxo1, and Ebf1 (labeled (1) in Fig. 6A) were decreased in Dido1∆E16 compared to wild type mice. These factors are necessary for precursor differentiation towards the B lineage and are crucial for expression of Pax5, an essential factor for maintenance of B line identity. We also observed lower levels of mRNA encoded by genes related to B cell development and maturation (labeled (2) in Fig. 6A), including Pax5, Igll1(Igl5), S1pr1, Btk, Blk, Cd79a (MB-1, Iga), Cd79b (Igb), Irf4, Cr2 (Cd21), Pou2af1 (OBF1),and Cxcr4. In Dido1∆E16 cells, we detected upregulation of genes (labeled (3) in Fig. 6A) such as Cebpb, whose relationship with cancer-driven myelopoiesis is observed in humans27, and of Spn (Cd43), which is downregulated in the wt cells at this stage28, when V(D)J recombination begins.
As in almost every developmental process, the three “E-proteins” play a fundamental role in hematopoiesis. These transcription factors are characterized by a basic helix-loop-helix (bHLH) domain with capacity to homo- and hetero-dimerize with themselves and other HLH proteins, and to bind to DNA through E-boxes (CACGTG sequences). Tcf3 (which encodes E2A isoforms E12 and E47) and Tcf12 (which encodes HEB isoforms HEBcan and HEBalt) have been implicated in B cell development. To test the expression of the distinct isoforms, we designed specific primers for conventional RT-qPCR assays. As already detected in the TLDA assays, Tcf3 expression was lower in Dido1∆E16 than in wt prepro-B cells; this reduction was due equally to E12 and E47 isoforms (Fig. 6B). However, while the expression of the HEBcan isoform of Tcf12 (assessed by exon 1/2 levels) was not affected by DIDO3 absence, HEBalt isoform expression (exon X9 levels) dropped to 10% of the wt in Dido1∆E16 prepro-B cells (Fig. 6C).
Tcf3 levels remained low in Dido1∆E16 pro-B cells (Fig. 6D), and we also found low Bcl2 and high levels of the cell cycle regulator Cdkn2a (p16, cyclin-dependent kinase inhibitor 2A). E2A is necessary to maintain Ebf1 and Pax5 expression as well as the B cell genetic program, whereas BCL2 is an antiapoptotic factor, and CDKN2A inhibits the kinase-dependent kinase cyclin 4 (CDK4) and therefore blocks the cell cycle in G1.
In the pre-B Dido1∆E16 population, we again detected a pro-apoptotic environment, with low Bcl2 and high levels of cyclin D2 (Ccnd2), which codes for the regulatory subunit of CDK4 and CDK6, necessary for cell cycle G1/S transition. Our data also showed no silencing of genes that encode pre-BCR components such as Igll1, CD79a, and CD79b, or genes that inhibit pro-B cell differentiation such as Erg (Fig. 6E).
In immature B-Dido1∆E16 cells, we detected an increase in the mRNA levels of stemness-related transcription factors (Pou5f1 (Oct-4), Nanog, Kit, and Ly6a (Sca1)), in addition to high levels of pre-BCR surrogate light chain components such as Vpreb2 and Vpreb1, and of cell cycle regulatory genes such as cyclin-dependent kinase inhibitor Cdkn2b (p15INK4b). We also observed overexpression of Id3, which forms heterodimers with HLH proteins that lead to inhibition of DNA binding and thus, regulation of gene transcription (Fig. 6F). All these results suggested that Dido1∆E16 affects differentiation processes both in mouse hematopoietic precursors and in embryonic stem cells.
RNA-seq analysis of gene expression in pre-B cells
As the first checkpoint in B cell development is dependent on pre-BCR activation and takes place at the pre-B stage29, where we found the greatest effect of the absence of DIDO3 (seen as lack of progenitor cell progression), we sequenced the transcriptome (RNA-Seq) of cells in this phase.
Using pooled samples (2-3 mice/pool) to obtain sufficient starting material, we sorted pre-B cells by flow cytometry and purified RNA. Two biological replicates were used to construct libraries for massive sequencing. After filtering low quality reads, an average of 24 million clean reads were obtained and 94.91% clean reads were mapped to the mouse genome. A total of 7846 genes with more than three fragments per kilobase of transcript per million mapped reads (FPKM) was used for subsequent analyses. DEG were defined based on the absolute value of their fold change |FC| ≥1.5 (log2FC >0.6|) associated to a false discovery rate (FDR) ≤0.05. We found 120 genes with higher expression in wt pre-B cells, and 346 genes upregulated in Dido1∆E16 cells (Supplementary table 2).
The biological significance of DEG was explored using the network tool STRING v.11.0 (https://string-db.org). Gene Ontology (GO) terms with FDR <0.25 included positive regulation of leukocyte differentiation, cellular response to stimulus, positive regulation of lymphocyte activation, positive regulation of lymphocyte differentiation, positive regulation of T cell differentiation, positive regulation of B-cell differentiation, signal transduction, regulation of signaling, immune response, and regulation of leukocyte cell-cell adhesion. After gene clustering, the highlighted signaling pathways were MTOR, GPCR, Rho-GTPase, interferon, b-catenin, cytokine signaling in the immune system, and antigen processing by ubiquitination and proteosome degradation (Suppl. Fig. 5).
For a second analysis we applied the GSEA tool, using a ranked list of genes with base mean >1, ordered by FC. By analysis against the group of genesets defined as targets of specific transcription factors, we identified the genesets of targets for PRC2 components JARID2 and SUZ12 (p<0.001) (Fig. 7). Aebp2, a substoichometric PRC2 subunit that stimulates the K9 and K27 methylation of histone H330, was upregulated (log2FC = 2.61 FDR = 0.067) in Dido1∆E16 pre-B cells.
The relevance of pre-BCR in the pre-B developmental checkpoint and the PRC2 participation in V(D)J recombination led us to study the effect of DIDO3 deficiency on the transcription of Igh and Igk loci. Joining (J) and variable (V) regions of the Igk locus were not notably affected. There was over-representation of proximal Igh V regions (Q52 and 7183 families), whereas intermediate and distal V fragments (mainly J558 and 3609 families) were under-represented in Dido1∆E16 pre-B cells (Fig. 8A and Supplementary table 3). We detected unexpectedly high transcription levels of DNA located between J and D fragments (Fig. 8B). In wt cells, this DNA segment is eliminated by recombination in the previous pro-B stage, or even in prepro-B, by RAG1/2 activity. Finally, we detected expression of the Igg2b (g2b) gene, which suggested that isotype-switched IgG2b+ memory B cells co-purified with pre-B cells (B220+CD19+CD43−IgM−) in numbers substantially higher than in wt samples (Fig. 8C).