Construction of a single-cell transcriptomic map of stepwise leukemogenesis in a murine AML model
To dissect the whole process of leukemogenesis, we performed a series of scRNA-seq analyses of bone marrow (BM) cells from mice transplanted with leukemogenic HSPCs at different stages (Figure S1A). Since Myc is frequently amplified and/or overexpressed in human AML (12%), we constructed a Myc-induced murine AML model(21). All the recipient mice developed full-blown AML approximately 8 weeks after transplantation with minimal variation (Figure S1B). The leukemic cells could be identified by the expression of GFP, which was linked with Myc (Figure S1A). A majority (80.3%) of the bone marrow in all transplanted mice was leukemia cells that expressed markers of myeloid lineage (B220-; CD3-; Mac-1+; c-Kit+), and over time, the proportion of myeloid cells increased. (Figure S1C-D). Peripheral blood smears showed step-by-step leukocytosis with increased blasts (Figure S1E). The histological analysis showed that the resulting mouse AML recapitulated the features of human disease (Figure S1F). During leukemogenesis, we harvested the initiating Myc-overexpressing HSPCs (pretransplanted, T0), preleukemic BM cells at two weeks (T1), four weeks (T2) after transplantation and full-blown AML BM cells (T3, eight weeks after transplantation) for 10x Genomics scRNA-seq analyses, and a total of 41,078 cells were retained for analysis. Two independent repeats for each time point were highly similar to each other (Figure S2A).
Accordingly, we calculated the UMAP plot of all these preleukemic and leukemic cells (PLCs) together with normal bone marrow cells (Figure 1A). The normal cell subpopulations were clustered by their distinct gene signatures, and the remaining GFP-positive cells were defined as the PLC subtype (Figure S2B and Supplementary Table 1). While all the normal cell lineages from each time point largely overlapped, PLCs displayed a shift from being close to neutrophils to being close to HSPCs on the UMAP (Figure 1B). The ratios of normal populations gradually decreased, while those of PLCs increased over time (Figure S2C).
Since it has been proposed that uncontrolled proliferation and blocked differentiation are two major mechanisms of leukemogenesis(7), we analyzed the cell cycle and differentiation status of PLCs. The results showed a gradual increase in the cell cycle for PLCs from T0 to T3 (Figure 1C and Supplementary Table 2). PLCs also acquired progressively increased differentiation blocks, as indicated by reduced differentiation scores and enhanced stemness scores from T0 to T3 (Figure 1D and Supplementary Table 2). We also found that the number of UMIs and genes progressively increased in GFP-positive cells, while those of GFP-negative cells did not, which was consistent with a previous report that RNA abundance is associated with elevated expression of stem genes(22) (Figure 1E, S2D-E). Overall, we delineated a single-cell landscape of leukemogenesis that recapitulated the stepwise transition of this disease.
Clinically, more than 30% of MDS cases progress to AML. To visualize the molecular switches in PLCs during leukemogenesis, we performed correlation analysis of the transcriptome between PLC populations at different time points and patients with high-risk MDS, MDS-AML and de novo AML. Of note, the gene signature of T1 PLCs was highly expressed in patients with high-risk MDS, that of T2 PLCs was highly expressed in AML patients who developed from MDS and that of T3 PLCs was highly expressed in de novo AML (Figure 1F and Supplementary Table 3). The correlation of the stepwise molecular switches of PLCs with those of patients suggested that the molecular trajectory of PLC transformation in mice represented a potential mechanism of leukemogenesis from MDS to AML patients.
Progressively deteriorated RNA splicing abnormality during leukemogenesis
To dissect the molecular events underlying stepwise leukemogenesis, we performed gene set variation analysis (GSVA) for specific pathways of PLCs of each stage. We found that the Myc target genes gradually increased from T0 to T3, along with genes involved in oxidative phosphorylation and DNA repair, and the genes of the p53 pathways gradually decreased over time, all of which were consistent with the gradually increased aggressiveness of PLCs over time (Figure 2A).
Since myc was the sole driver in this AML model, we first checked the expression levels of Myc. We found that the expression of either ectopic or endogenous Myc was not constitutively increased from T0 to T3 (Figure S3A). However, we observed that the expression levels of Myc targets constitutively increased from T0 to T3 (Figure S3B). Since the expression level of Myc itself was not progressively changed in PLCs, it was unlikely that the increase in Myc targets was just a selection of cells with high levels of Myc, which further suggested that stepwise leukemogenesis was not a result of simply selecting Myc expression. Importantly, by analyzing the transcriptome of the TCGA AML cohort, we found that the signature of Myc target, but not the expression of Myc itself, was associated with the poor prognosis of AML patients (Figure S3C).
Furthermore, the KEGG analysis of the Myc target genes revealed that the spliceosome pathway was the most enriched among the Myc targets in PLCs, along with the cell cycle pathway and RNA transport pathway(23) (Figure 2B). The expression level of splicing factors gradually increased from T0 to T3 (Figure 2C and Supplementary Table 4), and stage-specific splicing regulatory factors (SFs) were identified by gene expression (Figure 2D). We found that the expression of these stage-specific SFs associated with PLCs, but not those associated with normal cells, had prognostic value in human AML patients, and the prognostic value of PLC SFs progressively increased from T1 to T3 (Figure 2E and Supplementary Table 5). Taken together, these data suggested that aggressively increased expression of splicing factors, independent of the Myc expression level, might underlie the progression from preleukemic to fully transformed leukemia in our stepwise model of leukemogenesis.
Abnormal RNA splicing kinetics distinguish leukemogenic initiating cells from their normal counterparts at the tipping point
Diverging from the normal development route is one of the critical steps at the initiation of tumorigenesis(24). In the landscape of leukemogenesis, we observed a diverging point where HSPCs would differentiate into GMP cells or be transformed into GMP-like preleukemic cells (Figure 3A). The differentiation route from HSPCs to GMP cells was characterized by reduced expression of leukemia signature genes and increased expression of myeloid/neutrophil lineage genes, while in sharp contrast, the transformation route from HSPCs to GMP-like cells displayed increased expression of leukemia signature genes and decreased expression of differentiation genes (Figure 3B). The kinetics of the expression of GMP-specific and GMP-like-specific genes over pseudotime indicated that there was a fate-determining tipping point for Myc-expressing HSPCs to either the GMP differentiation route or GMP-like leukemogenesis route (Figure 3C).
Given the importance of RNA splicing during the whole leukemogenesis process, we analyzed RNA splicing kinetics at the tipping point of leukemogenesis by RNA velocity, which has been suggested to indicate developmental potential and cellular dynamics(25). Consistent with the differentiation tendency of HSPCs into GMPs, HSPCs displayed relatively stationary RNA velocity, and GMPs had high velocity. There was a strong directional flow from HSPCs to GMP. Strikingly, the RNA velocity of GMP-like cells was more stationary than that of HSPCs, and more importantly, the directional flow was from GMP-like cells to HSPCs, which indicated that GMP-like cells at the tipping point may have more potential than HSPCs (Figure 3D). Importantly, we found that at the onset of leukemogenesis, GMP-like cells did not gain a significantly increased AML or leukemic stem cell (LSC) signature (Figure 3E) compared to HSPCs. However, we observed a significant decrease in the expression of RNA splicing factors from HSPCs to GMPs but a significant increase from HSPCs to GMP-like cells (Figure 3F). These active splicing factors might give preleukemic cells a selection pool for leukemia-promoting molecules.
NPM1 at the tipping point is involved in leukemogenesis
To identify the key signatures leading to the binary cell fate choice at the tipping point, we analyzed the differential expression of genes on the two evolutionary routes and then defined the intersection between the low-expressing gene set 2 in the GMP lineage and the high-expressing gene set 4 in the GMP-like lineage as tipping point signatures (TPS), with a total of 30 genes. (Figure S4A-B). We characterized the TPS pattern in the GMP lineage and GMP-like lineage and found that the KEGG_RIBOSOME pathway associated with tumorigenesis also conformed to this pattern (Figure S4C-D).
Furthermore, CRISPR/Cas9 gene editing was used to verify and screen out genes that affect the differentiation and stemness of Myc-overexpressing HSPCs at the tipping point of malignant transformation. To identify specific genes promoting the differentiation of HSPCs, we ranked the sgRNAs by differences in the mean fluorescence intensity (MFI) of Mac-1 or c-Kit staining and the proportion of Mac-1+ or c-Kit+ cells. Based on all detection indicators, Npm1 and Phgdh were finally selected as key molecules for inducing HSPCs malignant transformation during leukemogenesis (Figure S4E-F). In addition, the characteristic molecular interaction network analysis showed that Npm1 was a key node gene in the GMP-like cells, as both the differentially activated module gene and differentially expressed gene in the GMP-like cells suggested that Npm1 might be a key player in leukemogenesis (Figure 4A). As expected, Npm1 mRNA was gradually upregulated in the transformation route but gradually downregulated in the differentiation route, consistent with our TPS pattern (Figure 4B). The abnormal expression of NPM1 has been reported to be involved in human leukemogenesis (26), and consistent with the upregulation of Npm1 expression in GMP-like cells, NPM1 was also significantly upregulated in AML patients compared to normal samples. (Figure 4C). Therefore, we performed further functional studies of Npm1 to verify its effect on the development of leukemia.
To validate the function of Npm1 in leukemogenesis, we disrupted it in Myc-overexpressing HSPCs by CRISPR/Cas9 (Figure S5A-B) and found that Npm1 loss significantly extended the latency of Myc-driven AML in mice (Figure 4D). Consistently, Npm1 deficiency promoted the differentiation of HSPCs in vitro, as indicated by significantly increased expression of the differentiation marker Mac-1 and decreased levels of the stem cell marker c-Kit (Figure 4E). Transcriptomic analysis by RNA-seq also showed that Myc-overexpressing HSPCs promoted cell differentiation while decreasing stemness, with more GMP gene signatures and fewer GMP-like signatures after the deletion of Npm1. (Figure 4F-G and Supplementary Table 6). KEGG also showed that cell differentiation and apoptosis pathways were enriched in Npm1-deficient tumor-initiating cells (Figure 4H). GSEA indicated that the MYC_TARGETS_V1 and MRNA_SPLICING upregulated genes were significantly negatively enriched in HSPCs with sgNpm1 compared to those with sgScramble, which corroborates our previous analysis of global molecular regulatory events behind the evolution of leukemia from T0 to T3(Figure 4I). Consistent with our data, the GMP gene signature was significantly positively enriched in NPM1-deficient human leukemic cells compared to control shREN cells, and the GMP-like gene signature was significantly negatively enriched, suggesting that NPM1 might be a driver of leukemia (Figure 4J).
Phgdh, although rarely reported in AML, was another top gene in the tipping point of GMP-like cells. Its expression was upregulated along the transformation route but downregulated in the differentiation route (Figure S5C). Furthermore, Phgdh disruption promoted the differentiation of HSPCs (Figure S5D). We also observed that Myc-overexpressing HSPCs reduced the GMP-like gene signature, cell stemness and the cell cycle after Phgdh deletion (Figure S5E-F and Supplementary Table 6). KEGG also showed that lineage differentiation pathways were significantly enriched in Phgdh-deficient preleukemia cells, and GSEA showed that the MYC_TARGETS_V1 and MRNA_SPLICING upregulated genes were also significantly negatively enriched in HSPCs with sgPhgdh (Figure S5G-H). High expression of PHGDH was associated with poor prognosis in AML patients (Figure S5I). As outlined above, our data suggested that the AML tipping point signatures, including Npm1 and Phdgh, might be a critical driver for leukemia initiation at the tipping point.
RNA splicing creates explosive heterogeneity of leukemic cells
Once the PLCs went through the tipping point and embarked on the tumorigenesis route, they gradually gained malignancy and heterogeneity, which could be visualized on the URD map (Figure S6A). All these PLCs could be grouped into five major subpopulations, which were named after their top marker genes (Figure 5A-B and Supplementary Table 7). On the map were all the T0 cells at the “root”, followed by T1 and T2 cells, sequentially, at the “stem”. T3 cells were located at the crown and ended with multiple “branches” (Figure 5C). The expression of the human AML signature genes gradually increased along with the leukemogenesis “tree” (Figure S6B). The ratios of these subpopulations significantly varied among the time points, and Hsp90aa1hi cells did not emerge until T3, suggesting that this subpopulation might be more aggressive than others (Figure S6C). As expected, we found that these subpopulations displayed distinguishable gene signatures in the TCGA_LAML and TARGET_AML databases, and the hazard ratio calculated that the HSP90AA1hi subtype was significantly associated with high risk in human AML patients (Figure S6D-G).
We further explored the potential molecular mechanisms underlying the heterogeneity, especially those promoting aggressiveness in the Hsp90aa1hi subpopulation, and found that Hsp90aa1hi cells expressed significantly higher levels of splicing regulatory factors than any other subpopulation (Figure 5D), suggesting that the splicing machinery generates leukemia intraheterogeneity and promotes its aggressiveness.
To validate the intraheterogeneity in human AML, we analyzed the single-cell transcriptomes of bone marrow cells from five AML patients. The leukemic cells were recognized by referred copy number variations (Figure S7A). These AML cells could be divided into four distinct subpopulations (Figure 5E), and the ratios of these subpopulations significantly varied among these 5 patients (Figure S7B). Of note, the HSP90AA1hi subpopulation shared the same marker genes, HSP90AA1 and PDLIM1, with the Hsp90aa1hi subpopulation in mouse AML. Similarly, the TYROBPhi subpopulation expressed the same top marker genes, TYROBP and SAT1, as the Tyrobphi subpopulation in mouse AML (Figure 5F). Importantly, through unbiased hierarchy clustering according to their transcriptomes, we found that the human HSP90AA1hi subpopulation was most similar to the mouse Hsp90aa1hi subpopulation, the human TYROBPhi subpopulation was most similar to the mouse Tyrobphi subpopulation, and the human MPOhi and SPINK2hi subpopulations were close to the mouse Heshi subpopulation (Figure 5G). Furthermore, the HSP90AA1hi subpopulation was the most aggressive among all cells, as indicated by the highest cell cycle score, while the TYROBPhi subpopulation had the slowest cell cycle in human AML (Figure 5H). Similar to those in mouse AML, the HSP90AA1hi cells expressed the highest levels of splicing factors, while the TYROBPhi cells expressed the lowest (Figure 5I). KEGG analysis showed that splicing pathways were also significantly enriched in the HSP90AA1hi cell population (Figure S7C). The similarity of the intraheterogeneity and its associated molecular features between human and mouse AML suggested that our mouse AML model faithfully recapitulated the molecular and cellular characteristics of human disease and that abnormal RNA splicing generally promoted leukemic heterogeneity in various AML.
Exon 6 skipping of Tmem134, predominantly in the Hsp90aa1hi subpopulation, promotes aggressiveness in AML
To deeply characterize abnormal RNA splicing events underlying leukemia heterogeneity, we performed single-cell RNA-seq analysis of 178 leukemic cells in T3 by Smart-seq2 instead of the 10x Genomics platform, which would be able to analyze the full length and more transcripts (27, 28). Based on their gene expression patterns, these cells were partitioned into 5 populations, and the tSNE plot showed that the Hsp90aa1hi subpopulation was separate from others (Figure 5J and S8A). Indeed, the high expression of the signature genes of the Hsp90aa1hi cell population was also significantly associated with poor prognosis in human AML patients (Figure 5K).
Consistent with those analyzed by 10x Genomics, the signature genes of Hsp90aa1hi cells were enriched in the RNA splicing pathway and DNA metabolic process pathway (Figure S8B). Furthermore, we found that all five major subtype events of alternative RNA splicing, including exon skipping, intron retention, alternative 5’ splice site, alternative 3’ splice site, and mutually exclusive exons, were increased in Hsp90aa1hi cells compared with Hsp90aa1lo cells (Figure S8C). These data validated the RNA splicing in the Hsp90aa1hi subpopulation, which displayed more aggressiveness.
Among the alternatively spliced genes in Hsp90aa1hi cells was transmembrane protein 134 (Tmem134), a putative transmembrane protein with two transmembrane domains on the C-terminus(29). The full-length Tmem134, named Tmem134α, and exon 6 skipped Tmem134, named Tmem134β (Figure 6A). TMEM134 is highly conserved in mice and humans, and importantly, the same exon skipping also occurred in human AML (Figure 6B and S8D). We found that exon 6 of Tmem134 was skipped more in Hsp90aa1hi cells than in other cells (Figure 6C). Importantly, exon 6 skipping of TEME134 (TMEM134 PSI low) was significantly associated with poor prognosis in human AML (Figure 6D). In addition, HSPCs with TMEM134α overexpression hindered leukemogenesis in our Myc-induced leukemia murine model compared to the empty vector, while TMEM134β significantly promoted leukemogenesis (Figure 6E and S8E). We observed that compared to TMEM134α-overexpressing tumor-initiating cells expanded rapidly in vivo with a strong growth advantage (Figure S8F). Similarly, TMEM134β promoted Myc-overexpressing HSPC growth, while TMEM134α inhibited HSPC growth ex vivo (Figure 6F).
Furthermore, we tested the potential effects of exon 6 skipping of TMEM134 in Myc-induced AML cells ex vivo. To exclude the influence of endogenous TMEM134 protein, we generated a Tmem134 knockout Myc AML monoclonal cell line (i.e., M#8) with CRISPR/Cas9 and then introduced TMEM134α or TMEM134β cDNA into M8# by retrovirus (Figure S8G-I). The results showed that without endogenous TMEM134, exogenous TMEM134α significantly reduced the growth of leukemic cells compared to the empty vector, while TMEM134β significantly promoted the growth of leukemic cells (Figure 6G). In addition, we used two different strategies by abrogating the splicing site of exon 6 (sgExon6) or deleting the whole exon 6 (sgIntron5 combined with sgIntron6) for enforced exon 6 skipping of Tmem134 in leukemia cells. Both sgExon6 and sgIntron5+6 significantly increased the growth of AML cells (Figure 6H and S8J). Furthermore, we introduced synonymous mutated TMEM134α/β cDNA into the THP-1-cell line and HL-60 cell line (two kinds of human leukemia cell lines) with endogenous TMEM134 knockout. Consistently, TMEM134α significantly repressed the growth of human leukemia cells, and TMEM134β was promoted (Figure 6I and S8K-N). Then, we tested the in vivo effects of TMEM134 exon 6 skipping on leukemia maintenance by transplanting leukemic cells harboring TMEM134α or TMEM134β into recipient mice. We found that recipients with TMEM134α had significantly extended survival (Figure S8O). Despite the short period of tumor development, we have not yet had time to observe the in vivo growth advantage of TMEM134β-overexpressing leukemia cells after the second transplant. Thus, exon 6 skipping of TMEM134, which likely abrogated the tumor suppression function of full-length TMEM134, drove aggressiveness in AML.
Consistently, transcriptomic analysis by RNA-seq showed that TMEM134α significantly repressed the expression of cell cycle genes in AML (Figure S8P). EdU incorporation assays also showed that the proliferation ratios of cells with sgExon6 or sgIntron5+6 were significantly higher than those of the control cells (Figure S8Q). In both human and mouse AML, TMEM134α was significantly associated with a decreased cell cycle (Figure S8R). The genes upregulated by TMEM134α were expressed at significantly lower levels in Hsp90aa1hi cells than in Hsp90aa1lo cells, and in contrast, the genes downregulated by TMEM134α were expressed at significantly higher levels in Hsp90aa1hi cells than in Hsp90aa1lo cells (Figure 6J and Supplementary Table 8). GSEA indicated that the NUP98-HOXA9 upregulated genes were significantly negatively enriched in leukemia cells with TMEM134α compared to those with empty vector (Figure 6K). These data strongly indicated that exon 6 skipping of TMEM134 was critical for generating leukemia intraheterogeneity by promoting the aggressiveness of a subpopulation.