Single-cell transcriptome revealed seven topologically continuous BMPC subsets in AL, MGUS, and healthy controls
To demonstrate differences in the pathological progression of BMPCs between AL and MGUS, we first performed a single-cell transcriptomic analysis using 14 081 BMPCs collected from bone marrow samples donated by AL patients (n=3), MGUS patients (n=2), and healthy controls (n=1+20). The doublets were then removed (Fig. S1a) and single cells were clustered (Fig. S1b) based on the differentially expressed genes (Fig. S1c), immunoglobulin production (Fig. S1d) and marker gene expression (Fig. S1e, Table S2) facilitated by the machine-learning algorithm (Fig. S1f).
To identify subsets of BMPCs that were specifically associated with AL or MGUS, we conducted UMAP (Fig. 1a) and three-dimensional (3D) UMAP (Fig. 1b) embedding of the integrated transcriptome of BMPCs (Fig. S2a), revealing the topological continuity of the seven subsets (Fig. S2b) in the AL, MGUS, and healthy control samples. This showed differences in proportions and gene expression.
Examination of BMPC marker genes showed differences in expression among the subsets. The majority of BMPCs were syndecan 1 (SDC1, encoding CD138) (+), CD38 (+), positive regulatory domain I-binding factor 1 (PRDM1) (+), X-box-binding protein 1 (XBP1) (+), interferon regulatory factor 4 (IRF4) (+), CD19 (−), and membrane spanning 4-domains A1 (MS4A1, encoding CD20) (−) (Fig. 1c). Markers for long-lived plasma cells, such as Thioredoxin Interacting Protein (TXNIP) and C-X-C motif chemokine receptor 4 (CXCR4), were expressed in all BMPC subsets except for subset 2.
The predominant subset (subset 0) was more abundant in AL and MGUS. In contrast, subsets 2–4 were more abundant in healthy controls (Fig 1d), and subsets 5 and 6 were specifically enriched in AL and MGUS, respectively (Fig 1e). These differential proportions of subsets among AL, MGUS, and healthy controls reveal that changes in subset proportions were likely associated with the progression towards MGUS and then to AL.
Differentially expressed gene expression further showed that subsets 0 and 1 shared greater transcriptomic similarity with each other than with other subsets, while subsets 2–6 had specific characteristics (Fig. 1f). The AP-1 complex, including jun proto-oncogene (JUN) and fos proto-oncogene (FOS), was upregulated in subsets 0 and 1. Beta-1,4-galactosyltransferase 1 (B4GALT1), a protein glycosylation enzyme localised in the Golgi apparatus, was expressed in subsets 0 and 1. Amyloidosis-associated genes such as apolipoprotein E (APOE) were distinctly expressed in subset 5, and were enriched in AL. Stathmin 1 (STMN1), involved in DNA repair and microtubule disassembly, was specifically expressed in subset 6. CD27 was highly expressed in subsets 3 and 4, which were enriched in the healthy controls. The functional subsets marked by APOE, STMN1, and B4GALT1 expression (Fig. 1f) were validated using flow cytometry. Subset 5, highly expressing APOE, and subset 6, marked by STMN1, largely contained plasmablastic BMPCs (Fig. 1g, S3a). The proportions of subsets 5 and 6, revealed by scRNA-seq (Fig. 1d) and image flow cytometry (Fig. 1g, S3a), were similar. Plasmablastic BMPCs showed higher roundness, higher nuclear–cytoplasmic (NC) ratio, and lower nuclear eccentricity compared with mature BMPCs. The subcellular distribution of APOE (in vesicles), STMN1 (in the cytosol), and B4GALT1 (in the Golgi apparatus) was consistent with previous findings (Fig. 1h, S3b).
As the basis of subsequent analysis, these transcriptomic subsets, defined by differences in their gene expression and abundance in AL, MGUS, and healthy subjects, revealed the intra-individual heterogeneity of BMPCs.
Functional subsets with gradients of signalling entropy and immunoglobulin production were related to AL pathogenesis
Based on these results, we sought to determine the functional implications of these transcriptomic subsets. Interestingly, the signalling entropy (Fig. 2a) and immunoglobulin production (Fig. 2b) showed an inversely proportional relationship and formed a gradient across distinct BMPC subsets. Subset 2 exhibited the highest expression of immunoglobulins and the lowest signalling entropy, whereas subsets 0 and 6 showed the inverse. Because signalling entropy is interpreted as the capacity for differentiation of individual cells, we inferred that subset 6 was less differentiated, whereas more differentiated BMPCs showed higher immunoglobulin production.
Pseudo-time analysis revealed three distinct trajectories among these BMPC subsets, following a gradient of decreasing signalling entropy and increasing immunoglobulin production (Fig. 2c). Subset 6 was the root of all three trajectories, and the predominant subset was adjacent downstream to subset 6. The other subsets were located further downstream of the trajectory.
We then conducted functional enrichment analysis to explore the relationship between trajectories and the roles of BMPC subsets in AL pathogenesis. The results indicated that subset 6 was involved in the cell cycle, DNA repair, DNA replication, chromatin conformation change, and upregulation of chaperone-mediated unfolded protein response (UPR), which supported the role of this subset as the root. The predominant subset, downstream of subset 6 in the trajectory, showed upregulation of oxidative stress, endoplasmic reticulum stress (ER stress), and osteoclast differentiation. This subset was also more responsive to glucocorticoid and proteasome inhibitors such as bortezomib. The predominant subset then underwent a trajectory into subset 5 or subset 1 in a pseudo-time. Subset 5 was enriched for the upregulation of neutrophil degranulation pathways and supramolecular fibre organisation, while subset 1 was enriched for carbohydrate biosynthesis and chromatin organisation. In addition, subsets 2–4 showed up-regulated immunoglobulin production and Fc-gamma receptor signalling (Fig. 2d).
Analysis of subset-specific transcription factors further supported the functional roles of different subsets. In particular, SWI/SNF-related, matrix-associated, actin-dependent regulator of chromatin subfamily C member 2 (SMARCC2), a regulator of DNA helicase gene expression, dominated subset 6. AP-1 family members were predominantly expressed in the 0. X-ray repair cross complementing 1 (XRCC4), a known controller of V(D)J recombination, regulated gene expression in subset 5. DNA damage inducible transcript 3 (DDIT3) and nuclear factor I C (NFIC), which are activated by ER stress, were expressed in subset 2 (Fig. 2e, S4a).
Taken together, these results depict the functional roles of BMPCs in AL pathogenesis, associated with an inverse relationship between immunoglobulin production and signalling entropy.
Immunoglobulin idiotype, marker gene expression, and copy number variations contributed to the inter-individual heterogeneity of BMPCs in AL
To explore intra-individual and inter-individual heterogeneity, we conducted UMAP (Fig. 3a, 3b) and 3D UMAP (Fig. 3c) embedding with the merged transcriptome of BMPCs. The plots obtained revealed a flower-like topology with the healthy controls relatively tightly clustered at the centre, while the AL and MGUS samples projected out, like petals, with no clear relationship with each other except connections through healthy controls. These results suggest that BMPCs in AL and MGUS exhibit high inter-individual heterogeneity, but not in healthy controls (Fig. 3a). Projection of the functional subset types showed that the predominant subset in AL and MGUS showed the highest deviation from the healthy controls, whereas subset 2, which produced the highest level of immunoglobulin, was more similar to them (Fig. 3b, S5a).
We then investigated the inter-individual differences in immunoglobulin production. The results confirmed that subset 2 had the highest expression of immunoglobulin genes in the AL, MGUS, and healthy controls. In addition, examination of differences in immunoglobulin idiotypes revealed that the expression of the light chain was highest in subset 2 in AL. However, kappa chains were relatively abundant in the non-monoclonal BMPCs of the healthy controls (Fig. 3d).
Since the immunoglobulin idiotype affects organ tropism of light chains and is associated with organ involvement in AL patients9, 21, 22, we checked the monoclonality of several immunoglobulins including IGLV 1-44, IGLV 2-23, IGLV 3-19, IGLV 2-14, and IGLV 6-57, across samples (Fig. 3e). Previous studies demonstrated that IGLV 1-44 and IGLV 2-14 light chains tend to be deposited in cardiac muscles23, whereas IGLV 3-19 damages renal glomeruli24. Our findings on the clinical manifestations of AL patients (Table S1) are consistent with those of previous reports.
Additionally, we investigated the expression of drug-associated marker genes. We noted that BMPCs in a clinically significant proportion of AL and MGUS patients harboured t(11;14) (IgH/CCND1), a translocation conjugating the IgH promoter to cyclin D1 (CCND1). We observed predominant subsets in some cases of AL and MGUS that highly expressed CCND1, whereas cyclin D2 (CCND2) was upregulated in non-t(11;14) AL. In addition, CD27 was highly expressed in the healthy controls. CD38, the target of daratumumab, was expressed in BMPCs of AL, MGUS, and healthy subjects, and was subtly upregulated in non-t(11;14)AL (Fig. 3f).
We investigated the CNVs inferred from our single-cell transcriptome data and found patterns associated with t(11;14) and CCND1 expression (Fig. 3g, S5b). More specifically, the BMPCs of non-t(11;14)AL showed larger CNV fragments than those detected in patients with t(11;14) AL and MGUS. CNVs detected in BMPCs from t(11;14) AL samples were also found in t(11;14) MGUS, thus implicating a mutational continuum underlying the progression of MGUS to AL25. In particular, a hotspot for CNVs specific to t(11;14) AL and MGUS was identified on the long arm of chromosome 11 around the CCND1 locus. Similarly, the deletion of 13q14 and gain of 1q21 are both mutations frequently found in AL26, consistent with the CNVs observed on chromosome 13 and chromosome 1 in non-t(11;14) AL.
The combination of intra -and inter-individual heterogeneity revealed monoclonality, marker gene expression, and copy number variations of BMPCs in AL, especially those associated with t(11;14).
t(11;14) and non-t(11;14) AL had aberrant subsets that up-regulated amyloidosis-associated genes and those associated with light chain production
From the above findings on intra- and inter-individual heterogeneity in BMPCs, we observed that subset 5 was enriched in the AL (Fig. 1e, 4a). GSEA showed that the upregulated genes in subset 5 were enriched for amyloidosis-associated genes (DisGeNET database) in subset 5 of both t(11;14) (Fig. 4b) and non-t(11;14) AL (Fig. 4c). This subset upregulated genes encoding amyloid-beta binding proteins and secretory granule proteins, with APOE serving as the hub (Fig. 4d).
We also found a proportion of subset 2 that deviated from healthy controls (Fig. 4a). Compared with their counterparts in healthy controls (Fig. S6a), and MGUS (Fig. S6b), subset 2 in AL upregulated genes involved in the active transport of proteins to the Golgi apparatus, protein folding in the Golgi apparatus, neutrophil degranulation, asparagine N-linked protein glycosylation, and apoptotic pathways (Fig. 4e). Specifically, peptidylprolyl isomerase A (PPIA), B4GALT1, and transmembrane P24 trafficking protein 2 (TMED2) were shown to participate in multiple pathways (Fig. 4f).
Collectively, these findings showed that amyloidosis-associated genes and aberrant light-chain production-associated pathways were upregulated in the subsets related to AL pathogenesis in both t(11;14) and non-t(11;14) AL.
Predominant subset in t(11;14) and non-t(11;14) AL differentially expressed the gene associated with venetoclax sensitivity
In the t(11;14) and non-t(11;14) AL, the predominant subset showed subtle differences in gene expression (Fig 5a), and were highly associated with venetoclax sensitivity. Functional enrichment of differentially expressed genes revealed that t(11;14) AL upregulated B-cell differentiation pathways, whereas non-t(11;14) AL upregulated the response to growth factor stimuli. In addition, the predominant subset in t(11;14) AL up-regulated the glucose metabolic process and response to corticosteroids, while the non-t(11;14) AL group showed up-regulation of the blood vessel development, neutrophil degranulation, and Golgi-associated pathways (Fig. 5b).
Protein–protein interactions (PPI) network analysis showed that upregulated genes in the predominant subset in t(11;14) AL were in a network (Fig. 5c) and enriched in a gene set highly expressed in venetoclax-sensitive MM cell lines (Fig. 5d). The predominant subset in t(11;14) AL exhibited overexpression of anti-apoptotic BCL2 complex and B cell-associated genes, including CD79A, VPREB3, CD45, and MS4A1. In contrast, the upregulated genes in the predominant subset in non-t(11;14) AL included CCND2, PPIA, GAPDH, and S100A gene families and those encoding the cytochrome c protein family (Fig. 5e). They were significantly enriched in a gene set highly expressed in venetoclax-resistant MM cell lines (Fig. 5f).
Thus, the predominant subset contributed to the differential expression of venetoclax sensitivity-associated genes in t(11;14) and non-t(11;14) AL, which are involved in a functional PPI network.
Proportion of functional subsets defined the patient subgroups of AL exhibiting differential gene expression related to venetoclax sensitivity
Finally, we sought to validate these findings using bulk RNA-seq of BMPCs from an independent cohort of 29 patients with AL (Fig. S7a, S7b). Clustering the proportion of subsets revealed three subgroups of patients with AL (Fig. 6a, Fig. S7c). Subgroup B had a larger population of the predominant subset (subset 0), and subgroup A had a larger proportion of subset 1, involving the majority of patients with AL (Fig. 6b). Subgroups A and B showed mutually exclusive upregulation of CCND2 and CCND1, and another subgroup was double-negative for CCND2 and CCND1 (Fig. 6c, 6d).
We then validated the co-expression of genes associated with venetoclax sensitivity (Fig. 6e). CCND1, B-cell CLL/lymphoma 2 (BCL2), CD79A, and VPREB3 were co-expressed in subgroup B with a larger predominant subset, while CCND2, S100A6, and CST3 were co-expressed in subgroup A. These consistent proportions of subsets and marker gene expression demonstrated that the scRNA-seq samples of t(11;14) and non-t(11;14) AL represented subgroups B and A, respectively.
Taken together, these results validated a patient subgroup featuring a larger proportion of the predominant subset and a higher expression of CCND1 and venetoclax sensitivity-associated genes.