Multi-omic data integration reveals PCNSL molecular subtypes with clinical outcome implications
We performed a cluster of clusters analysis using six levels of omic information (Figure 1A and Fig. S1-S3) to identify four PCNSL subtypes (CS1 to CS4) that display different clinical outcomes in OS (Global log-rank p < 0.001, Figure 1B). Patients in CS4 had the longest OS (median=66.8 months; 95% confidence interval [CI]=19.8-67.2) and lived significantly longer than those in both clusters CS2 (median=18 months; CI95%=8.3-53.4; p=0.024) or CS3 (median=13.8 months; CI95%=6.1-16.7; p=0.003), and slightly longer, but not significantly, to those in CS1 (median = 26.2 months; CI95%=13.3-63.9; p=0.094). Additionally, these observations remained significant after adjusting by age and Karnofsky Performance Status (KPS) in Cox proportional hazard ratio multivariate models (Fig. S4A). Interestingly, CS4 was independently associated with a better response when considering progression free survival in univariate and multivariate models (Fig. S4B-C). Finally, we did not observe significant differences in the median number of predicted immunogenic neoantigens (p=0.44, Table S2 and Fig. S5).
Transcriptomic data correctly assign multi-omic defined PCNSL subtypes in FF and FFPE samples
Given the difficulty of acquiring FF tissue and of analyzing and implementing multi-omic data into routine clinical practice, we sought to evaluate the use of only RNA expression, obtained from FFPE or FF tissue, to categorize patients into the four PCNSL CS (Table S3). We obtained a Cohen’s kappa coefficient of 0.90 (p<0.001) when evaluating the accuracy of correctly assigning patients from the multi-omic cohort. Additionally, when expanding to the FF-RNA complete set or when using the FFPE cohort, we validated the prognostic importance of the molecular subtypes CS1-CS4 for clinical outcome (Global p<0.001) in univariate and multivariate models (Figure 1C and Fig. S6-S12).
Next, we evaluated the contribution of each omic-level data to outcome prediction models by using Harrell’s concordance index (C-index)10. A C-index of 0.60 (0.56-0.65 at CI95%) in FF and 0.71 (0.68-0.74 at CI95%) in FFPE was observed using KPS and age, which are the clinical features currently used in the Memorial Sloan Kettering Cancer Center prognostic score for PCNSL11. When adding different omic-data to the FF cohort modeling, we observed higher predictive power using mRNA expression compared to the other omic data (C-index=0.91±0.02 at CI95%). We further validated these observations in the FFPE cohort obtaining a C-index of 0.83 (0.80-0.85 at CI95%) and 0.93 (0.91-0.95 at CI95%) when adding the mRNA level or the TME and RNA levels to the model, respectively (Figure 1D). Altogether, these results show that RNA-seq data from FFPE or FF tissue can be used to correctly identify PCNSL subgroups and significantly increase accuracy of outcome prediction.
Mutational landscape of PCNSL
We identified 32,544 mutations in the 115 PCNSL samples analyzed (median=3.23 mutations/Mb; range=0.02-85.49; Table S4 and Fig. S13). We applied the dNdScv12 algorithm to identify driver mutations identifying the hallmark mutations of PCNSL like MYD88 (64%), PIM1 (59%), PRDM1 (57%), GRHPR (50%), HLA-A/B/C (49%, 30%, and 13%), BTG2 (47%), CD79B (43%), CDKN2A (28%), TBL1XR1 (25%), KLHL14 (25%), CARD11 (22%), and HIST1H1E (18%) which are involved in BCR and TLR-mediated NF-kB signaling, antigen presentation, cell-cycle, histone modification and B-cell differentiation regulation 7,8,13 (Figure 2A, Table S5 and Fig. S14). Moreover, we detected canonical activation-induced cytidine deaminase (c-AID) off-target mutations and found they represent 7.9% (6.8-8.5% at 95% CI) of SNV mutations and fall within driver genes like PIM1 (47%), CD79B (10%), IRF4 (9%), and HIST1H1E (6%) (Table S6, Fig. S15-S16). Interestingly, clonal mutations are significantly more driven by both c-AID and non c-AID (Cosmic signature SBS9) mutational processes than sublonal mutations (p=0.007 and 0.018, respectively), hence reflecting the importance of AID activity in the early stages of PCNSL tumorigenesis (Fig. S17-18).
Regarding focal CNA, we identified significant recurrent amplifications in 18q21.33 (42%), and 19p13.13 (34%), and deletions in 6p21 (39%), 6q21 (65%), 6q27 (49%), and 9p21.3 (28%) which have a higher frequency than those observed in primary systemic DLBCL (Figure 2B)1,7. Furthermore, we found additional, not previously described, amplifications in 1q32.1 (33%, IL10), and 11q23.3 (26%, CD3G), and deletions in 6p25.3 (21%, IRF4), 22q11.22 (29%, GGTLC2) and 14q32.33 (84%) that produce significant expression changes in CD3G (FC= 1.25), IRF4 (FC=-1.03) and GGTLC2 (FC=-1.76, FDR q-value<0.1), respectively (Table S7, Fig. S19-S22).
Distinct genetic signatures within PCNSL subtypes and systemic DLBCL
Afterwards, we aimed to characterize the differences in genetic alterations across groups for each mutation, focal SCNA, and fusions. The CS4 cluster presents ten enriched events that included mutations in SOCS1, which is a negative regulator of the JAK-STAT3 pathway, MPEG1, PIM2, and deletion of 17q25.1 involving GRB2 that indirectly regulates the NF-κB pathway. We observed 43 events within the CS1 cluster including mutations involved in NF-κB pathway (RIPK1/6p25.3 deletion), B-cell differentiation (IRF4/6p25.3 deletion, TOX, and BCL6), proliferation via interruption of cell cycle arrest (CDKN2A/2B fusions and FOXC1), and B-cell lymphomagenesis (e.g., ETV6, OSBPL10). Patients within the CS3 cluster exhibit 12 events from which HIST1H1E arises as the top enriched, and has been proved to enhance self-renewal properties and disrupt chromatin architecture in B-cell lymphomas1,2,7,14,15. The CS2 cluster did not present any genomic characteristic events. Furthermore, most of these distinctive events were inferred as early events (clonal) in tumorigenesis like IRF4 and BCL6 in CS1 (Figure 2C, Table S8-S9). Of note, most of these mutations were not observed in the clusters previously defined in systemic DLBCL (e.g., 9p11.2 del; Figure 2D)1.
B-cell differentiation stages, pathways, and TME distinctions between PCNSL molecular subtypes
We analyzed the expression of different previously curated gene signatures14,16 (see Methods). CS1 was characterized by the upregulation of PI3K, glycolytic activity, and cell proliferation signatures; additionally, it presented hyperactivation of the Polycomb Repressive Complex 2 (PRC2) complex which reduces MHC-I expression, through histone methylation (Figure 3A, p<0.05, and Fig. S23-S25)17. Moreover, p53 activity was enriched in the CS2 cluster16,18. Interestingly, even though all clusters presented mutations within the NF-κB pathway, it was transcriptionally active only in clusters CS3 and CS4. Additionally, MAPK and JAK-STAT pathways were upregulated in those clusters, respectively (Figure 3A).
Regarding B-cell differentiation programs, CS1 expressed a mixture of GC cells which is consistent with the 6p25.3-19q13.12 deletions, and BCL6 mutations (Figure 2C). On the other hand, cluster CS4 presents an enrichment in terminally differentiated plasma cells that goes in line with BCL6 downregulation, the absence of MYC induction, and BCL6 mutations. The most heterogeneous cluster was the CS3, presenting features of both GC and mature B-cells (plasma cells and memory B-cells). Intriguingly, the cluster CS2 did not present any B-cell stage enrichment but instead a lymphatic endothelial cell (LEC) gene signature (Figure 3A and Fig. S26-27).
Then, we aimed to describe the TME differences between subtypes by using CIBERSORTx derived immune deconvolution and B-cell lymphoma specific TME gene signatures19. CS1 cluster is immunologically “neutral” meanwhile the CS2, which is immunologically depleted, exhibits expression of vascular endothelial cells (VEC), memory resting CD4+ T-cells, monocytes, and activation of GABA synthesis, which has been recently linked to B-cells that inhibit CD8+ T-cells’ killer function and promote monocyte differentiation into anti-inflammatory macrophages20. The CS4 cluster has a hot-inflammatory TME due to the presence of active CD8+ T-cells and NK cells (with high cytolytic activity score)21. Conversely, heterogeneity was again observed for the CS3 subtype, being only inactivated macrophages M0 more significantly enriched (Figure 3A, Fig. S28-S41, and Table S9-S10).
CS3 subtype is associated with meningeal infiltration to cerebrospinal fluid
Here, we investigated if brain MRI analysis (n=90, FFPE cohort) could provide more insights on the molecular subtypes. We observed no brain lobe preference between PCNSL subgroups but, in general, tumors arose less in the occipital lobe (4/90 cases versus 86/90, p<0.001). In addition, CS4 tumors arose more in the isthmus of the corpus callosum (7/34 cases versus 0/56, p<0.001). Conversely, CS2/CS3 were more frequent in the brainstem (4/16 and 3/19 cases versus 1/55, p=0.005), when compared to the other clusters. Strikingly, we found no association with tumor size nor multiple lesions. However, meningeal infiltration of the cerebrospinal fluid (CSF) was only found within CS3 tumors (6/16 cases versus 0/74, p<0.001, Figure 3B and Table S11).
Epigenetic attributes across PCNSL subtypes
We proceeded to investigate epigenetic differences among subtypes (n=64). The CS2 displayed higher hypermethylation globally (p=0.006, Fig. S42-S43). Interestingly, GO analyses on differentially methylated promoters revealed B-cell differentiation programs to be hypomethylated in CS1 but hypermethylated in CS2; while interleukin-1 was hypermethylated in CS4 (Figure 4A and Table S12). Genomic region enrichment analysis on hypermethylated promoters identified strong enrichment of binding sites for the histone/chromatin proteins H3K27me3 and E2H2 in CS1, and NF-κB, IRF4, and BCL6 in CS2 (Figure 4B, Fig. S44 and Table S13).
From multi-omics to potential therapeutic targets
To generate an explanatory bridge between the different multi-omic layers and ultimately potential therapeutic targets across subtypes, we integrated all multi-omic data and evaluated their contribution to specific pathways. Even though the hallmark PCNSL alterations targeting My-T-BCR protein supercomplex, CD79A/B BCR subunits, TNFAIP3, RIPK1, TAB2, and the CBM (CARD11-BCL10-MALT1) complex were relatively constant across subgroups, the NF-κB hyperactive group (CS4) exhibited more GRB2/LYN deletions and had no PLCG2 mutations, which represses the BCR complex and activates the CBM complex, respectively. Furthermore, NF-κB activity could not be explained by self-antigen-dependent chronic active BCR signaling upregulation since IgVH4-34 expression was similar across groups (Figure 5) 14,22,23. These observations suggest that CS4 and CS3 may be more sensitive to BTK inhibitors (e.g., ibrutinib). The CS4 cluster also presented high JAK-STAT activity and mutated SOCS1 (a JAK1 repressor), making it potentially responsive to JAK1 inhibitors (e.g., INCB040093)24,25. Regarding antigen presentation-related genes, we observed only monoallelic deletions in HLA-A, B2M, and CD58 but not in HLA-B or HLA-C. Moreover, the absence of PRC2 complex activity and presence of MHC-I and checkpoint molecules expression indicate a potential use of immune checkpoint inhibitors (ICI) for CS4. On the other hand, EZH2 inhibitors (e.g., tazemetostat) in combination with ICI could potentially increase MHC-I expression and immune detection in CS326. Interestingly, the CS3 cluster is enriched with HIST1H1E/C mutations which have been recently demonstrated to confer enhanced fitness, and self-renewal properties to B-cells15.
Additionally, we observed a higher frequency of cases with genetic alterations involved in the cell cycle for CS1 (97%, p<0.001, e.g., CDKN2A/2B fusions); hence, cyclin D-CDK4 and CDK6 plus PI3K inhibitors could be beneficial for CS1 patients.
Despite not presenting enriched genetic signatures, the CS2 cluster may be potentially susceptible to inhibition of the TFs IRF4 (e.g., lenalidomide), SPIB, and MEIS1 (e.g., MEISi-1), and/or inhibition of GAD6714,20.