Multi-omics molecular phenotyping of childhood acute lymphoblastic leukemia cell lines reveals novel drug vulnerabilities


 Acute lymphoblastic leukemia (ALL) is the most common childhood cancer. Although standard-of-care chemotherapeutics are sufficient for most ALL cases, there are subsets of patients with poor response who relapse in disease. The biology underlying differences between subtypes and their response to therapy has only partially been explained by genetic and transcriptomic profiling. To characterize ALL subtypes and identify novel pharmacologic vulnerabilities, we performed comprehensive multi-omic analyses of 49 widely-used childhood ALL cell lines, using proteomics, transcriptomics, and pharmacoproteomic characterization. This enabled us to characterize the functional impact of genetic fusions and cellular differentiation states. The proteomics data revealed differences in spliceosome and p53 levels not evident in the transcriptomics data and with improved correlation to drug sensitivity. Focusing on BCP-ALL cell lines, we connected the genotype, molecular phenotype, and functional phenotype with drug response data on 528 oncology drugs. Here, we identified the DAG-analog Bryostatin-1 as a novel therapeutic candidate in the MEF2D-HNRNPUL1 fusion high-risk subtype, for which this drug activated pro-apoptotic ERK signaling associated with molecular mediators of pre-B cell negative selection. Our data also forms an interactive online resource with navigable proteomics, transcriptomics, and drug sensitivity profiles at https://lehtio-lab.se/forall/.


INTRODUCTION
Acute lymphoblastic leukemia (ALL) accounts for ~30% of all cancers in children, making it the most common childhood cancer. Despite the clinical success of broad chemotherapy protocols and allogeneic hematopoietic stem cell transplantation (HSCT) 1, 2 , a considerable number of patients (15-20%) continue to experience poor survival outcomes because they are non-responsive to standard of care therapeutics and likely to relapse in disease 3,4 . Additionally, approximately 80% of survivors of childhood ALL will experience a post-treatment lifethreatening medical event by age 45, an effect hypothesized to result from the severity and duration of ALL treatment protocols 5 . Targeted therapy protocols have demonstrated potential to reduce the likelihood of post-treatment health complications, improve outcomes and address resistance to chemotherapy. These targeted approaches have been successfully implemented clinically in protocols giving tyrosine kinase inhibitors to Philadelphia chromosome positive ALL patients (NCT03911128) 6 . Recently, targeted treatment strategies have expanded to include immunotherapeutic agents such as antibody and chimeric antigen receptor (CAR) -Tcell therapies 7 . However, their adoption remains limited to ALL subtypes expressing specific surface antigens, and challenges in CAR-T production and the clinical emergence of adverse events presently limits them from attaining widespread application 8,9 . Consequently, there is still an immediate need for novel therapeutic modalities for treatment of high-risk patients and patients who relapse.
Therefore, bridging the gap in therapeutic options for ALL will likely require development of both pharmacologic and cellular therapies; and this process will depend on deeper characterization of both the biomarkers and biology of non-responsive subtypes. While the genetic landscape of childhood ALL has been extensively studied by implementation of next generation genomic, transcriptomic, and epigenetic sequencing tools 10,11 , somatic mutations can only partially explain the underlying biology and phenotype, and approximately 25% of childhood ALL patients lack a detectable driving mutation 12 . Although genetic characterizations have improved risk stratification and hope of new molecular targets for therapy, the primary challenge of identifying novel and effective treatments for ALL is achieving more reliable clinical stratification that can improve therapeutic response 13 .
Therefore, new approaches are needed to identify gaps between genetic and phenotypic properties of the disease.
Considering that regulation of protein abundance by post-transcriptional and post-translational mechanisms is necessary to maintain cellular fitness, and that these mechanisms have diverse impacts on the protein levels for each cellular component 14 , the proteome represents an ideal framework for understanding cellular phenotypes. Recent studies have demonstrated poor correlation between the proteomic and the genetic phenotypes of cancers 15,16,17 , including childhood ALL 18 . Although studies of tumors and human-derived cell lines 19 have demonstrated the importance of proteomic characterization, our recent characterization of ETV6-RUNX1 and hyperdiploid ALL is the only previous study that has used these methods to understand the biology of ALL 18 .
MEF2D fusions represent ~3.6% of childhood ALL cases, and patients experience 5-year event-free survival of 71%, identifying this patient subset as high-risk of relapse or disease progression in current therapy protocols 20 . Through proteomics-guided analysis of childhood ALL cell lines, we have generated a comprehensive resource of biomarkers and drug sensitivities, and identified a potential new therapeutic vulnerability to target the MEF2D-HNRNPUL1 rearranged high-risk subgroup. Our dataset demonstrates that proteomics and transcriptomics identify different phenotypic landscapes in childhood ALL cell lines, and we used proteomics to identify biological and phenotypic characteristics that suggest promising novel insights into both on-target and potential off-target drug activities. We provide a user-friendly web application for exploration of the proteomic, transcriptomic, and drug sensitivity data described in this study, available at https://lehtio-lab.se/forall.

In-depth multi-omics profiling of childhood ALL cell lines
We performed in-depth profiling by examining protein and RNA levels as well as drug sensitivities ( Figure 1A Fig. 1C, Supp. Table 5).
Stratified by cytogenetic background, our dataset was confirmed to reflect enrichment of known markers and pathways. Fusion proteins known to act as leukemia-driving lesions or signatures were enriched in our proteomics dataset, including PBX1, ROR1 and WNT16 for the TCF3-PBX1 fusion 22 , and ABL1 for BCR-ABL1, NUP214-ABL1, and SPFQ-ABL1 fusions ( Figure 1D). Other known markers associated with specific translocations could also be detected, including IGF2BP1 and PIK3C3 for ETV6-RUNX1 23, 24 and MEIS1 for KMT2A/mixed-lineage leukemia 25 (Figure 1D, Supp. Fig. 1D).

Correlation analysis of ALL proteome and transcriptome
Cellular phenotypes are controlled by diverse mechanisms that occur at multiple points following transcription, and we sought to quantify the impact of these mechanisms by performing a systems-level analysis of matched proteome and transcriptome profiles (n=66).
We performed Spearman pairwise correlation for all mRNA-protein pairs (n=8981), and the median correlation coefficient was 0.55 (Figure 2A) 19,27 . This could be as a result of clonal derivation of cell lines, or owed to the significant technical advantages of working with cell lines compared to the complexity in sample acquisition, preparation, and cellular fitness requirements of clinical samples. To evaluate the functional impact of proteins detected at levels that diverged from their respective mRNA quantification, a list ranked by Spearman correlation coefficient was used to identify enriched KEGG pathways for the highest and lowest correlated pairs. Highly correlated mRNA-protein pairs belonged to specialized signal transduction pathways, while poorly correlating pairs belonged to house-keeping functions ( Figure 2B).
Post-transcriptional fine-tuning of transcript stability and translational efficiency is known to be achieved by miRNAs 28 . Using mRNAs experimentally validated 29 to be targeted by miRNAs, we compared Spearman correlation coefficients of these miRNA targets relative to non-targets, and we observed that miRNA targets demonstrated significantly higher mRNAprotein correlations (Supp. Fig. 2A). These results confirm the relevance of miRNAs in regulating transcript abundance and target protein levels.
In contrast, molecular mechanisms that occur post-translationally could contribute to divergent mRNA-protein correlation. Evidence of post-translational buffering has been observed to have significant biological importance for members of supernumerary protein complexes 30 , which are theorized to be especially susceptible to degradation due to instability of orphaned components and dependence on precise stoichiometries for assembly and proper activity 31 .
Using a dataset of protein complex members obtained from CORUM 32 , co-enrichment of complex members was quantified relative to random protein pairs. This demonstrated higher correlations at the protein level than the transcript level ( Figure 2C and Supp. Fig. 2B).
Previous studies of eukaryotic cells have indicated that post-translational regulation is most commonly observed at subcellular locations where protein complexes assemble 31 . By overlaying MS-based predicted localizations 33 on mRNA-protein correlations, we found that mitochondrial and nuclear proteins showed lower correlations than cytosolic and secretory proteins (Supp. Fig. 2C). In contrast, the secretory protein group, consisting of vesicle and plasma membrane proteins, had higher mRNA-protein correlations 18,33 .
Degradation of protein complex members can occur via non-exponential ubiquitin-mediated degradation or exponential degradation 34 . These mechanisms differed in their degree of posttranslational interference as assessed by mRNA-protein correlation in our dataset, and lower mRNA-protein correlations were observed for proteins following non-exponential degradation compared with proteins following exponential degradation (Supp. Fig. 2D). Interestingly, nonexponential degradation significantly differed from exponential degradation only for proteins not targeted by miRNAs (Supp. Fig. 2E) suggesting complementarity of buffering mechanisms for pervasive gene co-expression 35 .
In parallel with the above-mentioned general mechanisms, we detected unique molecular fingerprints on mRNA-protein rations related to mutational status. For the REH cell line, which harbors a GINS2 mutation (P19S) 36 , mRNA-protein correlations for GINS-complex members were significantly poorer than correlations for corresponding protein-mRNA pairs in other cell lines. This suggests that the mutation could have an impact on the regulation or function of the entire protein complex that is only evident at the proteome level. Our data implicates a potential impairment in complex formation, which would lead to collateral degradation of the other GINS-members 37 (Figure. 2D).
Considering markers that are frequently involved in ALL initiation and progression, MTOR, AKT2 and KMT2D displayed a poor mRNA-protein correlation, and TP53 showed no correlation ( Figure 2E). This is in line with the previous observations that KMT2D truncating mutations impact protein level but not transcript level 38 . Markers from the B-, and T-cell receptor KEGG pathways showed a similar pattern with a wide range of mRNA-protein correlations (Supp. Fig. 2F and G). These data further support that transcript level analysis of markers in ALL alone allows for imprecision in interpretations of protein-level abundance of markers.

Proteome and transcriptome of ALL display alternative subgroupings.
Consensus clustering of the proteome using high variance proteins (n = 3282) suggested seven distinct clusters. Hierarchical clustering including all samples and proteins stratified the cell lines into consensus leukemic clusters (CLC) (Figure 3A and Supp. Fig. 3). Although some clusters included multiple cell lines sharing the same cytogenetic subtype, the majority of the proteomics clusters could not be solely explained by shared cytogenetic background. CLC1 contained cell lines with ETV6-RUNX1, BCR-ABL1, KMT2A-MLLT1 and IGH-MYC gene fusions. CLC2 consisted primarily of TCF3-rearranged ALL cell lines (TCF3-PBX and TCF3-HLF) along with several cell lines with previously unspecified cytogenetic background. CLC3 contained 16 out of 22 T-ALL cell lines, again with various cytogenetic backgrounds and characterized by G2M checkpoint hallmark, higher spliceosome and higher E2F target activity when compared to the other six T-ALL cell lines that clustered in CLC7 (Supp. Table 6).
Given that co-transcriptional regulatory mechanisms have been implicated in the assembly and degradation of the spliceosome 39 , we performed DE analysis on the transcriptomics data between CLC3 and CLC7 followed by GSEA analysis, which failed to recapitulate the upregulation of the spliceosome (Normalized enrichment score (NES) 0.61, q = 1.0) ( Figure   3B). Altered splicing profiles and differential splicing have been implicated in drug and therapy resistance and in ALL 40,41,42 , and these results provide further support for the value of using the proteome to quantify key markers.
The B and T lineages are quite distinct cell types and to gain a better resolution on relative differences between the two different lineages, each lineage was subsequently clustered separately. This identified five clusters in the B lineage (B-CLC) and seven clusters in the T lineage (T-CLC) cell lines ( Figure 3C). Additionally, we performed consensus clustering of the transcriptomics data, which subdivided each lineage into four different subgroups ( Figure   3D). In the transcriptomics data, KMT2A-rearranged subtypes (KMT2A-AFF1 and KMT2A-MLLT1) were classified to the same cluster (cluster 1), whereas the proteomics clustering divided these subtypes into two different proteomic clusters (B-CLC1 and B-CLC5), demonstrating that the proteomics analysis delivers an additional angle to reveal phenotypic differences, which may be clinically meaningful. We performed DEqMS analysis between these two KMT2A-rearranged subtypes and identified p53 among the top upregulated proteins in KMT2A-AFF1 cell lines (Figure 3E, Supp. Table 7) suggesting differences in the p53 regulatory network between the KMT2A cell lines.

The proteome of ALL cell lines echoes distinct states in B-cell differentiation.
Mechanisms driving B-lineage hematopoiesis are also implicated in leukemogenesis, and Blineage markers are commonly used in clinical practice to characterize clinical ALL cases 43 .
To evaluate whether biologically relevant lineage states were detected by proteome characterizations of our cell lines, we used established cellular markers of B-cell developmental stages 44 , and we identified corresponding cell states for each of the 25 BCP-ALL cell lines ( Figure 4A, 4B). Together, relative protein abundance of DNTT, CD19, RAG1/RAG2, and VPREB1 by quantitative proteomics were assessed to immunotype the BCP-ALL cell lines.
Three cell lines displayed ambiguous or transitional immunotyping (noted as "pre-B other"), including a BCR-ABL1 fusion cell line and two cell lines from KMT2A-rearranged subgroups.
All other cell lines could be assigned to a corresponding cell state. This recapitulates clinical characteristics of samples with these rearrangements; BCR-ABL1 and KMT2A-rearranged leukemias have been previously demonstrated to display mixed lineage markers 45 . Among ambiguously immunotyped samples, we noted especially high abundance of SPN/CD43, a marker of hematopoietic stem cells of lymphoid lineage, which is expressed during development and downregulated during maturation, but can also be re-expressed following activation 46 (Figure 4A, 4B). This indicates that hematopoietic-like traits could contribute to acquisition of an unconventional lineage, in line with theories of the hematopoietic origins of mixed lineage leukemias 47 .
When B-cell state assignments were compared across cytogenetic subtypes, some cytogenetic subtypes appeared to be confined to one B-cell state, while others occupied multiple cell states and proteomic clusters ( Figure 4C). Uniform B-cell state assignment was identified for TCF3-PBX1 and MEF2D-HNRNPUL1 cell lines across technical and biological replicates, with all samples typed as pro-B or early-pre-B respectively, suggesting that these fusions could be linked to acquisition of an arrested cell state. These fusions were also associated with unified clustering assignments.
To explore how clinically and biologically defined lineage markers are associated with proteome-defined phenotypic subtypes, we examined relative levels of curated lineage markers across B-CLC clustering assignment ( Figure 4D, Figure 4E) These observations indicate that both conventional lineage and oncogenic traits contribute to proteome-level differences in our cell line panel. Our phenotypic profiling supports current clinical practice in leukemia stratification and suggests that proteomics could be an effective new avenue to explore the drivers that contribute to pathogenic phenotypes.
The drug sensitivity of BCP-ALL cell lines to a set of 528 oncology drugs.
There is a demand for new ALL therapeutics to improve outcomes for high-risk patients and address disease relapse, and drug repurposing could help identify promising therapeutics that are safe and clinically viable. Given that the proteome represents a cumulative set of druggable target proteins, analysis of the proteome could reveal mechanistically relevant correlations to drug sensitivity, further elucidating the processes and markers that determine sensitivity to drugs. Therefore, we performed DSRT on the BCP-ALL cell lines (n=25) against a panel of 528 FDA approved and investigational oncology drugs at five concentrations. The selective drug sensitivity scores (sDSS) were obtained by normalizing drug sensitivity against drug response of normal bone marrow, as described in 52 (Figure 5A and Supp. Table 8).
Broad-spectrum cytotoxic drugs, e.g., conventional chemotherapy drugs demonstrated high activity in a majority of the tested cell lines (Figure 5B and Supp. Fig. 4A). Interestingly, numerous inhibitors targeting kinases (i.e. Aurora kinases, PLK1 inhibitors) or binding to the anti-apoptotic protein BCL2 (i.e. venetoclax, navitoclax) demonstrated high activity across most tested cell lines regardless of cytogenetic subtype (Supp. Fig. 4A). Additionally, most cell lines were sensitive to the p53-MDM2 antagonists (e.g., Idasanutlin, SAR405838) except for four cell lines, which were consistently resistant to these antagonists (Supp. Glucocorticoids (GCs) are components of standard of care combination chemotherapy protocols for childhood ALL 53,54 , and screening of patient cells ex vivo after diagnosis has been shown to predict response or resistance to these therapeutics 55 . Most BCP-ALL cell lines demonstrated substantial sensitivity to dexamethasone, prednisolone, and methylprednisolone, except for six cell lines, among them the REH cell line (Supp. Fig. 4A and Supp. Table 8), which has previously been shown to be resistant to dexamethasone, in agreement with our data 56 . Sensitivity to GCs correlated very well with the glucocorticoid receptor (NR3C1) protein abundance ( Figure 5C), consistent with associations between favorable patient response to GCs and basal expression levels of NR3C1 in ALL and myelomas 57,58 . Anthracyclines are also widely used in childhood ALL therapy, and they demonstrated correlation with TOP2A and TOP2B (Supp. Fig. 4C). Correlation between favorable outcomes with anthracycline treatment and transcriptional expression of TOP2A and TOP2B in acute myeloid leukemia has been previously reported 59 . Together, these results demonstrate alignment with the activity profiles of clinically validated drugs and their associated mechanisms.
Identification of patients likely to be resistant or responsive to a targeted therapy is critical, and understanding the mechanism of action (MoA) of drugs is an effective first step to achieve this goal. Correlation between mRNA expression and drug sensitivities has been previously used to analyze MoA of compounds 60 . Limited datasets of Reverse-phase Protein Arrays (RPPA) 61 using ~230 proteins have also been used for protein drug correlations in cancer cell lines, including ALL cell lines (n=7), and these results identified that protein levels had stronger correlations with drug activity than corresponding mRNA levels. To investigate the association between the molecular markers in our dataset and the pharmacologic responses of our cell line panel, we performed pairwise drug sensitivity correlation analysis with our transcriptomics and proteomics data, using drugs with an sDSS score of ≥8 in at least one cell line (n=336) for the analysis. Overall, relative protein abundances assessed by sDSS score had a significantly (p=2.2e-16) higher correlation than relative protein coding mRNA levels (Supp. Fig. 4D), in line with previous observations 61 . For each drug, which was effective (sDSS > 8) for one cell line at minimum, a matrix of Pearson correlation values for proteins correlated to drug sensitivity was used as input for dimensionality reduction using UMAP 62,63 . Conventional chemotherapy and epigenetic modifying drugs each demonstrated close association in UMAP space, illustrating similar patterns of protein abundance associated with drug sensitivity, which suggests conserved MoAs for the drugs in each respective drug class ( Figure 5D). Similar patterns could be seen for many other drug target families such as ABL1, HDAC, BCL2, and PI3K inhibitors, further demonstrating a robust setting for MoA elucidation (Supp. Fig. 4E).
For these drug classes, we examined outliers which were dispersed from their counterparts, and we noted that the selective class I HDAC inhibitor Tacedinaline was substantially dispersed from other class I HDAC targeting drugs. Additionally, TCF3-PBX1 fusion cell lines demonstrated resistance to Tacedinaline, despite responding to a majority of other HDAC inhibitors (Supp. Table 8). These observations suggest that further study is warranted to understand why Tacedinaline displays these distinctions from other drugs of its class. Close associations could be seen for drugs with different proposed molecular targets, such as Mubritinib, an ERBB2 inhibitor that correlated significantly with HIF1A inhibitor BAY-87-2243 (R = 0.69, p = 1.19e-04) and VCP/p97 inhibitor NMS-873 (R = 0.83, p = 3.52e-07), indicating a shared MoA. These drugs have a common alternative molecular target, respiratory complex I 64, 65 66 . Inhibiting the electron transport chain might lead to reduced ATP availability, which could explain the strong positive correlation of these drugs and abundance of proteasomal subunits (Supp. Fig. 4F), indicating ATP-consuming high protein turnover rates.
Using a list of reported putative drug target protein(s) for each drug, we ranked the drug sensitivity-protein abundance Pearson correlations of each drug and its putative drug target to examine the relationship between target abundance and drug response across the screened drug library ( Figure 5E). Among the highly ranked drugs in this analysis, several tyrosine kinase inhibitors (e.g. imatinib and bafetinib) and receptor tyrosine kinase inhibitors (e.g. sorafenib and quizartinib) correlated well with protein abundance of their putative targets ABL1 and FLT3 respectively (Figure 5E and Supp. Fig. 4G). Additionally, PARP1 inhibitors talazoparib, niraparib and olaparib correlated well with PARP1 levels ( Figure 5E). MDM2 inhibitors positively correlated with MDM2 abundance while anticorrelating with the target of MDM2 degradation, p53 ( Figure 5E). Thus, mechanistic insights into drug activity can be inferred using correlation analysis between drug sensitivity and baseline protein abundance.
The proteomics coupled with sDSS assessment also identified previously uncharacterized associations between protein abundance and drug sensitivity. The majority of the tested cell lines (n=22) demonstrated sensitivity to Rigosertib. Rigosertib, a pan-kinase inhibitor, has undergone late phase clinical trials for treatment of several hematopoietic malignancies (NCT01167166, NCT01928537, NCT01926587). Rigosertib has several proposed MoAs, such as being a RAS-mimetic compound 67 and an inhibitor of PLK1, which triggers a G2M transition 68 . Rigosertib has also been demonstrated to exhibit microtubule-destabilizing activity 69 , although this mechanism has been questioned amid criticism of drug purity levels 69,70 . GSEA analysis of the sDSS-protein abundance correlation of Rigosertib identified enrichment of PLK1, AURORA_B and cell cycle pathways (Supp. Table 9). This supports a PLK1 inhibition and cell cycle disruption MoA of Rigosertib in the BCP-ALL cell lines.
Among the top correlating proteins with Rigosertib drug sensitivity was MASTL, a microtubule-associated serine/threonine kinase (Supp. Fig. 4H), suggesting that the debated microtubule destabilization effects could be a result of pleiotropic binding to MASTL.
Additionally, no PDPK1 interacting proteins from the STRING network database correlated with drug activity (Figure 5F), and no significant correlation was detected for markers of associated PDPK1 pathways such as MTOR or AKT signaling. Previous work has disputed the designation of OSU-03012 as a PDPK1 inhibitor, and identified that cell death following treatment with OSU-03012 occurred via induction of ER-stress signaling 71 . Additional studies identified changes in abundance and stability of heat shock proteins (HSPs) following OSU-03012 treatment, and suggested that HSPs were targets of OSU-03012 72 . Our dataset identified significant correlation with HSP90 (HSP90AB1, Figure 5F) in support of this hypothesis.
However, among the most highly correlated proteins were many components of the chaperonin-containing TCP1-complex ( Figure. 5F), which performs ATP-dependent folding of polypeptide chains and is implicated in ciliogenesis. Drug activity via functional interference with this complex would also be consistent with previous data describing induction of ERstress following drug treatment, and these novel associations suggest that the chaperonincontaining TCP1-complex could be studied further as a potential target of OSU-03012, and a vulnerability axis in ALL. Additionally, this observation is especially interesting in the context of other ongoing clinical, preclinical and investigational applications of OSU-03012, which have identified the importance of observed changes in chaperone functionality and autophagy induced by drug treatment. This has included several disease indications, namely antifungal activity, antiviral activity, and prion clearance in neurodegeneration 73,74 . We identified seven cell lines being sensitive to treatment with OSU-03012.
Another drug of interest is Tacrolimus, which demonstrated efficacy in eleven of the tested BCP-ALL cell lines. Tacrolimus is a macrolide lactone widely used clinically as an immunosuppressant due to its ability to inhibit calcineurin-mediated dephosphorylation of NFAT 75 . Previous characterizations have suggested Tacrolimus does not target this pathway by direct binding, but rather inhibits dephosphorylation of NFAT in complex with FKBP1A, a cis-trans prolyl isomerase belonging to the immunophilin protein family 76 . Despite the crucial role of FKBP1A in this mechanism of action, we found that the abundance of FKBP1A negatively correlated with sensitivity to Tacrolimus (R = -0.4, p = 0.048). However, another member of the cis-trans prolyl isomerase family, FKBP10, strongly correlated with Tacrolimus sensitivity (R = 0.81, p = 7.7e-05) ( Figure 5G). FKBP10 is known to bind to Tacrolimus, and although previous work has suggested that this interaction affects collagen assembly, the biological impact of the FKBP10/Tacrolimus interaction is under-investigated relative to other tacrolimus binding partners 77,78 . This unexpected correlation suggests that FKBP10 merits further study with regard to its binding to Tacrolimus and activity profile. Collectively, our results demonstrate that integrated proteomics and drug sensitivity analysis is an especially well-suited approach to identify drug-target specificities and deciphering their MoA in ALL.

The phenotypic signature and drug sensitivity of MEF2D-rearranged cell lines
Across our analysis of cytogenetic subtype, proteomic clustering, drug sensitivity, and cellular state, we observed striking similarities among a group of cell lines distinguished genetically by a MEF2D-HNRNPUL1 fusion. MEF2D is a member of the myocyte-specific enhancer factor 2 (MEF2) family of transcription factors, which have key roles in a variety of malignancies 79 and in B-cell development 80 . In the context of childhood ALL, MEF2D has been found fused to at least six different fusion partners 20,81 . The MEF2D rearranged cytogenetic subtype is associated with relatively poor clinical outcomes 20,81,82,83 , and is also found in adult-onset ALL.
MEF2D rearrangements are associated with increased HDAC9 levels and MEF2D transcriptional activity. HDAC9 is a class II histone deacetylase that regulates the activity of MEF2D-HNRNPUL1 leukemias display a distinct gene expression signature 20,81 , which showed substantial overlap with our data (Supp. Fig. 5A, Supp. Table 10). Phenotypically, these cell lines clustered together when analyzed by transcriptomics or proteomics, and our cell state assessment immunotyped them uniformly as early pre-B cells (Supp. Fig. 5B), which is consistent with a recent study of a fourth MEF2D-HNRNPUL1 fusion cell line (KASUMI-7) derived from an adult patient 85 . DEqMS analysis identified MEF2C and HDAC9 among the top differentially abundant proteins in comparison to the other BCP-ALL cell lines ( Figure   6A, 6B, Supp. Table 11), suggesting they may also be sensitive to the broad class I HDAC inhibitors. However, we found no significant difference in sensitivity compared to the other BCP-ALL cell lines (Supp. Fig. 5C). To investigate this further, we used TMP269, a potent class IIa HDAC inhibitor with an IC50 of 23 nM for HDAC9 86 . Again, we found no inhibitory effect in any of the tested cell lines (Supp. Fig. 5C). This suggests that HDAC9 may not be a therapeutically targetable vulnerability, at least as a leukemic cell-intrinsic toxicity in patients carrying the MEF2D-HNRNPUL1 fusion.
In contrast to their lack of specific sensitivity to HDAC inhibitors, the three childhood ALL cell lines with the MEF2D-HNRNPUL1 lesion demonstrated uniquely potent and specific sensitivity to Bryostatin-1 (p = 5.6e-06), a DAG-analog and protein kinase C (PKC) activator 87 (Figure 6C). DSRT analysis of the adult KASUMI-7 cell line also demonstrated significant sensitivity to Bryostatin-1 (sDSS = 24). By assessing proteomic markers with high Pearson correlations to sDSS score, we observed strong correlation between Bryostatin-1 toxicity and markers upregulated in the MEF2D-HNRNPUL1 cell lines ( Figure 6D).
One of the positively correlating markers was RASGRP1 (R = 0.54, p = 0.0064), a calciumand DAG-regulated guanine nucleotide exchange factor, which has been identified as a direct mediator of sensitivity to negative selection in pre-B cells 88 . Negative selection occurs at specific states of B-cell development, and it influences response to BCR stimulus by triggering programmed cell death after sustained, strong BCR signaling. In addition to limiting selfreactive mature B-cells, this mechanism has also been shown to limit the survival of dysfunctional pre-B cells 89,90 . In cells that are not vulnerable to negative selection, BCR stimulus can also contribute tonic survival signals, and therefore leukemias often demonstrate alterations in B-cell developmental progression that favor survival 91 . As cells that have expressed the pre-B cell receptor at the early pre-B stage, our MEF2D fusion cell lines are at a stage in development that could render them vulnerable to negative selection 89 . In support of an underlying manipulation of this negative selection pathway, we observed that MEF2D fusion cells had enriched abundance of the DAG degrading enzyme DGKH ( Figure 6A).
Negative selection mimicry by pharmacologic methods has been applied to target pre-B cell leukemia carrying the BCR-ABL fusion 92,93 as well as in B-cell lymphomas 94 , and has also been suggested for treating ALL 95 . The mechanism of negative selection mediated by RASGRP1 acts through calcium-regulated PKC delta activity: calcium-bound PKC delta phosphorylates RASGRP1 at S332, and this phosphorylation allows RASGRP1 to initiate proapoptotic ERK pathway activation 88,94 . Thus, we evaluated whether Bryostatin-1, as a DAGanalog, works by activating this pro-apoptotic ERK signaling. Following 2 hrs of 100 nM Bryostatin-1 treatment, we observed a significant increase of abundance of phosphorylated ERK compared to DMSO treated controls ( Figure 6E).
To validate that Bryostatin-1 acts on-target as a DAG analog, we evaluated whether MEF2D-HNRNPUL1 cell lines would be sensitive to another DAG analog, Phorbol myristate acetate (PMA). PMA also conferred potent toxicity in MEF2D-HNRNPUL1 cell lines (Figure 6F), and this effect was significantly more potent in MEF2D-HNRNPUL1 cell lines than other tested BCP-ALL cell lines (n=8 BCP-ALL cell lines, p = 8.8e-10), suggesting that the on-target effects of Bryostatin-1 are replicated by PMA. To further confirm that ERK signaling was a mediator of toxicity, we blocked ERK pathway (phosphorylation) by using the MEK inhibitors Tramatenib, UO126, and Selumetinib, and the ERK inhibitor ERK 11e. In cells treated with 100nM Bryostatin-1 or 25nM PMA, concurrent dosing with 1uM of MEK or ERK inhibitors significantly improved viable cell counts for all tested inhibitors (Figure 6G, Supp. Fig. 5D).
With the exception of UO126, treatment with 1uM MEK or ERK inhibitors alone resulted in slight but significant reductions in viability for MEF2D-HNRNPUL1 fusion cell lines, further supporting that drugs targeting ERK phosphorylation specifically benefit viability in the context of correcting DAG-analog toxicity (Supp. Fig 5E).
These characterizations demonstrate that vulnerability to negative selection at the pre-B stage may be partially conserved in leukemic subsets and targetable by DAG analogs. Thus, further study of this therapeutic strategy in MEF2D-HNRNPUL1 fusion leukemias is merited. More broadly, this finding illustrates that proteomics coupled with drug screening is an analytical approach that can support robust and replicable identification of molecular mediators of drug toxicity.

DISCUSSION
We here report an in-depth multi-omics layered analysis of 49 readily available childhood ALL cell lines encompassing more than 12,000 proteins and 19,000 protein-coding RNA transcripts as well as drug sensitivity to 528 oncology and investigational drugs. This represents the first proteomic analysis of readily available childhood ALL cell lines covering numerous cytogenetic subtypes, which complements our previous proteogenomic analysis 18 of the two most common subtypes (i.e. ETV6-RUNX1 and Hyperdiploid). Genomic profiling has been the major tool utilized for characterization of childhood ALL. However, protein levels have been demonstrated to have a more direct impact on cellular phenotypes, and our resource and analysis thus provides an improved insight into the biology of leukemia. In this study, we demonstrated that the mass spectrometry-based proteomics analysis of cell lines captures many features of ALL and subtype specific biology. Overall, our integrative data recapitulated the poor mRNA-protein correlation further highlighting the importance of addition of proteomic analysis in studying childhood ALL.
By utilizing this resource, we were able to identify novel phenotypic and biological insights.
For the TCF3-PBX1 and MEF2D-HNRNPUL1 fusion samples, separation by cytogenetic subtype drove the clustering assignments when assessed using transcriptomics, but phenotypic similarities with other cytogenetic subtypes could be identified in the clustering designations obtained using proteomics data (Figure 3D). In the drug sensitivity screening, protein abundances showed higher correlations with drug sensitivity than protein coding mRNA expression, demonstrating the value of proteomics also in characterizing drug activity ( Figure   5E). The drug sensitivity screening recapitulated clinically validated results, and conventional chemotherapeutic agents and GCs demonstrated broad activity across the majority of the BCP-ALL cell lines.
In addition, we performed drug sensitivity correlation analysis on a proteome-wide level, which identified correlations between clinically used therapeutics and abundance of their molecular targets. Tyrosine kinase inhibitors ranked as the highest correlated drug-target protein pairs.
The correlation of drug sensitivities with proteins other than their intended protein targets (e.g. OSU-03012, Tacrolimus and Rigosertib) also provide an excellent basis for elucidating their true molecular targets in ALL. Beyond ALL, this analysis provides general insights that could be used to identify novel off-targets.
Notably, we identified a DAG analog and proposed PKC activator, Bryostatin-1, that demonstrated a phenotype-and subtype-specific efficiency in the MEF2D-HNRNPUL1 cell lines. Our observations demonstrate that Bryostatin-1 activates the proapoptotic PKC/RASGRP1/ERK signaling pathway in early pre-B cells, the progeny stage at which MEF2D-HNRNPUL1 cell lines are arrested. Bryostatin-1 has been investigated as a latencyreversing agent in T-cells for HIV treatment as well as cancer 96,97 and Alzheimer's 98 treatment and has successfully completed several phase I and II clinical trials 99 . Our finding provides an initial basis for further exploration and repurposing of Bryostatin-1 for this high-risk group.
Additionally, our identification of negative selection vulnerability in MEF2D-HNRNPUL1 fusion cases represents a biologically novel therapeutic approach, and as a result of its orthogonal mechanism it could be especially useful in cases of drug resistance and relapse or pursued in a combination therapy. Collectively our proteomics, transcriptomics, and pharmacoproteomics analysis of this childhood ALL cell line panel provide a rich resource for exploration and hypothesis generation.     Drugs were correlated to gene symbol-based protein quantification data based on sDSS score, and Pearson correlation scores from this analysis were used to generate the UMAP. Each drug is annotated by its class. E. Scatter plot of the Pearson correlation versus rank for the sDSS of each drug and the relative log2 protein level of its putative protein target. The drugs are colored by drug classes.    cell lines vs other BCP-ALL cell lines. The MEF2D-HNRNPUL1 cell lines were not significantly more sensitive to class I HDAC inhibitors from the DSRT experiments, the x-axis shows sDSS. The MEF2D-HNRNPUL1 cell lines were also not more sensitive to selective HDAC9 inhibitor (TMP269) at any of the six tested concentrations ranging from 0.0001 to 10µM in 10-fold dilution, the x-axis here shows the ratio of live cells between TMP269 and DMSO treated samples.
E. Viable cell quantification normalized to corresponding mean DMSO viable cell count of MEF2D-HNRNPUL1 fusion cell lines KASUMI-9, P30-OHKUBO, LC41, and KASUMI 7, treated with 25 nM of PMA alone or in combination with 1uM MEK inhibitors UO126, Trametinib, or Selumetinib. Alternatively, to block ERK directly, 1uM ERK inhibitor ERK 11e was used, and equal volume of DMSO was used as a control in triplicates. All wells were supplemented to obtain equal DMSO volumes per well.Viable cells were quantified by flow cytometry per 15uL HTS collection, excluding zombie aqua dyed non-viable cells. Results are merged from n=3 independent experiments. p-values were obtained from unpaired t-tests.
F. Viable cell quantification normalized to corresponding mean DMSO viable cell count of MEF2D-HNRNPUL1 fusion cell lines KASUMI-9, P30-OHKUBO, and LC41, treated with 1uM MEK inhibitors UO126, Trametinib, or Selumetinib, or 1uM ERK inhibitor ERK 11e. Equal volume of DMSO was used as a control in triplicates. All wells were supplemented to obtain equal DMSO volumes per well. Viable cells were quantified by flow cytometry per 15uL HTS collection, excluding zombie aqua dyed non-viable cells. p-values were obtained from unpaired t-tests. High resolution isoelectric focusing (HiRIEF) of peptides

Supplementary information
The prefractionation method was applied as previously described in 101  subtype were obtained from Shnaghail institute of haematology and were added to St. Jude cohort and processed accordingly using the same RNA-seq analysis pipeline. The reads from bam files were re-aligned to gencode.v31.p12 human genome then the processing steps and annotation files were used identical to the cell lines RNA-seq data processing steps.
Differential expression analysis of transcriptomics data RNA-based differential expression analyses were performed using edgeR 113 . FeatureCounts was used to assemble a raw counts matrix, and edgeR used this counts matrix to perform differential expression analysis based on negative binomial distribution. The two-groups differential expression analysis has been implemented in the data portal for both cell lines and clinical EGA data.

Identification of highly variable proteins
Identification of highly variable proteins were performed as previously described in 26  The EM process converged in a two distribution solution, which we assumed to represent the highly variable and the unmodulated proteins. Using this model, we estimated the number of highly variable proteins and selected a modified standard deviation threshold, which optimized the number of highly variable minus unmodulated proteins. As the EM process inevitably produces slightly different thresholds every time it is executed, we performed ten iterations and rounded the mean of the iterations to the lower 0.5 in order to have a reproducible solution.
Clustering of proteomics data This final PSM count per protein was then used to build the empirical Bayes statistics. The significance cut-off was set to an adjusted p-value of 0.01 and fold-changes to log2(1.5)-fold difference in abundance.

Correlation analyses
Correlation between mRNA and protein levels for each gene was performed using overlapping genes (n=8981) between the transcriptomic and proteomics data using 64 matched samples.
Correlation between the mRNA and protein abundance values for each of these gene was determined using Spearman's ρ correlation method. Correlation analysis between sDSS and protein abundance of all quantified proteins (n=12446) for 25 BCP-ALL cell lines (total n=43 including biological replicates) were also performed using Pearson Rank Correlation.

UMAP of drug-protein correlations
Using the Pearson correlation coefficient results calculated as described above, a 2D matrix of values was assembled representing Pearson correlations to gene symbol-centric protein quantification values for each drug. Drugs were limited to only the screen results meeting a minimum sDSS threshold of 8 for at least one cell line (n=336 drugs, n=9786 proteins). A scaled, centered matrix was generated and used as input for a PCA using the Seurat package in R (https://cran.r-project.org/web/packages/Seurat/index.html). This PCA calculation was used to generate an elbow plot and a jack straw plot, also using functions in the Seurat package.
Statistical analysis of flow cytometry data All quantification was performed using the BDFacsDiva software (BD Biosciences). For each experiment, mean DMSO treated counts were obtained across technical replicates (per cell line, per day of experiment), and this mean value was used to normalize counts to a ratio of viable treated cell count divided by viable mean DMSO cell count. Normalization operations, statistical tests, and data plotting was performed in R using the packages ggplot2 (https://cran.rproject.org/web/packages/ggplot2/index.html), and ggpubr (https://cran.rproject.org/web/packages/ggpubr/index.html). p-values were obtained from an unpaired t-test between groups performed using the stat_compare_means function in ggpubr. Significance was validated in un-normalized raw counts data.

Data and code availability
Processed data, raw data and code will be available upon publication in a peer-reviewed journal.