Visual Barcodes for Multiplexing Live Microscopy-Based Assays


 While multiplexing samples using DNA barcoding revolutionized the pace of biomedical discovery, multiplexing of live imaging-based applications has been limited by the number of fluorescent proteins that can be deconvoluted using common microscopy equipment. To address this limitation we developed visual barcodes that discriminate the clonal identity of single cells by targeting different fluorescent proteins to specific subcellular locations. We demonstrate that deconvolution of these barcodes is highly accurate and robust to many cellular perturbations. We then used visual barcodes to generate ‘Signalome’ cell-lines by multiplexing live reporters to monitor the simultaneous activity in 12 branches of signaling, in live cells, at single cell resolution, over time. Using the ‘Signalome’ we identified two distinct clusters of signaling pathways that balance growth and proliferation, emphasizing the importance of growth homeostasis as a central organizing principle in cancer signaling. The ability to multiplex samples in live imaging applications, both in vitro and in vivo may allow better high-content characterization of complex biological system


Abstract:
While multiplexing samples using DNA barcoding revolutionized the pace of biomedical discovery, multiplexing of live imaging-based applications has been limited by the number of fluorescent proteins that can be deconvoluted using common microscopy equipment. To address this limitation we developed visual barcodes that discriminate the clonal identity of single cells by 5 targeting different fluorescent proteins to specific subcellular locations. We demonstrate that deconvolution of these barcodes is highly accurate and robust to many cellular perturbations. We then used visual barcodes to generate 'Signalome' cell-lines by multiplexing live reporters to monitor the simultaneous activity in 12 branches of signaling, in live cells, at single cell resolution, over time. Using the 'Signalome' we identified two distinct clusters of signaling pathways that 10 balance growth and proliferation, emphasizing the importance of growth homeostasis as a central organizing principle in cancer signaling. The ability to multiplex samples in live imaging applications, both in vitro and in vivo may allow better high-content characterization of complex biological systems.

Introduction:
The ability to multiplex samples has revolutionized science as well as medical practice. Genetic barcoding applications enabled unprecedented multiplexing, followed by parallel processing and analysis of dozens to hundreds of thousands of samples in applications like scRNA-Seq or 20 functional CRISPR/shRNA/Open reading frame (ORF) screens. In contrast, in the field of imagebased screens, high order multiplexing is limited by the small number of channels that can be practically separated with common microscopy equipment. We developed visual barcodes that enable multiplexing of microscopy-based applications and used them to multiplex live-cell reporters for the study of signaling pathways dynamics in cancer cells. 25 To maintain growth homeostasis, individual cells must balance and coordinate numerous and sometimes competing demands. These requirements are answered by signaling networks that integrate information from numerous branches of signaling. In cancer, genetic and non-genetic alterations in major signaling pathways have been tightly linked to tumor initiation, progression 30 and response to anti-cancer therapies. It was also demonstrated that many of these alterations can be classified into a dozen signaling pathways, which together regulate core cellular processes such as cell fate, cell survival, and genome maintenance. 1,2 To facilitate understanding of cancer signaling, previous studies have developed genetically tagged activity reporters, i.e. fluorescent-proteins that exhibit changes in abundance or localization in response to particular signaling activities. 3,4 Unlike endpoint assays, which necessitate ending of the experiment in order to measure a phenotype, these reporters helped reveal the intricate dynamics of individual branches of signal transduction pathways, in live cells, 5 at single cell resolution. However, since it is difficult to multiplex fluorescence reporters, mutual dependencies between these separate branches of signaling remained less explored.
In the present study, we describe the development and application of visual barcodes, a technology that enables multiplexing live cell imaging applications. The visual barcodes are used 10 as labels of clonal identity of single cells in mixed populations. To advance the understanding of cancer signaling pathways and their crosstalk, we used visual barcodes to multiplex 12 live reporters of major signaling pathways, generating an experimental system that we term, the 'Signalome'.

15
Using the Signalome we investigated the coordinated (multiplexed) dynamics of 12 signal transduction pathways in cancer cells that were challenged with well characterized chemical perturbants. Our results show that multiplexing 12 reporter lines not only increases throughput but also eliminates noise associated with well to well variations in high throughput drug screens.
Surprisingly, our results also identify a previously undescribed binary partitioning of cancer 20 signaling into two distinct clusters.
To maintain homeostasis, proliferating cells require mechanisms that coordinate rates of cell division with appropriate rates of biosynthesis. To sustain the proliferative demands of oncogenes, cancer cells require precisely tuned rates of biosynthesis. 5 If cells fail to double their mass 25 between consecutive cell divisions, cell mass will progressively diminish. Conversely, rates of biosynthesis that exceed rates of cell division can result in cellular enlargement and senescence. [6][7][8] We show that the two clusters of signaling pathways identified by the Signalome system represent a general stress response that correlates with the cells' need to balance between growth and proliferation. Implementing Signalome in cancer cell lines thus revealed 30 previously underappreciated importance of growth homeostasis as a central organizing principle in cancer signaling.
Our findings thus demonstrate that the Signalome is a robust technology that can help study the dynamics of signaling pathways in a single cell resolution and the interconnection between different signaling branches. As the system is highly modular, replacing reporters can be easily achieved to allow the study of many other questions across the different fields of biology.
Additionally, the visual barcodes can be used in numerous in-vitro and in-vivo applications in 5 which visual deconvolution of multiplexed cell lines is of need.

Results:
Developing visual barcodes to allow multiplexing of live imaging applications. 10 To construct visual barcodes, we first stably infected the A375 melanoma cell line with a lentivirus containing the nuclear marker histone 2A fused to iRFP (iRFP-H2A) to accurately demarcate nuclear and cytoplasmic compartments (Fig. 1A). Next, iRFP-H2A cancer cells were used to generate five subclones, each labeled with a cyan fluorescent protein (CFP) tagged with unique cellular localization sequences, targeting the CFP into one of five subcellular locations -nucleus, 15 endoplasmic reticulum (ER), cytoplasm (NES), peroxisomes (Peroxi) and whole cell (WC) (Fig.   1B). CFP localization was thus used as a visual barcode that can discriminate clonal identity.
To test the accuracy of the visual barcodes, we imaged each of the five subclones separately with both phase-contrast as well as in iRFP and CFP channels. We then used CellProfiler 9 to (1) segment the cells, (2) identify both nucleus and cytoplasmic compartments and 20 (3) extract texture, shape, and intensity-based features in the CFP channel for each cell (Sup code). CellProfiler Analyst 10 was used to classify single cells based on visual barcode readouts.
In our implementation, we used 70% of the cells as a training set, with the remaining 30% as a validation set. Average false detection rate (false positive barcode labeling) was 1.15%, while miss rate (false negative rate) was 1.17%, representing very high precision and recall rates 25 respectively ( Fig 1C and Sup Fig. 1A). The highest false detection rate and miss rate were both detected between nuclear and ER localizations, which were thus deprioritized from additional follow up.
To scale up the dimensionality of visual barcodes to a 12-barcode system, we fused three of the 30 localization signals to four fluorescent proteins (CFP, BFP, GFP, and YFP), resulting with 12 distinct visual barcodes (Fig. 1D). This increase in barcode number still maintained a very high accuracy with an average false detection rate of 1.45% and a miss rate of 1.34% ( Fig. 1E and Sup fig 1B). Most detection errors were between subclones with the same fluorescence color, with the highest false detection rate detected between whole-cell and peroxisomal BFP (Fig. 1E).
To explore the robustness of the system under different types of perturbations, we treated each of the subclones separately with 75 drugs (Sup Table 1). We found that after 48 hours of drug treatment, both precision and recall were still very high in almost all drugs with only eight drugs (12%) having a miss rate of above 3%, five of which strongly affecting the cells proliferation and 5 viability. (Fig. 1F, Sup figs 1B,C).
Lastly, we showed that the clones can also be separated when cells are in suspension, using the Imagestream high resolution microscopy and flow cytometry system (Fig 1G-J). As our system was lacking a laser to detect YFP, we only multiplexed nine of the subclones. To demonstrate that the visual barcodes can also be used in-vivo, we mixed the nine clones and implanted them 10 subcutaneously in a nude mouse. When the tumor reached a diameter of 8mm it was excised, dissociated into single cells and analyzed by the imagestream system, demonstrating that all nine clones could be detected (Sup Fig. 1D-G). 15 To generate the Signalome, we assembled 12 previously published and well characterized reporter constructs (Sup Fig 2A), each associated with the activity of a different cancer-related signaling pathway. To enable their multiplexing, we replaced the original tagged fluorescent protein in each of the 12 reporter vectors with the mStrawberry fluorescent protein. We used two different types of reporters: reporters that drive the expression of the fluorescent protein by a 20 specific transcription response element (TRE) or kinase translocation reporters (KTRs) that translocate the fluorescent protein from the nucleus to the cytoplasm upon activation of upstream signaling (Sup Fig 2A).

Generating the 'Signalome' reporting cell lines
We then infected each of the 12 A375 subclones that have visual barcodes with one of the 12 reporters (Fig 2A). Next, we validated that the proliferation rate of the single-cell derived 25 reporter subclones is not different from that of the parental cell line (Sup Fig 2B). We also validated that all 12 subclones are as sensitive to the BRAF inhibitor, vemurafenib, as the parental A375 cell line (Sup fig. 2C). Lastly we pooled together all 12 subclones generating the A375 signalome cell line, and demonstrated that the proportions of the 12 different clones remained constant over 48 hours of culture ( Figure 2B).

30
An advantage of single cell measurements is that perturbations can be characterized not only for their influence on population average but also on the full distribution (i.e. the frequency of cells with low, medium or high signaling activity). To quantify condition-dependent differences in the distribution of reporter activity, we used the Kolmogorov-Smirnov (KS) test as it is a nonparametric test that can also detect changes in the distribution that are not reflected by the mean of 5 the distribution (Sup Figs. 2D,E). 11,12 To add to the KS score, activity scores were assigned a positive or negative sign based on the change in direction of the mean of these distributions. To validate our readout, we used known positive or negative regulators for each of the reporters (Sup Fig 2F).
As an additional validation, we show that drug-dependent changes in the activity of all 10 reporters of the multiplexed A375-Signalome cell line are strongly correlated with measurements performed on the single reporter subclones(r=0.75, p<10 -16 ) ( Fig 2C). Reassuringly, we found that Vemurafenib and the MEK inhibitor Trametinib, both inhibitors of the MAPK pathway, exerted highly similar effects on the A375-Signalome cell line (figure 2D-F).
An advantage of multiplexed, time dependent measurements on signaling is the ability to 15 differentiate direct and indirect drug influences. For example, while MAPK inhibitors (vemurafenib and trametinib) promoted detectable changes in measured ERK signaling that were observed 1.5 hours into drug treatment, an influence of these same drugs on other pathways was also observed, albeit at much later times (Fig 2D-F). This is in agreement with the direct effect of these drugs on the MAPK pathway and the subsequent adaptive response of the other pathways to 20 the inhibition of the MAPK pathway. Indeed, previous reports already described activation of PKA 13 , NFkB 14 , HIF 15 , and YAP/TAZ 16 in response to vemurafenib and suggested that these adaptations can contribute to resistance to MAPK inhibition. In addition, we observed a significant upregulation of retinoic acid receptor activity 48 hours after BRAF or MEK inhibition (P-value < 6x10 -5 ). Interestingly, examination of three independent cohorts of melanoma patients 25 demonstrated that patients with high activity of the RAR/RXR pathway, as calculated from expression data by PathOlogist 17 , had an overall better survival (Sup figure 2G-I). Therefore, it may be of interest to continue and better explore the role of RAR in melanoma and its response to therapy. These measurements also demonstrate the efficiency and throughput of the Signalome system: with one 384-well plate, we screened the influence of 75 drugs (in triplicates) and 39 30 DMSO controls on 12 branches of signaling, in multiple time points, and at single cell resolution.
A single signalome plate thus provided measurements on half a million cells from each of the different time points.
To demonstrate that that visual barcodes and a signalome system can be readily applied to other cell lines we generated two more signalome cell lines using the PC9 EGFR-mutated nonsmall cell lung cancer cell line and the SK-MEL-5 BRAF-mutated melanoma cell line. We were 5 able to demonstrate that our barcode precision and recall rates are also very high in these cell lines (Sup. figure 3A-D) and that an early inhibition of ERK by both inhibitors can be detected followed by cell-line specific adaptive mechanisms. For example -while inhibition of BRAF or MEK in A375 melanoma cell line resulted in activation of the YAP/TAZ pathway, the effect of BRAF/MEK inhibition in the SK-MEL-5 cell line resulted in inhibition of the YAP/TAZ pathway 10 (Sup figure 3E-H). Here again we found that EGFR inhibition in the PC9 cells also drives upregulation of RAR activity.

Large scale correlations in signaling suggest a generalized response that is compound independent.
To investigate interdependencies in the cancer signaling, we treated the pooled A375 signalome 15 cell line with a library of 422 well characterized chemical perturbants (Table 2, Sup Figure 4A). Of all tested compounds, 122 (28.9%) promoted significant changes in at least one of the reported pathways (KS absolute score > 0.25). As expected, different drugs with similar targets displayed similar patterns of reporters changes in response to these drugs (Sup figure 4B-G).
Surprisingly, in addition to target-specific signatures, unsupervised clustering of the 20 reporters activity scores also suggested a higher structure that partitions the signal transduction signatures into three clusters, two of which seem anticorrelated ( Fig 3A). Cluster A groups compounds that seem to all activate the pathways PKA, AKT, ERK, p38 and JNK while inhibiting WNT, p53, NFkB, RAR, HIF and YAP\TAZ; while cluster B contains drugs that orchestrate the opposite response. Cluster C contained drugs that did not follow this dichotomy. The large number 25 of drugs with varying mechanisms of action that result in signaling clusters A and B, suggests a coordinated response to drug treatment that involves all of our measured branches of signaling and is surprisingly independent of drug target. As a case in point, Fig 3B shows a negative correlation between the activities of p53 and p38 which is persistent across a wide diversity of chemical perturbations. Drugs that diminished p38 activity correlate with equivalent increase in 30 the activity of p53 and vice versa (Pearson's r = -0.517, p < 2.2x10 -16 ) (Fig. 3B). More generally, drugs promoted positive correlations among pathways within cluster A or B (intra-cluster correlations) and negative correlations when comparing pathways from cluster A to pathways from cluster B (inter-cluster correlations). For simplicity, we will refer to these two clusters as the p38 signaling state (cluster A) and the p53 signaling state (Cluster B) ( Fig 3A).
Chemical perturbations can generate correlated influences by simultaneously affecting more than one target. Such correlated influences, however, should be compound specific, relating 5 to target affinities that differ from one compound to another. By contrast, we found that the same pairs of pathways are positively, or negatively correlated, across a large number of drugs (n=122) that have multiple and highly different targets. (Fig 3B, 3C). To demonstrate that this partition is not cell line specific we treated the PC9 signalome cell line with 247 drugs and found very similar bifurcation into two anti-correlated clusters of pathways (Sup Figure 4H). The question, therefore, 10 is as follows: how can such a wide variety of perturbations, each associated with different targets, converge onto only two main outcomes? We reasoned that the partitioning of the signaling pathways into two clusters suggests a regulatory process that is common and upstream of all of our measured branches of signaling Fig 3D. Large scale correlations in signaling are present pre-treatment and increase over time by 15

multiple drugs
To gain insight into the nature of this upstream regulatory process, we first performed time course measurements to ask, how soon after drug treatments do the pairwise correlations become apparent? We found that pairwise correlations in pathway activity became more and more prominent throughout the 48 hours following drug treatments (Fig 4A, 4B). The segregation 20 of the two signaling states post drug treatment is also apparent from visual inspection of time course measurements (Fig 4C-H). Trajectories of activity of the reporters in response to six representative drugs that are chemically distinct and associated with different targets, demonstrated that while three drugs promoted the p38 signaling state, the other three activated the p53 signaling state. Altogether, these results demonstrate the binary partitioning of the 25 signaling pathways by showing that a variety of different drugs, each associated with different targets, converge to promote two main signaling states outcomes.
These results suggest that the p38-and p53-signaling states are mediated by a process that, in response to drug treatments, gradually increases its influence, or activity, over time. Since it is likely that the drug treatments only promoted the activity of an already existing process, we 30 were curious as to why the correlations shown in Fig 3C seem absent in unperturbed cells ( Fig   4B)? One possibility is that, prior to drug treatments, signaling pathways are subjected to the simultaneous influence of several competing regulatory demands, each pulling in a different direction. According to this interpretation, drug treatments promote correlated signaling by increasing the relative weight of one particular regulatory process -most likely a stress response -such that its affect is no longer averaged by competing influences. This model suggests a testable hypothesis: the binary partitioning of the 12 pathways should become apparent in 5 untreated cells if the measurements are normalized for independently existing correlations.
To this end we used Principal component analysis (PCA), a technique that transforms a dataset into a linear combination of independently existing multivariate correlations. As expected, PCA confirmed the high degree of correlations by identifying a single principle component (PC1) 10 that explains almost 50% of the variance after 48h of treatment ( Fig 5A). Further, the first principle component clearly identified the two signaling states; drugs that promote the p38-signaling or p53-signaling states are characterized by positive or negative values of PC1, respectively ( Fig   5B). Next, we repeated the PCA, but this time on measurements collected prior to drug treatments (time zero). Note that in this latter implementation of PCA, variation in measured activities did not 15 reflect differences in drug response, as no drugs were yet applied, but rather, small well-to-well variations like differences in evaporation rate or oxygen concentrations that are usually referred to as noise. Since the signalome provides measurements on 12 pathways in each well, it can test whether these small variations will lead to a well-specific shift in the signaling states that can be detected by PCA. Indeed, PCA significantly identified both p38-and p53-signaling states also in 20 the unperturbed cells ( Fig 5C).
To further explore whether the p38-and p53-signaling states precede drug treatments, we tested whether measurements on cells that were not exposed to drug treatments can predict the specific correlations observed post drug treatments (Sup figure 5). This analysis identified dynamics that are classically characteristic of homeostasis ( Fig 5D). In the first hour post drug 25 treatment, chemical perturbations effectively eliminated correlated activities that linked the different branches of signaling in the unperturbed cells. Several hours into drug treatment, however, correlated activities resumed and, in fact, gained more prominence. These results suggest a general stress program that: (A) had functioned in cells that were not subject to drug treatment (B) was effectively eliminated in the first hour of drug treatment and (C) had resumed 30 activity in the hours following drug treatment. This observation can also be visualized in Fig 4C-H, in which the two signaling states are apparent at time zero (pre drug treatment), lost at 1 hour, and gains significance at the latter time points.

The p38-and p53-signaling states are linked to perturbations in cell size
The constancy of the two signaling states, in the face of diverse chemical perturbations, suggests that the large-scale correlations described by these signaling states function to support some process that is critical in our cell lines. Since our measurements were performed on cancer cells, we further reasoned that the process in question may relate to demands imposed by 5 continuous cell divisions. To maintain homeostasis, proliferating cells must double their mass between consecutive cell divisions. In cancer, this requirement may be more critical. [18][19][20] If cancer cells fail to match the proliferative demands of an oncogene with equivalent increases in biosynthesis, cell size will decrease over time. 21 We therefore wondered whether the observed pattern of coordinated signaling results from stress-sensing systems that respond to changes in 10 cell size resulting from imbalances in cell growth and cell division ( Fig 6A).
As a first step, we asked whether drugs that selectively interfere with rates of biosynthesis trigger compensatory mechanisms that will help the cell to reach a new steady state between growth and proliferation rates. To inhibit cell division rate, we used various chemical inhibitors of cyclin dependent kinases (CDK) like SNS-032 21 while to lower biosynthetic activity (cell growth), 15 we either inhibited protein synthesis by cycloheximide or mTOR activity by rapamycin and torin.
In all cases, drug doses were carefully optimized to ensure that cells are still proliferating and are not undergoing complete cell cycle arrest. For quantitative measurements of cell growth (protein synthesis per unit time), we followed a previously described protocol for single cell measurements of total macromolecular protein mass using fluorescently labeled succinimidyl ester (SE) that label 20 all proteins (Sup methods) 21-23 .
Our results demonstrate that during the initial hours of rapamycin treatment, while cell growth was rapidly inhibited, the rates of cell division were relatively unaffected ( Fig 6B).
Conversely, the CDK2 inhibitor SNS-032 lowered rates of cell division but did not affect cell growth ( Fig 6B). At the later time points, however, a coordination of growth and proliferation was re-25 established, but at a slightly different setpoint. Cells with inhibited rates of biosynthesis adapted by promoting longer periods of biosynthetic activity (longer cell cycles). Yet, this lengthening of the cell cycle fell short of a perfect adaptation, resulting in paired values of growth and division that fell slightly below the proportionality line. Similarly, to adapt to the longer growth periods imposed by CDK2 inhibitors, cells lowered the amount of protein synthesized per unit time. Yet, 30 here too, the compensation was incomplete, resulting in paired values that lay above the line.
Extending these results to multiple cell cycle or cell growth inhibitors across five cell lines demonstrated that incomplete compensation of the growth and proliferation rate is a general phenomena ( Fig 6C). In conclusion our results suggest that: (A) Drugs that perturb rates of biosynthesis trigger compensatory changes in division rates, and vice versa (B) The adaptation of growth rates and division rates to drug treatments is typically incomplete, resulting in paired values of growth and division rates that lie above and below the proportionality line. 5 To investigate the possibility that the p38-and p53-signaling states that we observed are related to the homeostasis of cell size, we asked whether drugs that induce these two states differ in their influence on cell size. To that end, we scored each compound for the extent that it promoted the p38-signaling vs p53-signaling states (Sup methods). We then used the signalome single-cell resolution measurements to calculate the influence of each drug on cell division rate 10 and cell size. Consistent with our hypothesis, we found that drug treatments that promote the p38 state correlated with a smaller cell size while drugs that promote the p53 state correlated with an increased cell size (Pearson's r = -0.625, p = 1.68x10 -23 ) ( Fig 6D).
Next, we used the signalome to test whether the two signaling states correlate with imbalances in cell growth and division rates. We found that drugs that disproportionately decrease 15 growth rates (i.e. data points below the diagonal) were associated with the p38-signaling state, while drugs that disproportionately decrease proliferation (i.e. points above the diagonal) induced the p53-signaling state (Fig 6E,F).
In cancer, disproportional changes in growth and division can spontaneously result from intracellular genetic changes or from external stresses, including nutrient or growth factor 20 deprivation. To test if the p38 and p53 signaling states are represented in human cancers, we mined the TCGA (https://www.cancer.gov/tcga) to retrieve proteomic measurements from 8,167 human tumors that span 32 different types of cancer. 24,25 To compare signalome reporters with TCGA, we assembled a list of 8 proteins or phosphoproteins that are known to correlate with the activity of pathways included in the signalome. Using these proteins as a surrogate for 25 pathways activity we separately analyzed each of the 32 cancers for correlated signaling.

Discussion
Live reporters are widely used to study signaling dynamics in cells. However, currently, the ability to multiplex live reporters together is limited. The integration of information from multiple signaling branches is critical for the understanding of complex biological processes. In this study we introduced visual barcodes, a fluorescent protein coupled to a specific subcellular localization 5 peptide, which allows multiplexing cells in live imaging applications. We demonstrate that visual barcodes are robust to perturbations, have a high precision and recall rates and are applicable for multiplexing both in vitro and in vivo. Multiplexing of different subclones not only increases the throughput of experiments but also reduces cost and well-to-well or animal-to-animal variation. 26 Adding more fluorescent proteins or cellular localizations to the system can augment its 10 multiplexing potential, and we predict that the system can be easily expanded to 20-plex combinations. Deconvolution of the visual barcodes was done using freeware, thus enabling the use of the system without licensing limitations. The visual barcodes system can be used for a very wide variety of applications such as competition assays between clones with different perturbations in vitro or in vivo, live tracking of cells with reduced risk of switching between 15 subclones, as well as multiplexing of live reporters, as we demonstrated by the Signalome cell lines.
For generating the Signalome cell lines we added a different fluorescent reporter for each of our 12 validated visual barcodes subclones, reporting for major signaling pathways in cancer cells.
While we generated each of the Signalome subclones using three consecutive round of infections 20 (nuclear marker, visual barcode, fluorescent reporter), we envision that a visual barcode and a reporter could be integrated into a single plasmid thus allowing one round of infection with a mix of plasmids on a cell line with a nuclear marker, allowing the generation of additional Signalome cell lines in days rather than weeks.
To better understand the interdependencies of signaling pathways we treated the A375 and PC9 25 signalome cell lines with hundreds of characterized chemical perturbants. Altogether, our results suggest an explanation as to how a chemically diverse collection of drugs converged onto a much smaller number of signaling states. Growth and division are fundamental processes that are subject to multiple mechanisms of homeostasis. While different drugs affect different intracellular mechanisms, an influence on growth or division is a common denominator of many different drug 30 targets. According to this model, the question of whether a drug promotes the p38 or p53 signaling states is not answered by the affected drug targets but rather, by how that drug targets relates to growth and division, i.e. to cell size.
The association of p38 with changes in cell size is consistent with previous reports 23,27 that show that p38 MAPK is selectively activated in cells that are smaller than their target size. These aforementioned studies, however, failed to identify why size sensing may be critical in 5 continuously proliferating populations. The present work links size sensing in proliferating cells with an adaptation to homeostasis of growth and cell division. It is also interesting to note that, while both p38 and p53 are well established stress proteins, their physiological response to stress conditions is very distinct. Stress conditions that activate p38 typically promote inflammatory programs which promote growth and suppress apoptosis. 28 By contrast, the activation of p53 is 10 both pro-apoptotic and functions to suppress mTORC1-mediated biosynthesis. 29 Overall, the visual barcodes are an easy to implement system that can help researchers to multiplex cells for a very wide variety of applications. The system is highly modular and can serve to generate Signalome cell lines with different reporters and thus may be useful in the research of a very wide variety of biological fields.       Figure 5. PCA suggests that the p38 and p53 signaling states exist pretreatment and increase in weight over time. A. PCA of the activity scores of 11 signaling pathways after 48 hours of treatment with 122 drugs. The color of each drug is indicating its cluster in figure 3A. B, C. bar plots representing the PC1 loading of each pathway after 48 hours of drug treatment (B) or pre-treatment (C). D. Variance of pathway activity, when projected on the principle components calculated from measurements on cells that were not exposed to drug treatment.  (2)    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + ++ + ++ + + + + + + ++ + + + ++++ + + + + + + + + + + + + +   An illustration depicting the correlation of p38 and ERK as represented in two different coorvdinate systems. In the first coordinate system C., axes are defined by the measured activities of p38 and ERK. With the second coordinate system D., axes are defined by performing PCA on cells that were not exposed to drug treatment. The yellow and blue shaded regions represent the area (or volume) that encloses the data in the alternate coordinate systems, which is calculated as the product of the variances. As shown in E. The volume enclosing the data is smaller when the coordinates are aligned with linear trends with the dataset. To calculate the extent to which the principle components calculated from measurements on cells that were not exposed to drug treatments represent drug-induced correlations, we compare the product of the variances in the alternate coordinate systems. F. To calculate a similarity score for each TCGA cancer, we calculated the pairwise correlation coefficients Pearson between measured activities in the tumor samples and the signalome. The similarity scores were then grouped together in a boxplot to show the distribution of similarities in every pathway for each cancer. The box blots represent the top and bottom quartiles for the distribution with the whiskers showing the extent of the distribution barring outliers marked by diamonds . A pathway which scores by this measure has the same pattern of correlations and anti-correlations between the given cancer and the signalome. Similarity scores of indicate that all the pathways in the cancer have an identical grouping of correlations and anti-correlations as observed in the signalome. By contrast, scores of zero indicate there is no similarity in the pattern, while negative scores indicate some pathways exhibit inverse patterns in correlations to those observed in the signalome. Median similarity scores indicate the degree to which the pattern of signaling in that cancer was similar to those observed in the signalome. Those above marked by the red line indicate that most of the TCGA cancers shared a pattern of correlations with those observed in the signalome. Growing media were supplemented with 10% fetal bovine serum (FBS) and 1% penicillinstreptomycin, pyruvate and glutamine (Invitrogen, #15140-122).

PCR for detection of Mycoplasma:
10 The protocol is based on the Takara Kit (#6601). The buffer used, deoxynucleotide triphosphates (dNTPs) and Taq polymerase were obtained from the Takara Ex Taq kit (#RR001A). The following forward and reverse Mycoplasma-specific primers were used for PCR: 5'-ACACCATGGGAGCTGGTAAT-3', 5'-CTTCWATCGACTTYCAGACCCAAGGCAT-3'. Reactions were held at 94°C for 30 s to denature the DNA, with amplification proceeding for 40 cycles at 94°C for 30 s, 55°C for 2 min, and 72°C for 1 min 20 Visual barcodes and Signalome plasmids construction: All of the plasmids described in this paper to generate nuclear marker, visual barcodes and live reporters have been submitted to Addgene and are available. We have also submitted the backbones used to create the visual barcodes and reporters to assist generation of additional visual 25 barcodes or reporter cell lines. The complete list of the plasmids that we have submitted is the following: This plasmid serves as the backbone for the TRE reporters described in our paper. It's a promoterless-mStrawberry-pGK-BSD backbone to which we insert a pathway specific promoter before the mStrawberry 158679 WNT-TRE-mStrawberry reporter

JNK-KTR-mStrawberry reporter
This plasmid is a JNK pathway reporter. It has CMV promoter driving the expression of JNK-KTR fused to mStrawberry-pGK-BSD 158690

Geminin-mStrawberry reporter
This plasmid is a Geminin cell cycle reporter. It has CMV promoter driving the expression of Geminin fused to mStrawberry-pGK-BSD 158691 iRFP-H2A This plasmid encodes an iRFP fused to H2A histone nuclear marker. It has a CMV promoter driving the expression of iRFP fused to H2A. It has no selection markers To generate the plasmids for the visual barcode clones we used a CMV-Puromycin-F2A construct on the backbone of pLKO.1 containing a multiple cloning site following the F2A. First, the plasmid was linearized using NheI and MluI right after the F2A sequence. Next, we used Gibson assembly (New England Biolabs, Inc. #E2611) to add the fluorescent protein and localization 5 peptide right after the F2A sequence. All plasmid sequences were verified by Sanger sequencing.
To generate the transcription response element (TRE) type of reporter plasmids as well as the AKT reporter plasmid we first created a promoterless-mStrawberry plasmid (TRE backbone plasmid on backbone of pLKO.1) with a SanDI recognition site before the mStrawberry. First, the plasmid was linearized using SanDI. Next, PCR products containing the promoter region of the plasmids 10 from Sup figure 2a were fused by Gibson assembly into the TRE backbone plasmid to generate the mStrawberry reporter plasmids.
To generate the translocation reporters, we used the KTR reporters created by Regot et al 2014. 4 Using Gibson assembly, we introduced these reporters to our TRE backbone plasmid by adding CMV promoter driving the expression of the KTRs fused to mStrawberry. GEMININ reporter was 15 constructed in an identical fashion to the KTR reporters.
Generating visual barcode reporter clones: We constructed our visual barcode signalome reporter clones in three steps: 1) for visual demarcation of the nuclear region we infected the cancer cell-lines with lentiviruses containing an iRFP-H2A plasmid that we generated. We then generated a single cell-derived parent clone (see 20 below); 2) we infected the parent clone with a lenti-virus containing the visual barcode plasmids and selected for positive cells using puromycin; 3) the puromycin positive cells were then infected with a lentiviruses containing a mStrawberry Signalome reporter and positive cells were selected using blasticidin. Next, we derived single cell clones from the puromycin-blasticidin positive cells and tested the clones for their visual barcode and reporter activity using known activator/inhibitors of the signaling pathway (Sup Figure 2f).
For generating the lentiviruses, plasmids were transfected into the 293T cells-2nd generation Following four weeks of growth, the mice were sacrificed and the tumors were extracted.
The extracted tumors were broken down into a single-cell suspension using the cold protease method described by Adam et al., 2017 (PMID: 28851704). In short: Tumors were incubated at 25 6°C for 7 min in a dissociation buffer containing Bacillus Licheniformis protease (10mg/ml final concentration), PBS and DNaseI(125U/ml). Next, the tumors were transferred to GentleMACS Ctubes (miltenyibiotec) and placed in the gentleMACS Dissociator (brain_03 program, miltenyibiotec). Following dissociation, the cells were sequentially filtered on 70 and 40µm strainers and spun-down at 500G for 5 minutes at 4°C and resuspended in 50µL cold PBS.
ImageStream analysis: Cells were imaged by an Imaging Flow Cytometer (ImageStreamX Mark II, AMNIS corp. -part 5 of Luminex, TX, USA). Data was acquired using a 60X lens, and lasers used were 405nm (30mW), 488nm (30mW), 561nm (200mW), 642mW (150) and 785nm (5mW). Data was analyzed using the manufacturer's software IDEAS 6.2 (AMNIS corp.). Images were compensated for spectral overlap using single stained controls. Cells were first gated according to their area (in µm 2 ) and aspect ratio (the Minor Axis divided by the Major Axis of the best-fit ellipse) of the iRFP staining.  were acquired from each well in a 384-well plate using a 10X objective in both digital phasecontrast (DPC) as well as in each of the six FP specific wavelengths.
Image analysis and feature extraction workflow: We first used CellProfiler (Version 2.2.0) to detect nuclei (iRFP), segment cells (DPC), identify tertiary objects (cytoplasm, perinuclear, cell-specific background), and detect and quantify 20 fluorescent proteins of visual barcodes (BFP, CFP, GFP, YFP) and reporters (mStrawberry) in all cellular compartments from all images. The resulting data, termed cytological profiles, consist of more than 200 features that describe the characteristics of each cell such as its size, shape, and the intensity and texture of all FPs expressed. Results of this pipeline were exported both to spreadsheet and sql-lite database.

25
Barcode deconvolution workflow: To determine the visual barcode identity of each cell, we used CellProfiler Analyst (Version 2.0) classifier supervised machine learning software. We first trained the software with images from our control wells that were plated with only one subclone type per well (one visual barcode). We made sure that for each clone we trained on at least 200 cells, representing all timepoints in the 30 experiment. We instructed the CellProfiler to use 50 rules in order to differentiate between the clones. This type of training was done for each of our experiments as we noticed that using the same set of rules between experiments reduces the overall accuracy of barcode calling. Note that a feature could be used more than once in the classifier. Finally, we applied the rules to all cells in the experiment in order to determine the barcode of each cell.

5
Data analysis: Data analysis and statistical tests were performed using R (R version 3.6.0) and RStudio (Version background. The result is an activity score for each reporter at a given treatment and time-point (the score was not calculated in cases where the group contained less than 30 cells). These activity scores were used to perform Hierarchical clustering (distance = Euclidean, agglomeration method = Ward.D2) on the data using the pheatmap package (version 1.0.12). Plotting the activity score was done using the ggplot2 package (version 3.3.0). KS as well as other statistical tests were 20 performed using the stats package (version 3.6.0).
Reporter activity score: To measure the effect of treatments we used a modified KS test, comparing the treated population's intensity distributions with respect to the control (Supp Fig 2d). The test, which measures the biggest difference between the two CDFs (Cumulative Distribution Function) allows us to identify 25 almost every significant effect the treatments had on the reporters' activity. However, since the KS statistic is the absolute value of the maximum difference in CDFs, we assigned a sign to the statistic based on the location of both populations' median values. Thus, our score is ranged between -1 for maximum inactivation, as shown for ERK translocation reporter treated with trametinib ( Supp Fig 2d) to 1, maximum activation. In general we noticed a high correlation between the KS score and the effect of drugs on the distribution mean (data not shown).
Measurements of cell growth: To measure the average cellular growth rate (protein accumulation rate) in each condition, we used a previously-described method for quantification of total macromolecular protein mass in individual cells. 21-23 At intervals during drug treatment, samples were fixed and permeabilized, and cells were reacted with a succinimidyl ester that is covalently bound to a fluorescent dye (SE-A647). SE-A647 covalently binds to fixed proteins to produce a fluorescent signal that is proportional to cell mass as shown in Sup figure 5A and 5B. After fixation and staining with SE-A647 (protein) and DAPI (DNA), widefield fluorescence images were collected. The bulk protein 5 content (total SE-A647 intensity of sample) and number of cells were measured in each sample.
From these measurements, we calculated the average growth rate and cell cycle length of cells in each condition, by fitting all data points (from two replicates of each condition) to an exponential growth model. Prior to fixation, throughout the course of drug treatment, proliferation was independently monitored by periodic imaging of live cells via differential phase contrast imaging.

10
These measurements were used to estimate the average cell cycle length in each condition by fitting data to an exponential proliferation model.
Cells were imaged using a Perkin Elmer Operetta high content microscope, controlled by Harmony software, with an incubated chamber kept at 37°C and 5% CO2 during live-cell imaging. A Xenon lamp was used for fluorescence illumination, and a 740 nm LED light source was used for 15 transmitted light. Differential phase contrast images were collected using a 10 × 0.4 NA objective lens. Widefield fluorescence images were collected with a 20 × 0.75 NA objective lens.
Metric to quantify PCA pre and post drug treatment: We developed a metric that quantifies how well measurements of cells that were not exposed to 25 any drug treatment can predict the correlated signaling activities that we observed post drug treatment. As shown in fig 4, correlated activities in untreated cells seem absent when calculated by pairwise correlations, but are identified by PCA with significance (Fig. 5). The reason for this discrepancy is that pairwise correlations fall short of representing multivariate dependencies. By examining the dataset as separate pairs of pathways, piecemeal comparisons of pairwise 30 correlations reduces the statistical power of any data analysis. On the other hand, performing independent PCA on treated versus untreated also falls short of answering our question. While PCA identified multivariate trends in measurements on both drug-treated and untreated cells, it is hard to say whether such independently identified dependencies are similar.
To circumvent this challenge, we quantified the persistence of the correlations with a different 5 approach. In simple terms, we asked how well principle components calculated from measurements on untreated cells can explain trends that emerge post drug treatment. Intuitively, principal component analysis (PCA) is a method that calculates a new coordinate system that is optimally aligned with linear trends within the measured data (Sup Figure 5B). In the present study, any given drug is scored by measuring its influence on 12 branches of signaling (Sup Figure 5A).

10
In Sup Fig 5A, for example, drugs are represented as points on a 2D coordinate system, where the 'coordinates' represent the drug's influence on the measured pathways. In such case, PCA constructs a new (orthogonal) coordinate system (Sup Figure 5B) that is optimally aligned with linear dependencies in the raw data. Further, the new coordinates calculated by PCA are hierarchically ordered such that the first coordinate (the first principle component) is aligned in a 15 direction that captures the highest variance in the dataset and so on.
To test whether measurements on untreated cells can predict the drug-induced multivariate correlations, we performed a PCA from measurements on cells that were not exposed to drug treatments. This resulted in a coordinate system that we call , that has twelve axes in the directions . Since results from performing PCA on untreated cells, these axes (coordinates) 20 are trivially aligned with correlated signaling pre-drug treatment. What is not clear is whether would also align with the correlations that are promoted by drugs. To test this, we compared the product of the variances of the original measurements to the product of the variance after measurements are projected onto .

25
As is illustrated by supplementary Fig 5, the product in the variances describes a region that encloses the data points (in a given coordinate system). When two pathways (here we use the example of p38 and ERK) have activity that's correlated, the region enclosing the data will be smaller in a coordinate system that is aligned with those correlations. A useful aspect of is that it is naturally normalized to the range of (0,1). To begin, if measured activities are not correlated, 5 the volume that encloses the data should not change no matter what coordinate system it is projected onto and the value of will be near one in all coordinates. By contrast, if we imagine perfectly correlated data, would tend to zero (e.g. a line has no volume).
In our work, this was used to see whether the linear combinations of pathway activity seen in unperturbed cells reflect correlations in drug-treated cells better than the pathway activities 10 themselves: low values of indicate that these combined signaling groups do not reflect the coordination of signaling changes in response to drugs, while higher values indicate that they do.
Qualify drugs conformity to p38-or p53-signaling state: To score the extent to which a given compound promotes the p38-signaling vs p53-signaling states, 15 we used Eq. 2.
Eq. 2 Here, the value is the numerical value of the drug's effect in the $i$th axis of the principal coordinate system . Intuitively, the extent by which a drug conforms with the p38-and p53signaling clusters is represented by the extent to which its influence on the 12 pathways is aligned with the first principal component: the magnitude of which is . Eq. 2 is the ratio of a drug's 20 influence on as compared to the size of the effect vector. This quantifies how closely aligned the drug's effect is with p38-p53 signaling.