Single-cell integration through virtual labelling
Generative DL-enabled virtual labelling mediates the integration of molecular marker channels and cell populations within our ExIF framework (Figure 1). Since virtual labelling fidelity thus determines the quality of ExIF data integration, we begin this study by optimising virtual labelling performance, first by comparing alternate generative DL architectures, and then by testing different types and combinations of anchoring channels (DL model inputs).
Reflecting the prevailing ‘label-replacement’ paradigm of virtual labelling12,15, we first use label-free differential interference contrast (DIC) images as input (Figure 2A, left) to compare three alternate generative DL architectures: U-Net25; cGAN26; and ResViT27- the latter of which incorporates self-attention and is used here for the first time in the context of virtual labelling. We compare virtual labelling performance across eight alternate molecular marker channels (α-Tubulin, DNA (DAPI), CoxIV, fibrillarin, GM130, F-actin (phalloidin), β-Catenin, NF-κB-p65) concurrently labelled via the ‘4i’ experimental multiplexing protocol28 in 45,780 DU145 prostate cancer cells (Figure 2B). We stress that (experimentally) multiplexed IF datasets are not required for ExIF, which is explicitly designed to enable high-plexity dataset integration based on standard 4-plex IF data (as demonstrated across Figures 3-5). Rather, the experimentally multiplexed dataset was used here for developing and testing computational elements of the ExIF framework. All comparative results reported here and throughout this study are based on robust 5-fold cross-validation.
Virtual labelling fidelity with DIC-only inputs was first assessed using image-level metrics: the structured similarity index (SSIM29) and Pearson’s Correlation Coefficient30 at the pixel level (Pixel-PCC) by comparing predicted virtual labels and their real counterparts. U-Net achieved the highest overall SSIM scores, while the ResViT model achieved the highest Pixel-PCC scores (Figure 2C). To focus on virtual labelling fidelity at single-cell resolution, we next performed cell segmentation (Supplementary Figure 1A) and marker feature measurement using CellProfiler31 software, comparing 142 quantitative features per marker channel, per cell, spanning several feature categories: 15 ‘intensity’; 75 ‘radial distribution’, and 52 ‘texture’ features (complete feature list in Supplementary Table 1). For each quantitative feature, we calculated the Pearson’s correlation (metric herein termed ‘Feature-PCC’) between ground-truth experimentally labelled and virtually labelled marker channels. Significantly higher Feature-PCC values were achieved across all feature categories using ResViT (Figure 2D). For per cell mean marker intensities – particularly important given their relation to protein abundance – ResViT predictions achieved high correlations to ground-truth values for most markers (~0.6 - 0.9 PCC; Figure 2E), though GM130 and β-Catenin correlations were modest (<0.5 PCC). A full breakdown per marker, DL architecture and image metric/feature category is provided in Supplementary Figure 1B. Given the convincing outperformance of ResViT, particularly at the per-cell feature level, we hereafter present ResViT results only, together with Feature-PCC metrics of per cell quantitative feature similarity between real and virtual labels, as these emphasise biologically interpretable analyses at single-cell scale.
We next targeted further improvements in virtual labelling fidelity by providing ResViT with multiple input image channels. We initially complemented the DIC anchoring channel with a single additional fluorescence marker channel (Figure 2A, center) (testing all alternatives) and found improvements in Feature-PCC scores in essentially all cases, although improvement levels varied for mean intensity (Figure 2F) as well as across each feature category (Supplementary Figure 2A). Unsurprisingly, the choice of additional input image channel made a marked difference to the degree of improvement observed, as exemplified when integrating β-Catenin or GM130 virtual labels across the cells (Figure 2G, Supplementary Figure 2B). Extending this multichannel input strategy to its limit, we built ResViT models incorporating DIC plus seven fluorescence marker channels as input to predict an eighth marker (Figure 2A, right). Feature-PCC values improved further across all target markers, though increases became more subtle beyond the impact of adding the single best fluorescence marker to DIC, when considering mean intensity prediction (Figure 2F) or other feature categories (Supplementary Figure 2A). Taken together, we have achieved significant increases in virtual labelling fidelity through innovative use of the ResViT DL architecture, and by providing ResViT with multiple image inputs as anchoring channels.
Applying extensible immunofluorescence to investigate epithelial-mesenchymal cell state plasticity
We next demonstrate ExIF in the context of epithelial-mesenchymal plasticity, a phenomenon facilitating cancer progression by enabling tumour cells to switch between epithelial and mesenchymal phenotypes32,33. Epithelial-mesenchymal transition (EMT) is a complex multimolecular process, involving repression of epithelial markers and upregulation of mesenchymal markers enabling enhanced cell migration, invasion and thus metastatic progression34. To assess the utility of ExIF for interrogating EM cell state plasticity, we amplified EM state diversity in A549 lung cancer cells by comparing control cells to those treated for 48 h with either EGF (0.1μg/mL) or TGF-β1 (0.01μg/mL); growth factors known to weakly or strongly (respectively) drive epithelial-to-mesenchymal transition (EMT) in A549 cells35. To readout EM state per cell, we labelled and imaged eight different 4-plex IF panels across each treatment condition, following the extensible labelling schema (Figure 1). Specifically, each panel consists of recurring ‘general’ markers (DNA, F-Actin and β-Catenin), and a single variable EM state marker (E-Cadherin or EpCAM or N-Cadherin or PTEN or Vimentin or CD44total or CD44std or CD44v936–41 (Figure 3A).
We then performed data integration via virtual labelling, using the recurring general markers and DIC as inputs to ResViT virtual labelling models to predict labelling for variable (non-recurring) EM state markers. We highlight the role played by the recurring markers in the ExIF context, providing consistent virtual labelling inputs and acting as common anchors for accurate integration of variable markers. This enables production of a unified ExIF dataset, with each cell characterised by all 11 fluorescence markers (3 general markers and 8 EM state markers). CellProfiler-based cell segmentation and feature measurement followed (morphological features (cell size, shape, neighbour-contact etc) plus 142 features per marker per cell; as in Supplementary Table 1), acquiring data spanning 12,103 control cells, 14,350 EGF-treated cells and 7,119 TGF-β1-treated cells.
Although ExIF is designed to enable utilisation of multiple input channels to improve integration performance, we note that some research applications are suited to a label-free only approach (e.g. long-term live cell imaging, sensitive diagnostic sample imaging etc). Thus, in addition to assessing integration performance using ‘full ExIF’ (using general fluorescence markers and DIC as input to predict each of the 8 EM-state markers), we also assessed using DIC as the only input; termed ‘label-free ExIF’ (Figure 3B). Consistent with our previous results, we found significantly higher Feature-PCC values between real and virtual labels when using full-ExIF (Figure 3C; Feature-PCC values per feature category, per marker shown in Supplementary Figure 3A). The multichannel inputs achieved extremely high correlations (~0.7 – 0.9 PCC) for per cell mean intensity of virtual labels relative to real labels (though vimentin showed weaker performance; < 0.6 PCC) (Figure 3D).
Using EM state marker mean intensity distributions to examine differences induced by growth factor treatments (Figure 3E), we found that control and EGF conditions had relatively similar distributions, whereas TGF-β1 induced more dramatic changes. Mesenchymal markers N-Cadherin and Vimentin were especially strongly expressed after TGF-β1 treatment, while epithelial markers CD44v9, E-Cadherin, and EpCAM were strongly suppressed. These effects are consistent with established EMT biology36–41. Notably, EM marker distributions resulting from virtual labelling-mediated dataset integration closely reflected treatment-induced effects seen with real labels (i.e. those derived from the subset of cells experimentally labelled with each EM marker). Thus, ExIF achieves strong performance across various EM state markers, retaining biologically relevant information that captures heterogeneous yet distinctive responses to different EM-state perturbations.
Classifying cell treatment condition using ExIF integrated data
We next assessed whether ExIF meaningfully enhances downstream single-cell quantitative data analyses commonly used to interrogate complex cell biology, including in the contexts of omics and multi-omics analyses. We first tested accuracies for machine learning classification of treatment conditions (control vs EGF vs TGF-β1). Using standardized and scaled single-cell (CellProfiler-derived) features from standard and ExIF integrated datasets, we performed principal components analysis (PCA) before using derived principal components as input for support-vector machine (SVM)-based classification of cell treatment conditions (Figure 4A). This was repeated for each cell population labelled with a different EM state marker. Classification results were evaluated via F1-scores, comparing different datasets in aggregate (Figure 4B).
We found that full ExIF achieved significantly better classification of treatment conditions (F1 ~ 0.85) compared to standard 4-plex IF (F1 ~ 0.8). We also noted that ExIF anchored using only the three general fluorescence channels (no DIC) retained classification performance (F1 ~ 0.84) that was far better than when using only features directly measured from the three general fluorescence anchoring channels (F1 < 0.75). This implies that ExIF integration of EM-state markers (via virtual labelling) adds significant discriminatory value (regarding treatment class) beyond that directly accessed from the general markers that form the basis of integration. Label-free ExIF, which virtually labels all 11 fluorescence markers from this experiment also significantly improved classification beyond that achieved using the three (real) general markers. Indeed, label-free ExIF achieves essentially the same EM-state discriminatory power with label-free images-only as can be achieved using features from a standard 4-plex IF experiment that includes at least one explicit EM-state marker. By comparison, we note that virtual labelling of only the original 4-plex panels using DIC-only-inputs – an approach that mimics the prevailing label-replacement paradigm for virtual labelling – failed to match even classification results achieved using features measured from the three general markers.
Interestingly, classification using only features from the three general markers most often misclassified control and TGF-β1 cells as EGF-treated (Figure 4C), while full ExIF integration reduced misclassifications across all classes, especially improving differentiation of TGF-β1 and EGF conditions. To explain these performance-differences, we show examples of how the addition of virtual markers corrects two control cells misclassified as TGF-β1-treated (Figure 4D). Comparing EM-state marker mean intensity distributions, for instance, we see no clear separation between control (blue) and TGF-β1 conditions (red) when considering the real common anchoring channels-only, although TGF-β1-treated cells have somewhat lower mean DNA intensities (Figure 4E). Given these minimal general marker differences between control and TGF-β1 conditions, the two misclassified control cells (mean intensities marked by vertical lines - green or purple) could belong to either treatment population. Indeed, the real general marker labelling of these control cells looks similar to exemplar TGF-β1-treated cells. In contrast, EM-state marker integration, especially of CD44v9, E-Cadherin, N-Cadherin, EpCAM, and Vimentin, defines population distributions that more effectively discern control and TGF-β1 conditions. Accordingly, the two example cells can be identified as strongly exhibiting control phenotypes when considering the combination of all virtual markers, exemplifying how ExIF integration enhances phenotype discrimination and classification capacity.
Mapping EM-state heterogeneity and marker dynamics using ExIF data integration
Moving beyond classification of the discrete treatment conditions, we next examined whether ExIF integration could enhance the quantitative mapping of cell phenotype heterogeneity in the EM spectrum. To this end, we applied several unsupervised manifold learning methods to our quantitative single-cell feature datasets. Importantly, we excluded cell morphology features (relating to cell size, cell shape, cell neighbour proximity) from construction of these and all subsequent manifolds, since cell morphology is linked to EM state, and thus morphological features would undermine capacity to assess the construction of ‘EM-state-sensitive’ manifolds leveraging ExIF-integrated data. We present manifolds constructed with PHATE, a manifold-embedding technique that seeks to capture underlying dynamic trajectories in multivariate data42 (Figure 5), as well as manifolds constructed with T-SNE 43 and UMAP 44 (Supplementary Figure 3B).
For the ‘Standard’ manifolds, we used features measured directly from the three general markers only (Figure 5A), demonstrating the limited capacity of standard-plexity IF data to map heterogeneity in EM-state. We next constructed manifolds via label-free ExIF integration of all 11 markers using DIC as the only DL model input (Figure 5B). We found that the label-free ExIF integrated dataset maps EM-state heterogeneity more effectively than standard IF data, with improved treatment condition-separation and manifold extension. Lastly, we generated full ExIF manifolds (using general markers plus DIC as DL model inputs), integrating all eight EM-state markers (Figure 5C). Features measured from this optimally integrated dataset greatly improved delineation of treatment conditions in the PHATE manifold, emphasising differences between the ‘mesenchymal’ states induced by EGF versus TGF- β145–47, as well as defining a more structured and extended manifold profile than the other manifold versions. T-SNE and UMAP manifolds confirm similar tendencies.
Modelling of pseudotime48 within the PHATE-embedded manifolds further supports improved cogency of the full ExIF manifold, which defines a trajectory capturing progression from cell states enriched for the control condition to cell states enriched for EGF then TGF-β1 treatments. Representative cell images from milestones (1-6) along the full ExIF manifold pseudotime trajectory indicate cell morphological trends aligned with EMT (Figure 5D). Indeed, colour-coding the full ExIF manifold by real (CellProfiler-measured) morphological and cell-cell contact values (not used in manifold construction) shows consistent quantitative trends aligned with expected EMT dynamics, i.e. increasing cell area, increasing cell protrusivity and reduced cell-cell contact (Figure 5E). Pseudotime analyses further delineated individual EM marker-level (mean cell intensity) variations along each inferred trajectory. These variations are unstructured and noisy in the Standard manifold compared to clear trends and signals spanning the full ExIF manifold (Figure 5F); again supporting improved manifold characteristics achieved via optimal ExIF dataset integration. Notably, the displayed EM marker dynamics derived from the Standard manifold (Figure 5F, left) reflect real marker values from the cell subpopulations experimentally labelled for each variable marker. EM marker dynamics from the full ExIF manifold (Figure 5F, right) depict either: real marker dynamics (weak line colours; from the subset of cells experimentally labelled for each marker); or virtual marker dynamics (strong line colours; from all cells). Comparisons of real versus virtual marker trends reveal strong correspondence for each marker, indicating that dynamical trends inferred using the virtual data used for dataset integration correspond to ground-truth marker dynamics, though the virtual data trends are substantially less noisy.
Focusing on the specific expression-level dynamics inferred for each EM marker in the optimally performing full ExIF manifold, we see trends aligned with established molecular changes during transition from epithelial to mesenchymal state. E-cadherin expression declines while N-cadherin levels ultimately elevate49. The epithelially-associated CD44v9 isoform is suppressed, counterbalanced by increases in total CD44 and in the mesenchymally-associated CD44std isoform50. Tumour-suppressor PTEN expression declines while vimentin levels peak late in the trajectory39,40. As with E-cadherin, EpCAM also declines substantially51. By contrast, these biologically informative trends are difficult or impossible to discern from pseudotime analysis of the Standard manifold based on data from the common fluorescence anchoring channels-alone.
Taken together, we find that ExIF integration – even when using only label-free (DIC) inputs – can achieve analytical outcomes that compare favourably to interrogation of standard IF data alone. Moreover, our full ExIF approach, which achieves high-fidelity integration to create high-plexity fluorescence imaging data, enables powerful multimolecular analyses of discrete cell states, continuous phenotypic heterogeneity and (pseudotemporal) marker dynamics. Achievable with just a single experiment comprising several (or many more) standard 4-plex IF labelling conditions, these results are comparable to those otherwise accessible only by using complex and costly experimental multiplexing methodologies.