SEAM is a spatial single nuclear metabolomics method for dissecting tissue microenvironment

Spatial metabolomics can reveal intercellular heterogeneity and tissue organization. Here we report on the spatial single nuclear metabolomics (SEAM) method, a flexible platform combining high-spatial-resolution imaging mass spectrometry and a set of computational algorithms that can display multiscale and multicolor tissue tomography together with identification and clustering of single nuclei by their in situ metabolic fingerprints. We first applied SEAM to a range of wild-type mouse tissues, then delineated a consistent pattern of metabolic zonation in mouse liver. We further studied the spatial metabolic profile in the human fibrotic liver. We discovered subpopulations of hepatocytes with special metabolic features associated with their proximity to the fibrotic niche, and validated this finding by spatial transcriptomics with Geo-seq. These demonstrations highlighted SEAM’s ability to explore the spatial metabolic profile and tissue histology at the single-cell level, leading to a deeper understanding of tissue metabolic organization. SEAM is a platform for the analysis of high-resolution secondary ion mass spectrometry imaging that allows spatially resolved nuclear metabolomic profiling at the single-cell level.

T he hierarchical organization of multicellular organisms is stably maintained by homeostasis at different levels. At the tissue level, such homeostasis is often modulated by the combination of intracellular gene expression network and extracellular (microenvironmental) signals [1][2][3][4] . A cell and its extracellular environment interact dynamically through various signaling mediators, including metabolites, secretome and ligand-receptor interactions. Metabolites from the extracellular environment can substantially influence cell behavior or even transform cell identity. For instance, extensive alcohol intake not only activates the detoxification activity of hepatocytes but also alters the epigenetic landscape of hepatocytes 5 . Conversely, the release of metabolites by cells can also have effects on the microenvironment. One classic example is basophils and mast cells releasing histamine to increase the permeability of capillaries when encountering infection 6 . To facilitate a deeper and more systematic understanding of the multiscale (that is, molecular level, cellular level and tissue level) nature of biological processes (for example, organ development or tumor microenvironment), various single-cell omics technologies have been rapidly developed and used 7 . Currently, advanced imaging mass spectrometry (IMS)-based techniques are making it possible to profile a large number of metabolites spatially and/or temporally, providing new dimensional insights into the hierarchical processes 8,9 . For spatially resolved metabolomics studies, different techniques have been developed including matrix-assisted laser desorption/ ionization-mass spectrometry (MALDI-MS) 10 , desorption electrospray ionization-mass spectrometry (DESI-MS) 11 , laser ablationinductively coupled-plasma-mass spectrometry 12 and secondary ion mass spectrometry (SIMS) 13 . Atomspheric pressure MALDI has demonstrated an application on P. caudatum and imaged endogenic biomolecules with 1.4-μm lateral resolution 14 . MALDI-2 was introduced by adapting a t-MALDI-2 ion source to an Orbitrap mass analyzer and a pixel size of 600 nm was achieved on brain tissue 15 . DESI-MS has been used to visualize tissue-level metabolomic alterations in 256 patients with esophageal cancer 11 . Using a nanoDESI platform, a 10-μm spatial resolution was achieved on mouse pancreatic islets 16 . More recently, three-dimensional (3D) OrbiSIMS, a label-free IMS with a subcellular lateral resolution and high mass-resolving power, has been developed 17 . These techniques will be used increasingly in future spatial metabolomics applications.
Although the above techniques have achieved unprecedented subcellular resolution, several analytical complications still exist, such as single-cell segmentation and cell fingerprint extraction. Previous studies typically segmented cells using hematoxylin-eosin (H&E) staining, which suffered from either inaccurate segmentation due to imperfect registration of adjacent slides, or labeling on the same slides, which could introduce exogenous substances and SEAM is a spatial single nuclear metabolomics method for dissecting tissue microenvironment Zhiyuan Yuan 1,10 , Qiming Zhou 2,10 , Lesi Cai 3,10 , Lin Pan 4 , Weiliang Sun 4 , Shiwei Qumu 5 , Si Yu 6 , Jiaxin Feng 3 , Hansen Zhao 3 , Yongchang Zheng 6 , Minglei Shi 7 , Shao Li 1 , Yang Chen 1,9,11 ✉ , Xinrong Zhang 3,11 ✉ and Michael Q. Zhang 1,7,8,11 ✉ Spatial metabolomics can reveal intercellular heterogeneity and tissue organization. Here we report on the spatial single nuclear metabolomics (SEAM) method, a flexible platform combining high-spatial-resolution imaging mass spectrometry and a set of computational algorithms that can display multiscale and multicolor tissue tomography together with identification and clustering of single nuclei by their in situ metabolic fingerprints. We first applied SEAM to a range of wild-type mouse tissues, then delineated a consistent pattern of metabolic zonation in mouse liver. We further studied the spatial metabolic profile in the human fibrotic liver. We discovered subpopulations of hepatocytes with special metabolic features associated with their proximity to the fibrotic niche, and validated this finding by spatial transcriptomics with Geo-seq. These demonstrations highlighted SEAM's ability to explore the spatial metabolic profile and tissue histology at the single-cell level, leading to a deeper understanding of tissue metabolic organization. thereby damage sample integrity 18 . Another cell segmentation strategy exploited a convolution neural network trained on pixel-wise annotated cells, demanding huge human expert labor 19 . As for cell fingerprint extraction, the common practice that took the average of pixel profiles within each cell caused the impairment of distributive information 20,21 ; specifically, taking the average for a population of pixels will only maintain the first-order moment of this distribution, which could lose information such as covariance and shape. These deficiencies (including the aforementioned problems in segmentation and representation) could cause inaccurate preservation of single-cell metabolic profile and intercellular spatial organization. Consequently, although instruments for IMS have improved, the downstream analytical methods still require further development for users to be able to fully exploit spatial metabolomic features.
To overcome those deficiencies, we propose spatial single nuclear metabolomics (SEAM), a platform that leverages the spatial metabolic profile provided by SIMS and a comprehensive series of computational algorithms for delineating the in situ single-cell-level metabolic profile and tissue microenvironment. This study shows the segmentation and analysis of single nuclear metabolic profiles directly from tissue sections. SEAM is label free and requires only minimal experimental preparation, which avoids the introduction of exogenous substances and preserves samples' native states. As a proof of principle, we comprehensively calibrated SEAM using popular cell cultures, and then systematically scaled up to various mouse tissues, including wild-type mouse liver, stomach, pancreas, kidney, small intestine and lung. Finally, we identified different hepatocyte metabolic subpopulations and their spatial network organization within the tissue microenvironment in the human fibrotic liver.

results
Overview. SEAM is an integrated platform for qualitative and quantitative analysis of tissue metabolic cell typing and the in situ microenvironment. The pipeline is composed of two main parts: IMS assay and computational analysis suite (Fig. 1a). The full workflow of SEAM is shown in Supplementary Fig. 1.
As an IMS technology, time-of-flight-secondary ion mass spectrometry (TOF-SIMS) provides both mass spectra (chemical information) and ion images (spatial information) for biomolecules on tissue sections (Fig. 1a, top left). Typically, hundreds of peaks in a mass spectrum can be extracted from a 400 × 400 μm 2 scan area on a tissue section. Every experiment outputs multiplexed SIMS data with 256 × 256 pixels (roughly 1.5 μm per pixel) in spatial resolution, and each pixel is associated with a vector of over 200 selected m/z peaks ( Fig. 1a and Methods). To facilitate quick examination of the metabolic spatial pattern across the full spectrum, rather than manually reading hundreds of m/z images one by one, SEAM provides SIMS-View to compress the multiplexed SIMS images from hundreds of channels into three, while preserving local and global structures in the feature space (Fig. 1a, bottom left and middle). Then the three-channel images are mapped to CIELAB color spaces 22 and can be rapidly surveyed by human vision (Supplementary Fig. 1).
To compensate for the potential information loss due to dimensionality reduction, SEAM takes advantage of compositional characteristics and spatial continuity to build a spatial single-nucleus map and delineate the organization of metabolically distinct in situ cell subpopulations (Fig. 1a, bottom right). More specifically, SEAM provides three additional data analysis modules (Methods): single-nucleus segmentation (SIMS-Cut, Fig. 2a), single-nucleus representation (SIMS-ID, Fig. 2b) and differential metabolite analysis (SIMS-Diff).
SEAM resolves spatial single nuclear metabolic profiles from tissues. As a first check, we tested SEAM using wild-type mouse liver (Fig. 1a, bottom row), stomach, pancreas, kidney, lung and small intestine samples ( Fig. 1b and Supplementary Figs. 2-5). Qualitative visualization of SIMS-View illustrates the corresponding tissue structures: for example, in the liver, the metabolites show gradual changes spreading out from the central vein (CV) 23 ; in the stomach, mucosa and submucosa are shown in separate colors; in the pancreas, the pancreatic islet is highly distinguishable in the central area; in the lung and kidney, the specific structure of the local metabolic niches, such as bronchioles and glomerulus 24 ; and in the small intestine, the characteristic anatomical pattern along the intestinal villus axis can be seen 25 (Fig. 1b and Supplementary Figs. [2][3][4][5].
In addition to the visualization provided by SIMS-View, one can selectively generate histological or functional information by using the different SEAM modules. Alongside the SIMS-View, clustering results using the single nuclear representation module SIMS-ID can mark strong correspondence to well-established cell types, for example, hepatocytes, Kupffer cells and endothelial cells in the liver, Clara cells in the lung, mucosa, gland and submucosa in the stomach, as well as enterocytes and lamina propria in the small intestine ( Fig. 1 and Supplementary Figs. [2][3][4][5]. Algorithm design and modular data analysis. SIMS-View is a fast visualization tool designed for SIMS data, which takes advantage of the efficiency and the local and global structure preservation of uniform manifold approximation and projection (UMAP) 26 . It takes multiplex SIMS data as input and outputs a single human-readable image using three steps. First, SIMS data are regarded as 256 × 256 independent pixels, each represented by a fixed-length vector, and each pixel is feature-wise normalized to avoid feature bias. Next, the 65,536 pixels are fed into UMAP to reduce the dimensionality to three. Finally, each of the three resulting dimensions is scaled and color-coded by CIELAB color space, and all pixels are mapped back to their original positions ( Supplementary Fig. 1). Combining the advantages of both UMAP and CIELAB color space, and unlike related works [27][28][29] (Supplementary Notes), SIMS-View provides a global view of compressed ion distribution features in one single image at the pixel level.
To solve the cell segmentation problem, previous in situ studies used a variety of approaches. Some used matched H&E stain 18 , others took one simple measurement as input 19,30 , and most of them used supervised segmentation, via either pixel-wise classification or modeling the whole image using convolution neural networks. Based on the visualization of SIMS-View results on different samples, the nuclei of cells showed similar colors for most cells, yet were different from other nonnuclear areas ( Supplementary Fig. 2). Therefore, we decided to isolate the nucleus to demarcate every single cell. To avoid extra staining and heavy annotation labor, which would sabotage the original metabolic state of samples, we developed SIMS-Cut, an unsupervised label-free algorithm, to segment regions of interest (ROI) using corresponding metabolic markers, for example, adenine (m/z 134.04) as the nuclear marker 17 . The input data format is multiplexed by selecting ion species that are highly colocalized with nuclei, which is highly consistent across different samples ( Supplementary Fig. 6a). At the core of SIMS-Cut is an expectation-maximization algorithm, aiming to solve an optimization problem of a probabilistic graphical model (PGM) 31 , which combines a restricted Boltzmann machine (RBM) [32][33][34][35] , and a Potts model 36,37 (Fig. 2a and Supplementary Fig. 6d). The RBM ( Supplementary Fig. 6c) is suitable for modeling the appearance of a multi-image pixel given its label (foreground/background), and the Potts model ( Supplementary Fig. 6b) encourages the resulting segmentation masks to be smooth. Detailed performance descriptions can be found in Supplementary Notes and Supplementary Figs. 7-12.
After segmentation, the metabolic fingerprint of each segmented nucleus needs to be extracted and represented. Given the fact that SIMS captures the cumulative intensities along the z axis for each pixel, the metabolic fingerprint of each cell (both nucleus and cytoplasm) can be extracted by combining its segmentation mask and corresponding SIMS data. Existing works often represent cells by computing the average of all the pixels contained within each cell 20,21 , which requires strong assumptions, such as Gaussian or unimodal, and suffers from loss of pixel variation ( Supplementary  Fig. 15c). To obtain better results, SIMS-ID represents cells using the bipartite graph of pixels and cells constructed by a self-supervised learning algorithm [38][39][40] , which can soften the hard labeling produced by SIMS-Cut ( Fig. 2b and Supplementary Fig. 15a,b). The resulting representation showed superior discriminative power, noise robustness and pixel distribution preservation. A detailed performance description can be found in Supplementary Notes and Supplementary Figs. 16-27. The resulting representation of SIMS-ID lies in high-dimensional feature space. SIMLR 41 is a popular single-cell clustering algorithm, which automatically learns cell-to-cell affinity with multiple kernel ensemble learning, and shows satisfactory performance when combined with SIMS-ID ( Supplementary Fig. 22). We simply adopted SIMLR as our clustering method.
To characterize key metabolite-differentiating clusters, and account for the variation of pixels within cells, we developed SIMS-Diff as our differential analysis algorithm. SIMS-Diff regards cells as distributions of pixels and uses the earth mover's distance (Methods) 42 for the dissimilarities among cells. Using this, the discriminative power of one feature with respect to a given cluster partition can be measured as the ratio of between-cluster variation (BCV) and within-cluster variation.
Our algorithm modules of SEAM can be freely extended to other platforms, for example multiplexed ion beam imaging by TOF (MIBI-TOF) 19 , single-cell metabolic regulome profiling 43 and sequential fluorescence in situ hybridization  Table 5).
SEAM reveals cell metabolic states in mouse liver. The liver is an important metabolic organ consisting of repeating hexagonalshaped units called lobules 45 . The spatial heterogeneity of its metabolic mechanisms has been thoroughly investigated using immunohistochemistry (IHC) analyses 46 , transcriptomics 23 and epigenomics 47 , but, to our knowledge, direct spatial metabolic profiling at the single-cell level has not been reported. This allows us to fill the gap by a proof-of-concept demonstration of SEAM.
To this end, wild-type mice were used to obtain sequential liver sections, and CV-centered regions were selected for SEAM analysis. The SIMS data consist of approximately 200-300 ion species after spectral peak selection and filtering (Methods), and SIMS-Cut detected 724 nuclei in the field of view. To extract metabolic cell fingerprints, we used SIMS-ID to represent each cell with a fixed-length vector, which was fed into SIMLR for clustering. Then SIMLR resulted in metabolically distinct subpopulations corresponding to major liver cell types, including Kupffer cells, endothelial cells and hepatocytes (Fig. 2c).
The identified subpopulations showed specific spatial patterns consistent with the known liver organization ( Fig. 2c and Supplementary Fig. 29a) and were characterized and annotated by experienced pathologists. Kupffer cells are specialized macrophages in the liver, which typically line the walls of the sinusoids. Endothelial cells typically lie between the crevices of hepatocytes and receive blood from both the hepatic artery and the portal veins into the hepatic parenchyma 48 . Hepatocytes (the parenchymal cells) constitute 80% of the mass and 60% of cell composition in a healthy mammalian liver, performing various metabolic functions strongly associated with their positions 45 . The identified cell clusters are consistent between replicates ( Supplementary Fig. 29b). To further confirm the major cell types identified by pathologists, we performed immunofluorescent staining assays for immunophenotypic markers of CV-enriched hepatocytes (CYP2E1) 23 , sinusoidal endothelial cells (LYVE1) and Kupffer cells (CD68) 49 on adjacent tissue sections of TOF-SIMS (Supplementary Fig. 28 and Methods). The spatial distribution pattern of endothelial and Kupffer cell clusters identified by SEAM showed high similarity to the spatial distribution of LYVE1 and CD68 ( Supplementary Fig. 29c,d). SIMS-Diff identified differential ion species among the subpopulations (Fig. 3a,b and Supplementary Fig. 30a  of the hepatocyte subpopulations whose spatial distribution was of special interest. First, we found that hepatocyte C1 (   Left, schematic diagram of strategy of measuring cell-to-CV distance. Right, hepatocyte C1 shows significantly smaller distance to CV than other clusters. f, Ion species show zonation-like distribution. Left, schematic diagram of strategy of measuring metabolite-to-CV distance: concentric circles with distance of arithmetic sequence from CV partition the liver lobule into nine zones. Right, metabolic markers of hepatocyte C1 show gradient decrease away from CV. x axis, zone number; y axis, enrichment score of each ion species, which is the proportion of hepatocytes that highly express each ion species in each zone. g, Left, relative intensity plot shows that ion species with zonation pattern (red) stand out among all peaks. x axis, m/z; y axis, relative intensity (log 10 fold change of metabolic intensities between C1 and other hepatocyte clusters). Top right, mass spectrum peaks of glucose standard. x axis, m/z; y axis, intensity.
at CV (Fig. 3d). To unbiasedly search for the spatial niche of distinct metabolic pattern, we performed pixel-neighborhood analysis (Methods) and identified an area consistent with CV in H&E staining ( Supplementary Fig. 33e). Further quantitative analysis revealed that the cells in hepatocyte C1 showed significantly smaller distances from the CV compared with the other hepatocyte groups (P < 10 −9 , one-sided Wilcoxon rank sum test) (Fig. 3e). We also found a series of ion species markers of hepatocyte C1 that displayed gradual changes along the liver lobule, and this zonation pattern showed consistency with the reported spatial transcriptome 23 (Fig. 3f). Additionally, replicate experiments on different CV regions also showed consistent metabolic patterns and cluster-specific metabolic profiles but no such pattern was observed in the portal node regions ( Supplementary  Figs. 34a and 35). Consistent with the spatial expression of GLUL 23 , the spatial pattern of m/z 58.00, 59.01, 69.00, 71.02, 87.00 and 101.03 showed higher expression in the nearest 1-2 layers of hepatocytes from CV (Fig. 3f). We further conducted IHC of two liver zonation markers, glutamine synthetase (the protein encoded by GLUL) and cytochrome P450 2E1 (CYP2E1), on adjacent slides, and confirmed the liver zonation pattern ( Supplementary Fig. 34b-d).
To identify potential metabolites contributing to this liver zonation pattern, metabolite standards were analyzed using TOF-SIMS (Supplementary Table 9 and Supplementary Fig. 36). The series of CV-enriched ion species was consistent with the abundant ion species from the glucose mass spectrum in TOF-SIMS analysis (Fig. 3g). This suggests that the metabolite enriched at the CV region was most likely to be glucose. This finding was consistent with previous knowledge that glucose uptake takes action at the CV 50 . This example provided SEAM with a positive control that can accurately and comprehensively characterize the spatial heterogeneity within a well-studied tissue microenvironment.
SEAM identified hepatocyte subpopulations associated with fibrotic niche. Liver cirrhosis is a major killer, and progressive liver fibrosis often results in liver cirrhosis 51 . Having been proved effective in the case of wild-type mouse liver, SEAM was applied to human liver fibrosis to characterize the metabolic microenvironment around a fibrotic niche. We hypothesized that there should be metabolic alterations of hepatocytes around the fibrotic niche, and such an alteration might be associated with the distance between hepatocytes and the fibrotic boundary (FBD) at a local scale.
To test this hypothesis, we collected ten nontumor tissue regions from three patients with liver cancer (Supplementary Table 2) and made sequential 10-μm slides for SIMS and other assays. We selected four regions from one sample, each containing a fibrotic niche, and conducted SIMS experiments (Fig. 4a,b). The resulting data consist of approximately 200-300 ion species after spectral peak selection and filtering (Methods). The color-coded pixel visualizations produced by SIMS-View depicted a qualitative spatial pattern within each region (Fig. 4c left column). To quantitatively characterize the cell composition and spatial organization, SIMS-Cut detected 902, 716, 546 and 682 nuclei in four square regions, respectively. SIMS-ID and SIMLR were subsequently performed to get metabolically distinct cell subpopulations. The consistent manifolds and clusters shown by UMAP (Fig. 4c middle column) and the spatial single nucleus map (Fig. 4c right column) confirmed the reliability and robustness of SEAM. The identified subpopulations, corresponding to Kupffer cells, immune cells, fibroblasts, endothelial cells and three subpopulations of hepatocytes, exhibited the specific spatial distributions (Fig. 4d) and the matching metabolic fingerprints (Fig. 4e and Supplementary Table 10). The correspondence and incongruity between cell subpopulations of human and mouse liver samples were also analyzed (Methods and Supplementary Fig. 37).
We observed that hepatocyte C1 was visually localized near the FBD, and its associated metabolic markers, for example m/z 69.00, 55.02 and 56.99, showed a consistent spatial pattern across ten regions (Fig. 4f,g and Supplementary Figs. 38 and 39). To quantify the association between the hepatocyte metabolic alteration and the distance from the FBD, we separately conducted two statistical analyses on ten regions from three patients (Supplementary Table 2) given defined FBD (Methods and Supplementary Fig. 41 second column): the distance from FBD to hepatocytes C1/C2 (distance-based analysis) and the normalized count ratio between hepatocytes C1 and C2 (count-based analysis). Using R1 as a demonstration, we first defined five zones (zones 0-4) with increasing areas (Fig. 4h left), each representing an accumulative territory between the FBD and the corresponding parallel strip (parallel strips are indicated by gray solid lines, and the accumulative territories of zones are indicated by gray dotted brackets), then the distances from FBD to hepatocytes C1/C2 within the five zones were subsequently summarized by a series of paired boxplots (Fig. 4h right, for ten regions). Meanwhile, we calculated the normalized count ratio between hepatocytes C1 and C2 within an area as a function of the distance from the outer edge (indicated by the gray solid line in Fig. 4i left) to the FBD ( Fig. 4i right, for ten regions). The result of the distance-based analysis showed that hepatocyte C1 was significantly closer to FBD than C2 to FBD within the five zones (one-sided Wilcoxon rank sum test, Fig. 4h right, for ten regions), and the relative proximity exhibited high similarity across ten regions (Supplementary Fig. 41 third column). Complementarily, the count-based analysis showed that the normalized count of C1 was consistently higher than C2; specifically, C1 was about 30-50% denser than C2 within 100 µm (a typical hepatocyte size is roughly 25 µm) of the FBD and reduced quickly to about the same level as C2 after around 350 µm (Fig. 4i right, for ten regions), and this trend was highly similar across ten regions ( Supplementary Fig. 41 fourth column). Details of FBD determination, zone partition, distance and normalized count ratio calculation, as well as other necessary term definitions, are described in Methods. The above statistical analyses verified our hypothesis that the metabolic alteration of the hepatocyte subpopulations might be associated with spatial proximity to the fibrotic niche. To verify that the variation of microenvironment was not only reflected at the metabolic level, we subsequently performed Geo-seq, a spatial transcriptome assay at the same ROI of different hepatocyte subpopulations.
Transcriptome validation on hepatocyte subpopulations. To get a deeper understanding of the SEAM results, we performed Geo-seq of the transcribed RNA samples isolated from the tissues of the corresponding ROI from adjacent slides (Fig. 5a,b and Supplementary  Fig. 42) with a modified protocol (Methods). To increase reproducibility, multiple adjacent slides were used ( Supplementary  Figs. 43-46). The Geo-seq slides showed high continuity with the corresponding SIMS slides in terms of spatial histology (Fig. 5b). Hepatocyte C1 from SEAM's result, which was proximal to the fibrotic niche and enriched with the ion species m/z 69.00 series, was defined as Hepa 69-high , whereas hepatocyte C2, which was distal and not enriched with the ion species m/z 69.00 series, was defined as Hepa 69-low . We also collected the fibrotic regions, as was done for the FBD samples. In total, 15 complementary DNA libraries were constructed successfully (Hepa 69-high N = 6, Hepa 69-low N = 5 and FBD N = 4). Principal component analysis (PCA) indicated that two different groups (Hepa 69-high -proximal and Hepa 69-low -distal) of hepatocytes shared higher similarity relative to FBD samples (Fig. 5c). More importantly, Hepa 69-high samples were consistently closer to FBD samples than Hepa 69-low to FBD samples in PCA space (Fig. 5c and Supplementary Fig. 47). To validate the expression pattern of each group, we first compared gene expression profiles between hepatocytes (that is, Hepa 69-high  Figs. 48 and 49). Up-regulated DEGs were mainly involved in liver biosynthesis pathways for both Hepa 69-high and Hepa 69-low groups and down-regulated DEGs were highly enriched in lymphocyte activation and humoral immune response pathways. We further looked at the well-known marker genes specific for hepatocytes (ASL, HP and SAA1), fibrosis (TGFB1, PDGFB and COL4A1) and immune response (IGHM, IGHG3 and IGHV4-59). Both hepatocyte groups showed high levels of hepatocyte marker genes, whereas genes typically activated in fibrotic regions for fibrosis and immune response were highly expressed in FBD samples (Supplementary Fig. 50). There were 718 DEGs fitting the criteria of adjusted P < 0.05 and log fold-change standard error <3. The expression heatmap indicated that these genes had different expression patterns between the proximal (Hepa 69-high ) and the distal (Hepa 69-low ) hepatocytes (Fig. 5d). We used the DEGs for GO enrichment analysis (Fig. 5e). There were 17 genes enriched in the first GO entry, of which 16 were consistently higher in Hepa 69-high than Hepa 69-low (Fig. 5f). Genes for solute carrier transporter families with different functions were enriched in the fibrosis-proximal (Hepa 69-high ) group, indicating that the corresponding metabolite transmembrane exchange activities were elevated.

discussion
In this study, we developed SEAM, a platform combining experiments and computational algorithms to quantitatively characterize metabolic intra-or intercellular features with multiscale spatial resolution. Unlike other IMS instruments such as DESI (40-60 μm) 11 , SIMS can provide high spatial resolution (HSR), allowing one to visualize detailed metabolic structures in tissue histology. With fast and minimal sample processing, SIMS maximally preserves the native state of samples. Given the nature of SIMS, although it breaks most of the molecules into fragments, thereby making it more difficult to annotate (a common challenging issue for mass spectrometry studies), it produces a high multiplexity of metabolic features with the potential of characterizing cell and fine tissue microenvironment. Benefiting from both HSR and high multiplexity of SIMS, the algorithms of SEAM start solely from the features generated by SIMS and run a pipeline enabling metabolic fingerprint analysis from pixels to single nuclei, then to the selected metabolic features with spatial information annotated. Previously, there have been reports on spatial metabolic features at tissue level or in vitro single-cell level 17 . This study shows the segmentation and analysis of single nuclear metabolic feature profiles directly on tissue sections. Also, this algorithmic pipeline is, in principal, scalable to other spatial-omics studies based on other IMS platforms, transcriptomics and proteomics with minimum adjustments (Supplementary Notes and Supplementary Figs. 52-55), and it easily works together with additional bioinformatics tools such as CIPHER to predict and prioritize disease-related metabolic molecules 52 .
Apart from the scalability of SEAM's algorithms, we have demonstrated that the range of SEAM applications could cover in vitro cell culture assays to various tissue samples. First, in the mixed cell-cultured assay, SEAM could easily deconvolute the different cell lines cocultured together. Additionally, in different wild-type mouse tissue samples, SEAM successfully segmented single nuclei without requiring extra labeling. The single-nuclear metabolic profile analysis was also consistent with conventional tissue histological characterization (Supplementary Figs. 3-5). Specifically, in the liver, a spatially well-orchestrated but complex organ, the CVportal node axis zonation has been well-established at single-cell transcriptome level in wild-type mouse 23 . We observed consistent zonation patterns at the single-cell level in the CV-centered region with a gradational decrease in certain characteristic metabolites. Last, we found that hepatocyte subpopulations (including C1) differentiated by different metabolic features were also transcriptionally distinct, as shown by Geo-seq (Fig. 5c-f). The elevated expression of solute carrier genes can potentially explain the enrichment of a list of metabolite species found by SEAM (Fig. 4). These genes are involved in amino acid transport (SLC36A4, SLC3A2 and SLC38A9) [53][54][55] , phosphate transport (SLC17A2 and SLC17A4) 56 and -aminobutyric acid transport (SLC6A12) 57 . SLC3A2 has already been reported to play a central role in fibronectin matrix assembly, which also concurs with our result, as the proximal samples were closer to the fibrotic region 54 . More importantly, SLC3A2 (4F2hc) and SLC7A6 (y+LAT2) have been reported to dimerize together 58 and mediate arginine, leucine and glutamine uptake, and their expression was elevated in the Hepa 69+ group ( Fig. 5f and Supplementary Fig. 51a). These findings indicate that the transmembrane transport of amino acids was increased in these hepatocytes, and might be caused by the fibrogenesis adjacent to the Hepa 69+ group. Further, we have made putative annotation of ion species relating to small biomolecules and fatty acids by searching databases and published studies 17,[59][60][61][62][63][64] (Supplementary Tables 7  and 8). Ion peak m/z 145.04 has been assigned to glutamine. More importantly, this peak was also abundant in our glutamine standard mass spectrum (Supplementary Fig. 36). Therefore, we were confident to assign m/z 145.04 as glutamine. We compared the relative level of glutamine (m/z 145.04) between the Hepa 69+ and Hepa 69group. The Hepa 69+ group has a significantly higher level of glutamine compared to the Hepa 69group ( Supplementary  Fig. 51b,c). Therefore, combining the evidence from SEAM and Geo-seq suggests that glutamine level was elevated in the Hepa 69+ group, and that this was potentially mediated by up-regulation of SLC3A2 and SLC7A6 transmembrane amino acid transport activity. In summary, the results indicate that changes in the spatial microenvironment (for example, fibrogenesis) could influence cellular metabolic homeostasis (for example, glutamine elevation), as well as alteration of gene regulation (for example, SLC3A2 up-regulation).
Although the investigation of lipids, amino acids, carbohydrates and nucleotides can be achieved with TOF-SIMS, there might still exist challenges in metabolite annotation and coverage. However, this does not affect the usability of SEAM. On one hand, SEAM's advantage is the identification of tissue histology patterns differentiated by high-resolution metabolic features, and such detected metabolic features provide necessary clues for further identifying possible metabolites by annotation tools [65][66][67][68] . On the other hand, SEAM can be combined with other IMS technologies and spatial-omics (Supplementary Notes). We believe SEAM is not restricted by certain IMS platforms and will gain more influence in various spatial-omics profiling applications.
In summary, SEAM provides a HSR single-nuclear metabolic profiling and analysis pipeline that requires minimal sample preparation and is label free. It is scalable to different biological samples ranging from cell culture assays to complex tissue samples. It can have a great impact on differentiating subtle tissue metabolic changes undetectable by or complementary to other conventional assays. With future improvements in IMS resolution and molecule annotation capability, SEAM would be able to provide more detailed spatial metabolome profiles with higher resolution and broader functionality.

online content
Any methods, additional references, Nature Research reporting summaries, source data, extended data, supplementary information, acknowledgements, peer review information; details of author contributions and competing interests; and statements of data and code availability are available at https://doi.org/10.1038/ s41592-021-01276-3.

IMS experiments. TOF-SIMS 5 (ION-TOF GmbH) equipped with a Bi liquid metal ion gun (LMIG) was used in this study, and we collected TOF-SIMS spectra and images of tissue samples using a 30 keV Bi 3
+ LMIG with a HSR mode. The Bi 3 + current in the HSR mode was 0.1 pA (100 ns pulse width, unbunched beam). The total Bi3 + accumulated ion dose was about 2.0 × 10 10 ions per cm 2 , and the typical probe sizes of the Bi 3 + LMIG was roughly 200 nm in HSR mode. The secondary ion images were acquired using Bi 3 + LMIG rastering over a 400 × 400 μm 2 area with 256 × 256 pixels. The Bi 3 + LMIG was operated at a cycle time of 150 μs (mass range 0-2,000 u). Negative spectra were mass calibrated using CH 2 − , O − , OH − and PO 2 − . A flood gun with low energy electrons was used to compensate for charge buildup on the sample surface. A 10-keV Ar 2500 + commercial gas cluster ion gun (GCIB) was used as a sputter gun (rastering over a 550 × 550 μm 2 area, incident angle 45°) to carry out the depth profiling. A final 2D image was an overlay of 80-120 layers of depth profiling scan images.
In initial cell analysis, a high mass resolution (HMR) mode was used with 0.8 pA (<1 ns pulse width, bunched beam) Bi 3 + current, and the mass resolutions (measured at C 2 H − ) were typically >6,000. The total Bi3 + LMIG accumulated ion dose was between 10 11 and 10 12 ions per cm 2 , rastering over a 300 × 300 μm 2 area with 256 × 256 pixels. The Bi 3 + LMIG was operated at a cycle time 150 μs (mass range 0-2,000 u). Negative spectra were mass calibrated using CH 2 − , O − , OH − , and PO 2 − . A flood gun with low energy electrons was used to compensate for charge buildup on the sample surface. A 10-keV Ar 2500 + commercial gas cluster ion gun (GCIB) was used as a sputter gun (rastering over a 450 × 450 μm 2 area, incident angle 45°) to carry out the depth profiling. A final 2D image was an overlay of 50-80 layers of depth profiling scan images.

Peak selection.
To avoid noise interference and improve follow-up analysis efficiency and accuracy, picking out peaks from a full spectrum was necessary. A peak search process in SurfaceLab was carried out with the parameters as follows: mass range 50-500, minimum counts 10,000 and minimum signal to noise ratio 1,000. Typically, 200-500 peaks were picked out from a full spectrum. SIMS data preprocessing. Each peak corresponds to a highly spatially resolved and spectrally filtered ion image: the former originated from a specific one or a class of chemical substances in the tissue sample, while the latter shows its characteristic spatial distribution features in this tissue square (Fig. 1a, top right). For further data analysis, each ion image can be exported as an American Standard Code for Information Interchange mode data file by the SIMS built-in data processing software SurfaceLab, which contains three columns corresponding to the xy axes coordinates and signal intensity values. Overlaying ion images were processed by ImageJ FIJI (v.1.53a) 69 . Table 9 were dissolved into double distilled water. Cholic acid was dissolved into methanol (high-performance liquid chromatography level, Sigma) then diluted with double distilled water. Metabolite solution was dropped onto on a glass slide (CITOGLAS), dried in a sterile airflow cabinet and then processed into TOF-SIMS for analysis using the same parameters as for the HMR mode. Peaks were filtered without low intensity (<mean of intensity) and background peaks from the glass slides.
Bromodeoxyuridine (BrdU) cell mix-culture experiment. Following protocol from the previous study, A549 and HeLa cell lines were both cultured with and without 20 μM BrdU (Sigma) for 48 h before seeding. A549 with BrdU were then replated with non-BrdU Hela at the same density on microscope cover glass (CITOGLAS) for 20 h and vice versa for non-BrdU A549 and Hela with BrdU. The same mix-culture procedure for iododeoxyuridine (Sigma) was applied at Hepa 1-6 and NCTC 1469 cell lines.
Mice. C57BL/6N mice were purchased from Charles River. All mice were housed in isolated ventilated cages (maxima six mice per cage) in a barrier facility at Tsinghua University. The mice were maintained on a 12/12-h light/dark cycle, 42% humidity, 22-26 °C with sterile pellet food and water ad libitum.
The laboratory animal facility has been accredited by AAALAC (Association for Assessment and Accreditation of Laboratory Animal Care International) and the Institutional Animal Care and Use Committee of Tsinghua University approved all animal protocols used in this study (Animal Welfare Assurance no. F16-00228 (A5061-01)).

Nontumor liver tissues from patients with intrahepatic cholangiocarcinoma (ICC).
The ICC nontumor liver tissues were obtained from leftover pieces from surgery. The protocol of this study was compliant with the principles of the Declaration of Helsinki and was also approved by the Institutional Review Board and Ethics Committee of Peking Union Medical College Hospital (JS-2492). Informed consent was obtained from all sample donors.
Tissue section preparation. Mouse and human tissues were isolated individually and embedded in Optimum Cutting Temperature compound (SAKURA), then snap-frozen in liquid nitrogen. Cryosections were taken using CM1900 Cryostat (Leica) to obtain 3-10 μm continuously adjacent sections.
Histology staining. Tissue cryosections were thawed at room temperature for 5 min then washed in PBS twice for 5 min each time. Slides were fixed in 4% paraformaldehyde (PFA) for 20 min at room temperature then washed in PBS once. H&E stainings were then performed using the H&E staining kit (Leagene). Images were obtained from Axio Scan. Z1 (Zeiss) and processed via ZEN (Blue edition) v.3.1 or obtained from Cytation5 (Biotek) and processed via its software Gen 5.
IHC. Tissue cryosections were thawed at room temperature for 5 min then washed in PBS twice for 5 min each time. Slides were fixed in 4% PFA for 20 min at room temperature then washed in PBS once. Samples were permeabilized and blocked in 5% BSA solution (Sigma) with 0.4% Triton-X100 (AMRESCO) for 2 h at room temperature. We carried out primary antibody incubation using glutamine synthetase antibody (ab176562, Abcam, 1:200 dilution) and cytochrome P450 2E1 antibody (ab28146, Abcam, 1:300 dilution) in PBS with 0.1% Triton-X100, and incubated this in a humid dark chamber at 4 °C overnight. Next, it was washed three times in PBS with 0.1% Triton-X100 for 10 min each time. We carried out secondary antibody incubation using HRP conjugated goat-anti-rabbit antibody (7074, Cell Signaling, 1:500 dilution) in PBS with 0.1% Triton-X100 and incubated this in a humid dark chamber at room temperature for 2 h. We washed this three times in PBS with 0.1% Triton-X100 for 10 min each time. We applied 1× DAB (3,3′-diaminobenzidine) solution using a DAB staining kit (PA110, Tiangen) for 3-5 min. Slides were mounted using ProLong Gold Antifade Mountant (ThermoFisher). Images were obtained from Axio Scan. Z1 (Zeiss) and processed via ZEN (Blue edition) v.3.1 or obtained from Cytation5 (Biotek) and processed via its software Gen 5.
Modified Geo-seq. We modified a spatial transcriptome analysis method, Geo-seq, previously described by Chen et al. 70 . Tissue cryosections were mounted on the polyethylene naphthalate membrane slide and stored at −80 °C for short-term storage. Slides were stained in 0.5% cresyl violet and dehydrated in serial ethanol. Tissue blocks were obtained in a 0.2-ml PCR tube by LMD7000 (Leica). Buffer RLT (Qiagen) with dithiothreitol (DTT) (Sigma) was added and shaken vigorously for tissue lysis and RNA release. RNA Clean beads (Vazyme) 1.8× were added to isolate total RNA. We prepared the annealing procedure in the same tube with 3 μl of H 2 O, 1 μl of dNTP, 1 μl of Oligo(dT) and 0.5 μl of RNase Inhibitor (Life Technologies). This was incubated at 72 °C for 3 min and immediately transferred to ice for 2 min. We prepared the reverse transcription reaction in the same tube with 2 μl of 5× RT buffer, 0.5 μl of DTT, 0.5 μl of RI, 0.5 μl of Template Switch Oligo (Sangon Biotech) and 1 μl of maxima reverse transcriptase (Life Technologies).
This was incubated at 50 °C for 1 h and reverse transcriptase was deactivated at 85 °C for 5 min. We amplified the first strand product with 12. Cell-type annotation. The cell-type annotation was performed by experienced pathologists with reference to the H&E staining. To double confirm the cell-type annotations with morphology and protein markers, we performed SIMS and H&E staining on the same tissue sections, and further conducted immunofluorescence staining of CYP2E1, CD68 and LYVE1 on the adjacent section.
RNA-sequencing (RAN-seq) data processing and analysis. RNA-seq data were first obtained through adapter removal and quality filtering by Trim Galore 71 . The qualified reads were then mapped to the human gencode reference genome using STAR and used to generate BAM files 72,73 . Samtools was used to index the BAM files 74 . Duplication was removed by PICARD (http://broadinstitute.github. io/picard/) for all the BAM files. The read count for each gene was performed by HTSeq-count with reference to gencode human gene annotation, release 32 (GRCh38.p13) 72,75 . Different gene expression analyses were analyzed using DESeq2 in R 76 . We carried out GO analysis using clusterProfiler 77  For H&E staining assays, the N of mouse tissue experiments ( Fig. 1 and Supplementary Figs. 28, 34a and 35a-h) were as follows: liver (N = 3 independent mice), kidney (N = 2 independent mice), stomach (N = 2 independent mice) and pancreas (N = 2 independent mice). The N of human tissue experiments (Figs. 4a,b and 5b) were as follows: ICC nontumor liver tissues (N = 3 independent patient samples).
For IHC and immunofluorescence staining assays, the N of mouse tissue experiments ( Supplementary Figs. 28a,b and 29c,d and 34b-d) were as follows: glutamine synthetase on liver (N = 2 independent mice), Cyp2e1 on liver (N = 4 independent mice), CD68 on liver (N = 2 independent mice) and LYVE1 on liver (N = 2 independent mice).
For F-actin/neutral lipid staining assays, the N of human tissue experiments ( Supplementary Fig. 40a) were as follows: ICC nontumor liver tissues (N = 2 on one patient sample).
For SIMS-Cut comparison on public MIBI data, the N of MIBI data (Supplementary Figs. 52 and 53) were as follows: triple-negative samples from patients with breast cancer (N = 36 fields of view on 36 patients).

SIMS-Cut framework.
Given an M × N × N SIMS data, with M filtered metabolic peaks and N × N images as input, SIMS-Cut first select M metabolites colocalizing with nucleus ( Supplementary Fig. 6a), and then iteratively solves a maximum a posteriori problem (Supplementary Fig. 6d) to get an N × N binary matrix Y that indicates a nucleus.
Since the SIMS data are superimposed on a certain thickness of biological slice in its nature, we regard the segmented nucleus region as a cell containing molecular fragments in both cytoplasm and nucleus. The main part of SIMS-Cut can be formulated by finding an optimal Y * : where N], and xij ∈ R m , which is the M-dimensional metabolic density at the coordinate of (i,j). This Bayesian formulation aims to find the optimal label assignment Y* that produces the maximum posterior probability given X.
As with traditional hidden Markov random field-based image segmentation 78,79 , SIMS-Cut uses a similar graphical model, consisting of p(Y), the smoothing model for unknown label field Y before guaranteed spatial homogeneity and p(X|Y), the data model for the conditional distribution of pixel metabolic profiles X given corresponding pixel labels.
Smoothing model. The label prior, p(Y) is modeled as a special Markov random field called the Potts model 36 . According to the Hammersley-Clifford theory 80,81 , p(X) follows a Gibbs distribution 82 : where U is the energy function, which is calculated by summing over the potential of all second-order cliques V, and each clique corresponds to a pair of neighboring pixels (for example, the four-neighborhood system). Z is a partition function, making p(Y) a valid probability density function.
V is defined on doubletons, penalizing the heterogeneity of labels: Data model. According to the graphical model ( Supplementary Fig. 6b) and d separate 31 , [1,N] p(xij|yij) While the multivariate Gaussian distribution is typically suited to the data model of color image segmentation 83,84 , its model capacity is limited and its assumptions are too strong for SIMS data. Instead, we use RBMs 32-35 to model the conditional distribution of data intensities given label assignment.
RBM as a generative model is typically a two-layer bipartite undirected graph. It is composed of a visible layer that is an M-dimensional metabolic profile in our case, and a hidden layer, which is a kind of d-dimensional memory providing model capacity. In theory, RBM is a universal approximation for any probability density function with a large enough number of hidden layers 34 . Here we use two separate RBMs to model p(x ij |y ij = 0) and p(x ij |y ij = 1), respectively, and we describe one RBM in the following. For the sake of notation simplicity, in the following, we , p ∈ [1, m] to denote x ij (the subscript is removable thanks to the conditional independence given by equation (7)). The graphical model of RBM is shown in Supplementary Fig. 6c.
, q ∈ [1, d] is the hidden layer variable, and V is the visible layer variable.
are parameters. The joint probability density function is where E is the energy function and Z is the partition function The probability that an RBM model assigns a vector V, for example x ij , is given by Note that the superscripts indicate the parameters of a specific RBM.
Partition function of RBMs estimation. For a specific pixel given its segmentation label a, the log probability that the RBM assigns metabolic profiling x ij is computed as Here F a (x ij ) is the free energy of the RBM corresponding to class a, which can be rapidly calculated. To estimate the partition function Z, we build a softmax model to classify x ij at every pixel to its label y ij : Maximum a posteriori. Our objective can be expressed as As it is a nonconvex problem, we develop an EM-style algorithm to alternate between subproblems to reach a locally optimal point iteratively (Supplementary Notes).

SIMS-ID framework.
After SIMS-Cut, hundreds of separated nuclei have been detected from an N × N image, each pixel containing M-dimensional metabolic profiles. Thus, each nucleus contains a diverse number of connecting pixels, represented by fixed dimensional vectors. SIMS-ID conducted an auxiliary classification task to assign a single fixed dimensional vector to each nucleus, which is robust to over/under segmentation in SIMS-Cut. The representation learned by SIMS-ID compresses all the pixel metabolic information using a distilled softmax space 85 , regarding a nucleus as a whole while including distribution information of pixels. A fixed dimensional representation of the nucleus helps further analysis of single-nuclei data, such as clustering, visualization and so on. More detailed information can be found in the Supplementary Notes.

Clustering.
Represented by fixed-length vectors, the nuclei can be straightforwardly clustered and visualized in low-dimensional space. The number of cells that one SIMS experiment captures typically ranges from 400 to 1,000, and the length of the representation vector for each cell is equal to the number of pixels within segmented cells, typically ranging from 5,000 to 15,000. With the consideration of both data characteristics and experimental performance ( Supplementary Fig. 22), we apply SIMLR 41 , a single-cell clustering algorithm, which automatically learns the low-rank similarity matrix by means of multiple kernel ensemble. SIMLR also provides a means of estimating the number of clusters, which we can take as a guideline to explore populations of metabolic cell states in different scales.
SIMS-Diff framework. The goal of this algorithm is quantification of the feature's discriminative power to tell clusters apart. Owing to the nature of our data, the traditional two-sample test cannot be directly applied. We assume that discriminative features can produce a similarity matrix with a block diagonal structure. Therefore, we use the ratio between BCV and within-cluster variation to evaluate the compactness of the similarity matrix. For each feature, we use the earth mover's distance 42 as a metric for two nuclei represented by histograms, and the variation can be simply evaluated by summing all pairwise distances. More detailed information can be found in the Supplementary Notes.
Pixel-neighborhood analysis. Pixel-neighborhood analysis is a tool for computationally identifying spatial niches or microenvironments within a SIMS field of view. The main idea came from the cellular neighborhoods analysis 86 . We modified some portions and added some visualizations to fit our case. Pixel-neighborhood analysis contains four steps, and more detailed information can be found in the Supplementary Notes. Gradient orientation analysis. Gradient orientation analysis is a tool to visualize the gradient quantification of metabolic features within a SIMS field of view. More detailed information can be found in the Supplementary Notes.

Multimodal intersection analysis between mouse and human liver samples.
To access the correspondence between clusters identified in mouse and human samples, we adopted modified multimodal intersection analysis 87 . Specifically, we ranked metabolites by the score computed using SCANPY 88 , which is z-score underlying the computation of a P value (Student's t-test) for each gene for each cluster. Next, gene sets of each cluster were defined as genes with the top 20 associated scores. The significance of the intersection of gene sets between any pair of clusters was inferred using the hypergeometric distribution. The multimodal intersection analysis map was finally displayed as a heatmap, with each element defined as the negative logarithm P value (hypergeometric test) of the corresponding cluster pair.

Statistical analysis of human samples.
To exactly describe the statistical analysis in Fig. 4, we defined the following terms: FBD Ri is the FBD of region Ri; PSP(j, FBD Ri ) is a parallel strip whose distance to FBD Ri is equal to j µm; AREA(j,i) is the territory between FBD Ri and PSP(J, FBD Ri ); zone(j, i) is short for AREA((j + 1) × 100,i); CFBD(cell i , zone(j, k))is the distance (µm) between cell i and FBD Rk within zone(j, k); NCC(population i , area 1 , area 2 ) is the ratio between the number of cells in population i within area 1 and the number of cells in population i within area 2 .
The FBD is approximated according to SIMS-View and the spatial single-nucleus map ( Supplementary Fig. 41). Coming to cases where FBD could not be well fitted by a single line segment, polylines are used and the distance to FBD is simply adjusted to be the smallest among distances to all line segments.
All parameters of boxplots are set as defaults using Seaborn (https://seaborn. pydata.org), a Python statistical data visualization toolbox. Statistical testing was performed by Scipy 89 . Neural network modules were dealt with using Keras (https://keras.io).
Boxplots. All boxplots in the main and supplementary figures share the same settings: the lower and upper hinges show the first and third quartiles (the 25th and 75th percentiles); the center lines correspond to the median; the upper whisker extends from the upper hinge to the largest value, which should be less than 1.5× the interquartile range (or distance between the first and third quartiles) and the lower whisker extends from the lower hinge to the smallest value, whic is at most the 1.5× interquartile range. Data beyond the end of the whiskers are 'outlying' points and are plotted individually.
Datasets. Detailed information of simulated datasets (Datasets 1 to 9) and mixture cell datasets (Datasets 10 and 11) can be found in Supplementary Notes.
Reporting Summary. Further information on research design is available in the Nature Research Reporting Summary linked to this article.