Mass Spectrometry Imaging Data Analysis with ShinyCardinal

doi:10.21203/rs.3.rs-4072606/v1

Download PDF

Article

Mass Spectrometry Imaging Data Analysis with ShinyCardinal

https://doi.org/10.21203/rs.3.rs-4072606/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

Given the rapid growth and improvement in both mass spectrometry imaging (MSI) techniques and its applications, there is a critical need for the development of efficient and comprehensive computational tools for MSI data analysis. As such, we introduce ShinyCardinal, an open-source and vendor-neutral software that covers all step in MSI data analysis. It leverages the R package Cardinal to enhance its functionalities by introducing several additional important features, such as removal of background noises and matrix peaks, deisotoping, absolute quantification, network analysis, and metabolite identification. ShinyCardinal is built as a desktop application (https://shinycardinal.sourceforge.io) with a conveniently designed graphic user interface to provide users a stable, consistent, and user-friendly data analysis framework. The versatility and capabilities of ShinyCardinal is demonstrated with nine MSI datasets acquired from different platforms.

Biological sciences/Biological techniques/Analytical biochemistry

Biological sciences/Biological techniques/Mass spectrometry

Biological sciences/Computational biology and bioinformatics/Software

Mass spectrometry imaging (MSI) has become an important technique for spatial proteomics and metabolomics ^1–3 owning to several of its unique features, such as label-free and non-targeted detection, high sensitivity, exceptional mass resolution and remarkable spatial resolution. Indeed, it has been routinely applied to map the spatiotemporal organizations of proteins, peptides, lipids, metabolites, drugs and elements across a wide spectrum of biological samples with two-dimensional (2D) or three-dimensional (3D) levels on the scales of organisms, organs, tissues, or cells ^4,2.

While the field of MSI has seen significant growth in recent years, MSI data analysis is still challenging ^5,6. Advances in improving the mass resolution, spatial resolution, and throughput of MSI has led to a substantial increase in data size ^7,8. For instance, a typical MSI dataset consists of 5,000–50,000 spectra, each having 10,000–1,000,000 mass features ^9,10. Datasets generated using ultra-high mass resolution and spatial resolution instruments (e.g., Fourier-transform ion cyclotron resonance mass spectrometry) or through 3D MSI are several times larger ^9,11. Such large, multidimensional datasets pose significant memory and computational challenges, namely ‘the curse of dimensionality’ ⁵. An additional obstacle is that the raw MSI datasets are instrument-dependent, sometimes even binary encoded ¹², which hampers MSI data processing and management. Furthermore, MSI datasets are intrinsically complicated. For instance, MSI datasets can suffer from strong pixel-to-pixel spectral variations, and spectra in MSI datasets often contain high-frequency noise and unresolved chemical background ¹³. As such, developing efficient, robust, and MSI-specific data processing methods becomes imperative.

A typical MSI data analysis workflow includes raw data importing, preprocessing, data mining, visualization, and identification. A plethora of software packages have been developed for MSI data analysis. These include commercial tools such as SciLS Lab (Bruker), MSiReader ¹⁴ and LipostarMSI ¹⁵, as well as free and open-source solutions like METASPACE ^16,17, rMSI ¹⁸ and Cardinal ^19,20. For a comprehensive overview and comparison of different MSI software, please refer to elsewhere ^8,12,21. Among them, Cardinal is the most promising open-source tool due to its robust support for most MSI data analysis steps along with its the capability to handle large-scale data through parallel and out-of-memory computing ²⁰.

However, one major limitation of Cardinal is the absence of a graphical user interface (GUI). Users must have a certain level of R programming knowledge and a profound understanding of MSI data structure to analyze the data. In addition, the lack of interactivity further compounds the challenge of parameter selection and optimization for each data analysis step. Moreover, several crucial data processing steps, such as data cleaning, absolute quantification, and identification, are not yet supported by Cardinal. To address these issues, we have introduced ShinyCardinal – a comprehensive and vendor-neutral software solution available as an R package, web application and standalone desktop app, for rapid, responsive, and effortless MSI data analysis. In particular, the introduction of data cleaning, MSI data cropping, absolute quantification, network analysis and identification steps make it a more powerful MSI data analysis platform.

Overview of ShinyCardinal Pipeline

ShinyCardinal is a comprehensive and vendor-neutral software tool for MSI data analysis. It is built upon the R package Cardinal ^19,20 following the golem schema for production grade app development ²². ShinyCardinal is available as an R package, web application (https://gincpm.shinyapps.io/ShinyCardinal/) and standalone software (https://shinycardinal.sourceforge.io). The workflow covers all the steps necessary for MSI data analysis: data import, pre-processing, data cleaning, image visualization, regions-of-interest analysis, statistical analysis, absolute quantification, image segmentation, network analysis, metabolite identification, and data export (Fig. 1). A series of tutorial videos (https://www.youtube.com/@MSI_WIS/videos) and a built-in user guide are provided to streamline the data analysis. Given the computational complexity of MSI datasets, particularly during data preprocessing, ShinyCardinal allows users to save the preprocessed data in RDS format for future usage to avoid repeated data preprocessing. Notably, in ShinyCardinal each step functions as a module, allowing users to analyze data selectively with preferred modules or collectively and iteratively.

Data import and Preprocessing

The purpose of preprocessing is to reduce experimental variance within the MSI dataset, and prepare it for subsequent statistical analysis ²³. ShinyCardinal allows uploading and processing multiple MSI datasets simultaneously. It supports both processed and continuous imzML format ²⁴, in profiled or centroid mode (Fig. 1). The preprocessing steps of ShinyCardinal includes normalization, peak picking, peak alignment, and optionally, spectra smoothing and baseline reduction. It is identical as described for the R package Cardinal ²⁰ except that interactive MS spectra are provided in ShinyCardinal to facilitate users to choose, evaluate and optimize different parameters for each data preprocessing step. The overall data preprocessing time varies from several minutes to few hours depending on the computational resources, MSI data size (e.g., number of mass features and pixels), and notably, the format of the imzML data (i.e., processed and continuous imzML format). Further details illustrating the relationship between data preprocessing time and MSI data size, format, and parallel computation can be found in Supplementary Table 1.

Data cleaning

MSI data is intrinsically complex due to the presence of multiple ‘redundant’ ion species. These include background noises originating from different sources of contaminants, MALDI matrix peaks (unique to MALDI-MSI datasets), and isotopic peaks. This complexity impedes downstream data analysis and leads to false hits in metabolite identification ^25,26. To tackle this challenge, ShinyCardinal provides two core modules for MSI data cleaning, namely background noise and matrix peak removal, and deisotoping.

Few software tools have been developed to detect and remove MALDI matrix peaks, such as OffSampleAI ²⁶, rMSIcleanup ^27,28 and mass2adduct ²⁹. These methods require either a predefined list of matrix adducts or assume a matrix-like spatial distribution, which limit the untargeted applications of MSI. By contrast, ShinyCardinal is based on the principle that all MALDI matrix-derived peaks show very similar spatial distribution; they can be therefore extracted using a reference matrix peak through colocalization analysis (Fig. 2a). Similarly, different sources of background noises can be detected with reference noise peaks. In a test-case study, matrix removal was performed on a mouse testis sample analyzed in negative ion mode with 1,5-Diaminonaphthalene (DAN) as the matrix. In total, 83 mass peaks were extracted with m/z 313.1458 ([2M-H-H₂]⁻), a known DAN matrix peak, as a reference peak. Among them, 70 were putatively identified as DAN-derived peaks (Fig. 2b-d, Supplementary Table S2), and all the 83 peaks show very similar spatial distribution (Pearson correlation coefficient > = 0.9 Supplementary Fig. 1).

Deisotoping is another challenging problem for MSI data analysis due to the absence of physicochemical separation of ions generated on the sample tissue ³⁰. Currently, three software tools are capable of performing deisotoping for MSI data, massPix ³¹, imShot ³⁰, and METASPACE (Palmer et al., 2017; Alexandrov et al., 2019). Among them, METASPACE provides the most accurate and specific approach for deisotoping. It takes into account the mass accuracy, spatial distribution similarity, and spectral similarity between a predicted isotope pattern and measured spatial intensities for isotopic peak detection. Nevertheless, the primary aim of deisotoping in METASPACE is to decrease false positives for metabolite identification, and the detected isotope peaks are only displayed for identified mass features. Consequently, the deisotoped MSI data is not available for downstream data analysis. ShinyCardinal uses a similar algorithm except that it does not consider the intensity proportions but instead applies a more stringent spatial distribution similarity score (default is 0.9) for isotopic peak detection (Fig. 3a). The performance of ShinyCardinal was evaluated by manual inspection of the results and comparting them with those obtained by METASPACE from a mouse brain MALDI-MSI data ³¹. Of all the 698 mass features, 260 peaks were manually identified as isotopic peaks, with 146 isotopic peaks being detected by ShinyCardinal (Supplementary Table S3). Remarkably, the polyethylene glycol (PEG)-1450 polymer was found present within the mouse brain section, and all its corresponding isotopic peaks were accurately detected using ShinyCardinal (Supplementary Fig. S2).

Image visualization

The visualization of ion intensity distributions is the key component of an MSI data analysis ¹⁵. ShinyCardinal allows to visualize both ion images and segmentation maps. A range of parameters, such as contrast enhancement methods, smoothing techniques, and color scales, are provided for users to customize and fine-tune the ion images (Fig. 3b). Users can choose to plot either a single ion image or multiple ion images; and multiple images can also be viewed jointly in a superposition manner to compare the spatial localizations of different ions (Fig. 3c). Ion images can be exported separately or collectively in high quality, publication-ready figures in PDF or PNG format. When clicking on the ion image, interactive mass spectra are generated for each selected pixel, allowing for the comparison of ion intensities across different pixels (Fig. 3c).

Region of interest analysis

Region of interest (ROI) analysis holds great potential in identifying differences on a molecular level in small regions of tissue, for which the signal would be easily overlooked when employing non-imaging MS-based techniques, such as liquid chromatography (LC)-MS and gas chromatography (GC)-MS ^32,33. The accurate definition of ROIs enables the extraction of molecular abundances specific to each tissue type. This is crucial for statistically discovering molecular alterations either among different ROIs within the same sample (e.g., different tissue types and structural features) or between different samples at the same ROI (e.g., differentially expressed molecules at the same anatomical area between healthy and diseased samples) ³⁴. ShinyCardinal provides an interface for users to manually select ROIs based on either ion images or spatial segmentation maps. Apart from biomarker discovery, ShinyCardinal allows peak profiling and MSI data cropping through ROI analysis (Fig. 4).

The usage and efficacy of ShinyCardinal for ROI analysis was demonstrated on a MALDI MSI dataset collected from a purple tomato fruit section ^35,36. The purple anthocyanin-rich tomato fruit was generated by ectopic expression of the snapdragon ROSEA1 (ROS1, a MYB-type) and DELILA (DEL, a bHLH-type) transcription factors. Anthocyanin production in the ROS1/DEL tomato fruit was locally reduced by virus-induced gene-silencing (VIGS), which led to the irregular accumulation of anthocyanins in tomato fruit at the red, ripe stage (Fig. 4a1). To compare the metabolic profiles between anthocyanin-free (A) and anthocyanin-rich (B) regions, two ROIs from each area were selected for statistical analysis and biomarker discovery (Fig. 4a2). The ROIs were defined based on the ion image of m/z 317.0656, which corresponds to the radical ion of the known purple pigment petunidin ([M]⁺.) in the tomato fruit. Alternatively, a segmentation map, e.g., spatial shrunken centroids segmentation with 9 segments representing different tomato fruit anatomical features (see Image Spatial Segmentation section for more details), can be employed to select ROIs (Fig. 4a2). Furthermore, the defined ROIs can be also visually inspected to verify their accuracy (Fig. 4a3). Subsequently, statistical test was performed for all mass features by comparing ROIs belonging to the two different regions. As a result, a table was generated, containing mean ion intensity for each ROI, fold change, and adjusted p-value for each mass feature (Fig. 4a4, Supplementary Table S4). This table is intended to help users identify potential biomarkers. In this case study, only two groups were considered, with two ROIs defined for each group, all derived from a single sample. However, users can define numerous groups and multiple ROIs originating from multiple samples in their own analyses.

ShinyCardinal also enables intensity profiling along the defined ROI line (Fig. 4b). For example, users can draw a line over the ion image or the segmentation map to capture a series of pixels along the ROI (Fig. 4b2), and subsequently visualize and compare the ion intensities of different mass features against the pixels as a line plot (Fig. 4b3). In addition, ShinyCardinal supports MSI data cropping, a feature akin to the scan scrubber tool described in MSiReader ¹⁴. Users can select ROIs (Fig. 4c2) and decide whether to keep or discard these ROIs in the MSI data (Fig. 4c3). This functionality is particularly useful for eliminating unwanted pixels and off-tissue data.

Image Spatial Segmentation

Image spatial segmentation is a powerful tool to explore the characteristics of the MSI data and identify ROIs to understand the tissue structure and metabolic patterns ^37,38. The most widely used approaches are hierarchical clustering and K-means. However, these methods treat each pixel independently and overlook the spatial information, which could potentially lead to spatially discrete clusters ³⁹. Cardinal has introduced a unique segmentation method tailed for MSI data, named spatial shrunken centroids (SSC) ^40,20. It incorporates the spatially aware distance to regularize the distance between pixel spectra, and thus produces improved segmentations. SSC automatically determines the total number of segments, and it guides the choice of an appropriate number of segments ⁴⁰.

ShinyCardinal provides full support for SSC with a user-friendly interface. The image segmentation functionality was demonstrated using a bovine lens MALDI MSI dataset ⁴¹. To select an appropriate segmentation, ShinyCardinal allows users to initialize SSC with multiple numbers of starting segments k (e.g., 3 to 6), spatial smoothing radii r (e.g., 2, 4, and 6) and the shrinkage parameter s (e.g., 1 to 3) (Fig. 5a). SSC was then run with every permutation of the three parameters, and a segmentation map was generated for each permutation. For instance, in line with the bovine lens anatomy, the dataset was computationally segmented into up to 6 clusters (Fig. 5b). Users can explore each segmentation map by specifying the k, r and s parameters (Fig. 5c). In addition, the shrunken t-statistics of the spectral features were plotted for each cluster within the segmentation map. The interactive plots help users examine and select the most informative mass features that define each cluster, where higher t-statistics values denote increased contributions of the mass feature to that specific cluster (Fig. 5d). Furthermore, a table summarizing the shrunken t-statistics of the spectral features was provided for in-depth investigation (Fig. 5e).

Absolute Quantification for MSI

Quantitative mass spectrometry imaging (qMSI) is an emerging field that allows accurate measurement of local concentrations of molecules within complex samples. qMSI typically relies on using standards of known concentration, which is predominantly achieved through spotting standards on a reference tissue or the use of a mimetic tissue model ^42,43. In each approach, several concentrations of a standard (or an isotopically labeled analogue of the target molecule) are used to construct a calibration curve for calculating the absolute concentration of the target analyte within the ROIs.

The quantification module of ShinyCardinal supports both approaches for qMSI. It enables users to extract ion signals from each calibration standard, build a calibration curve using a linear least square regression, and subsequently calculate the concentration of analyte across the selected tissue ROIs (Fig. 6a). ShinyCardinal also allows users to recalculate the results by dynamically updating the calibration curve through the addition or removal of calibration points. When an ion intensity of a designated ROI falls outside the ion intensity range of the calibration curve, the corresponding result for that ROI is highlighted in red in the plot, reminding users that the outcome might not be accurate for such cases. As a case study, we analyzed a published DESI imaging dataset which aimed at spatial quantitation of drugs in a rat liver (Fig. 6b) ⁴⁴. Using ShinyCardinal, we were able to reproduce the original results of two selected drugs, namely olanzapine (Fig. 6c) and erlotinib (Fig. 6d). The linearity of the response for each drug was satisfactory, with R² values being 0.995 and 0.998 for olanzapine and erlotinib, respectively (Fig. 6c-d).

Molecular Networking and Database Search Based Metabolite Identification

Metabolite identification remains a significant challenge in MSI due to the inherent limitations such as the absence of chromatographic separation and the complexities associated with implementing tandem MS ^45,21. ShinyCardinal offers three key modules, i.e., network analysis, database query, and data export to METASPACE, collectively designed to facilitate metabolite identification.

Molecular networking (MN) is a computational strategy that has been widely used for metabolomics data analysis. It calculates the degree of spectral similarity based on the principle that structurally related molecules tend to yield similar fragmentation patterns (MS/MS). The MS/MS spectra are then organized and presented in graph-based spectral networks, in which each node corresponds to an ion with an associated fragmentation spectrum, and the links among the nodes denote similarities of the spectra ^46–48. Analogous or structurally similar molecules are grouped together in the network, enabling the identification of unknown molecules through neighboring known ones. However, given that most of the current MSI studies are performed at MS-1 level, without employing MS/MS fragmentation ⁶, the application of MS/MS-based MN to MSI datasets becomes unfeasible. A co-localization-based MN, named PICA (pixel intensity correlation analysis), has been proposed for metabolite identification for MSI ³⁶. This approach assumes that ions of similar spatial distribution are also structurally related. Indeed, ions that originate from the same molecule are theoretically perfectly co-localized. For instance, these may include in-source fragments, natural isotope peaks, adduct ions, multiple charged ions, or multimers of the molecule. The efficacy of PICA has been showcased on three MSI datasets, underscoring its efficiency for enhancing metabolite identification.

The network analysis module of ShinyCardinal has enhanced PICA in terms of speed and versatility. It supports both global network analysis and single network analysis (Fig. 7a). Global network analysis calculates the degree of spatial similarity among all mass features in a pairwise manner. It constructs a graph-based network according to the user-defined spatial similarity score cutoff, which ranges from 0 (dissimilar) to 1 (identical in spatial distribution). Global network analysis provides an overview of ion clustering and the number of ion clusters within an MSI dataset. By contrast, single network analysis seeks ions with spatial distributions akin to the user-defined ion of interest. It then builds a network using all these ions and produces a pseudo-MS/MS spectrum. Single network analysis is particularly valuable for metabolite identification.

To showcase the process and effectiveness of ShinyCardinal in metabolite identification, we employed a MALDI data obtained from a mouse brain section ³¹. The dataset was preprocessed with ShinyCardinal without deisotoping and matrix removal. Out of the 231 detected mass features, a total of 17 ion clusters were generated using a spatial similarity score cutoff of 0.9 (Fig. 7b). Among them, the clusters 11 and 12 were identified as Na⁺ and K⁺ polyethylene glycol (PEG)-1450, respectively, each exhibiting a repeating unit of 44 Da (Supplementary Fig. S2). The remaining ion clusters were subjected to single network analysis and the identification module for metabolite annotation. The result of three representative clusters, i.e., cluster 1, 9 and 15 (Fig. 7c), are shown in Fig. 7d. Manual inspection of the resulting pseudo-MS/MS spectra of the 3 clusters confirmed that these ions within each cluster were indeed stemmed from the same molecule. For example, four ions, with m/z values of 772.5267 (ion 1), 773.5292 (ion 2), 713.4532 (ion 3), and 714.4565 (ion 4), were found highly colocalized (cluster 1). Notably, m/z 773.5292 (ion 2) is ¹³C isotopic peak of m/z 772.5267 (ion 1); and similarly, m/z 714.4565 (ion 4) is the ¹³C isotopic peak of 713.4532 (ion 3). The MALDI images confirmed that they shared similar spatial distribution (Fig. 7e). Database search under the identification module of ShinyCardinal using HMDB database ⁴⁹ with a 5 ppm mass accuracy window revealed that m/z 772.5267 (ion 1) corresponded to either a Na⁺ adduct of a phosphatidylethanolamine (PE) lipid ([M + Na]⁺, C₄₃H₇₆NO₇P, mass accuracy: -1.99 ppm), a K⁺ adduct of a monomethyl phosphatidylethanolamine (PE-NMe) lipid ([M + K]⁺, C₄₀H₈₀NO₈P, mass accuracy: -1.79 ppm), or a K⁺ adduct of a phosphatidylcholine (PC) lipid ([M + K]⁺, C₄₀H₈₀NO₈P, mass accuracy: -1.79 ppm). Indeed, accurate mass search alone provides poor evidence for metabolite identification, and it is unable to distinguish lipid class isomers between PC and PE. The detection of m/z 713.4532 (ion 3) in the pseudo-MS/MS of cluster 1 indicates a neutral loss of 59 Da, which corresponds to the PC head trimethylamine, confirming that m/z 772.5267 (ion 1) corresponds a PC class lipid but not a PE or PE-NMe class lipid (Fig. 7d). This example highlights the power of ShinyCardinal in metabolite identification. With the same approach, we have identified all the remaining ion clusters with high confidence (Supplementary Table S5).

To check the accuracy of the metabolite identification results, we have submitted the same dataset (https://metaspace2020.eu/datasets?ds=2023-08-15_15h59m29s) to METASPACE ^16,17. The results for the ion clusters are consistent or superior to those from METASPACE (Supplementary Table S5). For instance, m/z 772.5267 (cluster 1) was identified as PE ([M + Na]⁺), PE-NMe ([M + K]⁺), or PC ([M + K]⁺) class lipid at a 10% false discovery rate (FDR) in METASPACE. However, METASPACE does not differentiate between the three lipid class isomers. Using ShinyCardinal, this ion was confidently identified as PC ([M + K]⁺) class lipid due to the detection of the neutral loss of 59 Da. Another example involves m/z 835.6685 (cluster 8), which was not identified by METASPACE at a 10% false FDR, whereas it was identified as Sphingomyelin (42:2) with two typical neutral losses of 183 and 59 Da (Supplementary Table S5).

The export module of ShinyCardinal allows export of the processed MSI data in centroid imzML format needed for METASPACE or other MSI software tools. In addition, users can choose to export the deisotoped, and background and MALDI matrix peak removed MSI dataset for METASPACE, which further reduce the number of false positives. In addition to RDS and imzML formats, the preprocessed MSI data can also be exported to Comma-Separated Values (CSV) file. The CSV file, structured as a table with m/z values in rows and pixels in columns, can be read into R, Python, or other programming languages for machine learning purposes.

As MSI sees rapid adoption and throughput, along with continuous improvements in spectral quality, the demand for advanced and specialized software tools becomes imperative. ShinyCardinal provides a powerful and user-friendly platform that enables users to effectively analyze MSI data. It offers all steps of a typical MSI data processing workflow through a GUI for rapid, interactive, and easy visualization and comparison of MS images. Furthermore, ShinyCardinal is an R ⁵⁰ environment-based free and open-source framework that allows simple customization to the needs of the user; other developers can also modify the source code to continuously improve the software’s capabilities and incorporate more features. In particular, the encapsulation of ShinyCardinal into a standalone desktop app largely facilitates the installation of the software. Users do not have to separately install R and the associated packages. Most importantly, this encapsulation guarantees the consistent performance of the software regardless of the variation of the user's local environment and system configurations. Additionally, the cloud-based version of ShinyCardinal (https://gincpm.shinyapps.io/ShinyCardinal/) allows user to analyze small MSI dataset (data size less than 1 GB) without installing the software.

ShinyCardinal was designed in a modular manner, enabling users to employ each individual modules separately or integrate several modules to achieve specific tasks. Users can redo or undo certain modules so as to integrate the result with different modules as desired. For instance, due the lack of chromatographic separation, deisotoping is recommended before mass-based metabolite identification in MSI because isotope peaks frequently lead to false positives. Conversely, isotope peaks hold great value for network analysis as the isotope patterns play a crucial role in understanding the inter-peak relationships within a network. ShinyCardinal offers an intuitive interface for users to manage deisotoping results based on their analysis types and needs. Another example is that users can easily toggle between ion images and segmentation maps within the image visualization and segmentation modules to help them precisely select ROIs.

To showcase the versatility and functionalities of ShinyCardinal, we analyzed nine MSI datasets obtained from both plant and mammalian samples using different MSI technologies, including MALDI-Orbitrap, MALDI2-Orbitrap, MALDI-FTICR, MALDI-TOF, and DESI-Orbitrap imaging. We manually verified or compared the results with the other MSI software tools when feasible, showing that ShinyCardinal generated comparable or even superior outcomes.

We applied ShinyCardinal to nine different MSI datasets. (1) The mouse testis data was generated by a MALDI2-Orbitrap instrument (Thermo Fisher Scientific GmbH. MALDI ion source is from Spectroglyph) in negative ion mode over the mass range of 67-1005 Da with 50 μm spatial resolution. (2) The mouse stomach dataset was generated by a MALDI-rapifleX TOF instrument (Bruker Daltonics) at positive ion mode over the mass range of 500-2495 Da with 50 μm spatial resolution ⁵¹. (3) The mouse brain dataset was generated by a MALDI-timsTOF Flex instrument (Bruker Daltonics) in negative ion mode over the mass range of 600-1794 Da with 20 μm spatial resolution ⁵². (4) The tomato fruit dataset was generated by a MALDI-FTICR instrument (Bruker Daltonics) in positive ion mode over the mass range of 150-3000 Da with 50 μm spatial resolution. (5) The tomato root dataset was generated by a MALDI-FTICR instrument (Bruker Daltonics) in positive ion mode over the mass range of 118-2000 Da with 30 μm spatial resolution. (6) The rat brain data was generated by a MALDI-Orbitrap instrument (Thermo Fisher Scientific GmbH. MALDI ion source is from Spectroglyph) in negative ion mode over the mass range of 180-1999 Da. (7) Bovine lens dataset was generated by MALDI-FTICR instrument (Bruker Daltonics) in positive ion mode over the mass range of 500-3000 Da with 150 μm spatial resolution ⁴¹. (8) DESI-Orbitrap instrument (Thermo Fisher Scientific GmbH; DESI source is from Prosolia, Indianapolis, IN, USA) in positive ion mode over the mass range of 200-600 Da with 125 μm spatial resolution ⁴⁴. (9) The mouse brain dataset was generated by a MALDI-Orbitrap instrument (Thermo Fisher Scientific GmbH) in positive ion mode over the mass range of 300-2000 Da with 50 μm spatial resolution ³¹. Dataset 1 was converted to imzML format using ImageInsight software (version 0.1.0.1361). Datasets 4 and 5 were converted to imzML format using flexImaging (version 5.0, Bruker Daltonics) software. All the rest datasets were downloaded from public.

Data availability

The datasets used in this study are all publicly available:

1. Mouse testis, MALDI2-Orbitrap MSI data (Fig. 2, Supplementary Table S2 and Supplementary Fig. 1):

(https://metaspace2020.eu/dataset/2024-03-07_09h25m45s)

2. Mouse stomach, MALDI-TOF MSI data (Supplementary Table S1): (https://ftp.pride.ebi.ac.uk/pride/data/archive/2020/08/PXD011104/)

3. Mouse brain, MALDI-TOF MSI data (Supplementary Table S1):

(https://metaspace2020.eu/dataset/2022-07-11_16h35m00s)

4. Tomato fruit, MALDI-FTICR MSI data: (Supplementary Table S1 and Fig. 4):

(https://metaspace2020.eu/dataset/2024-03-08_22h01m34s)

5. Tomato root, MALDI-FTICR MSI data: (Supplementary Table S1):

(https://metaspace2020.eu/dataset/2024-03-08_22h23m31s)

6. Rat brain, MALDI-Orbitrap MSI data (Fig. 3):

(https://metaspace2020.eu/dataset/2018-04-16_22h30m01s)

7. Bovine lens, MALDI-FTICR MSI data (Fig.5):

(https://proteomecentral.proteomexchange.org/cgi/GetDataset?ID=PXD025486)

8. DESI-Orbitrap MSI data: (Fig. 6):

(https://metaspace2020.eu/dataset/2016-12-09_10h16m23s)

9. Mouse brain, MALDI-Orbitrap data (Fig. 7 and Supplementary Table S3): (https://www.ebi.ac.uk/metabolights/editor/MTBLS487/files)

Code availability

The source code of ShinyCardinal is available on GitHub (https://github.com/YonghuiDong/ShinyCardinal). The desktop version of ShinyCardinal can be download (https://shinycardinal.sourceforge.io).

Lundberg, E. & Borner, G. H. H. Spatial proteomics: a powerful discovery tool for cell biology. Nat Rev Mol Cell Biol 20, 285–302 (2019).
Alexandrov, T. Spatial Metabolomics and Imaging Mass Spectrometry in the Age of Artificial Intelligence. Annu. Rev. Biomed. Data Sci. 3, 61–87 (2020).
Ma, S. et al. High spatial resolution mass spectrometry imaging for spatial metabolomics: Advances, challenges, and future perspectives. TrAC Trends in Analytical Chemistry 159, 116902 (2023).
Petras, D., Jarmusch, A. K. & Dorrestein, P. C. From single cells to our planet—recent advances in using mass spectrometry for spatially resolved metabolomics. Current Opinion in Chemical Biology 36, 24–31 (2017).
Abdelmoula, W. M. et al. Peak learning of mass spectrometry imaging data using artificial neural networks. Nat Commun 12, 5544 (2021).
Dong, Y. & Aharoni, A. Image to insight: exploring natural products through mass spectrometry imaging. Natural Product Reports 39, 1510–1530 (2022).
He, J. et al. MassImager: A software for interactive and in-depth analysis of mass spectrometry imaging data. Analytica Chimica Acta 1015, 50–57 (2018).
Hu, H. & Laskin, J. Emerging Computational Methods in Mass Spectrometry Imaging. Advanced Science 9, 2203339 (2022).
Alexandrov, T. MALDI imaging mass spectrometry: statistical data analysis and current computational challenges. BMC Bioinformatics 13, S11 (2012).
Alexandrov, T. Spatial metabolomics: from a niche field towards a driver of innovation. Nat Metab 5, 1443–1445 (2023).
Fischer, C. R., Ruebel, O. & Bowen, B. P. An accessible, scalable ecosystem for enabling and sharing diverse mass spectrometry imaging analyses. Archives of Biochemistry and Biophysics 589, 18–26 (2016).
Weiskirchen, R., Weiskirchen, S., Kim, P. & Winkler, R. Software solutions for evaluation and visualization of laser ablation inductively coupled plasma mass spectrometry imaging (LA-ICP-MSI) data: a short overview. Journal of Cheminformatics 11, 16 (2019).
Gessel, M. M., Norris, J. L. & Caprioli, R. M. MALDI imaging mass spectrometry: Spatial molecular analysis to enable a new age of discovery. Journal of Proteomics 107, 71–82 (2014).
Bokhart, M. T., Nazari, M., Garrard, K. P. & Muddiman, D. C. MSiReader v1.0: Evolving Open-Source Mass Spectrometry Imaging Software for Targeted and Untargeted Analyses. J Am Soc Mass Spectrom 29, 8–16 (2018).
Tortorella, S. et al. LipostarMSI: Comprehensive, Vendor-Neutral Software for Visualization, Data Analysis, and Automated Molecular Identification in Mass Spectrometry Imaging. J. Am. Soc. Mass Spectrom. 31, 155–163 (2020).
Palmer, A. et al. FDR-controlled metabolite annotation for high-resolution imaging mass spectrometry. Nat Methods 14, 57–60 (2017).
Alexandrov, T. et al. METASPACE: A Community-Populated Knowledge Base of Spatial Metabolomes in Health and Disease. http://biorxiv.org/lookup/doi/10.1101/539478 (2019) doi:10.1101/539478.
Ràfols, P. et al. rMSI: an R package for MS imaging data handling and visualization. Bioinformatics 33, 2427–2428 (2017).
Bemis, K. D. et al. Cardinal: an R package for statistical analysis of mass spectrometry-based imaging experiments. Bioinformatics 31, 2418–2420 (2015).
Bemis, K. A., Föll, M. C., Guo, D., Lakkimsetty, S. S. & Vitek, O. Cardinal v.3: a versatile open-source software for mass spectrometry imaging analysis. Nat Methods (2023) doi:10.1038/s41592-023-02070-z.
Baquer, G. et al. What are we imaging? Software tools and experimental strategies for annotation and identification of small molecules in mass spectrometry imaging. Mass Spectrometry Reviews e21794 (2022) doi:10.1002/mas.21794.
Fay, C., Rochette, S., Guyader, V. & Girard, C. Engineering Production-Grade Shiny Apps. (Chapman and Hall/CRC, Boca Raton, 2021). doi:10.1201/9781003029878.
Norris, J. L. et al. Processing MALDI mass spectra to improve mass spectral direct tissue analysis. International Journal of Mass Spectrometry 260, 212–221 (2007).
Schramm, T. et al. imzML — A common data format for the flexible exchange and processing of mass spectrometry imaging data. Journal of Proteomics 75, 5106–5110 (2012).
Ràfols, P. et al. Signal preprocessing, multivariate analysis and software tools for MA(LDI)-TOF mass spectrometry imaging for biological applications: MSI DATA PROCESSING. Mass Spec Rev 37, 281–306 (2018).
Ovchinnikova, K., Kovalev, V., Stuart, L. & Alexandrov, T. OffsampleAI: artificial intelligence approach to recognize off-sample mass spectrometry images. BMC Bioinformatics 21, 129 (2020).
Baquer, G. et al. rMSIcleanup: an open-source tool for matrix-related peak annotation in mass spectrometry imaging and its application to silver-assisted laser desorption/ionization. Journal of Cheminformatics 12, 45 (2020).
Baquer, G. et al. Discovering Matrix Adducts for Enhanced Metabolite Profiling with Stable Isotope-Labeled MALDI-MSI. http://biorxiv.org/lookup/doi/10.1101/2023.06.28.546946 (2023) doi:10.1101/2023.06.28.546946.
Janda, M. et al. Determination of Abundant Metabolite Matrix Adducts Illuminates the Dark Metabolome of MALDI-Mass Spectrometry Imaging Datasets. Anal. Chem. 93, 8399–8407 (2021).
Aftab, W., Lahiri, S. & Imhof, A. ImShot: An Open-Source Software for Probabilistic Identification of Proteins In Situ and Visualization of Proteomics Data. Molecular & Cellular Proteomics 21, 100242 (2022).
Bond, N. J., Koulman, A., Griffin, J. L. & Hall, Z. massPix: an R package for annotation and interpretation of mass spectrometry imaging data for lipidomics. Metabolomics 13, 128 (2017).
Dong, Y., Li, B. & Aharoni, A. More than Pictures: When MS Imaging Meets Histology. Trends in Plant Science 21, 686–698 (2016).
Yajima, Y. et al. Region of Interest analysis using mass spectrometry imaging of mitochondrial and sarcomeric proteins in acute cardiac infarction tissue. Sci Rep 8, 7493 (2018).
Guo, A., Chen, Z., Li, F. & Luo, Q. Delineating regions of interest for mass spectrometry imaging by multimodally corroborated spatial segmentation. GigaScience 12, giad021 (2022).
Dong, Y. et al. High mass resolution, spatial metabolite mapping enhances the current plant gene and pathway discovery toolbox. New Phytol nph.16809 (2020) doi:10.1111/nph.16809.
Dong, Y. et al. PICA: Pixel Intensity Correlation Analysis for Deconvolution and Metabolite Identification in Mass Spectrometry Imaging. Anal. Chem. acs.analchem.2c04778 (2023) doi:10.1021/acs.analchem.2c04778.
Alexandrov, T. et al. Spatial Segmentation of Imaging Mass Spectrometry Data with Edge-Preserving Image Denoising and Clustering. J. Proteome Res. 9, 6535–6546 (2010).
Hu, H., Yin, R., Brown, H. M. & Laskin, J. Spatial Segmentation of Mass Spectrometry Imaging Data by Combining Multivariate Clustering and Univariate Thresholding. Anal. Chem. 93, 3477–3485 (2021).
Xiao, K., Wang, Y., Dong, K. & Zhang, S. SmartGate is a spatial metabolomics tool for resolving tissue structures. Briefings in Bioinformatics 24, bbad141 (2023).
Bemis, K. D. et al. Probabilistic Segmentation of Mass Spectrometry (MS) Images Helps Select Important Ions and Characterize Confidence in the Resulting Segments. Molecular & Cellular Proteomics 15, 1761–1772 (2016).
Guo, G. et al. Automated annotation and visualisation of high-resolution spatial proteomic mass spectrometry imaging data using HIT-MAP. Nat Commun 12, 3241 (2021).
Tobias, F. & Hummon, A. B. Considerations for MALDI-Based Quantitative Mass Spectrometry Imaging Studies. J. Proteome Res. 19, 3620–3630 (2020).
Unsihuay, D., Mesa Sanchez, D. & Laskin, J. Quantitative Mass Spectrometry Imaging of Biological Systems. Annu. Rev. Phys. Chem. 72, 307–329 (2021).
Swales, J. G. et al. Spatial Quantitation of Drugs in tissues using Liquid Extraction Surface Analysis Mass Spectrometry Imaging. Sci Rep 6, 37648 (2016).
Feldberg, L., Dong, Y., Heinig, U., Rogachev, I. & Aharoni, A. DLEMMA-MS-Imaging for Identification of Spatially Localized Metabolites and Metabolic Network Map Reconstruction. Anal. Chem. 90, 10231–10238 (2018).
Wang, M. et al. Sharing and community curation of mass spectrometry data with Global Natural Products Social Molecular Networking. Nat Biotechnol 34, 828–837 (2016).
Vincenti, F. et al. Molecular Networking: A Useful Tool for the Identification of New Psychoactive Substances in Seizures by LC–HRMS. Front. Chem. 8, 572952 (2020).
Schmid, R. et al. Ion identity molecular networking for mass spectrometry-based metabolomics in the GNPS environment. Nat Commun 12, 3832 (2021).
Wishart, D. S. et al. HMDB 5.0: the Human Metabolome Database for 2022. Nucleic Acids Research 50, D622–D631 (2022).
R Core Team. R: A Language and Environment for Statistical Computing. (R Foundation for Statistical Computing, Vienna, Austria, 2020).
Erich, K. et al. Spatial Distribution of Endogenous Tissue Protease Activity in Gastric Carcinoma Mapped by MALDI Mass Spectrometry Imaging. Molecular & Cellular Proteomics 18, 151–161 (2019).
Abu Sammour, D. et al. Spatial probabilistic mapping of metabolite ensembles in mass spectrometry imaging. Nat Commun 14, 1823 (2023).

There is NO Competing Interest.

Download PDF

Version 1

posted

You are reading this latest preprint version

Mass Spectrometry Imaging Data Analysis with ShinyCardinal

Status:

Version 1

Abstract

Figures

Main

Result

Overview of ShinyCardinal Pipeline

Data import and Preprocessing

Data cleaning

Image visualization

Region of interest analysis

Image Spatial Segmentation

Absolute Quantification for MSI

Molecular Networking and Database Search Based Metabolite Identification

Discussion

Methods

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1