Non-targeted isomer-sensitive N-glycome analysis reveals new layers of organ-specific diversity in mice

N-glycosylation is one of the most common protein modifications in eukaryotes, with immense importance at the molecular, cellular, and organismal level. Accurate and reliable N-glycan analysis is essential to obtain a systems-wide understanding of fundamental biological processes. Due to the structural complexity of glycans, their analysis is still highly challenging. Here we make publicly available a consistent N-glycome dataset of 20 different mouse tissues and demonstrate a multimodal data analysis workflow that allows for unprecedented depth and coverage of N-glycome features. This highly scalable, LC-MS/MS data-driven method integrates the automated identification of N-glycan spectra, the application of non-targeted N-glycome profiling strategies and the isomer-sensitive analysis of glycan structures. Our delineation of critical sub-structural determinants and glycan isomers across the mouse N-glycome uncovered tissue-specific glycosylation patterns, the expression of non-canonical N-glycan structures and highlights multiple layers of N-glycome complexity that derive from organ-specific regulations of glycobiological pathways.


Introduction
Protein glycosylation, the covalent attachment of simple or complex sugar structures to amino-acid side chains of polypeptides, affects virtually all aspects of biology.Over 50% of all human proteins are subject to post-translational modi cations by glycans 1 , which alter their functions in fundamental biological processes, such as cell adhesion, signal transduction, intracellular tra cking, essential immune functions 2 or host-pathogen interactions 3 .
In the process of mammalian protein glycosylation, glycan structures are co-or post-translationally linked to nascent polypeptide chains, which then become extensively processed along the secretory pathway, throughout the Golgi-network to the cell surface.In protein N-glycosylation, arguably the currently best understood form of protein glycosylation, a large, highly conserved N-glycan precursor structure is en bloc transferred and covalently linked to the side chains of speci c asparagine-residues within the endoplasmic reticulum (ER).Subsequent N-glycan processing is a non-template driven process which critically depends on the coordinated action of dedicated enzymes, a highly speci c set of glycosyltransferases and glycosidases 1 .Perturbations in any of these tightly interconnected and subtly tuned cellular processes hold the potential to result in differentially processed glycoproteins that exhibit aberrant glycan structures.
Owing to its close resemblance to the human physiology, its short generation time and its visible phenotypic variants, the mouse (Mus musculus) represents the most common mammalian model organism to study fundamental biological processes, including glycosylation 4 .Nevertheless, striking differences between human and murine glycosylation have been described.Most importantly, humans lack the Gal-α1,3-Gal epitope (due to an inactivation mutation in the α1,3-galactosyl transferase (Ggta1) 5 ) and the biosynthetic capabilities for the generation of N-glycolylneuraminic acid (Neu5Gc) due to a mutation in the CMP-NeuAc hydroxylase 6 (Cmah).Further differences between human and murine glycosylation have recently been reported for the brain 7 and immune cells 8 .
The comprehensive analysis of protein-linked N-glycan structures is a highly challenging task.Due to compositional, structural, positional, and anomeric isomers 9 , this analytical challenge has long been tackled using a variety of glycomics techniques, which all pro le different aspects of glycosylation and all come with speci c analytical advantages and disadvantages.Importantly however, the use of this wide range of different methodologies, including lectin-arrays 10 , MALDI-TOF mass-spectrometry (MS) analysis of native 11 , permethylated 12 or otherwise derivatized glycans 13,14 , uorescence labelling of glycans followed by either reversed-or normal-phase HPLC analysis 15,16 , various LC-MS and LC-tandem MS (MS/MS) analysis work ows 17,18 , causes low comparability between studies.
Generation of glycobiological ground truth was pioneered by the Consortium of Functional Glycomics glycan pro ling initiative (CFG, www.functionalglycomics.org) who generated, annotated, and made publicly available the hitherto most comprehensive, consistent, and coherent mammalian glycome datasets.From 2001 to 2011, the CFG Analytical Glycotechnology Core analyzed the compositions of Nand O-linked glycans from 11 human and 16 murine tissues, primarily using MALDI-TOF MS analysis.
While the CFG's analytical activities have come to an end, their seminal glycan pro ling datasets still represent a vital and fundamental resource to the eld of mammalian glycobiology.More recently, new glyco-analytical developments and the paucity in modern, comprehensive N-glycome datasets prompted several studies aiming at the systemic pro ling of tissue-speci c N-glycosylation patterns using MALDI-TOF MS [19][20][21][22][23][24] .Despite the remarkable compositional variations that were detected by such ("singlestage") MS approaches, these methods intrinsically fall short in capturing the unique level of N-glycan structure micro-heterogeneity.Instead, glycan structures are tentatively deduced from composition and based on pre-established knowledge of glyco-biosynthetic pathways.
Here, we mapped the N-glycomes of 20 mouse tissues by performing PGC-LC-MS/MS (porous graphitic carbon liquid-chromatography coupled to a tandem mass-spectrometer).We used PGC-LC to chromatographically separate closely related N-glycan structure isomers and an Orbitrap mass-analyzer for high resolution MS/MS data-acquisition.Conventionally, the analysis of such LC-MS(/MS)-N-glycome datasets is based on the targeted, selective extraction for known or anticipated glycan-compositions andmasses 20,22,25 .Subsequent glycan-structure assignment incorporates additional information on preestablished, relative chromatographic retention times 9,26,27 , complementary MS/MS data and expert knowledge on glycan biosynthetic pathways.Despite the breath of glycan-information that can be retrieved by such -largely manual-approaches 9,25,[27][28][29][30] , they are hardly scalable to the size of the present dataset.Instead, we developed an automated, non-targeted and scalable MS/MS-centric N-glycomics data-analysis work ow, largely independent of prior glycobiological knowledge and anticipated glycancompositions.We showcase this approach by characterizing 20 different mouse tissues, providing a consistent and complete N-glycome atlas.Our analyses reveal tissue speci c N-glycome signatures and glycan-structural features, highlighting organ-intrinsic regulations of glycobiological pathways.

Precursor-independent MS/MS-based N-glycome pro ling of 20 mouse tissues
To generate a consistent N-glycome dataset, we collected tissues, as well as serum, in duplicates from age-matched C57BL/6J mice, and processed all samples using identical protein extraction, enzymatic Nglycan release (i.e.PNGase F), chemical reduction and glycan clean-up protocols.All samples were analyzed using porous graphitic carbon liquid-chromatography (PGC-LC) coupled to a high-resolution tandem (MS/MS) mass-spectrometer (i.e.Orbitrap Exploris 480).
To survey this data, we rst assessed the MS/MS data for the occurrence of diagnostic, N-glycan derived fragment ions, independent of intact glycan precursor mass information.For this, we automatically extracted all MS/MS spectra and retained only those that contained N-glycan speci c fragment ions (i.e.224.1 amu, diagnostic for reduced N-acetylhexosamine).This rst breakdown of our dataset con rmed that, overall, more than 40 percent of all MS/MS spectra generated in this study (i.e.220,506 out of 509,283 MS/MS spectra) contained information on chemically reduced glycan precursor ions.
Importantly, however, we also found considerable differences in the relative proportion of N-glycanderived MS/MS spectra between tissues, ranging from approx.20% for ileum to more than 80% for seminal vesicle (Supplementary Fig. 1.A).This tissue-dependent variability in the dynamic range and structural complexity prompted further investigation into tissue-speci c N-glycome features such as sialylation or fucosylation, independent of intact glycan precursor information.
Similar to sialylation, fucosylation is a vital, developmentally controlled modi cation of N-glycans, which has been implicated in numerous cell-cell interactions 34 and massively expands the structural heterogeneity of the N-glycome.Fucose can be linked to the most proximal core-GlcNAc residue, originally connected to the protein (i.e.core-fucose).Additionally, fucose residues have been found linked to either galactoses or N-acetylhexosamines in multiple positions of the distal parts of N-glycan antenna (i.e.distal fucosylation).Interestingly, distal fucosylation also encodes a series of essential, immune-reactive glycan-epitope isomers, comprising one (e.g.Lewis X, sialyl Lewis X, Lewis A, sialyl Lewis A, blood-group H type 1 and type 2) or more fucose residues (e.g.Lewis Y, Lewis B), in humans.Fucosyltransferase 3 (FUT3), the enzyme which catalyzes the addition of the Lewis A and Lewis B epitope in humans was found to be a pseudogene in mice.The murine system is thus believed to lack these two important fucoepitopes 35,36 .Unfortunately, most of these fucosylated N-glycan isomers cannot be resolved by MS/MS alone [37][38][39] .As a consequence, our precursor independent MS/MS pro ling approach, which is based on fragment ions that are indicative [37][38][39] of fucosylated N-glycans (i.e. one fucose linked to one HexNAc and to one hexose; fragment ion mass = 512.2amu), merely re ects on the combined expression patterns of core fucose, Lewis X, sialyl Lewis X, Lewis Y, blood-group H (bgH) type 1 and type 2, but not Lewis A or Lewis B, in murine tissues.Our analysis showed that fucosylated N-glycans were indeed present in all tissues, with exceptionally high levels in seminal vesical (85%) 40 , kidney (65%), and brain (50%) (Fig. 1., Supplementary Fig. 2.E).This suggested generally high expression levels of fucosyltransferases in these tissues, such as Fut9, which is highly expressed in kidney and brain 41 , Fut2 and Fut4, which are highly expressed in colon epithelial cells, or Fut8, which is globally expressed in mouse 42 .
Another distinctive structural feature of N-glycans is bisecting N-acetylglucosamine (GlcNAc).Bisecting GlcNAc residues are β1,4-linked to the core β-mannose residue by MGAT3 and have been reported vital to fetal development 43 , immunity, and cell adhesion 44 .Expression of this critical structural modi cation of N-glycans was pro led across all tissues, based on the diagnostic fragment ion of mass 792.3 amu (i.e. one reduced GlcNAc-residue, two GlcNAc-residues, and one hexose residue).As expected, we found bisecting GlcNAc expression levels to be highly tissue speci c, with the highest expression levels in brain (4%), kidney (2.5%), and colon (2%) (Fig. 1., Supplementary Fig. 2.D), all in good agreement with moue Mgat3 gene expression data 45 .
Moreover, an extensive screening of all tissues was conducted to identify fragment ions indicative of the Sda antigen.This antigen is characterized by a Neu5Ac residue α2,3-linked and a GalNAc residue β1,4linked to the galactose within a LacNAc motif (860.3 amu) 46 .Intriguingly, this speci c structural modi cation was predominantly expressed in the colon (9%), followed by the jejunum (3.5%) and duodenum (1%) (Supplementary Fig. 1.C).Notably, the ileum exhibited only minimal expression, accounting for less than 0.25% of all N-glycan associated MS/MS spectra.In previous studies, the Sda antigen has been identi ed in the colon of healthy humans, on the Tamm-Horsfall glycoprotein of Sda + individuals, and in the serum of patients with gastric cancer 46 .In humans, the biosynthesis of this antigen is driven by the B4GALNT2 gene, which orthologs' (B4Galnt2) expression indeed appears to be restricted to the intestinal organs in mice 42 .
Automated MS/MS-data-driven reconstruction of the mouse N-glycome.
To determine the N-glycan precursors contributing to the observed N-glycome signatures, we implemented a data-aggregation work ow to reconstruct N-glycosylation patterns at the precursor level.This work ow e ciently reduced PGC-LC-MS data complexity leveraged MS/MS information for precursor identi cation and extracts quantitative precursor information.The extraction of precursor mass information from LC-MS/MS data is often complicated by the imperfection of mass-spectrometric data, due to e.g.incorrect mono-isotopic peak-picking or charge-state assignments by the mass-spectrometer, unintended precursor ion co-isolation, in-source fragmentation, or excessive adduct-ion formation.To address these challenges, we conducted post-analysis charge-deconvolution and deisotoping of all MS spectra (using DeCon2 47 , Supplemental Information), assigned precursor mass and intensity values to mass bins (range of 1000-4000 Da; bin-width = ± 0.05 amu), and summed the values across the entire chromatographic time-range using custom code.This transformation of time-resolved LC-MS data into two-dimensional precursor mass-to-intensity arrays not only improved the robustness of mono-isotopic precursor mass assignments, but also e ciently reduced the dimensionality of our dataset.Additionally, these data arrays facilitated the construction of tissue-speci c quantitative histograms (Fig. 2.A, Supplementary data), resembling MALDI-TOF MS spectra frequently communicated in the elds of glycomics, providing a convenient data visualization format that integrates seamlessly with current glycome data repositories (e.g.CFG, www.functionalglycomics.org; Fig. 2.A).
In a next step, raw MS/MS data were re ned (i.e.mono-isotopic peak picking and charge state assignment re-evaluation) and converted into the generic .mgfle-format using the proteomics software PEAKS 48 .From this, to identify MS/MS spectra that derived from N-glycan precursors and to stringently control for unintended precursor ion co-isolation events, we calculated spectrum-speci c Score-values (i.e.SNOG-score) from the intensity of an N-(and O-) Glycan-speci c fragment ion (i.e.oxonium ion of the reduced-end monosaccharide GlcNAc; 224.1 amu), using custom code (Supplemental data).Empirically, MS/MS spectra with SNOG-scores greater than 0.03 were determined to derive from actual N-glycan precursors (Supplementary Fig. 3., Supplementary Fig. 4., and Supplemantary Fig. 5.).Subsequently, precursor mass information of SNOG-scored MS/MS spectra was aligned with the initial two-dimensional MS precursor mass-to-intensity arrays, and only MS signals of true N-glycan precursors above a cumulative intensity threshold of 5E + 6 were retained.
The number of automatically identi ed glycan precursor masses greatly varied between tissues.For example, while SNOG-ltering reduced the total number of glycan-derived precursor mass-bins by approx.30% in brain, it removed approx.75% of all precursor mass-bins in liver (Supplementary Fig. 4.B).This, again, highlighted important tissue-dependent differences in the dynamic range and the structural complexity of N-glycomes, and suggested sample-speci c background signals at the precursor level.
Manual inspection of MS/MS spectra of low-scoring mass-bins (i.e.rejected) con rmed, for example, exceptionally high levels of non-N-glycan signals that derived from hexose-oligomers (e.g.dextrans or maltodextrins degradation products of glycogen 49 ) in liver, which were e ciently removed by our approach (Supplementary Fig. 5.).
The correlation and cluster analysis of our SNOG-ltered dataset (Fig. 3.) validated robust reproducibility among sample replicates and revealed distinctive tissue-speci c clustering patterns.Notably, we found clustering even among distantly related mouse tissues, such as the exocrine organs, i.e. seminal vesicles and pancreas, clustering with brain, or the central organs of the immune system, spleen and thymus, coalescing with mammary glands, for which we observed convergent N-glycan signatures.Intriguingly, while spleen and thymus share a cluster, lymph nodes exhibit a glycosylation pattern more akin to white adipose tissue, potentially in uenced by spatial/histological proximity.
A distinctive glycosylation pattern set apart the tissues of the small intestine, particularly the jejunum and duodenum, constituting a distinct cluster separate from the ileum and colon.Conversely, the ileum and colon formed a cluster with the bladder and skin, suggesting a potential association rooted in shared epithelial glycosylation patterns.Additionally, our observations extend to the highly blood-perfused organs, where the liver and lung share a cluster with serum, while heart, kidney, and testis form another distinct cluster.This clustering pattern provides insights into the glycosylation variations within these organ groups, potentially re ecting their functional relationships or physiological roles.
Dissecting the mouse N-glycome based on sub-structural determinants.
To systematically query and stratify N-glycan precursors based on fragment ion data, in a next step we extended our initial SNOG-scoring scheme by additional N-glycan speci c fragment ions (i.e.eSNOG).MS/MS spectrum-speci c eSNOG-scores were thus calculated from the relative intensities of substructure speci c diagnostic fragment ions and empirically determined, sub-structure speci c eSNOG cutoff values were used for down-stream data-ltering (Supplementary Fig. 6.).E ciently limiting the impact of potential gas-phase rearrangements 50 , this versatile data-ltering approach allowed for automated strati cation of our reconstructed N-glycome data based on sub-structure speci c features.
Capitalizing on our automated data-analysis work ows, we next strati ed all other N-glycome datasets into the same six N-glycan categories and compared their relative abundances across the 20 murine tissues analyzed (Fig. 4.B).This comparative analysis allowed us to further dissect the structural complexity of the mouse N-glycome at the precursor level and revealed a remarkable diversity in glycosylation patterns across tissues.
From our initial precursor-independent pro ling analyses we found that serum was largely dominated by Neu5Gc-bearing N-glycans, representing up to 96% of the TIC 53 .Additionally, our N-glycan precursorinformed data now revealed that this important trait of the murine serum N-glycome was essentially derived from only two biantennary, non-fucosylated N-glycans (Hex 5 HexNAc 4 Neu5Gc 1 and Hex 5 HexNAc 4 Neu5Gc 2 ), totaling ~ 57% of all its Neu5Gc-containing structures.The sialylated fraction of the brain N-glycome, on the other hand, consisted almost entirely of Neu5Ac-decorated N-glycan species, in good agreement with previous studies 25,54 .Based on our eSNOG-strati ed precursor information, the Nsialome of the brain presented as highly diverse, comprising truncated (e.g.Hex 4 HexNAc 3 Fuc 1 Neu5Ac 1 ), hybrid-type (e.g.Hex 6 HexNAc 3 Fuc 1 Neu5Ac 1 ), bisecting GlcNAc-containing (e.g.Hex 5 HexNAc 5 Fuc 1 Neu5Ac 2 ) and highly complex tetra-antennary N-glycans (e.g.Hex 7 HexNAc 6 Fuc 3 Neu5Ac 2 ).Additionally, we found approx.50% of the sialylated N-glycan species in brain to also carry distal fucoses.From this data we calculated that almost 50% of the brain N-glycome is sialylated.Of note, the degree of sialylation reported for the brain varies vastly between studies, with previous estimates ranging from only ~ 3% 25 , over 20% 29 , or up to ~ 40% of sialylation 55 , depending on methodology used.
Differentiating the N-glycomes of seminal vesicle and pancreas, we next screened our data for fragment ions that are diagnostic for doubly fucosylated antennae (i.e.Hex 1 HexNAc 1 Fuc 2 , 658.3 amu), hence the Lewis Y-epitope.Our analysis revealed that this N-glycan motif was most speci c for seminal vesicle (~ 1.2%) (Supplementary Fig. 7.A) and barely detected in the pancreas of mice.Lewis Y-containing precursors included a series of partially core-fucosylated, bi-(e.g.Hex 5 HexNAc 4 Fuc 4 , Hex 5 HexNAc 4 Fuc 5 ,) and multi-antennary N-glycans (e.g.Hex 6 HexNAc 5 Fuc 5, and Hex 8 HexNAc 7 Fuc 6 ).This unique glycosylation landscape of the seminal vesicle stands out among all tissues, and it is, to our knowledge, the rst glycobiological description of this organ in mammals.Furthermore, the high expression levels of distally fucosylated N-glycans, including those with the Lewis Y epitope, correlated with gene expression data of the respective fucosyltransferases (i.e.Fut2, Fut4), as well as previous reports on the glycome of human seminal plasma 40,56 .The functional implications of these exceptionally high levels of distal fucose and alpha-galactose in exocrine organs remain to be explored.
Deep mining the mouse N-glycome uncovers unusual structural modi cations.
To comprehensively annotate the results of our non-targeted data-analysis approach we used a in silico constructed mouse N-glycome database, holding an extensive list of canonical mouse N-glycans (i.e.glycoDB; 1429 unique N-glycan compositions, 960 precursor ion mass bins; Supplementary).Surprisingly, the automated annotation of our N-glycome histograms with this glycoDB highlighted several precursor masses, which, despite their consistent and reproducible detection, could not be explained by known, conventionally computed or anticipated N-glycan compositions.We hypothesized that this remarkably large population of unknown N-glycan precursor masses (i.e.approx.15% of TIC across all samples; ranging from approx.6% TIC in ileum, to up to approx.37% in liver; Supplementary Fig. 4.C and 4.D) would either result from experimental noise in the MS raw-data or represent unusual N-glycan compositions.Manual inspection of underlying MS/MS spectra uncovered and con rmed a series of unusual diagnostic fragment ions that were indicative for rare or non-canonical N-glycan structures, such as those bearing the HNK-1 epitope, sulfated HexNAc residues, doubly sialylated antenna, fucosylated LacdiNAc structures, or acetylated sialic acids (i.e.Ac-Neu5Ac and Ac-Neu5Gc), not covered by our insilico N-glycome model database.Systematic mining of our dataset across all tissues using our eSNOG approach revealed striking tissue-speci city for most of these non-canonical N-glycan modi cations.
Sulfation is considered the most diverse glycan modi cation with 35 sulfotransferases involved in the process of glycan sulfation 61 .Although many of these sulfotransferases are thought to serve in decorating O-linked glycosaminoglycan chains, sulfated N-glycans have also been reported for porcine and human pancreas, with the highest levels found within the islets of Langerhans 62 .Systematic MS/MS-based screening of the different tissues for the relevant fragment ion (HexNAc-SO4-284 amu) revealed that almost all tissues exhibited at least low levels of N-glycan sulfation.In line with previous reports 32,62 , also in our data the highest amount of HexNAc sulfation was found in pancreas (~ 5%), bladder (~ 2.5%), skin (~ 2%), and brain (~ 1%) (Fig. 5.F).Remarkably, however, the population of sulfated N-glycans in the pancreas was comprised exclusively of bi-antennary glycans, with and without corefucose (e.g.Hex 5 HexNAc 4 Fuc 1 SO4 1 ).Other sulfated structures in the pancreas were found to also carry one or two additional alpha-Gal residues (e.g.Hex 6 HexNAc 4 Fuc 1 SO4 1 ), which aligned well with the high expression levels of alpha-Gal in the pancreas found by our study.
The expression of the fucosylated LacdiNAc N-glycan epitope was previously associated with different forms of cancer 63 , and previous data has shown that LacdiNAc precursors can be fucosylated by FUT9 64 , making it a candidate gene for regulating the tissue-speci c biosynthesis of fucosylated LacdiNAc.Recently, fucosylated LacdiNAc containing structures have been identi ed in the N-glycome of the human brain 27 and of HEK-293 cell-line 65,66 .Systematic screening of all tissues for glycan precursors that generated the respective diagnostic fragment ion (HexNAc 2 Fuc 1 -553.2amu) revealed that this epitope is expressed in a highly tissue-speci c manner, with the highest levels found in kidney (~ 3%), colon (~ 2%), skin (~ 0.5%), and brain (~ 0.1%) (Fig. 5.E).These observations were further corroborated by previous reports of Fut9 expression in the proximal tubule of kidney, in intestinal epithelial cell and in neurons 42 .
Neu5Ac and Neu5Gc can both be installed in either α2,3-, α2,6-or α2,8-linkage, to either galactose or -very rarely in N-glycans-to N-acetyl-hexosamines, such as N-acetyl-glucosamine (i.e.6-sialyl Lewis C; as reported previously in small amounts in bovine fetuin 67 ) or N-acetyl-galactosamine (e.g.sialyl LacDiNAc).We thus compared the relative abundances of the two sialic acid variants linked to hexoses, the canonical acceptor sites on N-glycans, or to N-acetylhexosamines (i.e.N-acetylglucosamine or Nacetylgalactosamine; 6-sialyl Lewis C, sialyl LacdiNAc, respectively), across all tissues.Again, we eSNOGltered all N-glycan derived MS/MS spectra for those that contained fragment ions diagnostic for sialic acids (i.e.fragment ion mass 292.1 amu and 308.1 amu) and for sialylated HexNAc (i.e.fragment ion mass 495.2 amu and 511.2 amu, respectively).As expected, the relative proportions of precursor signals of canonically sialylated N-glycans greatly exceeded those that were generated from the unusual sialyl-HexNAc structures, in all tissues.Also, the relative incorporation rates of Neu5Ac and Neu5Gc were essentially independent of the acceptor monosaccharide across all tissues.Remarkably, in brain Neu5Ac-HexNAc was found on approximately 10% of the sialylated N-glycan fraction.
Further dissecting the structural complexity of sialylated N-glycans, we also investigated the relative abundance of O-acetylated neuraminic acids across tissues.O-Acetylation of sialic acids is correlated with the circulatory half-life of glycoproteins in the human serum and can be crucial for their biological activities 72 .Most importantly, O-acetylated sialic acids are critical entry receptors to many respiratory viruses, including In uenza C virus, human coronavirus OC43 and the murine coronavirus 73 .Murine coronaviruses often spread to the liver, an organ topism that has been suggested to be partially explained by the expression pattern of Ac-Neu5Ac and Ac-Neu5Gc in both lung and liver 74 .Quantifying the signals of all N-glycan precursors with O-acetylated neuraminic acids (i.e.Ac-Neu5Ac, fragment ion mass = 334.1 amu and/or Ac-Neu5Gc, fragment ion mass = 350.1 amu) allowed us to con rm expression of these important modi cations almost exclusively in ve tissues, namely lung (~ 4.5%), heart (~ 2 %), kdney (~ 1.5%), and, at very low levels, spleen (< 0.1%) (Fig. 5.C).Liver gave ambiguous results, as liver 1 showed very high, and liver 2 lower expression of acetylated sialic acids (~ 3.5% and ~ 0.2%, respectively).The respective compositions in the lung are all partially core-fucosylated, bi-antennary N-glycans with one, two, or three Neu5Gc-or Neu5Ac-residues, of which one sialic acid residue was acetylated (e.g.Hex 5 HexNAc 4 Fuc 1 Neu5Gc 1 Ac 1 ).So far, CASD1 is the only mammalian enzyme known to catalyze the acetylation of sialic acid resulting in the formation of Neu5,9Ac2 75 .In mouse, like human, CASD1 appears to be widely expressed among most organs 42 .Our ndings suggest additional unknown regulatory mechanisms that restrict expression of O-acetylated neuraminic acids to speci c organs.Furthermore, we predominantly observed O-acetylation on Neu5Gc and only to a low degree on Neu5Ac residues (Supplementary Fig. 7.H and Supplementary Fig. 7.I).This suggests important differences in the biosynthesis, stability, or incorporation of the two O-acetylated sialic acid variants into N-glycans.As it is unclear whether CASD1 also catalyzes the O-acetylation of Neu5Gc, this warrants further investigations into the substrate speci cities of CASD1 and the identi cation of additional O-acetyltransferases 76 .
Pro ling the isomeric structural complexity of the murine N-glycome.
Next to compositional variations, N-glycans exhibit a unique level of micro-heterogeneity that derives from structural, positional, and anomeric isomers.The number of unique N-glycan structures that can be constructed from a given mono-saccharide composition (hence of identical molecular mass, i.e. isobaric) adds a critical layer of complexity to the N-glycome 9 .Importantly, the co-existence of multiple isobaric Nglycan isomers within a single tissue cannot be captured by MS (or even MS/MS) alone and can only be resolved by either chromatographic or ion-mobility based separation techniques.To generate isomersensitive information, in this study, all samples were analyzed using a highly isomer-selective stationary phase (PGC-LC).Previously established N-glycan retention libraries combined with MS/MS data were used to identify the exact structures of respective isomers 9,26,27,77 .
Expanding our N-glycome analyses by the integration of chromatographic information revealed the staggering structural complexity of the mouse N-glycome and provided insight into vital, organ-intrinsic regulations of glycobiological pathways.To showcase the granularity of our N-glycan isomer-sensitive dataset, we rst compared the retention times of a speci c, single precursor mass, that holds all doubly Neu5Ac-sialylated, core-fucosylated, biantennary, complex type N-glycan structures of composition Hex 5 HexNAc 4 Fuc1Neu5Ac 2 , across tissues (Fig. 6.A).As sialic acids are usually found in either α2,3-or α2,6-linkage to terminal galactose residues of N-glycans, up to four different structural isomers of unique retention times are expected for this single glycan composition: both antennae carrying α2,3-linked, both antennae capped with α2,6-linked, and one of the antennae with α2,6-linked while the other antenna bearing a α2,3-linked Neu5Ac.To compensate for experimental chromatographic elution-time shifts between samples, all retention times within a given analytical run were normalized to those of the consistently detected Man5 N-glycan structure.Elution pro les of N-structures comprised of Hex 5 HexNAc 4 Fuc 1 Neu5Ac 2 showed that almost all tissues were dominated by α2,3-linked Neu5Ac (Fig. 6.A, Supplementary Fig. 9.D).Interestingly, however, brain, lung, and testis presented with a balanced ratio of α2,3-and α2,6-linked Neu5Ac isomers.Moreover, the brain displayed a notable presence of distinct N-glycan structure isomers due to the occurrence of branching Neu5Ac and/or antennary fucose.
Remarkably, these speci c glycan structures were exclusive to the brain tissue and were not identi ed in any other organ within the mouse.
Next, we compared the elution pro le of the corresponding Neu5Gc-sialylated N-glycan structures (i.e.Hex 5 HexNAc 4 Fuc 1 Neu5Gc 2 ) across all tissues.In stark contrast to Neu5Ac-sialylated structures, Neu5Gcsialylated structures were predominantly found in α2,6-linkage across tissues, except for skin (i.e.ratio of α2,3-to α2,6-linked Neu5Gc approx 50%) (Fig. 6.A, Supplementary Fig. PGC-LC also allowed us to discriminate pivotal positional differences in distal fucosylation (i.e.α1,2linked to galactose or α1,3-linked to GlcNAc-residues) that give rise to the important, glycan-associated immune-determinants bgH and Lewis X.While Lewis X determinants are mainly synthesized by FUT4 or FUT9, bgH-epitopes are synthesized by FUT1 or FUT2 78 .To discern these critical fucose structures, we compared the retention time pro les of a multiply fucosylated precursor mass of the compositions Hex 5 HexNAc 4 Fuc 3 across all relevant samples (Fig. 6.B).The N-glycan structures that may be deduced from this single composition comprise asialo, core-fucosylated, biantennary, complex type N-structures with two distal fucoses, either in bgH-or Lewis X-related linkage.Based on our normalized elution pro le data we found unexpected differences in antennary fucose between brain, kidney, seminal vesicles, and pancreas (Fig. 6.B).While brain and kidney were found to contain almost exclusively the Lewis X containing isomer 79 , pancreas exhibited only the bgH-epitope containing isomer, which correlates with the high expression of FUT1 in the pancreas 80 .Seminal vesicle, however, exhibited signals corresponding to both the Lewis X and the bgH-eptitope containing isomer.This nding aligns well with the occurrence of the Lewis Y-epitope, as described above, and the high expression level of FUT1, FUT2 and FUT4 in seminal vesicle 40 .Interestingly, a major carrier of Lewis X and Lewis Y in human seminal plasma has been linked to Glycodelin isoform S (GdS) 81 .As Lewis X and Y are known ligands to the immune receptor DC-SIGN 82 and glycodelins are potent immunosuppressors, they have been suggested to be important for feto-embryonic defense 83 against adverse immune reactions in the early stages of gestation.
Chromatography also provided strati cation of arm/branch-isomers of tri-antennary, complex-type structures.Complex-type N-glycans are characterized by the substitution of both (i.e.α1,3-and α1,6-) terminal core-mannoses by the addition of GlcNAc residues by Mgat1 and Mgat2, respectively.Additional branching of such complex-type structures results from secondary GlcNAc substitutions on the (i.e.α1,3and α1,6-) terminal core-mannoses by Mgat4 and Mgat5, respectively, or the installation of a "bisecting GlcNAc" to the core-central mannose by Mgat3.Our isomer-sensitive analysis thus allowed us to discern between the catalytic activities of the ve different glycosyl-transferases involved across all tissues.To explore the occurrence of biologically distinct N-glycan structures of identical composition, we compared the retention times of the precursor composition Hex 3 HexNAc 5 Fuc 1 across all tissues which could either be multiple tri-antennary structural isomers or a single biantennary, bisected complex type structure (Fig. 6.C, Supplementary Fig. 9.C).The respective elution pro les indicated three dominant peaks.Intriguingly, brain, kidney, colon, ileum, duodenum, jejunum, and serum were strongly dominated by biantennary, bisected structure isomers.While this con rmed the results of our MS/MS based pro ling, it also suggests a high occurrence of bisecting GlcNAc in brain, kidney, and colon, corroborated by Mgat3 expression data 84 .
Apart from oligomannose-type, N-glycans are often classi ed into hybrid-type or complex-type structures.Hybrid-type N-glycans result from the incomplete action of alpha-mannosidase II, giving rise to unsubstituted mannose residues on the α1,6-arm and a potentially substituted GlcNAc residue linked to the α1,3-mannose residue 1 .Interestingly, many N-glycome studies infer N-glycan classi cation based on composition.We found that this approach is highly oversimplifying the experimental data presented here.
For example, the composition Hex 5 HexNAc 3 Fuc 1 , which is often considered an "archetypical" hybrid type N-glycan structure, separated into several chromatographic peaks across tissues (Supplementary Fig. 9.A).Although most tissues exhibited highly similar elution pro les, pancreas, seminal vesicle, thymus, and spleen showed distinct dominant structures, which eluted much later.Manual inspection of the associated MS/MS spectra indicated core-fucosylated, truncated N-glycan structure with an alphagalactosylated antenna.This was further corroborated by high levels of alpha-galactose in pancreas, seminal vesicle, thymus, and spleen, as were observed in our initial MS/MS based N-glycome pro ling, and implied elevated levels of hexosaminidase in these tissues.
The classi cation of N-glycan structures, solely relying on composition, is further complicated by the incorporation of bisecting GlcNAc into hybrid-type N-glycans.Many hybrid-type structures with bisecting GlcNAc have counterparts in the class of complex-type N-glycans, adding an additional layer of complexity to their delineation based on composition (e.g.Hex 5 HexNAc 4 Fuc 1 , Hex 6 HexNAc 4 Fuc 2 , Hex 5 HexNAc 4 , Hex 6 HexNAc 4 ).While the classi cation of the composition Hex 5 HexNAc 4 Fuc 1 as complextype was true for most tissues we analyzed, we found that brain exhibited at least two hybrid-type structures within this composition 9 (Fig. 6.D, Supplementary Fig. 9.A).Similarly, while the composition Hex 6 HexNAc 4 Fuc 1 consisted of two hybrid-type structures containing bisecting GlcNAc in the brain, it represented alpha-galactosylated complex-type structures when present in other tissues (Supplementary Fig. 10.A).The same was found for the composition Hex 6 HexNAc 4 Fuc 2 , which consists of hybrid-type structures with a bisecting GlcNAc in the brain, yet complex-type structures when present in other tissues (Supplementary Fig. 10.B).These ndings suggest that hybrid-type structures with bisecting GlcNAc are yet another highly distinctive feature of the brain N-glycome, and they boldly underscore the merits of isomer-sensitive N-glycome analyses to uncover unexpected structural signatures.

Discussion
Here we make publicly available a consistent and structure sensitive N-glycome dataset of 20 different mouse tissues, essential for modern integrative system biology efforts.We used PGC-LC to chromatographically separate even closely related N-glycan isomers, an Orbitrap mass-analyzer for highresolution tandem mass-spectrometric data-acquisition and an unconventional, non-targeted data analysis approach.This entirely data-driven, analytical approach provides means to automatically extract and query overall glycosylation patterns and allows for (semi-) quantitative N-glycomics.
Our work ow provides an unprecedented level of depth and structural delity, allows for the identi cation of multiple times more glycan features than previous studies and highlighted organ-intrinsic regulations of glycobiological pathways of not yet fully understood functionality.Additionally, using our approach, we identi ed and consistently compared the expression of rare, non-canonical modi cations, some of them even reported for the rst time.
Unsupervised data analysis found clustering and convergent N-glycan signatures even among distantly related mouse tissues.These clustering patterns provide impartial insights into the glycosylation variations within organ groups, which potentially re ect their functional relationships or physiological roles.
Furthermore, our data suggests that different tissues use different biocatalytic strategies to specify their glycobiological identity, and thereby generate the staggering complexity revealed by our study.For example, while brain N-glycan diversity mainly results from its unique capacity to generate highly unusual and tissue-speci c monosaccharide linkages to complex-and hybrid-type N-glycan core structures (e.g."branching" Neu5Ac or HNK-1; Fig. 5.), the structural complexity of lung N-glycans largely draws from the incorporation and modi cation of different sialic acid variants (e.g.differently O-acetylated Neu5Gc and Neu5Ac).By contrast, exocrine organs, such as pancreas or seminal vesicle, shape their diversity of Nglycan structures primarily via a uniquely tuned interplay of alpha-galactosylation and antennary fucosylation.Although large parts of such tissue-speci c N-glycome signatures can be aligned with gene expression data, especially the expression of non-canonical N-glycan structures imply additional, hitherto unknown, regulatory mechanisms that restrict the catalytic activity of glyco-enzymes to speci c organs.
Taken together, using a non-targeted isomer-sensitive work ow to characterize a consistent and essentially complete mouse N-glycome atlas, our analyses reveal tissue speci c N-glycan-structural features and highlight important organ-intrinsic regulations of glycobiological pathways.We anticipate that both our dataset and our analytical work ow, will be instructive for fundamental glycobiological research, glycan analytical benchmarking, the development of new glycome data analysis tools, and integrative systems glycobiology.

Preparation of mouse tissues
Two Male and two female C57BL/6J mice were obtained from the Jackson Laboratory (PRID: IMSR_JAX:000664, Bar Harbor, ME).C57BL/6J mice were bred in the licensed breeding facility of the Institute of Molecular Biotechnology (IMBA, Vienna, AT), under a 14hrs/10hrs light/dark cycle.Food and water were available ad libitum.All mice were euthanized with carbon dioxide at the age of 13 weeks.All tissues were collected in triplicates from either female (F) of male (M) mice, and included: Skin from ear (F), whole pancreas (F), middle section of duodenum (F), middle section of jejunum (F), middle section of ileum (F), middle section of colon (F), whole spleen (F), left kidney (F), whole liver (F), whole heart (F), whole lung (F), whole brain (F), epididymal white adipose tissue (F), inguinal mammary fat pad including mammary glands and excluding inguinal lymph node (F), inguinal lymph node (F), whole thymus (F), left testicle (M), whole bladder (M), serum (submandibular bleeding, M), and left seminal vesicle (M).Prior to freezing, duodenum, jejunum, ileum, colon and bladder were opened and cleaned with PBS for food, fecal matter and/or urine.Serum was collected from blood with Microtainer SST tubes (BD Biosciences, 365968).All samples were immediately snap frozen in liquid nitrogen after collection and stored at -80°C until further processing.

N-glycan extraction
Each mouse tissue was transferred into a Falcon tube and mixed with 100 mM ammonium bicarbonate buffer containing 20 mM ditiothreitol and 2% sodium dodecyl sulfate in a total volume of 2 mL.After homogenization with an Ultra-Turrax T25 disperser, the homogenate was incubated at 56°C for 30 min.After cooling down, the solution was brought to 40 mM iodoacetamide and incubated at room temperature in the dark for 30 min.After centrifugal clari cation, chloroform-methanol extraction of the proteins was carried out using standard protocols 66 .In brief, the supernatant was mixed with four volumes of methanol, one volume of chloroform and three volumes of water in the given order.After centrifugation of the mixture for 3 min at 2500 rpm, the upper phase was removed, 4 mL of methanol were added, and the pellet was resuspended.The solution was centrifuged, the supernatant was removed, and the pellet was again resuspended in 4 mL methanol.The last step was repeated two more times.
After the last methanol washing step, the pellet was dried at room temperature.The dried pellet was taken up in 50 mM ammonium acetate o (pH 8.4).For N-glycan release 2.5 U N-glycosidase F were added and the resulting mixture was incubated overnight at 37°C.The reaction mixture was acidi ed with drops of glacial acetic acid and centrifuged at 4000×g.The supernatant was loaded onto a Hypersep C18 cartridge (1000 mg; Thermo Scienti c, Vienna) that had been previously primed with 2 mL of methanol and equilibrated with 10 mL water.The sample was applied, and the column was washed with 4 mL water.The ow-through and the wash solution were collected and subjected to centrifugal evaporation.Reduction of the glycans was carried out in 1% sodium borohydride in 50 mM NaOH at room temperature overnight.The reaction was quenched by the addition of two drops glacial acetic acid.Desalting was performed using HyperSep Hypercarb solid-phase extraction cartridges (25 mg) (Thermo Scienti c, Vienna).The cartridges were primed with 450 µL methanol followed by 450 µL 80% acetonitrile in 10 mM ammonium bicarbonate.Equilibration was carried out by the addition of three times 450 µL water.The sample was applied and washed three times with 450 µL water.N-glycans were eluted by the addition of two times 450 µL 80% acetonitrile in 10 mM ammonium bicarbonate.The eluate was subjected to centrifugal evaporation and the dried N-glycans were taken up in 20 µL HQ-water.5 µL of each sample were subjected to LC-MS/MS analysis.

LC-MS/MS analysis
LC-MS analysis was performed on a Dionex Ultimate 3000 UHPLC system coupled to an Orbitrap Exploris 480 Mass Spectrometer (Thermo Scienti c).The puri ed glycans were loaded on a Hypercarb column (100 mm × 0.32 mm, 5 µm particle size, Thermo Scienti c, Waltham, MA, USA) with 10 mM ammonium bicarbonate as the aqueous solvent A and 80% acetonitrile in 50 mM ammonium bicarbonate as solvent B. The gradient was as follows: 0-4.5 min 1% B, from 4.5-5.5 min 1-9% B, from 5.5-30 min 9-20% B, from 30-41.5 min 20-35% B, from 41.5-45 min 35-65% B, followed by an equilibration period at 1% B from 45 − 55 min.The owrate was 6 µL/min.MS analysis was performed in data-dependent acquisition (DDA) mode with positive polarity from 500-1500 m/z using the following parameters: resolution was set to 60,000 with a normalized AGC target of 300%.The 10 most abundant precursors (charge states 2-6) within an isolation window of 1.4 m/z were selected for fragmentation.Dynamic exclusion was set at 20 s (n = 1) with a mass tolerance of ± 10 ppm.Normalized collsion energy (NCE) for HCD was set to 20, 25 and 30% and the precursor intensity threshold was set to 8000.MS/MS spectra were recorded with a resolution of 15,000, using a normalized AGC target of 100% and a maximum accumulation time of 100 ms.

Data Analysis
For the MS/MS based pro ling, MS2 data (in the .rawle format) were extracted, re ned (i.e.precursor mass and charge-state; no scan merging), and converted into .mgfle format using PEAKS X Pro Studio 10.6 (build 20201221) or converted into .mgfles using the "vendor"-speci c peak-picking algorithm implemented in MSConvert (version 3. custom-made perl-script ("MS2OxoPlot.pl"),binned according to retention time (bin size = 10s), counted and visualized using R.
MS1 data were extracted, charge-deconvoluted, and deisotoped using Decon2 47 with custom parameters adjusted for N-glycomics data (Supplementary data).In a next step, we assigned the precursor mass and intensity values to mass bins (i.e. in the range of 1000-5000 Da; bin-width = ± 0.05 amu), summed the individual precursor mass-to-intensity values across the entire chromatographic time-range (i.e.retention time 0-50 min) and eventually removed signals below a cumulative intensity threshold of 5E + 6, using custom code.
Raw MS/MS data were re ned (i.e.mono-isotopic peak picking and charge state assignment) and converted into the generic .mgfle-format using PEAKS 48 .Analogous to MS1-data processing, the re ned precursor-information was deconvoluted and assigned into mass-bins (bin-width = ± 0.05 amu).From this, to automatically identify precursors that derived from N-glycans and to stringently control for unintended precursor ion co-isolation events, we calculated mass-bin-speci c score-values (SNOG-score) from the intensity value of an N-(and O-) glycan-speci c fragment ion (i.e.oxonium ion of the reducedend monosaccharide GlcNAc; [M + H] + = 224.1 amu) using custom-code.To remove signals derived from non-N-glycan-associated structures, all mass-bins with a SNOG-score lower than 0.03 were rejected.This     for Glycans (SNFG)."WAT-"-white adipose tissue.
9.E).This entirely different construction of the Neu5Gc N-sialome compared to its Neu5Ac-terminated counterpart raised the question of the origin of these differently sialylated structures.The elution pro le of the mixed Neu5Ac/Neu5Gc composition (Hex 5 HexNAc 4 Fuc 1 Neu5Ac 1 Neu5Gc 1 ), closely resembled the elution pro le of the Neu5Ac/Neu5Ac-sialylated structures in all tissues (Fig. 6.A), suggesting a shared origin for these structures.By contrast, the structural pro le of entirely Neu5Gcsialylated structures (i.e.Hex 5 HexNAc 4 Fuc 1 Neu5Gc 2 ) markedly deviated from the Neu5Ac/Neu5Gc and Neu5Ac/Neu5Ac patterns.Notably, the elution pro le of Neu5Gc/Neu5Gc structures exhibited minimal variation across tissues and closely resembled the serum elution pro le, suggesting that these structures were actually derived from (contaminant) serum glycoproteins.

Figures Figure 1
Figures

Figure 4 Sub
Figure 4