(Hex)2HexNAc
There are many isomers of (Hex)2HexNAc covered by the curves of ions m/z 568 in Fig. 1(b), but these isomers were not separated well from each other. The eluents from the amide-80 column were collected every 30 seconds. The collected fractions were concentrated and individually injected into an HPLC with a porous graphitic carbon (PGC) column for further separation. The chromatograms of (Hex)2HexNAc using a PGC column are illustrated in Fig. 2(a)–(h). Because isomers were partially separated by the amide-80 column, the isomer distribution in the front part of the curve of (Hex)2HexNAc in Fig. 1(b) is different from the isomer distribution in the rear part of the same curve, as illustrated by the changes in the relative abundances of these isomers in the chromatograms of Fig. 2(a)–(h). Notably, we used intact oligosaccharides (i.e., no reduction at the reducing end), therefore one isomer may appear as two peaks in a chromatogram if the α and β anomeric configurations of the sugar at the reducing end of a given isomer were separated by the PGC column. Although the relative abundance of a given isomer changed in these chromatograms, the relative intensities of the α and β anomers of a given isomer remain the same because the α and β anomers of the same isomer reached equilibrium in solution through mutarotation. Consequently, a comparison of the relative intensities in these chromatograms provided a crude idea which two peaks in chromatogram belong to the same isomer. For example, the peaks at retention time t = 39.1 and 47.6 min in Fig. 2(a) and 2(b) may belong to one isomer, and peaks at retention time t = 27.0 and 32.6 min in Fig. 2(b)–(d) may belong to another isomer.
To confirm which two peaks belong to one isomer, eluents from the PGC column were collected in fractions every 30 seconds, ensuring that different peaks in Fig. 2(a)-(h) were collected into separate fractions. These eluents were stored at room temperature for several hours before being reinjected separately into the same PGC column. If two peaks in Fig. 2(a)-(h) are from the same isomer, reinjecting the eluents into the same PGC column would show two peaks again in the chromatogram, even though each fraction initially contained only one anomer. This is because the α and β anomers of the same isomer change to each other and reach equilibrium through mutarotation during the storage time.
Eluents from amide-80 column in tubes 12–15 [Figure 2(a)–(d)] had similar isomer distributions, they were combined and sent into HPLC with a PGC column for further isomer separation. The chromatogram is illustrated in Fig. 2(i). Then, the eluents collected from the PGC column at different retention time in Fig. 2(i) were reinjected into the same PGC column again. Parts of the chromatograms are illustrated in Fig. 2 (j)–(n). Each chromatogram in Fig. 2 (j)–(n) represents a chromatogram of a pure isomer collected from PGC column at different retention time. For example, isomer shown in Fig. 2(j) was collected at the retention time 27.0–27.5 min in Fig. 2(i). Although only the peak at t = 27.3 min was collected initially, the chromatogram in Fig. 2(j) shows two peaks, t = 27.3 and 32.9 min, indicating these two peaks belong to one isomer.
For the isomer shown in Fig. 2(n) which was collected at the retention time t = 40.5–41.0 min, it was contaminated by the isomer in Fig. 2(m) because one peak of the isomer shown in Fig. 2(m) had a retention time similar to that of isomer shown in Fig. 2(n). To purify the isomer shown in Fig. 2(n), repeated injections of the eluent collected at t = 40.5–41.0 min into PGC column followed by the collection at t = 40.5–41.0 min were performed until no contamination was found. These isomers were then individually transported into a linear ion trap mass spectrometer for structural determination.
The structures of oligosaccharides were determined using two methods. The first method involved comparing them to the oligosaccharides extracted from human, bovine, and caprine milk in our previously studied65. In this method, the structures of oligosaccharides extracted from caprine colostrum were determined using the following three criteria. (1) Comparison of the chromatogram retention times of the selected m/z value to those of the oligosaccharides extracted from human, bovine, and caprine milk. Both the retention times of α and β-anomers of a given isomer must match those of the oligosaccharides extracted from human, bovine, and caprine milk. (2) Comparison of one MS2 and two MS3 mass spectra at the corresponding chromatogram retention times to those of the oligosaccharides extracted from human, bovine, and caprine milk. (3) Comparison of the relative intensities of α and β-anomers of a given isomer to those of the oligosaccharides extracted from human, bovine, and caprine milk. Structures were assigned only if the aforementioned three criteria were satisfied simultaneously.
The second method for structural determination is LODES/MSn. If the aforementioned three criteria were not satisfied, the peaks in the chromatogram represent oligosaccharides not reported in our previous work. The structures of these oligosaccharides were determined using LODES/MSn. LODES/MSn involves sequential collision induced dissociation (CID) of oligosaccharide sodium (or lithium) adducts in an ion trap mass spectrometer50–55. The selected fragments for the next stage CID in the sequences of MSn are derived from carbohydrate dissociation mechanisms56–60. The mechanisms of oligosaccharide sodium ion adducts used in this study are summarized as the following three propensity rules.
(1) Dehydration mainly takes place at the reducing end of oligosaccharides.
(2) Cross-ring dissociation mainly takes place at the reducing end of oligosaccharides and follows the rules of retro-aldol reaction. Fragmentation patterns of retro-aldol reaction are used to determine the linkage position of the sugar at the reducing end. Details of fragmentation patterns are illustrated in Figure S1 and S2 of the Supplementary Information.
(3) Cleavages of the glycosidic bond can occur at any glycosidic bond (i.e., not limited to the reducing end).
The dissociation mechanisms of lithium ion adducts are similar to those of sodium ion adducts. However, cross-ring dissociation and dehydration occurring at the nonreducing end cannot be neglected. Therefore, in some cases, the O1 atom of the monosaccharide at the reducing end had to be labeled with 18O when lithium ion adducts were used for structural determination.
Here, we used the trisaccharide Galα-(1–3)-Galβ-(1–4)-GlcNAc [Figure 2(k)] as an example to illustrate how the structure was determined using LODES/MSn. The mass spectrum, presented on the left side of Fig. 3(a), shows the fragments produced from CID of the precursor ion m/z 568 [sodium adduct of (Hex)2HexNAc]. The loss of neutral m = 101 from the precursor ion, resulting in the fragment ion m/z 467, represents the cross-ring dissociation of a HexNAc at the reducing end (rule 2). The glycosidic bond linkage(s) between Hex and HexNAc can be 1–4 or 1–6 for linear trisaccharides, or (1–4, 1–6) for branched trisaccharides, according to the retro-aldol reaction (refer to Figure S1-S3 of Supplementary Information). The fragment ions m/z 365 (sodium ion adduct of (Hex)2) and m/z 347 (sodium ion adduct of (Hex)2-H2O) found in the same CID spectrum suggested two hexoses were connected together. Therefore, the trisaccharide must be linear, i.e., Hex-Hex-HexNAc, and HexNAc is located at the reducing end. The corresponding CID sequence and the structures of fragments are illustrated in the middle of Fig. 3(a), with the possible precursor structures derived from these fragments are illustrated on the right side of Fig. 3(a).
In the next step, the large intensity of the fragment ion m/z 275 (compared to ions m/z 305 and 245) found in the CID sequence 568 (sodium ion adduct of precursor)→365(sodium ion adduct of (Hex)2)→fragments [left side of Fig. 3(b)] indicated that the linkage between two hexoses is 1→3, according to the retro-aldol reaction (refer to Figure S1 and S2 of Supplementary Information). In the third step [Figure 3(c)], comparison of the CID spectrum 568→244 (sodium ion adduct of HexNAc)→fragments [Figure 3(c)] to the HexNAc monosaccharide database (refer to Figure S4 of Supplementary Information) suggested that the HexNAc is GlcNAc.
To determine the stereoisomer of each hexose, the hexose lithium ion adducts, ions m/z 187 produced from the following two CID sequences, 552 (lithium ion adduct of precursor)→331 (lithium ion adduct of (Hex)2-H2O)→187, and 552 (lithium ion adduct of precursor)→390 (lithium ion adduct of Hex-HexNAc)→187, were used in the subsequent CID spectrum measurement. The ion m/z 187 produced from these two sequences represented the hexose at the terminal nonreducing end and center, respectively. The structures of these two hexoses are determined by comparing CID spectra of these two hexose monosaccharides [left side of Fig. 3(d) and 3(e)] to the monosaccharide database in Figure S5 of the Supplementary Information. Spectrum similarities, as shown on right side of Fig. 3(d) and (e), suggest they are Galα and Galβ, respectively because they have the highest similarity score for the hexose at the terminal nonreducing end and center, respectively. In the final step, the CID sequence 568 (sodium ion adduct of precursor)→406 (sodium ion adduct of Hex-HexNAc)→fragments was studied. The intensity of fragment ion m/z 328 (loss of m = 78) was much larger than that of fragment ion m/z 329 (loss of m = 77) in the CID spectrum [left side of Fig. 3(f)]. Comparison to the CID spectra of disaccharides Gal-β-(1–4)-GlcNAc and Gal-β-(1–6)-GlcNAc [Figure S6 of Supplementary Information] suggests the linkage between Gal and GlcNAc was 1–4. Consequently, the entire trisaccharide was determined to be Galα-(1–3)-Galβ-(1–4)-GlcNAc.
The structures of the other purified isomers, illustrated in Fig. 2(j)-(n), (p) were determined using the LODES/MSn analogous to the aforementioned method. The CID spectra used for the structural determination are presented in Figure S11-S15 of Supplementary Information. The other isomers, for which the retention time, MS2 and MS3 spectra and relative intensities of anomers were found to be identical to the oligosaccharides found in bovine milk and human milk in our previous study, had their structural determination made by comparison to the oligosaccharides assigned previously in bovine milk and human milk.
Oligosaccharides in caprine colostrum and mature milk have been studied extensively in previous research.40, 41, 45, 47, 48, 65–77. Among the isomers of (Hex)2HexNAc, two isomers, GalNAcα-(1–3)-Galβ-(1–4)-Glc and GlcNAcβ-(1–3)-Galβ-(1–4)-Glc have been reported before. Three isomers, GalNAcα-(1–3)-Galβ-(1–4)-Glc, GalNAcβ-(1–3)-Galβ-(1–4)-Glc, and Galβ-(1–4)-[GalNAcβ-(1–2)]-Glc were reported in our previous study, and they were also found in this study. Additionally, six isomers were newly discovered in this study: Manα-(1–6)-Manβ-(1–4)-GlcNAc, Galα-(1–3)-Galβ-(1–4)-GlcNAc, GalNAcα-(1–4)-Glcβ-(1–4)-Glc, GalNAcβ-(1–3)-Glcβ-(1–4)-Glc, Galβ-(1–3)-Glcβ-(1–4)-GlcNAc, and Galβ-(1–4)-Glcβ-(1–4)-GlcNAc. The isomer Manα-(1–6)-Manβ-(1–4)-GlcNAc may be produced from the degradation of N-glycans, while the other isomers cannot be explained by the biosynthesis from lactose or degradation from large oligosaccharides.
In Fig. 2(e)-(g), there are some peaks at retention time 36.8, 37.2, and 40.6 min. However, the intensities of these peaks are very small, and/or the separation from other isomers is difficult. The structures of these peaks were not determined. Therefore, the total number of (Hex)2HexNAc isomers are greater than what we have identified.
The comparison of (Hex)2HexNAc isomers in colostrum and mature milk is illustrated in Fig. 4. In general, the structural diversity of oligosaccharides in colostrum is greater than that in mature milk. Many oligosaccharides, which do not have lactose at the reducing end, are more abundant in colostrum than in mature milk.
(Hex)3
The purification and structural determination of (Hex)3 isomers are similar to those of (Hex)2HexNAc. Figure 5 (a)-(h) show the chromatograms of isomers collected from the eluents of the amide-80 column at different retention time, separated by the PGC column. Most of the isomers can be identified by comparing their retention time, MS2 and MS3 spectra, and relative intensities of anomers to the oligosaccharides identified in bovine and human milk in our previous study. For the isomers that cannot be identified by comparing to the oligosaccharides found in bovine and human milk, they were purified by separating from the other isomers. The pure isomers collected from the PGC column at different retention time were reinjected into the PGC column separately to check their purity. The chromatograms of these isomers are illustrated in Fig. 5(i)-(m). The structures of these isomers are then identified using LODES/MSn.
Here we used the isomer shown in Fig. 5(l) as an example to demonstrate the structural determination. The mass spectrum, presented on the left side of Fig. 6(a), shows the fragments produced from CID of the precursor ion m/z 527 [sodium adduct of (Hex)3]. The loss of neutral m = 60 from the precursor ion, resulting in the fragment ion m/z 467, represents the cross-ring dissociation of a Hex at the reducing end (rule 2), and the glycosidic bond linkage(s) between the Hex at reducing end and the other Hex can be 1–4 (for linear trisaccharide), or (1–4, 1–6) for branched trisaccharide, according to the retro-aldol reaction (Figure S1 and S2 of Supplementary Information). The CID sequence and structures of fragments are illustrated in the middle of Fig. 6(a), while the possible precursor structures derived from these fragments are illustrated on the right side of Fig. 6a. The fragment ion m/z 347 (sodium ion adduct of (Hex)2-H2O) in the CID spectrum of 527(sodium ion adduct of precursor)→467(cross-ring dissociation from the Hex at the reducing end)→fragments [left side of Fig. 6(b)] suggest that the two hexoses not at the reducing end are connected together. Therefore, the trisaccharide must be linear with a 1–4 linkage at the reducing end, i.e., Hex-Hex-(1–4)-Hex. The ion m/z 275 found in the CID sequence 527 (sodium ion adduct of precursor) →467(cross-ring dissociation from the Hex at the reducing end)→365 (sodium ion adduct of (Hex)2 at nonreducing end)→fragments [left side of Fig. 6(c)] indicated that the linkage between two hexoses at nonreducing end is 1–3, according to the retro-aldol reaction. Therefore, the linear trisaccharide is Hex-(1–3)-Hex-(1–4)Hex. To determine the stereoisomer of each hexose, lithium ion adduct of 18O labelled trisaccharide was used in CID. The hexose lithium ion adducts, ion m/z 187 or 189, produced through the CID sequences 513→451→331→187, 513→351→187, and 513→351→189 represented the hexose at the terminal nonreducing end, center, and reducing end, respectively. The CID spectra of these hexoses [Figure 6(d), 6(e), 6(f)] were compared to the CID spectra of the monosaccharide database provided in Figure S5 of Supplementary Information to determine the stereoisomers of these hexoses. Spectrum similarities, as shown in Fig. 6, show that Galβ and Glcβ, and Glc have the highest similarity scores for the hexose at the terminal nonreducing end, center, and reducing end, respectively. Consequently, the entire trisaccharide was determined to be Galβ-(1–3)-Glcβ-(1–4)-Glc.
The CID spectra used for the structural determination of the purified isomers in Fig. 5(j), (l), (m) are presented in Figure S16-19 of Supplementary Information. For the remaining isomers which the retention time, MS2 and MS3 spectra and relative intensities of anomers matched those of oligosaccharides found in bovine milk and human milk in our previous study, the structures were determined based on our earlier assignments.
There are 11 isomers of (Hex)3 found in caprine colostrum in this work. Among these isomers, Galα-(1–3)-Galβ-(1–4)-Glc, Galβ-(1–3)-Galβ-(1–4)-Glc, and Galβ-(1–6)-Galβ-(1–4)-Glc have been reported before41, Galβ-(1–4)-Glcβ-(1–4)-Glc, Galβ-(1–4)-[Galβ-(1–2)]-Glc and Galβ-(1–4)-[Glcα-(1–2)]-Glc were reported in our previous study65. Three isomers, Galβ-(1–3)-Glcβ-(1–4)-Glc, Galβ-(1–6)-Glcβ-(1–4)-Glc, Galβ-(1–6)-[Hex-(1–4)]-Glc, and a nonreducing trisaccharide Galβ-(1–4)-Glcβ-(1–1)-Gal were newly found in this study. The stereoisomers and anomericities of another isomer, Hex-(1–6)-Hex-(1–4)-Hex, at retention time 25.9 and 27.2 min [Figure 5(a)], were not determined. Peaks labeled by question marks in Fig. 5 represent structures that remain unidentified. The structures of these isomers are difficult to be determined due to low abundance and challenges in separating them from other isomers.
The comparison of (Hex)3 isomers in colostrum and mature milk is illustrated in Fig. 7. The structural diversity of oligosaccharides in colostrum exceeds that of mature milk, with many oligosaccharides lacking lactose at the reducing end being more abundant in colostrum than in mature milk.
(Hex)2Fuc
The purification and structural determination of (Hex)2Fuc are similar to those of (Hex)2HexNAc and (Hex)3. Figure 8 (a)-(f) show chromatograms of isomers collected from the eluents of the amide-80 column at different retention time, separated by the PGC column. Tube 15 contains only one isomer [Figure 8(c)]. Isomers at retention time 25.2 and 26.0 min [Figure 8(d)] and the isomer at 13.2 and 14.4 [Figure 8(f)] were separated from other isomers through fraction collection of the PGC column. The pure isomers collected at different retention time were reinjected separately into the PGC column to check the purity. Chromatograms of these isomers are illustrated in Fig. 8(g) and (h). Among the isomers at retention time 29–42 min, there are many isomers [Figure 8(a) and 8(b)]. Figure 8(i)-(o) shows the chromatograms of these isomers separated by PGC column from the fraction collections of Fig. 8(b) for every two minutes. Only one isomer was successfully purified [Figure 8(l)]; the intensities of the other isomers are either too small or they are too difficult to separate from the others. The purified isomer was then identified using LODES/MSn.
The isomer presented in Fig. 8(g) serves as an example to demonstrate the structural determination process. The mass spectrum, presented on the left side of Fig. 9(a), shows the fragments generated from CID of the precursor ion m/z 511 [sodium adduct of (Hex)2Fuc] and the CID sequence as well as the structures of fragments are illustrated in the middle. The loss of neutral m = 60 from the precursor ion yields the fragment ion m/z 451, indicating cross-ring dissociation of a Hex or Fuc at the reducing end, according to the retro-aldol reaction. This CID spectrum suggests two possible structures: a linear trisaccharide with a Hex or Fuc at the reducing end and a glycosidic bond linkage between the reducing-end sugar (Hex or Fuc) and the other sugars through 1–4 linkage, or a branched trisaccharide with (1–4, 1–6) linkages with a Hex at the reducing end. Since fragment m/z 365 indicates two connecting hexoses, the possible precursor structures would be those on the right side of Fig. 9(a).
In the CID spectrum of 511 (sodium ion adduct of precursor)→451(cross-ring dissociation from the Hex or Fuc at the reducing end)→fragments [left side of Fig. 9(b)], fragment ions such as m/z 365 (sodium ion adduct of (Hex)2), m/z 347 (sodium ion adduct of (Hex)2-H2O), m/z 349 (sodium ion adduct of HexFuc), and m/z 331 (sodium ion adduct of (HexFuc)-H2O) were not found, indicating the trisaccharide is branched. Therefore, the trisaccharide must be Hex-(1–4 or 1–6)-[Fuc(1–4 or 1–6)]-Hex. The intensities of fragment ions m/z 289, 259, and 229 in the ratio of 5: 3: 1 found in the CID spectrum through the sequence 511 (sodium ion adduct of precursor)→349 (sodium ion adduct of disaccharide Fuc-Hex)→fragments [left side of Fig. 9(c)] indicated that the linkage between Fuc-Hex is 1–6, according to the retro-aldol reaction. The intensities of fragment ion m/z 305 found in the CID spectrum through the sequence 511 (sodium ion adduct of precursor)→365 (sodium ion adduct of disaccharide Hex-Hex)→fragments [left side of Fig. 9(d)] indicated that the linkage between Hex-Hex is 1–4, according to the retro-aldol reaction. Therefore, the branched trisaccharide is Hex-(1–4)-[Fuc-(1–6)]-Hex. The branched trisaccharide can be cross-checked using 18O labelled at the reducing end of lithium ion adduct. Fragment ions m/z 351 (lithium ion adduct of 18O labeled Hex-Hex) and m/z 335 (lithium ion adduct of 18O labeled Fuc-Hex) found in the CID spectrum 497→fragments in Fig. 9(e) suggest the trisaccharide is branched.
To determine the stereoisomer of each hexose, the lithium ion adduct of 18O labeled trisaccharide was used in CID. The hexose lithium ion adducts, ion m/z 187 or 189, produced through the CID sequences 497→351→187 and 497→351→189, represent the hexose at nonreducing end and reducing end, respectively. The CID spectra of these hexoses [Figure 9(f) and 9(g)] are compared to the CID spectra of the monosaccharide database provided in Figure S5 of the Supplementary Information to determine the stereoisomers of these hexoses. Spectrum similarities, as shown in Fig. 9(f) and (g), indicate they are Galβ and Glc, respectively. Consequently, the entire trisaccharide was determined to be Galβ-(1–4)-[Fuc(1–6)]-Glc. At this moment, we do not have Fucα-(1–6)-Glc and -Fucβ-(1–6)-Glc disaccharide CID spectra for comparison, and the anomeric configuration of the glycosidic bond between Fuc and Glc is not determined. The CID spectra for structural determination of the other isomers are illustrated in Figure S20-S22 of the Supplementary Information.
Among the isomers of (Hex)2Fuc, Fucα-(1–2)-Galβ-(1–4)-Glc and Galβ-(1–4)-[Fucα-(1–3)]-Glc in caprine colostrum have been reported before41, while Fucα-(1–6)-Galβ-(1–4)-Glc and Galβ-(1–4)-[Fuc-(1–6)]-Glc were newly found in this study. At retention time 32–42 min, there are at least four isomers of (Hex)2Fuc, but only one of them is structurally identified. The abundances of the other isomers are too small or too difficult to separate from other isomers, and the structures remain unidentified.
Figure 10 shows the comparison of oligosaccharides in colostrum and mature milk. Analogous to the (Hex)2HexNAc and (Hex)3, the structural diversity of oligosaccharides in colostrum is more than that of mature milk.
In conclusion, we demonstrated that a new mass spectrometry, logically derived sequence tandem mass spectrometry (LODES/MSn), was applied for the structural determination of neutral trisaccharides extracted from caprine colostrum and mature milk. This method does not rely on the oligosaccharide standards, thus it is particularly useful for identifying undiscovered oligosaccharides. New oligosaccharides were found in caprine colostrum, and many of which lack lactose at the reducing end. Instead, they feature Glcβ-(1–4)-Glc or Glcβ-(1–4)-GlcNAc at the reducing end, indicating the existence of undiscovered biosynthetic pathways. These unusual oligosaccharides are more abundant in colostrum than that in mature milk. These newly discovered oligosaccharides indicate undiscovered biosynthetic pathways.