Confirmation of N-Glycosylation Site
CID was the most used ion fragmentation in MS characterization of therapeutic protein, in which molecular ions were accelerated in the vacuum region (using potential energy) to high kinetic energy, and then collided with inert gas molecules such as helium, nitrogen or argon, resulting in the conversion of kinetic energy into internal vibrational energy. When the vibration energy exceeded a certain threshold, covalent peptide bonds were most likely to be broken at the lowest energy skeletal cleavage region, producing b- and y- ions. For glycopeptides consisting of glycans and peptides, glycans and peptides were bound via glycosidic bonds. The bond energy of the glycosidic bond was weak compared with that of the peptide bond. Thus, the glycosidic bond was preferentially cracked in the CID fragmentation, which would reduce the fragmentation efficiency of peptides. Therefore, the MS/MS sequence coverage of glycopeptides was low, which leaded the accurate confirmation of glycosylation site would be greatly affected. In comparison, low-energy free electrons interacted with protonated multi-charge proteins or peptides in EAD fragmentation, resulting in instantaneous fragmentation due to exotherms during the interaction, and mainly producing abundant c- and z- ions formed by the fracture of N-C α bond in peptides. The EAD fragmentation could reduce proteolysis dependence and cleave long peptide or intact proteins with higher charges. MS/MS sequence coverage of glycopeptides were higher, and the integrity of many modifications could be effectively preserved on peptides or proteins, such as phosphorylation modification, N- and O-glycosylation modification, sulfonation modification, etc. The EAD fragmentation would effectively compensate the limitations of glycosylation sites analysis in CID fragmentation.
In order to determine a better fragmentation mode for glycopeptide analysis, digested rhEPO_1 was analyzed in CID and EAD fragmentations, respectively. As shown in Fig. 1, the MS/MS spectrum of the glycopeptide AEN(24)ITTGCAE + (A4G3F1_OMe1) were compared in CID and EAD fragmentations. B-ions were a series of oxonium ions, which were fragment ions formed after partial fragmentation of the glycans released from peptides. The Y-ions were fragment ions owning the entire peptide and a part of glycan after the glycosidic bond was broken. The presence of these ions would help to determine the fragmentation of glycans and peptides in the glycopeptides. As shown in Fig. 1A, in CID fragmentation, the glycosidic bonds were preferentially broken, resulting in abundant fragment ions of glycans. Although many oxonium ions of glycans could be identified, the MS/MS sequence coverage of peptide was very low, and the glycosylation site could not be accurately identified. As shown in Fig. 1B, abundant c/z fragment ions of the glycopeptide AEN(24)ITTGCAE + (A4G3F1_OMe1) were obtained in EAD fragmentation. The MS/MS sequence coverage of this glycopeptide was 89%. The MW difference between c2-ion (m = 218.1 Da) and c3-ion (m = 2729.0 Da) including N24 site was 87.0 Da, which was just that of A4G3F1_OMe1 (2396.9 Da). The results could accurately determine the glycosylation at N24 site. Identification of glycosylation at N38 and N83 sites were similar as that of N24 site (as shown in Fig. 2). In rhEPO_1, the glycosylation ratios (Area percent of all identified glycopeptides in all peptides with the same N-glycosylation site, according to Eq. 1) at N24, N38, and N83 sites were 82.70%, 100%, and 100%, respectively. The usage of EAD fragmentation could effectively preserve the integrity of glycans and the glycosidic bonds, and produce abundant MS/MS fragment ions, which largely increased the MS/MS sequence coverage of glycopeptide and was more helpful for the accurate identification of glycosylation sites.
N-glycosylation Heterogeneity Analysis
There were three N-glycosylation sites (N24, N38 and N83) in rhEPO, each of which contained not only a variety of different glycans, but also many O-methylation or O-acetylation modifications. This complexity made in-depth characterization of glycosylation in rhEPO challenging. In this study, a workflow using EAD fragmentation in LC-MS method were used for the glycosylation heterogeneity analysis. The O-methylation and O-acetylation modifications of N-glycans were mainly considered during the data process. As shown in Table 1, main glycans at N83 site were identified in the rhEPO_1. The glycans at N24 and N38 sites were also analyzed and not shown in this study. The results showed that there were 15, 10 and 12 main glycans identified at N24, N38 and N83 sites, respectively. More than 80% of them owned four antenna structures. The total relative content (Area percent of all glycopeptides with sialylation modification in all identified glycopeptides at one N-glycosylation site, according to Eq. 2) of glycopeptides with sialylation was about 55.14%, 99% and 100%, respectively, which confirmed that sialylation was more likely to occur at N38 and N83 sites. In addition, the total relative content of O-acetylation on the sialic acid at the N24, N38 and N83 sites in all the glycopeptides identified was 51.15%, 97.00% and 73.40%, respectively, which confirmed that O-acetylation on the sialic acid was more likely to occur at the N38 site. Among these N-glycosylation sites, Neu5Gc were easily occurred at the N83 site, the sum relative content of which was 65.80%. As shown in Table 3, the total average number (Weighted average of area percent of one glycopeptide containing variable x in all identified glycopeptides and the number of sialic acids in the glycans of the responding glycopeptide, according to Eq. 3) of three variables including sialic acids (Neu5Ac and Neu5Gc), Neu5Gc and O-acetylation occurred on the sialic acids were calculated at N24, N38 and N83 sites. For the rhEPO_1, the total average number were 7.28, 0.66, 4.21, respectively. These parameters could provide an important quantitive data at the glycosylation sites level and the intact protein level, respectively, which would help rhEPO manufacturers to better improve the activity of their products by optimizing the production process.
Confirmation of O-Glycosylation Site
Like the N-glycosylation analysis of rhEPO, as for the O-glycosylation analysis, the MS/MS sequence coverage of the same O-glycopeptide, AISPPDAAS(126)AAPLR+(Core1_S1), was also compared in CID and EAD fragmentations. This glycopeptide contained two serine (Ser, S), both of which were potential O-glycosylation sites. As shown in Fig. 3A, more MS/MS fragment ions of O-glycans were found in CID fragmentation, while there was a low MS/MS sequence coverage of peptide. These made it difficult to accurately confirm O-glycosylation sites of rhEPO. In comparison, abundant c-ions and z-ions were observed in EAD fragmentation, and the MS/MS sequence coverage of glycopeptide was up to 93%, as shown in Fig. 3B. The MW difference between y12-ion (m = 1808.8) and y11-ion (m = 1721.8) was 87.0 Da, which was just that of one S-residue (87.0 Da). The result confirmed that the O-glycosylation didn’t occur on the amino acid of S120. Similarly, the MW difference between c9-ion (m = 1483.6 Da) and c8-ion (m = 740.4 Da) was 743.2 Da, which was just the sum between MW of one S-residue (87.0 Da) and that of one O-glycan residue (656.2 Da). Thus, it was confirmed that S126 was the O-glycosylation site.
O-glycosylation Heterogeneity Analysis
To better characterize the O-glycosylation, rhEPO_1 was selected as an example to be analyzed at intact protein and glycopeptide levels in this study. The N-Glycans of rhEPO_1 were firstly removed using the PNGF enzyme. As shown in Fig. 4, three major MW were obtained in deconvolution spectra, including 18238.3 Da, 18894.6 Da, 19185.8 Da, of which 18238.3 Da responded to de-N-glycosylated rhEPO. The MW difference between 18894.6 Da and 18238.3 Da was 656.3 Da, which responded to the MW of the Core1_S1. The mass difference between 19185.8 Da and 18238.3 Da was 947.5 Da, which responded to the MW of the Core1_S2. Thus, two O-glycans were identified in the rhEPO_1, including Core1_S1 and Core1_S2. By calculating the relative area percent of three O-glycoforms in the deconvolution spectra, the glycosylation ratios of Core1_S1 and Core1_S2 were 42.2% and 38.0%, respectively, and the total O-glycosylation ratio was 80.2%, as shown in Table 2.
In this study, EAD fragmentation was used to analyze glycopeptide AIS(126)PPDAASAAPLR+ (Core1_S2)with high quality MS/MS spectrum. As shown in Fig. 3B and Fig. 5, the MS/MS sequence coverage of glycopeptides was obtained as high as 93%. At glycopeptide levels, either Core1_S1 or Core1_S2 was accurately identified in glycopeptide level, as shown in Table 2. The total glycosylation ratio of Core1_S1 and Core1_S2 was 80.3%, which was the same as the result at intact protein levels. These further demonstrated that the identification results at intact protein and glycopeptide levels were complementary for the O-glycosylation characterization of rhEPO. Otherwise, the EAD fragmentation was helpful for the identification of O-glycans and the glycosylation ratio in the glycopeptides. So far, there were no studies showing the effect of O glycosylation on the activity of rhEPO, so the sialylation of O-glycans was not further to study.
Batch-to-Batch Consistency Analysis
Batch-to-batch consistency was very important to monitor the stability of production process. Three different batches of rhEPO samples were compared using the LC-MS method at the glycopeptide level in this study. Some important parameters were compared, including differences of the relative content (Area percent of one glycopeptide in all glycopeptides with same glycosylation site) of the same glycopeptides, the total average number of sialic acids in glycopeptides, the total average number of Neu5Gc in identified N-glycans and the total average number of O-acetylation on the sialic acids. The relative content of the same glycopeptide at the same site was shown in Fig. 6A ~ 6C, which was obviously different in the three rhEPO samples. As shown in Table 3, considering the difference in the number of sialic acids, Neu5Gc and O-acetylation on sialic acid, it was concluded that the rhEPO_3 was similar with rhEPO_1. These important parameters would guide the manufacturers to optimize the production process of rhEPO.
For O-glycosylation site of S126, Core1_S1 and Core1_S2 were successfully identified in three different batches of rhEPO samples. As shown in Fig. 6D, the relative content of Core1_S1 was higher than that of Core1_S2 in all rhEPO samples. These results determined that in-depth glycosylation analysis of rhEPO samples would be effectively applied for the batch-to-batch consistency analysis using EAD fragmentation in LC-MS method, which would help the manufacturers better monitor the stability of production process.