Polar metabolites represent an important fraction of the metabolome. Our novel RPLC-MS/MS method uses a PGC column to quantify 14,685 features in human serum and collect MS/MS spectra to annotate 1,907 features. Fifty five percent of annotated features were carboxylic acids and derivatives. Triplicate analyses of six individual’s serum demonstrated that 90% of the annotated features had CVs lower than the widely used untargeted metabolomics threshold of 0.30 (Brunius et al. 2016). Appropriate CV thresholds for metabolite analyses are determined according to the aim of the study. For example, less strict CV thresholds are typically applied to exploratory untargeted metabolomics analyses to increase metabolome coverage (Brunius et al. 2016). Thresholds for acceptable CVs for features in untargeted metabolomics vary but CVs between 0.20 (Dudzik et al., 2018) to 0.30 (Brunius et al. 2016) are generally accepted. Six internal standards (i.e. creatinine, hypoxanthine, phenylalanine, thymine, tyrosine and vitamin B3) had recoveries within 70–130% of the spiked blanks, five of these with CVs < 0.10, an additional four internal standards (i.e. leucine, alanine, ethanolamine and 1,4-butanediamine) met the acceptable 0.15 > CV for targeted metabolomics, specified by the United States Food and Drug Agency [FDA] (Fda and Cder 2018). Thus, we claim the method we developed accurately and precisely quantifies polar metabolites in human serum.
Consensus criteria for the evaluation of the performance of untargeted metabolomics methods are still to be universally adopted. However, we recognise the comments of Dudzik et al., Kirwan et al., Rakusanova et al. (Dudzik et al. 2018; Kirwan et al. 2022; Rakusanova et al. 2023) which include: the need to clearly communicate thought-out methodology; QCs and blanks to assess method consistency, spiked internal standards relevant to the targeted metabolites; and to keep the metabolite co-efficient of variation as low as possible. We incorporated these suggestions into our demonstration of the accuracy, reproducibility and utility of a novel LC-MS/MS method that identifies and quantifies polar metabolites. Our results compare well with results of HILIC or derivatisation-GC-MS methods (Rakusanova et al., 2023; Zeki et al., 2020). For example, Zhang et al. (2023) detected 755 positive ions in serum samples from 131 hypertensive children. HILIC methods (e.g. Zhang et al. (2023)) often require repetitive drying and reconstitution steps, using a combination of methanol and acetonitrile extractions. These methods are time consuming and modify metabolites. Our method requires a simple single extraction and centrifugation step, that can be completed within 15 minutes. Furthermore, each sample takes only 15 minutes of LC-MS time to analyse, including re-equilibration of the column. Thus, our method is quick, easy and able to accurately and precisely identify a wide range of polar metabolites.
Aqueous TCA extraction and porous graphite LC-MS/MS identifies polar metabolites with logp values ranging from − 9.1 to 5.6 (Supplementary Fig. 3). Metabolites with logp < 0 are typically described as polar. Approximately 70% of annotated metabolites captured by our method are polar. Oxidised glutathione is the annotated metabolite with the lowest partition coefficients (-9.1).
Derivatisation-GC-MS is useful for polar metabolomics. However, derivatisation-GC-MS tends to detect fewer features than LC-MS and requires additional pre-processing steps that often impact sample integrity. For example, a recent GC-MS study detected 384 features (Eylem et al. 2022). Similarly, Rey-Stolle et al. detected only 241 features and of these 153 had a CV < 30 and 123 had a CV < 20 (Rey-Stolle et al. 2022). Furthermore, when used in conjunction with LC-MS to extend coverage to the non-polar fraction of the metabolome, GC-MS requires extra training and is costly (Haggarty and Burgess 2017).
Previously, TCA as a metabolite extraction solvent in combination with NMR-based metabolomics was claimed to be a poor deproteinisation reagent that introduces artifacts (Daykin et al. 2002). We have demonstrated aqueous TCA to be an effective deproteinisation and extraction solvent for a wide range of polar metabolites and did not observe similar artifacts. Ninety percent of annotated metabolites, predominantly ‘carboxylic acids and derivatives’, were consistently recovered with a CV < 0.30. The disadvantage of using TCA is that it is a relatively strong acid (pKa = 0.66) that can hydrolyse esterified compounds. Moreover, the high concentration of halogenated acid is inappropriate for ESI- modes, as it is anionic and quenches ionisation. Although ~ 80% of metabolites ionise to some degree in ESI+. Additionally, the water component of the TCA solution is more polar than methanol or acetonitrile and thus the process is less effective at extracting mid- to non-polar metabolites. However, this is advantageous when the mid- to non-polar metabolites are not the identified targets of the method. This is particularly true with PGC which irreversibly adsorbs molecules with planar saturated hydrocarbon moieties such as tryptophan.
Polar internal standards (i.e. creatinine, tyrosine, hypoxanthine, thymine and vitamin B3) were highly recovered by the paired TCA extraction and PGC chromatography method. By contrast, polar internal standards alanine and ethanolamine were not, possibly due to their loss during extraction, or ion suppression (Alseekh et al. 2021). Quantitation of creatinine, tyrosine, hypoxanthine, thymine and vitamin B3 were highly linear with abundance PCC values of 0.97, when compared to dilution, suggesting that they are not entering the machine early enough to be affected by void enhancement or suppression. Tryptophan appears to be lost throughout the run possibly because it has an aromatic moiety that is strongly retained by the PGC, and only partially elutes. A stronger solvent mixture might resolve this, adding extra value to the method we have developed.
The use of internal standards is a major strength of our study, particularly as the concentrations are at the lower end of their adult physiological quantities acquired from HMDB, demonstrating that the assay yields robust and biologically relevant information (Wishart et al. 2018). This is in part because metabolite identification and quantitation is restricted by the mass spectrometer dynamic range. Dynamic range is determined by metabolite abundance, methodology and instrument selection and can be tested by determining metabolite linearity by serial dilutions. Accurate quantification of metabolites is important to enable deciphering differences in metabolite concentrations between groups or individuals over time. Non-linearity can distort these differences to make them less or more significant than they are. Therefore, parametric tests, which use numerical value differences to determine significance, cannot be used on these metabolites. However, for those metabolites that are positively correlated (99.8%) non-parametric statistical tests that use ranks from observed data instead of the size of relative abundance difference, can be used to determine observable differences between samples.
Linearity is particularly important for internal standards, as these are used to normalise other metabolites and to reduce batch effects. Thus, non-linearity in internal standards may lead to distortions in the relative abundances of metabolites that are being normalised. QreSS internal standards, 1,4-butanediamine, alanine, leucine, phenylalanine, vitamin B3, and thymine had high observable r values to actual concentration therefore are reliable internal standards to use. However, tyrosine was relatively poorly correlated and is therefore less reliable as an internal standard for sample preparation and detectability. Interestingly tyrosine sensitivity seems to decline throughout the run, which would describe why its CVs are high in comparison to the other internal standards. This is likely due to tyrosine like tryptophan also being aromatic.
Unsurprisingly as metabolite concentration is more highly correlated to detected abundance, the R2 of metabolite concentration is increased. The variance explained by the fitted regression model for 68.0% of annotated metabolites is higher than the acceptable 0.80 threshold, suggesting that the actual change in concentration for over two thirds of annotated metabolites is driving the differences we observed. Most internal standards sit above the 0.8 threshold and their detected abundance is relative to their actual concentration. Thus, the internal standards are reliable for normalisation across the tested concentration range.
‘Carboxylic acids and derivatives’, and ‘fatty acyls’ are the top classifications of our annotated metabolites that were identified in HMDB. These metabolites can be annotated to level 2 of the MSI guidelines (Sumner et al. 2007). Carboxylic acids are polar metabolites, while fatty acyls can be polar or non-polar dependent on the attached hydrocarbon chain. These classes have previously been identified in more complex LC-MS and derivatisation-GC-MS methods (Bian et al. 2020; Yu et al. 2021; Zhu et al. 2016). However, many metabolites were unable to be successfully annotated, and fewer than 50% (130/274) of them could be classified using the HMDB 2020 library. Increasing the number of annotated and classified metabolites would likely result in even greater diversity in our classifications.
Future developments of this method might incorporate stronger solvents to enable quantification of metabolites that strongly bind to the PGC. Further validation of this method using predefined metabolomic mixes (i.e. credentialing kit) may be used to show the MS/MS acquired features are not contaminants (L. Wang et al. 2019).