Acquisition of metabolite levels in urine by NMR and HRMS. Urine samples from 46 subjects, divided into three groups: healthy control (CTRL, 9), patients with chronic cystitis (CC, 6) and bladder cancer patients (BC, 31) were used to develop the new method. The NMR dataset was acquired using a 600 MHz Bruker Avance spectrometer (Bruker, MA, USA). A typical urine NMR spectrum is shown in Figure 1.
MS dataset was acquired using a UHPLC-High Resolution Mass Spectrometry (UHPLC-HRMS) analysis system, coupled to an Orbitrap QExactive™ mass spectrometer (Thermo Scientific™, MA, USA) equipped with a HESI source operated in positive and negative ion mode. We used two different chromatographic conditions: reverse phase (RP), which gives the possibility to separate metabolites based on hydrophobic interactions, and hydrophilic interaction chromatography (HILIC) as a complement, which allows the analysis of polar compounds. The combination of the two ion modes and the two chromatographic conditions allowed a wide coverage of metabolites present in urine. After the MS analysis, 10497 hits were obtained, and for each one, information about a matched formula, exact mass, retention time, and relative intensity for each sample was available; for several hits a putative name was also provided (Table 1).
Table 1. Examples of two hits obtained with UHPLC-HRMS and the information available for each of them.
Mode1
|
Name
|
Formula
|
MW
|
Rt [min]
|
S1a
|
S2
|
…. S46
|
HC+
|
Caffeine
|
C8 H10 N4 O2
|
194.0805
|
4.636
|
2.36
|
2.81
|
0.85
|
RP-
|
-
|
C11 H15 N2 P
|
206.0976
|
9.727
|
0.39
|
0.36
|
0.14
|
1 HC+: HILIC positive; RP-: Reverse phase negative. a Relative intensity of sample 1
SYNHMET Method workflow. The SYNHMET method was developed to combine in an effective way data coming from NMR and MS to produce in a fast and efficient way a complete and unique matrix of the largest possible number of metabolites in human urine. The proposed workflow is shown in Scheme 1. We started with a list of metabolites that can probably be identified and quantified in urine by NMR according to the work of Bouatra et al. 9. These metabolites were divided according to whether their exact mass was found among the MS hits contained in our dataset. For those for which we did not found any hit (mainly metabolites with low molecular weight under 60 Da), a profiling was attempted using exclusively NMR. Those that could not be accurately quantified in this way were discarded.
For each of the remaining metabolites, we extracted all the hits that showed their exact molecular mass. Given the use of two chromatographic conditions and two detection polarities, the same compound can give rise to up to four different hits, hence the need to group them into what we have defined as a chemical entity. Two peaks with the same exact mass were considered to represent the same chemical entity if they showed a significant correlation (R2 ³ 0.9) between their intensity distributions in the samples. In the case of hits obtained in the same chromatographic condition but with different polarity, the retention time must be the same.
The next step was to identify the MS hits corresponding to each of these metabolites. As is generally established, this assignment is highly ambiguous if it is performed considering only the exact mass, without the use of MS/MS spectra or more appropriately of analytical standards. In our scheme, such identification is performed when a mass hit, in addition to presenting the exact molecular weight of a given metabolite, shows a highly correlated distribution between the intensities of the peaks measured by MS and the concentrations observed by NMR in the different samples. For those more concentrated, the NMR measurement was possible in most of the samples. In many cases, however, we have quantified them only in a smaller subset. As will be seen from the examples, however, counting with a partial number of appropriately resolved NMR spectrum may be sufficient to find the corresponding chemical entity in the MS dataset. A few metabolites, however, did not show any high correlation with the corresponding hit/s of MS, and for them we attempted a quantification using only NMR. As in the precedent case, if this was not possible, they were discarded.
The final two steps were the validation of the results by comparing the results with those of literature and the extraction of personalized profiles using the whole dataset of metabolites quantified by the SYNHMET approach.
(see Scheme 1 in the Supplementary Files)
Analytical synergism between NMR and MS and metabolite identification. Through the following cases, we will try to reveal the synergy between the two techniques resulting in a fast and efficient identification of metabolites in urine. For example, our HRMS dataset contained 24 hits showing the exact mass of tyrosine. These hits were grouped in 14 chemical entities (Table 2) following the criteria exposed in chapter 2.2. In parallel, the concentration of this metabolite was measured by NMR by spectral deconvolution. To determine which of all entities corresponded to tyrosine, we calculated a linear correlation between HRMS intensity and NMR concentration distributions in the 46 samples for the 14 entities. As shown in the last column of Table 2, entity 7 is the only one showing a significant R2, allowing its assignment as tyrosine.
Table 2. Modes of detection, retention times and the correlation coefficient with NMR intensities for the fourteen chemical entities showing the exact mass of tyrosine (181.0739 Da). In bold chemical entity 7, which we identified as tyrosine.
Chemical entity
|
Mode (Rt [min])
|
R2
|
1
|
RP+/-1 (8.45)
|
0.38
|
2
|
RP+/- (8.53)
|
0.31
|
3
|
RP+ (2.89); HC+2 (7.89)
|
0.34
|
4
|
RP+ (3.87); HC+ (3.77)
|
0.71
|
5
|
RP+ (5.94); HC+ (4.78)
|
-0.1
|
6
|
RP+ (6.49); HC+ (0.7)
|
-0.06
|
7
|
RP- (3.39); HC+/- (5.35)
|
0.99
|
8
|
RP+ (2.18); HC+/- (3.5)
|
-0.03
|
9
|
RP+ (5.07)
|
0.31
|
10
|
HC+ (1.55)
|
-0.1
|
11
|
HC+ (6.46)
|
0.37
|
12
|
HC- (3.75)
|
0.08
|
13
|
HC- (4.04)
|
0.35
|
14
|
HC+ (1.92)
|
0.1
|
1 RP+/-: reverse phase positive and negative, respectively; HC+/-: HILIC positive and negative, respectively.
To further confirm the assignment of tyrosine and other metabolites by this method, we compared the retention times of nine labeled standards co-injected with the samples with those obtained by the NMR-HRMS correlation method (Table 3). The excellent correspondence observed constitute a strong validation of the assignment and prompted us to extent the method to the rest of the metabolites under study.
Table 3. Comparison of the retention times observed for nine labeled standards (Std) with those assigned with the NMR-HRMS intensity correlation method (Corr) in the two chromatographic conditions.
Metabolite
|
RP
|
HC
|
|
Std
Rt [min]
|
Corr
Rt [min]
|
Std
Rt [min]
|
Corr
Rt [min]
|
Carnitine
|
0.88
|
0.88
|
8.02
|
8.02
|
Glucose
|
|
|
2.13
|
2.14
|
Hypoxanthine
|
|
|
2.35
|
2.36
|
Inosine
|
|
|
2.37
|
2.37
|
Kynurenate
|
|
|
3.60
|
3.56
|
Kynurenine
|
|
|
5.19
|
5.14
|
Lactate
|
1.57
|
1.57
|
|
|
Tryptophan
|
|
|
5.09
|
5.08
|
Tyrosine
|
2.05
|
2.05
|
5.38
|
5.38
|
A second example corresponds to the identification of isobaric compounds, such as 2-, 3- and 4-hydroxyphenylacetic acids, and 2- and 3-methylglutaric acids (Figure 2). These assignments represent an extra challenge for identification using only MS/MS data, but their distinction is relatively simple using this method since they present very different NMR signals. As shown in Figure 2, the correlation between the MS intensity and concentration data measured by NMR allows to quickly find out which MS chemical entity corresponds to each of the isomers for both classes of molecule.
Synergism also manifests itself in the opposite direction, since having a hypothesis of the concentration distribution of a given metabolite from MS data helps to identify the correct position of its signals in the NMR spectrum, especially in crowded regions. The combination of NMR and HRMS transforms the identification from a 1D-experiment (chemical shift) into a 3D-experiment (chemical shift, retention time, exact mass) significantly increasing the resolution power of this approach (Figure 3). For example, it is possible to start the profiling from the sample that according to the HRMS measure presents the highest concentration of a given metabolite, which surely facilitates the positioning of the corresponding signals in the NMR spectrum.
It is very important to underline the fact that once we find correlations between the intensity of a certain metabolite identified by NMR and that of a mass peak, knowledge of its elemental composition is added to the identification. At this point, the only ambiguity could arise from the existence of an isomer with the same chemical shift and multiplicity, which is highly unlikely or directly impossible to exist for small molecules.
Analytical synergism between NMR and MS and metabolite quantification. The identification of a compound in the MS database through correlation already provides the basis for its quantification. In fact, juts by multiplying the MS relative intensities by the slope of the correlation line we can convert them into absolute concentrations. Transforming all mass hits and NMR concentrations to a single scale opens the possibility of calculating a more accurate mean value for quantification. Furthermore, by having the absolute concentration it is possible to normalize its value in the different samples by dividing by the creatinine concentration, thus comparing it with literature data.
The most obvious synergistic factor in improving the ability to quantify a larger number of metabolites in almost all samples is undoubtedly the higher sensitivity of MS with respect to NMR. This is not only the consequence of the intrinsically lower sensibility of NMR, but also has an origin in what we have defined as the "NMR matrix effect", since the resulting consequence is similar to the one observed for MS when ESI is used as the ionization source 19. The great variety of composition that can be found in urine often causes signals belonging to compounds in high concentrations to cover partially or totally the signals of a given metabolite, so that it can be quantified only in a limited number of samples. This fact can be appreciated, for example, in the case of the quantification of cysteine (Figure 4a). We could measure accurately its concentration by NMR in only 15 samples out of 46, which have been enough to identify the corresponding MS peak in the RP chromatography in positive mode detection, with a retention time of 1.16 min. Thanks to the MS intensities and the corresponding conversion using the NMR-MS correlation, all the 46 absolute concentrations could be determined. As the graph in Figure 4a shows, concentrations below 50 mM could not be measured by NMR. However, in addition, there were also some samples in which the concentration of cysteine, measured by MS, was well above this limit, but due to the presence of other signals in the regions of the NMR spectrum where this metabolite can be readily quantify, it was not possible to assign an accurate value by this technique.
While this direction of synergism between NMR and MS is somewhat intuitive, there is also an effect in the opposite way. In fact, the accuracy of the quantification performed by MS can be significantly improved when such data are cross-checked with the concentrations obtained by NMR. The most frequent causes of errors in the evaluation of concentrations by MS are on the one hand the saturation in certain samples of the detector by highly concentrated metabolites, and on the other hand the matrix effect, which is also very variable from sample to sample depending on the composition of urine sample.
An example of the first effect is shown in Figure 4b for the case of the quantification of hippuric acid. Its concentration in urine showed a wide range, from a minimum of 20 20 to a maximum of 837 mM/mM creatinine 21. The saturation of the MS detection results in a clear deviation of the linearity of response with respect to the value measured by NMR for concentrations higher than 3.8 mM (Figure 4b). Thus, all MS intensities above this concentration were discarded and the final concentration calculated from the NMR data alone, correcting for a significant error.
The second cause of MS quantification errors may be more difficult to detect. This is the case of sample 2992, for which the different concentrations obtained by MS and NMR are presented in Table 4.
Table 4. Concentrations obtained for hippuric acid from the four MS hits and NMR for the sample 2992.
Quantification mode
|
MW
|
Rt
[min]
|
Conc
[mM]
|
HRMS-RP+
|
179.0579
|
8.214
|
0.13
|
HRMS-RP-
|
179.0581
|
8.213
|
0.02
|
HRMS-HC+
|
179.0583
|
3.607
|
1.38
|
HRMS-HC-
|
179.0582
|
3.598
|
1.58
|
1H-NMR
|
-
|
-
|
1.35
|
The data obtained show that the concentrations derived from HILIC column chromatography agree with the NMR data, while those measured with the RP column are significantly lower. In this case, only the values obtained with the first chromatographic condition were considered. This effect was not detected in other samples for hippuric acid and is probably due to a compound present only in this sample that co-elutes with the metabolite causing a partial suppression of the peak intensity.
Results of SYNHMET application to 46 urine samples. We applied the SYNHMET method to the already described urine samples starting with a list of 180 metabolites whose identification and partial quantification by NMR in this biofluid has already been reported 9. Of them, 12 metabolites were quantified using only NMR because no MS hit with the corresponding exact mass was available. For other 168 metabolites we found at least one MS hit that matched their molecular weight, and 145 showed a successful correlation between NMR concentrations and MS intensities, whereas 7 were quantified only with NMR because no correlation was found. The total number of metabolites that we could quantify was 164. Of the total number of metabolites quantified through correlation, 61 (42%) were identified in at least 80% of the NMR spectra. Together with those metabolites quantified by NMR alone (19), this makes a total of 80, which is approximately the limit of compounds that the use of separate NMR could yield to build a matrix containing concentration data for most of the samples. The combined use of both techniques has ultimately resulted in more than doubling the concentrations of metabolites measured in virtually all samples.
Our dataset constitutes an almost complete matrix of 46 samples x 164 metabolites containing 7496 concentration values, with only 48 missing values, which represents 0.6% of the total. These concentrations were converted into mM/mM of creatinine to compare with the range observed in literature (Table S1).
The metabolites quantified using SYNHMET cover a wide range of biochemical markers, including amino acids and their metabolism, markers of vitamins, dysbiosis, diet and toxin exposure, carbohydrates and their metabolism, energy, fatty acid/lipid and glycine/serine metabolism, ketone bodies and others. In this way, it is possible to cover some of the main metabolic pathways, both for metabolomics studies with the purpose of discovering biomarkers related to pathological states, as well as for individual profiling.
Method validation by comparison with literature normal ranges. To validate the results obtained, we compared the extreme values observed for the three groups of individuals with those reported in the literature. For this purpose, we have considered the ranges reported for adults over 18 years of age. The heat map in Figure 5 shows the results obtained. The very low number of grey cells, that represents missing values, allows appreciating the completeness of the dataset. Eight out of nine individuals in the control group did not show any values significantly out of the range observed in the literature for the 164 metabolites studied. Only one person showed higher values for threonine and carnosine concentration. The fact that almost all of the concentration values measured using our approach for the control group fall within the normal range accepted in the literature can be read as a validation of the method. However, it is important to consider that the reference values taken from literature are very heterogeneous in the number of persons that were involved in their definition. In particular, our values for 2-methylglutarate, 4-aminohippurate, adenine, ADP, anserine, choline, cinnamate, cytosine, gluconate, glucuronate, isobutyrate, levoglucosan, maltose, N-methylhydantoin and Sumiki’s acid represent a significant improvement in the knowledge about the normal values, because there are very few studies reporting their normal values according to the HMDB site 22. In this respect, our data can be added to the existing one to make the definition of the normal and abnormal ranges of a give metabolite more robust.
A different picture emerged for the groups of subjects with chronic cystitis and bladder cancer, which show a much higher number of metabolites with abnormal values. In particular, those indicated with black in Figure 5 are more than 4 times higher than the maximum value observed previously. This most likely reflects different metabolic imbalances related to the pathologies of these patients. Specifically, for the BC group, 82 values were observed outside the literature ranges. Dietary components embrace the highest number of abnormal values, followed by metabolites belonging to fatty acids/lipids, carbohydrates, energy and branched chain amino acid metabolisms. Nine metabolites that were previously found significantly altered in bladder cancer, namely O-acetylcarnitine, gluconate, lactate, phenylacetylglutamine, citrate, hippurate, succinate, valine and erythritol 23, were also found with abnormal values in our BC group (Figure 5).
Extraction of personalized profiles from absolute concentrations. The fact that we were able to obtain absolute concentrations for all the metabolites, and their subsequent normalization with the creatinine concentration, allowed us to obtain a profile for each individual. Subject 2852 belonging to group BC, presented 24 values of metabolic concentrations outside the literature range. His complete metabolic profile, comprising the 164 metabolites quantified, is shown in Figure S1, while a detail of the metabolites with abnormal values is shown in Figure 6.
As in the case of the mean profile of the BC group, most of the patient's metabolites with abnormal values belong to components of diet, fatty acid metabolism and energy metabolism. The results shown in Figures S1 and 6 show the degree of detail that can be achieved with the SYNHMET methodology, which can be used in clinical practice to monitor the health status and disease progression of a given patient.