Metabolites detection and identification
A full-scan detection of plasma metabolites was performed by UPLC-Q-TOF/MS, including 50 cases of CSG, 7 cases of EGC and 25 cases of AGC, which involved principal components that account for the majority of the differences in the data. In conjunction with the Progenesis QI package, UPLC-Q-TOF/MS analysis of plasma metabolites contained three typical total ion current (TIC) chromatograms, as shown in Fig. 1. A total of 2666 peaks were detected and 50 differential metabolites were authentically identified, including L-proline, L-isoleucine, L-leucine, L-valine, lysine alanine, lysophosphatidylcholines (LysoPC) (12), phosphatidylcholines (PC) (16), phosphatidylethanolamines (6), L-carnitine, creatine, cholesterol, cholic acid, tyramine, uric acid, capryl carnitine, pyruvaldehyde, docosatrienoic acid, malonaldehyde and 1-sphingosine phosphate, etc. The identity of metabolites based on the following criteria: VIP score > 1, P༜0.05 in the EZinfo software, and metabolites match in the databases of HMDB, LIPID MAPS and SERUM.
Metabolite differences in EGC and AGC
Similarly, a PCA analysis was used to explore the metabolic profiling differences between the EGC and AGC patients, and the results are presented in Fig. 2C. There were no distinctive differences between EGC and AGC groups. Then, the OPLS-DA model was launched (Fig. 2D). Based on the criteria of OPLS-DA (VIP > 1 and P < 0.05), 16 statistically differentially expressed metabolic molecules in total were screened out and finally 6 metabolic molecules were identified as potential metabolite biomarkers between the two groups. The significantly changed 6 metabolites listed in Table 2. PC(O-18:0/0:0) and LysoPC(20:4(5Z,8Z,11Z,14Z)) were found to be up-regulated, whereas L-proline, L-valine, adrenic acid and pyruvaldehyde to be down-regulated in AGC patients.
Table 2
Statistically significant differences in metabolite levels between EGC and AGC groups
NO | Retention Time(min) | m/z | Compound | VIP value | P value | The trend of EGC |
1 | 2.04 | 138.06 | L-Proline | ༞1 | 0.018 | ↓ |
2 | 0.73 | 118.08 | L-Valine | ༞1 | 0.015 | ↓ |
3 | 4.31 | 355.26 | Adrenic acid | ༞1 | 0.038 | ↓ |
4 | 7.96 | 551.43 | PC(O-18:0/0:0) | ༞1 | 0.027 | ↑ |
5 | 5.91 | 544.34 | LysoPC(20:4(5Z,8Z,11Z,14Z)) | ༞1 | 0.041 | ↑ |
6 | 0.68 | 114.06 | Pyruvaldehyde | ༞1 | 1.50E-09 | ↓ |
Discriminant models establishment based on the ROC analysis
The receiver operating characteristic (ROC) analysis is generally considered as a standard method for effectiveness assessment of diagnostic biomarkers. In this study, an in-depth ROC curve analysis was performed for 6 potential biomarkers (CSG vs EGC) and possible biomarker combinations. The area under the ROC curve (AUC) represents the overall accuracy of GC diagnostic test, and the results (optimal cut-off values, sensitivities, specificities and AUC values) are depicted in Table 3. The four up-regulated metabolites including L-Carnitine, L-Proline, Pyruvaldehyde and PC(14:0/18:0) provided AUC values of 0.723 (sensitivity 57.1%, specificity 86.0%), 0.820 (sensitivity 71.4%, specificity 88.0%), 0.794 (sensitivity 100.0%, specificity 52.0%) and 0.769 (sensitivity 100.0%; specificity 50.0%) respectively, which implied a good distinctive ability in predicting GC. Another two down-regulated metabolites also have good independent predictive potential with AUC values from 0.780 to 0.849 (Table 3). Because the GC is a complex disease involving biochemical dysfunction in multiple pathways, a single biomarker could not be powerful to discriminate in clinical practice. Therefore, identifying a combination of biomarkers, which had greater predictive power, was particularly important. The 6 selected metabolites as the independent variables were combined together passed through binary logistic regression model with ROC curves to build the best biomarker panel. As a result, all six metabolites were used, which was termed as Mixmodel 1. The performance was calculated according to the Eq. (1) as follow.
Table 3
ROC curve analysis of potential biomarkers in GC
No. | Groups | Metabolites | AUC | Sensitivity (%) | Specificity (%) |
1 | CSG-EGC | L-Carnitine | 0.723 | 57.1% | 86.0% |
2 | CSG-EGC | L-Proline | 0.820 | 71.4% | 88.0% |
3 | CSG-EGC | Pyruvaldehyde | 0.794 | 100% | 52.0% |
4 | CSG-EGC | PC(14:0/18:0) | 0.769 | 100% | 50.0% |
5 | CSG-EGC | LysoPC(14:0) | 0.849 | 100% | 62.0% |
6 | CSG-EGC | Lysinoalanine | 0.780 | 100% | 52.0% |
7 | CSG-EGC | Mixmodel 1 | 1 | 100% | 100% |
8 | EGC-AGC | L-Proline | 0.611 | 88.0% | 42.9% |
9 | EGC-AGC | L-Valine | 0.526 | 64.0% | 57.1% |
10 | EGC-AGC | Adrenic acid | 0.720 | 44.0% | 100% |
11 | EGC-AGC | PC(O-18:0/0:0) | 0.749 | 100% | 57.1% |
12 | EGC-AGC | LysoPC(20:4(5Z,8Z,11Z,14Z)) | 0.731 | 76.0% | 71.4% |
13 | EGC-AGC | Pyruvaldehyde | 0.657 | 40.0% | 100% |
14 | EGC-AGC | Mixmodel 2 | 0.931 | 76.0% | 100% |
where x1, x2, x3, x4, x5 and x6 are represent for the peak value of L-carnitine, L-proline, pyruvaldehyde, PC(14:0/18:0), LysoPC(14:0) and lysinoalanine, respectively.
The calculated results showed that the proposed biomarker panel model had AUC value of 1 (Fig. 3A), which meant that the multivariate model showed 100% discrimination power to separate EGC patients from CSG patients.
Similarly, to confirm the diagnostic potential for the early detection of GC, we examined the AUC values in stage EGC to AGC patients. As listed in the Table 4, 3 metabolites show good discrimination ability, with an AUC value above 0.7, along with 3 metabolites above 0.5. In addition, the biomarker panel was also applied to distinguish AGC patients with EGC patients, mining the potential ability for staging diagnosis. On the basis of binary logistic regression analysis, the plasma biomarker panel consisting of three metabolites, including L-proline, adrenic acid and PC(O-18:0/0:0), was defined as Mixmodel 2. The performance was calculated according to the following Eq. (3).
where y1, y2 and y3 are represent for the peak value of L-Proline, Adrenic acid and PC(O-18:0/0:0), respectively. Results show that the biomarker panel had better diagnostic abilities than any single metabolite alone in distinguishing between early stage EGC patients and AGC patients, with sensitivity, specificity, and AUC value of 0.931, 76.0%, and 100.0% at the best cut-off points (Fig. 3B, Table 3). As indicated by these results, the biomarker combinations presented herein serves not only to discriminate EGC from CSG patients, but is also capable of distinguishing stage I and II GC models with relatively high diagnostic accuracy.
Metabolic Pathway Analysis
On the basis of the detected differential metabolites, pathway analysis was performed by MetaboAnalyst 4.0 to uncover the global metabolic disorders in CSG and GC patients. Figure 4A-B presents the major impacted pathways in the CSG-EGC and EGC-AGC groups, indicated by the red and orange colors (-log(p) > 2 or impact value > 0.1). As shown in Fig. 3, the amino acid metabolism was discovered to be strikingly disturbed, including glycine, serine and threonine metabolism, valine, leucine and isoleucine biosynthesis and so on. The perturbations of central carbon metabolism (e.g., pyruvate metabolism) and lipid metabolism (e.g., glycerophospholipid metabolism, linoleic acid metabolism, alpha-linolenic acid metabolism and ether lipid metabolism) were also observed. The changes of detected differential metabolites related to the abnormal metabolic pathways, providing clues for underlying the potential metabolic mechanism in GC.