Metabolites detection and identification
A full-scan detection of plasma metabolites was performed by UPLC-Q-TOF/MS, including 50 cases of CSG, 7 cases of intestinal-type EGC and 25 cases of intestinal-type AGC, which involved principal components that account for the majority of the differences in the data. In conjunction with the Progenesis QI package, UPLC-Q-TOF/MS analysis of plasma metabolites contained three typical total ion current (TIC) chromatograms, as shown in Fig. 1. A total of 2666 peaks were detected and 50 differential metabolites were authentically identified, including L-proline, L-isoleucine, L-leucine, L-valine, lysine alanine, lysophosphatidylcholines (LysoPC) (12), phosphatidylcholines (PC) (16), phosphatidylethanolamines (6), L-carnitine, creatine, cholesterol, cholic acid, tyramine, uric acid, capryl carnitine, pyruvaldehyde, docosatrienoic acid, malonaldehyde and 1-sphingosine phosphate, etc. The identity of metabolites based on the following criteria: VIP score > 1, P<0.05 in the EZinfo software, and metabolites match in the databases of HMDB, LIPID MAPS and SERUM.
Differential plasma metabolic profiles among groups
Metabolite differences in CSG and intestinal-type EGC
Using SIMcA-P software, PCA as an unsupervised method and OPLS-DA as a supervised method were performed to discriminate the overall metabolic profiles between CSG and intestinal-type EGC patients. The score plots of PCA and OPLS-DA are presented in Fig. 2A-B. Subjected to the inter-group PCA analysis, it was observed that there is no obvious clustering pattern between the two groups. OPLS-DA (CSG vs EGC) revealed a well gathering trend and complete separation in score plot. Of all peaks detected by UPLC-Q-TOF/MS, the peak areas of 30 peaks were statistically different between CSG and intestinal-type EGC patients (VIP>1, P<0.05), and these signals were identified by database of HMDB, LIPID MAPS and Serum. After that, the molecules responsible for these peaks were further identified by comparing the MS/MS spectra and Metlin database. Six metabolites, named L-carnitine, L-proline, pyruvaldehyde, PC(14:0/18:0), LysoPC(14:0) and lysinoalanine were definited in metabolic profiles. Identification and statistic analysis indicated significant elevation of L-carnitine, L-proline, pyruvaldehyde, PC(14:0/18:0), LysoPC(14:0) and lysinoalanine were definited in metabolic profile, while revealing significant reduction of LysoPC(14:0) and lysinoalanine, in intestinal-type EGC compared with CSG, as shown in Table 1. Actual fold-change values were represented as scatter dot plots in Fig. S3.
Table 1 Statistically significant metabolites for comparison of CSG and intestinal-type EGC patients
NO
|
Retention time(min)
|
m/z
|
Compound
|
VIP value
|
P value
|
Means ± SEM
|
Confidence interval (95%)
|
The trend of EGC
|
1
|
0.67
|
162.11
|
L-Carnitine
|
>1
|
0.0081*
|
3014 ± 882.5
|
(1246 to 4783)
|
↑
|
2
|
2.04
|
138.06
|
L-Proline
|
>1
|
0.0003*
|
399.7 ± 85.42
|
(228.6 to 570.9)
|
↑
|
3
|
0.68
|
114.07
|
Pyruvaldehyde
|
>1
|
0.0036
|
800.2 ± 263.2
|
(272.7 to 1328)
|
↑
|
4
|
10.30
|
734.57
|
PC(14:0/18:0)
|
>1
|
<0.0001*
|
4044 ± 906.4
|
(2227 to 5860)
|
↑
|
5
|
5.65
|
468.31
|
LysoPC(14:0)
|
>1
|
0.0081*
|
-1340 ± 558.0
|
(-2458 to -221.7)
|
↓
|
6
|
2.85
|
251.17
|
Lysinoalanine
|
>1
|
0.002
|
-750.6 ± 421.5
|
(-1595 to 94.11)
|
↓
|
Parameters remaining significant after FDR correction are marked with a *
Metabolite differences in intestinal-type EGC and AGC
Similarly, a PCA analysis was used to explore the metabolic profiling differences between the intestinal-type EGC and AGC patients, and the results are presented in Fig. 2C. There were no distinctive differences between intestinal-type EGC and AGC groups. Then, the OPLS-DA model was launched (Fig. 2D). Based on the criteria of OPLS-DA (VIP > 1 and P < 0.05), 16 statistically differentially expressed metabolic molecules in total were screened out and finally 6 metabolic molecules were identified as potential metabolite biomarkers between the two groups. The significantly changed 6 metabolites listed in Table 2. PC(O-18:0/0:0) and LysoPC(20:4(5Z,8Z,11Z,14Z)) were found to be up-regulated, whereas L-proline, L-valine, adrenic acid and pyruvaldehyde to be down-regulated in intestinal-type AGC patients. Actual fold-change values were represented as scatter dot plots in Fig. S3.
Table 2 Statistically significant differences in metabolite levels between intestinal-type EGC and AGC groups
NO
|
Retention time(min)
|
m/z
|
Compound
|
VIP value
|
P value
|
Means ± SEM
|
Confidence interval (95%)
|
The trend of EGC
|
1
|
2.04
|
138.06
|
L-Proline
|
>1
|
0.0113*
|
2200 ± 977.3
|
204.4 to 4196
|
↓
|
2
|
0.73
|
118.08
|
L-Valine
|
>1
|
0.0051*
|
2058 ± 936.6
|
144.8 to 3971
|
↓
|
3
|
4.31
|
355.26
|
Adrenic acid
|
>1
|
0.0159*
|
425.5 ± 314.2
|
-216.2 to 1067
|
↓
|
4
|
7.96
|
551.43
|
PC(O-18:0/0:0)
|
>1
|
0.0076*
|
-1497 ± 523.1
|
-2565 to -428.2
|
↑
|
5
|
5.91
|
544.34
|
LysoPC(20:4(5Z,8Z,11Z,14Z))
|
>1
|
0.0153*
|
-2644 ± 1028
|
-4743 to -543.8
|
↑
|
6
|
0.68
|
114.06
|
Pyruvaldehyde
|
>1
|
0.0352
|
954.0 ± 650.4
|
-374.2 to 2282
|
↓
|
Parameters remaining significant after FDR correction are marked with a *
Construction of Random Forrest modelling
Random Forest, a supervised machine-learning algorithm, was used as a classifier capable of sorting CSG, EGC and AGC patients. As a classification result, ROC-curves were generated, which summarize all classification runs and estimate classification performance for all available data. As shown in Fig. 2E, the area under the receiver operating curve (AUC) was 97.65% (95% confidence interval [CI], 93.83–100%) in the training set, suggesting that patients with CSG could be effectively distinguished from the EGC patients. Consistently, the AUC for distinguishing EGC from AGC was 60.94% (95% CI, 42.79–79.09%) in the testing set (Fig. 2F).
Discriminant models establishment based on the ROC analysis
Table 3 ROC curve analysis of potential biomarkers in intestinal-type GC
No.
|
Groups
|
Metabolites
|
AUC
|
Sensitivity (%)
|
Specificity (%)
|
1
|
CSG-EGC
|
L-Carnitine
|
0.723
|
57.1%
|
86.0%
|
2
|
CSG-EGC
|
L-Proline
|
0.820
|
71.4%
|
88.0%
|
3
|
CSG-EGC
|
Pyruvaldehyde
|
0.794
|
100%
|
52.0%
|
4
|
CSG-EGC
|
PC(14:0/18:0)
|
0.769
|
100%
|
50.0%
|
5
|
CSG-EGC
|
LysoPC(14:0)
|
0.849
|
100%
|
62.0%
|
6
|
CSG-EGC
|
Lysinoalanine
|
0.780
|
100%
|
52.0%
|
7
|
CSG-EGC
|
Mixmodel 1
|
1
|
100%
|
100%
|
8
|
EGC-AGC
|
L-Proline
|
0.611
|
88.0%
|
42.9%
|
9
|
EGC-AGC
|
L-Valine
|
0.526
|
64.0%
|
57.1%
|
10
|
EGC-AGC
|
Adrenic acid
|
0.720
|
44.0%
|
100%
|
11
|
EGC-AGC
|
PC(O-18:0/0:0)
|
0.749
|
100%
|
57.1%
|
12
|
EGC-AGC
|
LysoPC(20:4(5Z,8Z,11Z,14Z))
|
0.731
|
76.0%
|
71.4%
|
13
|
EGC-AGC
|
Pyruvaldehyde
|
0.657
|
40.0%
|
100%
|
14
|
EGC-AGC
|
Mixmodel 2
|
0.931
|
76.0%
|
100%
|
The receiver operating characteristic (ROC) analysis is generally considered as a standard method for effectiveness assessment of diagnostic biomarkers. In this study, an in-depth ROC curve analysis was performed for 6 potential biomarkers (CSG vs EGC) and possible biomarker combinations. The area under the ROC curve (AUC) represents the overall accuracy of intestinal-type GC diagnostic test, and the results (optimal cut-off values, sensitivities, specificities and AUC values) are depicted in Table 3. The four up-regulated metabolites including L-Carnitine, L-Proline, Pyruvaldehyde and PC(14:0/18:0) provided AUC values of 0.723 (sensitivity 57.1%, specificity 86.0%), 0.820 (sensitivity 71.4%, specificity 88.0%), 0.794 (sensitivity 100.0%, specificity 52.0%) and 0.769 (sensitivity 100.0%; specificity 50.0%) respectively, which implied a good distinctive ability in predicting intestinal-type GC. Another two down-regulated metabolites also have good independent predictive potential with AUC values from 0.780 to 0.849 (Table 3). Because the intestinal-type GC is a complex disease involving biochemical dysfunction in multiple pathways, a single biomarker could not be powerful to discriminate in clinical practice. Therefore, identifying a combination of biomarkers, which had greater predictive power, was particularly important. The 6 selected metabolites as the independent variables were combined together passed through binary logistic regression model with ROC curves to build the best biomarker panel. As a result, all six metabolites were used, which was termed as Mixmodel 1. The performance was calculated according to the equation (1) as follow.
where x1, x2, x3, x4, x5 and x6 are represent for the peak value of L-carnitine, L-proline, pyruvaldehyde, PC(14:0/18:0), LysoPC(14:0) and lysinoalanine, respectively.
The calculated results showed that the proposed biomarker panel model had AUC value of 1 (Fig. 3A), which meant that the multivariate model provided acceptable fit to the data across test samples.
Similarly, to confirm the diagnostic potential for the early detection of intestinal-type GC, we examined the AUC values in stage EGC to AGC patients. As listed in the table 4, 3 metabolites show good discrimination ability, with an AUC value above 0.7, along with 3 metabolites above 0.5. In addition, the biomarker panel was also applied to distinguish AGC patients with EGC patients, mining the potential ability for staging diagnosis. On the basis of binary logistic regression analysis, the plasma biomarker panel consisting of three metabolites, including L-proline, adrenic acid and PC(O-18:0/0:0), was defined as Mixmodel 2. The performance was calculated according to the following equation (3).
where y1, y2 and y3 are represent for the peak value of L-Proline, Adrenic acid and PC(O-18:0/0:0), respectively. Results show that the biomarker panel had better diagnostic abilities than any single metabolite alone in distinguishing between intestinal-type EGC patients and AGC patients, with sensitivity, specificity, and AUC value of 0.931, 76.0%, and 100.0% at the best cut-off points (Fig. 3B, Table 3). As indicated by these results, the biomarker combinations presented herein serves not only to discriminate EGC from CSG patients, but is also capable of distinguishing stage I and II intestinal-type GC models with relatively high diagnostic accuracy.
Metabolic Pathway Analysis
On the basis of the detected differential metabolites, pathway analysis was performed by MetaboAnalyst 4.0 to uncover the global metabolic disorders in CSG and intestinal-type GC patients. Fig. 4A-B presents the major impacted pathways in the CSG-EGC and EGC-AGC groups, indicated by the red and orange colors (-log(p) >2 or impact value > 0.1). As shown in Fig. 3, the amino acid metabolism was discovered to be strikingly disturbed, including glycine, serine and threonine metabolism, valine, leucine and isoleucine biosynthesis and so on. The perturbations of central carbon metabolism (e.g., pyruvate metabolism) and lipid metabolism (e.g., glycerophospholipid metabolism, linoleic acid metabolism, alpha-linolenic acid metabolism and ether lipid metabolism) were also observed. The changes of detected differential metabolites related to the abnormal metabolic pathways, providing clues for underlying the potential metabolic mechanism in intestinal-type GC.