Key gene modules associated with the progression of chronic gastritis to early GC
A total of 1,8340 genes in GSE55696 were used for WGCNA analysis. β=9 (scale-free R2 0.85) was the lowest power fit scale-free indices over 0.85 and determined as the soft thresholding power parameters to ensure a scale-free network in GSE55696. Eventually, genes with similar expression modules had 5 co-expression modules in GSE55696, brown module (R=0.59, P=1e-04), turquoise module (R=0.58, P=1e-04), blue module (R=-0.77, P=1e-08), yellow module (R=0.77, P=1e-08) and gray module (R=0.38, P=0.02), respectively (Fig. 2). The gray module is a set of genes that cannot be aggregated to any module. We identified the blue module in GSE55696 (R=0.05, P=0.8) as a key gene module associated with the progression of chronic gastritis to early GC. The blue module in GSE55696 contained 140 genes and defines this gene set as GS1 for the next step of analysis.
Key module gene protein interaction network construction and enrichment analysis
The 137 genes in GS1 were analyzed for PPI protein interactions using the STRING database, and the PPI protein interaction network was mapped using Cytoscape 3.8.0 software, with 67 nodes and 64 edges (Fig. 3A). KOBAS database analysis showed that they were mainly enriched in lipid metabolic pathways such as fat digestion and absorption, cholesterol metabolism, and associated with H. pylori, such as NOD-like receptor signaling pathway, epithelial cell signaling in H. pylori infection and so on (Fig. 3B, C).
Validated cohort: The differential genes analysis in chronic gastritis and early GC
To validate our results, we performed differential gene analysis on the GSE130823 dataset. A total of 1,494 DEGs were identified in GSE130823, including 758 up-regulated genes and 736 down-regulated genes. The DEGs in GSE130823 were defined as GS2 and drawn volcano map with R ggplot2 package (Fig. 4A). There are 15 crossover genes in GS1 and GS2, namely PEBP4, PTPRZ1, CARNS1, PLA2G2A, CHI3L1, RGL3, OSM, CLIC5, CHAD, HOTAIR, IFI44L, CXCL5, CIRBP, KRT6B and CA1 (Fig. 4B). GO and KEGG analyses showed that these crossover genes were enriched in fat digestion and absorption, epithelial cell signaling in H. pylori infection, and so on. The results were consistent with GS1 enrichment analysis (Fig. 4C, D).
Survival analysis and validation of crossover genes
To investigate the prognostic value of these 15 crossover genes, we used the Kaplan Meier plotter platform to analyze the results showing that of the 15 crossover genes, 9 were associated with unfavorable OS in GC patients. This suggests that CA1, CARNS1, CHAD, CLIC5, CXCL5, KRT6B, OSM, PEBP4, and RGL3 can be used as biomarkers for the progression of chronic gastritis to early GC (Fig. 5). The mRNA expression levels of nine survival-related genes were validated between normal and GC samples using the GEPIA database.
Candidate compound screening and compound-target network related to progression of chronic gastritis into early GC
There were 9 targets that related to poor OS in patients with GC, and were eligible to be used as biomarkers for progression of chronic gastritis into early GC. Among these targets, those could be matched with small molecular compounds by HERB database include CA1, CARNS1, CHAD, CXCL5, KRT6B, OSM, PEBP4 and RGL3 were defined as potential targets.
87 target-related components were screened through HERB database, and 59 candidate compounds were further screened according to ADME parameters and Lipinski rule. (Table 1). These compounds correspond to 5 potential targets. The target-compound network with potential targets was established, which consists of 34 nodes and 33 edges (Fig. 6A). The greater degree means the stronger hub of the node, and the greater regulation role in the whole network. The top 2 compounds (including the tied for second) of degree value are glycerin, immune globulin, D-malic acid (DMR), and aminobutyric acid (D-2-Aminobutyrate), corresponding to 3, 2, 2, 2 targets respectively. These compounds can affect the progression of multiple chronic gastritis to early GC-related target proteins. The top three targets in degree value were CXCL5, CA1 and CARNS1.
Table 1. Abbreviations corresponding to compounds and OB values.
NO.
|
Ingredient name
|
OB
|
|
NO.
|
Ingredient name
|
OB
|
HBIN1
|
1-methoxyindole-3-carbaldehyde
|
0.55
|
|
HBIN28
|
pedatisectine b
|
0.55
|
HBIN2
|
2-Diazo-2,3-dihydro-3-methyl-1H-inden-1-one
|
0.55
|
|
HBIN29
|
cis-9-octadecenoic acid
|
0.85
|
HBIN3
|
2-Methyl-N-phenylmaleimide
|
0.55
|
|
HBIN30
|
oleic acid
|
0.85
|
HBIN4
|
(2R)-3-oxo-2-phenylbutanenitrile
|
0.55
|
|
HBIN31
|
absinthin
|
0.55
|
HBIN5
|
3-methoxy-1H-quinazoline-2,4-dione
|
0.55
|
|
HBIN32
|
acetylcholine
|
0.55
|
HBIN6
|
8-dihydroxy-4,5-dimethyl-3,4-dihydronaphthalen-1(2H) -one
|
0.55
|
|
HBIN33
|
allyl isothiocyanate
|
0.55
|
HBIN7
|
7-hydroxy-8-(2-hydroxyethyl)coumarin
|
0.55
|
|
HBIN34
|
cascarillin
|
0.55
|
HBIN8
|
Acetovanillin
|
0.55
|
|
HBIN35
|
cascarillin c
|
0.55
|
HBIN9
|
FMT
|
0.85
|
|
HBIN36
|
colchine
|
0.55
|
HBIN10
|
fraxetin
|
0.55
|
|
HBIN37
|
coumari
|
0.55
|
HBIN11
|
methyl (2S)-3-hydroxy-2-phenylpropanoate
|
0.55
|
|
HBIN38
|
coumarin
|
0.55
|
HBIN12
|
Methyl 4-hydroxyphenylacetate
|
0.55
|
|
HBIN39
|
ethylpyrazine
|
0.55
|
HBIN13
|
noradrenaline
|
0.55
|
|
HBIN40
|
evodin
|
0.55
|
HBIN14
|
Phlegmariuine-N
|
0.55
|
|
HBIN41
|
grosheimin
|
0.55
|
HBIN15
|
procurcumadiol
|
0.55
|
|
HBIN42
|
grossheimin
|
0.55
|
HBIN16
|
Scopoletol
|
0.55
|
|
HBIN43
|
isohumulone a
|
0.56
|
HBIN17
|
(S)-tropic acid
|
0.85
|
|
HBIN44
|
limonin
|
0.55
|
HBIN18
|
Tuberosine B
|
0.85
|
|
HBIN45
|
mustard oil
|
0.55
|
HBIN19
|
3-methylhistidine
|
0.55
|
|
HBIN46
|
nigakilactone d
|
0.55
|
HBIN20
|
6-aminopurine
|
0.55
|
|
HBIN47
|
(-)-noradrenaline
|
0.55
|
HBIN21
|
aminobutyric acid
|
0.55
|
|
HBIN48
|
obakulactone
|
0.55
|
HBIN22
|
D-malic acid
|
0.56
|
|
HBIN49
|
Propanoic acid
|
0.85
|
HBIN23
|
glycerin
|
0.55
|
|
HBIN50
|
Pyrazine
|
0.55
|
HBIN24
|
Immune globulin from
|
0.55
|
|
HBIN51
|
serotonine
|
0.55
|
HBIN25
|
L-lysine
|
0.55
|
|
HBIN52
|
uccinic acid
|
0.85
|
HBIN26
|
lysine acid
|
0.55
|
|
HBIN53
|
17-beta-estradiol
|
0.55
|
HBIN27
|
n-thirteen(carbon)alkyl
|
0.55
|
|
HBIN54
|
(-)-nicotine
|
0.55
|
Target-compound-herb network construction
87 candidate compounds were matched out of 248 TCM herbs. According to the relationship between compounds and herbs, compound-herb network was constructed, which contained 337 nodes and 405 edges. The top 13 in degree value were Fructus Hippophae, Radix Bupleuri, Radix Peucedani, Radix Isatidis, Cortex Phellodendri Chinrnsis, Fructus Mori, Arum Ternatum Thunb., Rhizoma Typhonii, Radix Stemonae, Radix Saposhnikoviae, Herba Portulacae, Fructus Amomi, Fructus Evodiae (Fig. 6B). The targets of each herbs were collected through the bridging of candidate compounds. The results showed that Radix Bupleuri, Radix Isatidis, Fructus Mori, Fructus Hippophae and Rhizoma Typhonii could be associated with 5, 4, 4, 4, 4 targets, respectively. These five herbs have strong modulating effects on the progression of chronic gastritis to early GC, which can provide a basis for the selection of herbs in clinical or experimental settings. Ramulus Cinnamomi, Cortex Phellodendri Chinrnsis, Radix Stemonae, Herba Ephedra, Radix Saposhnikoviae, Herba Portulacae, Radix Angelicae Dahuricae, Cortex Periplocae Radicis, Arum Ternatum Thunb., Silybum Marianum, Lilii Bulbus, Lygodii Spora, Fructus Chaenomelis, Radix Aucklandiae, Folium Ginseng, Fructus Amomi could all be associated with 2 targets (Fig. 6C). The compound degree value>6 was a potential core compound including oleic acid, coumarin, scopoletol, limonin, glycerin, fraxetin, noradrenaline, 6-aminopurine, pedatisectine b, (-)-nicotine, acetylcholine, FMT, Propanoic acid (PPI), evodin, obakulactone (Fig. 6D). The comparison table of Chinese medicine abbreviations is shown in Appendix 1.
The results of molecular docking of core compounds and targets
25 sets of docking results between the five potential core compounds and the five core targets CA1 (PDB ID: 7Q0D), CARNS1 (Uniprot ID: A5YM72), CXCL5 (PDB ID: 2MGS), CHAD (PDB ID:5LFN), KRT6B (Uniprot ID: P04259), 20 groups (80%) had Affinity <-5 and 8 groups (32%) had Affinity <-7 (Fig. 7). Each ligand is embedded in the active pocket of the target, and interacts with multiple residues of the target through hydrophobicity and hydrogen bonding. The highest Affinity was CXCL5-limonin (-8.7 kcal/mol), and the lowest was CXCL5-coumarin (-5.7 kcal/mol). It indicates that the potential core compounds screened may have good binding activity with the core target. There were 20 new groups outside the network, the top 3 of Affinity were CHAD-limonin (-11.3 kcal/mol), CA1-limonin (-9.8 kcal/mol), and CARNS1-limonin (-9.1 kcal/mol). There were 15 off-network groups with Affinity<-5kcal/mol.