2.1. The active components and targets of SJZT
In this study, "Renshen", "Gancao", "Baizhu" and "Fuling" were used as keywords to obtain the active components and corresponding targets of SJZT in the Traditional Chinese Medicine Systems Pharmacology Database and Analysis Platform (TCMSP). Threshold values for our ADME evaluation system were oral bioavailability (OB) ≥ 30% and drug-likeness (DL) ≥ 0.18, as recommended by TCMSP guidelines. After ADME screening, 135 unduplicated active compounds were obtained (Supplemental Table 1), and these components corresponding to 247 unduplicated targets. The components were renamed according to the source of the active components, e.g., celabenzine is derived from ginseng and is renamed RS1; aposiopolamine is also derived from ginseng and is renamed RS2; kaempferol is contained in both liquorice and ginseng and is renamed A1 (Supplemental Table 2). The results showed that these four botanical drugs have a potential synergistic effect at the target level, without many overlaps at the component level. Liquorice was the botanical drug contributing to the highest proportion of collected components and targets.
2.2. Potential targets of SJZT in the treatment of HCC
To obtain the potential targets of SJZT in the treatment of HCC, HCC-related targets were collected in four databases, including GeneCards, DisGeNet, GEO and TCGA. By retrieving the keywords "liver cancer", "hepatoma", "hepatic carcinoma" and "hepatocellular carcinoma" in the GeneCards and DisGeNet databases, 17763 and 5978 HCC-related genes were collected, respectively. By using the R limma package with FDR < 0.05 and | log2 (fold change) | > 1 as thresholds, 6235 differentially expressed genes in HCC were identified in six HCC-associated microarrays (GSE36376, GSE101685, GSE62232, GSE112790, GSE84402, GSE121248) in the GEO database (Fig. 1A-L), and 9038 differentially expressed genes in HCC were collected from the TCGA database. Then, we crossed the HCC-related targets obtained from different databases with the targets of SJZT through Venn diagram. Finally, 43 genes were identified as potential targets of SJZT in the treatment of HCC (Fig. 2).
2.3. Herb-component-target network analysis
As TCM formulas exhibit multiple pharmacological activities via multiple targets, it was constructive for us to explore the underlying mechanisms of TCM formulas on complex diseases by network analysis. An herb-component-target network of SJZT was created based on the targets and their corresponding components. The network included 359 nodes and 1875 edges represented the interactions between the components and the targets (Fig. 3A). This network visually displayed the regulatory relationship among drugs, ingredients and targets. We discovered that the same active component could act on multiple targets. The exact target also corresponds to separate chemical components, which fully reflect the multicomponent and multitarget characteristics of SJZT in HCC treatment.
To obtain the main active components of SJZT for the treatment of HCC, the network was screened based on the network node topological parameter “degree”. Based on the degree value ranking, we initially screened the main active compounds of SJZT for the treatment of HCC, including quercetin, kaempferol, beta-sitosterol, 7-methoxy-2-methylisoflavone, formononetin, naringenin, stigmasterol, medicarpin and isorhamnetin (Fig. 3B).
2.4. GO and KEGG enrichment analysis
To further elucidate the biological functions of the potential target genes of SJZT in the treatment of HCC, we performed GO enrichment and KEGG pathway analyses. The results of GO enrichment analysis included 823 BP (biological process) terms, 85 MF (molecular function) terms and 41 CC (cellular component) terms. In BP, the cellular response to inorganic substances, response to cadmium ion, and response to metal ion were the most significant (Fig. 4A). In CC, caveola and plasma membrane raft were the most significant (Fig. 4B). In MF, steroid binding was the main components (Fig. 4C). Sixty-three relevant pathways were obtained by KEGG pathway enrichment, mainly pathways involved in chemical carcinogenesis, fluid shear stress and altheroscierosis, IL-17 signaling pathway, AGE-RAGE signaling pathway in the pathogenesis of diabetes (Fig. 4D).
2.5. Exploring hub genes
Based on the previous results, 43 potential targets were imported into the STRING 11.0 database to construct a PPI network. The minimum combined score between the targets was adjusted as the medium confidence (0.400). The PPI network of the potential targets was saved as a TSV file and then entered into Cytoscape 3.8.2 for further analysis. The network included 42 nodes and 174 edges (Fig. 5A), and then we used two methods to screen the hub genes. First, we explored the significant modules by using the MCODE plug-in in Cytoscape, and one significant module from the PPI network was screened, which consisted of 12 nodes and 118 edges (Supplemental Fig. 1A). At the same time, we also used cytoHubba in Cytoscape to find the hub genes, and the top 10 hub genes were generated using the multiscale curvature classification algorithm. Interestingly, we found that these genes were all contained in the modules we had previously identified through the MCODE plug-in (Fig. 5B). Therefore, we inferred that these 10 genes could act as hub genes in SJZT treating on HCC, including MMP9, CCL2, PTGS2, SERPINE1, MMP1, SPP1, ESR1, CRP, IGFBP3 and HMOX1.
2.6. Molecular docking
We investigated the interactions between the main active compounds and the proteins of hub genes through molecular docking to further screen for active compounds with good anticancer activity. The molecular docking results are shown as free binding energies (kcal/mol), and the lower the binding energy, the better the affinity. Binding affinities below − 5.0 kcal/mol indicated confirmation of good interactions. The binding energies of the main compounds and hub targets were all below − 5.0 kcal/mol, indicating good interaction with the targets (Fig. 6). Among them, quercetin, stigmasterol and formononetin have a stronger affinity with most of the hub targets. Moreover, previous studies have also shown that these compounds have good anticancer activity, which suggesting that quercetin, stigmasterol and formononetin could be the most important active compounds of SJZT in the treatment of HCC.
2.7. Prognostic analysis of the hub genes
To further mine the most valuable genes in the above hub genes, the clinical significance of the hub genes was analyzed. After downloading the gene expression profiles from TCGA, machine learning was adopted to determine the best prognostic markers among the hub genes. We evaluated the importance of 10 hub genes in the prognosis of patients with hepatocellular carcinoma using Lasso regression, Random Forest, XGBoost and AdaBoost. Because of their difference in modeling approaches, these methods could help us identify the critical genes among the hub genes. The Lasso and Adaboost methods agreed that SPP1 is the essential feature in the dataset. Random Forest and XGBoost also suggested that SPP1 plays an essential role in the prognosis of patients with HCC (Fig. 7).
At the same time, univariate and multifactorial Cox regression analysis were performed to assess the relationship between hub genes and survival in the patients with HCC. The results showed that SPP1 could be used as an independent prognostic factor to assess the prognosis of patients with HCC (Fig. 8), which displayed coincidence with the result of machine learning.
Furthermore, we performed bioinformatics analysis to verify our conclusions. KMplot analysis was used to elucidate the influence of the differential expression of SPP1 on the survival rate. The results showed that patients with high SPP1 expression had significantly shorter overall survival, disease-free survival, relapse-free survival and progression-free survival (Fig. 9A-D). Based on the ULCAN dataset, we found there were significant differences in SPP1 expression levels among individual cancer stages (Fig. 9E). Moreover, the HPA database showed that the SPP1 protein is highly expressed in HCC tissues and located in Golgi apparatus (Fig. 9F-H). These results indicated that SPP1 is expected to be a new prognostic marker.
2.8. Immune microenvironment analysis
Considering that HCC is a typical inflammation-related tumor, we further performed ssGSEA to examine the correlation between SPP1 expression and immune cell infiltration in HCC. The results showed that macrophages positively correlated with SPP1 expression, with a spearman correlation up to 0.388 (p < 0.001) (Fig. 10A, Supplemental Table 3). Moreover, the expression of SPP1 was also correlated with immature DCs, Th1 cells, activated DCs, Th2 cells, T effector memory cells, NK CD56 bright cells and Th17 cells (Fig. 10A, Supplemental Table 3).
Since the expressing of SPP1 positively correlated with macrophages, while tumor-associated macrophages (TAMs) can be classified as M1-like (pro-inflammatory and usually anti-tumor) and M2-like (anti-inflammatory and pro-tumor) (Vitale et al., 2019), we further analyzed the relationship between SPP1 and different types of TAMs in HCC. First, we performed Kaplan-Meier survival analysis to evaluate the OS in HCC patients with different types of macrophage infiltration. The high abundance of M2-like TAMs was significantly associated with poor cumulative survival rate in HCC (Fig. 10C), while M1-like TAMs had no significant effect on survival (Fig. 10B). Second, by using GEPIA 2021 database, we found that SPP1 expression was higher in M2 macrophages than in M1 macrophages in hepatocellular carcinoma tissues (Fig. 10D). Further analysis revealed that a highly positive correlation between SPP1 and M2-like TAMs molecular biomarkers (CD163, VSIG4 and MS4A4A; Fig. 10F), whereas M1-like TAMs biomarkers (IRF5, MARCO and NOS2; Fig. 10E) showed only weak or no correlations. Collectively, our findings indicate that upregulated SPP1 correlates with M2-like TAMs polarization, which may contribute to HCC.
To further analyze the association between SPP1 and immune regulation, we utilized the TISIDB database to estimate the relationship between SPP1 and several well-known immune suppressive molecules in the HCC microenvironment. The results exhibited that SPP1 was related to colony stimulating factor 1 receptor (CSF1R) (r = 0.336, p < 0.001; Fig. 11A), Hepatitis A virus cellular receptor 2 (HAVCR2; also known as TIM-3) (r = 0.401, p < 0.001; Fig. 11B), V-Set Domain Containing T Cell Activation Inhibitor 1 (VTCN1) (r = 0.415, p < 0.001; Fig. 11C), Galectin 9 (LGALS9) (r = 0.400, p < 0.001; Fig. 11D). Other well-known immunosuppressors, such as cytotoxic T-lymphocyte-associated protein 4 (CTLA4) (r = 0.268, p < 0.001; Fig. 11E), and programmed cell death protein 1 (PDCD1; also known as PD1) (r = 0.141, p < 0.001; Fig. 11F) were moderately related to SPP1 expression. These results suggest that SPP1 may participate in mediating immunosuppression in the HCC microenvironment.