System Pharmacology-Based Strategy to Investigate Core Component Group and Molecular Mechanisms of Xuefu Zhuyu Decoction in the Treatment of Gbm

Xuefu Zhuyu Decoction (XFZYD) is a well-known Traditional Chinese Medicine (TCM) formula that has many pharmacological effects, including enhancing immune function, improving hemorheology and regulating blood vessels bidirectionally. Modern pharmacological and clinical studies showed that XFZYD could ameliorate curative effect of glioblastoma (GBM). The aim of this study was to interpret core components and the hidden molecular mechanisms of XFZYD on GBM. Here, a novel network pharmacology strategy, which combined pharmacological data, next generation sequencing data, pharmacokinetic parameters and a novel node importance calculation method was designed to decipher the potential therapeutic mechanism of XFZYD on GBM. The partial components in core component group (CCG) were evaluated by in vitro expriments. We identied 117 chemical components analysis through ADME screening, then component-target network and GBM related genes were integrated as the component-target-pathogenic gene (C-T-P) network. The results show that the enriched pathways of targets in the key functional network could cover 77.92% of the enriched pathways of pathogenic genes. A novel cumulative contribution rate (CCR) calculation model was designed and captured CCG with 21 components. The statistics results indicate that 15 enriched pathways of the targets of CCG were overlap with pathogenic genes enriched pathways. Finally, some core components in CCG were validated by in vitro experiments. The results show that our proposed stategy for decoding CCG and infering the underlying mechanism with good reliability and accuracy. The validation results indicate that the CCG play a therapeutic role on GBM by targeting to PI3K-Akt signaling pathway and Toll-like receptor signaling pathway. Our strategy provides methodological reference for the optimization and secondary development of TCM formula. It can systematically and comprehensively observe the intervention and inuence of drugs on disease networks, thus revealing the synergistic mechanism of key components in drugs on the complex disease. Network pharmacology is usually used to analyze the mechanism of drug action, nd lead compounds or new indications, and determine new drug targets from the system level. The overall and systematic characteristics of network pharmacology are not only consistent with the concept of diagnosis and treatment of diseases by TCM, but also consistent with the cooperative action principle of TCM formula.

At present, a novel systematic pharmacological model was designed to detect the core component group (CCG) and explore the therapeutic mechanism of XFZYD in treating GBM. Our proposed strategy could combine pharmacokinetics synthesis screening data, high throughput data, target shing data, network analysis, and molecular docking to detect CCG and infer the potential mechanisms of XFZYD in treating GBM. Speci cally, the chemical components of XFZYD are collected in public databases. Then we use the previously proposed ADME model to screen potential active components of XFZYD, and then predicted targets of these components to construct a C-T network. After that, we integrate the C-T network and pathogenic genes into a comprehensive C-T-P network and then use the node importance calculation method to obtain the key functional network. Based on the key functional network, we use cumulative contribution rate (CCR) to select the CCG that can target most of the genes in the key functional network and speculate the molecular mechanism of XFZYD in treating GBM on CCG.

Methods
Construct Weighted GBM-related Pathogenetic Gene network Two datasets, GSE50161 and GSE15824 in GEO database [16] recorded the expression pro le data of normal and GBM patients were kept for futher analysis..
We use "limma" package to obtain the differential expression genes (p < 0.05 and fold change > 2) of the two datasets, respectively. the shared part of these two differential expression genes is considered to be reliable GBM pathogenic genes. Each pathogenic gene has two fold change values, Then we used the square root of the two fold change value as the weight of pathogenic gene.
PPI data collected from the public databases STRING, Dip, BioGRID, Mint, and HPRD [17] were combined as a comprehensive PPI network. The GBM-related pathogenetic genes with de ned weight were mapped to the comprehensive PPI network to construct the weighted GBM-related pathogenetic gene network.
Component Identi cation TCMSP could provide detail items for each component, mainly include drug-likeness (DL), oral bioavailability, (OB), blood-brain barrier (BBB), and so on.
Shanghai Institute of Organic Chemistry, Chinese Academy of Sciences. Chemical database [DB/OL] can provide information on the structure and identi cation of compounds, natural products and pharmaceutical chemistry, safety and environmental protection, chemical literature, chemical reactions and comprehensive information. TCMID contains about 47000 prescriptions, 8159 herbs, 25210 compounds, 6828 medicines, 3791 diseases and 17521 related targets, which is the most integrated collection of big data in the related eld of TCM research. TCM @ Taiwan contains 443 kinds of TCM formula and more than 20000 pure ingredients, which are divided into different categories and provide enquiries, including inquiries about the ingredients of TCM. All the ingredients of XFZYD come from the above databases. Open Babel Toolkit (version 2.4.1) was uesd to prepare all chemical structures and converte to canonical SMILES. HitPick [18], Similarity Ensemble Approach (SEA ) [19], STITCH [20] and Swiss Target Prediction [21] were used to predict the targets of XFZYD.

ADME Screening
The preliminary evaluauion of ADMET properties is becoming an important process in modern drug discovery. In our study, three ADMET-related models of DL, OB, and BBB were used to select the active components in XFZYD. Anatomically, BBB is characterized by the continuous tight junctions between continuous non-fenestrated endothelial cells. The function of this connection is usually to restrict the entry of protein and underlying drugs into the brain parenchyma (Tattersall et al., 1975). It is important to assess the ability of each component to enter the CNS, and components with > -0.3 are considered as candidate components. OB (%F) represents the percentage of the drug reaching the systemic circulation with unchange in oral administration, that indicates the trends of ADMET process (Xu et al., 2012). Higher OB usually represent the core index to de ne the characteristics of active components as potential drugs. The compounds with OB≥30% were selected as candidate drugs. Drug-likeness (DL) is a digital indicator that is widely used in drug design to assess the drug likeness of an expected components, which helps to capture pharmacokinetics and drug characteristics. The DL value is more than 0.14 can be used as the selection standard for the TCM components [19].

Construct C-T network
The Cytoscape (version 3.5.1) [22] was employed to constructed and visualized the component-target (C-T) network of XFZYD. The Cytoscape plugin NetworkAnalyzer [23] was uesd to calculate topological parameters of networks.
Develop a PageRank algorithm model to Select key functional network In tumor research, the PageRank has employed to analysis the important nodes in the protein network [24,25]. In order to nd key functional network of XFZYD in treating GBM, the following mathematical algorithm was designed and described: Here, the PageRank formula was revised by adding damping factor d to the simple formula: PR(A) represent the importance of the node A. Ti is a component or target linking to A. C(Ti) is the number of links of Ti. D is the damping coe cient.

Cumulative Contribution Rate (CCR) Calculation
In order to get the core component group (CCG), the CCR model was designed to extract CCG from the key functional network, which would be used to infer the hidden mechanism of XFZYD in the therapy of GBM. Given n components and cumulative contribution rates, the number of pathways enriched in component i is W i , the coverage rate of the enriched pathway in components compare to the pathogenic gene is P i , and the total restriction condition C is the number of components was used to calculate the CCR. Select the appropriate components to calculate the contribution rate based on coverage of enriched pathways in components compare to that in pathogenic genes. When the CCR reaches or exceeds 90%, the iterative cumulative calculation of this model will be terminated. There are only two options for each component i: included or not, that is, component i can only be included in the cumulative probability once. The model of this problem can be expressed as the following 0/1 integer programming model:

Subject to
Where x i is the decision variable of 0-1, x i =1 represents the item is chosen into the knapsack, and x i = 0 represents the item is not selected into the knapsack.

Gene Ontology and Pathway Analysis
The R package clusterPro ler (Yu et al., 2012) was used to perform gene ontology (GO) analysis and pathway enrichment analyses [26]. In enrichment analysis, P value less than 0.05 is considered to be signi cant, and these enriched images are displayed by ggplot2 in R. Different colors in the pathway diagram are marked by Pathview package in R Bioconductor (Luo et al., 2017).

Cell Culture
The human LN229 glioma cells (the American Type Culture Collection) were cultured inwith complete medium in an incubator with 37 °C. When LN229 cells reached 80-90% con uency, they were treated with different concentrations of baicalein and wogonin for 24 h, respectively.
Cell Viability Assay CCK8 assay was utilized to measure cell viability. Cells were seeded in 96-well plates. Following incubation for 24h, the original medium was replaced with medium containing different concentrations of baicalein and wogonin (Jingzhu Biotechnology Co., Ltd, Nanjing, China) for 24h. Subsequently, 10 μL of CCK8 (KeyGEN BioTECH, KGA317) reagent was added to each well and the 96-well plates were incubated for 2h in the dark. Finally, the optical density (OD) values were measured at 450 nm using a microplate reader.

Western blot analysis
The LN229 cell was lysed by RIPA Buffer (Solarbio) at 4 °C for 30 min to obtain protein lysates. 30 mg proteins was separated using a 10% SDS-PAGE and transferred onto PVDF membranes (Invitrogen, Carlsbad, CA). The antibodies were used in this analysis. chemiluminescence signals were detected with ECL reagent (Millipore, USA).

Measurement and detection of cell apoptosis
Cell apoptosis was assessed using FITC Annexin V Apoptosis Detection Kit (KeyGEN BioTECH) according to manufacturer's instructions. Cells were treated with 60 μM baicalein and 80 μM wogonin for 24 h. We washed with PBS and collected cells. The cells were suspended in annexin V binding buffer. Finally, 2 μl of annexin V-FITC and 2 μl of propidium lodide (PI) were utilized to incubated in dark conditions for 5min.

Statistical Analysis
In order to compare the molecular characteristics of each component in XFZYD, statistical analysis was performed using SPSS22.0.The student's t-test for comparison was used to analysis data.

Results
In our study, we designed a systematic pharmacological method to analyze the molecular mechanism of XFZYD in treating GBM (Figure 1). The key point of the method is to construct the weighted pathogenetic gene network and C-T network, then integrate the two networks through PPI data to construct a comprehensive C-T-P network, and then optimize the C-T-P network by using the novel node importance method to capture the key functional network. The dynamic programming algorithm was employed to predict CCG, nally, the CCG and their targets were used to infer the molecular mechanism of XFZYD in treating GBM. This strategy provided a methodological reference for formula optimization in TCM.

Construct Weighted Pathogenetic Gene network
The weighted pathogenetic gene network re ects the pathogenesis of glioma assisted by multiple genes. thus, the construction of pathogenic gene regulatory network is the key to provide evidence for glioma therapy. In this study, we use the pathogenetic genes ( Figure S1) and comprehensive PPI network ( Figure S2) to construct the weighted pathogenetic network. In this network, the highest weight gene is RRM2, following by TOP2A, MELK, NUSAP1, NDC80, DLGAP5, CHI3L1, IGF2BP3, METTL7B, and COL3A1. Most of these pathogenic genes were reported associated with cell apoptosis related pathways of GBM ( Figure   S3). Such as, Rikke et al. nd that regulated RRM2 expression promoted tumorigenicity of GBM [27]. Arivazhagan et al found that the expression of TOP2A was a prognostic biomarker for GBM patients treated with TMZ [28]. Marie et al. con rmed that there exists a signi cant correlation between the expression of MELK and malignant grade of astrocytoma [29]. Qian et al. demonstrated that NUSAP1 could be used as a new prognostic indicator and a underlying target for GBM [30]. Clinical trials could be conducted to verify the e cacy of gene therapy by the down-regulating BUB1B (or NDC80) in GBM patients [31]. These results show that the weighted pathogenetic genes network and weighted pathogenetic genes can re ect the pathogenesis of GBM, and also provide a reliable reference for the next step of constructing an intervention network ( Figure S4).

Chemical components analysis
Chemical analysis is one of the main means to study the material basis and action mechanism of herbal medicine. Through literature search, we collected the chemical constituents and concentrations of various components in XFZYD. The detailed information was shown in Table 1. The results show that the chemical components of some herbs have high concentrations, suggesting that these components may have therapeutic effects and provide an extended experiment-aided chemical space for screening of active components and providing a more comprehensive reference for further analysis [3,32,33].

Active components of XFZYD
To fully understand the potential anti-GBM mechanism of XFZYD, we collected components of the 11 herbs in XFZYD from published databases, a total of 1365 components were collected. To assess the pharmacokinetic properties of the XFZYD formula, we used published ADME system to predict the pharmacological properties of BBB, OB, and DL. Only components with BBB -0.3, OB 30%, and DL 0.14 were predicted as active components. After screening, 117 active components were kept for further analysis. The detailed information of the active components can be found in Table 2.

C-T network construction
In order to observe the relation pattern of active components and their corresponding targets in XFZYD, Cytoscape was empolyed to construct C-T networks ( Figure 2). The C-T network consists of 117 active components targeting to 855 genes through 4488 edges. Then the plug-in of Cytoscape, NetworkAnalyzer was used to analyze the topology parameters of th C-T network and showed that the average degree of targets and components in XFZYD were 5.25 and 39.40, respectively. The average number of components for per target is 5.25. This result indicates that XFZYD has multi-components feature in treating GBM.

Key Functional Network Identi cation
The multi-component and multi-target characteristics of TCM determine that it exists network intervention in the process of treating complex diseases, and the active components in the formula can transmit the intervention effect to pathogenic genes through PPI networks by in uencing their targets, thus playing a therapeutic role on diseases. Components-targets-pathogenic genes form a complex intervention network, in which some components play a crucial role and some components play an auxiliary role. How to nd the core component groups and key functional networks from this network is the foundation to understand the treatment of complex diseases of formula in TCM and is also the foundation for the secondary development of TCM formula. PageRank (PRK) algorithm usually used in the Google search engine to measure the importance of web pages, is also widely used in biomedical researches. Syed et al. used PRK centrality to consistently nd convincing eloquent brain areas (Perry et al., 2019). Hao Zhang et al. predicted human and plant target genes using RNAhybrid by PRK algorithm (Zhang et al., 2016). In this paper, we combined C-T network, weighted pathogenetic gene network and PPI network to form a comprehensive network, in this network, we use PRK value to describe the importance and in uence of nodes, the nodes with higher PRK value than the average and median PRK value were de ned as A functional network and M functional network, respectively. In the further analysis of A functional network and M functional network, we found that A functional network includes 37 components and 728 targets, M functional network includes 33 components and 688 targets. Then we performed pathway analysis by using targets of A functional network, M functional network and C-T network respectively. Among them, the number of targets in A functional network enriched pathways was 131 (p < 0.05), the number of targets in M functional network enriched pathways was 125 (p < 0.05), the number of genes in C-T network enriched pathways was 154 (p < 0.05). Compared with enriched pathways of genes in the C-T network, the coverage rates of the two strategies were 77.92% and 74.68%, respectively (Figure 3). According to this result, we used the A functional network as key functional network for further analysis.

CCG Prediction and validation CCG Prediction
The key functional network contains a large number of chemical components, action targets and interactions between targets, which form a complex response network. There are potentially effective component groups and modes of action in this network. How to detect these effective component groups and action patterns is key to understand the treatment of glioma with XFZYD. Here we design CCR model to obtain CCG by heuristic optimization of key functional networks using reasonable global metrics. According to the results of CCR model, the top 5 components including gadelaidic acid (MOL004996), icos-5-enoic acid (MOL004985), acacetin (MOL001689), 7-Methoxy-2-methyl iso avone (MOL003896), and Glypallichalcone (MOL004835) contribute to 55.63% target coverage of genes in key functional network. For further analysis, 21 components can contribute to 90.66% targets coverage of genes in key functional network were selected as CCG ( Figure 4 and Table 3). Higher targets coverage of genes in key functional network proved that the CCG may play a leading role and produced combination actions in treating GBM.

CCG Validation
To analysis of XFZYD in the treatment of GBM at the functional level, we used CCG targets for pathway analysis, the number of CCG targets enriched pathways was 121 (p < 0.05), and the number of pathogenic genes enriched pathways was 26 (p < 0.05). We de ned the shared enrichment pathways of pathogenic genes and targets in key functional network as the reference pathways. The CCG targets enriched pathways account for 78.95% of the reference pathways ( Figure 5A). This results con rmed the accuracy and reliable of our CCG selection model. For futher analysis, we found that these main targets of CCG were frequently involved in PI3K/Akt signaling pathway (hsa04151), Toll-like receptor signaling pathway (hsa04620), Proteoglycans in cancer (hsa05205), Cell cycle (hsa04110), Cellular senescence (hsa04218), and Platelet activation (hsa04611), etc ( Figure 5B). For example, PI3K/AKT pathway plays a key role in the occurrence and development of GBM [45]. AKT, as a regulatory molecule and a potential drug target, has been widely studied for its carcinogenicity [46,47]. AKT activation promotes tumor progression by affecting tumor proliferation and growth [48]. Up to now, three isomers of AKT (AKT1, AKT2 and AKT3) have been discovered. According to reports, AKT3 can signi cantly activate DNA repair to resist radiotherapy and chemotherapy in GBM patients (Turner et al. (2015). Results show that our combined optimized strategy of key functional network detection and CCR model is accurate and reliable, and our predicted core component group plays an critical role in treating GBM.

GO enrichment analysis of CCG targets
To further analyze the combined effects of XFZYD, all targets of CCG were enriched by GO enrichment analysis ( Figure 6B). We de ned the shared enriched GO terms of key functional network and pathogenetic gene as the therapeutic response GO terms. The enriched GO terms of CCG targets account for 73.68% of the therapeutic response GO terms ( Figure 6A). ). Important genes involved in lipid metabolism in these GO terms include ABCB1, ABCC1, FABP4. These genes are involved in the lipid metabolism of GBM by in uencing cell growth, proliferation, angiogenesis and invasion [52,53]. GO analysis con rmed that XFZYD treats GBM by regulating of oxidative stress and lipid metabolism.
Pathway analysis to explore the potential therapeutic mechanisms of XFZYD In order to investigate the potential therapeutic mechanisms of XFZYD on GBM, We analyzed the intersection of enrichment pathways of CCG target gene and GBM pathogenic genes, totally get 15 shared pathways. Among the 15 common pathways, remove the pathways associated with other diseases, the PI3K-Akt signaling pathway (hsa04151) and Toll-like receptor signaling pathway (hsa04620) were the literature proved GBM-related pathways. For example, PI3K-Akt signaling pathway plays an important role in cell cycle, apoptosis, and cell proliferation in GBM [54,55].
In order to analysis the synergetic mechanism of XFZYD in treating of GBM, we constructed a comprehensive signaling pathway by combining two principal molecular pathways. As shown in Figure 7, we de ned the rst three columns as the upstream position of the pathway and the remaining columns as the pathway downstream position. Among them, targeting to the PI3K-Akt signaling pathway is the main approach for XFZYT to treat GBM. XFZYD regulates 7 targets located the upstream, such as Syk, RTK, TLR2 / 4, and 32 targets located PI3K-Akt signaling pathway downstream, such as MEK, AKT, and mTOR. Downstream targets account for 82.05%. XFZYD may activate downstream of AKT proteins through upstream RTK, leading to cascade ampli cation of downstream GSK3 and CREB, which was closely related to GBM cell proliferation and protein synthesis [48].
Toll-like receptor signaling pathway is also a critical way for XFZYD to treat GBM. The target of XFZYD regulation is located the pathway downstream. Such as, 11 targets of XFZYD, including AKT are located downstream of the Toll-like receptor signaling pathway. XFZYD can affect upstream TLR2/4 and Rac1 and then activate downstream AKT, thereby further affecting a series of malignant biology with GBM behavior-related proteins, such as STAT1, NFκB, and IL-6, thus play an important role in the treatment of GBM.

Experimental Validation in Vitro
The effects of baicalein and wogonin in CCG with different concentrations on the viabilities of human LN229 glioma cells were detected by CCK8 assay. After  Figure 8B). Further, we found that treatment with baicalein and wogonin could signi cantly reduced Bcl-2 protein expression ( Figure  8C). Our data suggested that the baicalein and wogonin induced apoptosis phenomenon play a important role in inhibiting the development of GBM.

Discussion
For thousands of years, TCM adopts the method of holistic treatment, emphasizing the holistic view, systematic view and cooperative view in the application of formulas in treating complex diseases. Network pharmacology has distinguishing feature of systematicness and integrity, which accord with the theoretical basis of TCM. Network pharmacology emphasizes the regulation of multi-component and multi-target of signal pathways to improve the e cacy of drugs and reduce toxic and side effects. Now, network pharmacology is widely applied to study the complex diseases treatment in TCM.
More and more experiments show that some components in TCM formula have synergistic and antagonistic effects, and even some components may produce toxic and side effects. How to use the current computing technology and chemoinformatics technology to detect the core components and analyze the mechanism of TCM formula are a major challenge of network pharmacology. The main purpose of optimization of TCM formula is to decrease unnecessary functional components and improve the therapeutic effect of the formula. Through the optimization of the TCM formula, medicinal materials or components with certain drug effects can be screened, thus the formula is more clari ed, and the therapeutic effect is more enhanced. In order to better optimize the TCM formula with clinical effect and obtain the main components and key mechanisms, the network pharmacological model and mathematical model are adopted to obtain the optimal target group by using the change of the proportion of the target enrichment pathway to the pathogenic gene enrichment pathway. Based on these target groups, the optimal key component group is speculated, and the function evaluation and mechanism speculation are carried out. It provides a methodological reference for the optimization and secondary development of TCM formula.
The emergence of accumulated high-throughput data provides rich data support for the characterization of human diseases, such as 1) looking for application of blood-based biomarkers in early diagnosis of tumors [31]. 2) integrate epidemiological, pharmacological, genetic and intestinal microbiome data into the drug metabolite map [56]. 3) the next generation tool for rapid disease screening [57]. These data make it possible to interpret the therapeutic mechanism of TCM in the treatment of complex diseases based on network pharmacology and high-throughput sequencing data. How to use this strategy is still in its infancy.
In this study, the mathematical methods and network pharmacology were used to study the optimization strategy of component prescription, functional analysis of target genes of key component groups after optimization and changes in the coverage rate of related pathogenic gene enrichment pathways, so as to better optimize the clinical effect of classical prescriptions. The results show that the enrichment pathway of the key functional network can cover 77.92% of the gene enriched pathways of C-T network, further con rming that our strategy of constructing a key functional network and selecting effective proteins is reasonable and reliable. Based on the effective protein provided by the key functional network, the cumulative contribution rate is calculated by CCR model, and nally the target coverage rate was calculated, CCG with 21 active components reaches more than 90% target coverage rate. Bioinformatics analysis demonstrated that our optimized CCG was closely related to the pathogenesis of GBM at functional level. Cellular experiments veri ed the reliability of the network pharmacology strategy by verifying the inhibitory effects of key components in CCG on LN29 cells. This proves once again the reliability and accuracy of our key functional network and CCR model.

Conclusions
At present, our network pharmacology model provides a powerful tool for optimizing the core components group and decoding the underlying mechanism of TCM formula. There are still some limitations. For example, in animal medicine research of XFZYD, how to select less active ingredients to verify the therapeutic effect and molecular mechanism of them in treating GBM by vivo experiments.

Competing interests
The authors declare that they have no competing interests.
Author's contributions * ZYL and KXW contributed equally to this work.
GLH and DGG provided conception and design of the work. ZYL, KXW, RWY, GZY, ZYL, YWL, DGG, and GLH conducted acquisition, analysis and interpretation of data. ZYL, KXW, DGG, and GLH contributed to critical revision for important intellectual content.
All Authors read and approved the manuscript.  Figure 2 The C-T network of 117 active components and their targets. The component is represented by red and the target is represented by green.   The biological processes analysis of CCG. A: Venn diagram shows the coincidence of GO analysis in CCG and the therapeutic response GO terms. B: Bubble diagram for top 30 GO enrichment of CCG.