The molecular mechanism of TiaoGanYiPi formula for treatment of chronic hepatitis B by network pharmacology and molecular docking verification


 The Chinese herbal formula Tiao-Gan-Yi-Pi (TGYP) showed effective against Chronic Hepatitis B (CHB). In this study, we aimed to clarify the mechanisms and potential targets between TGYP and CHB through network pharmacology and molecular docking verification. The compounds of TGYP were identified in the TCMSP and CNKI databases, and their putative targets were predicted through SwissTargetPrediction and STITCH databases. The targets of CHB were obtained from the GeneCards, NCBI Gene, and DisGeNET databases. The above mentioned data were visualized using Cytoscape, and molecular docking showed the relationship between them. The expression of key targets was verified in GEO databases. Hence, we screened out 11 TGYP-related key targets for CHB included ABL1, CASP8, CCNA2, CCNB1, CDK4, CDKN1A, EP300, HIF1A, IGF1R, MAP2K1 and PGR. The key targets were predominantly enriched in the cancer, cell cycle and hepatitis B pathways and involved in the positive regulation of fibroblast proliferation, signal transduction, and negative regulation of gene expression biological processes, and expression of key target genes was related to HBV infection and liver inflammation. Through this newly constructed interaction network between TGYP and CHB, we identified active compounds and targets which could be further used for providing clinical guidance.


Introduction
It has been a global public health problem for the Hepatitis B virus (HBV) infection for a long time 1 .
Worldwide estimates suggested that more than 2 billion people have been infected with HBV, with 248 million of these people are chronic hepatitis B (CHB) patients, and about 15% ~ 25% of CHB patients died from cirrhosis or liver cancer 2 , which caused a substantial global burden 3 . In China, there are 20 ~ 30 million CHB patients, and the mortality rate of HBV-related complications is higher than that in the world 4 . The disease process of CHB is affected by both HBV and the host immune system, while the pathogenesis of CHB is complex and still unclear. Meanwhile, Existing treatment options [nucleos(t)ide analogs (NAs) and interferons (IFN)] are limited in their e cacy and scope of application. Studies have suggested that discontinuation of NAs lacked operability 5 , whilst oral antiviral drugs had limitations in improving complications 6 . Meanwhile, the contraindications and side effects hindered the clinical application of IFN 7 .
For a long time, Chinese herbal medicines (CHMs) play a crucial role in CHB treatment and show superiority 8,9 . Among CHMs, the Tiao-gan-yi-pi formula (TGYP) was reported with e cacy for CHB treatment, as reported by our previous multi-center, double-blind, randomized placebo-controlled trial 10 . In Chinese Medicine theory, stagnation of Gan and de ciency of Pi qi, and damp-heat in the Gan and Dan, is a compound core syndrome of chronic hepatitis B 11 . The TGYP is targeting at the core syndrome of CHB, which reinforces and motivate qi, and dispel damp-heat. Prescription components may play a role in the various multi-targeting and synergistic effects of the CHM formula. However, the pharmacological underlying curative mechanisms of TGYP regarding the therapeutic effect of CHB have not been fully explained. Here, we aimed to explore the relationship between CHB and TGYP, identify the targets and molecular mechanisms of TGYP on CHB.
In this study, we performed network pharmacology based on system biology and molecular docking, screened out 11 key target genes that exert essential functions during the treatment of TGYP on CHB. These results revealed the relationship between TGYP and CHB, indicating directions for further research.

Work ow of TGYP
Based on the previous work, which showed the anti-HBV effect of TGYP, we designed the owchart of this study (Fig. 1). We aimed to determine the key target genes and regulatory mechanism of TGYP treatment against CHB through network pharmacology and database mining methods.

Compound-Putative Targets Network
We rst analyzed the four main herbs in TGYP by CNKI and TCMSP databases and found 26 active components in Chaihu, 25 active components in Huangqi, 25 active components in Danggui, and 17 active components in Kudiding, respectively ( Figure S1A). Among them, Chaihu and Huangqi have three common components, Chaihu and Danggui have two common components which were showed in Table 1, the detailed information of all components was shown in Table S1. Next, according to the active components, we identi ed 775 candidate target genes by SwissTargetPrediction and 1,198 candidate target genes by STITCH analysis for TGYP (Table S2). After eliminating the redundancy, a total of 1,764 putative targets were collected ( Figure S1B) and visualized as the compound-putative target network by Cytoscape ( Figure S1C). The compound-putative target network showed 1,856 nodes and 4,941 edges. According to combine degree, the top three ranked compounds were Isorhamnetin (Degree = 147), Quercetin (Degree = 145), and Kaempferol (Degree = 142), indicated their vital roles in TGYP and de ned as key compounds.  To explore the biological function of TGYP target genes, the GO and KEGG pathway enrichment analysis were performed by DAVID. The GO enrichment analysis revealed that the GO-CC of TGYP target genes were predominantly involved in the cytoplasm, plasma membrane and nucleus ( Fig. 2A); the GO-MF of TGYP target genes mainly included protein binding, ATP binding, and protein homo-dimerization activity (Fig. 2B); and the GO-BP of TGYP target genes concentrated primarily on signal transduction, oxidationreduction process, and positive regulation of transcription from RNA polymerase II promoter (Fig. 2C).
Meanwhile, the KEGG enrichment analysis suggested that TGYP target genes were primarily associated with metabolic, neuroactive ligand-receptor interactions and cancer pathways (Fig. 2D). The results above suggest that TGYP target genes are enriched in viral carcinogenesis and hepatitis B pathways closely related to CHB, and the GO results broadly correspond to this. However, the pathway analysis also enriched in metabolic, neuroactive ligand-receptor interaction, calcium signaling pathway, and more, suggesting that the TGYP formula has a wide range of action pathways and complex mechanisms that are consistent with the properties of traditional Chinese medicine compound in treating diseases. Of course, partial pathways that do not appear to be closely related to CHB still need to be further explored.

Identi cation and PPI Network Construction of OGEs
We analyzed and screened CHB disease targets, where we identi ed 8,037 targets in the GeneCards database, 415 targets in the DisGeNET database, and 236 targets in the NCBI Gene database, respectively (Table S3), and subsequently obtained 8,083 disease targets of CHB after eliminating the redundancy and taking the union of these three sets ( Figure S2A). Next, we examined the intersection of TGYP and CHB target genes and obtained 1,200 OGEs from the Venn diagrams tool ( Figure S2B). After that, we constructed a PPI network of OGEs and analyzed their corresponding secondary proteins from the STRING tool ( Figure S2C). The PPI network results showed the interactions between OGEs, and there were highly tight connections existed which indicated key target genes.

Identi cation of the key target genes through modules analysis
Since one gene lacked a corresponding protein structure, and while seven proteins did not interact with others based on PPI network, the remaining 1,192 proteins from STRING data were imported into the Cytoscape to further obtain interaction networks (Fig. 3A). Filtering with the network topological parameters, when the degree is greater than twice the median (degree>78), we get the preliminary hub network (Fig. 3B). Based on the fact that BC, CC, DC, LAC, NC, and SC are above the median (BC>0.001, CC>0.574, DC>61, LAC>38.452, NC>89.011, and SC>5.964E + 35), we identi ed 44 highly connected nodes as hub networks (Fig. 3C). The MCODE plugin in Cytoscape was used for a module analysis of the targets in the hub network. The module analysis identi ed four clustering modules, and we chose the key module, which had the highest network heterogeneity and network centralization at 0.242 and 0.333, respectively. The clustering key module we obtained represents a key target for treating CHB with TGYP.

Functional enrichment analysis of 11 key target genes
Functional enrichment was performed after data screening to further understand the biological behaviors of key targets (Table S4). The GO-CC analysis showed the main targets to be mainly enriched in the nucleoplasm, nucleus, and cytosol ( Fig. 4A). GO-MF analysis suggested that the key targets were strongly associated with protein binding, protein kinase binding, and protein tyrosine kinase activity (Fig. 4B). And the key targets were signi cantly related to multiple GO-BP, including positive regulation of broblast proliferation, signal transduction, and negative regulation of gene expression (Fig. 4C). The pathway enrichment results suggested that the key targets were mostly involved in pathways of cancer, cell cycle, hepatitis B and others (Fig. 4D). Based on the aforementioned results of GO-BP and KEGG, TGYP might have an effect on CHB related complications, such as liver brosis and liver cancer, which require further exploration. In general, these biological processes and signaling pathways may be linked to bene cial effects of TGYP against CHB.

The molecular docking of key compounds and key target genes
To better understand the interactions between key compounds and key target genes predicted by SwissTargetPrediction and STITCH, we constructed an interaction network between key target genes and corresponding active compounds in TGYP (Fig. 5A). According to the compound-putative target network analysis results, the key compounds were Isorhamnetin, Quercetin, and Kaempferol. Their 2D structures were shown in Fig. 5B. To further explore the interactions between key compounds and key target genes, we performed molecular docking by AutoDock Vina software (Fig. 5C). The docking capabilities of the protein-compounds are shown in Table 2. The docking simulations results included key target genes docked with key compounds and the most probable active compounds, respectively. The results further clari ed the regulation mechanism of TGYP on the key targets on the protein level.

Expression of key genes in CHB and correlation with liver in ammation
We analyzed the key target genes' expression and their correlations with liver in ammation in the GEO dataset GSE83148. Compared to healthy controls, some key target genes were signi cantly higher expressed in CHB patients (P<0.05), including ABL1, CASP8, CCNA2, CCNB1, CDK4, CDKN1A, HIF1A, and IGF1R (Fig. 6A). Results suggested the expression of these key genes were possibly related to the disease status of CHB. Furthermore, CHB patients with signi cant liver in ammation could bene t from improved antiviral effect 12 . To further explore the relationship between key target genes and liver in ammation in CHB patients, we compared the expression of 71 patients with abnormal serum ALT or AST levels (ALT ≥ 40 or AST ≥ 35) and 34 patients with normal serum ALT and AST levels (ALT < 40 and AST < 35). Results showed that expression of ABL1, CCNA2, CCNB1, CDKN1A, and HIF1A was signi cantly higher in the abnormal group than in the normal group, whereas MAP2K1 and PGR were reversed (Fig. 6B). These results indicate that expression of these key genes might be related to in ammatory CHB. TGYP might alter the disease status and the in ammatory state of CHB by regulating the expression of these genes.

Discussion
CHB is a refractory disease which poses great threats to public health, especially in the Asia-Paci c region. It is closely related to the occurrence and development of liver brosis and liver cancer 13 . At present, NAs and IFNs are the rst options for treatment, whereas both of them have shortcomings. NAs cannot eliminate HBV cccDNA in hepatocytes to achieve complete cure, while IFNs has many contraindications and side effects 14 . In contrast, traditional Chinese medicines have unique advantages against CHB, such as better safety pro le and lower price 8 . Our preliminary study con rmed the capability and safety of TGYP for improving HBeAg loss after combination with entecavir (ETV), thereby reducing disease-related risks in long terms 10 . For TGYP formula, the main components are Chaihu, Diding, Huangqi and Danggui. It is noteworthy that other studies have reported that Huangqi could relieve inhibiting HBV replication 16,17 . Combined these studies, it suggested that multiple active compounds of TGYP could have inhibitory effect on HBV in different ways. It is necessary to explore internal mechanisms to nd more treatment targets.
To further determine the mechanism of TGYP on CHB, we performed network pharmacology and database mining in this study: we rstly screened out TGYP target genes based on drug active compounds of the four main herbs. Among the active molecules, the top three active compounds with the highest network connectivity were Isorhamnetin, Quercetin, and Kaempferol. These compounds were observed inhibiting liver in ammation or anti-HBV effects, including that Isorhamnetin could signi cantly reduce serum levels of liver enzymes and pro-in ammatory cytokines in rats 18 ; Quercetin could strongly inhibit HBeAg synthesis and slightly anti-HBV surface antigen activity 19 , and Kaempferol could signi cantly reduce TNF-α expression and in ammatory responses 20 . These results suggested that TGYP may exert anti-HBV function via these active compounds and their target genes and pathways. We then combined CHB related genes to construct the interaction network of TGYP and CHB, where 11 key target genes were identi ed. Molecular docking analyses predicted direct binding between core active compounds and key target genes. We also found that in the GSE83148 database, compared with healthy controls, the expression of key target genes including ABL1, CASP8, CCNA2, CCNB1, CDK4, CDKN1A, HIF1A, and IGF1R were all up-regulated in liver tissues of CHB patients. Thus, the core active compounds of TGYP might exert functions by regulating the expression and protein activities of the key target genes.
Additionally, we found that the key target genes were mainly enriched in cancer, cell cycle, and hepatitis B signaling pathways using the KEGG pathway enrichment analysis. Speci cally, among the 11 key target genes, CCNB1, CDKN1A, EP300, ABL1, CDK4, and CCNA2 are enriched in the cell cycle pathway, and previous studies have shown that the expression of these factors could regulate the cell cycle to varying degrees, promote or inhibit cell proliferation [21][22][23][24][25][26] . Hepatocyte proliferation has been reported to accelerate during HBV infection 27 , leading to the downregulation of NTCP in hepatocytes, which contributes to HBV cccDNA loss and further accelerates HBV clearance 12,28 . Our study suggested that TGYP might exert anti-HBV effects by regulating the cell-cycle-related genes which manipulate hepatocyte proliferation. However, the speci c regulatory mechanism needs to be further investigated in HBV replication models. Meanwhile, studies have shown that HBV infection could inhibit the innate immune pathways in hepatocytes and the liver's adaptive immune system, which was critically important for the development of CHB 37 . In the current study, we found the expression of multiple target genes in liver tissue was correlated with the severety of hepatocyte in ammation. It is hypothesized that the TGYP might slow down the progression of CHB by regulating the immune response in the liver, which needs to be further investigated in animal models.
Finally, IGF1R and HIF1A genes enriched in cancer pathways were closely related to the occurrence and development of HBV-related HCC. Studies have shown that HBx could trans-activate endogenous IGFIR expression and promote cell mitosis, thereby promoting the proliferation of liver cancer cells 38 .
Meanwhile, HBx could enhance the transcriptional activity of HIF1A, thus promoting para-tumor angiogenesis 39,40 . IGF1R and HIF1A promote the occurrence and metastasis of HBV-associated liver cancer. Our results suggest that TGYP might affect the expression and activation of these genes, thereby playing a role in the inhibition of HBV-related HCC. Recent studies have shown that although NAs may inhibit HBV replication in hepatocytes, they cannot completely reverse HBV-induced signal transduction pathway activation or cell cycle regulatory protein change 41 . Those infected hepatocytes were still in transition to cancer cells. In contrast, our preliminary study demonstrated the superiority of integrative therapy regarding HBeAg clearance, the above results suggested TGYP may exert bene cial to prevent the risk of HCC in CHB patients. These results need to be validated in a larger sample, whilst the mechanisms are to be explored with in vitro and in vivo models.
In this study, a network was constructed between active compounds of TGYP and CHB related genes, demonstrating that this CHM formular might exert an anti-HBV effect through regulating hepatocyte proliferation, liver in ammation, and immune response. These ndings could provide further insight into the mechanisms of CHM-based CHB therapy and assist in screening potential therapeutic targets in the future.

Prediction of putative targets in TGYP.
The canonical SMILES or SDF of the active compounds were imported into the SwissTargetPrediction (http://www.swisstargetprediction.ch/) 47 and the STITCH databases (http://stitch.embl.de/) 48 , respectively. Putative targets of TGYP were collected under the following conditions: selecting the species is homo sapiens. The probability value of each potential target listed in the SwissTargetPrediction database was used to investigate the accuracy of the current predictions, whose probability value ≥ 0.1 was identi ed. In addition, the potential target proteins with con dence score ≥ 1.5 in the STITCH database.

Identi cation of CHB Targets
The target genes related to CHB were gathered from followed three databases. The GeneCards database (https://www.genecards.org/) 49 , the DisGeNET database (https://www.disgenet.org/) 50 and the NCBI gene database (https://www.ncbi.nlm.nih.gov/gene/) 51 . The search condition was using the keyword "Chronic Hepatitis B" and selecting the organisms "Homo sapiens" from the three platforms. The above gene ID was identi ed and standardized in the UniProt database (https ://www.unipr ot.org) 52 . We used Retrieve/ID mapping tool of UniProt to convert identi ers, which are of a different type to UniProt identi ers.

Network construction
The overlapped genes (OGEs) of the compound-putative targets and disease targets were charted by the Venn tool. We inputted the OGEs into the STRING database (https://string-db.org/) 53 with con dence scores ≥ 0.4 and the species limited to "Homo sapiens", and exported protein-protein interaction (PPI) data. Analyzed the PPI data and constructed network by the Cytoscape 3.8.0 (https://cytoscape.org/) 54 , and the non-connection genes were removed. We use the Analyzer plugin to analyze the PPI network and get the degree, through taking the over a double median of degree to get the preliminary hub network 55 60 . The ltering of retrieval results is with a threshold value of p < 0.05 and count in descending order. The value of P re ected the signi cance of protein biological function, and the purpose of ltering is to select the most enriched biological annotation from a given list of genes. Bubble Diagrams were drawn by R software with package ggplot2 (https://cran.r-project.org/web/packages/ggplot2/).

Molecular Docking Simulation
Molecular docking simulation was performed to verify the critical components' binding ability to the key targets and explore their accurate binding modes. The macromolecular protein target receptors were acquired from the RCSB PDB database (https://www.rcsb.org) 61 62 . Moreover, the 2D structures of the compounds were obtained from the PubChem Database. AutoDock Vina could markedly improve the average accuracy of the combined mode prediction 63 . Thereby the Molecular docking simulation was performed by AutoDock Vina software, and the result was visualized by PyMOL software 64 . In this analysis, the value of the vina score indicates the binding activity between a compound and a protein, and the lower the vina score, the more stably the compound binds to the protein.

Veri cation the expression of key genes
The expression of key genes was veri ed in the GEO database (http://www.ncbi.nlm.gov/geo/) 65 . The keywords were set as "hepatitis B" and "homo sapiens", the tissue type was limited to liver tissue, and the gene expression data of GSE83148 66  All the relevant data is provided within the paper and its supporting information les. The datasets analyzed during the current study available from the corresponding author on reasonable request.

Data statement
This work described was original research that has not been published previously, and not under consideration for publication elsewhere, in whole or in part.