2.1. Collecting drug molecular information and screening for active molecules
Chemical compounds in ADI were mainly obtained from the TCM systems pharmacology database (TCMSP, http://tcmspw.com/tcmsp.php) and the TCM [email protected] ([email protected], http://tcm.cmu.edu.tw), the two largest pharmacological data platforms for Traditional Chinese Medicine. They contain all herbs, chemical components and pharmacokinetic properties (namely, absorption, distribution, metabolism and excretion or ADME) information in the Pharmacopoeia of the People’s Republic of China (2010 edition)[17]. In addition, the databases of the China National Knowledge Infrastructure were also used to supply any omitted components. Finally,313 components from ADI were collected, including 190, 87, 9 and 27 from Renshen, Huangqi, Ciwujia, and Banmao, respectively (Supplementary Table S1)
2.2 Pharmacokinetics ADME Assessment
The recommended standard refers to the TCMSP database (TCMSP, http://ibts.hkbu.edu.hk/LSP/tcmsp.php), which includes only predicting the octanol-water partition coefficient (AlogP) and drug similarity (DL). If AlogP<5 and DL≥0.18 is satisfied, when the compound is retained, the compound that didn’t not conform to the standard compound would be removed from the list. The final compound was obtained. (Supplementary Table S2).
The screened compound molecules were then searched in the Pubchem database (https://pubchem.ncbi.nlm.nih.gov/) and the found compounds in the PubchemCID number and SMILES format were recorded with molecular SDF file is saved. The molecules which were not found in Pubchem, in the Swiss Target Prediction database (http://www.swisstargetprediction.ch/), drew the same structure according to the formula in the TCMSP database, then recorded the SMILES format and saved the SDF file of the molecule.
2.3 Prediction of compound-related target genes
We upload the SDF file saved in the previous step to the pharmmapper database (http://www.lilab-ecust.cn/pharmmapper/), and fill in the mailbox, then continue to select "Human protein targets only", and finally get all predictions molecularly related targets. We then found the official name of the predicted gene on the UniProt website (http://www.uniprot.org/) and selected "Search / ID Mapping", selecting the type as "Homo sapiens". Finally, our various ID forms will be converted to UniProt ID. This data is arranged into the ADI's drug component-target relationship data set (SupplementaryTable3).
2.4 Obtaining genetic data related to BC
BC-related genes are available from two official databases: Gene Cards database (https://www.genecards.org/), online Mendelian inheritance (OMIM, http ://www.omim.org/) and the simple "Homo sapiens" chose the protein linked to BC (SupplementaryTable4).
2.5 Network construction
We followed the method of Li et al to continue the following work according to the previous work in our laboratory [18], and we used Cytoscape (http://www.cytoscape.org, version 3.7.1) [19] to construct the common-target network selected from ADI and BC (Supplementary Table5).
PPI network construction. Protein is a biological macromolecule. Organisms have the synergistic effect of protein to complete various life activities and achieve various life functions. The various functions of an organism are manifested by the interaction of many proteins under specific conditions rather than by individual proteins. Establishment of PPI networks can make the link between prediction targets and other human proteins closer, which has become a promising target for drug discovery [20]. Using Cytoscape plug-in Bisogenet to build and visually analyze at different detail levels, the network contains six main PPI databases, including the biomolecular interaction network database (BIND), the complete molecular interaction database (complete), the human protein reference database (HPRD), the molecular interaction database (MINT), and the biological interaction database master database (Biogrid) [21]. We define the identifier as "Homo sapiens, protein identifier only" in the Bisogenet program, select the PPI data sources, set the distance from the input set to the new nodes as "1", and represent the output as "proteins". We establish PPI networks with BC related goals and ADI related goals.
Central network evaluation. Along with the rapid development of bioinformatics technology, the evaluation of the central network of PPI networks containing a large number of gene combinations and proteomics data becomes the main method to screen the core proteins. The PPI network BC related goals and ADI related goals is merged and then intersected. Then we use the plug-in of the CytoNCA evaluate intersection. Cytonca includes six centriality measures, such as degree centrality (DC), closed centrality (CC), network centrality (NC), Betweenness centrality (BC), method based on local average connectivity (LAC) and eigenvector centrality (EC) [22], to filter the data. We take the “DC ≥ 2 × median DC” as the screening criteria for preliminary processing of data, and the criterion used for secondary screening is "DC、BC、EC、CC、LAC、NC greater than or equal to their median"[23] (Supplementary Table 6) for data screening as the core target.
Cluster analysis. Clustering analysis can be performed by extracting nodes with the same or similar attributes as clusters for sub-regional analysis of complex PPI networks.The core-target PPI network (Supplementary Table 7) conducted cluster analysis by MCODE, a cluster analysis algorithm in Cytoscape in this research [24,25] .
2.6 Enrichment analysis
Annotated Visualization Database and Integrated Discovery Database (DAVID, https://david.nicifcrf.gov/, version 6.8) with P≤0.05 as screening criteria were used for gene ontology (GO) enrichment analysis[26, 27]. We used the DAVID database to apply for Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway enrichment analysis with p≤0.05 as screening criteria[28] to reveal the gene pathway annotation network for functional groupings. Furthermore, a collection of tools for "Search Pathway" and "Color Pathway ", KEGG Mapper (https://www.genome.jp/kegg/mapper.html) which is used to analyze the connections between upstream and downstream in main signaling pathways[29]. The mechanism of ADI therapeutic BC (Supplementary Table S8, 9, 10) is revealed by enrichment analysis of common target networks and PPI networks.