3.1. Active Chemical Compounds
We obtained 102 compounds from the TCMSP. According to the OB and DL standards set there were as few as 7 compounds left after our screening. As the SMILEs could not be found in PubChem, 2-O-methyl-3―O-β-D-glucopyranosyl platycogenate A and dimethyl 2-O-methyl-3-O-a-D-glucopyranosyl platycogenate A were excluded from the next analysis. In the following target prediction, Robinin had no effective protein target, so it was not included in the study. Finally, 4 compounds were analyzed. The basic information of these 4 compounds was shown in Table 1.
3.2. Construction of compounds-targets network
After the targets obtained from the TCMSP, SwissTargetPrediction and STITCH were sort out, we obtained 24 targets in the TCMSP, 42 targets in the SwissTargetPrediction and 24 targets in the STITCH. In total, 123 compounds targets were obtained, after deleting duplicates. The compounds-targets network was constructed used Cytoscape. The cluster analysis of the clusterMaker of Cytoscape led to four clusters, centered on four active compounds being obtained. Luteolin and Acacetin clusters contained more targets than the other two. The result was showed in Fig. 2.
3.3. The Targets of Lung Cancer
A total of 45 lung cancer related genes were identified in TTD database. There were three groups of disease data selected from DisGeNET with the Disease ID of C0684249, C0242379 and C0007121. In these three sets of data, 1963, 1991 and 37 target genes were obtained respectively. After removing the repeats, we finally got 2129 disease targets.
3.4. The Common Target of Compounds and Diseases and PPI Network
Venn figure was used to identify the common targets of compounds and diseases. According to the result, 80 targets were shared between them, among which the targets from Luteolin were the most,accounting for 64. Apart from that, 22 targets form Acacetin, 7 targets form cis-Dihydroquercetin, and 2 targets form Spinasterol. The result was showed in Fig. 3A. The 80 targets as mentioned above were inputted into String database to get a PPI network, which consisted 80 nodes༌and 1057 edges. We used Cytoscape to visualize and analyze the PPI network. The result was showed in Fig. 3B.
3.5. Hub Targets
After the visualization of the PPI network by Cytoscape and the topological analysis of the Network Analayzer tool, the details of the topological indexes such as Degree and Betweenness Centrality were obtained. According to the hub target standard defined by us, the target with the Degree and the Betweenness Centrality more than twice the median was identified as the hub target. Based on the stander 7 hub targets were identified. The topological indexes of 7 hub targets were showed in table 2.
In order to clarify the expression differences of the 7 hub targets in lung cancer and normal lung tissues, we retrieved data series GSE19804 in the GEO database (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE19804. Accessed 23 May 2020)[31]. We obtained 120 samples from paired tumor and normal tissues. We used the heatmap.2 package of R to normalize the data and draw the heatmap. We found that the expression of AKT1, CASP3, EGFR and TP53 was up-regulated and the expression of IL-6, MAPK1 and VEGFA was down-regulated in lung cancer. The expression level of each gene was shown by scatter plot. The results were showed in Fig. 4.
3.6. GO Enrichment
David was applied to do GO Enrichment of the 7 hub targets and then 81 enrichment results were obtained, including 67 biological processes (BP), 10 molecular functions (MF), and 4 cellular components (CC). The top ten go enrichment results of each group were showed in Fig. 5.
3.7. KEGG Pathway Enrichment
David was applied to carry out KEGG pathway Enrichment of the 7 hub targets. After removing the extensive pathways, 23 pathways were obtained. The top three were Proteoglycans in cancer, HIF − 1 signaling pathway and PI3K − Akt signaling pathway. The result of the 23 pathways was showed in Fig. 6.
Then, constructed compounds-targets-pathways network was constructed with the 7 hub targets. A network with 32 nodes and 114 edges was obtained, including 2 compounds, 7 targets, 23 pathways, 21 targets-targets interaction, 10 compounds-targets interaction and 83 targets-pathways interaction. After modular analysis the result of was showed in Fig. 7.
3.8. Molecular Docking Result
AutoVina was employed to conducted molecular docking, with the result listed in table 3. The reliability of the docking results was evaluated against with the score of Affinity. The value of affinity represented the strength of binding between the compounds and the targets. The greater the negative value, the more satisfactory the binding result. The best binding protein of compound Luteolin was 5UG9, the structure of EGFR. The best binding protein of compound Acacetin was 1NME, the structure of CASP3. Their three-dimensional structure and protein ligand interaction was showed in Fig. 8.