Integrating Data Mining, Network Pharmacology, and Molecular Docking Verication to Investigate the Molecular Mechanism of Traditional Chinese Medicine Prescriptions for Treating Male Infertility

Background: Male infertility (MI) affects almost 5% adult men worldwide, and 75% of these cases are unexplained idiopathic. There are limitations in the current treatment due to the unclear mechanism of MI, which highlight the urgent need for a more effective strategy or drug. Traditional Chinese Medicine (TCM) prescriptions have been used to treat MI for thousands of years, but their molecular mechanism is not well dened. Methods: Aiming at revealing the molecular mechanism of TCM prescriptions on MI, a comprehensive strategy integrating data mining, network pharmacology, and molecular docking verication was performed. Firstly, we collected 289 TCM prescriptions for treating MI from National Institute of TCM Constitution and Preventive Medicine for 6 years. Then, Core Chinese Materia Medica (CCMM), the crucial combination of TCM prescriptions, was obtained by the TCM Inheritance Support System from China Academy of Chinese Medical Sciences. Next, the components and targets of CCMM in TCM prescriptions and MI-related targets were collected and analyzed through network pharmacology approach. Results: The results showed that the molecular mechanism of TCM prescriptions for treating MI are regulating hormone, inhibiting apoptosis, oxidant stress and inammatory. Estrogen signaling pathway, PI3K-Akt signaling pathway, HIF-1 signaling pathway, and TNF signaling pathway are the most important signaling pathways. Molecular docking experiments were used to further validate network pharmacology results. Conclusions: This study not only discovers CCMM and the molecular mechanism of TCM prescriptions for treating MI, but may be helpful for the popularization and application of TCM treatment.

Traditional Chinese Medicine (TCM) can be effective in the treatment of many diseases, including MI [17].
The characteristic of TCM treatment involves formulating different TCM prescriptions based on the constitutional signs and symptoms of the patients [18,19]. In TCM prescriptions, Core Chinese Materia Medica (CCMM) plays a crucial role in the medical cases, which is used to nd the underlying laws and associations of TCM treatment. For MI, various TCM prescriptions have been used, but its molecular mechanism is still unclear, which limits the clinical application of TCM. Therefore, in order to accelerate the TCM clinical application, it is essential to explore CCMM the underlying molecular mechanism of TCM prescriptions for treating MI.
The concept of holism of TCM has much in common with the major points of network pharmacology, where the general "one target, one drug" mode is shifted to a new "network target, multi-components" mode [20,21]. In such a mode, the combination of network pharmacology and TCM prescriptions would create a novel direction for discovering bioactive components and potential targets, revealing the molecular mechanism, and examining scienti c evidence of numerous herbs in TCM prescriptions based on complex biological systems of human body. Molecular docking, a method of predicting the binding sites, is performed to estimate the associations between components and targets [22]. Correspondingly, this study not only uncovers CCMM and the underlying molecular mechanism of TCM prescriptions for treating MI, but may be helpful for the popularization and application of TCM treatment.
In view of the "multi-component", "multi-target", and "multi-pathway" characteristics of TCM prescriptions [23], we adopted a comprehensive approach integrating data mining, network pharmacology, and molecular docking veri cation. First, 6 years of TCM prescriptions for treating MI were come from the medical record of outpatient departments in National Institute of TCM Constitution and Preventive Medicine a liated to Beijing University of Chinese Medicine. Next, TCM Inheritance Support System (TCMISS) software was utilized to screen and discover CCMM in TCM prescriptions. Then, the components and targets of CCMM in TCM prescriptions, and the MI targets were obtained from various databases, respectively. Subsequently, network pharmacology was used to deeply investigate the core targets and signaling pathways in the mechanism of CCMM in TCM prescriptions against MI. In order to estimate the network pharmacology results, molecular docking approach was performed to investigate the interactions between representative components and key targets of CCMM in TCM prescriptions on MI.

TCM prescriptions analysis
TCM prescriptions used to treat MI were obtained from the outpatient departments of National Institute of TCM Constitution and Preventive Medicine a liated to Beijing University of Chinese Medicine (date: from January 2013 to June, 2019), which were prescribed by Professor Qi Wang. The inclusive criterion of TCM prescriptions are (1) the patient was rst diagnosed as MI, including azoospermia, oligozoospermia, asthenospermia and teratospermia; (2) the patient was older than 23; (3) the patient has been married for more than 1 year and had normal sex without contraception for 12 months, but the woman was unpregnant due to the male factors; (4) the patient has no family history related to MI. The exclusive criterion is that the patient's wife has a disease that makes it di cult to conceive. TCMISS software (V2.5) is provided by China Academy of Chinese Medical Sciences. It is specially focuses on data mining and analysis of TCM prescriptions, nally uncovers the core law and combination [24,25]. The software contains six functional modules: clinical collection, platform management, data management, knowledge retrieval, statistical report, and data analysis. Three graduate students were responsible for the accuracy of prescriptions collection. One of them used the "clinical collection" function to collect the prescriptions, the others used the "platform management" function to check the data. The "data analysis" function was used to analysis the frequency of Chinese Materia medica (CMM) in TCM prescriptions, then the combinations of CCMM were obtained. The support degree was 140 and the con dence score was greater than or equal to 0.95.

Targets of bioactive components of CCMM in TCM prescriptions
Swiss Target Prediction (http://www.swisstargetprediction.ch/) [34] was used to obtain the targets of bioactive components, with the species limited to "Homo sapiens" and probability value > 0. Finally, the names of targets were standardized by UniProtKB (https://www.uniprot.org/) [35] (Additional le 3: Table  S3). The CCMM component-target network was constructed using Cytoscape (http://www.cytoscape.org, version 3.8.0) [36]. The degree value of network was calculated by Network Analyzer [37], a plugin of the Cytoscape software.
CCMM-MI common-target network with a percentage greater than 60%, indicating that they are CCMM in the prescriptions on MI. The combinations of CCMM in TCM prescriptions are shown in Table 2. In addition, the application mode of CCMM in TCM prescriptions was virtualized as a network using the TCMISS software ( Figure 1).  Then, we constructed a CCMM component-target network using the 98 components and 816 targets ( Figure 2). We found that beta-sitosterol, sitosterol, quercetin, kaempferol, isorhamnetin, CLR, campesterol, Stigmasterol, and beta-carotene are repeated more than once in CCMM. The structure, OB, and DL of these duplicate components are shown in Table 3. So, we thought that these duplicate components should be further explored in the next experiment.

GO and KEGG enrichment analyses of the targets of CCMM in TCM prescriptions
The GO enrichment analysis contains three sections, including biological process, cellular component, and molecular function. We found that CCMM could inhibit apoptosis, promote cell proliferation, and regulate the cytosolic calcium ion concentration through negative regulation of apoptotic process (GO:0043066), positive regulation of cytosolic calcium ion concentration (GO:0007204), positive regulation of cell proliferation (GO:0008284). Additionally, Prostate cancer (hsa05215), HIF-1 signaling pathway (hsa04066), Progesterone-mediated oocyte maturation (hsa04914), and Acute myeloid leukemia (hsa05221) are related to male reproductive function, which was shown in Figure 3.

MI-related targets
A total of 671 targets of MI were collected from four different databases. Among these, 225 targets were from CTD, 210 targets were from DisGeNET, 197 targets were from GeneCards, 181 targets were from OMIM ( Figure 4A). Subsequently, we performed GO and KEGG enrichment pathway analyses on MIrelated targets ( Figure 4B and C). The results showed that pathways in cancer (hsa05200), PI3K-Akt signaling pathway (hsa04151), MAPK signaling pathway (hsa04010) were the most signi cant signaling pathways. Moreover, positive regulation of transcription from RNA polymerase II promoter (GO:0045944), response to drug (GO:0042493), negative regulation of apoptotic process (GO:0043066), spermatogenesis (GO:0007283) were the most signi cant terms in biological process. According to the above results, we suggest that MI is related to apoptosis and spermatogenesis.
GO and KEGG pathway enrichment analyses of the key targets Furthermore, GO and KEGG enrichment analyses were performed on the key targets of the PPI network ( Figure 9). The results showed that Estrogen signaling pathway is the most signi cant signaling pathway, and DNA damage induced protein phosphorylation is the most signi cant GO term. The common targets between the related targets in PI3K-Akt signaling pathway, Estrogen signaling pathway, HIF-1 signaling pathway, TNF signaling pathway, and the key targets in the PPI network are AKT1, MAPK3, MAPK1, EGFR, GAPDH, and TNF (Table 6, Figure 10-13).
Finally, the molecular docking approach was conducted to verify the strong interactions between the representative components and key targets, indicating the reliability of the network pharmacology results ( Figure 18).
In detail, kaempferol could protect sperm from estrogen-induced oxidative DD [51]. An in vitro study has shown that kaempferol restored motility of aluminum-exposed human sperm cells and decreased the levels of malondialdehyde (MDA) production, a lipid peroxidation marker [52]. Quercetin was con rmed to indirectly affect the stimulation of the sex organs, both at the cellular and organ levels [53], and show outstanding bene cial effects on the serum total testosterone [54]. Isorhamnetin is a kind of avonoid and a direct metabolite of quercetin. Isorhamnetin maintained longer than quercetin in plasma [55]. It has the anti-in ammatory, and antioxidant effects [56,57]. Beta-sitosterol is the natural occurring phytosterols having steroidal moiety, which can inhibit tumor growth, modulates immune response, and has antioxidant capacity. Beta-sitosterol is regarded as a potential chemo preventive agent for treating a variety of cancer, including prostatic carcinoma and breast cancer [58].
For the obtained signaling pathways, Estrogen signaling pathway is the most notable signaling pathway. Estrogens are involved in the pathophysiology of varicocele-associated male infertility [59]. Estrogen stimulation can directly affect the apoptosis of germ cells, and it can also change the communication between germ cells to change their apoptosis [60], which may have a profound impact on MI. The aberrant activation of PI3K-Akt signaling pathway may contribute to increase cell invasiveness and facilitate prostate cancer progression [61]. Hypoxia-Inducible Factor (HIF)-1 plays an integral role in responding to low oxygen concentrations or hypoxia in human [62]. TNF family is regarded to stimulate NF-κB, thus implicating in varicocele-mediated pathogenesis [63].
AKT1 is considered as the moderator of cellular growth, survival, metabolism and proliferation [64]. AKT1 also suppress radiation-induced germ cell apoptosis in vivo [65] and enhance the effects of thyroid hormone on postnatal testis development [66]. The MAPKs has been linked to disturbances in spermatogenesis and dysfunction of germ cells and Sertoli cells, resulting in reduced semen quality and male reproductive dysfunction [67]. In human, MAPK3 and MAPK1 may play a crucial role in cell cycle progression and apoptosis [68]. In the capacitation process, the EGFR is partially activated by protein kinase A (PKA), resulting in phospholipase D (PLD) activation and actin polymerization [69]. In the testis, GAPDH is of particular importance for spermatogenesis, and reduced sperm motility induced by male infertility [70].

Conclusion
In summary, based on the comprehensive approach integrating data mining, network pharmacology, and molecular docking veri cation, we found that

Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.

Competing interests
The authors declare no con icts of interest.