Virtual Screening of Potential AEC2 Inhibitors for COVID-19 from Traditional Chinese Medicine

Background: Angiotensin-converting enzyme 2 (ACE2), a negative regulator of the renin-angiotensin system and the severe acute respiratory syndrome-coronavirus (SARS-CoV) and SARS-CoV-2 receptor, plays an important role in viral genome replication and immune response. ACE2 has been proposed as a potential therapeutic target. Traditional Chinese medicine (TCM) could be considered as a promising complementary therapeutic option in the management of COVID-19. However, the active components and action mechanisms that account for its therapeutic effects remain controversial. Methods: ACE2 was employed as a target related to COVID-19 to explore the active ingredients from the Traditional Chinese Medicine Systems Pharmacology Database and Analysis Platform (TCMSP) database. And the PharmMapper database and TCMSP database were used to predict the targets of the compounds. Moreover, the potential therapeutic targets of COVID-19 were acquired by intersection among genes differentially expressed in COVID-19 patients, genes screened from ve public databases and genes targeted by active compounds. Finally, molecular docking verication and function analysis were performed. Results: In this study, puerarin, the common active compound for targeting ACE2 across the ve TCMs (Radix Cyathulae, Flos Puerariac, Radix Bupleuri, Radix Puerariac and Radix Hemerocallis), was found. Due to the comprehensive analysis, we revealed that puerarin might prevent the entrance of SARS-CoV-2 entry into cells by blocking ACE2, chiey modulated the T cell immunity and regulate pro-inammatory cytokine response by affecting TNF, STAT1 and RNASE3. Conclusion: Taken together, the present study found that puerarin might have a therapeutic effect on COVID-19 through regulation of the immune system, inhibition of inammation and prevention of virus entry into cells.

Traditional Chinese medicine (TCM) has over thousands of years of clinical practice with types of infectious diseases. TCM has a good effect on viral infectious pneumonia, and has been employed as effective treatments for severe acute respiratory syndrome (SARS) (8). According to recent data, TCM plays a critical role in prevention, treatment and rehabilitation of COVID-19. Early intervention by TCM could improve the cure rate, delay the disease progression, and reduce the mortality rate (9). Puerarin [7-hydroxy-3-(4-hydroxyphenyl)-1-benzopyran-4-one 8-(β-DGlucopyranoside)] is an iso avonoid found in Chinese herbs like radix pueraria, which possesses diverse pharmacological activities, including anti-in ammatory, and anti-viral activities (10)(11)(12). In this study, we found that puerarin might be a potential candidate to treat COVID-19. To further explore the molecular mechanism for puerarin against COVID-19, we integrated the network pharmacology tools and bioinformatics to predict the genes and proteins targeted by puerarin. By combination of clinical data of COVID-19, we speculated that ACE2, STAT1, RNASE3, LCK, TNF, FOS, PDE5A, IMPDH2, MAPK8 and MAPKAPK2 might be potential therapeutic targets of COVID-19 by puerarin.

Retrieval of candidate protein targets for COVID-19 from Traditional Chinese Medicine
Angiotensin-converting enzyme 2 (ACE2) is the cellular receptor for SARS-CoV-2 that related to COVID-19.
We employed ACE2 as a target to nd the potential traditional Chinese medicine from TCMSP databases.
Five traditional Chinese medicines, including Radix Cyathulae, Flos Puerariac, Radix Bupleuri, Radix Puerariac and Radix Hemerocallis were screened out. The representative compounds of the ve traditional Chinese medicines and their targets genes were showed in Fig. 2A. Meanwhile, the screening result also showed that puerarin was the common active compound across the ve Chinese medicines, which would target ACE2 ( Fig. 2A and B). The study owchart is illustrated in Fig. 1.
Moreover, to predict more protein targets of puerarin other than ACE2, the three-dimensional structure of puerarin was obtained from PubChem database and uploaded to PharmMapper database. As a result, a total of 317 protein targets from TCMSP database and PharmMapper database were obtained (Fig. 1, purple). Furthermore, functional enrichment analysis showed that the targets of puerarin were signi cantly involved in COVID-19-associated gene sets (Fig. 2C). In our previous study, we constructed a protein-protein interaction (PPI) network involved in cytokine secretion and found hub proteins which were critically important in ACE2-associated in ammatory responses (13). We found 6 puerarin targets were participated in the PPI network, including the most important hub protein, SRC (Fig. 2D). Therefore, we speculated that puerarin would have a potential effect on COVID-19.
Functional enrichment analysis of COVID-19 associated genes targeted by puerarin COVID-19 associated genes or proteins were screened in GeneCards, DrugBank, PharmGkb, TTD and OMIM databases. Firstly, we searched for "Novel Coronavirus" or "COVID-19" in these public databases and obtained 744 COVID-19 related genes (Fig. 1, pink). By intersecting with puerarin protein targets, the Venn diagram showed that 57 co-targets were identi ed (Fig. 3A). In addition, functional enrichment analysis of the 57 genes suggested these genes were signi cantly related to regulation of immune cell (e.g., T cell activation, and Leukocyte Differentiation), signaling by interleukins and virus response (e.g., Hepatitis B, Human Cytomegalovirus Infection and Human Immunode ciency Virus 1 Infection) ( Fig. 3B and C).

Functional Enrichment Analysis Of Covid-19 Associated Genes
The abundance of different immune cell types while infection can give us more insights into the immune regulation features in COVID-19. We rst assessed the immune cell abundance in lung samples of COVID-19 patients using xCell (Fig. 4A). Compared with those in healthy subjects, the result showed that the in ltrations of CD4 + T cells, B cells, CD8 + Tem, NK cells, iDC, macrophages, mast cells and neutrophils were increased in the lung of COVID-19 patients. In the meantime, to systematically evaluate the transcriptomic alterations in COVID-19, we applied the differential expression analysis to identify the aberrantly expressed genes in COVID-19 patients (|Log2 fold change| > 1, p.value < 0.05). A total of 2587 differentially expressed genes were identi ed, with 1800 up-regulated genes and 787 down-regulated genes. To further obtain the speci c genes associated with COVID-19 affected by SARS-COV-2, the intersection between the differentially expressed genes of COVID-9 and novel coronavirus related genes was performed, and 136 genes were found. Functional enrichment analysis revealed that these genes were involved in virus infection (e.g., In uenza A, Measles, Herpes simplex Infection and Epstein-Barr virus Infection), viral process (e.g., regulation of viral process), response to virus (e.g., defense response to virus, response to virus), signaling by interleukins, and response to interferon (e.g., interferon alpha/beta signaling, response to interferon gamma, type I interferon signaling pathway) ( Fig. 4B and C).

The potential therapeutic targets of COVID-19 by puerarin
To explore the promising role of puerarin in COVID-19 treatment, the intersection among genes targeted by puerarin, genes screened from public databases and genes differentially expressed in COVID-19 patients was performed. 10 genes, ACE2, STAT1, RNASE3, LCK, TNF, FOS, PDE5A, IMPDH2, MAPK8 and MAPKAPK2, were identi ed as potential therapeutic targets of COVID-19 by puerarin (Fig. 5A). The volcano plot also showed that TNF, STAT1, ACE2, RNASE3 were up-regulated, FOS, MAPKAPK2, LCK, IMPDH2, MAPK8 and PDE5A were down-regulated after infection (Fig. 5B). Secondly, we analyzed the protein-protein interactions among the 10 potential therapeutic targets. The PPI network showed that these proteins were tightly interactive with each other, except ACE2 and IMPDH2 (Fig. 5C). In addition, we found that TNF was the hub protein in the PPI regulation network, indicating that this molecule was critical in COVID-19 (Fig. 5D). KEGG pathway enrichment analysis also revealed that these genes were mainly involved in T-cell-associated pathway (e.g., Human T-Cell Leukemia Virus 1 Infection, T Cell Receptor Signaling Pathway, Th17 Cell Differentiation, and Th1 And Th2 Cell Differentiation) ( Fig. 5E and F).

Molecular docking veri cation and function analysis of potential therapeutic targets
Molecular docking was performed by Autodock vina. The interaction between ligand (puerarin) and receptors (10 proteins) were showed in Fig. 6, indicating that the small-molecule compound was tightly bound to the residues of proteins in binding pockets. To obtain further insight into the function of the 10 potential therapeutic targets, gene set enrichment analysis was conducted. 59 normal lung samples from The Cancer Genome Atlas (TCGA) were used. The result showed that high expression of TNF correlated with cellular response to virus, viral entry into host cell, cytokine secretion, immune cell-mediated immunity (Fig. 6A). Low expression of MAPK8 was related to negative regulation of humoral immune response, and negative regulation of myeloid leukocyte mediated immunity (Fig. 6B). High expression of ACE2 was found to be involved in adaptive immune response, nature killer cell, T cell, B cell, lymphocyte mediated immunity and response to virus (Fig. 6C). As to STAT1, it was associated with cytokine production, defense response to virus, humoral immune response, and type 2 immune response (Fig. 6D). For LCK, high expression correlated with humoral and adaptive immune response, cytokine production and secretion, viral gene expression and viral genome replication (Fig. 6E). RNASE3 displayed correlation with defense response to virus, macrophage activation, mast cell mediated immunity and cytokine ( Fig. 6F). All above genes were related to immune responses and viral response, suggesting their important position as the therapeutic targets in COVID-19 ( Fig. 6).

Discussion
As the global outbreak of COVID-19 caused by SARS-CoV-2 is rapidly evolving and expanding, its effects is becoming evident-from mild, self-limiting respiratory tract illness to severe acute respiratory distress syndrome (ARDS), multiple organ failure, and death (14). Most patients with COVID-19 have mild symptoms like low-grade fevers and mild fatigue, but no symptoms suggestive of pneumonia. About 5% develop severe symptoms, which can include ARDS, septic shock, and severe metabolic acidosis and coagulation disorders (15).
Innate immune sensing serves as the rst line of antiviral defense and is essential for immunity to viruses. Based on recent transcriptional and high dimensional cytometry studies on COVID-19 patients, crucial roles for impaired innate antiviral response, high pro-in ammatory response and dysfunction of adaptive immune response in disease severity of SARS-CoV-2 infection has been established (16,17). In this present study, immune cell abundance analysis showed signi cant increases of adaptive immune cells including CD4 + T cells, B cells, CD8 + Tem and iDC, as well as of pro-in ammatory neutrophils, macrophages and mast cells among COVID-19 patients. And we also found an increase of NK cells in the lung of COVID-19 patients, which are identi ed as vital mediators of early eradication of virus (18). Interferons (IFNs) are considered the most important for antiviral defense. Study demonstrated that SARS-CoV-2 suppresses IFN release in vitro and in vivo, as evidenced by signi cantly decreasing the production of type I/III IFN signatures in infected cell lines and a ferret model (19). In the meantime, compared to mild or moderate cases, patients with severe COVID-19 present remarkably impaired IFN-I signatures (20). Numerous studies also demonstrated that SARS-CoV-2 is sensitive to IFN-I/III pretreatment in vitro (19,21,22). By evaluating the transcriptomic alterations in COVID-19, we identi ed 136 genes associated with SARS-CoV-2 infection. Functional enrichment analysis of the 136 genes revealed that they were involved in response to type I interferon, type I interferon signaling pathway, response to interferon gamma, interferon alpha/beta signaling. In addition, these genes were signi cantly related to regulation of cytokine production, cytokine-mediated signaling pathway and signaling by interleukins, which supports the immunopathology of excessive in ammatory cytokine release in systemic infection patients.
It has been found that SARS-CoV-2 is able to bind to the ACE2 receptor on the surface of cells. Zhou et al. demonstrated that ACE2 was the cell entry receptor for SARS-CoV-2 in in vitro studies (23). Our previous study also analyzed the importance of ACE2 in the susceptibility of SARS-CoV-2 infection (24). Therefore, we employed ACE2 as the therapeutic target of COVID-19 to explore the related TCM. Unexpectedly, we obtained ve TCMs, including Radix Cyathulae, Flos Puerariac, Radix Bupleuri, Radix Puerariac and Radix Hemerocallis. We also found that puerarin was the common active compound across the ve TCMs, which targeted ACE2. Besides mediation of the virus entry, it has been documented that ACE2 is related to the regulation of in ammatory cytokine response in coronavirus infected cells. In this current study, we found that 6 puerarin target proteins were signi cantly involved in our previously constructed PPI network of in ammatory cytokine production which was crucial in ACE2-associated in ammation. This is consistent with previous studies. Puerarin was also able to reduce in ammation by regulation of neutrophils and eosinophils in rabbit models (25). In addition, puerarin had no toxic effect on host cells but only had a moderate ability to reduce viral production (26). Our functional enrichment analysis also suggested that the puerarin target genes were principally participated in COVID-19-associated gene sets.
Therefore, we speculated that puerarin would have a potential effect on COVID-19.
However, due to the complex targets by puerarin, the detailed mechanisms need to be further addressed.
To gure it out, we intersected the potential targets of puerarin with COVID-19 associated genes. Several genes, ACE2, STAT1, RNASE3, LCK, TNF, FOS, PDE5A, IMPDH2, MAPK8 and MAPKAPK2, were identi ed as potential therapeutic targets of COVID-19 by puerarin. Molecular docking simulation indicated that puerarin with effective score when docking with these potential therapeutic targets, which is in agreement with the above integrated analysis results. These target genes were signi cantly enriched in T-cellassociated pathway, such as virus indued T cell response, T Cell Receptor Signaling, Cell Differentiation of Th1, Th2 and Th17. A wealth of studies con rmed that Th1 cells are the leading Th cells in T cell immunity during SARS-CoV infection. Th1 cells generally secret IL-2, IFN-γ, IL-12, and TNF-a in the site of viral infection (27). However, recent studies on COVID-19 patients proposed that the production of Th2 cytokines along with Th17-speci c cytokines are also increased. TCR and cytokines released from Th cells are indispensable for activation of CD8 + T cells. Cheng MH and his colleague analyzed the TCR repertoire in adult SARS-CoV-2 infected patients, and found that SARS-CoV-2 spike glycoprotein as a superantigen induces cytokine storm via binding to TCR followed by TCR skewing (28). Indeed, puerarin has been shown to regulate expression of Th1/Th2 cytokines and inhibit Th17 cells in rats (29). Thus, we speculated that puerarin probably exert a protective effect on COVID-19 through modulation of T cell immunity.
Of all the targets, TNF, STAT1 and RNASE3, in addition to ACE2, were mainly up-regulated after SARS-CoV-2 infection. According to our GSEA analysis, high expression of TNF mainly associated with virus response and cytokine secretion related immune cell response. And STAT1 was primarily correlated with virus defense response, cytokine production and humoral immunity. It has been previously reported that serum TNF-α levels in severe COVID-19 patients are signi cantly higher than patients who have mild symptoms (30). Karki et al. identi ed that TNF-α is a key factor, working together with IFN-γ, to induce immune cell death and cytokine shock syndrome during SARS-CoV-2 infection through the of JAK/STAT1/IRF1 axis (31). Meanwhile, RNASE3 was highly related to virus defense, macrophage activation and mast cell mediated cytokine response based on our GSEA analysis. Earlier, KEGG pathway analysis of the up-regulated differentially expressed genes conducted by Lu and co-workers suggested that TNF signaling pathway are signi cantly enriched in RNase3 treated macrophages. STAT1 are also activated in RNase3 exposed macrophages, and this activation is followed by the hyper-production of chemokines and cytokines including TNF-α (32). Moreover, uncontrolled activation of mast cells can rapidly release large amounts of TNF-α and proteases stored in granules to further aggravate cytokine secretion storm and exacerbate disease severity (33). Additionally, our PPI analysis also showed that TNF was the hub protein in the regulation network, indicating that this molecule was critical in COVID-19. As ARDS with cytokine storm might be the main cause of death due to severe viral infection, antiin ammatory active compounds could be effective options for COVID-19. It has been established that chloroquine, a classic antimalarial compound, can inhibit production of TNF-α on the mRNA level, the pretranslational level and the post-transitional level, as well as block the receptor induced TNF pathway during virus infection (34). Early study demonstrated that puerarin greatly suppresses systemic in ammation by inhibiting the transcription of in ammatory factor TNF-α, IL-6, and IL-1β in sepsis mouse (35). Additionally, puerarin signi cantly inhibited in ammation through the down-regulation of nuclear factor-κB (NF-κB), MAPK activation and the secretion of pro-in ammatory mediators (36,37). Puerarin administration also reduces the expression of TNF-α and STAT1 in bone mesenchymal stem cells by regulation of miR-155-3p (38). From this point, we speculated that puerarin may against COVID-19 by mainly mediation of proin ammatory cytokine response, especially targeting the TNF pathway.

Conclusion
In this study, we presented an integrative network-based methodology for systematic identi cation of putative TCMs and the active compound of TCMs for potential treatment of COVID-19. Integration of TCM-target networks, SARS-CoV-2-induced transcriptomic alterations in human and SARS-CoV-2-host interactions are essential for the identi cation. Due to the comprehensive analysis, 5 candidate TCMs and 1 active compound for targeting SARS-CoV-2 were identi ed. Furthermore, 10 potential therapeutic targets of COVID-19 by puerarin were explored. Puerarin may against COVID-19 by regulation of the immune system, inhibition of in ammation and prevention of virus entry into cells. As a whole, this study offers a scienti c basis on TCMs or its active ingredients as a potential option in the management of COVID-19. However, all network-predicted results must be validated in various SARS-CoV-2 experimental assays and randomized clinical trials before being used in patients.

Methods
Identi cation of the active compound and its targets ACE2 was used as a target to nd the potential Chinese medicine from the Traditional Chinese Medicine Systems Pharmacology Database and Analysis Platform (TCMSP) (http://tcmspw.com/tcmsp.php) (39). Radix Cyathulae, Flos Puerariac, Radix Bupleuri, Radix Puerariac and Radix Hemerocallis, were screened out. Puerarin was identi ed as a common compound. The three-dimensional structure of puerarin was obtained and stored in sdf format from PubChem database. The sdf le was uploaded to PharmMapper (http://www.lilab-ecust.cn/pharmmapper/) to obtain human-related protein targets for puerarin (40). The UniProt database (https://www.uniprot.org/) was applied to get the gene annotation of each target.

Protein-protein interaction prediction
The intersection of each datasets was obtained by using Venny 2.1 (https://bioinfogp.cnb.csic.es/tools/venny/) (41). STRING (https://string-db.org) can be used to study the PPI. The potential targets were imported into the STRING database to obtain the PPI among each target. Cytoscape 3.7.2 was used to generate the network. Hub proteins were identi ed using Cytoscape plugin CytoHubba (42).

Data collection and Functional annotation
RNA-Seq count data of COVID-19 patient samples were downloaded from GEO data repository under the accession number GSE147507 (https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE147507). Differential expression analysis was conducted in the R computing environment using DESeq2 R package (43). The count data of lung samples were converted to CPM (Counts Per Million) followed by TMM normalization using edgeR R package (44). The normalized values were used for immune cell abundance quanti cation by using xCell (45). Gene ontology (GO) functional enrichment analysis and signaling pathway enrichment analysis were performed using Metascape (46). Funding paired normal samples were downloaded from Xena datahub (https://xenabrowser.net/, TCGA Pan-Cancer cohort). The samples were classi ed into high and low expression groups based on the median expression value of each gene. Then, differential expression analysis was conducted to by comparing high-expression group with low-expression group. GSEA analysis was performed by using GSEA software and iBio Tools (47,48).

Molecular Docking Veri cation Of Core Compounds
Firstly, the three-dimensional structure diagram of puerarin was downloaded from the PubChem database. Then, the sdf le was imported into Pymol software to transfer to pdb format, imported into AutoDockTools-1.5.6 software and saved in pdbqt format. Secondly, the protein crystal structures corresponding to the key genes were downloaded from the PDB database, imported into Pymol software to remove water molecules and heteromolecules, imported into AutoDockTools-1.5.6 software to add hydrogen atoms and charge operations, saved to pdbqt format. The pocket of each protein was calculated with Pymol. Finally, puerarin was used as ligands, and the proteins corresponding to the core target genes were used as receptors for molecular docking by Autodock vina.

Availability of data and materials
The datasets used and/or analyzed during the current study are available from the corresponding author upon reasonable request.
Ethics approval and consent to participate Not applicable.

Consent for publication
Not applicable. Study owchart. Identi cation of TCMs and the active compound was detailed in purple panel. ACE2 was used as a target to nd the potential Chinese medicine from the TCMSP database. Radix Cyathulae, Flos Puerariac, Radix Bupleuri, Radix Puerariac and Radix Hemerocallis, were screened out. Puerarin was identi ed as a common compound. The three-dimensional structure of puerarin was uploaded to PharmMapper database to obtain human-related protein targets for puerarin. In pink part, novel coronavirus associated genes or proteins were screened from public database. We searched for "Novel Coronavirus or COVID-19" in GeneCards, DrugBank, PharmGkb, TTD and OMIM and obtained 744 Novel coronavirus related genes. In green, RNA-Seq data of COVID-19 patient samples were downloaded from GEO data repository under the accession number GSE147507. The differential expression analysis was applied to identify the aberrantly expressed genes in COVID-19 patients (|Log2 fold change| > 1, p.value< 0.05). A total of 2587 differentially expressed genes were identi ed, with 1800 up-regulated genes and 787 down-regulated genes.  Functional enrichment analysis of novel coronavirus associated genes targeted by puerarin. A. Venn diagram showed that 57 novel coronavirus associated genes targeted by puerarin were identi ed. B and C. Bar plot and scatter plot showed functional enrichment of analysis.  hub-gene ranking of the 10 genes. E and F. Bar plot and scatter plot showed functional enrichment of analysis.