ACE2 Protein-Protein Interaction Networks Reveal Potential Druggable Targets For SARS-CoV-2

Background: The novel coronavirus SARS-CoV-2 pandemic has infected more than 130 million people, killed over 2.3 million so far. Currently, no effective drugs are available to treat this infectious disease, due to limited knowledge of the molecular mechanisms of SARS-CoV-2 infection. ACE2 (angiotensin I converting enzyme 2) has long been identi�ed as the major receptor for coronavirus entry the host cells. Methods: In this study, we constructed the protein-protein interaction networks (PPIN) based on ACE2 and its interacting proteins, considering with the expression alternation and co-expression relationship. The potential drugs targeting the proteins in the PPIN were explored. Results: ACE2 and its interacting proteins AAMP and HRAS are obviously increased, and their PPIN show distinguishing expression patterns during the COVID-19 progression. At least six pathways are activated for the host cell in the response to the virus. Moreover, drug-target networks were built to provide important clues to block ACE2 and its interacting proteins. Except the reported four drugs for ACE2, its interacting protein CALM1 and HRAS are great potentially druggable. We also considered the path initiated from ACE2 to nucleus by cascades of interaction, especially for the transcription factors in the PPIN which are also druggable. Conclusion: In summary, this study provides new insight into the disruption of the biological response to virus mediated by ACE2, but also its cascade interacting proteins when considering of PPIN.


Introduction
The outbreak of global SARS-CoV-2 pandemic has profoundly threatened the health of billions of individuals, overburdened local health care systems and millions of lives have lost by the end of May 2021 (https://coronavirus.jhu.edu/map.html).The most distinguishing clinic character of COVID-19 patients is a dysfunction of breath primarily due to acute respiratory distress syndrome, causing a greater risk of mortality [1].The rst step of viral infections is the entry of the virus of the host cell, following the replication multiple viral copies using the host cellular machinery.It has been widely acknowledged that angiotensin-converting enzyme-2 (ACE2, EC 3.4.17.23) acts as a major receptor for COVID-19 to gain intracellular entry into the host cells [2].ACE2 is a membrane-bound protein, with a NH 2 -terminal domain comprises the catalytic site oriented extracellularly [3].ACE2 is normally expressed in many human tissues, including lung, small intestine, heart, brain stem, nasal and oral mucosa [4].
The mechanism for COVID-19 causes lung dysfunction has not been fully proposed.ACE2 is considered as a major receptor for coronavirus entry into the cells, which causes diffuse alveolar damage through imbalances in the renin-angiotensin system.These immunologic reactions in severe COVID-19 may characterize the cytokine storm phenomenon: a massive release of cytokines, such as interleukin (IL)-6, IL-1, IL-8 and TNFα, from host cells and immune cells, such as monocytes and dendritic cells [5,6].These in ammatory cells and factors will severe respiratory complications, like ARDS (Acute respiratory distress syndrome), characterizing acute diffuse alveolar damage, pulmonary oedema and formation of hyaline membranes [7].
However, the detailed of molecular mechanism of how ACE2 is stimulated by virus and the progress of phenotype in host cell remain unclear.Many proteins carry out multiple biological functions through the interaction with other proteins, which maybe predicted by the interacting proteins.Analyses based on protein-protein interaction networks (PPINs) have become a prevalent method for the high-throughput data, especially the protein interaction data [8].PPINs analysis can systematically integrate single or multiple high-throughput data to obtain a global understanding of cellular functions or processes under various conditions, e.g., different tumor sub-types, cancer progressive stages and cell sub-groups [9,10].
In this study, the PPINs for ACE2 and its interacting proteins was constructed, and the expressed change and expression pattern during the COVID-19 progression were integrated into the networks.Moreover, a drug-target network was built to provide the clues to block the potential molecular path initiated from ACE2.

Search of proteins that interacts with ACE2
The reported interacting proteins of ACE2, con rmed by previous publications, were obtained from two databases of BioGRID (https://thebiogrid.org/) and MINT (https://mint.bio.uniroma2.it/).These proteins were curated to obtain a unique interacting protein list.

Construction of the protein-protein interaction network (PPIN)
The latest known human protein-protein interaction data were obtained from several databases, including BioGRID (http://thebiogrid.org/),HPRD (http://www.hprd.org/),MINT (https://mint.bio.uniroma2.it/),and IntAct (http://www.ebi.ac.uk/intact/).These protein interaction data were integrated and obtained a unique clean dataset after the remove of redundancy, which contained the up today human proteinprotein interactions.A total of 24,046 unique proteins and 438,656 interactions were contained in this clean dataset, which was applied as the parental PPIN from that new or sub-PPINs were constructed by Cytoscape [11].Firstly, a sub-network was built by mapping ACE2 and its interacting proteins were mapped to the parental PPIN as the seed proteins, to extract of their rst class directly interacting proteins.This subnetwork is named "ACE2 Full-PPIN".Secondly, a smaller PPI sub-network was illustrated, which was named "ACE2 Core-PPIN", in which ACE2 and its interacting proteins are linked through one or more partner proteins.
Topological parameters analyse for the PPIN NetworkAnalyzer was applied to analyze the topological parameters, which would provide insight into the complex network in terms of its organization and structure [12].As one of most important property of many large networks, the power law of distribution of node degrees was analyzed as we described previously [13].The node degree of a node n is de ned as the number of nodes connected with n.When considering the degree of all nodes in the network, we will have the equation of p(k)=Nk/N, with Nk = number of nodes with degree k.Moreover, several other important topological parameters, such as closeness centrality, shortest path length and topological coe cients, were analyzed for visualization.

Subcellular distribution of the PPIN
Subcellular locations for the proteins in the "ACE2 Full-PPIN" were downloaded from the HPRD database and served as a node attribute in the network as we previously described [14].If some proteins are annotated by multiple locations, e.g., membrane and nucleus, were merged as membrane/nucleus.The nodes in the "ACE2 Full-PPIN" were re-distributed into different layers by Cerebral plugin according to the annotated subcellular localizations, generating a pathway-like graph remaining their physically interactions [15].
Enrichment analyses for functions and pathways of network Functional enrichment analysis of the "ACE2 Full-PPIN" was performed in DAVID database (https://david.ncifcrf.gov/)to identify the enriched Gene Ontology (GO) "Biological Process" term.GO terms with a statistically signi cant P-value less than 0.05 were remained for visualization using ggplot2 R package.
Expression pattern for the Core-network in COVID infected lung cells GSE147507, an expression dataset containing SARS-CoV-2 infected normal lung cells, were obtained from GEO database (https://www.ncbi.nlm.nih.gov/geo), which was submitted by Blanco-Melo et al. [16].In this dataset, primary human lung epithelium (NHBE) was mock treated or infected with SARS-CoV-2 by biological triplicates.To mimic the effect of in ammatory factor, NHBE cell was treated with human interferon-beta (INFB) with the timepoint was set as 4h, 6h and 12h.The RNA-seq data was normalized and the differentially expression were analyzed by SangerBox (http://soft.sangerbox.com/).The Pearson expression correlation for each pair proteins in the "ACE2 Core-PPIN" were analyzed by an R script, according to the expression data.The expression fold changes and correlations were integrated into the "ACE2 Core-PPIN" as the node attribute, to illustrate the dynamic changes of the Core-PPIN under different conditions.The correlation matrixs and their expression pro le were clustered by Cluster 3.0 and illustrated by TreeView [17].

Activated pathways analyses
The expression matrix for the proteins in the "ACE2 Core-PPIN" was submitted to the Pathview (https://pathview.uncc.edu/)for visualization, which provides easy interactive access, and generates high quality pathway graphs [18].

Drug-target network construction
The latest release of DrugBank (version 5.1.6,https://www.drugbank.ca/)was downloaded, which contains 13,580 drug entries, including FDA approved small molecule drugs, biologics, 131 nutraceuticals, 6,376 experimental drugs, as well as 5,223 non-redundant target proteins.An intersection was obtained between the proteins in "ACE2 Full-PPIN" and target proteins from DrugBank, to construct a drug-target network.

Shortest path from ACE2 to transcription factor
The shortest path problem is any possible but the minimized path(s) from one node to another given node in the network.The proteins in the "ACE2 Full-PPIN" was compared the human transcription factors (http://humantfs.ccbr.utoronto.ca/)to identify the transcription factors in this network.The path from ACE2 to these transcription factors were analyzed by R "igraph" package as we previously described [19].The visualization for the path from ACE2 to transcription factors, as well as their targeted drug were performed by Cytoscape.

Visualization of the PPIN for ACE2 and its interacting proteins
The currently known interacting proteins of ACE2 were obtained from the BioGRID and other databases.
At the beginning of this study, there are 12 reported ACE2 interacting proteins, with information shown in Table 1 (Fig. 1A).A PPIN was built, using ACE2 and its interacting proteins as the seed nodes to extract all their interacting proteins from the parental PPIN, which was called as the "ACE2 Full-PPIN" (Fig. 1B).
The "ACE2 Full-PPIN" contained 1,318 nodes and 1,292 edges, suggesting that ACE2 and its interacting proteins can connect with thousands partner proteins by cascaded interactions to exceed the biological effects of solo protein.Three top genes with the highest number of edges were HRAS (620 edges), CALM1 (472 edges) and CAT (119 edges).
In the "ACE2 Full-PPIN", we found multiple ACE2-interacting protein were connected by many common partner proteins.To better illustrate their inner links, another smaller PPI sub-network, named as the "ACE2 Core-PPIN" was constructed by showing ACE2 and its interacting proteins linked by one or more partner proteins , in which the nodes with only one interaction in the "ACE2 Core-PPIN" were removed (Fig. 1C).
This "ACE2 Core-PPIN" contained 80 nodes and 154 edges, in which ACE2 and its interacting proteins were illustrated in light pink, while the partner proteins were indicated in light blue.Interestingly, it shown HRAS and CALM1 have the highest number of shared nodes.This indicates that HRAS and CALM1 would have the greatest potential to transduce the stimulus from ACE2 (Fig. 1C).Moreover, we consider these common nodes between ACE2 and its interacting proteins might act as switch proteins, then determine the direction of molecular signal paths by their expression level, and co-expression correlation strength as well.
Topological structure of the "ACE2 Full-PPIN" A real biological network, including PPIN, is distinguishable from any chaos or random networks by its distinct structural properties.Many real, complex networks are characterized as "scale-free", with a power law degree distribution [20].For "ACE2 Full-PPIN", the distributions of node degree followed an approximate power law, with the equation y = 288.18x −1.07 and an R 2 = 0.833 (Additional le 1A).The shortest path length (the minimum number of links from one protein to another protein) of "ACE2 Full-PPIN" was shown as mainly arranged from 2 and 3 step lengths (Additional le 1B).It also suggests that one protein can contact another protein by only a few nodes, enabling the transformation of different protein complexes or/and component maintenance.Topological coe cient is a measure for the extent to which a node shares neighbors with other nodes.A topological coe cient of 0 is assigned to nodes which have one or no neighbors (Additional le 1C).Closeness centrality measures how fast the ow of information would be through a given node to other nodes in the network, considering the e ciency of signal transduction in the network.The centrality curve is at when the number of links less than 10, but it gradually increases with more nodes have the links >10 (Additional le 1D).
Functional enrichment of the "ACE2 Full-PPIN" As a tightly connected protein network, we presumed that ACE2 and its interacting proteins are involved in various biological processes, especially in the pathology of COVID-19, through cascaded interactions to amplify their biological effects.To test this speculation, GO "Biological Process" enrichment analysis of "ACE2 Full-PPIN" was performed, acquiring more than a hundred of signi cantly enriched GO terms (data not shown).Two big groups GO terms caused us a great interest (Fig. 2B).One group relates to virus process, including "GO:0046718~viral entry into host cell", "GO:0039694~viral RNA genome replication", "GO:0019083~viral transcription" and "GO:0016032~viral process".The other interested group is about immunity, includes "GO:0050900~leukocyte migration", "GO:0031295~T cell costimulation", "GO:0050690~regulation of defense response to virus by virus" and "GO:0042110~T cell activation".These detailed signi cant GO terms with their enriched genes are listed in Additional le 2.
Dynamic expression pattern for the Core-PPIN in COVID infected lung cells In living cells, PPINs are not static, but dynamically change under different stimulus, or different stages of disease.For this aim, the expression data of GSE147507 containing different conditions of lung cells was obtained.Two important parameters for "ACE2 Core-PPIN" were analyzed, the expression log2(foldchange) (logFC), and the co-expression correlation coe cient of each protein interaction.Then, the logFC and correlation coe cient were imported into the network as a node attribute and an edge attribute, respectively, to observe the dynamic changes in various treatments.In the Series1 of GSE147507, the primary human lung epithelium (NHBE) was infected with SARS-CoV-2 (USA-WA1/2020) with mock treated as a control (Fig. 3A).ACE2 is signi cantly increased, followed by its interacting protein AAMP and HRAS.On the other hand, MYC in the network is also signi cantly upregulated (Fig. 3B).For expression coe cient, there are near 70-80% of positive co-expression in mock NHBE, while it increases to more than 90% positive in SARS-CoV-2 treated cells.The co-expression patters between normal and treated are rather different, as indicated by the heatmap (Fig. 3C-D).In the Series9 of GSE147507, NHBE were treated with human interferon-beta at the time point of 4h, 6h and 12h to mimic the stimulus of in ammation.After the treatment, ACE2 and several interacting proteins, such as CALM1, DLEU2, ISYNA1 and NTS are obviously increased in the time series.While GHRL is consistently decrease.HRAS is upregulated gradually from 4h to 12h, though not reach a very high expression level.For the coexpression relationship, it is interesting to point out that both CALM1 and HRAS have positive correlation with their interacting proteins in the control, but only CALM1 remains most positive correlation, while HRAS mostly turns to negative correlation with its interacting proteins (Fig. 4A-D, left panel).For global co-expression pattern as shown by the heatmaps, the total positive correlations are reduced while the total negative correlations are increased (Fig. 4A-D, right panel).On the other hand, the expression pattern of "ACE2 Core-PPIN" could distinguish the control group from the three treatment groups (Additional le 3).

Activated pathways in IFN stimulated lung cells
To exam what kinds of pathway are active during the mimic of COVID-19 stimulus, the expression matrix of NHBE treated with IFN was analyzed by Pathview, which maps omics data to the KEGG pathway.There are six signi cant enriched pathways, including "PI3K-Akt signaling pathway", "Focal adhesion", "ECMreceptor interaction", "Cell adhesion molecules", "Antigen processing and presentation", and "Regulation of actin cytoskeleton" (Additional le 4).To our great interesting, the genes mapped to the pathways are mostly increased, suggesting these pathways are activated during the mimic of COVID-19 in ammation environment.
The proteins in Full-network are potential drug targets Though many scientists and doctors are working hand to nd ways to cure the COVID-19, including the drugs targeting ACE2 itself, or the replication of SARS-CoV-2 virus.We consider it would be great helpful to nd drugs that target the ACE2 PPIN, to restrict the biological activity stimulated by the virus, thus reduce the virus replication and spread of the SARS-CoV-2.To achieve this, we search the DrugBank database and constructed drug-protein target networks.Currently, there are four registered drugs targeting ACE2, there are DB01611 (Hydroxychloroquine), DB05203 (SPP1148), DB15643 (N-(2-Aminoethyl)-1aziridineethanamine), DB00608 (Chloroquine) (though some drugs are now in great controversy) (Figure 6A).On the other hand, ve ACE2 interacting proteins (CALM1, HRAS, AGT, ISYNA1 and CAT) are reported druggable.Among them, CALM1 (calmodulin 1) has the largest number of 29 drugs.The famous signal molecular HRAS (HRas proto-oncogene, GTPase) has ve tested drugs (Fig. 5A).By targeting the "ACE2 Full-PPIN", there are 2075 nodes (1728 drugs and 347 proteins) and 2396 edges (targeting relations) in the drug-protein network (Fig. 5B).At least 26% of proteins in "Full-PPIN" are druggable, suggesting a great potential for treatment.The top 10 proteins have highest number of drugs, and the top 10 drugs have the highest number of target proteins are shown in the list (Fig. 5C).ESR1 (estrogen receptor 1) has the highest of 118 drugs, while DB12010 (Fostamatinib) targets the highest number (61) of proteins.Some drugs have more than one targets.The detailed information about the drugs in these networks (Fig. 5) is provided in the Additional le 5.
The shortest paths from ACE2 to the downstream transcription factors Usually an external stimulus, or the overexpression/knockdown of one gene, could cause a wide range change in gene expression pro le.We presumed the involved transcription factors play vital roles in the alternation of gene expression pro le.We applied the shortest path algorithm to illustrate how ACE2 reaches a speci c transcription factor by the cascades of interaction in the PPIN.37 transcription factors are present in the "ACE2 Full-PPIN".Consistent with the shortest path distribution described above, there are only two steps from ACE2 to these transcription factors (Fig. 6A), suggesting a quick response is exist from extracellular stimulus into nucleus, triggering the change of expression pro le, then the change of cellular activities.These transcription factors are also ideal targets for treatment.So we construct a small PPIN, in which the drugs target the ACE2-TF.There 278 drugs in the small PPIN, targeting the 37 transcription factors (Fig. 6B).It is critical to understand why a wide range alternation of expression pro le is caused when a gene overexpression/knockdown.Shortest path algorithm was applied to identify all possible paths from ACE2 to the transcription factor in the PPIN, which are most effective and economical cellular information transduction.

The veri cation cohort from public data
During the preparation of this manuscript, Gordon et al. reported a comprehensive SARS-CoV-2 protein interaction map [23].They expressed 26 of the 29 SARS-CoV-2 proteins to identify their interacting human proteins.They reported 332 high-con dence SARS-CoV-2-human protein-protein interactions, in which 66 druggable human proteins or host factors are targeted by 69 compounds [23].To test the reliable of ACE2 based PPIN in this study, we compared the "ACE2 Full-PPIN" with the 332 SARS-CoV-2 interacting proteins, and found 44 intersection proteins which were used to construct a co-network containing 510 nodes and 1264 edges (Additional le 6A).Among these nodes, at least six proteins (CALM1, HRAS, DEFA5, CAT, S and ISYNA1,6/12 ACE2 interacting proteins) from "ACE2 Full-PPIN", and fteen SARS-CoV-2 coding proteins are presented.After reduce the nodes with single connection, a core smaller PPIN was obtained to show a clear relationship between ACE2 interacting proteins and SARS-CoV-2 coding proteins (Additional le 6B).It also shows that HRAS and CALM1 are still the nodes with a large number of interactions, suggesting the consistent between the networks in this study and networks from Gordon, D. E. et al.

Discussion
COVID-19 is still a prevalent global pandemic that has already caused more than 3 million deaths worldwide so far.Biological activities, includes stimulus response and signal transduction, are built on protein physically interactions.In this study, several high con dent PPINs were constructed based on the ACE2 and its interacting proteins.Many proteins might link through partner proteins, which could act as switch protein to determine the direction of signal transduction.
Many previous reported biological networks are static networks, which only present a speci c state or one time point.It has been widely aware that cellular system, including the PPIN, is highly exible in the response to environmental stimulus, which enables the cells adapt to different pathological stages or physiological conditions [24].To measure the co-expression of each pair of interacting proteins in expression pro les, Pearson correlation coe cient is a popular correlation method [25].In this study, we not only constructed the static PPI network for ACE2 and its interacting proteins, but also the dynamic networks by regarding their expression pro les in different kinds of conditions mimic COVID-19 progress.
Several proteins are consistent change during the progress, showing a dynamic co-expression pattern.
Since there is no single effective drug for COVID-19 by now, we consider whether there are multiple drugs available for the ACE2 based PPIN to disturb the biological activities caused by the virus.Inspire by this hypothesis, we found hundreds of drugs targeting ACE2 and its cascaded interacting proteins.With the development of network pharmacology, the drug discovery paradigm has changed from the traditional model "one drug → one target → one disease'" to the network mode "multi-drugs → multi-targets → multidiseases" [26,27].One successful application of drug cocktails is the clinical treatment of HIV/AIDS patients [28].The cocktail treatment strategy has been also applied in other diseases, such as breast cancer and leukemia [29][30][31].To overcome this combinatorial explosion problem, Zimmer et al. have developed a model that predicted the effects of cocktails at all doses based on measuring pairs of drugs [32].So we believe that cocktails drug from the ACE2 drug-target network might provide important clues to treat the virus in the clinic trial.Some of ACE2 interacting proteins are reported as drug targets in different diseases.HRAS is a proto-oncogene which overexpresses in many tumors.Tipifarnib is a potent and highly selective inhibitor of farnesyltransferase, a critical enzyme requisite for HRAS activation and now a phase II trial in urothelial carcinoma is carried on [33].CALM1 is one of kind of the EF-hand calcium-binding protein, which modulates the function of ion channels by calmodulin regulation.CBP501, a drug currently in phase II clinical trial for lung cancer, may sensitize tumors to the chemotherapeutic agents bleomycin and cisplatin by inhibiting CALM1 [34].
It usually causes a large scale of alternation in gene expression pro le when the cells are stimulated.We presumed the transcription factors or transcriptional regulators in the PPIN play crucial roles in this expression pro le change.We also speculated that the information transmitted from one protein to another would also employ the most economical ways, through the cascaded of PPI interactions.So, we identi ed there are only two steps for ACE2 to transcription factors, to cause the alternation of gene expression pro le to make the cell adapt to the stimulus.These transcription factors are considered as drug targets.The transcription factor MYC, obviously upregulates both in the COVID-19 and IFN treated lung cells, is a promising therapy target for multiple human cancers.Small molecules are found capable of repression the transcriptional regulation of c-Myc, including IIA6B17, NY2267, and 28RH-NCN-1, which inhibit MYC binds to DNA [35].Based the above analyses, we propose a working model for ACE2 and its interacting proteins, showing the cascaded interaction and their potential drugs for treatment (Fig. 7).

Conclusion
In summary, this study provides new insight into the disruption of the biological response to SARS-CoV-2 mediated by ACE2, but also its cascade interacting proteins when considering of PPIN.

Declarations
PPINs for ACE2 and its interacting proteins.A. The reported ACE2 interacting proteins.B. The "ACE2 Full-PPIN" is consist of ACE2 and its interacting proteins with all their public reported interacting proteins.C. The "ACE2 Core-PPIN" was originated from the Full-PPIN by moving the nodes with only one link, to illustrate ACE2 and its interacting proteins are connected with many partner proteins.
A. Subcellular layers of "ACE2 Full-PPIN", which is separated into eleven layers while maintaining their protein interactions.ACE22 and its interacting proteins are shown in pink color.B. Gene Ontology "Biological Process" enrichment of the Full-PPIN shows signi cant virus activity and immune related terms.Figure 6