The reported interacting proteins of ACE2 were obtained from the BioGRID and other databases. At the beginning of this study, there are 12 ACE2 interacting proteins (detailed information provided in Table 1) (Figure 1A). First, a PPIN was constructed, using ACE2 and its interacting proteins as the seed nodes to extract all their known interacting proteins from the parental PPIN, which was called as the “ACE2 Full-PPIN” (Figure 1B). The Full-PPIN contained 1,318 nodes (proteins) and 1,292 edges (interactions), suggesting that ACE2 and its interacting proteins could be linked to more than a thousand other proteins by cascaded interactions to expand their biological effects. The top three genes with the highest number of interacting proteins were HRAS (620 edges), CALM1 (472 edges) and CAT (119 edges).
In the Full-PPIN, we found multiple ACE2-interacting protein were linked by many other partner proteins. For a better illustration, ACE2 and its interacting proteins linked through one or more partner proteins were shown in another smaller PPI sub-network, referred as the “ACE2 Core-PPIN”, whereby the nodes with only one link in the Full-PPIN were removed (Figure 1C). This Core-PPIN contained 80 nodes and 154 edges, in which ACE2 and its interacting proteins were indicated in light pink, while the linker proteins were shown in light blue. Interestingly, we found that HRAS and CALM1 have the highest dozens of common interacting proteins. This suggests that HRAS and CALM1 would have the greatest potential to transduce the stimulus from ACE2 (Figure 1C). Moreover, We consider these linker proteins between ACE2 and its interacting proteins might serve as switch proteins, then determine the trend or the direction of cellular signal transduction by their co-expression correlation strength.
Table 1
ACE2 interacting proteins from database
Gene ID
|
Official Symbol
|
Full name
|
Experimental System
|
Pubmed ID
|
801
|
CALM1
|
calmodulin 1 (phosphorylase kinase, delta)
|
Affinity Capture-Western
|
18070603
|
183
|
AGT
|
angiotensinogen (serpin peptidase inhibitor, clade A, member 8)
|
Biochemical Activity
|
10969042
|
14
|
AAMP
|
angio-associated, migratory cell protein
|
Affinity Capture-MS
|
26186194
|
847
|
CAT
|
catalase
|
Co-fractionation
|
26344197
|
51477
|
ISYNA1
|
inositol-3-phosphate synthase 1
|
Co-fractionation
|
26344197
|
8847
|
DLEU2
|
deleted in lymphocytic leukemia 2 (non-protein coding)
|
Affinity Capture-RNA
|
28977802
|
3265
|
HRAS
|
Harvey rat sarcoma viral oncogene homolog
|
Proximity Label-MS
|
30639242
|
43740568
|
S
|
Spike glycoprotein
|
Co-crystal Structure
|
32132184
|
340024
|
SLC6A19
|
solute carrier family 6 (neutral amino acid transporter), member 19
|
Co-crystal Structure
|
32132184
|
7431
|
VIM
|
vimentin
|
Affinity Capture-Western
|
26801988
|
1670
|
DEFA5
|
defensin, alpha 5, Paneth cell-specific
|
Reconstituted Complex
|
DOI:10.1101
/2020.03.29.013490
|
51738
|
GHRL
|
ghrelin/obestatin prepropeptide
|
|
11815627
|
The topology parameters of the “ACE2 Full-PPIN”
The true biological networks, including PPIN, are distinguishable from random or other chaos networks by their distinguishing topological parameters. Many networks have been shown to be scale-free with a degree distribution following a power law[20]. For “ACE2 Full-PPIN”, the distributions of node degree followed an approximate power law, with the equation y = 288.18 x−1.07 and an R2 = 0.833 (Supplementary Figure S1A). The shortest path length (number of edges from one node to another) of the Full-PPIN was shown ss mainly arranged from 2 and 3 step lengths (Supplementary Figure S1B). It also suggests that one protein can contact another protein by only a few steps, enabling the formation of different protein complexes or/and component switching. Topological coefficient is a measure for the extent to which a node shares neighbors with other nodes. A topological coefficient of 0 is assigned to nodes which have one or no neighbors (Supplementary Figure S1C). Closeness centrality measures how fast the flow of information would be through a given node to other nodes in the network, considering the efficiency of information spreading in the network. Some nodes have high closeness centrality when links <10, but it gradually increases with the number of links while links >10 (Supplementary Figure S1D).
Subcellular layers of the PPIN indicate stimulus from the extracellular to the nucleus
A given protein may have diverse functions not only dependents on interaction with other proteins, as well as its cellular location[21, 22]. So the subcellular location and/or translocation of proteins is critical for its function. In this study, the “ACE2 Full-PPIN” was divided into 11 layers with their percentage as follows: secreted (2.43%), membrane (22.53%), cytoskeleton (0.15%), cytoskeleton/cytoplasm (0.23%), cytoplasm (19.35%), secreted/nucleus (0.46%), membrane/nucleus (0.23%), cytoskeleton/nucleus (0.61%), cytoplasm/nucleus (18.06%), nucleus (8.19%) and uncertain (27.77%, distribute near the interacting proteins) (Figure 2). These results suggest that near 20% ACE2-interacting proteins and their partners have multiple subcellular locations. These results also indicate that the ACE2 could transfer stimulus from the extracellular/membrane into the intracellular environment, eventually to the nucleus, forming non-canonical pathways by cascades of interactions.
Functional enrichment of the Full-PPIN
Many proteins have multiple functions, we presumed that ACE2 and its interacting proteins are involved in diverse biological functions, especially in the pathology of COVID-19, through cascading protein-protein interactions to expand their biological effects. To examine this the possibility, GO “Biological Process” enrichment analysis of the Full-PPIN was performed, resulting in more than a hundred of significantly enriched GO terms (data not shown). Two big groups GO terms caused us a great interest (Figure 3). One group relates to virus process, including “GO:0046718~viral entry into host cell”, “GO:0039694~viral RNA genome replication”, “GO:0019083~viral transcription” and “GO:0016032~viral process”. The other interested group is about immunity, includes “GO:0050900~leukocyte migration”, “GO:0031295~T cell costimulation”, “GO:0050690~regulation of defense response to virus by virus” and “GO:0042110~T cell activation”. These detailed significant GO terms with their enriched genes are listed in Supplementary file 1.
Dynamic expression pattern for the Core-PPIN in COVID infected lung cells
PPINs in living cells are not static, but instead dynamically vary in different tissues, or different types of diseases, even in the different stages of the same tissue. In this study, the expression profile of GSE147507 containing different conditions of lung cells was obtained. The expression log2(fold-change) (logFC) of the proteins in the Core-PPIN, as well as the co-expression correlation coefficient of each pair of proteins were analyzed. Next, the logFC and correlation coefficient were integrated into the Core-PPIN as a node attribute and an edge attribute, respectively, to illustrate the dynamic changes in different treatments. In the Series1 of GSE147507, the primary human lung epithelium (NHBE) were infected with SARS-CoV-2 (USA-WA1/2020) with mock treated as a control (Figure 4A). ACE2 is significantly increased, followed by its interacting protein AAMP and HRAS. On the other hand, MYC in the network is also significantly upregulated (Figure 4B). For expression coefficient, there are near 70-80% of positive co-expression in mock NHBE, while it increases to more than 90% positive in SARS-CoV-2 treated cells. The co-expression patters between normal and treated are rather different, as indicated by the heatmap (Figure 4C-D). In the Series9 of GSE147507, NHBE were treated with human interferon-beta at the time point of 4h, 6h and 12h to mimic the stimulus of inflammation. After the treatment, ACE2 and several interacting proteins, such as CALM1, DLEU2, ISYNA1 and NTS are obviously increased in the time series. While GHRL is consistently decrease. HRAS is upregulated gradually from 4h to 12h, though not reach a very high expression level. For the co-expression relationship, it is interesting to point out that both CALM1 and HRAS have positive correlation with their interacting proteins in the control, but only CALM1 remains most positive correlation, while HRAS mostly turns to negative correlation with its interacting proteins (Figure 5A-D, left panel). For global co-expression pattern as shown by the heatmaps, the total positive correlations are reduced while the total negative correlations are increased (Figure 5A-D, right panel). On the other hand, the expression pattern of “ACE2 Core-PPIN” could distinguish the control group from the three treatment groups (Supplementary Figure S2).
Activated pathways in IFN stimulated lung cells
To exam what kinds of pathway are active during the mimic of COVID-19 stimulus, the expression matrix of NHBE treated with IFN was analyzed by Pathview, which maps omics data to the KEGG pathway. There are six significant enriched pathways, including “PI3K-Akt signaling pathway”, “Focal adhesion”, “ECM-receptor interaction”, “Cell adhesion molecules”, “Antigen processing and presentation”, and “Regulation of actin cytoskeleton” (Supplementary Figure S3). To our great interesting, the genes mapped to the pathways are mostly increased, suggesting these pathways are activated during the mimic of COVID-19 inflammation environment.
The proteins in Full-network are potential drug targets
Though many scientists and doctors are working hand to find ways to cure the COVID-19, including the drugs targeting ACE2 itself, or the replication of SARS-CoV-2 virus. We consider it would be great helpful to find drugs that target the ACE2 PPIN, to restrict the biological activity stimulated by the virus, thus reduce the replication and spread of the SARS-CoV-2. To achieve this, we search the DrugBank database and constructed drug-protein target networks. Currently, there are four registered drugs targeting ACE2, there are DB01611 (Hydroxychloroquine), DB05203 (SPP1148), DB15643 (N-(2-Aminoethyl)-1-aziridineethanamine), DB00608 (Chloroquine) (though some drugs are now in great controversy) (Figure 6A). On the other hand, five ACE2 interacting proteins (CALM1, HRAS, AGT, ISYNA1 and CAT) are reported druggable. Among them, CALM1 (calmodulin 1) has the largest number of 29 drugs. The famous signal molecular HRAS (HRas proto-oncogene, GTPase) has five tested drugs (Figure 6A). By targeting the “ACE2 Full-PPIN” , there are 2075 nodes (1728 drugs and 347 proteins) and 2396 edges (targeting relations) in the drug-protein network (Figure 6B). At least 26% of proteins in “Full-PPIN” are druggable, suggesting a great potential for treatment. The top 10 proteins have highest number of drugs, and the top 10 drugs have the highest number of target proteins are shown in the list (Figure 6C). ESR1 (estrogen receptor 1) has the highest of 118 drugs, while DB12010 (Fostamatinib) targets the highest number (61) of proteins. Some drugs have more than one targets. The detailed information about the drugs in these networks (Figure 6) is provided in the Supplementary file 2.
The shortest paths from ACE2 to transcription factors
Usually an external stimulus, or the overexpression or knockdown of one gene, could cause a wide range alternation of expression profile. We assumed the involved transcription factors play critical roles in the alternation of mRNA expression profile. We applied the shortest path algorithm to illustrate how ACE2 reaches a specific transcription factor by the cascades of interaction in the PPIN. 37 transcription factors are present in the “ACE2 Full-PPIN”. Consistent with the shortest path distribution described above, there are only two steps from ACE2 to these transcription factors (Figure 7A), suggesting a quick response is exist from extracellular stimulus into nucleus, triggering the change of expression profile, then the change of cellular activities. These transcription factors are also ideal targets for treatment. So we construct a small PPIN, in which the drugs target the ACE2-TF. There 278 drugs in the small PPIN, targeting the 37 transcription factors (Figure 7B). It is interesting to understand why a gene overexpression or knockdown could cause a wide range alternation of expression profile. In this study, we applied shortest path algorithm to identify possible paths from ACE2 to the transcription factor in the protein-protein interaction (PPI) network.
The verification cohort from public data
During the preparation of this manuscript, Gordon et al. reported a comprehensive SARS-CoV-2 protein interaction map[23]. They expressed 26 of the 29 SARS-CoV-2 proteins to identify their interacting human proteins. They reported 332 high-confidence SARS-CoV-2-human protein-protein interactions, in which 66 druggable human proteins or host factors are targeted by 69 compounds[23]. To test the reliable of ACE2 based PPIN in this study, We compared the “ACE2 Full-PPIN” with the 332 SARS-CoV-2 interacting proteins, and found 44 intersection proteins which were used to construct a co-network containing 510 nodes and 1264 edges (Supplementary Figure S4A). Among these nodes, at least six proteins (CALM1, HRAS, DEFA5, CAT, S and ISYNA1,6/12 ACE2 interacting proteins) from “ACE2 Full-PPIN”, and fifteen SARS-CoV-2 coding proteins are presented. After reduce the nodes with single connection, a core smaller PPIN was obtained to show a clear relationship between ACE2 interacting proteins and SARS-CoV-2 coding proteins (Supplementary Figure S4B). It also show that HRAS and CALM1 are still the nodes with a large number of interaction, suggesting the consistent between the networks in this study and networks from Gordon, D. E. et al.