Visualization of the PPIN for ACE2 and its interacting proteins
The currently known interacting proteins of ACE2 were obtained from the BioGRID and other databases. At the beginning of this study, there are 12 reported ACE2 interacting proteins, with information shown in Table 1 (Fig. 1A). A PPIN was built, using ACE2 and its interacting proteins as the seed nodes to extract all their interacting proteins from the parental PPIN, which was called as the “ACE2 Full-PPIN” (Fig. 1B). The “ACE2 Full-PPIN” contained 1,318 nodes and 1,292 edges, suggesting that ACE2 and its interacting proteins can connect with thousands partner proteins by cascaded interactions to exceed the biological effects of solo protein. Three top genes with the highest number of edges were HRAS (620 edges), CALM1 (472 edges) and CAT (119 edges).
In the “ACE2 Full-PPIN”, we found multiple ACE2-interacting protein were connected by many common partner proteins. To better illustrate their inner links, another smaller PPI sub-network, named as the “ACE2 Core-PPIN” was constructed by showing ACE2 and its interacting proteins linked by one or more partner proteins , in which the nodes with only one interaction in the “ACE2 Core-PPIN” were removed (Fig. 1C). This “ACE2 Core-PPIN” contained 80 nodes and 154 edges, in which ACE2 and its interacting proteins were illustrated in light pink, while the partner proteins were indicated in light blue. Interestingly, it shown HRAS and CALM1 have the highest number of shared nodes. This indicates that HRAS and CALM1 would have the greatest potential to transduce the stimulus from ACE2 (Fig. 1C). Moreover, we consider these common nodes between ACE2 and its interacting proteins might act as switch proteins, then determine the direction of molecular signal paths by their expression level, and co-expression correlation strength as well.
Topological structure of the “ACE2 Full-PPIN”
A real biological network, including PPIN, is distinguishable from any chaos or random networks by its distinct structural properties. Many real, complex networks are characterized as “scale-free”, with a power law degree distribution [20]. For “ACE2 Full-PPIN”, the distributions of node degree followed an approximate power law, with the equation y = 288.18 x−1.07 and an R2 = 0.833 (Additional file 1A). The shortest path length (the minimum number of links from one protein to another protein) of “ACE2 Full-PPIN” was shown as mainly arranged from 2 and 3 step lengths (Additional file 1B). It also suggests that one protein can contact another protein by only a few nodes, enabling the transformation of different protein complexes or/and component maintenance. Topological coefficient is a measure for the extent to which a node shares neighbors with other nodes. A topological coefficient of 0 is assigned to nodes which have one or no neighbors (Additional file 1C). Closeness centrality measures how fast the flow of information would be through a given node to other nodes in the network, considering the efficiency of signal transduction in the network. The centrality curve is flat when the number of links less than 10, but it gradually increases with more nodes have the links >10 (Additional file 1D).
Signal transduction from cell surface to nucleus through the PPIN
Evidence suggest that a protein might perform diverse functions depending on its cellular translocation, except the interactions with other proteins [21, 22]. So, the subcellular location and/or translocation become critical when considering its function, for the PPIN as well. After re-distribution, the “ACE2 Full-PPIN” was divided into 11 layers and the percentage from extracellular to nucleus as follows: secreted (2.43%), membrane (22.53%), cytoskeleton (0.15%), cytoskeleton/cytoplasm (0.23%), cytoplasm (19.35%), secreted/nucleus (0.46%), membrane/nucleus (0.23%), cytoskeleton/nucleus (0.61%), cytoplasm/nucleus (18.06%), nucleus (8.19%) and uncertain (27.77%, distribute near the interacting proteins) (Fig. 2A). It shown that near 20% nodes in this PPIN have multiple subcellular locations. It also strongly suggests that the ACE2 might transduct stimulus from the extracellular/membrane into the cytoplasm, till to the nucleus, making up various non-classical pathways through cascades of interactions.
Functional enrichment of the “ACE2 Full-PPIN”
As a tightly connected protein network, we presumed that ACE2 and its interacting proteins are involved in various biological processes, especially in the pathology of COVID-19, through cascaded interactions to amplify their biological effects. To test this speculation, GO “Biological Process” enrichment analysis of “ACE2 Full-PPIN” was performed, acquiring more than a hundred of significantly enriched GO terms (data not shown). Two big groups GO terms caused us a great interest (Fig. 2B). One group relates to virus process, including “GO:0046718~viral entry into host cell”, “GO:0039694~viral RNA genome replication”, “GO:0019083~viral transcription” and “GO:0016032~viral process”. The other interested group is about immunity, includes “GO:0050900~leukocyte migration”, “GO:0031295~T cell costimulation”, “GO:0050690~regulation of defense response to virus by virus” and “GO:0042110~T cell activation”. These detailed significant GO terms with their enriched genes are listed in Additional file 2.
Dynamic expression pattern for the Core-PPIN in COVID infected lung cells
In living cells, PPINs are not static, but dynamically change under different stimulus, or different stages of disease. For this aim, the expression data of GSE147507 containing different conditions of lung cells was obtained. Two important parameters for “ACE2 Core-PPIN” were analyzed, the expression log2(fold-change) (logFC), and the co-expression correlation coefficient of each protein interaction. Then, the logFC and correlation coefficient were imported into the network as a node attribute and an edge attribute, respectively, to observe the dynamic changes in various treatments. In the Series1 of GSE147507, the primary human lung epithelium (NHBE) was infected with SARS-CoV-2 (USA-WA1/2020) with mock treated as a control (Fig. 3A). ACE2 is significantly increased, followed by its interacting protein AAMP and HRAS. On the other hand, MYC in the network is also significantly upregulated (Fig. 3B). For expression coefficient, there are near 70-80% of positive co-expression in mock NHBE, while it increases to more than 90% positive in SARS-CoV-2 treated cells. The co-expression patters between normal and treated are rather different, as indicated by the heatmap (Fig. 3C-D). In the Series9 of GSE147507, NHBE were treated with human interferon-beta at the time point of 4h, 6h and 12h to mimic the stimulus of inflammation. After the treatment, ACE2 and several interacting proteins, such as CALM1, DLEU2, ISYNA1 and NTS are obviously increased in the time series. While GHRL is consistently decrease. HRAS is upregulated gradually from 4h to 12h, though not reach a very high expression level. For the co-expression relationship, it is interesting to point out that both CALM1 and HRAS have positive correlation with their interacting proteins in the control, but only CALM1 remains most positive correlation, while HRAS mostly turns to negative correlation with its interacting proteins (Fig. 4A-D, left panel). For global co-expression pattern as shown by the heatmaps, the total positive correlations are reduced while the total negative correlations are increased (Fig. 4A-D, right panel). On the other hand, the expression pattern of “ACE2 Core-PPIN” could distinguish the control group from the three treatment groups (Additional file 3).
Activated pathways in IFN stimulated lung cells
To exam what kinds of pathway are active during the mimic of COVID-19 stimulus, the expression matrix of NHBE treated with IFN was analyzed by Pathview, which maps omics data to the KEGG pathway. There are six significant enriched pathways, including “PI3K-Akt signaling pathway”, “Focal adhesion”, “ECM-receptor interaction”, “Cell adhesion molecules”, “Antigen processing and presentation”, and “Regulation of actin cytoskeleton” (Additional file 4). To our great interesting, the genes mapped to the pathways are mostly increased, suggesting these pathways are activated during the mimic of COVID-19 inflammation environment.
The proteins in Full-network are potential drug targets
Though many scientists and doctors are working hand to find ways to cure the COVID-19, including the drugs targeting ACE2 itself, or the replication of SARS-CoV-2 virus. We consider it would be great helpful to find drugs that target the ACE2 PPIN, to restrict the biological activity stimulated by the virus, thus reduce the virus replication and spread of the SARS-CoV-2. To achieve this, we search the DrugBank database and constructed drug-protein target networks. Currently, there are four registered drugs targeting ACE2, there are DB01611 (Hydroxychloroquine), DB05203 (SPP1148), DB15643 (N-(2-Aminoethyl)-1-aziridineethanamine), DB00608 (Chloroquine) (though some drugs are now in great controversy) (Figure 6A). On the other hand, five ACE2 interacting proteins (CALM1, HRAS, AGT, ISYNA1 and CAT) are reported druggable. Among them, CALM1 (calmodulin 1) has the largest number of 29 drugs. The famous signal molecular HRAS (HRas proto-oncogene, GTPase) has five tested drugs (Fig. 5A). By targeting the “ACE2 Full-PPIN”, there are 2075 nodes (1728 drugs and 347 proteins) and 2396 edges (targeting relations) in the drug-protein network (Fig. 5B). At least 26% of proteins in “Full-PPIN” are druggable, suggesting a great potential for treatment. The top 10 proteins have highest number of drugs, and the top 10 drugs have the highest number of target proteins are shown in the list (Fig. 5C). ESR1 (estrogen receptor 1) has the highest of 118 drugs, while DB12010 (Fostamatinib) targets the highest number (61) of proteins. Some drugs have more than one targets. The detailed information about the drugs in these networks (Fig. 5) is provided in the Additional file 5.
The shortest paths from ACE2 to the downstream transcription factors
Usually an external stimulus, or the overexpression/knockdown of one gene, could cause a wide range change in gene expression profile. We presumed the involved transcription factors play vital roles in the alternation of gene expression profile. We applied the shortest path algorithm to illustrate how ACE2 reaches a specific transcription factor by the cascades of interaction in the PPIN. 37 transcription factors are present in the “ACE2 Full-PPIN”. Consistent with the shortest path distribution described above, there are only two steps from ACE2 to these transcription factors (Fig. 6A), suggesting a quick response is exist from extracellular stimulus into nucleus, triggering the change of expression profile, then the change of cellular activities. These transcription factors are also ideal targets for treatment. So we construct a small PPIN, in which the drugs target the ACE2-TF. There 278 drugs in the small PPIN, targeting the 37 transcription factors (Fig. 6B). It is critical to understand why a wide range alternation of expression profile is caused when a gene overexpression/knockdown. Shortest path algorithm was applied to identify all possible paths from ACE2 to the transcription factor in the PPIN, which are most effective and economical cellular information transduction.
The verification cohort from public data
During the preparation of this manuscript, Gordon et al. reported a comprehensive SARS-CoV-2 protein interaction map [23]. They expressed 26 of the 29 SARS-CoV-2 proteins to identify their interacting human proteins. They reported 332 high-confidence SARS-CoV-2-human protein-protein interactions, in which 66 druggable human proteins or host factors are targeted by 69 compounds[23]. To test the reliable of ACE2 based PPIN in this study, we compared the “ACE2 Full-PPIN” with the 332 SARS-CoV-2 interacting proteins, and found 44 intersection proteins which were used to construct a co-network containing 510 nodes and 1264 edges (Additional file 6A). Among these nodes, at least six proteins (CALM1, HRAS, DEFA5, CAT, S and ISYNA1,6/12 ACE2 interacting proteins) from “ACE2 Full-PPIN”, and fifteen SARS-CoV-2 coding proteins are presented. After reduce the nodes with single connection, a core smaller PPIN was obtained to show a clear relationship between ACE2 interacting proteins and SARS-CoV-2 coding proteins (Additional file 6B). It also shows that HRAS and CALM1 are still the nodes with a large number of interactions, suggesting the consistent between the networks in this study and networks from Gordon, D. E. et al.