Statistics of H2V data
Due to the variation in the availability of studies, the H2V datasets vary among the three viruses. As shown in Table 2, seven datasets of genes/proteins that respond to SARS-CoV-2 infection are available, namely, DEGs, PPIs, DEPs, DPPs, DTPs, DUPs and SAPs. In comparison, only three (DEGs, PPIs and DEPs) and two (DEGs and PPIs) datasets of genes/proteins that respond to SARS-CoV and MERS-CoV infections, respectively, are available. DEGs datasets are available for the response to infections with all three viruses. A total of 9321 human genes responded to MERS-CoV infection, while fewer genes (2249) responded to SARS-CoV infection and even fewer (1395) to SARS-CoV-2 infection. PPIs datasets are also available for the response to infections with all three viruses. There are 1581, 1150, and 296 interaction pairs of human and corresponding SARS-CoV-2, SARS-CoV and MERS-CoV proteins. DEPs datasets are available for the response to SARS-CoV-2 and SARS-CoV infections and include 253 and 66 human proteins, respectively, that responded to the infections. DPPs, DTPs, DUPs and SAPs datasets are only available for the response to SARS-CoV-2 infection, and include 2198 (5046 phosphorylation sites), 232, 516 (730 ubiquitination sites) and 610 response proteins, respectively.
To determine whether common proteins participate in different processes in response to SARS-CoV-2 infection, the intersection of DEPs, DPPs, DTPs and DUPs was analyzed. Figure 1a shows that both expression and translation of 11 proteins changed dramatically upon infection, that both phosphorylation and ubiquitination of 180 proteins changed remarkably upon infection and that one protein underwent noticeable changes in expression, phosphorylation, translation and ubiquitination. We then used Venn diagrams to analyze genes/proteins that are common across responses to different viral infections. This would help to elucidate the fundamental mechanisms of viral pathogenesis. Figure 1b shows that 130 common genes exhibited significant differences in expression upon infection. Figure 1c shows that 62 human proteins could interact with all three viruses.
Overview of H2V
As shown in Figure 2a, the web page header contains a navigation bar and a search box. The search box accepts queries from the user and tries to match anything that resembles a gene or protein. The navigation bar provides access to all resources in the database. The “SARS2” drop-down menu is linked to the SARS-CoV-2 infection response genes/proteins. Similarly, the “SARS1” and “MERS” drop-down menus link to the SARS-CoV-1 and MERS-CoV infection response genes/proteins, respectively. Under the “Utilities” drop-down menu, useful utilities, including a link to download data from or upload data to H2V, are provided. On the page listing the response genes/proteins, the genes/proteins are shown within rows of a table, with additional information about the gene/protein shown in columns (Figure 2b). The “Score” column in the table indicates the reliability of the gene/protein, calculated as the number of studies in which the gene/protein was identified [29]. The genes/proteins in the table are clickable. Clicking on a gene/protein will link to another page showing details of how the gene/protein responds to viral infection. This page includes two helpful features: one is to examine changes in the gene/protein at different timepoints post infection (Figure 2c), and the other is to discover known drugs that target the gene/protein. For PPIs, an embedded sequence viewer, as shown in Figure 2d, is provided for easy inspection of the gene/protein annotation in the viral genome. In addition, PPIs can also be visualized as an interaction network on the page (Figure 2e).
Application cases
To facilitate rapid drug discovery for the treatment of COVID-19 during the pandemic, H2V provides a drug finder that can be used to identify drugs that target a given protein based on the UniProt accession number. The found drugs and their DrugBank identifiers will then be displayed on the lower part of the same page. For example, a search for Q9BYF1 will identify a few drugs, including chloroquine and hydroxychloroquine (Figure 3a).
To help users establish a concrete perception of how all genes/proteins change dynamically over time post infection, H2V provides a utility called “Data animation”. On the page, a settings panel is provided to select data for animation. For example, Figure 3b shows the setting to animate DPPs in response to SARS-CoV-2 infection. The results (Figure 3c and 3d) of this example demonstrate that more human proteins are differentially phosphorylated at 24 h than immediately after SARS-CoV-2 infection. This indicates that the human body responds to SARS-CoV-2 infection by continuously rewiring cellular pathways.
H2V can be used to analyze integrated findings from different studies. Figure 4 shows an example of using the “Enrichment” analysis utility to analyze enriched pathways of DPPs that respond to SARS-CoV-2 infection. DPPs identified in at least two studies were analyzed first (also referred to as analysis 1). After setting the parameters on the left in Figure 4a, the analysis was implemented by clicking the button at the bottom. Based on the completed analysis, the input DPPs for analysis are listed on the right in Figure 4a, and the result is shown in Figure 4b. Seven pathways were enriched, including the FAS signaling pathway, p38 MAPK pathway, and PDGF signaling pathway. Findings repeated in independent studies are expected to be more reliable than those from a single study, so the same analysis (referred to as analysis 2) was performed for DPPs identified in at least one study. This time, more pathways were enriched, and the top seven pathways are shown in Figure 4c. The comparison shows that the top two pathways identified in analysis 1 were not among the top seven pathways identified in analysis 2. This indicates that the inclusion of DPPs of low confidence could distort the analysis result. H2V can be used to remove confounding factors to acquire reliable biological inferences.