Drug and vaccine repurposing workflow
To study BCG’s polypharmacology and potential beneficial effects of this vaccine in preventing the fatal consequences of COVID-19, we have devised and implemented a ‘network biology’ workflow (Figure 1) to interrogate the hypothesis that BCG vaccination may protect from COVID-19 fatalities. This workflow is based on our drug repurposing chemocentric informatics workflow[12], which has been validated previously for small-molecule drug repurposing. The current workflow is tweaked towards vaccine repurposing by employing novel bioinformatic approaches to computationally model and connect molecular networks in an effort to understand the underlying ‘network’ biology of vaccines, and pinpoint the regulatory genes and proteins responsible for causing the observed beneficial multitherapeutic effects. Although we are not the first group to use network biology approaches to study the transcriptional changes of vaccines, to our knowledge, this is the first study that uses these approaches to support vaccine repurposing, specifically for COVID-19.
BCG consensus gene signature
Our workflow starts with the prioritization of a gene signature to study BCG’s network pharmacology. First, we derived a consensus gene signature (CGS) for BCG based on GEO’s dataset GSE58636[13]. Details on BCG-CGS signature are found in table S1 (Supporting Information). Twenty-two differentially-expressed genes across all 4 experiments (2 Groups x 2 time points discussed in Methods) formed BCG’s consensus gene signature (BCG-CGS) shown in Figure 2A.
BCG protein-protein interactions (PPIs)
All 22 genes in BCG-CGS were used as seed nodes to build a protein-protein interaction network for signature genes (Figure 2B). Interactions were extracted from STRING database and included high confidence interactions including: physical interactions (e.g., binding), functional interactions (e.g., activation, inhibition, catalysis), or gene co-expression. Two types of networks were generated: 1) high-confidence ‘core’ network restricted to BCG signature genes as network nodes and high confidence (interactions as network edges, and 2) medium-confidence interaction network obtained from expanding the core network by 20 additional nodes (Figure 3).
Enrichment analysis results performed in Cytoscape, using STRING’s protein-protein interactions, indicated that BCG-CGS is enriched in inflammatory cytokines and immune response modulators (Figure 2B). Some signature genes are also involved in the negative control of important viral processes (e.g., (FCN1, TNF and CCL3), and others are involved in the response to viral infections (e.g., IFNG, RNASE6, IL6 and TNF). The complete lists of enriched pathways are included in tables S2 and S3 (Supporting Information).
Identification of key hubs
We identified 291 key hubs using the causal reasoning method which seeks to identify molecular regulators that will directly cause the observed transcriptional changes in response to BCG vaccination. Key regulators can be transcriptional factors and proteins with potentially altered activity that explains the transcriptional changes. Top five statistically significant inhibited key hubs were: HEY1, DSIPI (GILZ), Jagged1, HAND1 and miR-129-1-3p. And top five statistically significant activated key hubs were: PHF20, TAFII70, Glutaredoxin, RUNX2 and NOTCH1 (NICD). Top 30 causal key hubs are shown in table 1 and all identified 291 key hubs are included in table S4 (Supporting Information).
Table 1. Top twenty key hubs predicted by causal reasoning.
Key Hub
|
Molecular Function
|
Gene Symbol
|
Predicted Activity†
|
Correct/Total network predictions‡
|
Activity Prediction P-value*
|
Calculation Distance§
|
HEY1
|
Transcription factor
|
HEY1
|
-
|
15/15
|
3.05E-05
|
3
|
PHF20
|
Binding protein
|
PHF20
|
+
|
15/15
|
3.05E-05
|
3
|
DSIPI (GILZ)
|
Transcription factor
|
TSC22D3
|
-
|
14/14
|
6.10E-05
|
3
|
TAFII70
|
Transcription factor
|
TAF6
|
+
|
14/14
|
6.10E-05
|
3
|
DSIPI (GILZ)
|
Transcription factor
|
TSC22D3
|
-
|
13/13
|
1.22E-04
|
2
|
Glutaredoxin 1
|
Enzyme
|
GLRX
|
+
|
13/13
|
1.22E-04
|
3
|
Jagged1
|
Receptor ligand
|
JAG1
|
-
|
13/13
|
1.22E-04
|
3
|
RUNX2
|
Transcription factor
|
RUNX2
|
+
|
13/13
|
1.22E-04
|
2
|
NOTCH1 (NICD)
|
Transcription factor
|
NOTCH1
|
+
|
16/17
|
1.37E-04
|
3
|
HAND1
|
Transcription factor
|
HAND1
|
-
|
12/12
|
2.44E-04
|
3
|
PRMT6
|
Enzyme
|
PRMT6
|
+
|
12/12
|
2.44E-04
|
2
|
miR-129-1-3p
|
RNA
|
MIR129-1
|
-
|
12/12
|
2.44E-04
|
3
|
SOX10
|
Transcription factor
|
SOX10
|
+
|
12/12
|
2.44E-04
|
3
|
HAND2
|
Transcription factor
|
HAND2
|
-
|
12/12
|
2.44E-04
|
3
|
MSK1
|
Protein kinase
|
RPS6KA5
|
+
|
12/12
|
2.44E-04
|
2
|
USP28
|
Protease
|
USP28
|
+
|
15/16
|
2.59E-04
|
3
|
c-Fos
|
Transcription factor
|
FOS
|
+
|
15/16
|
2.59E-04
|
3
|
UBF
|
Transcription factor
|
UBTF
|
+
|
11/11
|
4.88E-04
|
3
|
miR-520e-3p
|
RNA
|
MIR520E
|
-
|
11/11
|
4.88E-04
|
2
|
TMEM119
|
Protein
|
TMEM119
|
+
|
11/11
|
4.88E-04
|
3
|
LRP16
|
Binding protein
|
MACROD1
|
+
|
11/11
|
4.88E-04
|
2
|
LRP16
|
Binding protein
|
MACROD1
|
+
|
14/15
|
4.88E-04
|
3
|
CaMK II gamma
|
Protein kinase
|
CAMK2G
|
+
|
11/11
|
4.88E-04
|
2
|
CaMK II gamma
|
Protein kinase
|
CAMK2G
|
+
|
14/15
|
4.88E-04
|
3
|
miR-4500
|
RNA
|
MIR4500
|
-
|
14/15
|
4.88E-04
|
3
|
NOTCH1 (NICD)
|
Transcription factor
|
NOTCH1
|
+
|
14/15
|
4.88E-04
|
2
|
miR-4516
|
RNA
|
MIR4516
|
-
|
11/11
|
4.88E-04
|
3
|
NDPK B
|
Protein kinase
|
NME2
|
-
|
11/11
|
4.88E-04
|
3
|
KLF11 (TIEG2)
|
Transcription factor
|
KLF11
|
-
|
11/11
|
4.88E-04
|
2
|
miR-320d
|
RNA
|
MIR320D1
|
-
|
14/15
|
4.88E-04
|
3
|
† Predicted activity of the key hub by causal reasoning is denoted by – if the hub is inhibited, and denoted by + if the hub is activated.
‡ Correct/total network predictions: correct for the genes in the dataset predicted correctly; total for the total number of genes in the causal reasoning network.
§ Calculation distance: Using causal reasoning one-step key hubs are defined as statistically significant transcriptional factors that are associated with experimental differential expressed genes regulation. Two-step and three-step key hubs are distant key hubs that regulate one-step transcriptional factors.
*P-value calcualted for the polynomial test.
Identifying BCG ‘mimics’
In order to identify experimentally validated upstream regulators that cause transcriptional changes similar to those induced by BCG, we queried the Connectivity Map (CMap)[14] database of the Broad Institute with BCG-CGS and identified proteins and small-molecule drugs that have strong connectivity scores with BCG (Figure 1). The CMap approach enabled us to compare BCG-CGS with ‘experimentally’ predefined signatures of therapeutic compounds and genetic perturbations (i.e., over expression or knockdown) included in the CMap and ranked according to a connectivity scores (ranging from +100 to −100), representing relative similarity to BCG-CGS. The connectivity score itself is derived using a nonparametric, rank-based, pattern-matching strategy based on the Kolmogorov-Smirnov statistic[15]. All instances in the database are then ranked according to their connectivity scores with BCG-CGS; those at the top (+) are most strongly correlated to the query signature and looked at as BCG mimics, and those at the bottom (−) are most strongly anticorrelated and can reverse BCG’s gene signature.
Our analysis identified three highly enriched classes of genetic knockdown (KD) perturbagens and one pharmacological class of drugs that have positive connectivity scores in alveolar A549 cells (i.e., caused similar transcriptional changes to those induced by BCG in alveolar A549 cells). These hits can be considered as BCG mimics capable of inducing transcriptional changes similar to those caused by BCG vaccine. Therefore, we suggest that BCG mimics can be used as alternatives to BCG vaccination to promote long-lasting beneficial effects on immune cells. The three enriched protein classes are: protein phosphatases (with best positive connection for PPP4C KD), histone deacetylases (with best positive connection for HDAC10 KD followed by HDAC11 KD), and mediator complex proteins (with best positive connection for MED6 KD followed by MED7 KD). Additionally, protein kinase C (PKC) activators were enriched as a drug class; and top 3 PKC activators with highest CMap connectivity scores to BCG-CGS prostratin, phorbol-12-myristate-13-acetate, and ingenol. It is evident that all of the above 4 classes of proteins share one common feature: they participate in the transcriptional and metabolic regulation of immune cells in response to environmental cues including responses to pathogens[16–19]. All top-scoring PKC activators from the CMap, are also known to have antiviral effects or affect T cell activation[20–24].
Remarkably, analyzing top ten CMap positive connections with BCG-CGS obtained from nine cell lines indicated that two compounds are approved antiviral drugs: raltegravir (top 3rd positive connection, an HIV integrase inhibitor) and lopinavir (top 6th positive connection, an HIV protease inhibitor). More interestingly, emetine (top 4th positive connection) and lopinavir were recently validated to inhibit SARS-CoV-2 replication in vitro[25]. We also found evidence in the biomedical literature indicating that MST-312[26], narciclasine[27] and verrucarin-a[28] possess antiviral activities. All CMap hits are provided in tables S5 and S8 (Supporting Information).
In order to prioritize high confidence BCG genetic mimics, we integrated hypotheses derived independently from the CMap with those predicted by causal reasoning, and accepted common hits only (i.e., CMap positive connections with BCG-CGS that are also predicted as beneficial drug targets by causal reasoning). This analysis resulted 30 high confidence common hits reported in table S9 (Supporting Information).
Any validation for functional connections with SARS-CoV-2?
We tested whether BCG-CGS, CMap positive connections, or predicted key hubs will have any impact on COVID-19 by identifying overlaps with SARS-CoV-2 interactome, i.e., human proteins that were experimentally validated to interact with SARS-CoV-2 and extracted from two recent reports[29,30]. This analysis (Figure 4A) validated 3 proteins hits to have physical links to SARS-CoV-2. The three proteins are transcribed by: BRD4, PRKACA and SIRT5; they all were positive connections from the CMap, predicted as statistically significant key hubs, and were also validated SARS-CoV-2 interacting proteins[7].
Additionally, 14 high-confidence CMap positive connections, were validated to make physical interactions SARS-CoV-2 proteins. These proteins are: PSEN2, PABPC1, HMOX1, CIT, PLAT, IGF2R, RIPK1, NDUFS3, NDUFA5, GGH, NEU1, SCARB1, CSNK2B, F2RL1. And two positive connections, MARK2 and MARK3, were reported to have interactions with corona viruses[29]. Predicted causal key hubs, SIGMAR1 and GNB1, were also validated to have physical links to SARS-CoV-2[7], and a third key hub PPIA was known as human protein interacting with proteins from corona viruses[29].
Additionally, we mined the biomedical literature to identify evidence for linking BCG small molecule mimics with SARS-CoV-2, corona viruses or viral infections in general. We found that two out of ten top positive compound connections (emetine and lopinavir), were recently validated to inhibit SARS-CoV-2 replication in vitro[25]. Other compounds we found to inhibit the growth of corona viruses, or had general antiviral activities (Table 2).
Table 2. Small-molecule BCG mimics with potential antiviral effects.
Compound
|
Score†
|
Description
|
Validation‡
|
prostratin
|
98.65
|
PKC activator
|
Antiviral[21]
|
ingenol
|
98.52
|
PKC activator
|
Antiviral[20]
|
raltegravir
|
97.85
|
HIV integrase inhibitor
|
Antiviral[54]
|
emetine
|
97.25
|
Protein synthesis inhibitor
|
SARS-CoV-2[25]
|
phorbol-12-myristate-13-acetate
|
96.72
|
PKC activator
|
Antiviral[23,24]
|
mebendazole
|
95.32
|
Tubulin inhibitor
|
Antiviral[55]
|
lopinavir
|
95.06
|
HIV protease inhibitor
|
SARS-CoV-2[25,26]
|
MST-312
|
95.04
|
Telomerase inhibitor
|
Antiviral[23,24,26]
|
narciclasine
|
94.71
|
Coflilin signaling pathway activator
|
Antiviral[27]
|
verrucarin-a
|
94.51
|
Protein synthesis inhibitor
|
Antiviral[56]
|
anisomycin
|
94.40
|
DNA synthesis inhibitor
|
Corona viruses[57]
|
azacitidine
|
94.29
|
DNA methyltransferase inhibitor
|
Antiviral[58]
|
cytochalasin-b
|
93.90
|
Microtubule inhibitor
|
Antiviral[59]
|
cephaeline
|
93.88
|
Protein synthesis inhibitor
|
Antiviral[60]
|
homoharringtonine
|
93.42
|
Protein synthesis inhibitor
|
Antiviral[61]
|
ruxolitinib
|
92.81
|
JAK inhibitor
|
COVID-19 CT§
|
HU-211
|
92.64
|
Glutamate receptor antagonist
|
Unknown
|
vinblastine
|
92.36
|
Microtubule inhibitor
|
Unknown
|
RO-28-1675
|
92.12
|
Glucokinase activator
|
Unknown
|
vincristine
|
91.61
|
Tubulin inhibitor
|
Unknown
|
†Score refer to the CMap score. It represents the level of similarity between transcriptional effects induced by BCG and each of the compounds.
‡ Validation refers to the presence of any supporting evidence from the biomedical literature that the predicted BCG mimics have any antiviral activities. Antiviral means there is evidence that the compound is used as or has antiviral activity; SARS-CoV-2 means that the compound should antiviral activity against SARS-CoV-2; Corona viruses means that the compound showed antiviral activity against corona viruses other than SARS-CoV-2.
§ COVID-19 CT: there is evidence that the compound is being tested in clinical trials for COVID-19. There are 12 Studies found for Ruxolitinib in COVID-19 on clinicaltrials.gov.