Bioinformatic approaches towards identication of potential repurposable drugs for COVID-19

Repurposing existing drugs approved for other conditions is crucial to identifying specic therapeutics against SARS-CoV-2 causing COVID-19 (coronavirus disease-2019) pandemic. Towards this attempt, it is important to understand how this virus hijacks the host system during the course of infection and determine potential virus- and host-targeted inhibitors. This study elucidates the underlying virus-host interaction based on differentially expressed gene proling, functional enrichment and pathway analysis, protein-protein and protein-drug interactions utilizing the information on transcriptional response to SARS-CoV-2 infection from GSE 147507 dataset containing COVID-19 case relative to healthy control and infected cell culture compared to uninfected one. Low IFN signaling, chemokines level elevation, and proinammatory cytokines release were observed markedly. We identied MYC-rapamycin and ABCG2-rapamycin interactions, and unique gene signatures in case (regulation of protein modication and MAPK signaling) as well as in cell (metabolic dysregulation and interferon signaling) different from known COVID-19 genes. proto-oncogene, N: nucleocapsid protein, NendoU: uridylate-specic endoribonuclease, NFATC3: nuclear factor of activated T cells 3, NFKB: Nuclear Factor Kappa B, NKRF: NFKB Repressing Factor, NLRP3: NLR family pyrin domain containing 3, nsp3/PL2-PRO: papain like protease, nsp5/MPRO: main protease, nsp12/RdRp: RNA dependent RNA polymerase, nsp7-nsp8: primase complex, nsp13: helicase, nsp14: exoribonuclease, nsp15: endonuclease, OAS1: 2’–5’-oligoadenylate synthetase 1, OASL: 2’–5’-oligoadenylate synthetase like, ORF: open reading frame, PDGFRB: platelet derived growth factor receptor beta, PE: phosphatidyl ethanolamine, PIAS1: protein inhibitor of activated STAT1, PKR: protein kinase R, PLEKHA4: pleckstrin homology domain containing A4, PPP3R1: protein phosphatase 3 regulatory subunit B alpha, PRR: Pattern Recognition Receptor, p-STAT1: phosphorylated signal transducers and activators of transcription, RBD: receptor-binding domain, RPS6: ribosomal protein S6, RSAD2: Radical s-adenosyl methionine domain containing 2, RTC: replicase-transcriptase complex, S: surface spike protein, SFN: epithelial cell marker protein 1, S100A9: S100 calcium binding protein A9, TF: transferring, TMPRSS2: transmembrane serine protease 2, TNF: tumor necrosis factor, TNFSF10: TNF superfamily member 10, TNNI3: troponin I3, TSC1: tuberous sclerosis 1, ULK: Unc–51 like autophagy activating kinase, XAF1: X-linked inhibitor of apoptosis protein associated factor 1.


Introduction
As of June 25, 2020, the ongoing COVID-19 (coronavirus disease-2019) pandemic caused 9,296,202 people infections globally including 479,133 deaths (WHO, 2020), because of the lack of speci c treatments or vaccines. Repurposing drugs may possibly curtail the time and costs compared to de novo drug discovery, in addition to providing principal information on pharmacology and toxicology, as well as ascertaining novel indications for prompt clinical trials and regulatory assessment (Shaha et al., 2020;Gordon et al., 2020;Guy et al., 2020). The present study address this by mapping the interaction between viral proteins and human proteins in SARS-CoV-2 infected case compared to healthy control, in SARS-CoV-2 infected-compared to uninfected cell culture. This was achieved by downloading the GSE 147507 dataset from the Gene Expression Omnibus (GEO) database (www.ncbi.nlm.nih.gov/geo/) related to transcriptional response to SARS-CoV-2 infection (Blanco-Melo et al., 2020). The underlying virus-host interactions was elucidated based on differentially expressed genes (DEGs) pro ling, functional enrichment and pathway analysis, protein-protein interaction (PPI) network, and protein-drug interaction (PDI) in context to COVID-19. The current study explore the major regulatory genes as potential biomarkers of COVID-19, transcriptional response to SARS-CoV-2 infection in cell culture as well as in case, comparison of the known gene signatures in COVID-19 with that in case and culture, and identi cation of potential repurposable drugs for COVID-19 treatment.

COVID-19 disease associated genes and network
The top 100 COVID-19 associated human genes were retrieved from disease query of the STRING (Search Tool for the Retrieval of Interacting Genes) (http://string-db.org/cgi/input.pl; version 1.5.0) App (Doncheva et al., 2019) of the Cytoscape software, version 3.7.2 (https://cytoscape.org/), according to Shannon et al. (2003). A STRING network was created with top 44 COVID-19 related genes ranked in order of disease scores and degree connectivity; the set of top 44 genes were named as group A with their disease scores and degree connectivity ranging from 1.043351 to 3.922025 and from 1 to 29, respectively.
The DEGswere identi ed in COVID-19 case relative to the uninfected control (group B), and in ACE2induced A549 cell line with SARS-CoV-2 infection compared to the uninfected cell line (group C); the threshold for DEGs were set as |log2FoldChange| (|log 2 FC|) >10, and |log 2 FC| >3 respectively, for the two groups. A PPI network was constructed with the DEGs in group B and group C using STRING App of the Cytoscape software, version 3.7.2 (https://cytoscape.org/), as per Shannon et al. (2003). The signi cant DEGs were expressed as heatmaps using R programming (https://www.r-project.org/; version 3.6.1 [2019-07-05]) gplots tools from the Biobase packages (https://CRAN.R-project.org/package = gplots) of the Bioconductor project (Huber et al., 2015).

SARS-CoV-2 protein and drug interaction (PDI) network analysis
The PDI network was constructed with STITCH: protein/compound query for the three groups A, B, and C comprising top 44 COVID-19 related genes, 255 DEGs among COVID-19 case compared to uninfected control, and 363 DEGs among ACE2 induced cell line treated with and without SARS-CoV-2, using STRING App of the Cytoscape software, version 3.7.2 (https://cytoscape.org/), according to Shannon et al. (2003); the approved and experimental repurposable drugs against COVID-19 were included to create the PDI network (Guy et al., 2020;Li et al., 2020).

Results And Discussion
The putative physiological response to SARS-CoV-2 infection from virus entry to release from the host, in the present study, as expressed from COVID-19 case and SARS-CoV-2 infected cell line is depicted in Figure 1. The SARS-CoV-2 has evolved strategies including inhibition of host IFN signaling (by nsp1 to decrease p-STAT1 in infected cells) (Wathelet et al., 2007), PKR-mediated apoptosis (NendoU by evading dsRNA sensors in host cell) (WikiPathways, 2020;Deng et al., 2017), NKRF repression by nsp9 and nsp10 as well as hijacking of ubiquitination by nsp10 with DUB activity (WikiPathways, 2020), to evade the host immune response against viral infection. The current dataset GSE5147507 contain the expression of CDK; N protein mediated restriction of cell cycle by CDK4/6 (Surjit et al., 2006), which was signi cantly decreased in both COVID-19 case and SARS-CoV-2 infected cell line, and nsp14 mediated upregulation of viral replication and transcription by DDX1 (Wu et al., 2014), which was decreased in both COVID-19 case and SARS-CoV2 infected cell line in the current study. The virus counteract IL6 by inducing DUSP1 (decreased in COVID-19 case, elevated in SARS-CoV-2 infected cell line), a negative regulator of p38 MAPK (decreased in both case and cell line), also reported by Liao et al. (2011). The viroporin E protein activate NLRP3 in ammasome (decreased in case, elevated in cell line) to trigger production of TNF, IL6, and IL1B proin ammatory cytokines causing host-immunopathological conditions: hypercytokinemia, ARDS and multi-organ failure (Josh and Manuel, 2020).
Herein, the DEGs of COVID-19 retrieved from STRING disease query of Cytoscape (https://cytoscape.org/) were designated as group A, whereas the dataset GSE5147507 was analyzed to identify DEGsinCOVID-19 case versus uninfected healthy control categorized into group B, and ACE2induced A549 cell line infected with SARS-CoV-2 versus uninfected A549 cell line recognized as group C ( Figure 2). The PPI network constructed with top genes from group A, B, and C are depicted in Figure 2 (IFI6, IFIT1, IFIT2, IFIT3, IFITM1, IFITM3, OAS1, OAS3, OAS2, MX1, MX2, RSAD2, OASL) and activating apoptosis (XAF1, IRF2, IRF7) (Figure 3d). The SARS-CoV-2 viral replication restricted the expression of ISG15 (both case and cell line in the GSE5147507 dataset analyzed), that play a role in the induction of myriad of antiviral interferon-stimulated genes (Schoggins and Rice, 2011). The ISG20 (in the current dataset GSE5147507), downregulated in COVID-19 case, acts as IFN-mediated ssRNA antiviral exoribonuclease, is related to IFNG signaling. Despite virus replication (strong expression in case compared to cell line), the host response to SARS-CoV-2 was ineffective to instigate a strong IFN-I and -II pathways whereas concurrently inducing elevated levels of chemokines required to engage effector cells. The differentially decreased genes that are involved in in ammasome activation and activity include NLRP3, CASP5, IL1A, IL1B, IL18RAP, and IL1R2, and those involved in chemokine signaling for recruiting innate immune cells to the epithelium include CCL2, CCL3, and CCL4, while the most signi cant ISGs in SARS-CoV-2 infection include IFI6, IFI44L, IFI27 and OAS2 (Mick et al., 2020).
The present dataset featured a wide array of gene expression of chemokine subfamilies (Figure 3b). The most signi cantly expressed CCL chemokines were CCL4 and CCL23 with log 2 FC values 2.28 and -8.9, respectively in COVID-19 case compared to healthy control group. CCL4 and CCL26 were expressed most signi cantly with log 2 FC 2.96 and -2.76, respectively along with CCL3 (disease score = 1.8; log 2 FC 1.1) in SARS-CoV-2 infected-compared to uninfected cell line. In addition to the expression of monocyte associated CCL8 in COVID-19 case, CCL2 (disease score = 1.66) was also expressed signi cantly in both case and cell line. CCR7 in COVID-19 case (log 2 FC -11.08), and CCR6 in both case (log 2 FC -6.74) and cell line (log 2 FC 3.22) were the most signi cantly expressed chemokine receptors. Among CXC subfamily, CXCR4 (disease score = 1.85; log 2 FC -6.8 in case and 2.42 in cell line, CXCL10 (disease score = 1.79; log 2 FC 2.07 in cell line), CXCL11 (log 2 FC -1.62 in case and 5.25 in cell line) and CXCL2 (log 2 FC -7.7 in case and 4.24 in cell line) were signi cant. CXCR4 is involved with the binding of CD4 to support HIV entry into cells and highly expressed in cancer; CXCL11 is dominant ligand for CXCR3 (CXCL3 log 2 FC -7.15 in case and 2.98 in cell line) and is induced by IFNG while CXCL2 is expressed at the in ammatory site and may inhibit hematopoietic progenitor cell proliferation (GeneCards, 2020). The current dataset expressed CX3CL1 (log 2 FC -5.73) in COVID-19 case and CX3CR1 (log 2 FC -1.32) in cell line. The CX3CL1 plays role in cancer, atherosclerosis, AIDS, and in ammatory diseases; CX3CR1 gene is involved in adhesion, migration of leucocytes, and acts as a co-receptor for HIV1 (GeneCards, 2020).
The major candidate genes in group A were ACE2 (highest disease score 2.55, signi cantly downregulated in case) and IL6 (highest degree connectivity 29, signi cantly downregulated in case and upregulated in cell line) with cytokine-cytokine receptor interaction as the top enriched pathway. The enriched pathway (Figure 5a), and ACE2 under expression along with IL6 activation are related to angiogenesis and cancer (Feng et al., 2011;Kumari et al., 2016). The DEGs in group B exhibited major candidate genes to be MYC (with highest degree connectivity) and SFN (with highest log 2 FC) both related to cancer along with FOS, PPP3R1, TSC1, PDGFRB, and MFN2 in PPI network (Figure 2b). The enriched MAPK and PI3K-AKT pathways are important mediators of cellular processes, dysregulation of which is associated with cancer pathogenesis (Raouf et al., 1996;Vara et al., 2004). The key candidate genes in group C DEGs (downregulated) were ALB (highest degree connectivity) and EPHX1 (highest log 2 FC) (Figure 2c), and upregulated DEGs were TNF with highest degree connectivity and IFNB1 with highest log 2 FC (Figure 2d). The gene families for ALB and TNF are plasma proteins, intracellular and secreted proteins, in addition to cancer-related and candidate cardiovascular disease genes. Similarly, EPHX1 is related to intracellular proteins, cancer-related genes along with potential drug targets, whereas IFNB1 belongs to cancer-related and secreted proteins related gene families (Human Protein Atlas, 2020). Several of the other cancer associated genes (GSTA1, ABCG2, CAT, CTSD, TF, EGR1, TNFSF10) appeared in the network (Figure 2c-d) were linked with transcriptional response to SARS-CoV-2 infected cell lines affecting innate immunity (CHRNB4, CAT, CTSD, IFNB1), and endothelial and vascular in ammation (TNF, ICAM) in COVID-19. These DEGs altogether displayed association with old and new drugs potentially useful in COVID-19 therapy and tested or used in the oncological settings as well (Ciliberto et al., 2020), including rapamycin, chloroquine, lopinavir, ritonavir, ribavirin as appeared in PDI network in our study ( Figure 6), remdesivir, tocilizumab and sarilumab as shown in virus-human protein interaction (Figure 1). The PDI network of group A genes showed top scores of 0.985 and 0.923 for rapamycin (degree of interaction: 5) with IL10, IL2A, CD8A, CD40LG, and CSF3 (Figure 6a). Cyclosporine possessed the highest degree of interaction ( = 7) with CD40LG, IL1B, IFNG, CD8A, IL6, CSF3, and IL2RA, with score ranging from 0.453 to 0.897) (Figure 6a). IL6 gene had highest degree of interaction ( = 4) with azithromycin, oseltamivir, cyclosporine, and prednisone]. The PDI network of group B genes showed top scores of 0.926 and 0.876 respectively, in PPP3R1-cyclosporine and MYC-doxycycline interactions; rapamycin had highest degree of interaction ( = 5) with TSC1, MYC, PDGFRB, MFN2, and PPP3R1; MYC was commonly interacting gene for doxycycline, rapamycin and cyclosporine (Figure 6b). In group C, TNF-chloroquine and ABCG2-cyclosporine had top interaction scores of 0.969 and 0.955. The highest degree of interaction ( = 5) was achieved for rapamycin (scores: 0.683-0.884), with ABCG2, GSTA1, TF, ALDH3A1 and ALDH3A2, while the gene with highest degree ( = 6) was ABCG2 having interaction with cyclosporine, rapamycin, ivermectin, ritonavir, CHQ, and macrolides (scores: 0.466-0.955) (Figure 6c). In general, the PDI network showed rapamycin as top interactor with IL6, MYC and ABCG2 genes in group A, B and C, respectively (Figure 6a-c).

Conclusions
Overall, the plausible repurposing drugs for COVID-19 treatment include cyclosporine, doxycycline, chloroquine and rapamycin that emerged as top scoring interactors with MYC, PPP3R1, FOS, ABCG2, and TNF in SARS-CoV-2 infection compared to drug (cyclosporine, azithromycin, rapamycin, and oseltamivir) interaction with IL6, CD8A, CD40LG, IL10, and CSF3 known COVID-19 genes. Among a plethora of available repositionable drugs, those with their targets, appearing in this investigation, might be helpful to combat the ongoing COVID-19 pandemic.