Translating GWAS Findings to Inform Drug Repositioning Strategies for COVID-19 Treatment

Abstract We developed a computational framework that integrates Genome-Wide Association Studies (GWAS) and post-GWAS analyses, designed to facilitate drug repurposing for COVID-19 treatment. The comprehensive approach combines transcriptomic-wide associations, polygenic priority scoring, 3D genomics, viral-host protein-protein interactions, and small-molecule docking. Through GWAS, we identified nine druggable host genes associated with COVID-19 severity and SARS-CoV-2 infection, all of which show differential expression in COVID-19 patients. These genes include IFNAR1, IFNAR2, TYK2, IL10RB, CXCR6, CCR9, and OAS1. We performed an extensive molecular docking analysis of these targets using 553 small molecules derived from five therapeutically enriched categories, namely antibacterials, antivirals, antineoplastics, immunosuppressants, and anti-inflammatories. This analysis, which comprised over 20,000 individual docking analyses, enabled the identification of several promising drug candidates. All results are available via the DockCoV2 database (https://dockcov2.org/drugs/). The computational framework ultimately identified nine potential drug candidates: Peginterferon alfa-2b, Interferon alfa-2b, Interferon beta-1b, Ruxolitinib, Dactinomycin, Rolitetracycline, Irinotecan, Vinblastine, and Oritavancin. While its current focus is on COVID-19, our proposed computational framework can be applied more broadly to assist in drug repurposing efforts for a variety of diseases. Overall, this study underscores the potential of human genetic studies and the utility of a computational framework for drug repurposing in the context of COVID-19 treatment, providing a valuable resource for researchers in this field.


Introduction
The COVID-19 pandemic has presented signi cant challenges in nding appropriate treatments.Over 620 drug development programs have been reported, with numerous antivirals, cell and gene therapies, immunomodulators, and neutralizing antibodies being investigated 1 .Drug repurposing is a cost-effective and expedited approach to nding effective therapeutic agents 2,3 .Approved drugs, such as interferon, chloroquine, remdesivir, lopinavir, ritonavir, and angiotensin receptor blockers, have been studied for their potential to combat COVID-19.Remdesivir, initially evaluated in clinical trials for the Ebola outbreak, has been approved to treat COVID-19 patients.Recent studies have shown that Remdesivir reduced respiratory infection and shortened recovery time in hospitalized COVID-19 adults 4 .The FDA has approved Remdesivir for COVID-19 treatment in hospitalized patients with positive SARS-CoV-2 viral testing results.Another repurposed drug is Olumiant, a Janus Kinase inhibitor used to treat severe alopecia areata, which has been approved to treat severe COVID-19 in hospitalized adults.Combining Olumiant and Remdesivir has further shortened recovery time and improved clinical status 5 .
Human genetics studies have long been used to identify the pathophysiology of diseases and novel therapeutic targets for various diseases, such as Type 2 diabetes, rheumatoid arthritis, ankylosing spondylitis, psoriasis, osteoporosis, schizophrenia, and dyslipidaemia 6 .With the COVID-19 pandemic, researchers are conducting genome-wide association studies (GWAS) to identify host genetic susceptibility to the virus and its causative agent, SARS-CoV-2.Recently, GWAS loci on chromosome 3p21.31and 9q34.2 have been associated with respiratory failure in 1,980 severe COVID-19 patients compared to 2,381 control subjects from Italy and Spain 7 .The COVID-19 Host Genetics Initiative (COVID-19 HGI) has conducted meta-analyses of 46 GWAS from 19 countries with different genetic ancestries, providing insights into the host genetic effects of COVID-19 8 .Gene prioritization in GWAS is an important therapeutic strategy to identify the latest and most effective treatment targets 9 .
Understanding virus-host gene interactions is essential to determine how viruses infect host and replicate within them, and how the host immune system responds to pathogens.Gordon D et al. generated a SARS-CoV-2 virus-host interactome, containing 332 high-con dence protein-protein interactions between SARS-CoV-2 virus proteins and human proteins using a nity puri cation mass spectrometry.This interactome serves as a valuable reference for prioritizing host genes as potential targets for treating COVID19 10 .
We present here a novel computational framework designed to facilitate drug repurposing for the treatment of COVID-19.This framework leverages a combination of transcriptomic-wide associations, polygenic priority score, 3D genomics, viral-host protein-protein interactions, and small-molecule docking techniques to identify potential drug candidates.Using a GWAS approach, we identi ed nine druggable host genes that are associated with COVID-19 severity and SARS-CoV-2 infection, and are differentially expressed in COVID-19 patients.Subsequent molecular docking analysis of 553 small molecules from ve therapeutic categories, including antibacterial, antiviral, antineoplastic, immunosuppressant, and anti-in ammatory drugs, enabled us to identify several promising drug candidates, such as Peginterferon alfa-2b, Interferon alfa-2b, Interferon beta-1b, Ruxolitinib, Dactinomycin, Rolitetracycline, Irinotecan, Vinblastine, and Oritavancin.Importantly, our proposed framework offers a versatile approach that can be applied beyond COVID-19, to aid in drug repurposing efforts for other diseases.Our study emphasizes the potential of human genetics studies and bioinformatics analysis in prioritizing gene targets, which can be further evaluated through molecular docking analyses with known and approved drug molecules.By providing a comprehensive approach to prioritize gene targets from GWAS studies, our framework enables researchers to identify underlying causal genes within each GWAS locus and to further prioritize them based on expression levels, tissue-speci city, and enrichment analyses.
Identi cation of repurposing drug categories using Gene-set enrichment analysis In order to gain further insight into the biological functions of the 41 prioritized genes, we performed gene set enrichment analysis using four gene sets libraries containing these genes.The enriched gene sets were identi ed using a combination of the ranked combined score and adjusted p-value cutoff of 0.05.We identi ed six gene sets from the WikiPathways database (Fig.  to 6.
After conducting a manual review of the top enriched biological mechanisms and pathways in each library, we identi ed ve drug categories: antibacterial, antiviral, antineoplastic, immunosuppressive, and anti-in ammatory.To further investigate the potential for molecular docking of these drugs with the prioritized genes, we collected molecules from ChEMBL database in these ve categories.

Molecular docking with a searchable docking browser
The study collected drug ligands from ChEMBL in ve categories: antibacterials (n = 221), antivirals (n = 67), antineoplastics (n = 172), immunosuppressants (n = 22), and anti-in ammatory (n = 71).We conducted comprehensive molecular docking for all 41 prioritized genes.More than 20,000 molecular docking analyses were conducted in this study.All docking results are available in the DockCoV2 database (https://dockcov2.org/drugs/).DockCoV2 also provides literature mining related to COVID-19 for interesting genes or compounds.The detail of DockCoV2 implementation is in the supplemental information.

Discovery of repurposable drugs from prioritized drug-gene interactions for COVID-19
The approval drug-gene interactions are shown in Table 2. Peginterferon alfa-2b (DrugBank ID: DB00022), Interferon alfa-2b (DrugBank ID: DB00105), and Interferon beta-1b (DrugBank ID: DB00068) are all approved drugs that target IFNAR1 and IFNAR2.Peginterferon alfa-2b is an antiviral and immunoregulatory drug that stimulates the innate antiviral response in treating hepatitis B, C, and some cancers.Interferon alfa-2b is recombinant human interferon used to treat hepatitis B and C infection, genital warts, hairy cell leukemia, follicular lymphoma, malignant melanoma, and AIDs-related Kaposi's sarcoma.Interferon beta-1b is another form of recombinant human interferon used to slow the progression of relapsing multiple sclerosis and reduce the frequency of clinical symptoms.Ruxolitinib, an approved drug that targets TYK2, is a kinase inhibitor used to treat patients with myelo brosis and polycythemia vera who have not responded to or are unable to tolerate the use of hydroxyurea, and to treat graft-versus-host disease when steroid treatments have failed.The prioritized drug-gene interactions are shown in Table 3 based on the COVID-19 GWAS analysis, strong binding from molecular docking analysis (Binding a nity < -10 kcal/mol), and relevant literature: Dactinomycin (DB00970), Rolitetracycline (DB01301), Irinotecan (DB00762), Vinblastine (DB00570), and Oritavancin (DB04911).verse predicted or self-reported The COVID-19 HGI has identi ed six phenotypes that can be used to group individuals into three distinct disease statuses: critical illness, and reported infection 8 .A2 with very severe respiratory con rmed COVID was classi ed as critical illness, B1 and B2 were classi ed as hospitalization, and C1 and C2 were classi ed as reported infection.In order to further understand the genetic basis of these disease statuses, we identi ed signi cant independent SNPs with a genome-wide signi cant P-value (< 5x10 − 8 ) and independent from each other at r 2 < 0.6.Furthermore, if the linkage disequilibrium (LD) blocks of signi cant independent SNPs are closely located to each other (< 250 kb based on the most right and left SNPs from each LD block), they were merged into one genomic locus.

A computational framework for prioritizing the drug-gene interactions
We propose a computational framework for prioritizing drug-gene interactions relevant to COVID-19.This framework consists of four main components: gene prioritization, identi cation of drug categories, target functional characterization, and comprehensive docking analysis and literature review (Fig. 1).Gene prioritization uses ve different methods to prioritize target genes relevant to COVID-19 phenotypes and all details were in supplementary.Drug categories are identi ed using gene-set enrichment analyses.Target functional characterization is performed by analyzing differential gene expression in cells and tissues due to SARS-CoV-2 RNA-seq pro les, genes with signi cant TWAS results in the lung or whole blood tissues, and genes involved in the top gene-set enrichment results.Finally, comprehensive docking analysis and literature review are used to identify highly druggable drug-gene interactions for COVID-19 drug repurposing.

Gene prioritization from COVID-19 GWAS loci
In our rigorous gene prioritization approach, multiple facets of gene prioritization are considered.We employed ve state-of-theart methods to discern target genes within each GWAS locus: 1) transcriptome-wide association study (TWAS); 2) Polygenic Priority Score (PoPS); 3) 3D chromosomal topology, Activity by Contact (ABC) Model of Enhancer-Gene; 4) Capture Hi-C Omnibus Gene Score (COGS); 5) Nearest gene with Protein-Protein Interactions (PPIs) within each GWAS signi cant and suggestive locus.In-depth explanations for each method are provided in the supplementary materials.

Gene-Set Enrichment Analysis
We performed a gene-set enrichment analysis to identify enriched pathways and functions to explore potential drug categories from gene prioritization results.We used GSEApy 15 and Enrichr 16,17 to analyze gene-set enrichment.Each gene set was evaluated by combining the p-value computed using the Fisher exact test with the z-score of the deviation from the expected rank by multiplying these two numbers as follows: This study used four gene-set libraries, including 1) WikiPathways Human 2021 18 , 2) Elsevier Pathway Collection by Enrichr 17 , 3) Gene Ontology Molecular Function 2021 19 , and 4) Gene Ontology Biological Process 19 .

Functional Characterization of druggable targets relevant to COVID-19
The druggability the target is critical for drug repurposing.The Drug Gene Interaction Database (DGIdb) 20 organizes genes with known drug interactions obtained from literature or publicly available databases, as well as genes that may be druggable according to categories such as kinases, druggable genome, cell surface, and transcription factor.We used the "druggable genome" category to identify potentially druggable targets among target genes from gene prioritization results.Furthermore, characterizing the functional druggable target genes and assessing whether they are relevant to COVID-19 pathways or biological mechanisms is crucial.We utilized three functional resources, including 1) differentially expressed genes (DEGs) from COVID-19-related RNA-seq pro les (Supplementary Table 7) or TWAS results from lung and whole blood tissues; 2) the enriched pathways from WikiPathways Human and Elsevier Pathway; 3) enriched gene sets from Gene Ontology.The functional targets must be present in at least two functional resources.

Comprehensive Molecular Docking
We performed molecular docking to prioritize drug-gene interactions using potential drugs identi ed through enrichment analysis and functional druggable genes identi ed through GWAS.A structure-data le (SDF) containing the three-dimensional structure of each drug was obtained from the ChEMBL 21 database.The Protein sequences of the functional druggable genes were collected through the UniProt 22 database and used them as templates to build approximate structure in SDF format using Swiss-model 23 .
This study used the CB-dock framework 24 to predict the binding sites for a given protein using a novel curvature-based cavity detection approach, and calculated the centers and sizes of these binding sites.Within the CB-dock framework, we used QuickVina2 to perform a large-scale molecular docking analysis 25 .QuickVina2 enhances speed through the use of heuristics that prevents unnecessary local searches.Since QuickVina2 only accepts inputs in the Protein Data Bank, Partial Charge (Q), and Atom Type (T) (PDBQT) format, we used OpenBabel 26 to convert SDF les to PDBQT format.The docking results were further processed to build a protein heatmap corresponding to the a nity score with the ligand docking positions described previously 27 .Next, we updated all docking results in the DockCoV2 database 27 .

Discussion
In this study, we developed a computational framework to translate GWAS ndings into drug repositioning for the treatment of COVID-19.Our approach began by utilizing ve state-of-the-art methods to identify 41 prioritized genes associated with COVID-19 infection and/or severity.We then further re ned this list by identifying 10 druggable gene targets among the 41 prioritized genes, based on their functions in DGIdb, COVID-19 DEGs, and TWAS results from lung and whole blood tissues, as well as enrichment gene sets.According to the enriched gene sets, we collected 553 drugs/ligands from the ve treatment categories, including antibacterial, antiviral, antineoplastic, immunosuppressive, and anti-in ammatory.Our computational framework then identi ed night drugs/ligands based on their mechanism of action or molecular docking, followed by a review of their known toxicities in the literature and drug databases.The night potential drugs for COVID-19 treatment are Peginterferon alfa-2b, Interferon alfa-2b, Interferon beta-1b, Ruxolitinib, Dactinomycin, Rolitetracycline, Irinotecan, Vinblastine, and Oritavancin.
Out of the night drugs identi ed in our computational framework, four drugs with clear drug-gene interactions: Peginterferon alfa-2b (DB00022), Interferon alfa-2b (DB00105), Interferon beta-1b (DB00068), and Ruxolitinib (DB08877).Interferon alfa-2b, peginterferon alfa-2b, and interferon beta-1b are associated with Type I interferon (IFN) responses, which are the primary defense against viral infection.Gene-Set enrichment results showed that interferon receptor activity from Gene Ontology TYK2 is a prioritized target gene in this study and has an approved immunosuppressant named Ruxolitinib.TYK2 and IL10RB are in Type III interferon signaling, and IL10RB is the top key regulator of COVID-19 host susceptibility 30 .
Five highly druggable drug-gene interactions for COVID-19 drug repurposing were identi ed from functional analyses, relevant literatures, and docking analysis (Table 3): Dactinomycin, Rolitetracycline, Irinotecan, Vinblastine, and Oritavancin.Datinomycin binds IL10RB strongly (binding a nity: -13.5 kcal/mol) and has been widely used as an inhibitor of viral cellular transcription in infected cells 31 .The combination of dactinomycin and sirolimus exhibited synergistic effects against the host proteins of SARS-CoV-2. 32.Oritavancin strongly binds with OAS1 (binding a nity: -16.6 kcal/mol) and has also been suggested as a potential COVID-19 treatment option that inhibits cathepsin L and cathepsin B in host cells (late endosomal pathway) 33 .Irinotecan is a topoisomerase I inhibitor that binds to CCR9 protein (binding a nity: -17.1 kcal/mol) and has also been suggested as a potential candidate therapy to counter cytokine storms in critically ill COVID-19 patients 34 .XCR1 has a strong binding a nity with vinblastine (binding a nity: -16.3 kcal/mol), and vinblastine showed robust anti-SARS-CoV-2 response in human VeroE6 cells 35 .
The challenge in post-GWAS is to identify the causal variants and their target genes within the GWAS locus.To address this challenge, many advanced approaches have been developed, such as ne-mapping, functional annotation, and gene prioritization 36,37 .In this study, we applied state-of-the-art gene prioritization methods to prioritize target genes from COVID-19 GWAS.These approaches have different perspectives on prioritizing target genes, including gene-based methods that use transcriptome gene expression coupled with genetic variants from GWAS, enhancer-promoter identi cation based on threedimensional chromatin interactions data (e.g., Hi-C, Capture Hi-C), and open chromatin information, and similarity-based methods that incorporate data from publicly available RNA-seq datasets, curated biological pathways, and protein-protein interaction data.
Further cellular and animal experiments are needed to validate our ndings.The percentage of drug mechanisms directly supported by human genetic studies increases throughout the drug development pipeline.Therefore, selecting genetically supported drug targets could improve the success rate of drug development and reduce the time and costs of developing new drugs 38 .In summary, this study presents a bioinformatics framework for drug repurposing to translate GWAS ndings and provides druggable candidates in immune reactions and host-viral interaction pathways.In addition to COVID-19, this computational framework can be applied to other GWASs for drug repurposing to speed up the discovery of potential therapeutics.The computational framework for prioritizing drug-gene interactions from GWAS summary statistics.

Declarations
2.a), ve from the Elsevier database (Fig. 2.b), ve from the Gene Ontology (GO) molecular function category (Fig. 2.c), and 12 from the GO cellular component category (Fig. 2.d).The details of enriched gene sets are shown in Supplementary Tables 3

(
GO:0004904, Odds Ratio:332.58)and Type 1 interferon induction and signaling during SARS-CoV-2 infection (WP4868, Odds Ratio:132.89)were highly enriched.A number of viruses, including SARS-CoV-2, have evolved mechanisms that enable them to evade the antiviral effects of Type I interferon (IFN-I)28  .Numerous experiments have been conducted in both vitro and in vivo to study the e cacy of IFN-I treatment against MERS-CoV and SARS-CoV29 .The Janus family kinases (JAK) and tyrosine kinase 2 (TYK2) are associated with the membrane-proximal part of the cytoplasmic domains of IFNAR1 and IFNAR2, respectively 28 .

Table 1 25
Prioritized genes located in GWAS signi cant loci

Table 3
Highly druggable drug-gene interactions for COVID-19 drug repurposing from several supported evidences