Shared genetics between breast cancer and predisposing diseases identifies novel breast cancer treatment candidates

Background Current effective breast cancer treatment options have severe side effects, highlighting a need for new therapies. Drug repurposing can accelerate improvements to care, as FDA-approved drugs have known safety and pharmacological profiles. Some drugs for other conditions, such as metformin, an antidiabetic, have been tested in clinical trials for repurposing for breast cancer. Here, we exploit the genetics of breast cancer and linked predisposing diseases to propose novel drug repurposing. We hypothesize that if a predisposing disease contributes to breast cancer pathology, identifying the pleiotropic genes related to the risk of cancer could prioritize drug targets, among all drugs treating a predisposing disease. We aim to develop a method to not only prioritize drug repurposing, but also to highlight shared etiology explaining repurposing. Methods We compile breast cancer’s predisposing diseases from literature. For each predisposing disease, we use GWAS summary statistics to identify genes in loci showing genetic correlation with breast cancer. Then, we use a network approach to link these shared genes to canonical pathways, and similarly for all drugs treating the predisposing disease, we link their targets to pathways. In this manner, we are able to prioritize a list of drugs based on each predisposing disease, with each drug linked to a set of implicating pathways. Finally, we evaluate our recommendations against drugs currently under investigation for breast cancer. Results We identify 84 loci harboring mutations with positively correlated effects between breast cancer and its predisposing diseases; these contain 194 identified shared genes. Out of the 112 drugs indicated for the predisposing diseases, 76 drugs can be linked to shared genes via pathways (candidate drugs for repurposing). Fifteen out of these candidate drugs are already in advanced clinical trial phases or approved for breast cancer (OR = 9.28, p = 7.99e-03, one-sided Fisher’s exact test), highlighting the ability of our approach to identify likely successful candidate drugs for repurposing. Conclusions Our novel approach accelerates drug repurposing for breast cancer by leveraging shared genetics with its known risk factors. The result provides 59 novel candidate drugs alongside biological insights supporting each recommendation.


Background
Current effective breast cancer treatment options have severe side effects, highlighting a need for new therapies.Drug repurposing can accelerate improvements to care, as FDA-approved drugs have known safety and pharmacological pro les.Some drugs for other conditions, such as metformin, an antidiabetic, have been tested in clinical trials for repurposing for breast cancer.Here, we exploit the genetics of breast cancer and linked predisposing diseases to propose novel drug repurposing.We hypothesize that if a predisposing disease contributes to breast cancer pathology, identifying the pleiotropic genes related to the risk of cancer could prioritize drug targets, among all drugs treating a predisposing disease.We aim to develop a method to not only prioritize drug repurposing, but also to highlight shared etiology explaining repurposing.

Methods
We compile breast cancer's predisposing diseases from literature.For each predisposing disease, we use GWAS summary statistics to identify genes in loci showing genetic correlation with breast cancer.Then, we use a network approach to link these shared genes to canonical pathways, and similarly for all drugs treating the predisposing disease, we link their targets to pathways.In this manner, we are able to prioritize a list of drugs based on each predisposing disease, with each drug linked to a set of implicating pathways.Finally, we evaluate our recommendations against drugs currently under investigation for breast cancer.

Results
We identify 84 loci harboring mutations with positively correlated effects between breast cancer and its predisposing diseases; these contain 194 identi ed shared genes.Out of the 112 drugs indicated for the predisposing diseases, 76 drugs can be linked to shared genes via pathways (candidate drugs for repurposing).Fifteen out of these candidate drugs are already in advanced clinical trial phases or approved for breast cancer (OR = 9.28, p = 7.99e-03, one-sided Fisher's exact test), highlighting the ability of our approach to identify likely successful candidate drugs for repurposing.

Conclusions
Our novel approach accelerates drug repurposing for breast cancer by leveraging shared genetics with its known risk factors.The result provides 59 novel candidate drugs alongside biological insights supporting each recommendation.

Background
Breast cancer is a major public health issue and the most common type of cancer in women worldwide (1).Current treatment options, such as chemotherapy, hormone therapy, and immunotherapy (2), while effective, often have severe side-effects that impair patients' quality of life and adherence to treatment (3,4).On the other hand, existing FDA-approved drugs have established safety and pharmacological pro les and testing them for repurposing for breast cancer treatment has been an attractive strategy (5).For instance, metformin, the most commonly prescribed anti-diabetic drug, has shown anti-proliferative effects in pre-clinical studies and is currently undergoing clinical trials for breast cancer treatment.This surprising effect may be explained by its downstream effects activating AMPK protein kinase, a tumor suppressor critical in regulating cell proliferation (6, 7).However, oncology clinical trials are slow, expensive and have up to 95% attrition rates, mainly due to lack of understanding of the mechanism of action of the tested drug prior to the clinical trial (8-10).Therefore, new approaches are needed to both accelerate drug repurposing for breast cancer treatment and provide biological insights to design clinical trials with higher success rates.
Genome-wide association studies (GWAS) associate genes with a disease; drugs targeting those genes are twice as likely to successfully treat that disease (11)(12)(13).For example, a recent breast cancer GWAS found a strong signal in the ESR1 locus which is the target of tamoxifen, a widely used drug for breast cancer treatment (14).This highlights the great potential of genetics in informing drug discovery and repurposing (15).However, the inherent limitations of GWAS, such as the need for large sample sizes and the challenge of identifying true causal genes, limit their ability to inform drug repurposing (16,17).This creates the need for innovative ways to build on genetics ndings to discover drugs for diseases of high public importance, such as breast cancer.
To this end, we propose that shared genetics between breast cancer and other diseases can inform drug repurposing for breast cancer.This idea is supported by prior research.For one, it has been shown that clinically co-occurring diseases share genetics (18,19).Some diseases predispose individuals to breast cancer, and genetics may point to the biological processes driving this relationship.Consequently, drugs approved for such predisposing diseases and targeting shared biology with breast cancer might hold potential for breast cancer treatment.In further support of this concept, we have previously built on work showing bearers of Mendelian disease mutations may suffer increased risk of certain cancers, due to a hidden role of the Mendelian disease genes in the comorbid cancer (19).In a follow-up study, we showed that the Mendelian disease genes implicated for a cancer are enriched for successful drug targets (20).Building on these observations, here, we propose the repurposing of drugs based on shared genetic etiology.Ultimately, we propose 59 drugs that have not been previously tested for breast cancer, accompanied by biological insights supporting their potential therapeutic effect on breast cancer.

Genetics data
We download the most recent, publicly available GWAS summary statistics data of European ancestry for breast cancer (21), depression (22), high HDL (23), high LDL (23), prostate cancer (24), schizophrenia (25) and type 2 diabetes (26).For each GWAS, we keep only bi-allelic SNPs in autosomal chromosomes that match to unique rsIDs.We also harmonize the effect and alternate allele for each SNP using the 1K Genomes reference panel of Europeans.The GWAS for schizophrenia included an imputation quality score for each SNP and we lter for SNPs with imputation score ≥ 0.3.

Protein-protein interaction network
We download the STRING protein-protein interaction (PPI) network for humans (version 11.5, date of download: May 8, 2023).This network contains both physical and functional interactions of proteins and assigns a con dence score to each interaction (edge) that represents the amount of evidence supporting it.We use this score to lter for edges with high con dence (score ≥ 0.7).We also remove multiple edges and loops and map each gene-node to entrezIDs using a le provided by STRING ("9606.protein.aliases.v11.5.txt").The nal PPI network contains 16,115 nodes and 240,541 edges.

Canonical pathways
We download a list of gene sets for canonical pathways from the MSigDB (version 2023.2).This is a list of manually curated gene sets by domain experts and includes information from ve databases: BioCarta, KEGG (Kyoto Encyclopedia of Genes and Genomes), Reactome, PID (Pathway Interaction Database) and WikiPathways.For each canonical pathway, we lter for genes found in the STRING PPI network.Eventually, we have a list of 3,795 canonical pathways that we use to connect shared genes to drugs.

Drugs indicated for breast cancer predisposing diseases
For each of the six breast cancer predisposing diseases (depression, high HDL, high LDL, prostate cancer, schizophrenia, type 2 diabetes), we compile a list of FDA-approved drugs using RxNORM and annotate them with gene-targets from DrugBank (version 5.19, date of download: June 16, 2022).We do not lter the list of targets based on pharmacological action.Finally, we keep the drug gene-target with unique entrezIDs that are found in the STRING PPI network.

Drugs investigated or indicated for breast cancer
To nd drugs currently investigated for breast cancer, we download clinical trial data from the Aggregate Content of ClinicalTrials.gov(AACT) database in a pipe-delimited format (date of download: November 4, 2022; note that it is updated daily).AACT is a publicly available relational database that contains information about all the studies registered in ClinicalTrials.gov.We obtain information for 432,597 clinical trials that were registered in ClinicalTrials.govby the date of download.Using a manually curated list of MeSH codes related to breast cancer (D001943, D000071960, D018270, D018275, D061325, D058922, D064726, D000069584), we keep 4,237 clinical trials that tested 774 drugs for breast cancer treatment.
To nd drugs currently indicated for breast cancer, we obtain the indications of all FDA-approved drugs from RxNORM and lter for those that treat breast cancer using the MeSH terms mentioned above.We nd 68 breast cancer indicated drugs and combine them with the investigated drugs to get a complete list of drugs for breast cancer.Again, all drugs are coded in DrugBank IDs.

Finding likely shared genes between breast cancer and its predisposing diseases
To identify shared genes between each pair of breast cancer and predisposing diseases, we use GWAS summary statistics data and LOGODetect (default settings) to perform a local genetic correlation analysis.LOGODetect scans the entire genome to identify loci with correlated SNP effects in two GWAS, accounting for linkage disequilibrium.Following the identi cation of shared loci for each disease pair, we use the FUMA SNP2GENE function to compile a list of protein-coding genes positionally located within and ± 10 kb upstream and downstream of positively correlated loci (default setting).We do this by providing to FUMA SNP2GENE a list of all SNPs within the identi ed loci and not only the genome-wide signi cant ones.Eventually, for each breast cancer and predisposing disease pair, we obtain a list of protein-coding genes (entrezIDs) that are part of their shared etiology (termed as shared genes).
We then seek to prioritize genes that are more likely to contribute to the shared biology for each disease pair.To achieve this, we utilize two commonly used tools: MAGMA (gene-based) and S-MultiXcan (cis-eQTL based).First, we run MAGMA with default settings to obtain gene-based p-values for each disease and adjust them for multiple hypothesis testing using the Benjamini-Hochberg correction method.
Second, we use S-MultiXcan to obtain the association of gene regulation with disease risk, summarized across 43 GTEx tissues.Then, for each disease pair, we lter the previously identi ed shared genes for those that are signi cantly associated with the predisposing disease (MAGMA p.adjusted < 0.05) and are predicted to be dysregulated in the same direction in both diseases (sign of S-MultiXcan calculated zscore).Ultimately, these two ltering steps, one gene-based and one cis-eQTL-based, help us prioritize shared genes more likely to contribute to the shared biology of a breast cancer and predisposing disease pair.

Connecting shared genes to drugs via canonical pathways
Our goal here is to link the shared genes with drugs and gain insights into the implicated biological pathways.To do so, we rst nd which canonical pathways are signi cantly connected to the shared genes using the STRING PPI network and the Personalized PageRank (PPR) algorithm.PPR uses a userde ned gene, named seed gene, as a starting point and performs random walks in the PPI network, with a probability (default = 0.8) of returning to the seed gene after each step.This process assigns a score to each gene based on its connectivity with the seed gene: genes with higher connectivity receive higher scores.Using each shared gene for a disease pair as a seed gene, we calculate the average PPR score of genes within a canonical pathway.To determine if the observed score is higher than expected by chance, we compare it to random PPR scores obtained through 1,000 permutations (more details in "Statistical analysis -Permutation tests").Ultimately, for each breast cancer and predisposing disease pair, this process yields a list of canonical pathways that are signi cantly connected (p_permutation < 0.05) to the previously identi ed shared genes (referred to as shared canonical pathways).
We then seek to link the shared canonical pathways to drugs approved for a predisposing disease.To do so, we repeat the above analysis but this time, we use a target of a drug as a seed gene (instead of a shared gene).Similarly, we calculate the average PPR score of genes within a shared canonical pathway and compare it to average PPR scores obtained after 1,000 permutations.Drugs approved for a predisposing disease and signi cantly connected to a shared canonical pathway (p_permutation < 0.05), are considered to target the shared etiology with breast cancer.Therefore, these drugs constitute the prioritized candidate drugs for repurposing for breast cancer, and the shared genes and canonical pathways they target provide biological insights for their effect on breast cancer.

Statistical analysis Evaluating candidate drugs for repurposing
We nd 76 unique drugs approved for a predisposing disease that can be linked with its shared biology with breast cancer.Using these drugs alongside those currently in clinical trials or approved for breast cancer treatment, we test for signi cant overlap using the Fisher's one-sided exact test.

Permutation tests
To assess the signi cance of the observed associations between shared genes, drugs and canonical pathways, we conduct permutations tests.
First, for each breast cancer and predisposing disease pair, we connect each canonical pathway to the identi ed shared genes.To do so, we use each shared gene as a seed gene and we calculate the average PPR score for the genes in a canonical pathway.To determine the signi cance of these connections, we create a null distribution of average PPR scores for genes within a canonical pathway.This involves grouping all genes in the PPI network into four bins based on their degree (quantiles).Subsequently, for each seed (shared) gene, we randomly select 1,000 genes from the same degree bin and calculate the average PPR score for genes within the canonical pathway.These scores are then compared to the observed average PPR score, and a permuted p-value is calculated based on how often the observed score is lower than the permuted scores.This process yields a permuted p-value for every canonical pathway-shared gene pair, which is then adjusted for the number of shared genes and canonical pathways tested using the Bonferroni correction method.
Second, we link the shared canonical pathways to drugs approved for a predisposing disease to prioritize candidate drugs for repurposing.To do so, we follow a similar approach as above, but this time, we use random genes matched to the gene-target of each drug using the four degree bins.This process yields a permuted p-value for every shared canonical pathway-drug target pair, which is then adjusted for the number of drug targets tested using the Bonferroni correction method.

Overview of approach
Health conditions including high cholesterol and type 2 diabetes incur increased risk for breast cancer.Previous genetics research has supported a biological explanation: these diseases, which we will refer to as predisposing diseases, share genetic variation with breast cancer.Building on this, we hypothesize that nding the shared genetics between breast cancer and its predisposing diseases can help us discover new drugs for breast cancer.Speci cally, we hypothesize that drugs approved for a predisposing disease and targeting its shared biology with breast cancer can treat the latter disease (Fig. 1A).
To test this hypothesis, we rst search the scienti c literature (epidemiological and statistical genetic studies) for diseases with genetic variation known to predispose individuals to breast cancer.We nd six such diseases (Table 1).
Next, we aim to identify the shared genetics between each pair of breast cancer and a predisposing disease and connect them to drugs approved for the predisposing disease.We rst use publicly available GWAS summary statistics data (Table 1) and a local genetic correlation analysis, to nd the shared genetics (Fig. 1B-D).Then, for each predisposing disease, we use a network biology method to link the shared genes to canonical pathways, and similarly for all drugs treating the predisposing disease, we link their targets to the pathways (Fig. 1E).By nding drugs that target shared pathways, we both prioritize candidate drugs for repurposing for breast cancer and provide biological insights that support their effect in disease treatment.
Finally, we evaluate our list of candidate drugs.To do so, we compile a list of 583 drugs either in clinical trials (N = 451) or approved (N = 132) for breast cancer and test for enrichment within our candidate drugs.

Discovering shared genetics between breast cancer and its predisposing diseases
The rst step in testing our hypothesis is to identify genomic loci likely to drive the shared etiology between breast cancer and a predisposing disease.Although cross-trait Linkage Disequilibrium Score Regression (LDSC) is a widely adopted method to identify genetic correlation between a pair of phenotypes using GWAS summary statistics, its genome wide nature means it does not provide insight into particular genes driving the correlation.To gain that insight, we perform a local genetic correlation analysis for each pair of breast cancer and predisposing disease, using LOGODetect (32).Across all disease pairs, we identify 59 negatively (per disease pair: median = 5.5; min = 0; max = 34) and 84 positively (per disease pair: median = 10; min = 5; max = 37) correlated genomic loci (Fig. 2A, Supplementary Tables 1 & 2).Notably, the identi ed loci are distributed across all autosomal chromosomes and not localized in speci c parts of the genome (Fig. 2B).In order to con rm that LOGODetect does not discover shared loci for disease pairs without known epidemiological or clinical associations, such as high LDL-depression and high LDL-schizophrenia, we repeat the analysis for these pairs.LOGODetect does not identify any signi cant correlated loci.
Next, for each predisposing disease, we seek to prioritize genes in the correlated genomic loci that are the most likely drivers of shared etiology with breast cancer.Since our ultimate goal is to recommend candidate drugs for repurposing for breast cancer, we are more interested in genes exhibiting effects in the same direction in both breast cancer and predisposing disease.Therefore, for the downstream analysis, we focus on genomic loci found to be positively correlated for each disease pair.Using the SNP2GENE function from FUMA (33), we extract a total of 194 protein-coding genes that fall within all the positively correlated loci (median of 37 protein-coding genes per disease pair; min = 8; max = 81).For each disease pair, these genes constitute the list of shared genes that likely drive the shared etiology with breast cancer.
From this list, we seek the subset most likely to contribute to druggable shared etiology.That is, we wish to prioritize genes more likely to 1) participate in shared pathophysiological processes and 2) relate to the effect of drugs on the predisposing disease.First, we use MAGMA, a tool that aggregates the effect of all SNPs within a gene, to keep genes signi cantly associated with the predisposing disease in every disease pair (34).This lter is important as the predisposing disease increases the risk for breast cancer, implying that any genes underlying that shared risk should, at a minimum, impact the predisposing disease.Second, we use S-MultiXcan, a tool that uses GWAS summary statistics data to associate genetic variation impacting gene regulation with disease risk, summarized across 49 GTEx tissues (35).
We keep only genes with dysregulation consistently aligned in the same direction (either downregulated or upregulated) in both diseases in a disease pair, as dysregulation in opposite directions is not easily interpretable and not useful for therapeutic purposes.For instance, ABCA1 is found to be one of the genes shared between high LDL and breast cancer.Probucol, an approved anti-cholesterol drug, inhibits ABCA1.But, ABCA1 is upregulated in high LDL and downregulated in breast cancer, suggesting that inhibitions of this gene might not lead to desired treatment outcome for breast cancer (36).Figure 2C shows the number of shared genes within positively correlated loci between breast cancer and each predisposing disease, before and after applying the aforementioned lters.The full list of shared genes for each disease pair is provided in Supplementary Table 3. Notably, list genes that have also been identi ed by other independent studies as being shared between disease pairs, such as the GATAD2A gene for breast cancer and schizophrenia (37).

Connecting shared genetics to drugs
The next step is to connect the identi ed shared genes to drugs treating the predisposing diseases.We make this connection through canonical pathways.In that way, we can capture the biological processes which 1) are involved with identi ed shared genes and 2) are related to drugs indicated for the predisposing disease.First, for each breast cancer and predisposing disease pair, we connect the identi ed shared genes to canonical pathways using the STRING protein-protein interaction network and a network propagation algorithm (Supplementary Table 4).Notably, our work ow identi es pathways with known roles in tumorigenesis and progression that are also disrupted in the breast cancer predisposing diseases (Figure S1).For instance, we identify the WNT signaling pathway to be part of the shared etiology between breast cancer (38) and each of high HDL (39), prostate cancer (40) and type 2 diabetes (41).Additionally, the mTOR signaling pathway, another cancer-related pathway (42,43), is found to be shared between breast cancer and high LDL (44).
After connecting the shared genes for every disease pair to pathological processes, we seek to nd which drugs target them.To do so, we use a similar approach and connect targets of a drug currently approved for a predisposing disease to the shared canonical pathways between breast cancer and that disease (Fig. 3).Drugs signi cantly connected to at least one shared canonical pathway are considered candidate drugs for repurposing for breast cancer.

Evaluation of drug repurposing and prioritization of new candidates
nal step is to assess the e cacy of our approach in identifying promising candidate drugs for repurposing for breast cancer.To do so, we compare our list of candidate drugs to those currently under investigation or approved for breast cancer treatment.In total, out of 112 approved drugs for the six breast cancer predisposing diseases, 16 have undergone testing or received approval for breast cancer treatment (Fig. 4A).Remarkably, our method identi es 15 out of these 16 drugs, while also offering insights into the speci c genes and pathways that might explain their effect on breast cancer (Supplementary Table 5).When considering the recommended candidate drugs for repurposing, we nd a signi cant enrichment for drugs currently investigated or approved for breast cancer (OR = 9.28, p = 7.99e-03, one-sided Fisher's exact test).
Our work ow identi es HMGCR to be shared between high LDL and breast cancer.However, HMGCR is also the target of statins, a group of drugs approved for lowering the LDL blood levels.Therefore, to ensure that the signi cance of our results is not driven by this instance where a shared gene is also the target of an approved drug for the predisposing disease, we repeat the analysis by excluding the high LDL-breast cancer pair.Again, we nd strong enrichment (OR = 9.35, p = 9.66e-03, one-sided Fisher's exact test), highlighting the ability of our approach in connecting shared biology to drugs even when a shared gene is not directly targeted by a drug.
In conclusion, we recommend 76 candidate drugs for repurposing while providing biological insights (speci c genes and pathways) supporting their potential use in breast cancer treatment.Among these candidates, 15 are already in advanced clinical trial phases or approved for breast cancer treatment, while 59 are novel candidate drugs (Fig. 4B, Supplementary Table 5).
In our previous work, we exploited monogenic diseases that predispose their bearers to complex diseases, showing that drugs targeting the causal gene of the monogenic disease are good candidates for treating the associated complex disease (20).With that study we showed that genetics of predisposing health conditions of a complex disease can inform drug repurposing for the complex disease.Building on that work, here, we develop an approach to investigate whether complex disease risk factors that predispose individuals to another complex disease can inform the repurposing of existing drugs for the complex disease.To test this, we use breast cancer as an example of a wellstudied and common complex disease with many predisposing risk factors.We show that shared genes and pathways between breast cancer and its predisposing diseases can point to promising candidate drugs for repurposing for breast cancer treatment.Importantly, our recommended candidate drugs are enriched for those currently undergoing clinical trials or already approved for breast cancer treatment, highlighting the ability of our approach in identifying likely successful candidate drugs for repurposing.
Pleiotropy has been previously used for the identi cation of shared biology between diseases from the same body systems, such as psychiatric and autoimmune diseases, suggesting a potential role in drug repurposing (45,46).However, no studies have used pleiotropy to inform drug repurposing for diseases of different body systems.In contrast, our study aims to identify shared genetic factors between breast cancer and diverse predisposing diseases, such as depression and high HDL.Despite affecting distinct anatomical regions, we nd that these disease pairs share pathophysiological processes which suggests that association is due to pleiotropic effects of genes rather than obvious physiological similarity between the phenotypes.Our approach identi es these pleiotropic genes using a local genetic correlation analysis and links them to canonical pathways; we then use these pathways to prioritize candidate drugs for repurposing.With that, our study is the rst to propose an alternative way to analyze publicly available GWAS summary data that can both suggest new uses for existing drugs and point towards the genes and pathways supporting drug repurposing.
We present an example to illustrate the power of our approach in both identifying shared biology between diseases and providing a biological basis for the repurposing of a drug.It is known that elevated levels of HDL increase the risk for breast cancer (28).By analyzing the genetics of this pair of diseases, we identify MLXIPL as a shared gene.We also nd that MLXIPL is signi cantly connected to the FOXA2 pathway.Interestingly, the FOXA2 pathway is known to play a role in both breast cancer pathogenesis and lipid metabolism (pleiotropic effect) (47,48).Among the approved lipid-lowering medications, we nd two bric acids (feno brate and gem brozil) and one statin (rosuvastatin) signi cantly connected to the FOXA2 pathway.Notably, both drug categories have demonstrated anticancer properties in preclinical and clinical studies, respectively (49,50).This example shows that our approach can identify meaningful biological signals.It also shows that by leveraging the shared biology between breast cancer and its predisposing risk factors, we can prioritize candidate drugs for repurposing while also providing plausible biological mechanisms through which these drugs may impact breast cancer.
Our approach has some limitations.First, we de ne shared genes as those located within the detected, positively correlated shared genomic loci for each disease pair.However, a SNP in a shared locus might have distal regulatory effects on genes located outside that locus and including those genes could potentially result in a more complete list of shared genes for each disease pair.Second, we rely on pleiotropy to identify shared genetic factors between breast cancer and its predisposing diseases.
However, it is possible that a disease might increase the risk for breast cancer through indirect or interaction effects, which are not captured by our approach.Third, redundant canonical pathways in the Molecular Signatures Database (MSigDB) could result in the identi cation of fewer signi cantly shared pathways between diseases than their actual number due to multiple testing corrections.Fourth, we analyzed GWAS summary statistics data only from the European population, due to greater sample sizes and data availability.Future studies analyzing data from diverse populations may discover shared loci missing from our analysis.

Conclusions
In conclusion, our approach detects and leverages the shared biology between pairs of breast cancer with its predisposing diseases to suggest novel drugs for breast cancer treatment, while also providing a biological basis for each drug recommendation.Future work can exploit our list of candidate drugs for repurposing and evaluate their e cacy in experimental settings.While we speci cally applied our approach to the case of breast cancer, we trust that it can be applied to recommend novel candidate drugs for repurposing for any complex disease with a known set of predisposing diseases and available GWAS summary statistics.Therefore, it serves as a valuable tool for not only identifying promising candidate drugs for repurposing across various complex diseases, but also suggesting the disease networks involved in repurposing.

Figures
Figure 1 Outline of the approach.A.Tested hypothesis that a drug treating a health condition known to increase the risk for breast cancer and targets the shared biology of both conditions may also treat breast cancer.B-E.Proposed work ow to identify the most likely shared genes between breast cancer and a predisposing disease and connect them to drugs approved for the predisposing disease.
Drugs treating high HDL and targeting its shared biology with breast cancer.These drugs are recommended for repurposing for breast cancer and the shared pathways they target provide a biological basis for the repurposing.Only canonical pathways signi cantly linked to both identi ed shared genes and high HDL drugs are shown.

Figure 2 .
Figure 2. Shared genetics between breast cancer and its predisposing diseases.A. Number of negatively and positively correlated loci between each breast cancer and predisposing disease pair.B. Position in the genome of the identi ed correlated loci for each breast cancer and predisposing disease pair.The yaxis represents the -log10(q-value) of each locus as provided by LOGODetect.C. Number of proteincoding genes located within the positively correlated loci for each breast cancer and predisposing disease pair, before and after applying the MAGMA & S-MultiXcan lters.

Figure 2 Shared
Figure 2

Table 1
Evidence for shared etiology of predisposing conditions, and genome-wide association study (GWAS) summary statistics for breast cancer and its six predisposing diseases used in this study.All GWAS include samples of European ancestry.
BCAC: Breast Cancer Association Consortium; PGC: Psychiatric Genomics Consortium; UKBB: United Kingdom BioBank; GLGC: Global Lipids Genetics Consortium; PRACTICAL: Prostate Cancer Association Group to Investigate Cancer-Associated Alterations in the Genome; DIAMANTE: Diabetes Meta-Analysis of Trans-Ethnic association studies