Shared etiology of Mendelian and complex disease supports drug discovery

Background Drugs targeting disease causal genes are more likely to succeed for that disease. However, complex disease causal genes are not always clear. In contrast, Mendelian disease causal genes are well-known and druggable. Here, we seek an approach to exploit the well characterized biology of Mendelian diseases for complex disease drug discovery, by exploiting evidence of pathogenic processes shared between monogenic and complex disease. One way to find shared disease etiology is clinical association: some Mendelian diseases are known to predispose patients to specific complex diseases (comorbidity). Previous studies link this comorbidity to pleiotropic effects of the Mendelian disease causal genes on the complex disease. Methods In previous work studying incidence of 90 Mendelian and 65 complex diseases, we found 2,908 pairs of clinically associated (comorbid) diseases. Using this clinical signal, we can match each complex disease to a set of Mendelian disease causal genes. We hypothesize that the drugs targeting these genes are potential candidate drugs for the complex disease. We evaluate our candidate drugs using information of current drug indications or investigations. Results Our analysis shows that the candidate drugs are enriched among currently investigated or indicated drugs for the relevant complex diseases (odds ratio = 1.84, p = 5.98e-22). Additionally, the candidate drugs are more likely to be in advanced stages of the drug development pipeline. We also present an approach to prioritize Mendelian diseases with particular promise for drug repurposing. Finally, we find that the combination of comorbidity and genetic similarity for a Mendelian disease and cancer pair leads to recommendation of candidate drugs that are enriched for those investigated or indicated. Conclusions Our findings suggest a novel way to take advantage of the rich knowledge about Mendelian disease biology to improve treatment of complex diseases.


Background
Traditional drug development pipeline is costly and slow.It is estimated that around $2.6 billion is spent and approximately 12-15 years are required for just one new drug to reach the market (1,2).Additionally, clinical trial success rates remain low (3).Therefore, there is a pressing need for new approaches to predict which drugs will succeed.
Recently, genetics has emerged as a resource for predicting drug success.Genome-wide association studies (GWAS) have identi ed genetic variants associated with complex diseases that are also known therapeutic drug targets (4).For instance, mutations in the IL23R locus have been associated with Crohn's disease (5).Ustekinumab, a monoclonal antibody originally approved for the treatment of psoriasis, targets the IL23 p-40 subunit(6).Based on genetics, ustekinumab was successfully repurposed for Crohn's disease (7)(8)(9)(10).This example highlights the importance of genetics both in target prioritization and drug discovery.Nelson et al. analyzed historical data of clinical trials and found that drug uses supported by human genetic evidence are twice as likely to succeed in clinical trials (11).Four years later, King et al. con rmed these ndings by analyzing data not available at the time of the Nelson et al. study, further emphasizing the importance of genetics in the drug development process (12).Moreover, King et al. found that the success rate of a drug is higher when the disease causal gene is clearly identi ed, as in the case of monogenic (Mendelian) diseases.However, identifying the disease causal genes from GWAS can be challenging as the majority of the GWAS hits are located in non-coding regions (13).In contrast, in monogenic (Mendelian) diseases, the causal genes are both well-known and druggable (14).Then, developing a way to translate knowledge about Mendelian disease biology to complex diseases could have a signi cant impact on their treatment.
We have previously exploited clinical data to discover associations between Mendelian and complex diseases (15,16).In a systematic analysis of 90 Mendelian and 65 complex diseases, Blair et al. used health records to identify which complex diseases individuals with a Mendelian disease are predisposed to, nding 2,908 clinically associated (comorbid) pairs of Mendelian and complex diseases (15).That study showed evidence that comorbidity can be tied to pleiotropic effects of disease genes.In a follow-up study focusing on cancers, Melamed et al. showed that Mendelian disease causal genes are likely to be frequently mutated in comorbid cancers (15).For instance, patients with Rubinstein Taybi syndrome, a Mendelian disease caused by mutations in CREBBP (17), are predisposed to lymphoma(18).CREBBP is one of the most frequently inactivated genes in lymphoma (19) meaning that the observed comorbidity can be attributed to a pleiotropic effect of CREBBP mutation causing Rubinstein Taybi syndrome and contributing to lymphoma.Both of the above studies systematically demonstrate that Mendelian disease comorbidity can suggest a role of Mendelian causal genes on complex disease.However, Mendelian disease comorbidity has not been previously used for complex disease drug discovery purposes.Building on previous ndings, we hypothesize that if the Mendelian disease causal genes contribute to the development of a comorbid complex disease, then these genes can be novel therapeutic targets for that disease (Fig. 1A).

Methods
Recommending drugs for complex disease based on Mendelian disease comorbidities We download results assessing comorbidity between 95 Mendelian diseases and 65 complex diseases from the supplementary materials from Blair et al. (available online) (15).Our goal is to nd drugs targeting Mendelian disease causal genes and recommend them as candidate drugs for the comorbid complex disease.Therefore, we remove 5 Mendelian diseases (and their comorbidities) that are due to chromosomal abnormalities and the causal gene is not obvious (Down Syndrome, Edward Syndrome, Klinefelter Syndrome, Patau Syndrome, Turner Syndrome).We use the remaining 2,908 comorbidity pairs between 90 Mendelian diseases and 65 complex diseases in our main analysis.Additionally, we group the 65 complex diseases into 6 major disease categories: cardiovascular (4), hormonal (8), immune (19), neoplasms (14), neurological (15), ophthalmological (5).
We systematically code all the 65 complex diseases using MeSH codes.To do this, we download the supplementary "Table S2" from Blair et al. (15) (available online) which contains the ICD-10 billings codes that the authors used to identify the complex diseases.We manually match them to relevant MeSH codes.Eventually, we have 230 unique MeSH codes for 65 complex diseases (median of 3 MeSH codes per complex disease).
We download information for 5,800 drugs from DrugBank (version 5.19, date of download: June 16, 2022, https://www.drugbank.com/).After ltering for gene targets in humans, for each drug, we keep its DrugBank ID and its gene targets (HGNC symbols).Drugs included in this le can be approved, investigational, small molecule, biotech, experimental, nutraceutical, illicit, or withdrawn.We do not lter the list of drug-gene targets based on pharmacological action.
To suggest drug repurposing candidates for a complex disease, we rst nd its comorbid Mendelian diseases.We obtain the genes causally associated with these Mendelian diseases from the OMIM.Using drug-gene target information from DrugBank, we nd drugs targeting the Mendelian disease causal genes and we suggest them as candidate drugs for the complex disease.

Finding investigated drugs for the complex diseases
To nd drugs currently investigated for the 65 complex diseases in our sample, we download clinical trial data from the Aggregate Content of ClinicalTrials.gov(AACT) database in a pipe-delimited format (data of download: November 4, 2022; note that it is updated daily) (31).AACT (https://aact.ctticlinicaltrials.org/) is a publicly available relational database that contains extensive information about every study registered in ClinicalTrials.gov.We obtain information for 432,597 clinical trials that were registered in ClinicalTrials.govby the date of download.For each clinical trial, we keep the clinical trial ID, clinical trial phase, conditions (diseases) studied and interventions (drugs) tested.We lter out clinical trials that tested behavioral, device, diagnostic test, dietary supplements, procedures, radiation, or "other interventions", and do not provide MeSH terms for both conditions and interventions.For the total of 109,430 clinical trials that remain, we match the MeSH terms of both conditions and interventions to MeSH codes using the Uni ed Medical Language System (UMLS) database (https://www.nlm.nih.gov/research/umls/index.html).We group clinical trial phases to Phase I (Phase I and Early Phase I), Phase II (Phase II and Phase I/Phase II), Phase III (Phase III and Phase II/Phase III) or unknown phase (no information provided).Phase IV studies are conducted after a drug gets approved to nd long-term bene ts and side-effects that could not be discovered in the duration of a clinical trial.Therefore, we consider drugs in Phase IV clinical trials as indicated drugs (see "Finding indicated drugs for the complex diseases").Eventually, we have 31,053 clinical trials that tested interventions for the 65 complex diseases in our sample.
For these clinical trials, we convert the MeSH codes of interventions to DrugBank IDs using the UMLS API (crosswalk function).However, DrugBank does not assign IDs to drug combinations.In order to include them in our analysis, we convert the MeSH codes that did not match directly to a DrugBank ID, to RxNORM CUIs using the UMLS API (crosswalk function).Then, for each drug combination, we obtain each active pharmaceutical ingredient using the UMLS API ("Retrieving Source-Asserted Relations"; vocabulary = RXNORM; relation label = has_part).
Moreover, MeSH vocabulary assigns different codes to each form of an active pharmaceutical ingredient.But DrugBank assigns IDs only to the general forms.For example, liposomal doxorubicin and doxorubicin are two separate entries in MeSH vocabulary but not in DrugBank (doxorubicin).To deal with this discrepancy, we follow the steps described above and we obtain the active pharmaceutical ingredients using the UMLS API ("Retrieving Source-Asserted Relations"; vocabulary = RXNORM; relation label = form_of).Then, for each drug, we convert its RxNORM CUI to DrugBank ID using the UMLS API (crosswalk function).
Eventually, we have 29,758 clinical trials that tested 1,795 drugs for 64 complex diseases in our sample.
Note that the complex disease "Dermatitis herpetiformis" did not have any investigated drugs at the time of this study.

Finding indicated drugs for the complex diseases
To nd drugs that are currently indicated for the 65 complex diseases, we download 4,225 approved drugs from DrugBank (version 5.19, date of download: June 16, 2022).We get their indications by combining information from RxNORM (https://www.nlm.nih.gov/research/umls/rxnorm/index.html) and repoDB (32), as described below.
Using the RxNORM API (getClassByRxNormDrugName function), we obtain diseases (in MeSH terms) with a relationship of "may_treat" or "may_prevent" with each approved drug.We then match the diseases to MeSH codes using the UMLS database.repoDB is a publicly available database that contains drug repositioning successes and failures by integrating data from DrugCentral and ClinicalTrials.gov.We download the full database (last update: 2017) and, for each drug, we keep its DrugBank ID and approved indication(s), after excluding the ones with a note of suspended, terminated, or withdrawn.All indications are coded in UMLS CUIs, so we easily convert them to MeSH codes using the UMLS database.
After combining the data from RxNORM and repoDB, we have 939 unique drugs indicated for 58 complex diseases.We then add to this data set the drugs in clinical trials Phase IV to get a total of 1,373 unique drugs indicated for 64 complex diseases.Note that the complex disease "hypotony of the eye" did not have any indicated drug at the time of this study.

Statistical analysis Logistic regression to evaluate candidate drugs
We nd 781 unique drugs that target the causal genes of the 90 Mendelian diseases.Using these drugs and the 65 complex diseases, we create a table where each row is a drug-complex disease pair.Therefore, in our main analysis, the number of rows in this table is 50,765 (781 drugs multiplied by 65 complex diseases).We assess whether our recommended drug-disease pairs are predictive of current investigated or indicated drug-disease pairs in a logistic regression model also adjusting for i) the disease category of each complex disease; ii) the number of known gene-targets per drug.
We account for the disease category due to differences in the number of drugs investigated or indicated among the 6 disease categories tested in this study.For example, signi cantly more drugs are tested in clinical trials for neoplasms than ophthalmological diseases (Fig. 1B).Additionally, we account for the number of targets per drug as drugs with a higher number of known targets are more likely to be linked to a Mendelian disease and may be more likely to be subject to research investment for new indications.
To ensure that class imbalance does not bias the coe cient estimate in our model, we also conduct a weighted logistic regression.As shown in Figure S1, the results from the weighted logistic regression are also signi cant and comparable to the non-weighted logistic regression.Therefore, we use a nonweighted logistic regression in all of our analyses.

Permutation tests
To assess the signi cance of the observed associations, we perform permutation tests.
In our main analysis, we want to assess the signi cance of the observed associations between drug repurposing candidates and currently investigated or indicated drugs for a complex disease.The random permutation changes which complex diseases are comorbid with each Mendelian disease, by keeping unchanged all the intrinsic Mendelian disease characteristics, such as prevalence,number of comorbidities, and associated causal genes.This allows us to assess if the observed association is solely attributed to the Mendelian disease comorbidity or not.
In the per Mendelian disease analysis, we want to assess if the higher druggability of a Mendelian disease gene rather than the information about comorbidity drives the results.The random permutation changes which drugs target a Mendelian disease gene by keeping unchanged the total number of drugs targeting the Mendelian disease gene.
In both cases, we perform 1,000 permutations to create a null distribution of odds ratios using the logistic regression model above.We then compare the observed odds ratio to this null distribution.We calculate the probability of observing an odds ratio at least as extreme as the original one by estimating the number of times a permuted odds ratio is higher or equal to the observed odds ratio (odds_ratio observed ≤ odds_ratio permutation ).A result is considered signi cant if the calculated probability is less than 0.05 (p permutation <50/1000).

Genetic similarity between Mendelian diseases and cancers
We assess the genetic similarity between a Mendelian disease and a cancer using two sources of evidence.First, we consider the extent of genetic overlap between two diseases.This simple metric captures the shared driver genes between two diseases.To capture a wider range of functional relationships between two diseases, we also use gene co-expression across diverse human tissues.We de ne a Mendelian disease and cancer pair as genetically similar only if at least one of the above metrics is signi cant (p < 0.05).Both metrics are described below.
The genetic overlap metric tests the signi cance of the overlap between the Mendelian disease causal genes and the genes signi cantly altered in a cancer.For each Mendelian disease, we compile a list of causally associated genes using the OMIM database.For each cancer, we compile a list of driver genes using the Broad GDAC Firehose database (https://gdac.broadinstitute.org/)including genes signi cantly mutated (as identi ed by MutSig v2.0, q < 0.05) and genes with signi cant copy number alterations (as identi ed by Gistic2; q < 0.05; peaks with at maximum 50 genes).Then, for a Mendelian disease and cancer pair, we test the signi cance of the overlap between the set of Mendelian disease causal genes and the set of genes signi cantly altered in cancer (Fisher's exact test, p < 0.05).
The co-expression metric tests correlation in expression between a cancer driver gene and a set of Mendelian disease genes.To assess genetic similarity using this metric, we rst download summarized expression data for 20,162 genes across 37 GTEx tissues from the Human Protein Atlas (https://www.proteinatlas.org/download/rna_tissue_gtex.tsv.zip).We remove 889 genes that do not have expression data across all 37 tissues.The remaining data contained expression for 574 out of the 594 Mendelian disease causal genes.Consequently, the co-expression metric of one Mendelian disease ("Familial Dysautonomia") with any cancer could not be measured (it was tested using only the genetic overlap metric).The metric tests co-expression between any cancer-related gene and the set of Mendelian disease causal genes.More speci cally, it tests whether the set of Mendelian disease genes exhibit stronger correlation in expression with the cancer gene compared to the correlation distribution of all other genes with the same cancer gene.We test this using the Wilcoxon rank-sum test and we adjust the resulting p-values to account for the number of cancer genes tested (Benjamini-Hochberg method; p < 0.05).

Results
Integrating data to test Mendelian diseases as a resource for drug repurposing candidates From Blair et al. (15), we obtain clinical associations between 2,908 pairs of a Mendelian and a complex disease.The data include 90 Mendelian diseases with known causal genes and 65 complex diseases across six disease categories (cardiovascular, hormonal, immune, neoplasms, neurological, ophthalmological).Figure 1B shows the distribution of the number of comorbid complex diseases per Mendelian disease.Using drug-gene target information from DrugBank and Mendelian causal genes from the Online Mendelian Inheritance in Man (OMIM), we compile a list of 781 drugs that target the Mendelian disease causal genes.This allows us to suggest candidate drugs for each complex disease based on its comorbid Mendelian diseases (Fig. 1A).
To test our hypothesis, we compare our candidate drugs against drugs currently investigated or indicated for the complex diseases.We curate 29,758 clinical trials that investigate 1,795 drugs for 64 complex diseases (median of 110 investigated drugs per complex disease).In addition to the investigated drugs, we compile current approved drug uses, including 1,373 indicated drugs for 64 complex diseases (median of 42 indicated drugs per complex disease) (Fig. 1C).

Mendelian disease comorbidity identi es drugs under current investigation or indication
First, we assess whether the candidate drugs for a complex disease are enriched for those currently investigated or indicated for that disease.Accounting for the number of gene targets per drug and the category of disease, we nd that the candidate drugs are signi cantly enriched for drugs currently investigated or indicated (odds ratio = 1.834, p = 5.98e-22) (Fig. 2A).
Next, we seek to exclude artifactual explanations for this signal.One such artifact is variation in disease frequency: disease frequency can impact both the ability to discover disease genes and the power to discover clinical associations.To exclude this spurious source of association, we randomly permute which complex diseases each Mendelian disease is comorbid with.This random permutation preserves the characteristics of each Mendelian disease, such as number of complex disease comorbidities, but not the list of candidate drugs for each complex disease.After 1,000 permutations, we nd that the observed association is signi cantly stronger than expected by chance (p permutation <0.001).For Mendelian diseases with many comorbidities, the permutation does not impact recommendations; therefore the permuted signal, though weaker, still has odds ratio > 1.However, when these Mendelian diseases are removed, comorbidity is signi cantly predictive of drug uses, while the permuted comorbidity is not predictive (Figure S2).
We repeat the analysis at a disease category level, and we nd signi cant results for neurological, immune, neoplasms, ophthalmological, and hormonal diseases.However, after permutation analyses, only neurological, immune, and neoplasm disease categories remain signi cant (p permutation <0.05) (Figs.2A, S3-S8).This may be due to the low number of analyzed complex diseases that fall under the cardiovascular (n = 4), ophthalmological (n = 5), and hormonal (n = 9) disease categories, compared to neurological (n = 15), immune (n = 19) and neoplasms (n = 14), potentially reducing the statistical power to detect a signi cant association.Additionally, these disease categories have a lower number of current therapies (Fig. 1C). Figure 2B shows an example of our recommended candidate drugs for 14 neoplasms, illustrating an extensive overlap between the candidate drugs and the drugs currently investigated or indicated for these neoplasms.The full list of recommended candidate drugs for repurposing for each complex disease can be found in Supplementary Tables 1 and 2.
Next, we ask the candidate drugs are more likely to be in advanced drug development phases for the relevant complex diseases.To test this, we stratify drugs by their drug development phase for a complex disease: phase I, phase II, phase III, indicated.First, we nd a signi cant enrichment of candidate drugs for a complex disease among drugs in any of phase I, II, or III (odds ratio = 1.59, p = 2.27e-09; p permutation <0.001) (Figure S9).Strati ed per clinical trial phase (phase I, II, or III), we nd a progressive increase in the enrichment for drug success with increasing phase (p permutation <0.05) (Figs.2C, S10-12).
Additionally, when considering only indicated drugs for a complex disease, we nd an even greater enrichment of its candidate drugs for drug success (odds ratio = 2.18, p = 5.94e-11; p permutation <0.001) (Figs.2C, S13).Overall, our predicted drug candidates show more enrichment in categories with more clinical evidence, supporting the potential of our approach for identifying new successful drugs.

Prioritizing Mendelian diseases targeted by high number of drugs
Mendelian disease causal genes are known to be good drug targets (14).We nd 193 out of 593 Mendelian genes (32.6%) to be targeted by at least one drug (median: 2 drugs per Mendelian gene).
However, outliers exist: androgen receptor (AR), a gene mutated in Androgen Insensitivity Syndrome, is targeted by 82 drugs (20).This variation in drug targeting of Mendelian genes may suggest that certain disease processes are more druggable.We hypothesize that the most druggable Mendelian diseases are the most promising for providing insight into complex disease therapeutics.
To test this hypothesis, we repeat the above analysis for each Mendelian disease individually, for Mendelian diseases targeted by at least one drug (n = 68) (Fig. 3A).That is, we test whether the drugs targeting the causal genes of each Mendelian disease are enriched for drugs currently investigated or indicated for its comorbid complex diseases.Although testing only the drugs targeting a single Mendelian disease reduces the statistical power of the analysis, we nd 8 signi cant Mendelian diseases (p permutation <0.05).Further, we nd that these 8 Mendelian diseases are targeted by a signi cantly higher number of drugs than other Mendelian diseases (p = 9.1e-05, one-sided Wilcoxon rank-sum test) (Fig. 3B).To exclude the possibility that this is due only to higher numbers of drugs increasing power to discover an association, we compare the result against a permutation analysis that permutes the drugs targeting each Mendelian disease (p permutation =0.018).
In another test of this hypothesis, we ask which Mendelian disease genes successfully point to new drug indications.That is, for each Mendelian disease gene, we use comorbidity to suggest which complex diseases may bene t from drugs targeting that gene.Under our hypothesis, we expect that highly druggable genes can more successfully be used for nding new drug uses.To test this, we repeat the above analysis for each gene targeted by at least one drug (n = 193) (Fig. 3C), comparing the association to permutations.We nd 12 signi cant genes, and these successful genes are again targeted by a higher number of drugs compared to the other Mendelian disease genes (p = 2.3e-05, one-sided Wilcoxon ranksum test) (Fig. 3D).Altogether, these results imply that Mendelian diseases associated with more druggable genes are a particularly promising resource for complex disease therapeutics.

Combining comorbidity with genetic similarity enhances drug predictions
Comorbidity is a way to discover diseases sharing a biological basis, but it is not the only way.Comorbid Mendelian and complex diseases have been shown to be more likely to share related or overlapping genes, which is known as genetic similarity (15,16).Additionally, genetic similarity between drug targets and disease-linked genes has also been shown to predict successful drugs for a disease (11,12).Building on these results, we propose that genetic similarity could contribute to discovering therapeutically relevant shared etiology of Mendelian and complex diseases (21,22).Speci cally, we propose that by combining comorbidity with genetic similarity, the two forms of evidence can more robustly point to diseases with shared etiology, increasing the predictive success of our approach.
To test this hypothesis, we focus on cancers, one of the disease categories with the strongest association in our analysis (Fig. 2A).Cancers are also of interest because each type of cancer has been associated with a set of recurrently mutated driver genes in The Cancer Genome Atlas (TCGA); we previously showed that Mendelian diseases comorbid with a cancer are enriched for genetic similarity to somatically mutated cancer driver genes(16).Building on that work, we ask whether candidate drugs supported by both comorbidity and genetic similarity between Mendelian disease and cancer have greater probability for success.Among the 10 cancers in TCGA, Mendelian disease comorbidity again predicts drugs enriched for those currently investigated or indicated (odds ratio = 1.69, p = 7.42e-06, p permutation =0.014) (Figure S14).But, combining comorbidity with genetic similarity, drugs with both forms of evidence are even more enriched for drugs with clinical support (odds ratio = 2.19, p = 6.33e-13, p permutation =0.001) (Figs.4A, S15).
In order to investigate the contributions of genetic similarity and comorbidity individually and combined, we stratify the 600 pairs of 60 Mendelian diseases and 10 cancers into those that are comorbid and those with no detected comorbidity relationship.As genetic similarity was not previously evaluated for non-comorbid disease pairs, we establish two measures for genetic similarity between two diseases, gene overlap and gene coexpression, similar to the measures used in Melamed, et al.( 16) (see Methods) (Supplementary Table 3).Among 314 comorbid disease pairs, 135 are also genetically similar (43%).These comorbid and genetically similar pairs greatly overlap with the ones identi ed by Melamed et al. ( 16) (p = 7.13e-08, one-sided Fisher's exact test), indicating that our genetic similarity metrics are consistent with the prior work.Among the remaining 286 non-comorbid pairs, 87 are genetically similar (30.4%) (Fig. 4B).The higher rate of genetic similarity among the comorbid diseases is consistent with the prior literature(16).
Using all the drugs targeting the causal genes of the 60 Mendelian diseases, we compile a list of 6,850 possible drug-cancer pairs (685 drugs x 10 cancers) (Supplementary Table S4).Among 2,727 drug-cancer pairs not supported by comorbidity, we nd that those supported by genetic similarity have increased probability of drug success (odds ratio = 2.32, p = 1.07e-04).This implies that genetic similarity might be able to detect shared etiology between Mendelian disease and cancer pairs that cannot be detected with comorbidity.Further, among 4,123 drug-cancer pairs supported by comorbidity, those additionally supported by genetic similarity have greater probability of drug success (odds ratio = 1.39, p = 0.01).As we expect that candidate drug recommendations supported by comorbidity are already enriched for shared etiology, it is logical that the effect of genetic similarity would be smaller for this category of recommendations, but the effect is still signi cant.Notably, drug uses supported by both comorbidity and genetic similarity are most enriched for known drug uses (Fig. 4C, most left bar).
In conclusion, these ndings suggest that by combining the two forms of evidence we can prioritize candidate drugs that target the shared biology between two comorbid diseases, enhancing the use of Mendelian disease biology for drug discovery.

Discussion
Previous studies have suggested that Mendelian disease genes pleiotropically contribute to the development of complex diseases, resulting in signi cantly increased risk of the complex disease in individuals with the Mendelian disease (15,16).However, this insight has not been harnessed for drug discovery.Here, we have shown that comorbidity between Mendelian and complex diseases can recommend candidate drugs for the complex diseases.Importantly, these candidate drugs are more likely to be in advanced drug development phases or have received regulatory approval, suggesting that Mendelian disease comorbidity can be used to prioritize drugs with high potential of eventual approval.
Our ndings provide a novel way to leverage the well-known biology of Mendelian diseases to enhance the treatment of complex diseases.For instance, verapamil, an approved calcium channel inhibitor for the treatment of angina (23), is among our recommended candidate drugs for Type 1 Diabetes (T1D).This recommendation is supported by the comorbidities of T1D with Long QT Syndrome (CACNA1C) and Spinocerebellar Ataxia (CACNA1A).Studies in mice have previously demonstrated verapamil's potential to prompt the survival of insulin-producing β-cells and reverse T1D (24).Notably, verapamil has recently been tested in a phase III clinical trial for T1D treatment (25).Additionally, we recommend carbamazepine, an approved sodium channel inhibitor for the control of seizures(26), as a candidate drug for the treatment of T1D based on its comorbidities with Long QT Syndrome (SCN5A) and Erythromelalgia (SCN9A).This recommendation is further supported by preclinical studies showing that inhibition of sodium channels increases the expression of INS1 and INS2 and thus protects from the development of T1D (27)(28)(29).Looking ahead, we anticipate that future clinical trials should consider testing the e cacy of this drug category for preventing T1D.
We also present an approach for identifying a subset of Mendelian diseases with the most utility for drug discovery.In general, Mendelian diseases are enriched for drugged genes (14), but some Mendelian diseases appear to be targeted by even more drugs than the average.Focusing on both the Mendelian disease and gene level, we nd that diseases associated with highly drugged genes hold greater promise for future drug discovery efforts.
As well, building on previous work that prioritizes drugs functionally related to disease genes(16), we explore genetic similarity as an additional way to identify diseases sharing likely pleiotropic causal genes.In an analysis of ten cancers, we nd that candidate drugs supported by both comorbidity and genetic similarity between a Mendelian disease and a cancer have greater probability of success.By combining two independent sources of evidence for shared disease etiology, future research can use Mendelian disease genes to prioritize new drug uses.
Our work has some limitations.First, we could not compile a complete list of investigated drugs for the 65 complex diseases due to annotation inconsistencies.Second, the complete list of genes causally associated with a Mendelian disease might not be complete due to its rarity.Third, comorbidity may not always be due to pleiotropic effects of the Mendelian disease genes on the development of the complex diseases, but it can also be due to indirect or interaction effects.Similarly, lack of measurable comorbidity between a pair of diseases does not de nitively mean an absence of shared pathological processes, but could be due to disease frequency or interaction effects.Finally, we nd that the enrichment of candidate drugs for success varies across disease categories: our results were not signi cantly predictive for cardiovascular, ophthalmological, and hormonal diseases.This may be because we were not able to test a diverse set of diseases in these categories leading to reduced statistical power.

Conclusion
In conclusion, we leverage the well-known biology of Mendelian diseases to improve treatment of common diseases.To our knowledge, this is the rst study that suggests the use of clinical associations of Mendelian diseases to inform drug discovery.Future work both exploit the drugs we suggest for each disease and explore Mendelian disease genes currently lacking drugs as novel drug targets.In fact, according to Finan et al. (30), almost one fourth (24.4%) of the undrugged Mendelian genes, have high druggability potential.Additionally, disease comorbidity might improve other drug repurposing efforts when considered as an additional source of evidence for prioritizing drug repurposing candidates.Finally, Figure 1 Outline of the approach.A. Proposed method where the drugs targeting genes causally associated with a Mendelian disease are suggested as candidate drugs for its clinically associated (comorbid) complex disease.This hypothesized connection between the drug and the complex disease is based on the previously shown pleiotropic effects of the Mendelian disease causal genes on the development of the comorbid complex disease.B. Distribution of the number of comorbid complex diseases per Mendelian disease.C. Number of investigated (per clinical trial phase) and approved drugs for each complex disease.for their comorbid complex diseases are targeted by a higher number of drugs compared to the other Mendelian diseases (p=9.1e-05,one-sided Wilcoxon rank-sum test comparing the number of drugs in each group).C. Histogram of number of drugs targeting a Mendelian gene, for genes targeted by at least one drug (n=193).D. Genes linked to Mendelian diseases that signi cantly predict candidate drugs already investigated or indicated for their comorbid complex diseases are targeted by a higher number of drugs compared to the other genes (p=2.3e-05, one-sided Wilcoxon rank-sum test comparing the number of drugs in each group).

Figure 2 Clinical
Figure 2

Figure 3 Highly
Figure 3