Untangling the relationships between complex phenotypes can prove elusive because the molecular and biological mechanisms that influence their presentation interact across multiple levels of biological information. In this study, we performed a de novo discovery analysis of conditions that are multimorbid with asthma using a whole blood spatial GRN. In effect, the whole blood GRN acted as a Rosetta stone connecting the asthma-associated SNPs with eQTLs from an extended set of curated protein-protein interactions. Thus, we were able to use molecular interactions to identify genetic variants that affect traits known to be multimorbid with asthma as well as other traits having an unconfirmed association with asthma. Our de novo discovery approach can be applied to other complex polygenic disorders and will help to identify known and unknown interacting phenotypes, thereby providing information on the genetic variation and biological mechanisms that are potentially responsible for the interaction.
This study is not without limitations. Primary amongst them is that associations can only be identified for traits previously investigated in GWAS studies. An additional consequence of this limitation is that not all variations will be represented due to GWAS participants being primarily of European ancestry (41). Second, our method relies on regulatory connections having known protein interactions. eQTLs whose target genes do not form protein-protein interactions, or those forming unknown interactions, will thus be missed by our analysis. The third limitation is that the blood GRN is not dynamic, i.e., it represents a snapshot in time of the captured interactions and is thus subject to change. Fourth, our analysis combines information across multiple levels of biological organization that do not originate from the same biological samples (i.e., eQTL data from GTEx, Hi-C datasets, population studies [GWAS], and protein-protein interaction). Notwithstanding these limitations, the ability to discover multimorbid traits de novo through described molecular pathways and by integrating resources across biological levels provides a potential step-change in our ability to identify why and how genetic variation contributes to complex phenotypes.
Asthma is a complex inflammatory airway disease whose features include airway remodeling and respiratory obstruction (1, 3). Consistent with these observations, we identified genetic variants that are associated with forced expiratory volume (FEV) and forced vital capacity (FVC) related phenotypes, which characterize respiratory functions, as being directly connected to protein encoding genes that are affected by asthma-associated SNPs. In addition, we also observed direct connections to proteins that are targeted by eQTLs associated with conditions such as Alzheimer’s disease (AD), blood protein levels (i.e. interleukin-18 (IL-18) (42), macrophage inflammatory protein-1b (MIP-1b) (43, 44) and 90K protein (45, 46), etc.), SLE, and sarcoidosis, all of which are conditions known to be multimorbid with asthma. For instance, metanalysis studies report an increased risk for the development of SLE in asthmatics (pooled odds ratio (OR): 1.37; 95% CI 1.14–1.65; I2 = 67%) (23). Additionally, the genetic risk of sarcoidosis, which is a granulomatous multisystem disease of unknown etiology and pathology that predominantly affects the lungs (47, 48), had significant pleiotropy with the genetic risk of asthma (R2=2.03%; p=8.89×10−9) (26). These findings suggest a potential role played by the remaining identified traits in the multimorbidity of asthma.
The strength of the network approach we outlined is that it provides molecular insights for interactions between asthma multimorbidities. For example, despite reported links between asthma and an increased risk for AD (hazard ratio (HR): 2.62; 95% CI 1.71–4.02) (49), there are no known mechanisms through which this could occur. Our analysis, however, implicates 19 genes and 56 eQTLs in the asthma-AD multimorbid interaction. These include ERCC1 (spatially regulated by the AD-associated eQTL, rs55923289), which interacts with both KAT5 and XRCC6 in the asthma PPIN. While no population studies associate ERCC1 with AD, ERCC1 mutant and knockout mice were reported to present with age-dependent neuronal pathology and cognitive decline (50, 51). Moreover, AD patients were reported to have decreased mRNA levels of ERCC1 compared to healthy controls suggesting its role in AD pathology (52). Similarly for lung cancer, the risk of its development increases in adults with active asthma (HR 1.29; 95% CI 0.95–1.75) (24). This could be mediated through mechanisms such as chronic inflammation, which is associated with higher risk of developing cancer (53), structural and functional changes of the lungs caused by chronic asthma exacerbations e.g., thickening of the bronchial wall, fibrosis, and formation of scar tissue (54, 55), and asthma-associated molecular disruptions that contribute to the development of lung cancer (22). Consistent with this, our findings implicate 13 genes, which are spatially regulated by 24 eQTLs, as contributors to the observed multimorbidity. One of those is the amino peptidase CTSH produced by lung macrophages and is spatially regulated by the lung cancer-associated eQTLs rs4886591, rs11639372, and rs28408315. CTSH interacts with 8 different asthma eGenes (TNFSF4, HLA-DRB5, HLA-DQA2, HLA-DQB2, CHIT1, HLA-DRB1, HLA-DQA1 and HLA-DQB1) and has been previously associated with lung cancer (61). Interestingly, CTSH was found to be overexpressed in both smoking lung cancer patients (63) and asthmatics (62). Collectively, our results identify the potential regulatory mechanisms contributing to the observed asthma multimorbidity and we propose that this should be subjected to further empirical investigation.
Conditions that have been identified in our analysis but were not reported to be associated with asthma, or their association with asthma is inconclusive, include disc degeneration, hip circumference (56, 57), mean arterial pressure, blood pressure related to alcohol consumption, and leprosy. Of those, we predict that traits found closer to the asthma PPIN (e.g., disc degeneration identified in the +1 level) to have stronger relationships with asthma. TAP1 is one of the genes that we identified to be modulating the interaction between the asthma PPIN (through interactions with HLA-B) and disc degeneration-associated eQTLs (11 eQTLs). TAP1 mediates the unidirectional translocation of peptide antigens across the endoplasmic reticulum for loading into MHC I molecules (58). Interestingly, previous studies report it to be associated with lumbar disc degeneration (59), asthma, and other inflammatory conditions such as rhinitis and dermatitis (60). Moreover, “TAP binding” was one of the enriched molecular functions of the asthma PPIN proteins (Supplementary Fig. 3b). These findings suggest potential increased risks for asthma-disc degeneration multimorbidities mediated through genes involved in inflammatory pathways such as TAP1. Asthma multimorbidities were also reported for traits found in the “+3” level including glycated hemoglobin levels (61), plasma-free amino acid levels (62), and linoleic acid (63, 64), as well as traits found in the “+4” level such as circulating chemerin levels (65). Thus, while proximity of the identified traits to the asthma module suggests stronger associations with asthma, higher level associations could also contribute to the risk of multimorbidity.
To further demonstrate the applicability of our approach to other complex disorders, we used it to identify the conditions that are proximal to ALL in regulatory space. Some of the conditions that we identified were previously reported to be associated with ALL (e.g., nephropathy, asthma (OR = 1.43, 95% CI: 1.10, 1.85) (66), disc degeneration (67), etc...) Drug associations were found to be common contributors to the observed multimorbidity. For instance, anticancer treatment in ALL has been associated with kidney function decline (68) and administration of immunosuppressive agents in systemic sclerosis patients has been associated with increased risk of hematological cancer (69). For disc degeneration, 90% of children with ALL are reported to have decreased bone mineralization, a contributor to disc degeneration (67, 70). Our analysis implicates 9 eQTLs and 3 eGenes in the disc degeneration-ALL multimorbidity including HLA-DQB2. While no population studies associate HLA-DQB2 with disc degeneration, its significant enrichment has been characterized by unique lower back pain DNA methylation signatures in human T cells (71). These findings demonstrate the utility of our approach in identifying multimorbidities of various complex diseases.
The whole blood GRN we constructed integrated common genetic variants, chromatin interactions and eQTL data. This network provides insights that expand beyond the identification of multimorbidity. Firstly, the spatially regulated protein-coding genes (median TPM >0.1) within the GRN are not statistically different in their expression levels (mean ± SD, 29.5 ± 330.5) from protein coding genes that are not spatially regulated (mean ± SD, 203.1 ± 5597.3). Yet, protein coding genes that are regulated distally (in trans), are statistically more intolerant to LOF (p <0.0001) when compared to protein coding genes that are expressed but not included in the GRN. Secondly, we identified a spike in eQTL-eGene interactions within the HLA region (6p21.3) in the whole blood GRN. This spike was specific to blood and was not present in the other tissues that were tested. This observation is consistent with the relative importance of this region in immune processes (e.g. regulation of inflammation, innate and adaptive immunity, antigen processing and presentation, autoimmunity, and the complement system (72)). The observation that tissue-specific spikes were observed at other chromosomal locations possibly highlights loci that are important for features specific to those tissue.
We have presented an approach that identifies multimorbid conditions of the disease of interest without the need for a priori selection of the interacting phenotypes. The molecular connections we have identified in this study represent high-value targets for subsequent investigation into asthma development, multimorbidity, and future therapeutic development.