3.1. eQTL analysis identifies spatially constrained eQTLs in the lung and blood
Disease-associated SNPs are typically located within non-coding DNA, can participate in long-range chromatin interactions, and have small effect sizes [25]. These factors make attributing function to disease-associated SNPs an ongoing challenge [26]. We combined chromatin interaction data (Hi-C) and eQTL data from GTEx to build spatially constrained GRNs in which putative functions are assigned to SNPs by associating them with their target genes (Fig. 1a) (17) The resulting L-GRN has 740,028 spatial eQTLs regulating the expression of 15,855 genes expressed in the lung, constituting 873,133 spatial eQTL-gene interactions. The B-GRN has 1,077,379 spatial eQTLs regulating the expression of 14,871 genes expressed in whole blood, making a total of 1,713,885 spatial eQTL-gene interactions. The L-GRN and B-GRN are described extensively in Zaied et al. [7]
3.2. Mendelian randomisation identifies genes causally associated with asthma in the lung and blood
MR infers the causal effect of exposure (genes) on the disease outcomes (development of asthma) using genetic variants as IVs [17]. We conducted a two-sample MR analysis to test whether L-GRN and B-GRN genes were causally associated with the development of asthma using spatial eQTLs as IVs (Fig. 1b). Among exposures with single IVs, the Wald ratio method identified 65 genes regulated by 64 spatial eQTLs as causally associated with asthma in the L-GRN (Fig. 2a; Supplementary Data 5). No exposures with multiple IVs were identified to be causally linked to asthma in the L-GRN. Of the 65 asthma causal genes in the lung, 43 were protein-coding, and 22 were “other” biotypes, e.g., antisense [n = 10] and pseudogenes [n = 6], etc., Fig. S1). These genes were enriched for 17 pathways in KEGG, including asthma, inflammatory bowel disease, and Th1 and Th2 cell differentiation (Fig. S2a). In the B-GRN, 63 genes regulated by 62 spatial eQTLs were identified using the Wald ratio method (Fig. 2b), and five genes regulated by 11 spatial eQTLs were identified using the IVW method (Fig. 2c; Supplementary Data 6). 53 of the identified asthma causal genes in blood were protein-coding, and 15 were non-protein-coding, e.g., antisense [n = 8] and processed transcripts [n = 3], etc., Fig. S1). The identified causal genes in blood were enriched for 12 KEGG pathways, including Th1 and Th2 cell differentiation, asthma, and Th17 cell differentiation (Fig. S2b).
Amongst the causal genes we identified, 69 were lung or blood-specific, and 32 were shared between the L-GRN and B-GRN. Of the shared genes, 27 had the same direction of effect on asthma risk in both tissues, and only three were regulated by the same spatial eQTLs across both tissues (Fig. 2d). Valette et al. identified 50 causal asthma genes in blood using non-spatial eQTLs as IVs [15]. Notably, our study independently replicated the causality of 19 of the causal genes: six in both the lung and blood (i.e., GSDMB, RERE, PGAP3, UNC13D, MEI1, and SCRN2), three exclusively in the lung (i.e., TMEM258, SLC22A5, and FADS1), and ten specific to the blood (i.e., IKZF3, SLC22A4, ING5, AHI1, SIK2, TEF, HOXB4, GNGT2, TLR10, and EFEMP2).
More than 200 genomic loci have been associated with the development of asthma through GWAS studies [6, 27]. The methods employed to identify the potential target genes of these loci include the nearest gene approach [28] and eQTL analysis [29, 30]. In this study, spatially constrained eQTLs were a critical component of identifying putative asthma causal genes. Of the identified causal genes in lung and blood, all were regulated in cis (spatial eQTLs located within +/-1 Mb of their target genes on the same chromosome) except for three eQTL-gene pairs (i.e., rs10167431-ATF2, rs9267123-HLA-DQB1, and rs61044849-RABL3) which were regulated by trans-intrachromosomal eQTLs (spatial eQTLs located > 1 Mb away on the same chromosome; Supplementary Data 6 and 7).
3.3. Seven protein-coding genes within the 17q12-21 locus are causal for asthma
49 of the 101 genes causally associated with asthma (48.5%) identified in lung and blood are located on chromosomes 17 (17 genes), five (17 genes), and six (15 genes) (Supplementary Data 5 and 6). Within chromosome 17, we identified seven protein-coding genes that are modifiable and causal for asthma in the 17q12-21 asthma locus (i.e., PGAP3, IKZF3, GSDMB, ORMDL3, GSDMA, MED24, and CASC3; Fig. 3). PGAP3, GSDMB, ORMDL3, and MED24 were identified as causal in both blood and lung, but are regulated by tissue-specific spatial eQTLs. Notably, some eQTLs coordinate the regulation of multiple genes (e.g., rs8067378 regulates both GSDMB and ORMDL3 in the blood). Furthermore, the eQTL site that regulates CASC3 in blood (rs3884477) is located within MED24, again providing the potential for coordination between these genes.
3.4 Shared regulatory patterns for causal asthma genes across immune cell types suggests overlapping impacts in the etiology of asthma
Various immune cell types have been associated with the development of asthma. Therefore, we mapped the causal asthma eQTL-gene pairs from lung and blood onto immune cell type-specific cis-eQTL data to prioritise potential driver immune cell types. Cell type-specific cis-eQTL data (not spatially constrained) was obtained from DICE across 15 immune cell types (i.e., innate immune cell types [natural killer cells [NK], classical monocytes, and non-classical monocytes [M2 cells]], naïve adaptive immune cells [naïve B, naïve CD4 + T cells, activated naïve CD4 + T cells [CD4_stim], naïve CD8 + T cells, activated naïve CD8 + T cells [CD8_stim], and naïve TREG cells], and CD4 + T memory subtypes [Memory TREG, TH1, TH 1/17 [TH star], TH17, TH2, and follicular helper T cells [TFH]]). Of the 63 causal asthma eQTL-gene pairs identified in the lung, 23 were identified as cis eQTL-gene pairs across 15 immune cell types (Fig. 4a). To distinguish the direction of effect in different cell populations, we performed unsupervised hierarchical clustering of effect sizes (beta) of cis-eQTLs across the cell types. In the lung, hierarchical clustering identified three main clusters of causal asthma eQTL-gene pairs: causal eQTL-gene pairs with A) negative, B) both positive and negative, and C) positive effect sizes (Fig. 4a). Additionally, four cell-type clusters (clusters 1–4) were enriched for various causal eQTL-gene pairs. For example, cluster 1 is the monocyte cluster and is enriched for causal eQTL-gene pairs from cluster A. Cluster 2 is primarily CD8 + and CD4 + T cells, and cell clusters 3 and 4 are predominantly CD4 + T memory subtypes.
Of the 65 single-IV causal eQTL-gene pairs identified in blood, 25 were also identified across the 15 DICE immune cell types (Fig. 4b). Hierarchical clustering identified three main clusters of causal asthma eQTL-gene pairs (A-C) and four cell-type clusters (clusters 1–4).
3.5. Asthma endotypes share regulatory mechanisms with the asthma GRNs
To identify traits that share regulatory connections with the causal asthma genes (i.e., level 0 traits), we retrieved the significantly enriched GWAS traits associated with spatial eQTLs (plus LD SNPs) targeting the asthma L-GRN (level 0). We identify 63 enriched traits (hypergeometric test, FDR ≤ 0.05 and 500 sets of Monte Carlo simulation) associated with spatial eQTLs (plus LD SNPs; n = 421) targeting 52 level 0 genes in lung. Convex biclustering of level 0 genes and traits revealed four distinct clusters (Fig. 5a). Cluster 1 includes genes known to play a role in cholesterol metabolism (e.g., HLA-DQA1, FADS1, and SLC22A5) and the spatial eQTLs regulating them were enriched for two cholesterol level traits. Cluster 2 is the metabolites cluster marked by two genes in 11q12.2 involved in fat metabolism (i.e., FADS1 and TMEM258). Cluster 3 includes immune response and allergy-related traits and includes D2HGDH, RERE, RP5-1115A15.1 and WDR36. Cluster 4 includes various endotypes of asthma, asthma medication, and other immune-mediated traits; it largely involves genes located within the 17q12-21 asthma locus (i.e., GSDMA, GSDMB, ORMDL3 and PGAP3; Fig. 5a).
Spatially constrained eQTLs and their LD SNPs within the asthma B-GRN were also analysed to identify traits sharing regulatory interactions with asthma in the blood (level 0 traits). 54 significantly enriched traits were associated with 512 SNPs regulating 57 level 0 genes in the blood. Convex biclustering of the level 0 genes and traits revealed five distinct clusters. Cluster 1 has two immunoglobulin-related traits and five HLA genes (i.e., HLA-DQA1, HLA-DQB1, HLA-DQB1-AS1, HLA-DQB2 and HLA-DRB1). The traits within cluster 2 include blood cell counts and genes related to cellular processing (i.e., MEI1, TEF, NDFIP1, and PMM1). Clusters 3 and 4 included immune-response and allergy-related traits and contained combinations of genes that resemble those in cluster 1. Cluster 5 included asthma endotypes and asthma medication-related traits. In addition to the HLA genes observed in cluster 1, cluster 5 also included genes within the 17q12-21 locus (i.e., GSDMB, ORMDL3, and IKZF3; Fig. 5b; Supplementary Data 7).
This brings the total number of traits enriched in level 0 to 88 across both tissues. Of these 88 traits, hepatitis C induced liver cirrhosis, IgA nephropathy, primary biliary cholangitis, Epstein Barr virus nuclear antigen 1 IgG levels and primary central nervous system lymphoma have not been previously identified as associating with asthma. The remaining traits we identified have known associations with asthma (e.g. systemic lupus erythematosus (SLE) [31], alphalinolenic acid [32], high density lipoprotein cholesterol [33], Immunoglobulin A vasculitis [34], eosinophil count [35], trans fatty acid levels [36], and Crohn’s disease [37]). Notably, 29 of the 88 level 0 traits are shared between both tissues; this can be attributed to the 24 shared asthma causal genes enriched at level 0 (hypergeometric test, FDR ≤ 0.05, and 500 sets of Monte Carlo simulations) as well as the multigenic nature of complex traits (Fig. 6a).
3.6. Additional traits are associated with asthma through indirect interactions with the asthma GRNs
Proteins do not work in isolation, and proteins associated with comorbid or biologically related diseases are more likely to interact within a PPIN [10, 38]. Therefore, we expanded four edges outwards from the asthma L-GRN (level 0; Fig. 6a) to identify its curated interacting protein partners (20,21)(Fig. 1c). This identified a total of 1,801 curated proteins across levels 1–4. Subsequently, Monte Carlo simulations reduced it to a set of 137 significant protein-coding genes. Level 1 included four interacting proteins and seven SNPs. At level 2, 123 genes associated with 231 SNPs were identified. While level 3 included 11 genes associated with 28 SNPs. No significant enrichment was identified at level 4 due to Monte Carlo simulation. To determine the traits enriched in the regulatory space proximal to the asthma L-GRN, we identified the enriched traits associated with SNPs that regulate level 1–3 proteins. In total, 34 enriched traits were identified across levels 1–3. Of those, seven traits, including asthma, were specific to levels 1–3 and unique to the lung (Fig. 6b). All of these had known associations with asthma, including keratinocyte cancer [39, 40], sphingolipid levels[41, 42] and amyotrophic lateral sclerosis (ALS) [43].
Performing this expansion analysis starting from the asthma B-GRN (Fig. 6a) identified 2,676 proteins over levels 1–4. Monte Carlo simulation reduced the total number of proteins to 43 protein-coding genes. Of these, 11 genes and 50 SNPs, ten genes and 25 SNPs, 16 genes and 30 SNPs, and six genes and 15 SNPs were identified at levels 1 through 4, respectively. SNPs that regulated level 1–4 genes were associated with 18 traits. Of those, 14 traits were blood-specific (Fig. 6c). Apart from aspartate aminotransferase platelet ratio index in high alcohol intake, all identified traits had reported associations with asthma. These include follicular lymphoma [44], shingles [45], Intracerebral hemorrhage [46], caffeine metabolism [47] and age-related macular degeneration [48].
A total of five enriched traits were shared between the lung and blood across levels 1–3 (Fig, 6d). Apart from nasopharyngeal carcinoma, all shared traits have previously been reported to be associated with asthma. These include chronic lymphocytic leukemia [49], IgE levels [50] and Hepatitis B [51, 52]. Importantly, this analysis also identifies the molecular mechanisms (i.e., genes and SNPs) that mediate the asthma-trait interaction. For instance (54,55)the enriched IgE levels trait is associated with three SNPs that regulate the expression of FCER1A and ACKR1, these have direct protein interactions with the asthma causal genes KCER1G and TNFRSF14, respectively (Fig. 6e).
3.7. Analysis of gene-disease associations corroborates identified asthma comorbidities observed in New Zealand patients.
Various disorders have been observed in asthmatics (e.g., diabetes and cardiovascular disease), consistent with the possible existence of common molecular mechanisms influencing their onset [53]. We identified 113 GWAS traits interacting with asthma (Fig. 6, Supplementary Data 7). However, some of these traits are not diseases (e.g., IgE levels). Therefore, we used DisGeNet [24] to identify the diseases associated with the genes in levels 0–4 across both tissues. This revealed 38 genes associated with 337 UMLS CUI identifiers in DisGeNet (Supplementary data 8), mapping to 71 three-digit ICD-10 disease terms and 72 curated UMLS CUI disease identifiers.
Hospitalisation records (n = 2,051,661; between 31/12/2015 and 01/01/2021) in New Zealand were analysed to identify comorbid disorders in patients with asthma (n = 26,781). The comorbidity analysis revealed 194 three-digit ICD-10-AM terms to have a statistical association with asthma (i.e., diseases that are more or less likely to be observed in the asthmatic population; q-value < 0.05; Supplementary data 9). Of those, 13 disease terms (mapping to 18 ICD-10-AM terms) were also identified in the GDA analysis. These diseases all had higher likelihoods of being comorbid with asthma (Fig. 7a). Moreover, the 13 disease terms were associated with nine genes identified in levels 0–4 across both tissues (Fig. 7b).
3.8. Genes bridging asthma-trait interactions are part of the druggable genome
To identify opportunities for drug repurposing, we investigated which of the genes identified in levels 0–4 (hypergeometric test, FDR ≤ 0.05 and 500 sets of Monte Carlo simulation) across both tissues are present in the druggable genome [54] and the drug-gene interaction database (DGIdb [55]). Of the 257 genes in levels 0–4, 61 are a part of the druggable genome, while 22 have known interactions with drugs in DGIdb (Supplementary data 10). Importantly, some of the identified genes are targets of known asthma drugs, including TSLP (Tezepelumab), IL2RA (daclizumab) and IL4R (Dupilumab).