UK biobank participants with LV and RV CMR measurements
CMR measurements were obtained from a sample of 36,548 UKB subjects utilizing an extensively validated deep-learning approach7. On average, subjects were 63.9 (standard deviation, SD: 7.6) years old, 18,879 (51.8%) were women. Participants had a mean SBP of 138.2 mmHg (SD: 18.4), a mean diastolic blood pressure (DBP) of 78.6 mmHg (SD: 10.0), and a mean heart rate of 62.5 bpm (SD: 10.2); see Supplementary Table 4.
Genomic loci associated with measures of RV and LV structure and function
We performed GWAS on 16 CMR traits, leveraging genotyped and imputed variants from the Affymetrix BiLEVE and Axiom arrays, and applying BOLT-LMM conditional on age, sex, body surface area, SBP, genotype measurement batch, 40 PCs and assessment centre.
The 91 unique lead variants (Figure 1-2, Supplementary Tables 5) were mapped to 53 genes based on independent review of annotated local-view plots (Supplementary file 2). This resulted in 16 genes for RV-ESV, 15 for LV-EF, 14 for LV-MVR, 14 for LV-ESV, 12 for RV-EF, 10 for RV-EDV, 6 for LV-EDV end LV-EDM, 5 for RV-SV, 4 for LV-SV, 2 for RV-PER, 1 for RV-PAFR, and none for RV-PFR, LV-PFR, LV-PER, and LV-PAFR. We identified five novel locus-trait associations for RV-PAFR (SCN10A), RV-PER (ALDH2 and HLA-B), and LV-SV and LV-MVR (both HLA-B). Twenty-six genes were associated with multiple measures. For example variants mapped to TTN were associated with 11, BAG3 with 6, and both TMEM43 and ATXN2 with 5 measures. Of these multi-trait genes, 12 were associated with both LV and RV measures: TTN, BAG3, TMEM43, ATXN2, PROB1, DMPK, ZNF572, PLEC, HSPB7, HLA-B, SPON1, and OBSCN (Figure 1 and Supplementary Table 4).
Genetic heritability of CMR traits and pairwise genetic correlation
BOLT-REML was used to estimate the amount of phenotypic variation that could be explained by narrow-sense genetic heritability (Figure 2). Heritability estimates ranged between 36% and 31% for both RV and LV measurements of EDV and ESV, as well as LV-EDM. For LV-MVR, EF and SV of both ventricles heritability ranged between 20% and 29%. Despite an absence of GWAS hits for PFR, LV-PER and LV-PAFR, heritability of these traits was between 6% and 12%.
The pairwise genetic correlation (Figure 3) indicated that genetic variants for SV and PER measurements (both LV and RV) were highly correlated (correlation coefficient close to 1.0), as were genetic variants associated with EDV and ESV traits from both ventricles, and variants for LV-PFR and RV-PFR. LV-EDM had a moderately strong correlation (around 0.70) with SV, PER, ESV, EDV of both ventricles. Finally, variants for LV-MVR, RV-EF, and LV-EF showed a positive correlation among themselves (maximum 0.68), and negative correlation with EDV, ESV, EDM, and SV traits (maximum -0.86).
Broader phenotypic effects of the CMR genes
Extracting data from GWAS catalogue (Figure 2, Supplementary Figure 3), we found genes identified for their association with one or more RV and LV measures were frequently associated with CMR traits from previous studies (e.g., LV dilatation, LV mass, and fractal dimension), with electrocardiographic traits (e.g., PR segmentation, QRS duration, QT interval), blood pressure and heart rate; as well as with plasma concentration of various apolipoproteins and cholesterol-containing lipoproteins. The following CMR genes were previously associated with a number of cardiac diseases including: AF (SYNPO2L, TBX5, IGF1R, GOSR2, TTN, SCN10A, CDKN1A, MYO18B, KCNH2) hypertrophic cardiomyopathy (HSPB7, SYNPO2L, BAG3, NSF, FHOD3, CDKN1A, SMARCB1), DCM (BAG3, FHOD3, TTN, SMARCB1), HF (SYNPO2L, BAG3); and CHD (ATXN2, ALDH2, PTPN11, GOSR2); Supplementary Figure 3.
Identifying plasma protein levels with an effect on CMR traits
We initially linked our putative CMR genes to BNF and ChEMBL and identified 18 genes which encoded a druggable protein (Supplementary Table 4-6). These include genes encoding drug targets for compounds with indications and/or side-effects for AF, HF, CHD, chronic obstructive pulmonary disease (COPD) and diabetes (IGFR1, KCNH2, KCNK3, PDE5A, SCN10A, Supplementary Tables 5-6).
We next expanded this analysis to use drug target MR to directly identify casual plasma proteins for CMR traits, which additionally provides effect directions that can inform the type of drug compound effect (i.e, inhibiting or activating compound effects). Specifically, we identified circulating proteins with a causal effect on RV and/or LV measures by performing a two-sample cis-MR, combining aggregated genetic data on protein levels from three sources (SCALLOP, Framingham, and INTERVAL) with the GWAS discovery analysis undertaken here (see methods). We found that 304 proteins were associated with at least one CMR trait (Supplementary Figure 4), with the number of associated proteins ranging from 62 for LV-ESV to 33 for LV-PAFR (Figure 4). The Kolmogorov-Smirnov test provided strong evidence that these results were not driven by multiple testing (Supplementary Figure 5).
Next, we identified I) 18 “drugged” CMR associated proteins that were targeted by a licensed drug compound (Supplementary Figure 6), II) 21 “druggable” proteins which were amenable to small-molecule perturbation or monoclonal antibody inhibition (Supplementary Figure 7), and III) 30 proteins with directionally “concordant” effects on three or more CMR traits which either all had beneficial or detrimental effects (Supplementary Figure 8). The set of concordant proteins contained 25 entries which were not part of the drugged or druggable sets, hence through linkage to IntAct data we next identify the “nearest” set of druggable proteins which contained 8 proteins for which we had plasma pQTL data. The nearest druggable were identified by counting the number of protein-protein interactions between the indexing protein and the nearest druggable protein. This resulted in a set of 72 proteins associated with LV and/or RV measurements (18 drugged, 21 druggable, 25 concordant and 8 nearest druggable), which we prioritized further through cis-MR to identify 33 proteins which were involved with the following cardiac outcomes: HF, DCM, non-ischemic CM, AF, and/or CHD.
Drugged CMR proteins: repurposing opportunity
Eight of the 18 druggable proteins could be associated with cardiac traits through cis-MR (Figure 5): IL6RA, CO6A1, CD33, CAH6, COFA1, TIE2, LAMC2, I17RA, and SLAF7.
CD33, I17RA, SLAF7 affected HF; CO6A1, I17RA affected non-ischemic CM; CAH6, LAMC2 were associated with DCM; IL6RA, CD33, COFA1 with AF, and finally IL6RA and TIE2 affected CHD. Focussing on the proteins affecting multiple cardiac traits, we found that increased levels of I17RA (Interleukin-17 receptor A) predominantly improved LV function (Supplementary Figure 6) and decreased the risk of HF OR 0.97 (95%CI 0.96; 0.98) and non-ischemic CM OR 0.94 (95%CI 0.92; 0.96). I17RA is targeted by the anti-inflammatory monoclonal antibody (mAb) brodalumab (Figure 6). IL6RA (Interleukin-6 receptor subunit alpha) had a directionally discordant effect on 9 LV and RV traits (Supplementary Figure 6), with increased levels being associated with decreased risk of AF OR 0.95 (0.94; 0.96) and CHD OR 0.94 (0.93; 0.94). Noting that genetic instruments for IL6R are associated with reduced membrane bound IL611, we find directionally concordant effects by IL6R inhibiting compounds such as tocilizumab decreasing cardiovascular risk (Figure 6). CD33 (Myeloid cell surface antigen CD33) was found to reduce LV-EDM and is targeted by mAb such as gemtuzumab which are indicated in oncology and have documented cardiovascular side effects (Figure 6). Increased levels of CD33 decreased the risk of HF OR 0.96 (95%CI 0.95; 0.98) and AF OR 0.96 (95%CI 0.89; 1.03). Similarly, SLAF7 (SLAM family member 7) and TIE2 (Angiopoietin-1 receptor) are both inhibited by compounds with an oncological indication with known cardio-metabolic side-effects, and non-oncological indications such as amyloidosis (SLAF7) and CHD (TIE2); Figure 6. Through MR we found that SLAF7 improved RV-EF and RV-PAFR function, but nevertheless increased the risk of HF OR 1.07 (95%CI 1.05; 1.08), TIE2 beneficially affected CMR traits with an LV-EF effect of 0.43% (95%CI 0.32; 0.55), RV-ESV -0.68 ml (95%CI -0.89; -0.48), and an RV-PAFR effect of 5.47 ml/s (95%CI 4.15; 6.79), while increasing the risk of CHD OR 1.10 (95%CI 1.06; 1.15).
Druggable CMR proteins: de novo developmental targets
We identified 11 druggable proteins (out of 21 in total) with an effect on a cardiac outcome (Figure 5, Supplementary Figure 7, Supplementary Tables 11-12): TNF12, ICOSL, IL8, TDGF1, LYAM1, PA2GA, TNR5, MK03, MFGM, ERAP2, ERAP1.
PA2GA, MK03 affected HF; TNF12, TDGF1, TNR5, MFGM affected non-ischemic CM; TNF12, ICOSL, TNR5, MFGM, were associated with DCM; TNF12, IL8, TDGF1, LYAM1, MK03, ERAP1 associated with AF, and finally IL8, TDGF1, ERAP2 and ERAP1 with CHD. Focussing on proteins with an effect on multiple cardiac traits, we found that TNF12 (Tumor necrosis factor ligand superfamily member 12) decreased the risk of non-ischemic CM OR 0.82 (95%CI 0.77; 0.88), AF OR 0.90 (95%CI 0.89; 0.91), and DCM OR 0.80 (95%CI 0.75; 0.85). Higher concentration of TNF12 improved LV dimensions but increased LV-EDM. TNF12 is inhibited by two phase 1 compounds indicated for neoplasm and rheumatoid arthritis. Higher levels of IL8 (Interleukin-8) IL8 increased LV-EDM, while improving RV-PER, and decreased the risk of HF OR 0.74 (95%CI 0.69; 0.81) and AF OR 0.83 (95%CI 0.77; 0.89), while increasing the risk of CHD OR 1.18 (95%CI 1.11; 1.25); Figure 5, Supplementary Figure 7 & Tables 11-12). IL8 is the target of mAb in development for treatment of neoplasms and chronic lung disease (Figure 6, Supplementary Table 8). TDGF1 (teratocarcinoma-derived growth factor 1), targeted by a developmental immunoconjugate BIIB015 for treatment of tumours, improved LV and RV cardiac traits (EF, SV, PER, RV-PFR), and decreased the risk of CHD, non-ischemic CM OR 0.93 (95%CI 0.92; 0.94), and increased the risk of AF OR 1.01 (95%CI 1.01; 1.01); Figure 5, Supplementary Figure 7, Tables 11-12. MK03 (Mitogen-activated protein kinase 3) is inhibited by multiple ERK1/2 kinase compounds for treatment of neoplasms and associated with improved RV-ESV and RV-EF, and decreased the risk of HF OR 0.85 (95%CI 0.80; 0.91) and AF OR 0.86 (95%CI 0.82; 0.91). ERAP1 and ERAP2 (Endoplasmic reticulum aminopeptidase 1 and 2, forming a protein complex39), both improved LV and RV CMR measurements (Supplementary Figure 6), and are both inhibited by the same compound tosedostat (currently in development for oncology). Higher ERAP1 was associated with an increased risk of non-ischemic CM OR 1.10 (95%CI 1.07; 1.13), and decreased risk of AF OR 0.99 (95%CI 0.98; 0.99) and HF OR 0.98 (95%CI 0.97; 0.98), while higher levels of ERAP2 in turn increased the risk of CHD OR 1.03 (95%CI 1.02; 1.03); Figure 5.
Nearest druggable proteins with directionally concordant CMR effects
Next, we identified 30 proteins with directional concordant effects on three or more CMR traits (i.e., with all beneficial or detrimental effects), and mapped these indexing proteins to their (next) nearest druggable protein (Figure 7, Supplementary Figure 8). This resulted in drugged and druggable proteins that either directly interacted with an indexing protein or were separate by at most one protein-protein interaction (Figure 7). Some of these indirectly drugged and druggable proteins had known cardio-metabolic indications and/or side-effects (Figure 7, Supplementary Figure 9). For example, PPAC (low molecular weight phosphotyrosine protein phosphatase) beneficially affected LV-PFR, RV-EDV, and RV-ESV, and while not druggable itself, interacted with eight druggable proteins (Figure 7, Supplementary Figure 7). Six of these PPAC related proteins (PDE4D, GBRG1, PRS7, VWF, RARA, 5HT1E) were targeted by inhibiting compounds with a recorded cardio-metabolic indication or side effect (Figure 7, Supplementary Figure 9, Supplementary Tables 13-14).
The set of concordant proteins included 25 that were not included in the drugged or druggable set (Supplementary Figure 8), and through identification of the nearest druggable protein we identified an additionally 8 druggable proteins with plasma pQTL data (EGFR, FA10, PAI1, MET, LYAM2, SYUA, IL6RB, IL18) that were not included previously. Pruning this combined set of proteins (i.e., the set of directionally concordant proteins and the indirectly drugged or druggable proteins they interacted with) on the presence of cardiac outcome effects (Figure 5) resulted in the following set of prioritized proteins: BAG3, C1QC, PGLT1 affected HF; BAG3, PATE4, affected non-ischemic CM; MANBA, NCAM2, BAG3, C1QC, GPC5, IL18R, were associated with DCM; LYAM2, PPAC, BGH3 associated with AF, and finally MANBA, UD16, SPA12 affected CHD.
Focussing on proteins with an effect on multiple cardiac traits, we found that higher concentrations of MANBA improved 5 CMR traits (ESV, EF, LV-PFR), and decreased CHD and DCM risk: OR 0.93 (95%CI 0.91; 0.96) and OR 0.76 (95%CI 0.72; 0.81) respectively. BAG3 (BAG family molecular chaperone regulator 3) improved 6 CMR traits and decreased the risk of HF OR 0.75 (95%CI 0.72; 0.79), non-ischemic CM OR 0.30 (95%CI 0.25; 0.36), and DCM OR 0.14 (95%CI 0.11; 0.17). Higher levels of C1QC (Complement C1q subcomponent subunit C) detrimentally affected 3 CMR traits but nevertheless decreased the risk of HF OR 0.97 (95%CI 0.96; 0.98) and DCM OR 0.86 (95%CI 0.82; 0.90). Higher UD16 (UDP-glucuronosyltransferase 1-6) worsened 4 LV CMR traits, and increased the risk of DCM OR 1.62 (95%CI 1.46; 1.80) and CHD OR 1.06 (95%CI 1.04; 1.08)
Tissue expression and phenome-wide scan of likely on-target clinical effects.
We next explored mRNA expression and performed a phenome-wide scan of the anticipated on-target effects of increased protein concentration of the 33 prioritized proteins who affected LV and RV measurements, as well had an effect on cardiac outcomes (Figures 5 & 8, Supplementary Figures 10-11).
Tissue specificity did not differ between CMR prioritized proteins and non-prioritized proteins (p-value = 0.20). We did observe a significant difference in tissue-specific expression (p-value 9.01×10−3), with prioritized plasma proteins more frequently higher expressed in spleen, lymph node, liver, granulocytes, kidney, pancreas, and lung tissues (Supplementary Figure 11).
In addition to the cardiac outcomes these proteins were prioritized on, the cis-MR phenome-wide scan showed that these proteins were frequently associated with DBP, SBP, ECG measurement during exercise, lipid fraction such as (HDL-C, Apo-A1, triglycerides, LDL-C, and Apo-B), estimated glomerular filtration rate (eGFR), body mass index (BMI), glycosylated haemoglobin (HbA1c), c-reactive protein, lung function (FEV1, FVC, PEF), and carotid intima-media thickness (cIMT) (Figure 8); protein specific results are presented in Figure 5, Supplementary Figure 10 and Table 16.