Data sources
Genetic association data on plasma LDL-C, HDL-C and TG concentrations were extracted from GLGC, which released aggregated data (i.e., point estimates and standard errors) for 146,492 East Asian participants and 1,320,016 European participants.
The following outcome data was sourced for the drug target MR analyses to estimate the on-target effects of CETP inhibition (Supplementary Data 1). For individuals of European ancestry, we leveraged data on apolipoprotein (Apo) A1 and B, lipoprotein a (Lp[a]), from 361,194 UK biobank (UKB) participants, systolic blood pressure (SBP), diastolic blood pressure (DBP) and pulse pressure (PP) on 757,601 participants14, glucose and HbA1c on 196,991 participants15, C-reactive protein (CRP, n = 204,402)16, CHD (60,801 case)17, any stroke (110,182 cases) and ischemic stroke (86,668 cases)18, HF (47,309 cases)19, T2D (180,834 cases)20, CKD (41,395 cases)21, glaucoma (15,655 cases)22 and subarachnoid haemorrhage (5,140 cases)23. Additional outcome data was sourced from a FinnGen and UKB meta-analyses by Sakaue et. al.12 on angina (30,025 cases), ventricular arrhythmia (1,018 cases), PAD (7,114 cases), asthma (38,369 cases), intracerebral haemorrhage (1,935 cases), pneumonia (16,887 cases), with COPD (58,559 cases) included from the Global Biobank Meta-analysis Initiative (GBMI)24.
The corresponding outcomes in the East Asian participants were accessed through the Pan-ancestry GWAS of the UK Biobank (Pan-UKB)25 on Apo-A1 (n = 2,325), Apo-B (n = 2,553) and Lp[a] (n = 2,275). Additional cardiometabolic biomarker data was sourced from Biobank Japan (BBJ)12 on SBP, DBP, PP, glucose, HbA1c, CRP for between 71,221 and 145,505 participants (Supplementary Data 1). BBJ provided data on CHD (32,512 cases), angina (14,007 cases), PAD (4,112 cases), ischemic stroke (22,664 cases), subarachnoid (1,203 cases) and intracerebral (1,456 cases) haemorrhage, ventricular arrhythmia (1,673 cases), T2D (45,383 cases), CKD (2,117 cases), glaucoma (8,448 cases), pneumonia (7,423 cases). Finally, the following outcomes were sourced from the East Asian GBMI release: HF (12,665 cases), COPD (19,044 cases), and any stroke (23,345 cases).
Cross-ancestry colocalization of the LDL-C and HDL-C CETP signals
Due to sampling variability as well as linkage disequilibrium (LD), the most significant variant at a given locus may not reflect the causal variant. Colocalization identifies potential shared causal variants between two traits26 while accounting for sampling variability and LD. Due to the larger sample size available in the European GLGC GWAS, rs183130 (16:g.56991363C > T, GRCh37) has been robustly identified as a causal CETP variant for both LDL-C and HDL-C. We leveraged coloc27 to determine whether this European fine-mapped variant was also causal for LDL-C and HDL-C in East Asian participants. We considered genetic variants within a ± 50kb flank of the CETP genomic region and a MAF ≥ 0.01, applying the following posterior probabilities: PP.H1, PP.H2 = 10− 4 to detect if at least a single genetic variant was associated with the plasma lipids in Europeans (PP.H1), in East Asians (PP.H2), or with plasma lipids in both populations (PP.H4 = 10− 6) at the CETP locus. A posterior probability for a shared genetic signal larger than 0.80 was considered as evidence of colocalization26.
To visualise the CETP association with plasma lipids across ancestries, we generated regional association plots of CETP using the lipids summary statistics for East Asian and European populations from GLGC. The plots were created with the skyline genomic plotting library (https://gitlab.com/cfinan/skyline) implemented in Python, based on the East Asian and European LD references from UKB.
Mendelian randomization analysis
To proxy the effect of CETP inhibition we capitalised on CETP variants strongly associated with HDL-C in both populations and performed a biomarker weighted drug target MR, by exploring the causal effects of CETP inhibition scaled towards a standard deviation (SD) increase in HDL-C. Despite weighting by an intermediate biomarker, the inference of such a “biomarker” drug target MR analysis is on the protein, not on the potential causality of the intermediate biomarker (Supplementary Methods and Supplementary Fig. 1).
To identify instruments for CETP inhibition, weighted by HDL-C, genetic variants within ± 50 kb of the CETP gene (Chr 16:56,995,762 − 57,017,757, GRCh37) were selected in both populations, based on an F-statistic of at least 15, MAF ≥ 0.01, and LD-clumped to an r-squared < 0.3 against their respective reference populations. Ancestry specific LD reference matrices were generated by selecting a random subset of 5,000 unrelated Europeans, and the entire subset of East Asians (n = 2,000) from UKB. The self-defined East Asian individuals were assigned to the East Asian ancestry group based on principle component analysis, implemented with PC-AiR for the detection of population structure, followed by PC-Relate to account for cryptic relatedness28, as described by Giannakopoulou et al29.
Residual LD was modelled through generalised least squares30 implementations of the inverse variance weighted (IVW) and MR-Egger estimators, where the MR-Egger estimator is more robust to the presence of potential horizontal pleiotropy31. To further minimise the potential influence of horizontal pleiotropy, we excluded variants with large leverage or outlier statistics and used the Q-statistic to identify possible remaining violations32. A model selection framework was applied to select the most appropriate estimator between IVW or MR-Egger for each specific exposure-outcome relationship32,33. This model selection framework, originally developed by Gerta Rücker34, utilises the difference in heterogeneity between the IVW Q-statistic and the Egger Q-statistic, preferring the latter model when the difference is larger than 3.84 (i.e., the 97.5% quantile of a Chi-square distribution with 1 degree of freedom). The results were reported as odds ratios (OR) or mean differences (MD) with 95% confidence intervals.
Interaction test
Potential differences between European and East Asian participants in the drug target MR effects of on-target CETP inhibition were formally tested using interaction tests35. Briefly, an interaction effect represents the difference between the ancestry specific MR effects, where the standard error of this difference is equal to the square root of the sum of the variance of the ancestry specific effect estimates. For binary outcomes, where the ancestry specific effect represents an OR, instead of a difference, the interaction effect was calculated as the ratio between the European and East Asian ancestry specific OR (i.e., representing a difference on the logarithmic scale). We additionally applied the interaction test to assess the difference in CETP effects between the East Asian population in our MR study and a previous MR analysis conducted by Millwood et al. in China Kadoorie Biobank.
Multiple testing
The focus of the presented analysis was evaluation of potential differential effects of CETP inhibition between participants of East Asian and European descent. To guard against multiplicity, interaction tests were evaluated against a corrected alpha of 0.05/26 = 1.9×10− 3, accounting for the 26 evaluated traits. Similarly, comparing the previous analysis by Millwood et. al. we had 14 common traits, resulting in a multiplicity corrected interaction p-value of 3.6×10− 3. We did not apply a similar multiple testing corrected alpha for the ancestry specific findings, and instead focussed on outcomes significant in both ancestries. Focussing on replicated associations resulted in an alpha of 0.0502 = 0.0025, and an expected number of false positive results close to zero: 26×0.502=0.065.
Inferential consideration in biomarker weighted drug target MR
As detailed in Schmidt et. al. 202036, Schmidt et. al. 202237, and described in the Supplementary Methods, the inference in biomarker weighted drug target MR is on the drug target itself, not on the downstream biomarker (e.g., HDL-C). Furthermore, the biomarker does not need to cause disease if the drug target affects disease through alternative pathways (i.e., post-translation horizontal pleiotropy) (Supplementary Fig. 1). We further expand these derivations to show that the biomarker weighted drug target MR will approximate an interaction test of the difference in protein effects, only when the protein effect on the biomarker is equal in both populations (Supplementary Methods). Alternatively, assuming directional concordance of the protein effect on the biomarker, more robust inference will be obtained by applying interaction testing to identify directionally discordant outcome effects.