We combined information from publicly accessible GWAS summary statistics of European ancestry in a two-sample MR setting. BMI and waist circumference: The summary statistics for BMI and waist circumference were obtained from the UK Biobank from 461,460 and 462,166 individuals respectively. Measures from the GIANT consortium were also included to replicate the estimates obtained with the UK Biobank. Summary statistics for BMI were obtained from a meta-analysis of up to 125 GWAS for 339,224 European individuals (Locke et al. 2015). Summary statistics for waist circumference were obtained from a meta-analysis of 232,101 individuals (Shungin et al. 2015). Measures of BMI and waist circumference were self-reported or measured in a laboratory or in a healthcare setting. Measures were corrected for age, age squared, sex, principal components and study sites. The resulting residuals were transformed to approximate normality with SD of 1 using inverse normal scores. WHR adjusted for BMI: WHR adjusted for BMI was calculated as the ratio of waist and hip circumferences adjusted for BMI in 210,088 individuals from the GIANT consortium (Shungin et al. 2015). Non-alcoholic fatty liver disease: Genetic association estimates for a clinical diagnosis of NAFLD were obtained from a recent GWAS (8434 cases and 770,180 controls) of European ancestry from four cohorts (Ghodsian, 2021). Briefly, we performed a fixed effect GWAS meta-analysis of The Electronic Medical Records and Genomics (eMERGE) (Namjou et al. 2019) network, the UK Biobank, the Estonian Biobank and FinnGen using the METAL package (Willer, Li, and Abecasis 2010). NAFLD was defined using electronic health record codes or hospital records. Logistic regression analysis was performed with adjustment for age, sex, genotyping site and the first three ancestries based principal components. Coronary artery disease: GWAS summary statistics for CAD were obtained from a GWAS on 122,733 cases and 424,528 controls from CARDIoGRAMplusC4D and UK Biobank (van der Harst and Verweij 2018). Samples from CARDIoGRAMplusC4D were drawn from a mixed population (Europeans, East Asian, South Asian, Hispanic and African American), with the majority (77%) of the participants from European ancestry. Case status was defined by CAD diagnosis, including myocardial infarction, acute coronary syndrome, chronic stable angina or coronary stenosis. We also used as replication dataset GWAS summary statistics from the CARDIoGRAMplusC4D excluding UK Biobank (60,801 CAD cases and 123,504 controls) (Nikpay et al. 2015). Type 2 diabetes: GWAS summary statistics for type 2 diabetes were obtained from the DIAbetes Genetics Replication and Meta-analysis (DIAGRAM) consortium and UK Biobank (74,124 cases/824,006 controls) (Mahajan et al. 2018). Case status was defined by electronic health records, self-reports, or laboratory derived clinical diagnostics of T2D. We also used as replication dataset GWAS summary statistics from the DIAGRAM consortium excluding UK Biobank (26,676 T2D case and 132,532 controls) (Scott et al. 2017).
Some of the study samples used to derive our study exposures and outcomes included summary statistics from the UK Biobank, which lead to sample overlap. In univariable MR, sample overlap will bias the estimated results towards the null only when weak instrument is present. In MVMR, the direction of the bias is unclear but will occur only in the presence of weak instrument bias (Sanderson and Windmeijer 2016). We included in our primary MR analysis the UK Biobank to increase power and included sensitivity analysis excluding the UK Biobank to remove sample overlap.
Selection of genetic variants and variants harmonization
For univariable MR analysis, we selected all genome-wide significant SNPs (p-value <5e-8). We then ensured the independence of genetic instruments by clumping all neighbouring SNPs in a 10 Mb window with a linkage disequilibrium R2<0.001 using the European 1000-genome LD reference panel. SNPs and relevant association statistics can be found for each exposure in Supplementary Table 4. For multivariable MR analyses, we first extracted all genetic instruments that were previously selected for univariable MR analysis. We then pooled these SNPs to the lowest p-value corresponding to any of the exposures, using the same parameter setting as the univariable MR (R2=0.001 window=10 Mb). We also included results of two other sensitivity analysis approaches: 1) prioritizing variants with lowest p value for BMI; 2) prioritizing SNPs with lowest p value for waist circumference. When NAFLD was used as an exposure in MVMR, we pooled the combined list of SNPs by selecting the SNP with the lowest p- value for NALFD. This procedure was implemented to select a maximum number of strong genetic instruments, as fewer genetic instruments are available for NAFLD exposure. SNPs in a 2 Mb window of the HLA, ABO and APOE genetic regions were excluded due to their complex genetic architecture and their widespread pleiotropy. Exclusion of pleiotropic genetic regions satisfies assumptions two and three and strengthen inference of MR analyses. Harmonization was performed by aligning the effect sizes of different studies on the same effect allele. All GWAS summary statistics were reported on the forward strand. When a particular SNP was not present in the outcome datasets, we used a proxy SNPs (R2> 0.6) obtained using linkage disequilibrium matrix of European samples from the 1000 Genomes Project. Instrument strength was quantified using the F-statistic (Stephen Burgess, Thompson, and CRP CHD Genetics Collaboration 2011), and the variance explained was quantified using the R2 (Pierce, Ahsan, and VanderWeele 2011). These statistics can be found in Supplementary Table 5.
For univariable primary MR analysis, we performed the inverse-variance weighted (IVW) method with multiplicative random effects with a standard error correction for under dispersion (Stephen Burgess, Foley, and Zuber 2018). MR must respect three core assumptions (relevance, independence and exclusion restriction) for correct causal inference. MR estimates bias occurs if the genetic instruments influence several traits on different causal pathways. This phenomenon, referred to as horizontal pleiotropy, can be balanced by using multiple genetic variants combined with robust MR methods (Slob and Burgess 2020). To verify if pleiotropy likely influenced the primary MR results, we performed 6 different robust MR analyses: MR Egger (Bowden, Davey Smith, and Burgess 2015), the MR-Robust Adjusted Profile Score (MR-RAPS) (Zhao et al. 2018), the contamination mixture (Stephen Burgess et al. 2020), the weighted median, the weighted mode and the MR-PRESSO (Verbanck et al. 2018), each making a different assumption about the underlying nature of the pleiotropy. Consistent estimates across methods provide further confirmation about the nature of the causal links. All continuous exposure estimates were normalized and reported on a standard deviation scale. For dichotomous traits (i.e., diseased status on NAFLD, T2D and CAD), odds ratios were reported.
For multivariable primary MR analysis, we conducted the IVW method (S. Burgess and Thompson 2015). The use of multivariable MR is analogous to the inclusion of measured covariates in a multivariate linear regression. Multivariable MR uses a set of overlapping genetic instrument to estimate the direct effect of an exposure on an outcome. As robust MVMR analyses, we used the multivariable MR-Egger (Rees, Wood, and Burgess 2017), the multivariable median method ,and the multivariable MR-Lasso method (Grant and Burgess 2021). Similar to robust univariable MR analyses, each method makes different assumptions about the underlying nature of the pleiotropy and consistent estimates give confidence in the robustness of the causal findings. Multivariable MR analyses were performed using the MendelianRandomization V.0.5.1 package (Yavorska and Burgess 2017). Mediation analyses were performed using the formula () Where θ2 is the direct effect estimated with IVW-MVMR and θt is the total effect estimated with univariable IVW-MR (Stephen Burgess et al. 2017).
Institutional review board approval
All GWAS summary statistics were publicly available and accessible through URL. For all included genetic association studies, all participants provided informed consent and study protocols were approved by their respective local ethical committee.