A two-sample MR study was designed to investigate the causal association between BW and the risk of AF (Figure 1). This method was based on three key assumptions. First, the genetic instrumental variables, i.e. SNPs, should be strongly associated with the BW. Second, the instrumental variables should be independent of confounders that may affect the association between the BW and the risk of AF. Third, the instrumental variables should be only associated with the risk of AF via BW.
The exposure in this study was genetically predicted BW, in standard deviations (SDs). The SNPs that proxied for BW were extracted from the hitherto largest genome-wide association study (GWAS) meta-analysis on BW using data from the Early Growth Genetics (EGG) Consortium and the UK biobank (n = 321,223). This trans-ethnic (92.8% were European ancestry) meta-analysis consisted of three components: (i) 80,745 individuals of European ancestry from 35 studies within the EGG consortium, (ii) 12,948 individuals of diverse (non-European) ancestries from 9 studies within the EGG consortium, (iii) 227,530 individuals of all ancestries from the UK Biobank (Table S1). The data on BW were collected via heterogeneous ways (i.e., measured at delivery, obtained from the birth records and medical registries, got from the parental interviews, and self-reported as adult).
The outcome in this study was AF. Summary statistics data on associations of SNPs with AF were derived from a recently published GWAS (n = 1,030,836). This GWAS was the largest one on AF to date, which analyzed a total of 34,740,186 genotyped SNPs on up to 60,620 cases and 970,216 controls from 6 resources: The Nord-Trøndelag Health Study (HUNT), deCODE, the Michigan Genomics Initiative (MGI), DiscovEHR, UK Biobank, and the AFGen Consortium (Table S2). The majority (98.6%) of the individuals were European ancestry. AF was mainly diagnosed according to the International Classification of Diseases (ICD-9 and ICD-10).
Selection and validation of instrumental variables (SNPs)
To ensure a close relationship between the genetic instrumental variables and BW, SNPs were identified at a genome-wide significant level (p-value < 5×10-8) from the corresponding GWAS summary dataset. To check for correlations between each SNPs, the pairwise-linkage disequilibrium (LD) was calculated using LD-Link based on European (https://ldlink.nci.nih.gov/). When r2 > 0.001, only the SNP with lower p-value was retained. In addition, the effects of SNPs on AF were obtained from the corresponding dataset. If the specified SNP was not available for AF, a highly correlated SNP (r2 > 0.8) was selected for proxy. Any palindromic SNPs were removed from our analysis. Additionally, the known effects of SNPs on other traits were checked in the PhenoScanner (http://www.phenoscanner.medschl.cam.ac.uk). SNPs associated with the confounders at the genome-wide significant level (p-value < 5×10-8) were dropped. Finally, F statistic was calculated for each SNP in order to detect whether this SNP was valid (F > 10) or not.
Primary MR analysis
The two-sample MR method was employed to evaluate the causal association between BW and AF in this study. Specifically, the causal effect of each SNP was estimated using the Wald estimator, and the relevant standard error was calculated using the Delta method. The inverse variance-weighted (IVW) with fixed effects method was performed to meta-analyze each Wald ratio as our primary analysis. Results were presented as odds ratios (ORs) with 95% confidence intervals (CIs) of AF per SD increased BW. The association of each SNP with BW was further plotted against its effect on AF.
In addition to searching in the PhenoScanner database, MR-Egger regression was performed to evaluate the potential directional pleiotropy. In MR-Egger regression, the intercept represented the estimated average value of the horizontal pleiotropy. When the p-value of intercept was large than 0.05, no horizontal pleiotropy existed. The slope was interpreted as an unbiased estimate of the causal effect of BW on AF even if all the SNPs were invalid (i.e., the intercept significantly differed from zero). However, the slope estimating relied on an additional assumption known as the Instrument Strength Independent of Direct Effect (InSIDE). A violation of this assumption could also bias the estimate. Moreover, MR-Egger regression was statistically inefficient, which was expected to have considerably larger standard errors (SEs) than other analyses. As such, we mainly focused on whether the intercept test suggested evidence of potential horizontal pleiotropy. Subsequently, funnel plot was generated to visually inspect the pleiotropy, in which symmetry provided evidence against directional pleiotropy.
In the follow-up sensitivity analyses, the IVW with multiplicative random effects, penalized IVW, robust IVW, penalized robust IVW, maximum likelihood, simple median, weighted median and Mendelian Randomization Pleiotropy Residual Sum and Outlier (MR-PRESSO) methods were applied to test the robustness of our primary analysis. These methods were more robust for SNPs with potential heterogeneity and pleiotropy. In comparison with the fixed-effect IVW, the SE in random-effect IVW were supposed to be larger when there was heterogeneity across SNPs. The penalized and robust methods could improve the robustness of estimates when heterogeneity and outliers existed. The weighted median method was able to generate a consistent estimate of the causal effect if the weights of the valid SNPs exceeded 50%. The MR-PRESSO method could detect and correct for outliers and then provided a robust estimate. Subsequently, a leave-one-out analysis was conducted to determine whether the estimated causal effect was disproportionately affected by a single SNP.
To investigate the statistical power, a power calculation was carried out using an online web-based tool named mRnd (https://shiny.cnsgenomics.com/mRnd/). Specifically, the sample size, type-I error (α) rate, proportion of AF cases, OR of AF per SD of BW, and total phenotypic variance explained by all SNPs were inputted. In this study, the statistical power was required to be at least 80%.
An observed 2-sided p-value < 0.05 was considered as significant evidence for a causal association. All the analyses were implemented using the “MendelianRandomization” and “TwoSampleMR” R packages in R (version 3.6.2) software environment.[37, 38]
Our study only made use of the publicly available data, and hence, no additional ethics approval was required.