Study design
These two-sample MR analyses were conducted based on de-identified summary-concentrations data that have been made publicly available; ethical approval was obtained in all original studies. MR analyses assessed the casual effect of an exposure on an outcome by using SNPs as instrumental variables (IVs), in the presence of unmeasured confounding, given that the genotypes are conditionally independent of the disease status[17-19]. The following assumptions are made for MR inference: first, the genetic variant has a robust causal relationship with the exposure; second, the genetic variant affects the outcome only through its effect on the exposure; third, the genetic variant and the outcome do not have common causes (fig.1)[20]. In the present study, two-sample MR analyses with four stages were employed. First, we evaluated whether SNPs were independently associated with exposure. Second, we explored whether each SNP was associated with the outcomes. Third, we combined these findings to assess the unconfounded causative relationship between exposure and outcomes. Fourth, we performed sensitivity analyses to confirm the robustness of our findings.
Data sources and instruments
Two-sample MR analyses were performed using GWAS summary data. These two GWAS datasets were required to have a similar genetic ancestry. Three plasma lipid fractions— LDL-C, high-density lipoprotein cholesterol (HDL-C), and triglycerides (TG)—were included in this study as the exposure variables. For the exposure dataset, a publicly available summary statistics data, based on 188,577 European-ancestry individuals, were identified through a meta-analysis of GWAS from the Global Lipids Genetics Consortium (GLGC) (Sample 1)[21, 22]. For the outcome dataset, we extracted the summary statistic datasets from a recent GWAS of POAG conducted among the United Kingdom (UK) Biobank study participants with a total of 463,010 European-ancestry individuals (Sample 2) and accessed through the MR-Base database (http://www.mrbase.org/). The MR-Base is home to a huge collection of summary data from many GWASs[23]. For the two-sample MR, it was important to certify that the IVs for the exposure were robustly independent. Therefore, we examined the clumping test to assess the linkage disequilibrium (LD) (r2 threshold of 0.001). In addition, if a particular SNP is not obtained in the outcome dataset, then we will use SNPs that are LD ‘proxies’ instead; these lookups are automatically provided by MR-Base.
In the GWASs of GLGC, a total of 185 independent SNPs robustly associated with plasma lipid concentrations reached a threshold of genome-wide significance (P< 5.0 × 10–8) and were selected as IVs for MR analysis (Supplementary Table 1). These 185 SNPs accounted for 6.9% of the variance in LDL-C, 6.4% of the variance in HDL-C, and 5.2% of variance in TG. Of these selected SNPs, 82 were associated with LDL-C, 96 were associated with HDL-C, and 60 were associated with TG. From amongst these 82 SNPs associated with LDL-C, 26 (rs1998013, rs1010167, rs903319, rs646776, rs515135, rs3817588, rs4148218, rs2030746, rs2247056, rs2297374, rs4722551, rs217386, rs10102164, rs2980885, rs8176720, rs174532, rs11220462, rs6489818, rs1186380, rs2288002, rs4791641, rs688, rs10401969, rs6859, rs7264396, rs6016381) were excluded as potential instrumental variables for having a pairwise LD coefficient of determination (r2) > 0.001 with another variant in the set. From amongst these 96 SNPs associated with HDL-C, 24 (rs6680658, rs355838, rs13326165, rs442177, rs6450176, rs17286602, rs4332136, rs894210, rs4871137, rs2980885, rs2472509, rs17788930, rs11246602, rs12226802, rs12801636, rs653178, rs10773105, rs2412710, rs261342, rs2652834, rs5880, rs11660468, rs1800961) were excluded as potential instrumental variables for having a pairwise LD r2 > 0.001 with another variant in the set. From amongst these 60 SNPs associated with TG, 14 (rs10493326, rs3817588, rs10029254, rs799160, rs4921914, rs894210, rs2980885, rs603446, rs2412710, rs261342, rs2652834, rs9930333, rs5880, rs1688030) were excluded as potential instrumental variables for having a pairwise LD r2 > 0.001 with another variant in the set.
Statistical analyses
The primary analyses of causal relationship of IVs for plasma lipid concentrations with POAG were performed using the standard inverse variance weighted (IVW) method. In addition to the IVW method, if other MR models (weighted median estimator, MR-Egger regression, and weighted mode-based estimator), which make different assumptions regarding instrument validity, were to also produce similar estimates of the causal effect, then we can be more confident in the robustness of our findings. Because the IVW method would yield biased estimates in the presence of horizontal pleiotropy, several sensitivity analyses were conducted to assess validity of MR findings. First, different MR analyses were used to verify with orthogonal MR methods (IVW, MR-Egger, weighted mode, and median-weighted MR)[24-26]. Second, MR-Egger intercept was used for detection of and adjustment for directional pleiotropy. A third sensitivity analysis was performed by omitting a single SNP each time (leave-one-out analysis). The heterogeneities among SNPs were assessed using Cochran’s Q test and funnel plots. All the analyses were performed using the MR-Base platform (http://app.mrbase.org/)[23]. Plots such as forest plot, scatter plots, and funnel plots were also generated using this platform. Plots such as forest plot and scatter plots were also generated using this platform. All statistical analyses used a P <0.05 threshold to indicate statistical significance.