Summary statistics of UK Biobank (UKB) and International Consortium for Blood Pressure (ICBP). UKB consists of 458,577 UK European and ICBP consists of 299,024 European decent subjects. GWAS of BP traits were conducted in UKB and ICBP separately and results were meta-analyzed.3; 6 Our analysis was based on the summary results from the UKB and ICBP GWAS that were calculated based on up to 757,601 participants and ~7.1 M genotyped and imputed SNPs with MAF ≥ 1% for variants present in both the UKB data and ICBP meta-analysis for SBP, DBP and PP.
Summary statistics of the Million Veteran Program (MVP). The summary statistics of the Million Veteran Program (MVP) consists of 318,891 European ancestry participants and 18.2M genotyped and imputed SNPs for SBP, DBP and PP.12 The MVP summary statistics were used for replication analysis.
UKB phenotypic data. (There was an inadvertent mistake in the last version). We analyzed three BP traits in UKB: SBP, DBP and pulse pressure (PP) (SBP-DBP). We calculated the mean SBP and DBP values from two baseline BP measurements. We added 15 and 10 mmHg to SBP and DBP for individuals who took antihypertensive medications. Hypertensive cases were defined as either SBP>=140 or DBP>=90 or taking antihypertensive medications. CVD cases in UKB were defined using self-reported baseline information on CVD prevalence and the ICD9 and ICD10 diagnostic codes on hospital admissions. The CVD cases includes ICD10 codes (I210, I211, I212, I213, I214, I219, I21X, I220, I221, I228, I229, I230, I231, I232, I233, 234, I235, I236, I238, I240, I241, I248, I249, I250, I251, I252, I253, I254, I255, I256, I258, I259), and ICD9 codes ("4109", "4119","4129", "4139", "4140", "4141", "4148", "4149"). This procedure resulted 35,968 CVD cases in European, African and Asian ancestries.
Mendelian Randomization analysis.We applied the iterative Mendelian randomization and pleiotropy analysis (IMRP)22 and MR mixture model (MRmix) 23for bi-directional MR analysis of SBP and DBP, as well as to estimate the causal contributions ofBP on CAD, MI and STROK. IMRP is an iterative approach by combining the pleiotropy test and the MR analysis. The iteration starts by performing MR-Egger analysis40 to estimate causal effect of an exposure to outcome, following by inverse variance weighted (IVW)41; 42 analysis until the causal effect estimation converges.At each iteration step, IMRP perform pleiotropy test to update which genetic instrument variants showing pleiotropy evidence (P<0.05) by performing the test , where and are the estimated effect sizes of an IV on the exposure and the outcome, respectively, and is the causal estimate which is updated at each iterative step. IMRP takes the advantages of MR-Egger, which is less bias, and IVW, which is efficient. IMRP can be applied to GWAS summary statistics of an exposure and an outcome obtained with overlapped or non-overlapped samples. To ensure the causal estimate is robust, we also applied a substantially different MR approach MRmix 23, which is an estimating equation approach that assumes follow a normal mixture model.MRmix usually shows a good trade-off between bias and variance even with more than 50% invalid IVs 23.
GWAS of pleiotropy analysis for SBP and DBP. We first perform a bi-directional MR analysis by IMRP to estimate the causal effect using genome wide significant independent variants associated with SBP and DBP separately in UKB-ICBP summary statistics. We then perform the pleiotropy test to all the 7.1M SNPs by fixing the causal effect estimated in the IMRP analysis. This is equivalent to performing GWASs for two new traits: BPpleio1=DBP SBPand BPpleio2=SBP DBP, where and are the estimated causal effects of SBP on DBPand DBP on SBP, respectively. We noted BPpleio1 and BPpleio2 were highly correlated and represented essentially one phenotype BPpleio. We declared a variant is significant when its P-value is less than 5×108. In the replication analysis in using MVP summary statistics, we applied the causal effect estimate obtained in UKB-ICBP data. Thus, the replication analysis had the same analytic model as in the UKB-ICBP.
Genomic inflation and confounding. We applied the LD score regression method24 to test for genomic inflation in the GWAS pleiotropy analysis. It is expected that BPpleio will have large genomic control inflation coefficient because of large sample sizes and dense genetic variants in high LD43. The GC lambda was 1.533 and LDSR intercept was 1.057 (0.013), with inflation ratio 4.23%, suggesting little inflation in the pleiotropic analysis.
Novel locus definition. Novel loci were defined as the variants reaching genome wide significance and are 1Mb away from known BP variants as well as LD r2 <0.05 with any known BP variants. Novel signals at known loci are variants within 1 Mb region of known BP variants and reach genome-wide significance level, as well as not being in LD with any known BP variants (r2 < 0.05). The 1000G European ancestry data was used as the reference genetic data for LD calculation.
Functional analyses. We evaluated all sentinel SNPs at novel loci for evidence of mediation of expression quantitative trait loci (eQTL) and alternative splicing quantitative trait loci (sQTL) in all 44 tissues using the Genotype-Tissue Expression (GTEx) database. Following the method in Evangelou et al.6, a locus is annotated with a given eGene(sGene) only if the most significant eQTL(sQTL) SNP for the given eGene(sGene) is in high LD (r2 ≥ 0.8) with the sentinel SNP. We performed overall enrichment testing but using the mediation and pleiotropic variants separately. We used DEPICT27 (Data-driven Expression Prioritized Integration for Complex Traits) to identify tissues and cells that are highly expressed at genes within the BP mediation loci, as well as BP pleiotropic loci. We also used DEPICT to test for enrichment in gene sets associated with biological annotations including GO ontology, mouse knockout phenotype studies and protein-protein interaction subnetworks. We reported significant enrichments with a false discovery rate < 0.05. Analysis was done using the platform: Complex-Traits Genetics Virtual Lab44.
Genetic risk score (GRS) and pleiotropic Genetic risk scores (PGRS). We constructed a traditional genetic risk score using independent 1,616 genome wide significant BP variants from UKB-ICBP. We first constructed SBP and DBP weighted GRSs and then derived a single BP GRS as the average of SBP and DBP GRSs. This approach was used in the literature6 to estimate the combined effect of the BP variants on BP and risks of hypertension and CVD. In addition, we constructed a pleiotropic genetic risk score PGRS using the 906 pleiotropic variants in a similar way. We first constructed BPpleio1 and BPpleio2 weighted PGRSs and next derived a single PGRS as the difference of SBP and DBP GRSs. The composite GRS was constructed by joint model of GRS and PGRS in a linear regression. We also performed a linear regression by jointly modeling GRS and PGRS, as well as the interaction of age*GRS and age*PGRS on blood pressure in the UKB data. Similarly, we performed logistic regression of GRS, PGRS, age*GRS and age*PGRS on risk of hypertension and cardiovascular events at baseline in the UKB data. We examined whether PGRS is able to predict additional variations of BP, hypertension and CVD after accounting for GRS. We also examined the age-varying effects of GRS and PGRS by testing the interaction effects. Our analysis included 386,752 unrelated individuals of European ancestry with phenotypes measured at baseline, respectively. To assess the association of GRS, PGRS and their interactions with age on BP, risk of hypertension and CVD, we performed the regression analysis, with adjustment for sex, age, BMI, geographical region and 10 genetic principal components. CVD was defined in unrelated participants in UKB data on the basis of self-reported medical history and linkage to hospitalization and mortality data6.
We assessed the association of the GRS, PGRS and their interactions with age on blood pressure in unrelated Africans (n = 7,904) and South Asians (n = 8,509) from the UKB to see whether blood pressure–associated SNPs identified from GWAS predominantly in Europeans are also associated with blood pressure in populations of non-European ancestry. All the analyses were performed using the residuals after adjusting for sex, age, BMI, geographical region and 10 genetic principal components.
Cross-trait lookups of novel loci: We supplied the index SNPs at the novel loci observed in UK Biobank-ICBP pleiotropic analyses to FUMA26 and GWAS catalog25 to investigate the trait pleiotropy with traits other than BP, extracting all association results with P < 5 × 10-8, for all SNPs in high LD (r2 ≥0.8).