Genetic and Causal Relationship Between Coffee Intake and Cardiometabolic Risks: Cross-Phenotype Association And Mendelian Randomization Analysis

Background: Many epidemiological studies have shown that there is a signicant association between coffee intake and cardiometabolic diseases, which may be due to the common genetic structure or causal relationship. Methods: We used linkage disequilibrium score regression analysis to calculate the genetic correlation between coffee intake and 23 cardiometabolic traits (diseases), and then used cross-phenotype association analysis to identify the shared genetic loci for the trait pairs with signicant genetic correlation. Besides, a bi-directional Mendelian Randomization analysis was used to explore the causal relationship between coffee intake and 23 cardiometabolic traits (diseases). Results: Coffee intake has a signicant genetic correlation (after Bonferroni correction) with body mass index (BMI) (Rg = 0.3713, P-value = 4.13(cid:0)10 -64 ), body fat percentage (BF%) (Rg = 0.2810, P-value = 1.81(cid:0)10 -13 ), type 2 diabetes (T2D) (unadjusted for BMI) (Rg = 0.1189, P-value = 8.80(cid:0)10 -6 ), heart failure (HF) (Rg = 0.2626, P-value = 6.00(cid:0)10 -9 ), atrial brillation (AF) (Rg = 0.1007, P-value = 4.30(cid:0)10 -5 ). There are 203, 18, 86, 13, 38 independent shared loci between coffee intake and BMI, BF%, T2D, HF, AF, respectively, among which 22, 2, 23, 4,13 loci do not achieve genome-wide signicance in single trait GWAS. Coffee intake has signicant causal effect on BMI (b = 0.0717, P-value = 2.33(cid:0)10 -5 ), T2D (unadjusted for BMI, OR = 1.27, P-value = 1.46(cid:0)10 -7 ), and intracerebral haemorrhage (ICH) (all types ICH: OR = 1.86, P-value = 3.37(cid:0)10 -4 ; deep ICH: OR = 2.12, P-value = 2.93(cid:0)10 -4 ). And BMI (b = 0.3694, P-value=3.64(cid:0)10 -154 ), BF% (b = 0.5500, P-value = 1.68(cid:0)10 -4 ), T2D (adjusted for BMI, b = -0.0252, P-value = 4.83(cid:0)10 -6 ) and triglycerides (TG) (b = -0.1209, P-value = 4.56(cid:0)10 -15 ) have signicant causal effect on coffee intake. Conclusions: Our study identied the shared genetic structure and causal relationship between coffee intake and several cardiometabolic traits (diseases), providing a new insight into the mechanism of coffee intake and cardiometabolic traits (diseases).


Introduction
Coffee is one of the most widely consumed beverages in the world. In North America, each person consumes about two cups of coffee a day. In many European countries, the average is four cups (1). Coffee contains more than one thousand compounds. In addition to caffeine, there are diterpene, chlorogenic acid and other antioxidant and anti-in ammatory substances (2). Thus, it is thought to have a long-term effect on human health (2).
Many observational studies have examined the association between coffee intake and cardiometabolic traits (diseases). For example, both P Mirmirmiran (3) and Yejee Lim (4) found a negative association between coffee intake and the prevalence of type 2 diabetes (T2D). Vijaykumar Bodar and P Bazal found that moderate coffee intake was associated with a reduced risk of atrial brillation (AF) (5,6), while Long Mo stated that drinking more than three cups of coffee a day was signi cantly associated with an increased risk of myocardial infarction (MI) (7). Moreover, previous Genome-wide association studies (GWAS) have identi ed a number of genetic variants that have signi cant effects on coffee intake as well as some cardiometabolic traits (diseases) (8)(9)(10)(11). This suggests that the association between coffee intake and cardiometabolic traits (diseases) may be due to a common genetic structure or causal relationship.
Based on this, we performed this large-scale genome-wide cross-phenotype association analysis and bidirectional Mendelian randomization (MR) analysis. We analysed the genetic and causal relationship between coffee intake and 23 cardiometabolic traits (diseases). Our results provide new insights into the association between coffee intake and cardiometabolic traits (diseases).
First, we calculated the genetic correlation between coffee intake and 23 cardiometabolic traits (diseases). For the trait pairs with signi cant genetic correlations, we carried out cross-phenotype association analysis to identify the shared genetic loci. Then, we conducted a bi-directional MR analysis to explore whether there is a causal relationship between coffee intake and these 23 traits (diseases).

Data Sources
For each trait (disease), we used the summary statistics from a recent GWAS. The data of coffee intake came from GWAS of UK Biobank 501,625 participants, who were asked how many cups of coffee they drank per day to obtain the association results of coffee intake and each single nucleotide polymorphism (SNP) (12). BMI data was obtained from a meta-analysis of ∼700,000 individuals (13). BF% data came from a GWAS meta-analysis of more than 100,000 participants (14). The T2D (adjusted and unadjusted for BMI) data came from the largest T2D GWAS meta-analysis, with a total of 898,130 participants (15). The data of FG, FI, HOMA-IR, HOMA -β were obtained from the meta-analysis of 21 GWAS by Jos é e Dupuis et al (16). The data of LDL, HDL, TG, TC were from the meta-analysis by cristen J willer et al (17). HF data was derived from a large-scale GWAS meta-analysis involving 47,309 cases and 930,014 controls (18). AF data came from a meta-analysis of more than 500,000 participants (19). Both CAD and MI data were from a large-scale GWAS meta-analysis conducted by the CARDIoGRAMplusC4D Consortium (20). Data of ICH and its subtypes were derived from a meta-analysis including people of European descent over 55 years of age (21). Data of ischemic stroke and its subtypes were derived from a meta-analysis of 15 ischemic stroke cohorts (22). For details of these studies, please refer to the corresponding literature.

Statistical Analysis
Linkage disequilibrium score regression analysis We used linkage disequilibrium score regression (LDSC) (23) to calculate the genetic correlation between coffee intake and 23 cardiometabolic traits (diseases) and Bonferroni correction was performed on the obtained P-value. Trait pair with P-value < 0.05/23 was considered to have a signi cant genetic correlation. And trait pair with 0.05/23 < P-value < 0.05 was considered to have a suggestive genetic correlation.

cross-phenotype association analysis
After calculating the genetic correlation between coffee intake and 23 cardiometabolic traits (diseases), we selected trait pairs with signi cant genetic correlation for cross-phenotype association (24) analysis to identify the shared genetic loci of these trait pairs. The SNPs with cross-phenotype association analysis P-value < 5 10 − 8 and single trait GWAS analysis P-value < 0.01 were thought to have signi cant in uence on both traits (diseases). Then we used PLINK 1.9 (25) software to divide these SNPs into different independent clumping regions. SNPs with a distance less than 10000 kb and linkage disequilibrium score R 2 > 0.001 were divided into the same clumping region. And the SNP with the lowest P-value in each region was taken as the index SNP.

Co-localization analysis
In order to further calculate the probability that the shared loci we identi ed containing the common genetic causal variant(s) of two traits (diseases), we conducted co-localization analysis by R package coloc (26-28). Locus (H4 > 0.5) was considered to be related to two traits (diseases) simultaneously in our co-localization analysis.

Fine-mapping credible set analysis
For each shared locus, we gave a SNP list which has a 99% probability of containing the common genetic causal variant(s) by Fine-Mapping (FM) summary method. This method maps the primary signal and uses at prior with steepest descent approximation (29,30).

Transcriptome-wide association analysis
In order to gain more insights into the genetic basis of each trait (disease), we combined the genotype, gene expression and phenotype data for large-scale transcription wide association studies (TWAS) using FUSION software (31). We used False Discovery Rate (FDR) correction and gene-tissue pairs with FDR < 0.05 were thought to be signi cant. And then, on the basis of TWAS results, we used R package RHOGE (32) to estimate the correlation of effect sizes across traits (diseases) in each tissue. Coffee intake and cardiometabolic traits (diseases) were thought to have a signi cant gene expression correlation in the tissues with P-value < 0.05/48.

Mendelian Randomization analysis
To explore the causal relationship between coffee intake and these 23 cardiometabolic traits (diseases), we conducted a bi-directional MR analysis (33,34) using TwoSampleMR package (35,36). Coffee intake and 23 cardiometabolic traits (diseases) were analysed as exposure for two-sample MR analysis in turn. For example, when taking coffee intake as exposure for MR analysis, we rst selected SNPs with P-value < 5 10 − 8 in GWAS summary statistics of coffee intake, and then selected independent SNPs as instrumental variables. We used inverse variance weighting method (37) to integrate the causal effect estimation of different instrumental variables and tested whether the MR-Egger regression (36) intercept item is zero to check whether the horizontal pleiotropy is balanced. Besides, we also carried out a series of sensitivity analysis, including single SNP analysis and leave one out analysis.

Genetic correlation
We calculated the genetic correlation between coffee intake and 23 cardiometabolic traits (diseases) (
The most signi cant locus in the cross-phenotype association analysis of coffee intake and BMI/BF% is rs3751813 (P-value = 3.42 10 − 306 ) on chromosome 16 and rs936227 (P-value = 3.97 10 − 49 ) on chromosome 15, respectively. The most signi cant locus in the cross-phenotype association analysis of coffee intake and T2D is rs2472297 (P-value = 6.38 10 − 221 ) on chromosome 15. This SNP has been repeatedly demonstrated to have a signi cant association with coffee intake, but there has been no previous study found its association with T2D (38). In the cross-phenotype association analysis of coffee intake and HF, rs476828 (P-value = 4.15 10 − 22 ) on chromosome 18 showed the strongest signi cance, so as rs12898551 (P-value = 2.19 10 − 70 ) to coffee intake and AF.
We mapped 5895, 47, 1044, 75, 382 shared genes in the shared regions of coffee intake and BMI, BF%, T2D, HF, and AF, respectively (Supplementary Table 2, Fig. 2). Interestingly, some of these mapped genes are shared by more than two trait pairs. Particularly, TMEM18, FTO and MC4R genes are signi cant in all ve cross-phenotype association analysis, indicating that these genes have signi cant in uence on all six traits (diseases). TMEM18, whose full name is the transmembrane protein 18, is a protein encoding gene that is more expressed in ovaries, thyroid, endometrium, and other tissues. Previous GWAS studies have found signi cant associations between TMEM18 and obesity (39), age of menarche (40), etc. FTO is called FTO alpha-ketoglutarate dependent dioxygenase, which is a protein coding gene and is mainly expressed in brain, adrenal gland, and other tissues. This gene is a nuclear protein of the AlkB related non-haem iron and 2-oxoglutarate-dependent oxygenase superfamily, but the exact physiological function is unknown (38). MC4R is fully known as melanocortin 4 receptor. This gene encodes a membrane-bound receptor and is a member of the melanocortin receptor family. A defect in this gene is responsible for autosomal dominant obesity (38).

Co-localization and ne-mapping analysis
We also used co-localization analysis to calculate the probability that the shared locus identi ed in crossphenotype association analysis contains the common pathogenic variant(s) of two traits (diseases). The main results are presented in Supplementary Table 3. To identify the pathogenic mutations more accurately in each region, we carried out ne mapping. The Supplementary Table 4 shows a SNP list for each region that has a 99% probability of containing the common genetic causal variant(s).
Then, based on the results of TWAS, we calculated the genetic correlation between coffee intake and these ve traits (diseases) in each tissue (Fig. 3, Supplementary Table 11). After Bonferroni correction, coffee intake and BMI are associated positively in all 48 tissues. There is a signi cant and positive genetic correlation between coffee intake and BF% in thyroid, brain cerebellar hemisphere, cells transformed broblasts, skin sun exposed lower leg; with T2D in oesophagus mucosa; with AF in brain cortex. Unfortunately, we did not nd any signi cant gene expression association between coffee intake and HF in all 48 tissues.

MR
To explore the causal relationship between coffee intake and cardiometabolic traits (diseases), we conducted a bi-directional MR analysis with coffee intake and 23 cardiometabolic traits (diseases) as exposure in turn (Fig. 4, Supplementary Table 12).
In MR analysis with coffee intake as exposure, we found that coffee intake had signi cant causal effect on BMI (b = 0.0717, P-value = 2. 33 Table 13). In addition to the above results, we also conducted several sensitivity analyses to examine the effect of a single SNP on the results (Supplementary Table 14, 15).

Discussion
This is the rst study to date to simultaneously analyse the shared genetic structure and causal direction between coffee intake and cardiometabolic traits (diseases). Using persuasive GWAS summary statistics, we found signi cant positive genetic correlation and identi ed the shared genetic loci between coffee intake and several cardiometabolic traits/diseases (BMI, BF%, T2D, HF, AF). Furthermore, we found that coffee intake could signi cantly increase BMI and the risk of T2D and ICH. Higher BMI and BF% could increase the level of coffee intake while T2D (adjusted for BMI) and higher TG reduce coffee intake.
Our results show that there is a signi cant and positive genetic correlation between coffee intake and T2D, but this signi cance no longer exists after adjusting for BMI. We also found that there was a positive genetic correlation between coffee intake and BMI/BF%, which is consistent with the results of Gene ATLAS (12). These results suggest that BMI may mediate the observed genetic correlation between coffee intake and T2D. In addition, we also found a signi cant genetic correlation between coffee intake and HF/AF, indicating that the association between coffee intake and HF/AF may be due to the shared genetic structure.
After that, we identi ed the shared genes between coffee intake and BMI, BF%, T2D, HF, AF by crossphenotype association analysis. We found 6316 shared genes totally. TMEM18, FTO and MC4R were found in all ve cross-phenotype association analysis, indicating that these three genes can simultaneously affect coffee intake and BMI, BF%, T2D, HF, AF. These three genes are all closely related to obesity, which once again highlights the importance of obesity in the association between coffee intake and cardiometabolic traits (diseases).
Our cross-phenotype association analysis revealed many novel genetic regions. Previous studies have not found a signi cant association between these regions and coffee intake or cardiometabolic traits (diseases). Many of these regions are associated with cognitive and intellectual performance (41)(42)(43)(44), and some are associated with mental disorders such as anxiety (45), schizophrenia(46), depression(47), and immune system symptoms such as systemic lupus erythematosus(48), and digestive diseases such as Barrett's esophagus and esophageal adenocarcinoma (49). These ndings suggest that the genetic structure shared by coffee intake and cardiometabolic traits (diseases) may also play an important role in these diseases, which needs further research and exploration.
We also integrated the genotype, gene expression and phenotype for TWAS analysis to show the expression of each phenotype in 48 tissues and calculated the gene expression association between coffee intake and ve other phenotypes in each tissue. Our results indicate that there is a signi cant gene expression correlation between coffee intake and these ve cardiometabolic traits (diseases) in nervous, digestive system and endocrine system. This is consistent with our ndings about novel genetic regions. These results suggest that some biological metabolic processes in these tissues may affect coffee intake and these ve cardiometabolic traits (diseases) at the same time.
In addition to identifying the shared genetic structure, we also conducted a bi-directional MR analysis to explore the causal relationship between coffee intake and cardiometabolic traits (diseases). We got a lot of signi cant results. A-SD increase in coffee intake (cups/day) can increase 0.0717-SD BMI (kg/m 2 ), increase the risk of T2D by 0.27 times, increase the risk of deep ICH by 1.12 times and increase the risk of all types ICH by 0.86 times. These results highlight the harmful effects of coffee intake on health. We did not nd any causal effect of coffee intake on AF or stroke, which is consistent with previous MR studies (50,51).
It should be noted that our MR results show that coffee intake can increase the risk of T2D, which is contrary to the results of most previous observational studies (52). The reasons may include the following aspects: First, our study found that coffee intake can increase BMI, which is the same as previous MR studies on coffee intake and obesity (53). BMI is a recognized risk factor for T2D (54), so BMI is likely to be a mediator between coffee intake and T2D. However, we have observed that previous observational studies on coffee intake and T2D have adjusted BMI in the analysis (52,55), which will inevitably lead to underestimate or misestimate the impact of coffee intake on T2D. Moreover, our MR results also showed that when BMI was adjusted for T2D data, T2D (adjusted for BMI) had a negative causal effect on coffee intake. Therefore, we speculate that the negative correlation between coffee intake and T2D found in many observational studies may be due to improper adjustment of BMI and reverse causality. Second, our coffee intake data is based on a survey of coffee intake among UK Biobank participants. Among all coffee drinkers, the participants who drank instant coffee accounted for the majority (54%). Instant coffee usually contains a lot of dairy creamer and sugar, and previous studies on instant coffee have found an important association with the increased risk of metabolic diseases (56).
Although studies on coffee ingredients such as chlorogenic acids and caffeine have found their bene cial effects on the health (57), such bene cial effects may be neutralized by other substances added to instant coffee. Third, our MR results re ect the effect of lifetime exposure to coffee intake on cardiometabolic traits (diseases), while experimental studies focus on the effect of short-term exposure, which may be different.
Our study has many advantages. First, this is the rst systematic study on the shared genetic structure and causal relationship between coffee intake and multiple cardiometabolic traits (diseases). We identi ed many shared genes and found signi cant causal relationship, which provide new insights into the mechanism of their association. Second, our research scale is very large. We have analysed a total of 23 cardiometabolic traits (diseases), involving obesity, blood glucose and lipid homeostasis and cardiovascular disease, which is helpful to comprehensively understand the relationship between coffee intake and cardiometabolic diseases. Third, the GWAS summary statistics selected for our study are all from large-scale and high-quality GWAS. We identi ed many shared loci and found many signi cant causal relationships that were not found in previous less-powerful MR studies. Fourth, our research is very comprehensive, which uses phenotypic information, gene information and expression information to comprehensively explain the relationship between coffee intake and cardiometabolic traits (diseases).
At the same time, our research also has some areas that can be further improved. First, in our analysis, we found that obesity played an important role in the association between coffee intake and cardiometabolic traits (diseases), but unfortunately, for some traits (diseases), we do not have the data adjusted for obesity. If appropriate data is available later, more interesting results may be found. Second, UK Biobank does not limit the types of coffee in the survey of coffee intake, and different types of coffee may have different effects on health. Studies on speci c coffee type may bring more speci c result.

Conclusion
In this study, we found signi cant positive genetic correlation between coffee intake and several cardiometabolic traits/diseases (BMI, BF%, T2D, HF, AF) and identi ed many shared genetic loci among them, including some novel genetic regions. By MR analysis, we found that coffee intake can signi cantly increase BMI and the risk of T2D and ICH. Our results highlighted the important role of BMI in the relationship between coffee intake and cardiometabolic diseases, which has been overlooked in most previous observational studies. This study provides new insights and evidence for the health effect of coffee intake, which is of great signi cance.

Availability of data and materials
All data generated or analysed during this study are included in these published reference articles: Genetic correlation of coffee intake and cardiometabolic traits (diseases) The vertical axis represents 23 cardiometabolic traits (diseases); the horizontal axis represents the genetic correlation; asterisk indicates that the genetic correlation between the trait/disease and coffee intake is signi cant (P-value < 0.05/23); the three different colours of the bars represent the three aspects of the 23 traits (diseases). Number of shared genes between coffee intake and cardiometabolic traits (diseases) A. The circles with different colours represent the cross-phenotype association analysis of different cardiometabolic traits (diseases) and coffee intake; the overlapped parts between different circles represent the number of genes jointly identi ed in two/several cross-phenotype association analysis; the unique part of each circle represents the number of genes only identi ed in this cross-phenotype association analysis. B. The horizontal axis represents ve cardiometabolic traits (diseases) that have signi cant genetic correlation with coffee intake; the vertical axis represents the number of genes shared by each cardiometabolic trait (disease) and coffee intake.