Coffee Types and Cardiac Metabolic Risks: Large-scale Cross-phenotype Association Study and Mendelian Randomization Analysis

Purpose: To explore whether coffee intake is associated with cardiac metabolic risks from a genetic perspective, and whether this association remains the same among different types of coffee consumers. Methods: We utilised the summary-level results of 28 genome-wide association studies (total sample size: ~5,000,000). First, we used linkage disequilibrium score regression and cross-phenotypic association analysis to estimate the genetic correlation and identify shared genes between coffee intake and various cardiac metabolic risks. Second, we used Mendelian randomization (MR) analysis to test whether there was a signicant genetically predicted causal association between coffee intake and cardiac metabolic risks. For all the analyses above, we also conducted a separate analysis for different types of coffee consumers, in addition to total coffee intake. Results: Genetically, coffee intake and choice for decaffeinated/instant coffee had signicant positive correlation with body mass index (BMI) and some other cardiac metabolic risks, while choice for ground coffee was signicantly negatively associated with these risks. Between these genetically related phenotypes, there were 1708 genomic shared regions, of which 139 loci were novel. Enrichment analysis showed that these shared genes were signicantly enriched in antigen processing related biological processes. MR analysis indicated that higher genetically proxied coffee intake may increase BMI (b: 0.35, p-value: 1.80(cid:0)10 -05 ), while genetically proxied choice for ground coffee can reduce BMI (b: -0.08, p-value: 6.50(cid:0)10 -05 ), and the risk of T2D (T2D: b: -0.2, p-value: 4.70(cid:0)10 10 ; T2D adjusted for BMI: b: -0.11, p-value: 4.60(cid:0)10 -05 ). Conclusions: Compared with other types of coffee, ground coffee has a signicant negative genetic and genetically predicated causal relationship with cardiac metabolic risks. And this association is likely to be mediated by immunity. The effect of different coffee types on cardiac metabolic risks is not equal, researchers on coffee should pay more attention to distinguishing between coffee types. by cristen J willer et al [26]. HF data was derived from a large-scale GWAS meta-analysis involving 47,309 cases and 930,014 controls [27]. AF data came from a meta-analysis of more than 500,000 participants [28]. Both CAD and MI data were from a large-scale GWAS meta-analysis conducted by the CARDIoGRAMplusC4D Consortium [29]. Data of ICH and its subtypes were derived from a meta-analysis including people of European descent over 55 years of age [30]. Data of ischaemic stroke and its subtypes were derived from a meta-analysis of 15 ischaemic stroke cohorts [31]. For details of these studies, please refer to Supplementary Table 1 and corresponding literature. cardiac metabolic risks. We identied many shared genomic regions and found signicant causal relationship, which provided new insights into the mechanism of their association. In addition to the amount of total coffee intake, we also studied the relationship between choice for different types of coffee and cardiac metabolic risks. The differences in the results suggest the importance of distinguishing different types of coffee in coffee research. In addition to cardiovascular disease, we have also investigated its risk factors (obesity, blood glucose and insulin homeostasis, blood lipids), which will facilitate a more thorough understanding of the association between coffee and cardiovascular disease. The GWAS summary statistics selected for our study are all from large-scale and high-quality GWAS. We identify many shared loci and nd many signicant causal relationships that have not been found in previous less-powerful MR studies.


Introduction
Coffee is one of the most widely consumed beverages in the world, with about 500 billion cups consumed yearly [1]. Coffee contains a variety of chemical substances, although some of which have clear health effects, such as chlorogenic acids (CGAs), which can protect the body from oxidative damage [2], and ochratoxin A, which would aggravate obesity [3], there are still some coffee ingredients whose functions have not been clari ed yet. And the content of components in coffee is also affected by various factors such as processing [4]. Therefore, research on the effects of coffee types on human health despite being complicated is important.
In recent years, a plethora of studies have been published focusing on the association between coffee intake and cardiac metabolic risks. Cumulative evidence from observational studies suggests a signi cant relationship between coffee intake and cardiac metabolic risks [5][6][7], but observational research is susceptible to confounding factors and reverse causality [8]. This problem is somewhat overcome by randomized clinical trials (RCTs), but current RCTs on the impact of coffee intake on cardiac metabolic risks are faced with problems of short intervention times and inconsistent results [9][10][11]. Some researchers use genetic variations as instrumental variables (IVs) to study the effect of genetic proxied coffee intake on cardiac metabolic risks by Mendelian randomization (MR), but these studies hardly distinguished coffee types and the results are not consistent with observational studies, or even completely opposite [12][13][14]. Because of the inconsistent nature of ndings regarding the impact of habitual coffee intake on cardiac metabolic risks, more research is necessary before health care workers can make evidence-based recommendations.

Data Sources
The data used in this study were obtained from 28 large-scale GWASs (Supplementary Table 1). The data of coffee intake and different coffee type choices came from GWAS of more than 320,000 UKB participants, who were asked how many cups of coffee they drank per day and what type of coffee they usually consumed (decaffeinated coffee (any type), instant coffee, ground coffee (include espresso, lter etc), and other types of coffee) [21], released by Neale lab [19].
BMI data was obtained from a meta-analysis of ∼700,000 individuals [22]. BF% data came from a GWAS meta-analysis of more than 100,000 participants [23]. The T2D (adjusted and unadjusted for BMI) data came from the largest T2D GWAS meta-analysis, with a total of 898,130 participants [24]. The data of FG, FI, HOMA-IR, HOMA -β were obtained from the meta-analysis of 21 GWAS by Jos é e Dupuis et al [25]. The data of LDL cholesterol, HDL cholesterol, TG, TC were from the meta-analysis by cristen J willer et al [26]. HF data was derived from a large-scale GWAS meta-analysis involving 47,309 cases and 930,014 controls [27]. AF data came from a meta-analysis of more than 500,000 participants [28]. Both CAD and MI data were from a large-scale GWAS meta-analysis conducted by the CARDIoGRAMplusC4D Consortium [29]. Data of ICH and its subtypes were derived from a meta-analysis including people of European descent over 55 years of age [30]. Data of ischaemic stroke and its subtypes were derived from a meta-analysis of 15 ischaemic stroke cohorts [31]. For details of these studies, please refer to Supplementary Table 1 and corresponding literature.

Statistical Analysis
Linkage disequilibrium score regression analysis As a convenient and frequently used tool in the study of genetic correlation between different phenotypes, LDSC relies on that the product of z-scores from two studies of phenotypes with non-zero genetic correlation is related to the LD score of the SNP under a polygenic model [32]. This method requires only GWAS summary statistics and therefore is faster than other methods. We used LDSC to calculate the genetic correlation between coffee intake, choice for different coffee types and cardiac metabolic risks. Bonferroni correction was performed on the obtained p-value and tests were judged statistically signi cant at p-value < 4.35 10 − 04 (0.05/5/23).

cross-phenotype association analysis
Having identi ed the signi cantly genetically related phenotype-pairs, we further conducted cross-phenotype association [33] analysis to identify the shared genes of these phenotype pairs. This was implemented using the R software package 'CPASSOC' [33]. This package uses the square root of the sample size of each phenotype as the weight, and estimates the correlation matrix through summary statistics of all independent SNPs in the two phenotypes [33]. This method not only considers the heterogeneity within the same phenotype or between different phenotypes, but also accounts for the potential kinship or population strati cation between the participants [33].
SNPs with cross-phenotype association analysis p-value < 5 10 − 8 and single trait GWAS p-value < 0.05 were thought to have signi cant in uence on both phenotypes. Then we used PLINK 1.9 software [34] to divide these SNPs into different independent clumping regions. SNPs with a distance less than 10000kb and LD score R 2 > 0.001 were divided into the same clumping region. And the SNP with the lowest p-value in each region was taken as the index SNP.

KEGG and GO enrichment analysis
In order to have a deeper understanding of the shared mechanism between coffee intake and cardiac metabolic risks, we performed Kyoto Encyclopedia of Genes and Genomes (KEGG) and Gene Ontology (GO) enrichment analysis on the shared genes identi ed in the previous step. All analysis is done through the R package 'clusterPro ler' [35].

Mendelian Randomization analysis
MR is currently a widely accepted method for assessing potential causal relationship for its unique advantages, compared with observational studies and RCTs [8]. The core of MR is IV. In genetic epidemiology, IVs refer to genetic variations related to exposure but not directly related to outcomes and confounders. The selected IVs must satisfy three basic assumptions [8]: rst, it must be related to the exposure of interest; second, it must be independent of confounding factors; third, given exposure and confounders, the IVs are independent of the outcome.
We used two packages which are widely available to conduct MR: "TwoSampleMR" [36,37] and "CAUSE" [38]. "TwoSampleMR" is a convenient and frequently used tool in the study of two-sample MR. Using this package, we selected independent genetic variations with the threshold of p-value 5 10 − 8 and clump-kb 10000, clump-r2 0.01 as IVs, calculated causal estimates based on the association effect sizes of IVs with outcome and exposure, and then used a series of methods including inverse variance weighted method (IVW), MR-Egger regression, simple median, weighted median, penalized weighted median, simple mode, and weighted mode to obtain the overall causal estimate. We also tested whether the MR-Egger regression [36] intercept item was zero to check whether the horizontal pleiotropy was balanced.
Different from "TwoSampleMR", as a recently published method, "CAUSE" uses all SNPs in estimating causal effects, not just variants that are strongly correlated with exposure [38]. This method relies on that if exposure has a causal effect on the outcome, then for any SNP that has a non-zero effect on the exposure, its association effect size with exposure and outcome should be related. Based on this, this method is thought to be able to distinguish between causality and horizontal pleiotropy (related and unrelated). If the result shows that the causal model is better to t the data than the shared model (p-value < 0.05), then we think that exposure has a causal effect on the outcome.

Genetic correlation
There were extensive signi cant genetic correlations between the amount of coffee intake or choices for different coffee types and cardiac metabolic risks (Fig 1, Supplementary Table 2 -6). Without distinguishing between coffee types, we found that the amount of coffee intake was positively genetically associated with BMI (Rg = 0.2617, p-value = 1.12 10 -30 ) and BF% (Rg = 0.1619, p-value = 5.4 10 -05 ).
When considering different coffee types, there was a considerable difference between various coffee type choices. Choosing decaffeinated coffee had a There was no signi cant genetic association between choice for coffee other than the above-mentioned types and cardiac metabolic risks.

Cross-phenotype association analysis
We identi ed multiple common genomic regions shared between the amount of coffee intake or choice for coffee types and cardiac metabolic risks through cross-phenotype association analysis of trait pairs with signi cant genetic correlations (Supplementary Table 7-29), including many novel regions which have not been reported in previous studies (Table 1).
In summary, there were 326 shared genomic regions between total amount of coffee consumed and cardiac metabolic risks, of which 304 were shared with BMI and 22 were shared with BF. Choice for decaffeinated coffee had 238 shared genetic regions with BMI, 75 with T2D, and 30 with AF. Choice for instant coffee and BMI, BF, FI, HOMA-IR, TG, MI had 260, 6, 1, 1, 25, 9 shared regions, respectively. Choice for ground coffee shared the most genomic regions with cardiac metabolic risks, including 321 shared with BMI, 14 shared with BF, 136 shared with T2D, 88 shared with T2D (adjusted for BMI), 15 shared with FG, 7 shared with FI, 10 shared with HOMA-IR, 48 shared with HDL cholesterol, 35 shared with TG, 15 shared with HF, 31 shared with CAD, 17 shared with MI.
One of the most important ndings of this analysis was the novel genomic regions which had not been reported by previous studies. We identi ed 139 novel regions totally. For example, we found that rs10492872 located on chromosome 16 was the shared site of choice for instant coffee and choice for ground coffee and BMI, and the effects of this locus on choice for instant coffee and ground coffee were opposite, similar to rs11660753 on chromosome 18. Another SNP, rs4988483, also located on chromosome 16, was the shared site of choice for ground coffee and T2D (adjusted and unadjusted for BMI), suggesting that this SNP may affect both choice of ground coffee and risk of T2D through a non-BMI-mediated pathway. Further analysis of these sites will help to understand the association between coffee and cardiac metabolic risks.
The discovery of these shared sites suggested that the genetic correlations between coffee intake or choice for different coffee types and cardiac metabolic risks were likely to be driven by these shared regions. Further research on these regions will help to gain a deeper understanding of the pathogenesis of these cardiac metabolic risks and their association with coffee.

KEGG and GO enrichment analysis
The KEGG pathways and GO terms enriched by shared genes are shown in Supplementary Figure 2-8. We performed enrichment analysis on the four shared gene sets of total coffee intake/choice for different coffee types (decaffeinated coffee, ground coffee, instant coffee) and cardiac metabolic risks. In the KEGG analysis, in addition to total coffee intake, the remaining three gene sets all showed signi cant enrichment in allograft rejection (hsa05330), autoimmune thyroid disease (hsa05320), and viral myocarditis (hsa05416).
In GO analysis, all four gene sets were signi cantly enriched in the following ve biological processes related to antigen processing: antigen processing and presentation (GO:0019882), antigen processing and presentation of endogenous peptide antigen (GO:0002483), antigen processing and presentation of endogenous peptide antigen via MHC class I (GO:0019885), antigen processing and presentation of exogenous peptide antigen (GO:0019885), antigen processing and presentation of peptide antigen (GO:0019885).

MR
We used eight different methods to estimate the genetically predicated causal effects of coffee intake and coffee type choice on cardiac metabolic risks. The results are given in Supplementary Table 30-34 and shown graphically in Figure 2.
For the amount of total coffee intake, its harmful associations with BMI (b CAUSE : 0.36, p-value CAUSE : 1.8 10 -05 ) and T2D (b CAUSE : 0.31, p-value CAUSE : 3.30 10 -02 ) were found by seven methods (except MR-Egger regression). Even after adjusting BMI, its enhancement of T2D risk was also re ected in the results of six models (except CAUSE and MR-Egger regression). In addition, some methods also suggested that coffee intake could signi cantly increase TG (simple median, penalized weighted median, simple mode) and the risk of CAD (weighted median, penalized weighted median).
Analysis of different coffee types indicated various results. Compared with other types of coffee, choice for decaffeinated coffee was found to increase BMI (IVW, simple median, weighted median, penalized weighted median) and the risk of T2D (CAUSE; b CAUSE : 0.29, p-value CAUSE : 2.30 10 -02 ) by some methods.
Meanwhile, it was suggested that choice for decaffeinated coffee could decrease TC (weighted median, penalized weighted median, weighted mode) but increase the risk of ICH (weighted median, penalized weighted median). In the research of instant coffee, the results obtained by different methods were not completely consistent. CAUSE's result supported the view that choice for instant coffee could increase the risk of T2D (b CAUSE : 0.17, p-value CAUSE : 2.90 10 -02 ) and this causality was mediated by BMI, but weighted median, penalized weighted median, simple mode, and weighted mode showed the opposite result. Different from the above two types of coffee, choice for ground coffee showed a protective association with cardiovascular health. It was indicated to decrease BMI (b CAUSE : -0.08, p-value CAUSE : 6.50 10 -05 ) and BF% (b CAUSE : -0.06, p-value CAUSE : 2.40 10 -02 ) by CAUSE. Regardless of whether BMI was adjusted or not, the results of most methods showed that choice for ground coffee can reduce the risk of T2D (except IVW and MR-Egger regression; T2D: b CAUSE : -0.2, p-value CAUSE : 4.70 10 -10 ; T2D adjusted for BMI: b CAUSE : -0.11, p-value CAUSE : 4.60 10 -05 ).

Discussion
In the present study, we found that genetically proxied total coffee intake, choice for decaffeinated or instant coffee were signi cantly associated with increased cardiac metabolic risks, including BMI, BF%, T2D and so on, whereas genetically proxied choice for ground coffee was associated with decreased risks. We have identi ed the genes shared by these trait pairs and further determined the biological processes and pathways that these genes were enriched in, and the results indicated the association between coffee intake and cardiac metabolic risks is likely mediated by immune system. This is of great signi cance for a better understanding of the impact of coffee on health and the pathogenesis of cardiac metabolic risks.
In the genetic association analysis, our results showed that those opting for ground coffee (include espresso, lter etc) were less genetically susceptible to cardiac metabolic risks, while those who tended to choose instant coffee or decaffeinated coffee (any type) were more genetically susceptible to cardiac metabolic risks (Fig. 1). Because of the heterogeneity among the consumers of different coffee types showed above, we suggested that combining the data of different coffee beverage consumers in statistical analyses should be done with caution.
We have identi ed many shared genomic regions between coffee intake or choice for different coffee types and cardiac metabolic risks (Supplementary Table 7 -29). Some of these regions have been discovered in GWAS of single trait, but a considerable part of them has never been discovered till now (Table 1). These regions should be paid more attention because they suggest common unknown pathways which may affect both phenotypes and these regions may partly account for the "missing heritability" [39] in mono-phenotype GWAS.
The results of GO enrichment analysis showed that shared genes were signi cantly enriched in the biological processes related to antigen processing ( Supplementary Fig. 2,4,6,8). This result suggests that the association between coffee intake and cardiac metabolic risks is likely to be related to immune regulation. The KEGG enrichment results also supported this inference ( Supplementary Fig. 3,5,7). Moreover, in the cross-phenotype association study, the notable identi ed SNP rs10492872 mapped to FTO is shared by two trait pairs of ground coffee/BMI and instant coffee/BMI, and its effects on ground coffee and instant coffee are in opposite directions, so as rs11660753 mapped to SMIM21 (Table 1). FTO gene is the rst candidate gene of obesity, and it is also related to a variety of other phenotypes, such as alcohol intake [40], C-reactive protein levels [41], etc. A recent study has found that FTO can help cancer cells escape immune surveillance [42]. These ndings suggest that FTO is likely to participate in immune activities. Compared with FTO, there is less research on SMIM21, but epidemiological studies have also found its signi cant association with rheumatoid arthritis [43]. The above research results suggest that the relationship between coffee intake and cardiac metabolic risks and the differences among various coffee type drinkers are likely to be related to the immune system.
In our MR analysis, we used eight different methods for MR research, with their own characteristics (Fig. 2, Supplementary Table 30-34). Heterogeneity test showed that heterogeneity was not uncommon (Supplementary Table 35 -39), which means that the IVs selected according to the conventional principles may be invalid or at least partially invalid. Under this circumstance, the results obtained by IVW may be problematic [44]. The test of the intercept term of MR-Egger regression [36] showed that the horizontal pleiotropy did not exist or had reached equilibrium (Supplementary Table 40). However, it should be noted that MR-Egger regression can only test the unrelated pleiotropy, that is, the horizontal pleiotropy of IVs on the outcome is not related to confounding factors, but cannot be used to identify related pleiotropy. Our genetic correlation results showed that these phenotypes had extensive genetic correlations (Fig. 1), which suggested that related pleiotropy was likely to exist. The median/mode method [45] has lower requirements for IVs, but considering that it uses little information, the results need to be carefully understood. As a newly proposed method, the CAUSE method [38] can deal with related and unrelated pleiotropy to identify causal models and shared models, but the validity of its results needs to be tested in more applications. In short, we recommend that MR results should be obtained from a combination of different methods such as we did here to address relationships between lifestyle factors and health optimally.
We have noticed that there have been MR studies on coffee intake and cardiac metabolic risks published [13,46]. For example, Nordestgaard [13] used 5 genetic variants as instrumental variables to study the relationship between genetically derived coffee intake and obesity, metabolic syndrome, and T2D. And Shuai Yuan [46] used 12 variants to assess the potential causal role of coffee consumption in cardiovascular disease. Neither of the above two studies found a signi cant result. Compared with these researches, our study has several outstanding advantages. First, in addition to total coffee intake, we also analysed the association between the choice for different coffee types and cardiac metabolic risks, and found signi cant differences between various coffee type drinkers. Second, instead of using only a few variants related to caffeine metabolism, we used more variants as instrumental variables to achieve a greater extent of explanation for the variance of coffee intake. Finally, we used multiple MR methods with different advantages to overcome the limitations of a single method. Thus, we believe that our research is more reliable and meaningful.
Both the results of genetic correlation study and MR analysis showed that choice for ground coffee was associated with better cardiac metabolic conditions, compared to instant coffee and decaffeinated coffee. This indicates that the content of coffee's bene cial cardiovascular components is different in various coffee types. This is consistent with the results of the laboratory investigation. A study investigating the content of CGAs and caffeine in 83 commercially available coffee species found that unblended ground coffee had the highest CGAs content and lowest mean caffeine/CGAs ratio [16]. Study on total phenol content has reached similar conclusions [4]. In instant coffee, the bene cial effects of lower content of chlorogenic acid and total phenols are likely to be offset by the harmful effects of added sugar, creamer and other ingredients, as the results of observational studies on instant coffee showed [47].
When we directly analysed total coffee intake without distinguishing between types, we found signi cant positive associations. This may be because in the UKB participants, compared with ground coffee, people who drink instant coffee or decaffeinated coffee account for a larger proportion. Actual data support this inference (decaffeinated coffee: N = 64,717, instant coffee: N = 185,482; ground coffee: N = 73,906; other type of coffee: N = 5,566). Although we are very clear about the cardiovascular bene cial effects of certain components in coffee, previous studies on the association between coffee intake and cardiovascular risks have not reached a consistent result. Some studies thought that moderate coffee drinking can bene t cardiovascular health [48], but there are also research suggesting the harmful effects of coffee [12,14]. Our results suggest a possible reason for this phenomenon. We suggest that research on the health effects of coffee should distinguish between coffee types.
There are several notable strengths of our work. First of all, this is the rst article to simultaneously study the genetic and potential causal relationship between coffee intake and cardiac metabolic risks. We identi ed many shared genomic regions and found signi cant causal relationship, which provided new insights into the mechanism of their association. In addition to the amount of total coffee intake, we also studied the relationship between choice for different types of coffee and cardiac metabolic risks. The differences in the results suggest the importance of distinguishing different types of coffee in coffee research. In addition to cardiovascular disease, we have also investigated its risk factors (obesity, blood glucose and insulin homeostasis, blood lipids), which will facilitate a more thorough understanding of the association between coffee and cardiovascular disease. The GWAS summary statistics selected for our study are all from large-scale and high-quality GWAS. We identify many shared loci and nd many signi cant causal relationships that have not been found in previous less-powerful MR studies.
Our research still has some areas that can be further improved. First, for different types of coffee, it should be noted that we are studying the relationship between choosing this type of coffee and cardiac metabolic risks, but not the amount of this type of coffee consumed. In the future, if relevant data is available, the association between the consumption amount of each type of coffee and cardiac metabolic risks can be further studied. Second, although it is observed that there is a great difference between the results of instant coffee and ground coffee, our research results still cannot answer whether this difference is due to the composition difference caused by the processing or the usual addition of saccharin and other additives in instant coffee. Later, we will further analyse whether the preference of adding milk/cream/sugar to coffee will affect its association with cardiac vascular risks. We believe that these sites will be much useful for studying the pathogenesis of cardiac metabolic risks and their relationship with coffee.

Conclusions
In summary, we have shown that choice for ground coffee (include espresso, lter etc) has extensive signi cant negative genetic correlation with T2D and other cardiac metabolic risks. MR analysis using all variants indicated genetically proxied choice for ground coffee can decrease BMI and the risk of T2D, while other types of coffee may increase the risk of T2D. This study provides new insights and evidence for the health effect of coffee. The results of different coffee types suggests that research on coffee's health effect should pay more attention to distinguishing between coffee types.

Declarations
Ethics approval and consent to participate Not applicable.
• Consent for publication Not applicable.
• Availability of data and materials All data generated or analysed during this study are included in these published reference articles: • Authors' contributions XW, JJ, and TH designed the study. XW performed the statistical analysis. XW wrote the manuscript. All authors helped interpret the data, reviewed and edited the nal paper, and approved the submission.

Figure 2
Forest plot for Mendelian randomization results The vertical axis represents different coffee types. Within each type of coffee, each row from top to bottom represents the results obtained from CAUSE, IVW, MR-Egger regression, simple median, weighted median, penalized weighted median, simple mode and weighted mode. The horizontal axis represents the size of causal effect estimation. This gure only shows MR results of cardiac metabolic risks which have signi cant causality in at least three methods. The different colours of squares are only used to distinguish different methods.

Supplementary Files
This is a list of supplementary les associated with this preprint. Click to download.