Genome-wide association study adjusting for familial relatedness identifies novel loci for food intake in the UK Biobank

doi:10.21203/rs.3.rs-3212631/v1

Download PDF

Article

Genome-wide association study adjusting for familial relatedness identifies novel loci for food intake in the UK Biobank

https://doi.org/10.21203/rs.3.rs-3212631/v1

This work is licensed under a CC BY 4.0 License

Version 1

posted

You are reading this latest preprint version

This study aimed to identify genetic risk loci associated with dietary intake using recently revealed data of over 93 million variants from the UK Biobank. By adjusting for familial relatedness among individuals in a linear mixed model, we identified a total of 399 genomic risk loci for the consumption of red meat (n = 15), processed meat (n = 12), poultry (n = 1), total fish (n = 28), milk (n = 50), cheese (n = 59), total fruits (n = 82), total vegetables (n = 50), coffee (n = 33), tea (n = 40), and alcohol (n = 57). Of these, 13 variants in previous study did not reach suggestive significant level (p = 1.0e-5). Under the LDAK model, the heritability (h²) was highest for the consumption of cheese (h² = 10.48%), alcohol (h² = 9.71%) and milk (h² = 9.01%), followed by tea (h² = 8.34%) and fruits (h² = 7.83%). Of these, the highest genetic correlation (r²) was observed between milk and tea consumption (r² = 0.86). Post-GWA analyses were further conducted to identify variant annotations and functional pathways using summary statistics. Overall, by analyzing the updated data with adjustment for familial relatedness in this large-scale database, we identified several novel loci for food intake. Further investigations in other populations are needed to understand the contribution of genetic factors to dietary habits in populations of various ethnic backgrounds.

Biological sciences/Genetics/Quantitative trait

Health sciences/Risk factors

Genome-wide association study

food intake

relatedness

linear mixed model

UK Biobank

Accumulating evidence has demonstrated the key roles of diet consumption in modifying the risk of noncommunicable diseases (1–3). The Global Burden of Disease Study estimated that approximately 7.9 million deaths and 187.7 million disability-adjusted life years from noncommunicable diseases could be attributed to dietary risk factors (4). The interactions between genetic factors and the interindividual variability of the disease response to specific diets have been hypothesized in the science of nutrigenetics (5, 6). The choice and preference of foods have been suggested to be influenced by not only physiologic, behavioral, and environmental factors but also genetic traits (7, 8).

A study of a large population-based cohort of British adolescent twins identified that a substantial proportion of the preference for foods was explained by genetic variations, e.g., for meat or fish (49%), dairy (44%), fruits (49%), and vegetables (54%) (9). In particular, individual food preferences and dietary habits have been identified to be affected by the senses of taste and smell and metabolic processes (10). To date, genes and corresponding polymorphisms linked to eating preferences, food addiction, taste sensations, fat and carbohydrate absorption, food intolerances, vitamin metabolism, and xenobiotic metabolism have been identified (11). Additionally, several genome-wide association studies (GWAS) also reported genetic variants related to dietary habits (12–34). In most studies, the presence of relatedness among individuals, which may confound the association between exposures and outcomes in large-scale databases (35), have not been fully accounted for. Particularly, population stratification (ancient relatedness) refers to any differences that may lead to systematic differences in allele frequencies and thus spurious associations, whereas familial relatedness (recent relatedness) refers to the presence of related individuals within the sample, which may violate assumptions of common analytical tools and artificially inflate test statistics leading to false-positive (35). The most comprehensive GWAS adjusting for familial relatedness reported hundreds of significant loci for single foods and dietary patterns in participants of the UK Biobank (19). However, underlying biological mechanisms contributing to genetic variations for the intake of several food items (e.g., pork vs. beef vs. lamb/mutton, oily vs. nonoily fish, fresh vs. dried fruits, cooked vs. raw vegetables) have been still unclear. To justify the selection of dietary factors, we combined food items into more common food groups and analyzed updated data. To address this gap, we performed a GWAS of food intake adjusting for both population stratification and cryptic relatedness using updated data of more than double number of single nucleotide polymorphisms (SNPs) compared to the previous study. We included red meat, processed meat, poultry, fish, milk, cheese, fruits, vegetables, coffee, tea, and alcohol in our analysis.

Characteristics of the study participants in the UK Biobank

Participants in the UK Biobank had a mean age of 56.9 years and a greater representation of women (N = 220,608, 54.1%). Descriptive statistics for food consumption are also presented in Table 1. Less than 1% of study participants had missing information on dietary intake, except for dairy foods (milk, 5.2%, and cheese, 2.3%).

Table 1

General characteristics and dietary intake of 408,093 participants
Factor	Mean ± SD	Median (IQR)	Missing (%)
Age (years)	56.9 ± 8.0	58 (51–63)
Sex
Men (N = 187,485)
Women (N = 220,608)
Red meat (times/week)	2.1 ± 1.4	2.0 (1.5–2.5)	552 (0.1%)
Processed meat (times/week)	1.5 ± 1.4	1.0 (0.5-3.0)	578 (0.1%)
Poultry (times/week)	1.9 ± 1.2	1.0 (1.0–3.0)	651 (0.2%)
Total fish (times/week)	2.3 ± 1.6	2.0 (1.0-3.5)	490 (0.1%)
Milk (100 mL/day)	2.3 ± 1.0	2.3 (1.7–2.9)	21,205 (5.2%)
Cheese (times/week)	2.5 ± 1.7	3.0 (1.0–3.0)	9,530 (2.3%)
Total fruits (servings/day)	2.6 ± 1.9	2.3 (1.0-3.5)	606 (0.1%)
Total vegetables (servings/day)	4.8 ± 3.1	4.0 (3.0–6.0)	2,297 (0.6%)
Coffee (cups/day)	2.1 ± 2.0	2.0 (0.5-3.0)	659 (0.2%)
Tea (cups/day)	3.4 ± 2.5	3.0 (1.0–5.0)	779 (0.2%)
Alcohol (times/day)	2.8 ± 2.5	1.5 (0.5–3.5)	294 (0.1%)
SD, standard deviation; IQR, interquartile range

Loci and annotation of SNPs related to dietary intake

The results from the genome-wide association analysis for significant SNPs (p < 5x10^− 8) associated with food intake are presented as Manhattan plots (Fig. 1 and Supplementary Fig. S1). Overall, relatedness-adjusted GWAS of red meat, fish, fruit, and vegetable intake appeared with fewer false positive results than plots of those that did not account for relatedness. In contrast, we observed similar patterns for GWAS of processed meat, poultry, milk, cheese, coffee, tea, and alcohol consumption between models with and without relatedness adjustment. Table 3 presents summary statistics of the top five genomic risk loci for the consumption of food intake.

Table 2

Heritability estimations for the proportion of dietary intake variance due to genetic differences between individuals and number of significant loci for dietary factors
Dietary factor	Heritability (%)	Standard deviation (%)	Number of significant loci
Red meat	5.81	0.18	983
Processed meat	5.42	0.34	293
Poultry	3.50	0.15	1
Total fish	5.58	0.17	1,792
Milk	9.01	0.21	3,315
Cheese	10.48	0.22	2,047
Total fruits	7.83	0.19	2,990
Total vegetables	5.30	0.17	997
Coffee	6.26	0.18	2,789
Tea	8.34	0.20	3,292
Alcohol	9.71	0.21	2,967

Table 3

Summary statistics of top five genomic risk loci identified from genome-wide association analysis
Dietary factor	Chr	Variant	Position	Nearest gene	Reference allele	Alternative allele	RAF	Beta	SE	P-value
Red meat	1	rs2055145	45926495	TESK2	C	G	0.762	0.0217	0.0036	2.06E-09
	1	rs10789340	72940273	RPL31P12	A	G	0.376	0.0196	0.0032	8.15E-10
	6	rs4486004	26167710	HIST1H2BD	G	T	0.780	0.0253	0.0037	1.09E-11
	9	rs141229573	15576114	CCDC171	T	TATC	0.525	0.0283	0.0031	7.74E-20
	16	rs12931387	5676432	RP11-420N3.2	A	C	0.808	-0.0249	0.0039	2.26E-10
Processed meat	3	rs13091492	81891476	RP11-359D24.1	A	G	0.625	-0.0185	0.0031	1.40E-09
	8	rs2980508	8171732	SGK223	G	A	0.511	0.0194	0.0030	5.76E-11
	8	rs113442811	10772255	XKR6	C	CACAGAAGA	0.508	-0.0230	0.0030	9.45E-15
	11	rs11032362	33759092	CD59	G	A	0.910	0.0323	0.0052	3.61E-10
	19	rs8103840	49254955	FUT1	C	T	0.531	-0.0196	0.0030	5.90E-11
Poultry	16	rs34473833	64876684	CDH11	A	G	0.984	0.0588	0.0108	4.61E-08
Total fish	6	rs3734543	26468545	BTN2A1	G	C	0.878	0.0370	0.0053	1.81E-12
	12	rs35287743	110057250	MVK	G	T	0.883	0.0476	0.0054	1.17E-18
	16	rs7187250	53810546	FTO	C	A	0.607	-0.0261	0.0035	1.35E-13
	19	rs429358	45411941	APOE	T	C	0.844	-0.0380	0.0047	1.16E-15
	19	rs8103840	49254955	FUT1	C	T	0.531	-0.0278	0.0035	1.34E-15
Milk	4	rs2199936	89045331	ABCG2	A	G	0.114	-0.0405	0.0036	9.59E-30
	7	rs4410790	17284577	AC003075.4	T	C	0.365	-0.0645	0.0024	4.72E-164
	7	7:73042302	73042302	MLXIPL	GCTTT	G	0.867	-0.0356	0.0034	2.57E-26
	7	rs17685	75616105	POR	G	A	0.722	-0.0345	0.0025	3.45E-42
	15	rs12909335	75214789	COX5A	T	G	0.439	-0.0386	0.0023	1.71E-63
Cheese	2	rs549814	45153508	RP11-89K21.1	C	T	0.672	-0.0440	0.0041	1.82E-26
	3	3:49800212	49800212	IP6K1	CT	C	0.492	0.0302	0.0039	1.25E-14
	8	rs7012814	9173358	RP11-115J16.1	G	A	0.525	0.0307	0.0039	2.98E-15
	18	rs2960578	21143739	NPC1	T	G	0.501	-0.0271	0.0039	2.27E-12
	20	rs6029941	35519475	TLDC2:SAMHD1	G	A	0.456	0.0271	0.0039	2.89E-12
Total fruits	1	rs1620977	72729142	NEGR1	A	G	0.267	0.0413	0.0047	7.53E-19
	7	rs6967154	143713124	OR6B1	A	T	0.627	-0.0591	0.0042	4.93E-44
	10	rs10828266	22098701	DNAJC1	A	G	0.283	-0.0460	0.0046	9.45E-24
	14	rs34162196	22038125	OR10G3	C	T	0.899	0.0593	0.0068	3.30E-18
	19	rs429358	45411941	APOE	T	C	0.844	-0.0484	0.0057	1.27E-17
Total vegetables	1	1:153761809	153761809	Y_RNA	AT	A	0.441	0.0498	0.0069	5.62E-13
	3	rs7619139	25110415	AC133680.1	T	A	0.411	-0.0562	0.0069	3.72E-16
	7	rs139549768	54932540	SNORA73	A	T	1.000	-3.7828	0.4597	1.89E-16
	7	rs145929636	57391143	MIR3147	C	G	1.000	-4.0848	0.5178	3.04E-15
	15	rs181495534	54039101	WDR72	A	T	1.000	-3.3383	0.4483	9.60E-14
Coffee	7	rs4410790	17284577	AC003075.4	T	C	0.366	-0.1218	0.0046	8.90E-157
	7	rs34060476	73037956	MLXIPL	A	G	0.866	-0.0611	0.0065	2.67E-21
	7	rs1057868	75615006	POR	C	T	0.715	-0.0619	0.0049	3.27E-37
	15	rs2017998	74721905	SEMA7A	G	C	0.379	-0.0662	0.0045	3.57E-48
	18	rs476828	57852587	RP11-795H16.2	T	C	0.762	-0.0482	0.0052	8.93E-21
Tea	6	rs2465018	51241140	RP3-437C15.2	G	A	0.769	-0.0553	0.0067	2.05E-16
	7	rs4410790	17284577	AC003075.4	T	C	0.366	-0.1150	0.0059	9.38E-86
	7	rs17685	75616105	POR	G	A	0.722	-0.0665	0.0063	3.96E-26
	15	rs12909335	75214789	COX5A	T	G	0.439	-0.0756	0.0057	2.31E-40
	22	rs4820593	24887087	ADORA2A-AS1:UPB1	A	T	0.415	-0.0665	0.0057	5.28E-31
Alcohol	2	2:27748992	27748992	GCKR	AT	A	0.089	-0.0882	0.0056	1.36E-55
	4	rs11940694	39414993	KLB	A	G	0.506	-0.0766	0.0056	1.43E-42
	4	rs29001570	99994405	ADH5	T	C	0.664	0.5175	0.0358	1.85E-47
	6	rs9482094	98364895	RP11-436D23.1	A	G	0.250	0.0514	0.0056	6.39E-20
	16	rs7191618	28565667	CCDC101	C	G	0.260	0.0546	0.0055	3.49E-23
Chr, chromosome; RAF, reference allele frequency; SE, standard error.

We identified a total of 399 genomic risk loci for the consumption of red meat (n = 15), processed meat (n = 12), poultry (n = 1), total fish (n = 28), milk (n = 50), cheese (n = 59), total fruits (n = 82), total vegetables (n = 50), coffee (n = 33), tea (n = 40), and alcohol (n = 57) in the linear mixed model adjusting for cryptic relatedness (Supplementary Tables S1-S11). Of these, variants rs2199936 (chromosome 4, ABCG2 gene, Supplementary Fig. S2), rs139797380 (chromosome 6, SLC35D3 gene, Supplementary Fig. S3), and rs4410790 (chromosome 7, AC003075.4 gene, Supplementary Fig. S4) were associated with milk, coffee, and tea consumption. Variant 2:27748992 (chromosome 2, GCKR gene, Supplementary Fig. S5) was associated with the consumption of milk, coffee, and alcohol. Variant rs8103840 (chromosome 19, FUT1 gene, Supplementary Fig. S6) was associated with the intake of processed meat, fish, and fruits. In addition, some SNPs were associated with two dietary factors, including rs201406724 (milk and tea), rs11940694 (milk and alcohol), rs2465018 (milk and tea), rs17685 (milk and tea), rs4726481 (tea and alcohol), rs7012814 (cheese and tea), 8:73433232 (milk and tea), rs11032362 (processed meat and fruits), 12:11271915 (coffee and tea), rs12591786 (milk and tea), rs12909335 (milk and tea), rs9937521 (tea and alcohol), rs12459249 (milk and coffee), and rs429358 (fish and fruits).

Functional gene set enrichment analysis

For total fish intake, chromatin assembly (p = 2.14e-40), protein heterodimerization activity (p = 1.21e-26), and histone modifications (p = 7.07e-21) were detected as the most significant pathways (Supplementary Fig. S7). For milk consumption, xenobiotic metabolic process (p = 0.007) and hexosaminidase activity (p = 0.044) were the only significant biological process and molecular function, respectively (Supplementary Fig. S8). For cheese and fruit intake, only molecular functions were found, with the most significant pathways being growth hormone receptor binding (p = 5.79e-5 for cheese intake, Supplementary Fig. S9) and interleukin-1 receptor activity (p = 0.001 for fruit intake, Supplementary Fig. S10). For coffee and tea consumption, sensory perception of bitter taste (p = 1.14e-8 for coffee and p = 2.18e-16 for tea) and bitter taste receptor activity (p = 1.18e-10 for coffee and p = 1.25e-20) were the most significant gene ontology pathways. In addition, oxidation by cytochrome P450 (p = 5.81e-3) was the only WikiPathway for coffee consumption, whereas aryl hydrocarbon receptor (p = 0.017), circadian rhythm related genes (p = 0.022), fatty acid omega oxidation (p = 0.035), and genes involved in male infertility (p = 0.035) were found for tea consumption (Supplementary Fig. S11-S12). For alcohol consumption, ethanol oxidation (p = 3.51e-5) and alcohol dehydrogenase activity zinc dependent (p = 1.81e-8) the most significant biological process and molecular function, and acid omega oxidation (p = 7.24e-4) were the only WikiPathway (Supplementary Fig. S13).

Heritability, and genetic and phenotypic correlation

The heritability was highest for the consumption of cheese (h² = 10.48%), alcohol (h² = 9.71%), and milk (h² = 9.01%), followed by tea (h² = 8.34%) and fruits (h² = 7.83%). Other foods had a heritability of approximately 5%-6%, except poultry (h² = 3.50%) (Table 2). Furthermore, we found a relatively high genetic relationship for the intake between milk and tea (r² = 0.86), fish and vegetables (r² = 0.52), fruits and vegetables (r² = 0.49), red meat and processed meat (r² = 0.48), processed meat and fruits (r²=-0.46), cheese and alcohol (r² = 0.44), and red meat and poultry (r² = 0.43). The highest Pearson correlation coefficients between food consumption were found for coffee and tea (r²=-0.32) and milk and tea (r² = 0.30) (Fig. 2).

By using the most recently released imputation data of more than 93 million variants for 408,093 participants in the large-scale UK Biobank, we identified 399 genomic risk loci for self-reported traits reflecting daily consumption of red meat, processed meat, poultry, fish, milk, cheese, fruits, vegetables, coffee, and tea. Of these, 231 SNPs were either unavailable or did not reach a significant level in a previous study (19). Overall, the heritability of these foods ranged between 3.5% and 10.5%, which reflected the proportion of dietary intake variation explained by genetic factors. Our gene set enrichment analysis found several significant functional pathways for the intake of total fish, milk, cheese, fruits, coffee, tea, and alcohol.

When we searched PubMed up to September 2022 for the GWAS of dietary traits, a total of 23 GWAS were identified, and seven studies included the population of the UK Biobank (12–34) (Supplementary Table S12). Of these, traits of interest were diverse, including bitter and sweet beverages (coffee, tea, alcohol, and juices), nutrients (carbohydrates, fats, and proteins), dietary patterns (meat-related diet and fish and plant-related diet), dietary index (Dietary Approaches to Stop Hypertension), (12, 15, 16, 20, 29, 31) or all 85 single item traits in the food frequency questionnaire and their corresponding 85 PC diets (19). In Cole et al.’s study, some single food item quantitative traits shared the substantial number of significant loci (Supplementary Table S13) (19). Therefore, it was more feasible to combine single food items (beef, lamb, and pork as red meat, oily and nonoily fish, fresh and dried fruits as total fruits, and cooked and salad/raw vegetables as total vegetables), which might exert similar patterns of genetic effects.

In a large-scale population-based study, the presence of relatedness among individuals may confound the association between exposures and outcomes (35). Among seven GWAS analyzing the UK Biobank data, relatedness was considered in Cornelis et al.’s and Zhong et al.’s studies only, which excluded participants with kinship coefficients > 0.0442 (20, 31). According to data released for the entire UK Biobank, only 0.04% of participants were identified as ten or more third-degree relatives, whereas more than 30% of participants were identified with at least one relative with each other (36). The cutoff value of 0.0442, which referred to pairs of individuals with third-degree or closer relationships (37), therefore did not fully account for the relatedness among study participants.. In Cole et al.’s study, the mixed model approach implemented in the BOLT-LMM software fully adjusted for cryptic relatedness (19). By using the dense genetic relatedness matrix (GRM) and leave-one-chromosome-out, BOLT-LMM generally exerted higher power than using the sparse GRM in the FastGWA tool (38). However, FastGWA was suggested to show greater robustness because the estimate of the ‘genetic variance’ in FastGWA captures the variance attributable to common environmental effects with higher genetic variance than BOLT-LMM (38). In the present study, we further excluded those who were genetically identified as non-White British ethnic backgrounds to control for population stratification. Further studies may validate the performance between FastGWA and BOLT-LMM in identifying genetic determinants of food intake.

By obtaining dietary habits from the questionnaire, we considered the amount of food consumption in the continuous form and applied the linear mixed model. A previous study converted food-liking traits into numerical values (range 0–9) without justification (39). Given the transformation of food preference phenotypes into the hedonic scale into numeric values is not appropriate, the proportional odds logistic mixed model (POLMM) has been shown to handle ordinal categorical phenotypes, especially when the phenotype is extremely imbalanced (40). The authors further applied the POLMM for the frequency consumption of food items (never or almost never, once every few months, once a month, once a week, 2–4 times per week, and almost daily) in the UK Biobank without converting into numeric values, and determined loci in the top 10 genes that were replicated in our current study (e.g., CCDC171 for beef, pork, and lamb, XKR6 for processed meat, LY6H for poultry, and MLLT10 for oily fish) (40).

In this study, we found some variants associated with more than one dietary trait, primarily milk, coffee, tea, and alcohol consumption. In particular, variants in the ABCG2, SLC35D3, GCKR, and AC003075.4 genes were identified as genomic risk loci for all three dietary factors. Cornelis et al. also detected variants at the ABCG2 gene in relation to coffee consumption; however, Zhong et al. found variants at the ABCG2 gene associated with total bitter beverage intake but not coffee or tea consumption (20, 31). ABCG2, which is metabolized by cytochrome P450s, is expressed in the apical membranes of several organs, such as the liver, kidney, intestine, and brain, and is involved in preventing the absorption and excessive accumulation of xenobiotic and endogenous substrates in certain tissues (20, 31, 41). ABCG2 is highly induced during lactation and plays an important role in the transformation of uric acid into milk and might affect the redox potential of milk (42). Although information about the influence of the SLC35D3 and GCKR genes on dietary habits is unknown, SLC35D3 was suggested to regulate dopamine signaling and be involved in the metabolic control in the central nervous system (43). GCKR polymorphisms were linked to lactate levels, multiple lipids, and metabolic traits (44–46). AC003075.4 was suggested to have a negative feedback mechanism with the expression of AHR, which is involved in caffeine metabolism and is suppressed by catechins in tea (20, 21, 31, 47, 48).

The identification of the rs8103840 variant, near the FUT1 and FGF21 genes, was aligned with findings from Niarchou et al., in which FGF21 reached genome-wide association significance for both meat-related and fish- and plant-related dietary patterns (15). The FGF21 gene exerts its endocrine action in both the central nervous system and adipose tissue and was involved in the metabolism of glucose, lipids, and proteins (49, 50). We found decreased consumption of processed meat and fish and increased consumption of fruits in individuals carrying the C allele for rs8103840.

By obtaining summary statistics of more than 11 million SNPs from the most recent comprehensive GWAS for dietary intake (19), we identified 231 variants associated with the intake of red meat (n = 9), processed meat (n = 7), poultry (n = 1), total fish (n = 12), milk (n = 50), cheese (n = 38), total fruits (n = 41), total vegetables (n = 42), coffee (n = 11), tea (n = 13), and alcohol (n = 16) which were either not available in the previous study or did not reach the significance level (p < 5e-8) (Supplementary Tables S1-S11) (19). Of these, almost half of SNPs (n = 107) were not available in Cole et al.’s study due to their smaller scale of imputed genetic data, and 70 SNPs reached the threshold for suggestive significance (5e-8 ≤ p < 1e-5). Among 54 loci that did not reach the suggestive significance level (p ≥ 1e-5), 41 loci were for milk intake (Supplementary Table S14). However, the amount of daily milk consumption was not evaluated in the previous study, and we identified the remaining 12 novel loci for red meat (rs12144834, AL592205.1; rs150877559, DLEU1; rs12938702, ST6GALNAC1; and rs7251466, ZNF574), poultry (rs34473833, CDH11), total fish (rs4600686, RNU6-812P), cheese (rs12472445, RBMS1; rs4886168, RNU7-88P; and rs276950, LINC01082), total fruits (rs34156224, AQP4-AS1:CHST9), and coffee (rs2682909, RP11-307C19.2). Since the previous study performed genome-wide association analysis for types of milk only, all genomic risk loci for our estimated milk intake were either unavailable or did not reach a significance level of 5e-08, and thus were determined to be novel loci in this study.

Functional annotations of dietary traits have not been elucidated in previous GWASs. In this study, our gene set enrichment test identified several gene ontology terms and WikiPathways related to the consumption of fish, milk, cheese, fruit, coffee, tea, and alcohol. None of the pathways for red meat, processed meat, poultry, and vegetable intake were significant. However, among 122 novel variants that were not significant in previous studies, some SNPs related to red meat intake were linked to pathogenesis. Variants rs150877559 (DLEU1), rs12938702 (ST6GALNAC1), and rs7251466 (ZNF574) were found in genes involved in the pathogenesis of colorectal, gastric, and ovarian cancers (51–54). This may suggest the role of red meat intake-related genes in cancer cell proliferation and migration. Furthermore, rs150877559 (DLEU1) and rs7251466 (ZNF574) were identified as novel loci in our present study. However, the underlying mechanisms of novel variants involved in dietary habits of food intake remain unclear. For processed meat intake, we identified the rs17676243 variant, which is in the NR3C2 gene. However, the expression of NR3C2 was found to be upregulated by the consumption of red meat but not processed meat in a gene expression study. The inconsistency may be due to the limited sample size and the low consumption of meats in study participants in the previous study (55).

By including more than double SNPs compared to the previous study and adjusting for familial relatedness, the point estimates of heritability from summary statistics were slightly lower than those calculated in Cole et al.’s study (processed meat, 5.42% vs. 6.6%; poultry, 3.50% vs. 4.9%; cheese, 10.48% vs. 10.8%; coffee, 6.26% vs. 7.9%; tea, 8.34% vs. 9.1%; and alcohol, 12.1% vs. 9.71%); Table 2 and Supplementary Table S13) (19). Nevertheless, no statistical tests were available to inform the significant difference. The heritability of food groups (red meat, total fish, total fruits, and total vegetables) appeared to be in the range of the heritability of corresponding food items. However, we were unable to compare the heritability of milk because milk intake was not assessed as a quantitative trait in the previous study.

Although the present GWAS included much more genetic information of imputed SNPs compared to earlier GWAS (12–34) and applied the recent methodology to account for confounding effects of both population stratification and cryptic relatedness in large-scale biobank data, our results were limited to the White British population only. Given that disparities in dietary intake according to different ethnic groups may exist due to cultural knowledge and food-related skills (56, 57), analyses for individuals from ethnic backgrounds other than White British require additional investigations. Furthermore, due to the lack of replication samples, our findings need to be validated in other independent studies.

In summary, the present study comprehensively assessed the influence of genetic variants and their functional mechanisms on the dietary behaviors of participants in the UK Biobank. By cautiously accounting for population stratification and cryptic relatedness in this large-scale of recently released imputation data, we identified several novel loci for food consumption. For implementation, genetic variants associated with dietary intake may converge into groups of genetic variants and are associated differently with diseases via several biological mechanisms. Furthermore, the summary statistics of our GWAS provided accurate estimates and can be used as a source of instrumental variables in the Mendelian randomization framework to address the causal relationship between dietary intake and health outcomes.

Ethics approval

The UK Biobank study (https://www.ukbiobank.ac.uk) received ethical approval from the North West Multi-center Research Ethics Committee (REC reference: 11/NW/03820). Written informed consent was obtained from all participants before enrolment in the study, which was conducted in accordance with the principles of the Declaration of Helsinki. This research was conducted using the UK Biobank Resource (Application Number: 94695). The study protocol was approved by the Institutional Review Board of Seoul National University (No. 2101-153-1191). All research was performed in accordance with relevant guidelines/regulations.

Study population

The UK Biobank is a prospective cohort study that included 502,539 participants aged 37–73 years, who resided within 25 miles of 22 recruiting centers across England, Wales, and Scotland between 2006 and 2010. The study was approved by the North West Multi-centre Research Ethics Committee. The methodological details and rationale of the UK Biobank have been published elsewhere (58–60).

In the present study, we excluded participants without genetic information (N = 15,130), sex mismatch (N = 367), putative sex chromosome aneuploidy (N = 651), and those who were either genetically identified or self-reported as having ethnic backgrounds other than White British (N = 78,378). After exclusion, the sample available for the final analysis was restricted to 408,093 individuals (Fig. 3).

Genotyping and quality control

Genotyping was performed using either the custom UK Biobank Axiom Array or the Affymetrix Axiom Array, as described elsewhere (59, 60). Genotyping data were imputed using both the UK10K and 1000 Genomes Phase 3 and the Haplotype Reference Consortium reference panel, which resulted in a total of 93,095,623 markers (59). Following the quality control procedure, we excluded SNPs with low imputation quality (imputed score < 0.3, n = 15,368,777), high missingness (geno > 0.05, n = 909,502), low minor allele frequency (maf < 0.0002, n = 55,398,429) and those that deviated from the expected Hardy-Weinberg equilibrium (p < 1e-6, n = 8,717,604) (61). A total of 27,503,596 SNPs that passed the quality filtering remained.

Phenotype assessment

Baseline information about dietary habits in the preceding year was obtained from a touchscreen questionnaire (62, 63). Of these, frequency traits of beef, pork, lamb, processed meat, poultry, oily fish, nonoily fish, cheese, and alcohol intake, and quantitative traits of fresh and dried fruits (pieces/day), cooked and raw vegetables (tablespoons/day), and coffee and tea (cups/day) consumption were included in our analyses. For alcohol intake, we used the corresponding numeric values (times/week) for the analysis, where 0 = never, 0.125 = special occasions only, 0.5 = one to three times a month, 1.5 = 1–2 times a week, 3.5 = 3–4 times a week, and 7 = daily or almost daily. For other categorical phenotypes, we used the corresponding numeric values (times/week) for the analysis, where 0 = never, 0.5 = less than once a week, 1 = once a week, 3 = 2–4 times a week, 5.5 = 5–6 times a week, and 7 = once or more daily. We grouped single items to obtain the total intake of red meat (including pork, beef, and lamb, times/week), total fish (including oily and nonoily fish, times/week), total fruits (including fresh and dried fruits, servings/day), and total vegetables (including cooked and raw vegetables, servings/day) (64). Milk consumption (mL/day) was estimated based on the type of milk (never/rarely, full cream, semiskimmed, or skimmed), breakfast cereal (bowls per day), coffee (cups per day), and tea (cups per day) intake (64). The 24-hour dietary data were used to validate the estimation of milk intake, and 94% of the total milk consumption was found to come from milk added to breakfast cereal, coffee, and tea (64). Details of the process of converting dietary information from the food frequency questionnaire to quantitative traits are summarized in Table 4.

Table 4

Summary of the process converting dietary items from the food frequency questionnaire into quantitative traits
Food item	Response and conversion	Food group
Pork	Never: 0 Less than once a week: 0.5 time/week Once a week: 1 time/week 2-4 times a week: 3 times/week 5-6 times a week: 5.5 times/week Once or more daily: 7 times/week	Red meat (times/week) = pork + beef + lamb
Beef
Lamb
Processed meat		-
Poultry		-
Oily fish		Total fish (times/week) = oily fish + non-oily fish
Non-oily fish		Total fish (times/week) = oily fish + non-oily fish
Cheese		-
Milk type	Never/ rarely having, full cream, semi-skimmed, skimmed milk Milk (mL/day) = 0 if never/ rarely having milk Milk (mL/day) = 100 * bowls of breakfast cereals + 25 * cups of coffee + 35 * cups of tea	-
Fresh fruits	Pieces/day	Total fruits (servings/day) = fresh fruits + ½ * dried fruits
Dried fruits	Pieces/day
Cooked vegetables	Tablespoons/day	Total vegetables (servings/day) = cooked vegetables + raw vegetables
Raw vegetables	Tablespoons/day
Coffee	Cups/day	-
Tea	Cups/day	-
Alcohol	Never: 0 Special occasions only: 0.125 time/week One to three times a month: 0.5 time/week One to two times a week: 1.5 times/week Three to four times a week: 3.5 times/week Daily or almost daily: 7 times/week	-

Statistical analysis

Genome-wide association analyses

In the genome-wide association between SNPs and food intake phenotypes, beta coefficients were estimated for each increment of the minor allele in a linear mixed model framework (38, 65). Accordingly, analyses were adjusted for age, sex, and the first 6 genetic principal component scores, which were released by the UK Biobank and defined the White British ancestry subset (59), as fixed effects.

To quantify the cryptic relatedness for the variance of genetic effects, we extracted genotyping data of 93,183 SNPs, which were used for the final kinship inference (n = 93,511) (59) and available after our quality control process (n = 27,503,596). These markers were used to estimate the GRM of pairwise relatedness between individuals. A cutoff value of 0.05 was selected to remove one of a pair of individuals with a relatedness greater than 0.05 (66). The computed GRM was included as a random effect in the linear mixed effect model.

GWA analyses under the linear mixed model were conducted using the fastGWA tool (38), and the Manhattan plot of summary statistics was created in the R program (package ‘qqman’ (67)). The results were further compared with findings from the plink2 tool (68, 69), which did not account for individual relatedness.

Genomic risk loci and regional annotation

The identification of genomic risk loci and SNP annotation were performed by the SNP2GENE of the web-based FUMA tool (70). Using summary statistics from the GWAS, independent significant SNPs were identified as those with p < 5x10^− 8 and r² < 0.6. Those with r² < 0.1 were then defined as lead SNPs, and of these, SNPs that were in linkage disequilibrium (LD) with independent significant SNPs (r² ≥ 0.6) were defined as genomic risk loci. The 1000G Phase3 UER was used as the reference panel population, and the maximum distance between LD blocks to merge into a genomic locus was set at 250 kb (70).

To map candidate SNPs to genes, three methods were used, including positional, eQTL, and chromatin interaction mapping. For positional mapping, SNPs were mapped to genes based on annotations from ANNOVAR. For eQTL mapping, both independent significant SNPs and their LD SNPs were mapped to eQTLs in types of user-defined tissues. For chromatin interaction mapping, overlapping SNPs of independent significant variants and their LD variants were mapped to genes with promoters that overlapped with another end of significant interactions (70).

Functional analysis

For each input gene, the GENE2FUNC of the FUMA tool provided the enrichment of public gene sets and tested the representation in different functional gene sets using databases such as the Molecular Signatures Database (MsigDB) and the WikiPathways (70). Individual genes that shared certain biological functions were grouped and evaluated for associations with the dietary trait. The cutoff for adjusted p values was set at 0.05 and minimum overlapping genes with gene sets were assigned at 2 in the functional analysis to detect significant gene ontology biological processes and molecular functions and WikiPathways.

Estimation of heritability and genetic and phenotypic correlation

The tagging file, which records the expected heritability tagged by each SNP, was obtained for the LDAK-Thin model (71). Summary statistics from the genome-wide association analysis for food intake together with the tagged file were used to estimate the heritability (h²) and genetic correlation (r²) of food intake (71). Pearson correlation coefficients for the consumption between two dietary factors were calculated and visualized as a heatmap using the R program (package ‘ggplot2’ (72)).

Source of support

This work was supported by the grant from the National Research Foundation of Korea (NRF) (No: 2022R1A2C1004608).

Acknowledgement

We thank Professor Seunggeun Lee (Shawn), from Seoul National University Graduate School of Data Science, for his useful comments on this study.

Author contributions

TH, AS, and SC designed research; TH, AS, SC, JC, and DK conducted research; TH analyzed data; and TH wrote the paper. AS had primary responsibility for final content. All authors read and approved the final manuscript.

Competing interests

The authors declare no competing interests.

Data availability

The datasets used and/or analysed during the current study available from the corresponding author on reasonable request.

Noce A, Romani A, Bernini R. Dietary intake and chronic disease prevention. Nutrients 2021;13(4). doi: 10.3390/nu13041358.
Di Renzo L, Gualtieri P, Romano L, Marrone G, Noce A, Pujia A, Perrone MA, Aiello V, Colica C, De Lorenzo A. Role of personalized nutrition in chronic-degenerative diseases. Nutrients 2019;11(8). doi: 10.3390/nu11081707.
Neuhouser ML. The importance of healthy dietary patterns in chronic disease prevention. Nutr Res 2019;70:3-6. doi: 10.1016/j.nutres.2018.06.002.
Qiao J, Lin X, Wu Y, Huang X, Pan X, Xu J, Wu J, Ren Y, Shan PF. Global burden of non-communicable diseases attributable to dietary risks in 1990-2019. J Hum Nutr Diet 2022;35(1):202-13. doi: 10.1111/jhn.12904.
de Toro-Martin J, Arsenault BJ, Despres JP, Vohl MC. Precision nutrition: a review of personalized nutritional approaches for the prevention and management of metabolic syndrome. Nutrients 2017;9(8). doi: 10.3390/nu9080913.
Barrea L, Annunziata G, Bordoni L, Muscogiuri G, Colao A, Savastano S, Obesity Programs of nutrition ER, Assessment G. Nutrigenetics-personalized nutrition in obesity and cardiovascular diseases. Int J Obes Suppl 2020;10(1):1-13. doi: 10.1038/s41367-020-0014-4.
Vink JM, van Hooijdonk KJM, Willemsen G, Feskens EJM, Boomsma DI. Causes of variation in food preference in the Netherlands. Twin Res Hum Genet 2020;23(4):195-203. doi: 10.1017/thg.2020.66.
Grimm ER, Steinle NI. Genetics of eating behavior: established and emerging concepts. Nutr Rev 2011;69(1):52-60. doi: 10.1111/j.1753-4887.2010.00361.x.
Smith AD, Fildes A, Cooke L, Herle M, Shakeshaft N, Plomin R, Llewellyn C. Genetic and environmental influences on food preferences in adolescence. Am J Clin Nutr 2016;104(2):446-53. doi: 10.3945/ajcn.116.133983.
Boesveldt S, de Graaf K. The differential role of smell and taste for eating behavior. Perception 2017;46(3-4):307-19. doi: 10.1177/0301006616685576.
Vesnina A, Prosekov A, Kozlova O, Atuchin V. Genes and eating preferences, their roles in personalized nutrition. Genes (Basel) 2020;11(4). doi: 10.3390/genes11040357.
Mompeo O, Freidin MB, Gibson R, Hysi PG, Christofidou P, Segal E, Valdes AM, Spector TD, Menni C, Mangino M. Genome-wide association analysis of over 170,000 individuals from the UK Biobank identifies seven loci associated with dietary approaches to stop hypertension (DASH) diet. Nutrients 2022;14(20). doi: 10.3390/nu14204431.
Suzuki H, Nakamura Y, Matsuo K, Imaeda N, Goto C, Narita A, Shimizu A, Takashima N, Matsui K, Miura K, et al. A genome-wide association study in Japanese identified one variant associated with a preference for a Japanese dietary pattern. Eur J Clin Nutr 2021;75(6):937-45. doi: 10.1038/s41430-020-00823-z.
Suzuki T, Nakamura Y, Matsuo K, Oze I, Doi Y, Narita A, Shimizu A, Imaeda N, Goto C, Matsui K, et al. A genome-wide association study on fish consumption in a Japanese population-the Japan Multi-Institutional Collaborative Cohort study. Eur J Clin Nutr 2021;75(3):480-8. doi: 10.1038/s41430-020-00702-7.
Niarchou M, Byrne EM, Trzaskowski M, Sidorenko J, Kemper KE, McGrath JJ, MC OD, Owen MJ, Wray NR. Genome-wide association study of dietary intake in the UK biobank study and its associations with schizophrenia and other traits. Transl Psychiatry 2020;10(1):51. doi: 10.1038/s41398-020-0688-y.
Meddens SFW, de Vlaming R, Bowers P, Burik CAP, Linner RK, Lee C, Okbay A, Turley P, Rietveld CA, Fontana MA, et al. Genomic analysis of diet composition finds novel loci and associations with health and lifestyle. Mol Psychiatry 2020. doi: 10.1038/s41380-020-0697-5.
Matoba N, Akiyama M, Ishigaki K, Kanai M, Takahashi A, Momozawa Y, Ikegawa S, Ikeda M, Iwata N, Hirata M, et al. GWAS of 165,084 Japanese individuals identified nine loci associated with dietary habits. Nat Hum Behav 2020;4(3):308-16. doi: 10.1038/s41562-019-0805-1.
Furukawa K, Igarashi M, Jia H, Nogawa S, Kawafune K, Hachiya T, Takahashi S, Saito K, Kato H. A genome-wide association study identifies the association between the 12q24 locus and black tea consumption in Japanese populations. Nutrients 2020;12(10). doi: 10.3390/nu12103182.
Cole JB, Florez JC, Hirschhorn JN. Comprehensive genomic analysis of dietary habits in UK Biobank identifies hundreds of genetic associations. Nat Commun 2020;11(1):1467. doi: 10.1038/s41467-020-15193-0.
Zhong VW, Kuang A, Danning RD, Kraft P, van Dam RM, Chasman DI, Cornelis MC. A genome-wide association study of bitter and sweet beverage consumption. Hum Mol Genet 2019;28(14):2449-57. doi: 10.1093/hmg/ddz061.
Jia H, Nogawa S, Kawafune K, Hachiya T, Takahashi S, Igarashi M, Saito K, Kato H. GWAS of habitual coffee consumption reveals a sex difference in the genetic effect of the 12q24 locus in the Japanese population. BMC Genet 2019;20(1):61. doi: 10.1186/s12863-019-0763-7.
Kranzler HR, Zhou H, Kember RL, Vickers Smith R, Justice AC, Damrauer S, Tsao PS, Klarin D, Baras A, Reid J, et al. Genome-wide association study of alcohol consumption and use disorder in 274,424 individuals from multiple populations. Nat Commun 2019;10(1):1499. doi: 10.1038/s41467-019-09480-8.
Hwang LD, Lin C, Gharahkhani P, Cuellar-Partida G, Ong JS, An J, Gordon SD, Zhu G, MacGregor S, Lawlor DA, et al. New insight into human sweet taste: a genome-wide association study of the perception and intake of sweet substances. Am J Clin Nutr 2019;109(6):1724-37. doi: 10.1093/ajcn/nqz043.
Gelernter J, Sun N, Polimanti R, Pietrzak RH, Levey DF, Lu Q, Hu Y, Li B, Radhakrishnan K, Aslan M, et al. Genome-wide association study of maximum habitual alcohol intake in >140,000 U.S. European and African American Veterans yields novel risk loci. Biol Psychiatry 2019;86(5):365-76. doi: 10.1016/j.biopsych.2019.03.984.
Nakagawa-Senda H, Hachiya T, Shimizu A, Hosono S, Oze I, Watanabe M, Matsuo K, Ito H, Hara M, Nishida Y, et al. A genome-wide association study in the Japanese population identifies the 12q24 locus for habitual coffee consumption: The J-MICC Study. Sci Rep 2018;8(1):1493. doi: 10.1038/s41598-018-19914-w.
Jiang L, Penney KL, Giovannucci E, Kraft P, Wilson KM. A genome-wide association study of energy intake and expenditure. PLoS One 2018;13(8):e0201555. doi: 10.1371/journal.pone.0201555.
Mozaffarian D, Dashti HS, Wojczynski MK, Chu AY, Nettleton JA, Mannisto S, Kristiansson K, Reedik M, Lahti J, Houston DK, et al. Genome-wide association meta-analysis of fish and EPA+DHA consumption in 17 US and European cohorts. PLoS One 2017;12(12):e0186456. doi: 10.1371/journal.pone.0186456.
Guenard F, Bouchard-Mercier A, Rudkowska I, Lemieux S, Couture P, Vohl MC. Genome-Wide Association Study of Dietary Pattern Scores. Nutrients 2017;9(7). doi: 10.3390/nu9070649.
Clarke TK, Adams MJ, Davies G, Howard DM, Hall LS, Padmanabhan S, Murray AD, Smith BH, Campbell A, Hayward C, et al. Genome-wide association study of alcohol consumption and genetic overlap with other health-related traits in UK Biobank (N=112 117). Mol Psychiatry 2017;22(10):1376-84. doi: 10.1038/mp.2017.153.
Pirastu N, Kooyman M, Robino A, van der Spek A, Navarini L, Amin N, Karssen LC, Van Duijn CM, Gasparini P. Non-additive genome-wide association scan reveals a new gene associated with habitual coffee consumption. Sci Rep 2016;6:31590. doi: 10.1038/srep31590.
Cornelis MC, Kacprowski T, Menni C, Gustafsson S, Pivin E, Adamski J, Artati A, Eap CB, Ehret G, Friedrich N, et al. Genome-wide association study of caffeine metabolites provides new insights to caffeine metabolism and dietary caffeine-consumption behavior. Hum Mol Genet 2016;25(24):5472-82. doi: 10.1093/hmg/ddw334.
Rudkowska I, Perusse L, Bellis C, Blangero J, Despres JP, Bouchard C, Vohl MC. Interaction between Common Genetic Variants and Total Fat Intake on Low-Density Lipoprotein Peak Particle Diameter: A Genome-Wide Association Study. J Nutrigenet Nutrigenomics 2015;8(1):44-53. doi: 10.1159/000431151.
Melkonian SC, Daniel CR, Hildebrandt MA, Tannir NM, Ye Y, Chow WH, Wood CG, Wu X. Joint association of genome-wide association study-identified susceptibility loci and dietary patterns in risk of renal cell carcinoma among non-Hispanic whites. Am J Epidemiol 2014;180(5):499-507. doi: 10.1093/aje/kwu158.
Baik I, Cho NH, Kim SH, Han BG, Shin C. Genome-wide association studies identify genetic loci related to alcohol consumption in Korean men. Am J Clin Nutr 2011;93(4):809-16. doi: 10.3945/ajcn.110.001776.
Thomson R, McWhirter R. Adjusting for familial relatedness in the analysis of GWAS data. Methods Mol Biol 2017;1526:175-90. doi: 10.1007/978-1-4939-6613-4_10.
. Internet: https://biobank.ctsu.ox.ac.uk/crystal/field.cgi?id=22021 (accessed February 14 2022).
Manichaikul A, Mychaleckyj JC, Rich SS, Daly K, Sale M, Chen WM. Robust relationship inference in genome-wide association studies. Bioinformatics 2010;26(22):2867-73. doi: 10.1093/bioinformatics/btq559.
Jiang L, Zheng Z, Qi T, Kemper KE, Wray NR, Visscher PM, Yang J. A resource-efficient tool for mixed model association analysis of large-scale data. Nat Genet 2019;51(12):1749-55. doi: 10.1038/s41588-019-0530-8.
May-Wilson S, Matoba N, Wade KH, Hottenga JJ, Concas MP, Mangino M, Grzeszkowiak EJ, Menni C, Gasparini P, Timpson NJ, et al. Large-scale GWAS of food liking reveals genetic determinants and genetic correlations with distinct neurophysiological traits. Nat Commun 2022;13(1):2743. doi: 10.1038/s41467-022-30187-w.
Bi W, Zhou W, Dey R, Mukherjee B, Sampson JN, Lee S. Efficient mixed model approach for large-scale genome-wide association studies of ordinal categorical phenotypes. Am J Hum Genet 2021;108(5):825-39. doi: 10.1016/j.ajhg.2021.03.019.
Esteves F, Rueff J, Kranendonk M. The central role of cytochrome P450 in xenobiotic metabolism-a brief review on a fascinating enzyme family. J Xenobiot 2021;11(3):94-114. doi: 10.3390/jox11030007.
Garcia-Lino AM, Alvarez-Fernandez I, Blanco-Paniagua E, Merino G, Alvarez AI. Transporters in the mammary gland-contribution to presence of nutrients and drugs into milk. Nutrients 2019;11(10). doi: 10.3390/nu11102372.
Zhang Z, Hao CJ, Li CG, Zang DJ, Zhao J, Li XN, Wei AH, Wei ZB, Yang L, He X, et al. Mutation of SLC35D3 causes metabolic syndrome by impairing dopamine signaling in striatal D1 neurons. PLoS Genet 2014;10(2):e1004124. doi: 10.1371/journal.pgen.1004124.
Zahedi AS, Akbarzadeh M, Sedaghati-Khayat B, Seyedhamzehzadeh A, Daneshpour MS. GCKR common functional polymorphisms are associated with metabolic syndrome and its components: a 10-year retrospective cohort study in Iranian adults. Diabetol Metab Syndr 2021;13(1):20. doi: 10.1186/s13098-021-00637-4.
Fernandes Silva L, Vangipurapu J, Kuulasmaa T, Laakso M. An intronic variant in the GCKR gene is associated with multiple lipids. Sci Rep 2019;9(1):10240. doi: 10.1038/s41598-019-46750-3.
Lopez Rodriguez M, Fernandes Silva L, Vangipurapu J, Modi S, Kuusisto J, Kaikkonen MU, Laakso M. Functional Variant in the GCKR Gene Affects Lactate Levels Differentially in the Fasting State and During Hyperglycemia. Sci Rep 2018;8(1):15989. doi: 10.1038/s41598-018-34501-9.
Josse AR, Da Costa LA, Campos H, El-Sohemy A. Associations between polymorphisms in the AHR and CYP1A1-CYP1A2 gene regions and habitual caffeine consumption. Am J Clin Nutr 2012;96(3):665-71. doi: 10.3945/ajcn.112.038794.
Fukuda I, Nishiumi S, Mukai R, Yoshida K, Ashida H. Catechins in tea suppress the activity of cytochrome P450 1A1 through the aryl hydrocarbon receptor activation pathway in rat livers. Int J Food Sci Nutr 2015;66(3):300-7. doi: 10.3109/09637486.2014.992007.
Szczepanska E, Gietka-Czernel M. FGF21: a novel regulator of glucose and lipid metabolism and whole-body energy balance. Horm Metab Res 2022;54(4):203-11. doi: 10.1055/a-1778-4159.
Chu AY, Workalemahu T, Paynter NP, Rose LM, Giulianini F, Tanaka T, Ngwa JS, Group CNW, Qi Q, Curhan GC, et al. Novel locus including FGF21 is associated with dietary macronutrient intake. Hum Mol Genet 2013;22(9):1895-902. doi: 10.1093/hmg/ddt032.
Liu T, Han Z, Li H, Zhu Y, Sun Z, Zhu A. LncRNA DLEU1 contributes to colorectal cancer progression via activation of KPNA3. Mol Cancer 2018;17(1):118. doi: 10.1186/s12943-018-0873-2.
Ogawa T, Hirohashi Y, Murai A, Nishidate T, Okita K, Wang L, Ikehara Y, Satoyoshi T, Usui A, Kubo T, et al. ST6GALNAC1 plays important roles in enhancing cancer stem phenotypes of colorectal cancer via the Akt pathway. Oncotarget 2017;8(68):112550-64. doi: 10.18632/oncotarget.22545.
Wang WY, Cao YX, Zhou X, Wei B, Zhan L, Sun SY. Stimulative role of ST6GALNAC1 in proliferation, migration and invasion of ovarian cancer stem cells via the Akt signaling pathway. Cancer Cell Int 2019;19:86. doi: 10.1186/s12935-019-0780-7.
Zhang J, Wu X, Huang L. ZNF574 promotes ovarian cancer cell proliferation and migration through regulating AKT and AMPK signaling pathways. Ann Clin Lab Sci 2022;52(4):611-20.
Pellatt AJ, Slattery ML, Mullany LE, Wolff RK, Pellatt DF. Dietary intake alters gene expression in colon tissue: possible underlying mechanism for the influence of diet on disease. Pharmacogenet Genomics 2016;26(6):294-306. doi: 10.1097/FPC.0000000000000217.
Mackenbach JD, Dijkstra SC, Beulens JWJ, Seidell JC, Snijder MB, Stronks K, Monsivais P, Nicolaou M. Socioeconomic and ethnic differences in the relation between dietary costs and dietary quality: the HELIUS study. Nutr J 2019;18(1):21. doi: 10.1186/s12937-019-0445-3.
Wang Y, Chen X. How much of racial/ethnic disparities in dietary intakes, exercise, and weight status can be explained by nutrition- and health-related psychosocial factors and socioeconomic status among US adults? J Am Diet Assoc 2011;111(12):1904-11. doi: 10.1016/j.jada.2011.09.036.
Canela-Xandri O, Rawlik K, Tenesa A. An atlas of genetic associations in UK Biobank. Nat Genet 2018;50(11):1593-9. doi: 10.1038/s41588-018-0248-z.
Bycroft C, Freeman C, Petkova D, Band G, Elliott LT, Sharp K, Motyer A, Vukcevic D, Delaneau O, O'Connell J, et al. The UK Biobank resource with deep phenotyping and genomic data. Nature 2018;562(7726):203-9. doi: 10.1038/s41586-018-0579-z.
Sudlow C, Gallacher J, Allen N, Beral V, Burton P, Danesh J, Downey P, Elliott P, Green J, Landray M, et al. UK Biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age. PLoS Med 2015;12(3):e1001779. doi: 10.1371/journal.pmed.1001779.
Orliac EJ, Trejo Banos D, Ojavee SE, Lall K, Magi R, Visscher PM, Robinson MR. Improving GWAS discovery and genomic prediction accuracy in biobank data. Proc Natl Acad Sci U S A 2022;119(31):e2121279119. doi: 10.1073/pnas.2121279119.
Greenwood DC, Hardie LJ, Frost GS, Alwan NA, Bradbury KE, Carter M, Elliott P, Evans CEL, Ford HE, Hancock N, et al. Validation of the Oxford WebQ Online 24-Hour Dietary Questionnaire Using Biomarkers. Am J Epidemiol 2019;188(10):1858-67. doi: 10.1093/aje/kwz165.
Liu B, Young H, Crowe FL, Benson VS, Spencer EA, Key TJ, Appleby PN, Beral V. Development and evaluation of the Oxford WebQ, a low-cost, web-based method for assessment of previous 24 h dietary intakes in large-scale prospective studies. Public Health Nutr 2011;14(11):1998-2005. doi: 10.1017/S1368980011000942.
Bradbury KE, Murphy N, Key TJ. Diet and colorectal cancer in UK Biobank: a prospective study. Int J Epidemiol 2020;49(1):246-58. doi: 10.1093/ije/dyz064.
Yang J, Lee SH, Goddard ME, Visscher PM. GCTA: a tool for genome-wide complex trait analysis. Am J Hum Genet 2011;88(1):76-82. doi: 10.1016/j.ajhg.2010.11.011.
Yang J, Benyamin B, McEvoy BP, Gordon S, Henders AK, Nyholt DR, Madden PA, Heath AC, Martin NG, Montgomery GW, et al. Common SNPs explain a large proportion of the heritability for human height. Nat Genet 2010;42(7):565-9. doi: 10.1038/ng.608.
Turner S. Internet: https://github.com/stephenturner/qqman.
Purcell S, Chang C. Internet: www.cog-genomics.org/plink/2.0/.
Chang CC, Chow CC, Tellier LC, Vattikuti S, Purcell SM, Lee JJ. Second-generation PLINK: rising to the challenge of larger and richer datasets. Gigascience 2015;4:7. doi: 10.1186/s13742-015-0047-8.
Watanabe K, Taskesen E, van Bochoven A, Posthuma D. Functional mapping and annotation of genetic associations with FUMA. Nat Commun 2017;8(1):1826. doi: 10.1038/s41467-017-01261-5.
Speed D, Kaphle A, Balding DJ. SNP-based heritability and selection analyses: Improved models and new results. Bioessays 2022;44(5):e2100170. doi: 10.1002/bies.202100170.
Wickham H, Chang W, Henry L, Pedersen TL, Takahashi K, Wilke C, Woo K, Yutani H, GDunnington D. Internet: https://ggplot2.tidyverse.org.

No competing interests reported.

GWASdietSupplementary230728.pdf

Download PDF

Version 1

posted

You are reading this latest preprint version

Genome-wide association study adjusting for familial relatedness identifies novel loci for food intake in the UK Biobank

Status:

Version 1

Abstract

Figures

Introduction

Results

Discussion

Methods

Statistical analysis

Declarations

References

Additional Declarations

Supplementary Files

Status:

Version 1