Genome-Wide Association Studies and fine-mapping of genomic loci for n-3 and n-6 Polyunsaturated Fatty Acids in Hispanic American and African American Cohorts

Omega-3 (n-3) and omega-6 (n-6) polyunsaturated fatty acids (PUFAs) play critical roles in human health. Prior genome-wide association studies (GWAS) of n-3 and n-6 PUFAs in European Americans from the CHARGE Consortium have documented strong genetic signals in/near the FADS locus on chromosome 11. We performed a GWAS of four n-3 and four n-6 PUFAs in Hispanic American (n = 1454) and African American (n = 2278) participants from three CHARGE cohorts. Applying a genome-wide significance threshold of P < 5 x 10−8, we confirmed association of the FADS signal and found evidence of two additional signals (in DAGLA and BEST1) within 200 kb of the originally reported FADS signal. Outside of the FADS region, we identified novel signals for arachidonic acid (AA) in Hispanic Americans located in/near genes including TMX2, SLC29A2, ANKRD13D and POLD4, and spanning a >9 Mb region on chromosome 11 (57.5Mb ~ 67.1 Mb). Among these novel signals, we found associations unique to Hispanic Americans, including rs28364240, a POLD4 missense variant for AA that is common in CHARGE Hispanic Americans but absent in other race/ancestry groups. Our study sheds light on the genetics of PUFAs and the value of investigating complex trait genetics across diverse ancestry populations.


Introduction
Omega-3 (n-3) and omega-6 (n-6) polyunsaturated fatty acids (PUFAs) are critical structural components of cell membranes, which can in uence cellular activities by promoting the uidity, exibility, and the permeability of a membrane. [1][2][3] Additionally, PUFAs affect a variety of other biological processes and molecular pathways, including modulating membrane channels and proteins, regulating gene expression through nuclear receptors and transcription factors, and conversion of the PUFAs themselves into bioactive metabolites. 4 Levels of circulating PUFAs and long chain (≥ 20 carbons) PUFAs (LC-PUFAs) are associated with reduced risk of cardiovascular disease 5,6 , type 2 diabetes mellitus 7 , cognitive decline 8 , Alzheimer's disease 9 , metabolic syndrome 10 and breast cancer 11 , as well as all-cause mortality. 12 PUFAs and LC-PUFAs are characterized by the position of the rst double bond from the methyl terminal (omega; ω; or n − FAs) and fall into two primary families, n-3 and n-6. The most abundant n-3 PUFAs are alpha-linolenic acid (ALA), eicosapentaenoic acid (EPA), docosapentaenoic acid (DPA) and docosahexaenoic acid (DHA), while the primary n-6 PUFAs are linoleic acid (LA), gamma-linolenic acid (GLA), dihomo-γ-linolenic acid (DGLA) and arachidonic acid (AA). ALA and LA are essential n-3 and n-6 PUFAs consumed from the diet and these then can be converted to more unsaturated LC-PUFAs through a set of desaturation and elongation enzymatic steps. For example, DGLA and AA can be synthesized from LA, while EPA, DPA and DHA can be produced from ALA (Fig. 1). The precursors LA and ALA are essential fatty acids that must be provided by the diet. Due to the lower abundance of ALA in Western diets and the ine ciency of conversion of ALA to longer chain n-3 LC-PUFAs such as EPA and DHA, dietary intake of these via fatty sh or marine oil supplementation is often recommended. 13,14 Previous studies have shown that African ancestry populations have higher circulating levels of LC-PUFAs compared to European Americans. 15 These large differences can be explained in part by variation in the allele frequencies of FADS variants associated with different biosynthetic e ciencies in these two populations. 16 Mathias et al. also revealed that African Americans have signi cantly higher levels of AA and lower levels of the AA precursor DGLA, and that FADS1 variants were signi cantly associated with AA, DGLA and the AA/DGLA ratio in a sample of fewer than 200 African Americans from the GeneSTAR study. 15 In addition, African ancestry populations have higher frequencies of the "derived" FADS haplogroup (represented by the variant rs174537 allele G) 17 that is associated with more e cient conversion for PUFAs. 16 In contrast, Amerind ancestry Hispanic populations have higher frequencies of the "ancestral" FADS haplogroup (represented by rs174537 allele T) that has a reduced capacity to synthesize PUFAs. Accordingly, we demonstrated that higher global proportions of Amerind ancestry are associated with lower levels of PUFAs in Hispanic populations. 17 Genome-wide association studies (GWAS) of n-3 and n-6 PUFAs were performed by the CHARGE consortium in European ancestry (EUR) participants. [18][19][20] The CHARGE GWAS of n-3 PUFAs in 8,866 European Americans identi ed genetic variants in/near FADS1 and FADS2 associated with higher levels of ALA and lower levels of EPA and DPA, as well as SNPs in ELOVL2 associated with higher EPA and DPA and lower DHA. The CHARGE GWAS of n-6 PUFAs in 8,631 European Americans con rmed that variants in the FADS gene cluster were associated with LA and AA, and it revealed that variants near NRBF2 were associated with LA and those in NTAN1 were associated with LA, GLA, DGLA, and AA (Fig. 1). In the Framingham Heart Offspring Study, variants in/near PCOLCE2, LPCAT3, DHRS4L2, CALN1 FADS1/2, and ELOVL2 were associated with PUFAs in European ancestry participants. 21,22 Collectively, these studies played an important role in identifying the genetic associations of n-3 and n-6 PUFAs in European ancestry populations.
To address the paucity of GWAS of PUFAs in non-European ancestry cohorts, we performed a metaanalysis of genome-wide association studies for n-3 and n-6 PUFAs for Hispanic American (HIS) and African American (AFA) participants from three CHARGE consortium cohorts: the Multi-Ethnic Study of Atherosclerosis (MESA), the Cardiovascular Health Study (CHS) and the Framingham Heart Study (FHS) Omni cohort. The major goals of the study were (1) to examine whether the major loci identi ed in European Americans are shared across race/ancestry groups, and (2) to examine evidence for genetic association unique to HIS and AFA populations. As GWAS approaches are not su cient to identify the causal variants and determine the number of independent signals, especially in the context of long stretches of linkage disequilibrium (LD) within the FADS locus 15,23 , we conducted statistical nemapping 24 to identify the most likely causal variants within each n-3 and n-6 PUFA-associated locus. We performed cross-ancestry replication analysis in CHARGE and MESA, with validation using the multiancestry GWAS of lipids from the Global Lipids Genetics Consortium (GLGC). 25 Subsequently, we performed integrative analysis leveraging gene expression data from MESA 26,27 and the Genotype-Tissue Expression (GTEx) project 28 to identify genes that could contribute to our identi ed genetic association results. Finally, we examined open chromatin de ned by ATAC-seq to determine the impact and physical contact of the identi ed variants with nearby genes. Our study demonstrates the vital importance of diverse ancestry genetic studies for the study of complex traits, and particularly for metabolites that have been subject to evolutionary pressures and are closely regulated by speci c protein-coding genes.

Participant characteristics
The participants in the meta-analysis of GWAS for PUFAs included 1,454 HIS and 2,278 AFA unrelated participants (Table 1; fatty acid levels are expressed as the percentage of total fatty acids throughout the entire manuscript). There were some differences in the distributions of fatty acid levels observed across cohorts, which were likely due to the sources of biospecimens for the assays (plasma phospholipids for MESA and CHS versus erythrocytes for FHS). For example, mean levels of DPA varied from 0.85% (CHS: plasma phospholipids) to 2.54% of total fatty acids (FHS: erythrocytes) in AFA and AA from 11.01% (MESA: plasma phospholipids) to 16.56% (FHS: erythrocytes) in HIS (Table 1). In addition, n-6 PUFAs, especially LA and AA, have relatively higher mean levels than n-3 PUFAs in all cohorts (Table 1). Regardless of whether the fatty acids were measured in plasma phospholipids or erythrocytes, AFA populations had higher levels of AA and elevated ratios of AA to DGLA and AA to LA relative to Hispanic populations. This result would be expected given the frequency differences in the derived (e cient) to ancestral (ine cient) FADS haplogroups between these two populations. As expected, due to the lower levels of dietary ALA relative to LA entering the biosynthetic pathway, levels of n-3 LC-PUFAs including EPA, DPA and DHA were signi cantly lower than the n-6 LC-PUFA, AA. Additionally, African Americans had higher levels of n-3 LC-PUFAs than Hispanic Americans, again likely due to differences in the ratio of the derived to ancestral FADS haplogroups. These differences are similar to those observed examining the same PUFAs and LC-PUFAs and ratios when comparing African Americans and European Americans. 15 Table S1). The directions of effect observed in HIS and AFA for these variants were consistent with those reported for European ancestry populations in prior CHARGE GWAS meta-analyses of n-3 and n-6 PUFAs (Table S1).

Gwas And Fine-mapping Identify Novel Pufa-associated Genetic Signals In Charge His And Afa
Based on a genome-wide signi cance threshold of P < 5 x 10 − 8 , our complete GWAS of n-3 and n-6 PUFAs identi ed associations on chromosomes 11, 15 and 16 in CHARGE HIS (Table 2) and chromosomes 6, 7, 10 and 11 in CHARGE AFA (Table 3). For regions with more than one genome-wide signi cant variant, we applied statistical ne-mapping to identify the independent putative causal signals (credible sets) for each genome-wide signi cant locus. We carried out these analyses separately for our CHARGE HIS and CHARGE AFA GWAS meta-analysis results.  Table 2 shows the signals (credible sets) of putative causal variants identi ed at each chromosome for each PUFAs from SuSiE in the HIS. All variant positions are presented based on Human Genome Build 37.
Variants previously documented in the CHARGE GWAS meta-analysis of n-3 and n-6 PUFAs were considered known prior to the current meta-analysis. Additionally, those variants demonstrating linkage disequilibrium (LD) R-squared > 0.2 with one or more previously reported GWAS variants were considered known. The remaining variants that were not in LD with known GWAS variants were considered novel in the current study. There was only one genome-wide signi cant variant on chromosome 15 for DGLA (rs57112407) in HIS, this signal was not carried forward for ne-mapping.  Table 3 shows the signals (credible sets) of putative causal variants identi ed at each chromosome for each PUFAs from SuSiE in AFA. All variant positions are presented based on Human Genome Build 37.
Variants previously documented in the CHARGE GWAS meta-analysis of n-3 and n-6 PUFAs were considered known at the current meta-analysis. Additionally, those variants demonstrating linkage disequilibrium (LD) R-squared > 0.2 with one or more previously reported GWAS variants were considered known. The remaining variants that were not in LD with known GWAS variants were considered novel in the current study. There was only one genome-wide signi cant variant on chromosome 10 for DHA (rs114622288) in AFA, this signal was not carried forward for ne-mapping.
We identi ed multiple independent putative causal signals for the PUFA traits  (Table 2, Table 3, Table S2 and Table S3). We examined the overlap of signals identi ed from ne-mapping in HIS versus AFA. We observed that the credible sets were generally smaller in AFA (average number of variants in credible set: HIS:3.4; AFR:2.2) possibly driven by the lower average LD in AFA.
Among the independent credible sets identi ed, most were novel associated signals within a +/-5 Mb region of the previously reported FADS signal on chromosome 11 (Tables 2-3). Examining all the signals for PUFAs in HIS and AFA, we observed that the lead signal (re ecting the strongest evidence of association) on chromosome 11 represents the FADS signal reported in the previous GWAS. 20 For example, rs174547, the FADS1 variant reported in the previous CHARGE EUR GWAS, is one of the variants in the rst credible set for AA in HIS. 19,20 In addition to the known FADS signals, we also observed multiple novel independent signals at other regions of chromosome 11 for PUFAs [AA: 5 novel signals (credible sets) and LA: 3] in HIS, for example, in/near ANKRD13D, TMX2, POLD4 and SLC29A2 and spanning a long range (57.5Mb ~ 67.1Mb) on chromosome 11 for AA in HIS (Table 2). Additionally, we observed several novel independent signals on other chromosomes showing associations with the PUFA traits in AFA [AA: 1 novel signal on chromosome 7 and DPA: 1 on chromosome 6] ( Table 3).
Additional independent PUFA-associated signals on chromosome 11 demonstrate chromatin contacts with

FADS and other genes
While prior studies have represented the FADS signal as primarily just one signal, 19,20 our study demonstrates numerous independent signals within the FADS region (+/-1Mb of the top variant, rs107724) ( Fig. 2A). We examined this region to identify the subset of variants that may affect cis-regulatory elements in physical contact with nearby genes. Four variants within the credible sets in this region were [hESC_HypothalamicNeurons], Enteroids, and HepG2s). Almost all of the interactions we detected were bait-to-bait interactions, meaning that they re ected physical contact between promoters of two different genes (Table S4). For example, the region surrounding rs2668898 near BEST1 showed evidence of physical contact with the TMEM258, FADS1 and FADS2 region in multiple cell types and TMEM258 region also showed evidence of physical contact with the FADS1 and FADS2 region ( Fig. 3A and Table S4). Besides the FADS region, we further found evidence of physical contact between POLD4 and ANKRD13D( Fig. 3B and Table S4), corresponding to the regions surrounding two signals identi ed in nemapping of AA in HIS ( Fig. 2A).
Additionally, one novel signal was also replicated across race/ancestry groups (Table 4). LRP4 variant rs11039018 in credible set 5 for LA was replicated in the CHARGE AFA (CHARGE AFA: P = 1.90 x 10 − 13 ).   Table 4 shows the novel putative causal variants in each signal (credible set) identi ed from Fine-mapping for PUFAs with replication and validation evidence in HIS. Variants that weren't previously documented in the CHARGE GWAS meta-analysis of n-3 and n-6 PUFAs and weren't in LD with known GWAS variants were considered novel in the current study.
Some of the novel signals without cross-ancestry replication demonstrated large differences in allele frequencies across groups. For example, the effect allele frequency of rs28364240, a POLD4 missense variant in credible set 3 for AA in Hispanics, is 0.204 in our CHARGE HIS group, but close to zero in the other race/ancestry groups examined (EUR: 0.003, AFR: 0.007, CHN: 0.005) (Fig. 2B and Table S5) and the effect allele frequency of rs142068305, a ANKRD13D intron variant, is 0.196 in our CHARGE HIS group while 0.007, 0.004 and 0.005 in AFR, EUR and CHN, respectively. These results suggest evidence of genetic association signals unique to HIS or other groups carrying Amerindian ancestry or admixture.
As some variants could not be interrogated using independent GWAS of PUFA traits, given those studies' focus on speci c race/ancestry groups which may not include our variants of interest and/or limited sample sizes, we performed validation analyses using the results of multi-ancestry GWAS of lipid levels from the GLGC including ~ 1.65 million individuals from ve genetic ancestry groups (admixed African or African, East Asian, European, Hispanic and South Asian). We focused on the most signi cant putative causal variants from each credible set and applied multiple testing correction for the number of validated variants (HIS: P < 0.05/19 = 0.0026 and AFA: P < 0.05/11 = 0.004). Interestingly, we observed that two novel signals without cross-ancestry replication did demonstrate association with one or more lipid levels.  Table 4, Table S7 and Table S8).

Integrative Analyses Identify Putative Causal Genes For The Pufa Loci
Using colocalization with eQTL resources, we identi ed candidate genes underlying the genetic association signals for the PUFA traits. In HIS, we found colocalization with expression of the genes MED19, TMEM258, PACS1, RAD9A, C11orf24, CTTN on chromosome 11 and PDXDC1 on chromosome 16 based on MESA multi-ancestry eQTL resources 26 (Table 5and Table S9). In further analysis using eQTL resources from GTEx whole blood, we con rmed colocalization with TMEM258 and MED19 identi ed using the MESA multi-ancestry eQTLs, and also identi ed colocalization with FADS1, RPS4XP13, AP001462.2, PGA5, PGA5, TPCN2, MEN1 on chromosome 11 and RP11-156C22.5 on chromosome 16. (Table 5and Table S10).  Table 5 shows the results of integrative analysis including Colocalization analysis and PrediXcan in the HIS by using MESA data and GTEx data. For Colocalization analysis, eQTL resources include MESA multiethnic eQTL from puri ed monocytes and GTEx European ancestry whole blood tissue eQTL. GWAS signals with posterior colocalization probability of hypothesis 4 (PP.H4) > 0.80, or PP.H4 > 0.50 and the ratio of PP.H4 / PP.H3 > 5 were considered colocalized with eQTL. For PrediXcan, reference gene expression prediction models include MESA puri ed monocytes and GTEx European ancestry whole blood.
We also performed complementary integrative analysis using PrediXcan, identifying signi cant associations for predicted expression of TMEM258 with AA, ALA, DGLA, DPA, EPA, GLA and LA (after multiple testing correction for all genes examined: P < 0.05/4470 = 0.00001), based on integration with eQTL from both MESA and GTEx. PrediXcan also identi ed TMEM109, ZBTB3, TTC9C, POLD4, INCENP and FERMT3 on chromosome 11 and PDXDC1 on chromosome 16 as putative genes associated with PUFAs in HIS ( Table 5, Table S11 and Table S12). For AFA, colocalization and PrediXcan analyses did not identify any genes of interest that met our pre-speci ed thresholds for statistical signi cance.
Incorporating the prior chromatin contacts identi ed (Table S4), we found that several of our GWAS regions had physical contact with one or more genes identi ed by integration with eQTL resources. For example, RAD9A was supported by colocalization with MESA eQTL and also showed chromatin contact with POLD4 in nearly all cell types examined (Fig. 3B). In addition, INCENP was supported by PrediXcan using both MESA and GTEx resources and also showed chromatin contact with TMEM258, FADS1 and FADS2 in nearly all cell types examined (Fig. 3A). We further observed that CLCF1, RAD9A, FADS2, TMEM258, INCENP, FADS1 identi ed from colocalization or PrediXcan were additionally supported by chromatin contacts analyses (Table S4, Fig. 3A and 3B).
To follow-up on the genes of interest identi ed by colocalization and PrediXcan analyses, we examined their co-expression with FADS1 using GTEx whole blood gene expression with multiple testing correction for the number of genes under consideration (HIS: P < 0.05/39 = 0.0012). In both unadjusted and age/sex-a djusted regression models, multiple genes showed statistically signi cant co-expression with FADS1, for example, TMEM258, MED19, POLD4, RAD9A and SSH3 (Table S13), suggesting these genes have shared patterns of expression.

Discussion
To address the relative lack of prior studies examining genetics of PUFA levels in non-European ancestry populations, we carried out a meta-analysis of GWAS of n-3 and n-6 PUFAs in HIS and AFA across three cohorts: MESA, CHS and FHS. Examining genetic variants identi ed in prior CHARGE GWAS of the same traits in European Americans, we demonstrated evidence of association with n-3 and n-6 PUFAs for the signals in/near FADS1/2 on chromosome 11, PDXDC1 on chromosome 16, and GCKR on chromosome 2 in both HIS and AFA from our current CHARGE GWAS, as well as for ELOVL2 on chromosome 6 in AFA only.
Through genome-wide analysis and subsequent statistical ne-mapping of our ancestry-speci c results, we demonstrated evidence of multiple independent novel signals within the FADS1/2 locus in both HIS and AFA, and in/near ELOVL2 in AFA. Among these independent novel signals, we found one of the novel signals for LA identi ed in HIS demonstrated evidence of replication in AFA based on association with the same PUFA traits in MESA and CHARGE (HIS: rs11039018 intronic to LRP4 [LDL receptor related protein 4]). This nding is supported by animal studies showing that de ciency of Lrp4 in adipocytes increased glucose and insulin tolerance and reduced serum fatty acids. 30 Additionally, multiple novel signals without cross-ancestry replication did show evidence of validation based on association with lipid levels in GLGC. For example, rs518804, a TMX2 (thioredoxin related transmembrane protein 2) intron variant associated with AA and LA was validated based on its association with HDL and Triglycerides, while a MARK2 (microtubule a nity regulating kinase 2) intron variant rs10751002 associated with LA was validated based on its association with LDL and Total Cholesterol.
While we identi ed one signal in HIS with evidence of cross-ancestry replication, we also found a large number of signals in HIS that could not be replicated across race/ancestry groups (European, African American and Chinese), in part to differences in allele frequencies. For example, the chromosome 11 POLD4 (DNA polymerase delta 4, accessory subunit) missense variant rs28364240 and ANKRD13D (ankyrin repeat domain 13D) intron variant rs142068305 identi ed in association with AA have minor allele frequencies of 0.204 and 0.196 in HIS, compared to frequencies close to zero in other race/ancestry groups.
Examining the distance between the putative causal variants in different credible sets identi ed in HIS, we observed that the signals on chromosome 11 span a long range (57.5Mb ~ 67.1Mb). The extended physical distance covered by these independent PUFA-associated variants, combined with their subsequent validation in association with selected lipid traits, suggests there may be long-range chromatin interactions or other forms of physical interaction that may have yielded distinct genetic associations across this region. 31 Interestingly, prior studies have reported the FADS signal on chromosome 11 as primarily just one genetic signal. 19,20 However, our study provides evidence of two more independent signals (BEST1 and DAGLA) within this FADS region. In order to understand the chromatin interactions of the FADS region on chromosome 11, we used ATAC-seq peaks and chromatin loops to perform the chromatin contact analyses. We identi ed multiple genes from colocalization or PrediXcan also supported by chromatin contacts, including CLCF1, RAD9A, FADS2, TMEM258, INCENP and FADS1, providing support for the role of our identi ed genetic signals in regulating these genes. In addition, we observed evidence of chromatin contacts among multiple distinct credible sets identi ed based on our ne-mapping of genetic signals on chromosome 11. For example, the region surrounding rs2668898 near BEST1 also showed evidence of physical contact with the TMEM258, FADS1 and FADS2 region in multiple cell types and TMEM258 also showed evidence of physical contact with the FADS1 and FADS2 region. This support for physical contact among some of the multiple independent signals within the FADS region opens the possibility of coordinated regulation among these distinct genetic signals. Besides the FADS region, POLD4 also showed evidence of physical contact with the ANKRD13D region in multiple cell types. The cell types examined for chromatin interaction correspond to pancreas, liver, and other cell types that could play a role in synthesis and regulation of fatty acids. While the cell types used to examine chromatin interactions are distinct from those used for our integrative eQTL analyses, the chromatin interaction results do provide support for the plausible role of the genes identi ed by colocalization and PrediXcan.
Through integrative analyses including colocalization analysis and PrediXcan and overlapping our GWAS of PUFA levels with selected eQTL resources, we identi ed putative candidate genes that may shed light on the functional mechanisms of our identi ed genetic association signals. On chromosome 11 containing the FADS genes, we identi ed overlap with eQTL for multiple other genes including MED19 (Mediator Complex Subunit 19), TMEM258 (Transmembrane Protein 258), PACS1 (Phosphofurin Acidic Cluster Sorting Protein 1), RAD9A (RAD9 Checkpoint Clamp Component A) and CTTN (Cortactin) suggesting additional complexity within this region beyond the FADS genes. For the signals on chromosome 16 identi ed based on analyses of DGLA in HIS and AFA, in/near NTAN1 and PDXDC1, our integrative PrediXcan analyses identi ed PDXDC1 (Pyridoxal Dependent Decarboxylase Domain Containing 1) (but not NTAN1) as a putative gene for DGLA. Additionally, having identi ed association with AA in HIS for the POLD4 missense variant rs28364240, our subsequent identi cation of POLD4 (DNA Polymerase Delta 4, Accessory Subunit) based on the PrediXcan analyses brings additional support for this gene. To follow-up on the genes of interest identi ed by colocalization and PrediXcan analyses, we examined their coexpression with FADS1 using GTEx whole blood gene expression. Multiple genes on chromosome 11 identi ed in our integrative analyses combining the GWAS of PUFAs with whole blood expression from GTEx showed evidence of co-expression with FADS1, for example, TMEM258, POLD4, TMEM109 and ZBTB3. This nding suggests some genomic regions at a considerable distance from FADS1 may play a role in regulating its expression, and ultimately in uence circulating PUFA levels.
While our genetic association study of PUFA levels in HIS and AFA provides novel insights, our work has several limitations. First, while we have combined data from multiple CHARGE cohorts, the overall sample size of our study is still relatively small for a GWAS. Second, as we began this GWAS effort some years ago, our work makes use of older imputation panels based on the 1000 Genomes. We expect future work could leverage newer resources including imputation based on the Trans-omics for Precision Medicine (TOPMed) reference panel or newer whole genome sequence data from TOPMed 32 . Third, the circulating PUFA levels examined in this study are derived from heterogeneous sources (plasma phospholipids in MESA and CHS vs. erythrocytes in FHS), which could have resulted in heterogeneity of genetic associations across the included studies and overall loss of power. Finally, while our integration of GWAS with eQTL proved useful in some cases, our efforts were driven in part by the available resources. We made use of multi-ancestry eQTL resources based on puri ed monocytes in MESA, as we knew these resources were well-matched with our GWAS cohorts in terms of LD structure, although puri ed monocytes were likely not the most relevant cell type for our study. We complemented those efforts with whole blood eQTL from GTEx through which we were able to capture colocalization of FADS1 that was not observed in MESA due to the lack of signi cant cis-eQTL for FADS1. This limitation underscores the need for more diverse ancestry eQTL resources across a wider range of tissues and cell types.
In summary, working with the CHARGE Consortium, we conducted the rst consortium-based GWAS of circulating PUFA levels in HIS and AFA cohorts. Our study demonstrated evidence of shared genetic in uences on PUFA levels across race/ancestry groups, and demonstrated for the rst time the large number of distinct genetic association signals within a neighborhood of the well documented FADS region on chromosome 11. 19,20 Our ndings provide new insight into the complex genetics of circulating PUFA levels that re ect, in part, their response to evolutionary pressures across the course of human history. 33,34 Overall, our study demonstrates the value of investigating complex trait genetics in diverse ancestry populations and highlights the need for continued efforts for expanded genetic association efforts in cohorts with genetic ancestry that re ects that of the general population within the United States and worldwide.

Study participants
The participants in this study were recruited from three population-based cohorts: the Multi-Ethnic Study of Atherosclerosis (MESA) 35

Meta-analysis Of Genome-wide Association Study
Genome-wide association analysis was carried out separately in each cohort and strati ed by race/ancestry with covariate adjustment for age, sex, study site and principal components of ancestry. Cohort-speci c GWAS results were ltered using EasyQC based on minor allele count (MAC) > 6 and imputation R-squared > 0.3. Cohort-speci c results were combined using weighted sum of z-score metaanalysis in METAL 37 and ltered using Effective Heterozygosity Filter (effHET) > 60. A threshold of P < 5 x 10 − 8 was applied to identify genome-wide signi cant loci.

Identi cation Of Novel Versus Previously Reported Variants
Variants previously documented in the CHARGE GWAS meta-analysis of n-3 (n = 8,866) 19 and n-6 (n = 8,631) 20 PUFAs in European ancestry cohorts were considered known for the current meta-analysis.
Additionally, those variants demonstrating linkage disequilibrium (LD) R-squared > 0.2 with one or more previously reported GWAS variants were considered known. The remaining variants were considered novel in the current study.

Statistical Fine-mapping Using Susie
For each chromosome with more than one genome-wide signi cant variant (at P < 5 x 10 − 8 ), we carried out statistical ne-mapping to identify the putative causal variants and estimate the number of independent signals. We used Sum of Single Effect model (SuSiE) 24 to identify the credible set of putative causal variants, providing as input all variants with P < 5 x 10 − 8 from the meta-analysis results. For ne-mapping of signals identi ed in our meta-analysis of HIS and AFA, we used imputed genotype dosage for the same set of variants in MESA HIS and AFA, respectively. To select the parameter L (prior number of independent signals) for ne-mapping in SuSiE, DAP-G (Deterministic Approximation of Posteriors) 38 was conducted to provide a starting value for L based on the number of credible sets that the threshold of posterior inclusion probability was greater than 0.95.

Follow-up Replication And Validation Analyses
Following statistical ne-mapping, cross-ancestry replication analyses were conducted for the most highly supported putative causal variant from each credible set using data on n-3 and n-6 PUFAs from other race/ancestry groups. The resources for replication analyses included the following: Given the limited number of cohorts available for ethnic-speci c and cross-ethnic replication of PUFA traits, additional validation analyses were conducted for the same set of variants using multi-ancestry genetic association with lipid traits (HDL, LDL, total cholesterol and triglycerides) from the Global Lipids Genetics Consortium (GLGC). 25 Multiple testing correction was applied to account for the number of variants examined in cross-ethnic replication (HIS: P < 0.05/19 = 0.0026 and AFA: P < 0.05/11 = 0.004).

Bayesian Colocalization Analysis
Bayesian colocalization analysis has proven an effective approach for identi cation of downstream genes underlying GWAS loci. 35 We used the R/coloc package to conduct Bayesian colocalization analysis 39 to identify the putative gene(s) corresponding to each credible set of variants using MESA multi-ancestry eQTL data from puri ed monocytes 26 and GTEx multi-ancestry whole blood tissue eQTL data. 40 Bayesian colocalization analysis tested the following hypotheses: H0. neither GWAS of PUFAs nor eQTL has a genetic association in the region (within 1 Mb of the transcription start site); H1. only GWAS of PUFAs has a genetic association in the region; H2. only eQTL has a genetic association in the region; H3. both GWAS of PUFAs and eQTL are associated, but with different causal variants; H4. both GWAS of PUFAs and eQTL are associated and share a single causal variant. Colocalization for variants in credible sets was de ned by (1) a posterior colocalization probability of hypothesis 4 (PP.H4) > 0.80, or (2) a PP.H4 > 0.50 and the ratio of PP.H4 / PP.H3 > 5.
PrediXcan, a gene-based association method focused on identifying trait-associated genes by quantifying the effect of gene expression on the phenotype on interest. 41 In this study, we applied summary-statistics based PrediXcan (S-PrediXcan) 42 using reference gene expression prediction models from MESA puri ed monocytes 26 and GTEx multi-ancestry whole blood. 43 S-PrediXcan associations were considered genomewide signi cant if they passed the multiple testing correction for all genes (MESA: P < 0.05/4470 = 0.00001 and GTEx: P < 0.05/4350 = 0.00001).

Chromatin Contact Analysis
To identify variants located in open chromatin regions in contact gene promoters, we used GenomicRanges (v. 1.46.1 ; R version 4.1.1) to intersect the genomic coordinates (hg19) of the variants contained in the credible sets with the open chromatin peaks (called using the ENCODE pipeline) in signi cantly enriched contact with gene promoter determined by Promoter Capture C (Chicago Score > 5). We queried chromatin accessibility and promoter contacts in human mesenchymal stem cells (hMSC) and Adipocytes differentiated in vitro from these (hMSC_Adipocytes), embryonic stem cell derived hypothalamic neurons (hESC Hypothalamic Neurons), induced pluripotent-dervived Heptocytes (IPS-Hepatocytes), Enteroids, and the hepatic carcinoma HepG2 cell line. [44][45][46][47][48][49] Details on Promoter Capture C and ATAC-seq library generation and analyses have been previously described. 44 Gene Co-expression Analysis.
We used the GTEx whole blood gene expression version 8 TPM dataset to examine co-expression with FADS1 for genes identi ed by integrative analyses, including colocalization and PrediXcan. Two models for gene co-expression analysis were used for the trait of interest, (1) an unadjusted model FADS1 ~ gene expression; and (2) a covariate adjusted model FADS1 ~ age + gender + gene expression.
Gene co-expression associations were considered statistically signi cant if they passed the multiple testing correction for all genes examined from colocalization and PrediXcan (P < 0.05/39 = 0.0012). Figure 1 PUFAs metabolic pathway and summary of genome-wide association from previous CHARGE GWAS of n-3 and n-6 PUFAs in European Americans