DOI: https://doi.org/10.21203/rs.3.rs-467119/v1
Background
Metabolomics is a promising approach that can be used to understand pathophysiological pathways of Alzheimer disease (AD). However, the relationships between metabolism and AD are poorly understood. The aim of this study is to investigate the causal association between circulating metabolites and risk of AD by combining metabolomics with genomics through two-sample Mendelian randomization (MR) approach.
Method
Genetic associations with 123 circulating metabolic traits were utilized as exposures. A large summary statistics data from International Genomics of Alzheimer’s Project was used in primary analysis, including 21,982 AD cases and 41,944 controls. Validation was further performed using family history of AD data from UK Biobank (27,696 cases of maternal AD, 14,338 cases of paternal AD and 272,244 controls). We utilized the inverse-variance weighted method as primary analysis and four additional MR methods (MR-Egger, weighted median, weighted mode, and MR pleiotropy residual sum and outlier) for sensitivity analyses.
Results
We found one-fold increased risk of developing AD per standard deviation increase in the levels of circulating ApoB (odd ratio (OR)=3.18; 95% confidence interval (CI): 1.52–6.66, P=0.0022) and glycoprotein acetyls (OR=1.21; 95% CI: 1.05–1.39, P=0.0093), serum total cholesterol (OR=2.73; 95% CI: 1.41-5.30, P=0.0030), and low-density lipoprotein (LDL) cholesterol (OR=2.34; 95% CI: 1.53-3.57, P=0.0001). Whereas glutamine (OR=0.81; 95% CI: 0.71-0.92, P=0.0011) were significantly associated with lower risk of AD. We also detected causal effects of several different composition of LDL fractions on increased AD risk, which has been verified in validation. However, we found no association between circulating high-density lipoprotein cholesterol and AD.
Conclusions
Our findings provided robust evidence supporting causal effects of circulating glycoprotein acetyls, ApoB, LDL cholesterol, and serum total cholesterol on higher risk of AD, whereas glutamine showed the protective effect. Further research is required to decipher the biological pathways underpinning associations.
Alzheimer’s disease (AD) is the leading cause of dementia, affecting dramatically increased number of aging populations and placing an enormous burden on patients, families, and health-care systems [1]. The hallmarks of AD include amyloid plaques, tau neurofibrillary tangles, neurodegeneration, and synaptic loss [1, 2]. Unfortunately, there is currently no effective prevention and treatment for AD [1]. Metabolomics is the newest systematic biology approach which measures the biochemical products of cell processes downstream of genomic, transcriptomic, proteomic systems, and influences from the environment. It offers great potential for the diagnosis and prognosis of neurodegenerative diseases by capturing snapshots of the complex and multifactorial biochemical pathways that may be altered in AD [3]. Recent studies have showed that metabolomics can be used to measure alterations in biochemical pathways related to AD [4–7]. However, one of the key challenges in these metabolomics studies is the inability to ensure whether the relationships between circulating metabolites and AD are causal. It is of great importance to understand the causality between metabolites and AD, as well as the potential pathophysiological pathways of AD, to inspire drug discovery and to detect biomarkers that aid in early detection of high-risk individuals to initiate prevention, monitoring, and treatment.
Mendelian randomization (MR) is an analytic approach that uses genetic variants as instrumental variables (IVs) to assess causal inferences between an exposure and the outcome of interest [8]. The MR approach is largely independent of unmeasured confounding biases and reverse causality inherent in observational studies, given that allocation of genotypes from parents to offspring is random and genetic variation is unlikely to be affected by environmental factors [8, 9]. This is particularly relevant in a metabolomic study where the inter-individual variability of circulating metabolites can be affected by a wide range of potential confounders such as age, sex, physical diseases, medication, exercise, weight, and time of sampling. In addition, two-sample MR analysis is an extension in which the effects of the genetic instrument on the exposure and on the outcome can be obtained from publicly available genome-wide association studies (GWAS) summary data [10]. The two-sample MR approach enables us to link circulating metabolites with risk of AD using GWAS estimates on both metabolic phenotypes and AD.
The aim of this study was applying a two-sample MR approach to investigate the causal relationship between circulating metabolites and risk of AD, combining genome-wide and metabolome-wide datasets generated from large scale cohorts.
Genetically determined metabolites
We obtained the summary statistics from a GWAS for 123 circulating metabolites [11]. Briefly, Kettunen et al. conducted a comprehensive GWAS estimated with quantitative human serum/plasma metabolites as phenotypes [11]. Metabolomics data were acquired based on human fasting blood samples, otherwise the fasting time effect were adjusted in original study. The 123 metabolites represent a broad molecular signature of systemic metabolism and were assigned to 12 classes (carboxylic acids and derivatives, fatty acyls, glycerolipids, glycerophospholipids, hydroxy acids and derivatives, keto acids and derivatives, lipoprotein, organooxygen compounds, protein, ratio, sphingolipids, steroids and steroid derivatives) based on human metabolome database classification and expert opinion [12, 13]. A total of up to 24,925 individuals from 14 Europe cohorts were meta-analyzed. The mean age was 44.6 years, and females accounted for 54.6%. Written informed consent was obtained from all participants, and the study was approved by ethical committees.
Selection of instrumental variables
For each of the 123 metabolites, single nucleotide polymorphisms (SNPs) associated at genome-wide significance P-value (P < 5×10-8) with a minor allele frequency greater than 0.01 were considered as potential instruments. Independent SNPs were selected at a threshold of linkage disequilibrium r2 > 0.05 and a distance of 1000kb. For palindromic SNPs, we aligned strands using allele frequency and discarded palindromic SNP(s) that had minor allele frequency above 0.42. Then exposure–outcome datasets were harmonized. We have considered the palindromic SNPs and checked original datasets to avoid reverse effects.
The proportion of variance explained by IVs were computed. And F-statistic of each metabolite was calculated to judge the strength of IVs. Typically, a strong instrument was defined as an F-statistic > 10 [14]. Additionally, power calculations were conducted using the R code provided by Burgess S with a two-sided type-I error rate α = 0.05 [15]. The proportion of variance explained by IVs, F-statistics and power were presented in Additional file 1.
IGAP AD dataset
In primary analysis, genetic variants associated with late-onset AD were obtained from a meta-analysis GWAS performed by International Genomics of Alzheimer's Project (IGAP) [16]. There is no sample overlap of IGAP with cohorts of circulating metabolites. IGAP is a large two-stage study based upon GWAS on individuals of European ancestry. Data from stage 1 was used in the present study, including 63,926 individuals (21,982 AD cases and 41,944 cognitively normal controls) of European descent from four consortia: Alzheimer Disease Genetics Consortium (ADGC), Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium (CHARGE), The European Alzheimer’s Disease Initiative (EADI), and the Genetic and Environmental Risk in AD/Defining Genetic, Polygenic and Environmental Risk for Alzheimer’s Disease Consortium (GERAD/PERADES) [16]. All AD diagnoses were autopsy-confirmed or satisfied the NINCDS-ADRDA criteria or DSM-IV guidelines [16]. The average age at onset of AD ranged from 74.4 to 81.9 years, and the average age at examination for 83% controls is ≥ 76 years.
UK Biobank proxy-AD dataset
We additionally set out to validate our results using a proxy-AD dataset, based on individuals in the UK Biobank (http://www.ukbiobank.ac.uk) [17, 18]. The high heritability of AD implies that individuals with family history of AD are likely to have a higher genetic AD risk load. Thus, individuals with one or two parents with AD were defined as proxy cases, that is, having family history of AD [19]. In this dataset, the proxy-AD case-control status was ascertained via self-report. Over 500,000 community-dwelling individuals aged between 37 and 73 years were recruited in the United Kingdom between 2006 and 2010 [17]. An array of 314,278 participants with available AD information on at least one parent were meta-analyzed in this analysis, including 27,696 cases of maternal AD, 14,338 cases of paternal AD, and 272,244 controls [18]. All data sources used in this MR study received approval from an ethics standards committee on human experimentation and obtained informed consent from all participants.
Statistical analysis for Mendelian randomization
We used inverse-variance weighted (IVW) method as the primary analysis to determine the causal relationships between genetically determined circulating metabolites and AD. The IVW method will return an unbiased estimate in the absence of horizontal pleiotropy or when horizontal pleiotropy is balanced [20]. Results are presented as odds ratio (OR) per standard deviation (SD) increase in genetically determined metabolites on AD for the outcome was dichotomous.
We conducted sensitivity analysis using weighted median [21], weighted mode-based estimate (MBE) [22], MR Egger regression [23], and Mendelian randomization pleiotropy residual sum and outlier (MR-PRESSO) [24], and displayed in scatter plots. These methods hold different assumptions at the costs of reduced statistical power. The weighted median allows for 50% of the IVs to be invalid or present pleiotropy [21]. The weighted MBE method used the mode of the IVW empirical density function as the weighted MBEs and obtained a causal effect estimate robust to horizontal pleiotropy [22]. MR-Egger regression allows >50% of the variants to be invalid [23]. MR-Egger is based on the “NO Measurement Error” (NOME) assumption (no measurement error in the SNP exposure effects), which is evaluated by the regression dilution I2GX statistic (i.e., less than 0.9 indicates a violation of NOME) [25]. Thus, an I2GX statistic was calculated to test the presence of bias with MR-Egger.
We checked each SNP prioritized in the present study using the GWAS Catalog (http://www.ebi.ac.uk/gwas) to ensure no genetic instruments were associated directly with AD. SNPs significantly associated with AD were excluded from the IVs. Where SNPs significantly (P < 5×10-8) associated with AD are detected, the analysis was re-run to determine whether removing those potential pleiotropic SNPs impacted the effects.
Furthermore, we conducted MR Egger intercept and MR-PRESSO global test to detect the presence of horizontal pleiotropy or heterogeneity. In the case of horizontal pleiotropy, MR-PRESSO outlier test compares the observed and expected distributions of the tested variants to identify outlier variants. If significant outliers (P < 0.05) are detected, they were removed from the analysis to return an unbiased causal estimate [24]. Heterogeneity in the IVW estimates was tested using Cochran’s Q test, and displayed in forest plots. Moreover, visual inspection of funnel plot and leave-one-out plot were also used to assess the MR “no horizontal pleiotropy” assumption (see Additional file 6).
To correct for multiple comparisons, we applied false discovery rates (FDR) correction in IVW. An FDR corrected P-value < 0.05 was considered significant, and an unadjusted P-value < 0.05 was considered as the evidence of a suggestive association. Those metabolites that showed suggestive evidence of association (PIVW < 0.05) with late-onset AD were assessed for validation using UK Biobank GWAS. Results with consistent direction of point estimates across sensitivity analyses, validation, IVW, and estimates after correction of potential pleiotropy were considered as robust causal associations.
Analyses were conducted using R version 3.6.3, with the MR analysis performed using the “TwoSampleMR” package version 0.5.2 [26, 27].
Pathway enrichment analysis
Metabolic enrichment analysis was conducted using the web-based MetaboAnalyst 5.0 (https://www.metaboanalyst.ca/). We analyzed the metabolite sets from Kyoto Encyclopedia of Genes and Genomes (KEGG) Database in our study [28]. All metabolites that were associated with AD risk with a P-value <0.05 were used to identify metabolic pathway.
A flow diagram depicting the process of the MR analyses is shown in Fig 1. The IVW identified 32 circulating metabolites that were associated with AD risk, including 27 significant (FDR PIVW < 0.05) traits and five suggestive (PIVW < 0.05) causal traits (Fig 2). Of the 32 metabolites, the IVs were composed of 6-52 SNPs and could explain 3–20% of the variance of the corresponding metabolites (see Additional file 1). And the F-statistics were all highly above the threshold of weak instruments of F-statistic < 10 (see Additional file 1). Results of sensitivity analyses were showed in Additional file 2. The negative MR results were presented in Additional file 3.
Effects of Genetically Determined Metabolites on AD
Those associated metabolites include two proteins, two carboxylic acids and derivatives, 26 lipoproteins, and two steroids and steroid derivatives (Fig 2). We observed a 1-fold increased risk of developing AD per SD increase in the level of circulating ApoB (OR = 3.18; 95% confidence interval (CI): 1.52–6.66, PIVW = 0.0022) and glycoprotein acetyls (Gp; OR = 1.21; 95% CI: 1.05–1.39, PIVW = 0.0093), serum total cholesterol (serum-C; OR = 2.73; 95% CI: 1.41-5.30, PIVW = 0.0030) and esterified cholesterol (OR = 2.42; 95% CI: 1.07-5.48, PIVW = 0.0345). Citrate (OR = 0.83; 95% CI: 0.73-0.95, PIVW = 0.0055) and glutamine (OR = 0.81; 95% CI: 0.71-0.92, PIVW = 0.0011) were significantly associated with lower risk of AD. In case of lipoproteins, five intermediate-density lipoprotein (IDL) subfractions (i.e., IDL-C, IDL-FC, IDL-L, IDL-P, and IDL-PL) and 15 low-density lipoprotein (LDL) subfractions (i.e., LDL-C, L-LDL-C, L-LDL-CE, L-LDL-FC, L-LDL-L, L-LDL-P, L-LDL-PL, M-LDL-C, M-LDL-CE, M-LDL-L, M-LDL-P, M-LDL-PL, S-LDL-C, S-LDL-L, and S-LDL-P) were significantly associated with higher risk of AD. Four high-density lipoprotein (HDL) subfractions (i.e., XL-HDL-C, XL-HDL-CE, XL-HDL-FC, XL-HDL-P) were associated with lower risk of AD, while S-HDL-P showed association with higher risk of AD at a suggestive level. Besides, XS-VLDL-PL was significantly associated with higher risk of AD using IVW method (OR = 1.49; 95% CI: 1.11-2.02, PIVW = 0.0087).
Pleiotropy and heterogeneity analysis
The Q-test detected no evidence of heterogeneity in the results of Gp, citrate, glutamine, S-HDL-P, XL-HDL-C, XL-HDL-CE, and XL-HDL-FC (Table 1). By checking each SNP in GWAS Catalog, nine AD-associated SNP were identified from IVs of 24 metabolites (see Additional file 4). IVW results of the 24 metabolites after removing those potential pleiotropy SNPs were showed in Additional file 5. Although this sensitivity results were all non-significant, the direction of point estimates was consistent with that in primary analysis. Only one metabolite, esterified cholesterol, showed opposite direction of point estimate (OR = 0.95) to the primary result (OR = 2.42; 95% CI: 1.07–5.48), indicating potential bias of the effect.
The MR Egger intercept test showed no evidence of pleiotropy in these IVW-identified metabolites (Table 1). MR-PRESSO global test also showed no pleiotropy in the results of Gp, citrate, glutamine, XL-HDL-C, XL-HDL-CE and serum-C. Others were analyzed by MR-PRESSO after removing the outlying SNPs identified by MR-PRESSO outlier test. The corrected results were significant except for XS-VLDL-PL (PcMR-PRESSO = 0.396) and IDL-FC (PcMR-PRESSO = 0.074). The corrected point estimates of MR-PRESSO were all consistent with IVW except for XL-HDL-FC (ORcMR-PRESSO = 1.64 vs. ORIVW = 0.90). Thus, effects of XS-VLDL-PL, IDL-FC, and XL-HDL-FC were considered unreliable.
Sensitivity analysis
The results for the sensitivity analyses of 32 IVW-identified metabolites are showed in Fig 4 and Additional file 2. The I2GX statistics of each MR-Egger result are all above 0.9. ApoB, Gp, glutamine, and serum-C present robust associations that all additional MR methods showed consistent direction of point estimates with IVW, though not all MR methods yielded significant results. However, as for citrate, the sensitivity results showed opposite effect estimates (ORweighted-median = 1.06, ORweighted-mode = 1.05) to primary result (ORIVW = 0.83). In case of lipoproteins, an array of 16 IVW-identified significant metabolites (i.e., IDL-C, IDL-L, IDL-P, LDL-C, L-LDL-C, L-LDL-CE, L-LDL-FC, L-LDL-L, L-LDL-P, L-LDL-PL, M-LDL-CE, M-LDL-L, M-LDL-P, M-LDL-PL, S-LDL-C, and S-LDL-P) also showed concordant results across sensitivity methods and IVW. Sensitivity analysis indicates non-robust effects of S-LDL-L (ORweighted-median = 0.80, ORweighted-mode = 0.80 vs ORIVW = 2.69), XL-HDL-CE (ORweighted-median = 1.20, ORweighted-mode = 1.23 vs ORIVW = 0.90), XL-HDL-FC (ORMR-PRESSO = 0.0002 vs ORIVW = 0.90) that significant sensitivity results yielded discordant point estimates with IVW. Another six lipoproteins, IDL-FC, IDL-PL, M-LDL-C, S-HDL-P, XL-HDL-C, and XL-HDL-P also showed inconsistent results of weighted median and weighted mode with IVW.
Validation
In validation, we used UK Biobank dataset to verify the association between IVW-identified metabolites and proxy-AD (Fig 3). Of the 32 identified metabolites, 20 were still significant associated with proxy AD in validation, especially including ApoB, Gp, and LDL-C, and several different composition of LDL fractions. Among 32 selected metabolites, direction of point estimates in validation were all accordant with primary results in IVW except for the citrate (ORvalidation = 1.029), which yields reversed effect with risk of AD (ORIVW = 0.83).
Pathway enrichment analysis
Our study identified six significant metabolic pathways that were involved in the pathogenesis of AD (Table 2). The most significant metabolic pathway was D-glutamine and D-glutamate metabolism (P =1.30×10-5) from the KEGG database. L-glutamine and L-glutamine were involved in this metabolic pathway. Another two metabolic pathways involving two circulating metabolites (i.e., glutamine and citrate) survived FDR correction, that is, the pathway of alanine, aspartate, and glutamate metabolism (P = 0.0003), and glyoxylate and dicarboxylate metabolism (P = 0.0004). We also identified three pathways at the nominal P <0.05, including nitrogen metabolism (P = 0.008), arginine biosynthesis (P = 0.018), and citrate cycle (i.e., tricarboxylic acid (TCA) cycle; P = 0.026).
By performing a two-sample MR analysis, the present study supports the hypothesis that circulating metabolites levels can be causally corelated to risk of AD. We suggest a significant association between higher levels of Gp and higher risk of AD, and genetically predicted glutamine levels are significantly associated with lower risk of AD. Our results also reinforce the idea that circulating lipid-related metabolites may play a role in the in the pathophysiological process of AD. Particularly, we observed robust evidences of causal effects with respect to ApoB, serum-C, three IDL subfractions (i.e., IDL-C, IDL-L, IDL-P), and 13 LDL subfractions (i.e., LDL-C, L-LDL-C, L-LDL-CE, L-LDL-FC, L-LDL-L, L-LDL-P, L-LDL-PL, M-LDL-CE, M-LDL-L, M-LDL-P, M-LDL-PL, S-LDL-C, and S-LDL-P) on higher risk of AD. To our knowledge, this is the most comprehensive MR analysis to examine the causal associations of circulating metabolites and risk of AD.
The measured Gp are mainly α-1-acid glycoprotein (AGP) [11], also called orosomucoid. Gp is an acute phase plasma α-globulin glycoprotein, involving in many activities including modulating immunity, binding and carrying drugs, maintaining the barrier function of capillary, and mediating the sphingolipid metabolism [29–31]. Gp is associated with AD due to its important role in modulating neuroinflammation [32]. Higher levels of plasma AGP were found in patients with cognitive impairment than in normal subjects [32]. Previous meta-analysis has reported that plasma levels of Gp were associated with increased risk of dementia and lower cognitive function [33]. These results support our findings suggesting a relationship between circulating Gp and AD.
ApoB is synthesized in the liver and circulates in the plasma as the major protein component of LDL, involving in the transport of cholesterol to peripheral tissues [34]. Previous studies have demonstrated that AD group has significantly higher levels ApoB in serum [35, 36] and plasma [37] than that of the control group, especially in AD subjects with APOE ε4 allele [38]. In AD patients, higher serum levels of ApoB are significantly correlated with higher Aβ42 levels in brain [36]. Additionally, genetic variants in the gene of APOB are strongly associated with early-onset AD [39], suggesting a link of ApoB to AD risk. However, previous studies of circulating ApoB levels in human are conflicting, with a large population study finding no association between circulating ApoB levels and incident AD [40]. Therefore, our MR results is a significant evidence enhancing the association between circulating ApoB and AD and suggesting it as causality.
It is reasonable of our findings that many biological studies have reported coincident evidences. Plasma ApoB was found co-localized with cerebral amyloid plaque in a transgenic mouse AD model [41], and was positively correlated with Ab plaque abundance in brain [42]. Overexpressing APOB in a transgenic mouse model induces significant memory impairment and increases Aβ levels compared with wild-type mice, suggesting that increased ApoB levels can contribute to the development of AD-like pathology [43].
Whereas ApoB is involved in LDL-C metabolism and is regarded as a promising link between cholesterol and AD [44], many epidemiological evidences of association between LDL-C and AD are consistent with that of ApoB [36]. Observational studies have indicated that LDL-C levels were significantly increased in AD patients [45–47]. Likewise, Zhou et al. suggest that elevated concentration of LDL-C (> 121 mg/dl) may be a potential risk factor for AD [48]. Our MR analysis support these results and suggest a causal effect of high circulating LDL-C levels in increasing risk of AD. Consistent with our findings, another two published MR study also revealed similar effects of LDL-C using different datasets [49, 50], enhancing reliability of the results.
According to the molecular size, LDLs are further categorized as large (L), medium (M), and small (S) LDLs in initial study [11]. Variation in circulating levels and composition of these fractions may have different pathophysiologic significance. Particularly, plasma levels of L-LDL particles were significantly associated with greater cerebral amyloidosis and lower hippocampal volumes independent of LDL-C [51]. Except for LDL-C, our findings suggest that six L-LDL subfractions, four M-LDL subfractions, and two S-LDL subfractions can influence the AD risk, but further investigations are needed to fully understand the molecular mechanisms involved.
In results of observational studies, the effects of serum-C on risk of AD were highly heterogeneous [52]. With respect to serum-C, several meta-analyses revealed non-significant effect on AD [53]. While other epidemiological studies reported that serum-C levels were significantly increased in AD patients [45–47]. The significance of serum TC differs between mid-life and older adults [54]. Several studies state that high mid-life serum TC levels represent a risk factor for subsequent AD [55], but that there are no detectable differences in serum TC levels at older ages [56]. Additionally, except for long-term average serum TC levels, higher TC variability is significantly associated with increased risk of all-cause dementia and AD in the general population, independent of mean TC levels [57]. Thus, the in-coincident results between serum-C studies may be explained by the variations in total cholesterol levels and the disease progression. Taking the advantage of not being affected by unmeasured confounders inherent in observational studies [8], our MR results are more robust, suggesting the high serum-C levels may have a causal effect in increasing AD risk.
Many epidemiological evidences suggest a protective association of circulating HDL cholesterol (HDL-C) levels against AD risk [58]. While we found HDL-C levels are not associated with AD risk with enough power (see Additional file 3), concordant with a large population study [40]. In our study, four very large HDL subfractions (i.e., XL-HDL-C, XL-HDL-CE, XL-HDL-FC, and XL-HDL-P) yield inverse effects in AD risk, however, these sensitivity results showed inconsistent effect estimates against IVW. Investigations are needed to further clarify whether the relationship between HDL and AD are causal.
Our study also reported several metabolic pathways that might be involved in the pathogenesis of AD, in which the D-glutamine and D-glutamate metabolism has been reported to be associated with AD [4, 59]. Observational study showed that glutamine concentrations in plasma is positively correlated with that in posterior cingulate cortex [60], which is associated with cognitive impairment in AD [61]. We found consistent results that genetically determined circulating glutamine show a protective effect against AD. Nevertheless, a cohort study found higher glutamine levels were associated with lower cognitive function and higher risk of dementia [33]. Whereas observational studies are prone to reverse causation and confounding bias, an MR analysis with balanced horizontal pleiotropy is more credible [9]. Consistent with our results, a published two-sample MR study came to a similar conclusion of glutamine using a different AD dataset [62]. Furthermore, by conducting a series of rigorous sensitivity, pleiotropy, and validation analyses, our results are more comprehensive and robust. Moreover, there also exist biological evidences of this result. Anderson et al. have observed that reduced glutamine metabolism, reduced TCA activity, and impaired oxidative glutamine metabolism precede amyloid plaque formation in AD mouse model compared to controls [63]. And glutamine is proved to protect against oxidative stress-induced injury that is intimately related to AD in AD mice model [64].
Citrate is key constituent of the TCA cycle, serves as a substrate in the cellular energy metabolism cycle involved in the fatty acid synthesis, glycolysis, and gluconeogenesis [65]. There are very few researches exploring the relationship between citrate and AD. However, our current analysis found a protective effect of citrate in AD risk at significant level. Although additional evidence is needed, it might provide valuable information to help understand the underlying biological mechanisms in the pathogenesis of AD.
Our study also has several limitations. First, a general challenge of MR is the persistent possibility of horizontal pleiotropic associations between exposure and outcome. In the present study, we conducted up-to-date analyses to detect and correct the potential pleiotropy. One limitation is that the Q-test and MR-PRESSO global test is significant in some metabolites. Nevertheless, MR-PRESSO outlier test was further performed to correct for horizontal pleiotropy and returned an unbiased causal estimate. Second, some metabolites yield opposite direction of effect estimates across sensitivity analysis and IVW. It is generally recommended that the emphasis of sensitivity analysis should be laid on the direction of point estimates among the IVW and sensitivity analyses, rather than just the P values. Although this standard ruled out several metabolites from robust results, a serious screening protocol ensure the reliability of our results. For instance, the Gp and glutamine showed the most robust casualty. However, we didn’t have enough evidence that those “non-robust” metabolites are not associated with AD. Third, we used a proxy-AD GWAS dataset to verify our analysis. Hence, the phenotypes used for validation were different from that used in primary analysis, resulting in smaller effect sizes. However, the validation is an independent replication analysis for that there is no overlap between primary AD dataset and validation dataset.
Despite these limitations, strengths of the study are notable. Our study provided novel insight by combining metabolomics with genomics to help understand the pathogenesis of AD. The use of two sample MR approach also enabled us to use the very large AD case–control data, giving sufficient power to detect even small effects. And there is no overlapping among exposure and outcome datasets, as is unachievable in many MR studies that may bias effect estimates. Stronger evidence of causal relationships is of great importance because the AD underlying pathophysiological mechanisms are unclear. If these circulating metabolites levels truly reduce AD risk, it would be promising markers for early detection and potential avenues for effective therapeutic intervention in AD.
In conclusion, our study suggested increased levels of circulating Gp, ApoB, LDL-C, and serum-C were the most robust metabolites that were associated with higher risk of AD, whereas glutamine showed the contrary effect. We found strong evidence for causal effects of several different composition of LDL fractions on increased AD risk. The present study provides little evidence that recommending circulating HDL-C would help to prevent AD. Further research is required to decipher the biological pathways underpinning associations.
AD, Alzheimer’s disease; ADGC, Alzheimer Disease Genetics Consortium; AGP, α-1-acid glycoprotein; CI, confidence interval; CHARGE, Cohorts for Heart and Aging Research in Genomic Epidemiology Consortium; EADI, The European Alzheimer’s Disease Initiative; FDR, false discovery rates; GERAD/PERADES, Genetic and Environmental Risk in Alzheimer’s Disease /Defining Genetic, Polygenic and Environmental Risk for Alzheimer’s Disease Consortium; GWAS, genome-wide association studies; Gp, glycoprotein acetyls; HDL, high-density lipoprotein; IDL, intermediate-density lipoprotein; KEGG: Kyoto Encyclopedia of Genes and Genomes; LDL, low-density lipoprotein; NOME, NO Measurement Error; OR, odds ratio; SD, standard deviation; SNP, single nucleotide polymorphism; IGAP, International Genomics of Alzheimer's Project; IV, instrumental variable; IVW, inverse variance weighting; MBE, mode-based estimate; MR, Mendelian randomization; MR-PRESSO, Mendelian randomization pleiotropy residual sum and outlier; serum-C, serum total cholesterol; TCA, tricarboxylic acid.
Ethics approval and consent to participate
All data sources used in this MR study received approval from an ethics standards committee on human experimentation and obtained informed consent from all participants.
Consent for publication
Not applicable.
Availability of data and materials
All the data used in this study can be acquired from the original genome-wide association studies that are mentioned in the text. Any other data generated in the analysis process can be requested from the corresponding author.
Competing interests
The authors declare that they have no competing interests.
Funding
This study was supported by grants from the National Natural Science Foundation of China (91849126), the National Key R&D Program of China (2018YFC1314702), Shanghai Municipal Science and Technology Major Project (No.2018SHZDZX03) and ZHANGJIANG LAB, Tianqiao and Chrissy Chen Institute, and the State Key Laboratory of Neurobiology and Frontiers Center for Brain Science of Ministry of Education, Fudan University.
Authors' contributions
Jin-Tai Yu had full access to all the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis.
Concept and design: Jin-Tai Yu.
Acquisition, analysis, or interpretation of data: All authors.
Drafting of the manuscript: Shu-Yi Huang
Critical revision of the manuscript for important intellectual content: All authors.
Statistical analysis: Shu-Yi Huang
Obtained funding: Jin-Tai Yu.
Administrative, technical, or material support: Jin-Tai Yu
Supervision: Jin-Tai Yu.
Acknowledgments
This work was made possible by the generous sharing of GWAS summary statistics. We thank the participants, researchers, and staff associated with the many other studies from which we used data for this report. We thank the UK Biobank for providing summary statistics for these analyses. We also thank the IGAP for providing summary results data for these analyses. The investigators within IGAP contributed to the design and implementation of IGAP and/or provided data but did not participate in analysis or writing of this report. IGAP was made possible by the generous participation of the control subjects, the patients, and their families. The i–Select chips were funded by the French National Foundation on Alzheimer's disease and related disorders. EADI was supported by the LABEX DISTALZ grant, Inserm, Institut Pasteur de Lille, Université de Lille 2 and the Lille University Hospital. GERAD was supported by the Medical Research Council (Grant n° 503480), Alzheimer's Research UK (Grant n° 503176), the Wellcome Trust (Grant n° 082604/2/07/Z) and German Federal Ministry of Education and Research (BMBF): Competence Network Dementia (CND) grant n° 01GI0102, 01GI0711, 01GI0420. CHARGE was partly supported by the NIH/NIA grant R01 AG033193 and the NIA AG081220 and AGES contract N01–AG–12100, the NHLBI grant R01 HL105756, the Icelandic Heart Association, and the Erasmus Medical Center and Erasmus University. ADGC was supported by the NIH/NIA grants: U01 AG032984, U24 AG021886, U01 AG016976, and the Alzheimer's Association grant ADGC–10–196728.
Metabolites |
MR-Egger intercept P-value |
Q test P-value |
MR-PRESSO global test P-value |
Corrected MR-PRESSO |
|
OR (95%CI) |
P-value |
||||
Protein |
|||||
ApoB |
0.80 |
<0.001 |
<0.001 |
2.20 (1.66, 2.92) |
2.74E-04 |
Glycoprotein acetyls |
0.83 |
0.12 |
0.20 |
- |
- |
Carboxylic acids and derivatives |
|||||
Citrate |
0.48 |
0.93 |
0.96 |
- |
- |
Glutamine |
0.92 |
1.00 |
0.88 |
- |
- |
Lipoprotein |
|||||
IDL-C |
0.53 |
<0.001 |
<0.001 |
1.72 (1.48, 1.99) |
1.65E-07 |
IDL-FC |
0.27 |
<0.001 |
<0.001 |
1.08 (0.99, 1.17) |
0.074 |
IDL-L |
0.26 |
<0.001 |
<0.001 |
1.24 (1.09, 1.40) |
0.002 |
IDL-P |
0.33 |
<0.001 |
<0.001 |
1.35 (1.16, 1.57) |
7.22E-04 |
IDL-PL |
0.83 |
<0.001 |
<0.001 |
1.69 (1.43, 1.98) |
1.19E-06 |
LDL-C |
0.44 |
<0.001 |
<0.001 |
1.59 (1.40, 1.81) |
2.83E-08 |
L-LDL-C |
0.41 |
<0.001 |
<0.001 |
1.67 (1.46, 1.91) |
6.58E-08 |
L-LDL-CE |
0.73 |
<0.001 |
<0.001 |
1.69 (1.49, 1.91) |
8.22E-09 |
L-LDL-FC |
0.41 |
<0.001 |
<0.001 |
1.69 (1.47, 1.94) |
9.34E-08 |
L-LDL-L |
0.51 |
<0.001 |
<0.001 |
1.61 (1.40, 1.85) |
1.77E-07 |
L-LDL-P |
0.34 |
<0.001 |
<0.001 |
1.68 (1.47, 1.92) |
2.38E-08 |
L-LDL-PL |
0.34 |
<0.001 |
<0.001 |
1.64 (1.45, 1.84) |
3.45E-09 |
M-LDL-C |
0.42 |
<0.001 |
<0.001 |
1.81 (1.57, 2.09) |
4.10E-08 |
M-LDL-CE |
0.40 |
<0.001 |
<0.001 |
1.74 (1.50, 2.01) |
8.98E-08 |
M-LDL-L |
0.42 |
<0.001 |
<0.001 |
1.89 (1.62, 2.21) |
1.45E-07 |
M-LDL-P |
0.17 |
<0.001 |
<0.001 |
1.78 (1.52, 2.09) |
3.32E-07 |
M-LDL-PL |
0.33 |
<0.001 |
<0.001 |
1.91 (1.59, 2.30) |
3.53E-06 |
S-LDL-C |
0.82 |
<0.001 |
<0.001 |
1.96 (1.61, 2.39) |
4.07E-06 |
S-LDL-L |
0.44 |
<0.001 |
<0.001 |
2.09 (1.62, 2.69) |
5.46E-05 |
S-LDL-P |
0.14 |
<0.001 |
<0.001 |
2.09 (1.62, 2.70) |
4.38E-05 |
S-HDL-P |
0.19 |
0.45 |
<0.001 |
1.93 (1.60, 2.33) |
2.62E-06 |
XL-HDL-C |
0.81 |
0.11 |
0.11 |
- |
- |
XL-HDL-CE |
0.47 |
0.12 |
0.13 |
- |
- |
XL-HDL-FC |
0.72 |
0.09 |
<0.001 |
1.64 (1.45, 1.84) |
3.45E-09 |
XL-HDL-P |
0.96 |
<0.001 |
<0.001 |
0.91 (0.86, 0.98) |
0.012 |
XS-VLDL-PL |
0.13 |
<0.001 |
<0.001 |
1.03 (0.97, 1.09) |
0.396 |
Steroids and steroid derivatives |
|||||
Serum-C |
0.78 |
<0.001 |
0.60 |
- |
- |
Esterified cholesterol |
0.48 |
<0.001 |
<0.001 |
2.18 (1.54, 3.08) |
0.004 |
CI, confidence interval; MR-PRESSO, Mendelian randomization pleiotropy residual sum and outlier; OR, odds ratio.
Metabolic pathway |
Metabolites involved |
P value |
FDR |
Database |
D-Glutamine and D-glutamate metabolism |
L-Glutamine; L-Glutamine |
1.30E-05 |
0.001 |
KEGG |
Alanine, aspartate and glutamate metabolism |
L-Glutamine; Citric acid |
0.0003 |
0.012 |
KEGG |
Glyoxylate and dicarboxylate metabolism |
L-Glutamine; Citric acid |
0.0004 |
0.012 |
KEGG |
Nitrogen metabolism |
L-Glutamine |
0.0079 |
0.165 |
KEGG |
Arginine biosynthesis |
L-Glutamine |
0.0183 |
0.308 |
KEGG |
Citrate cycle (TCA cycle) |
Citric acid |
0.0261 |
0.366 |
KEGG |
FDR, false discovery rates; TCA, tricarboxylic acid.