In this study, we analyzed samples from the CN and AD subjects separately, as we reasoned that the CSF biomarker-associated DNAm discovered in CN samples would most likely be associated with AD risk; in contrast, after the onset of disease, the CSF biomarker-associated DNAm in AD samples would most likely be associated with both AD risk as well as changes caused by AD pathologies that accumulate in the brain. Supporting this premise, we found that the significant DNAm identified in AD and CN samples were largely distinct (Supplementary Fig. 2). There was also little correlation between DNAm-to-AD biomarker associations in the two groups of subjects, both at the levels of CpGs (Supplementary Fig. 4) and pathways (Supplementary Fig. 5). These results suggest that the epigenetics associated with different pathological processes in cognitively normal subjects (some of which might later proceed to develop AD) and AD patients vary, supporting the recommendation of considering the patients' disease stage in developing treatment strategies 77,78.
Our comprehensive analyses identified a number of DNAm differences significantly associated with CSF biomarkers Aβ42, pTau181, and tTau, many of which were associated with genes previously implicated in AD pathogenesis. Specifically, in the analysis of CN subjects, we identified 1 CpG (cg06171420) mapped to around 5 kb upstream of the PCBP3 gene, significantly associated with tTau at 5% FDR (Supplementary Table 4, Supplementary Fig. 9). The PCBP3 gene encodes the RNA-binding protein hnRNPE3 (poly(rC) binding protein 3), which regulates alternative splicing of the tau gene79,80. In Down Syndrome, AD, and other neurodegenerative diseases, an abnormal ratio of tau protein isoforms often results in aggregated tau, a major component of neurofibrillary tangles. In the region-based analysis, the most significant CSF Aβ42-associated DMR is located in the promoter of the THRB gene (Supplementary Fig. 10), which encodes a receptor for the thyroid hormone, previously observed to be dysregulated in AD subjects81–83.
In AD subjects, we identified significantly more DNA methylation associated with the CSF biomarkers; a total of 112, 4, and 3 CpGs reached 5% FDR in their association with Aβ42, pTau181, and tTau, respectively. Among the top 10 most significant CpGs associated with Aβ42 (Table 2), cg24037493 maps to the promoter of the SFXN1 gene and is significantly associated with CSF Aβ42 in AD subjects (Supplementary Fig. 11). SFXN1 encodes the mitochondrial serine transporter, which helps to maintain mitochondrial iron homeostasis 84. It has been observed that iron levels accumulate in the brains of AD subjects and correlate significantly with cognitive decline85–87. Similarly, among the top 10 most significant pTau181 and tTau-associated CpGs (Table 3), cg03037740 maps to the promoter of the RING1 gene, and is significantly associated with CSF pTau181 (Supplementary Fig. 12). RING1 encodes a protein that interacts with the polycomb protein BMI1, which plays a critical role in AD pathogenesis. Remarkably, it has been demonstrated that reduced expression of BMI1 protein alone is sufficient to induce both amyloid and tau pathologies in both cellular and animal models88,89. The most significant promoter DMR associated with Aβ42 is located at the TMEM204 gene (Supplementary Fig. 13), which encodes a transmembrane protein that functions as a cell surface marker for infiltrating microglia in the CNS during neuroinflammation90. Similarly, the most significant promoter DMR associated with pTau181 is located at the FBP1 gene (Supplementary Fig. 14), which encodes an enzyme that regulates glucose and energy metabolism. It has been observed the expression levels of FBP1 are reduced in the brains of patients at risk for AD91,92, consistent with our observed hypermethylation at the promoter of the FBP1 gene in samples with increased levels of pTau181. Taken together, these results demonstrated that our analysis nominated biologically meaningful DNA methylation loci in the blood associated with AD and, importantly, that changes in the different pathological processes in the CSF, both before and after the clinical diagnosis of AD, are reflected in the epigenome.
In AD samples, the most significant pathways that reached 5% FDR are cardiac conduction (P-value = 2.76 × 10− 4, FDR = 2.54 × 10− 2) and muscle conduction (P-value = 1.42 × 10− 4, FDR = 2.54 × 10− 2), which also achieved 25% FDR in CN samples (P-value = 3.58 × 10− 4, FDR = 6.58 × 10− 2; P-value = 5.63 × 10− 4, FDR = 7.85 × 10− 2). In recent years, the interaction between the heart and brain has increasingly been recognized93. Cardiovascular disease, even subclinical cardiac damage, has been shown to be a significant risk factor for dementia94–97.
In CN samples, interestingly, among the most significant pathways enriched with significant CpGs is the KEGG pathway “Alzheimer’s disease”, which was curated based on recent AD literature and included genes that confer AD risks, such as APOE, PSENEN, MAPT, CALM3, MME, and others. Also, in CN samples, the most significant pathway is the calcium signaling pathway (P-value = 2.39 × 10− 4, FDR = 9.09 × 10− 3), consistent with the calcium hypothesis of AD, which posits that dysregulated neuronal calcium homeostasis induces impaired synaptic plasticity, defective neurotransmission, promotes accumulation of Aβ and tau proteins, and subsequently lead to neuronal apoptosis in the brain98,99. Moreover, increased levels of free intracellular calcium have also been observed in normal aging, the strongest risk factor for AD100,101. The second most significant pathway is the regulation of actin cytoskeleton (P-value = 1.61 × 10− 3, FDR = 2.51 × 10− 2), consistent with the observation that synapse degeneration is a key early feature of AD pathogenesis102,103, and stability of the actin cytoskeleton is crucial for maintaining functional integrity of the dendritic spines at sites for neurotransmission in the brain104. These results suggest that some of the brain impairment during the early stages of the disease (i.e., preclinical) is also reflected in the blood epigenome.
Although the majority of the CSF biomarker-associated DNAm differed in CN and AD samples, our analyses also identified a small number of DMRs that were significantly associated with CSF biomarkers in both groups (Supplementary Fig. 2), which could serve as candidate biomarkers in future studies of AD progression. Specifically, three DMRs, all of which were associated with Aβ42, reached Sidak adjusted P-value < 0.05 in both CN and AD sample analyses. The first DMR chr15:69744390–69744763 is located at the promoter of the RPLP1 gene, which encodes a subunit protein of the ribosome. A defective ribosomal function is associated with decreased capacity for protein synthesis, reduced number of synapses, and has been observed as an early feature of AD preceding neuronal loss105,106. Another noteworthy result is two overlapping DMRs significantly associated with CSF Aβ42, at chr6:30130819–30131284 in AD samples and chr6:30130819–30131362 in CN samples, both are located in the promoter of the TRIM15 gene, which encodes a member of the TRIM protein family involved in the ubiquitin system responsible for degrading misfolded protein aggregates and plays important roles in neurodegenerative diseases107,108.
To validate our findings, we studied premortem blood DNAm associated with postmortem Braak stage measured on prefrontal cortex samples in an independent dataset, previously described as the London dataset 7. Encouragingly, we found a number of CSF-biomarker-associated blood DNAm also correlated significantly with the Braak stage, which corresponds to neurofibrillary tangle tau pathology burden in the brain (Supplementary Tables 17–18). In the London dataset, we observed a strong blood DNAm to Braak stage association signal located at a DMR in the promoter region of the HOXA5 gene. Interestingly, this locus also showed a significant association to CSF pTau181 in the ADNI dataset (Supplementary Table 18, Supplementary Fig. 15). Moreover, we also observed a significant correlation between brain DNAm and blood DNAm at a subset of 7 CpGs within the DMR (Supplementary Fig. 7), as well as a significant association between the DMR and downstream target gene expression (Supplementary Fig. 8). Consistent with previous studies, which discovered the extensive hypermethylation in the brain at the HOXA gene clusters significantly associated with tau neuropathology7, our study provided strong evidence that these hypermethylated CpGs can also be observed in the blood epigenome, and are significantly associated with pTau181 levels in the CSF (Supplementary Table 18). Taken together, these results nominate hypermethylation at the HOXA5 locus in the blood as a plausible biomarker for tau pathology.
On the other hand, given brain and blood cells originate from different developmental cell lineages, previous studies also suggested that DNA methylation profiles are, by and large, distinct between brain and blood7,17,109. Consistent with these previous results, our comparison of the blood DNAm from this study with brain DNAm associated with AD pathology in two large recent meta-analyses of postmortem brain tissues9,110 shows only a few overlapping DNAm (3 CpGs and 8 DMRs), mapped to PRSSL1, LINGO3, SPRED2, HOXA2, NR2F1, CPT1B, HOXA5, ZFPM1 genes, and intergenic regions, were significant with both blood DNAm-to-CSF Aβ42/pTau181 association and brain DNAm-to-brain Aβ/tau association (Supplementary Tables 4–9). Also, there is not any overlap between blood DNAm associated with the CSF AD biomarkers and blood DNAm associated with clinical AD from our previous meta-analyses of two large clinical AD datasets17,111. This is not surprising, given the disconnection between brain pathology and clinical diagnosis in AD; it has been observed that a substantial proportion of cognitively normal subjects also have AD pathology in the brain20,21.
This study has several limitations. First, we analyzed the methylation levels measured on whole blood, which contains a complex mixture of cell types. To reduce confounding effects due to different cell types, we included estimated cell-type proportions as covariate variables in all our analyses. Future studies that utilize single-cell technology for gene expression and DNAm could improve power and shed more light on the particular cell types affected by the DNAm loci discovered in this study. Second, to study DNAm associated with CSF biomarkers in subjects at different stages of the disease (i.e., preclinical or clinical), we separately analyzed samples from cognitively normal and AD subjects, which reduced the sample sizes of the analysis datasets considerably. Given the modest sample size, we pre-defined a more liberal significance threshold (i.e., P-value < 10− 5) based on previous analyses of blood DNA methylation data 17,37,43,112, to select a small number of loci that were then further prioritized using additional integrative analyses. Future studies with larger sample sizes are needed to identify and replicate DNAm loci at more stringent significance thresholds. Third, we did not consider MCI subjects in this study because there is considerable heterogeneity among MCI subjects, with subjects converting to AD at different trajectories113. As ADNI is currently conducting additional phases of the study, future analyses with a larger sample size will make it possible to detect DNA methylation to CSF AD biomarker associations in different subgroups of MCI subjects. Fourth, although women make up about two-thirds of AD patients in the general U.S. population1, our study cohort (which had both CSF biomarkers and blood DNAm available in ADNI) had a disproportionately lower proportion of females in the AD group (37% females in AD group vs. 51% females in CN group) (Table 1). Therefore, our study cohort may not represent a random sample from the general population. In all our analyses, we adjusted the variable sex in addition to other covariate variables, so the DNAm-to-CSF biomarkers associations we identified are independent of sex. Large and diverse community-based cohort studies that validate our findings are needed. Fifth, as recent autopsy studies revealed that about a quarter of CN subjects also shows AD neuropathology in the brain20,21, the CSF biomarker-associated methylation we observed in CN subjects could potentially be markers of an early feature in AD that precedes clinical diagnosis. Future studies that develop DNAm-based prediction models for diagnosing AD and compare their performance with state-of-the-art plasma biomarkers of AD are needed. Finally, the associations we identified do not necessarily reflect causal relationships. Future studies are needed to establish the causality of the nominated DNA methylation markers.